New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[semi-off-topic] SFNTQA / Porting FontValidator from C# to Python? #416
Comments
For the sake of completion, I’m attaching below my writeup proposing SFNTQA from June 2014 that I circulated among some interested parties. I'd like to re-warm the idea that I pitched to Behdad in November 2013 under the name "pyftvalidator": For the time being, I'd like to call the project / tool "SFNTQA". Actual final name may differ. In short, I'm proposing:
Below is a more detailed discussion of my proposal. Please feel free to comment, and also to add other relevant people to this thread if needed. What We Had in the Past[MSFontVal] Microsoft Font Validator
Probably the first tool that tackled SFNT validation was Microsoft Font Validator (MSFontVal for short). It was originally very useful as a GUI validator per se (written as a .NET GUI app), but today it's rather dated. It hasn't seen updates in years, so, for example, it complains about "ss01" being an "unregistered tag", doesn't recognize recent additions to the OT standard. I don't think Microsoft is interested in maintaining it. Also, the particular report style has been pioneered by MSFontVal (XML+CSS) was rather neat. It has established a nice model for generating reports, with four color-coded levels (info, pass, warning, error). It has inspired some other efforts. Both FontVal and FontQA generate XML+XSL reports. Both reports came as XML+XSL. They don't render well on modern browsers, just MSIE -- because they're really both quite old packages. But they show the principle. What We Have Now: PythonTwo packages that do some QA/validation and are written in Python already exist. [FontQA] FontQA
Pros:
Cons:
[AFDKO] Adobe FDK for OpenType
The CompareFamily.py tool is interesting in particular (though there are some other tools of interest, e.g. spot). Pros (CompareFamily):
Cons (CompareFamily):
What We Have Now: Commandline / LibIn addition, there is a number of commandline tools which work on different OSes, can be run externally and return some useful results. [TTX] fontTools/TTX
[WOFFT] woffTools
[OTS] OT Sanitiser
[FTX] OSX Font Tools Release 4, Beta 1
[FT2] FreeType 2 Demos
[FF] FontForge
[HB] HarfBuzz
[sfntly] Google sfntly
General Scope Of ValidationHere are several types of mandatory and optional features for validation, in the increasing order of complexity. Mandatory features
Optional features
General Process Of ValidationGenerally speaking, a validation tool should be able to perform the following operations: Input
Processing
Output
Proposal For The SFNTQA PackagePythonI believe that we will all agree that Python is probably the best starting point for considering validation of SFNT fonts:
Input
Architecture
Organization of validation
The idea for this condition is that three "levels" of tests for a particular aspect could exists: Interoperability / external tools
Documentation
Technical requirements
ExamplesInitial examples of tests that could be implemented are: Character seta) testing the character set against a defined character set. As test data, the Adobe Latin 1 character set could be used: b) testing whether all input fonts have the same character set Linespacinga) testing whether each set of linespacing values (hhea.Asc/Desc/LGap and OS/2.WinAsc/Desc and OS/2.TypoAsc/Desc/LGap) is consistent across all input fonts b) testing whether the three sets of linespacing values within each font yield the same total value c) reporting what effecitve line height in relation to the UPM the total linespacing values will yield (in "em" or percentage, e.g. 120% if the UPM is 1000 and the total linespacing is 1200). SFNT testsFor a set of SFNT tables which are reasonably expected to have the same values, or at least some fields within those tables are expected to have the same values across all fonts, test whether it is true. ConclusionsThese are the initial conditions I can think of for such package. I think once an initial framework is in-place, we will find developers in the community who will implement additional modules, especially since people from Google, Adobe, the UFO/RoboFab community, but also FontLab and potentially Apple are already familiar with TTX. As you see, my goal is to have a multi-headed beast of a system. But I think the font community is really ready for this system. The problem so far has been that most of existing tools were written by people who don't really have much of an idea for good software architecture, and if in Python, they were written using old Python paradigms (old-style classes, no inheritance of classes etc.). The best "worst" example is the Adobe CompareFamily.py tool -- which does its job but is not really done in a "proper" way (as Adobe acknowledges). We would need
Someone needs to design the foundation of the framework with extensibility in mind. Someone needs to set up this pluggable architecture and tell the community how to extend it. Then, we'll see contributions. I'm certain that I'll contribute, but I'm not able to do the duties described above. But I would certainly act as a consultant/contributor (unpaid of course), and so would probably some of you. So I'd like all of you to think about it yourselves but also, if possible, gather additional support from wherever you can. There are additional players in the industry whom I'd like to involve in this while the project gets some initial traction. Please let me know what you think about this. Best, |
@davelab6 ping |
I think this consolidation is a fine idea :) however, I'm not sure who wants to actually put the work in. I wonder if any Googlers would volunteer? :) |
My current thinking:
|
TL;DR... (only recently learned this short-hand, seems approriate to use). Am a bit disappointed that neither Dave or Behdad mentioned our previous small-group discussion: .net dll's are usable as python classes under IronPython (python interpreter implemented on top of .net - it is shipped as a standard part of Mono). Assuming ttx runs under IronPython, you can sort of seamlessly go between ttx and Font Validator's internals within the same python session. I did some experiments in that direction (as in playing with Font Validator's intenals under IronPython) rather early on. Didn't not spend more time on it, just because there wasn't any demand/interest. |
The discussion happened towards the very beginning around/before even the MS code fell in my hand. I did a bit of looking in that direction, because, you know, the original plan with Font Validator was just so that some its code is released to allow its knowhows to go into ttx/fonttools... |
Actually I have documented this possibility: in the Extended TODO's, line 45 "IronPython scripted access to Font Validator's internals." |
Humm. Not sure if I was part of that discussion. |
@behdad : Argh. sorry. @davelab6 , @anthrotype , as well as of course @aaronbell , was in on that 2-3 e-mail discussion about IronPython. Here is a quick intro about IronPython:
In terms of which part(s) of Font Val you want to draw from to enhance FontTools, I have come to the conclusion that the main value of Font Val is in the glyf tests and the rasterization tests. The proprietary rasterization test backend was not released by Microsoft, and that part is pending enhancement in FreeType ( HinTak/Font-Validator#5 ) to do that; but once FreeType is up for it, FontTools does not need to go through FontVal to get at that, but can interactive with FreeType directly, so that's outside the current discussion. If you want to port over the glyf test from Font Val, the bulk of the code is in "GMath" and "Glyph" . Yes, the glyf test in Font Val has two dll's devoted to it, whereas all the rest of the table parsing machinery is in "OTFontFile" , and all the table checking machinery is in "OTFontFileVal" . I reckon FontTools does not need either of those and people looking to put some of Font Val's logic into FontTools should concentrate on "GMath" and "Glyph", but you are free to think otherwise. |
I'd be interested to know how much of FontTools works under IronPython, so if there is a testsuite or something that doesn't take too much (human) time to do, I'd be interested to hear about it. |
You just need to |
@behdad @davelab6 @anthrotype @aaronbell @twardoch @miguelsousa @n8willis : I needed a quick and dirty script to get at fonts' creation/modification times, so I wrote one in IronPython - it demonstrates how to use CPython modules with IronPython, and how to access Font Validator's internals from a python script. Edit the two lines after "NOTE:" to your preference. You need MS Font Validator, mono, ironpython (it comes with the windows version of mono definitely, and possibly also the mac OS version); CPython is optional. IronPython example to use Microsoft Font Validator's internals to get at font creation and modification times: |
I was a bit wrong about some of IronPython bits - current ipy are at 2.7.5 and 3.0, and not too far behind; and should be sufficient to run the pure python part of ttx/FontTools . |
@behdad @davelab6 @anthrotype @aaronbell @twardoch @miguelsousa @n8willis : Another IronPython example script to use FontVal's internals via python - this one does the same thing as "ttx -l ", except it recurses through ttc members and also takes multiple font file arguments. https://gist.github.com/HinTak/3cbc107a4394b0546f4647ecbe49d92c |
To be honest, I'm not sure what value this provides... |
I am not sure either, but there are at least 3 people who said re-writing FontVal into python is a good idea :-). I am just offering an alternative: the IronPython interpreter is a net executable - in fact it is a small wrapper around about 3 dotnet/mono dll's. So in principle (well, it is not too hard), I can bundle it with FontVal to give it the ability of accessing any native python script or modules, old or new. That includes ttx, and part of afdko which are reasonably standalone. Instead of re-writing FontVal into Python, it can take on an embedded python-interpreter, and be able to access and process some python script/modules. It is about an extra 4MB. |
Well, porting it to Python is useful to avoid having to install mono. You add IronPython, and now we have two problems. :) |
Haven't you heard? Since January, FontVal does not need a separate installation of mono on Mac OS X anymore. (and it shouldn't need mono on windows). Adding IronPython does not change that. I don't package it for Linux that way, because the system call interfaces on Linux change too often, and executable's just aren't very transferable between Linux boxes of different vintage/distro. |
Huh? |
More FontVal-based IronPython scripts, to merge/split ttc's. |
@schriftgestalt @aaronbell - The sheer number of parameters set-able for the rasterization test ( HinTak/Font-Validator#5 ) is hugh. It would actually make sense to just add one new option "-rp-script parameters.py" to set all of them in a heavily annotated python script. That adds an extra 4MB to bundle an embedded IronPython interpreter core, instead of adding a dozen options to set them individually, and adding all the option-parsing and string-to-number conversion code. The total of rasterization parameters is basically all the public members of the Validator Parameter class ( https://github.com/HinTak/Font-Validator/blob/master/OTFontFileVal/ValidatorParameters.cs ) plus all of the RastTestTransform class ( https://github.com/HinTak/Font-Validator/blob/master/OTFontFileVal/RastTest.cs ). That's 4 for stretchX, stretchY, skew, rotation, a 3x3 matrix for transform, 3 toggles for BW/Gray/ClearType, the 4 ClearType toggles, x/Res, and a list of point sizes. |
There does not need to be any explicit conversion - one can construct the two parameter classes directly on the python side, in a script. |
I’m not sure if the raster test needs to expose all that settings. And do you really need a python environment to parse one settings file? |
@schriftgestalt : At least during the testing phrase, it would be nice to be able to set all the options individually - of course most people don't need to do that, and just not load a script. The script is not compulsory. In that case, it would just be using the internal default settings, just like most people don't change the GUI options for those.
There are already 3 other python scripts to extend FontVal's (not counting the first one; the ttc splitter/mergers are quite useful). Might as well include a script-parsing functionality. It is only 4 MB extra for the binary, and very little extra new code - a lot less than it would be for all the individual options, parsing and converting from strings to numbers. I don't need a "python" environment - but a way of running simple script (for whatever language variant) is nice. C# is actually also script'able, and the additional code to load a C# script vs a python script is about the same. |
@schriftgestalt @aaronbell : Here is the already tested python script to set any of the rasterization test parameters: https://gist.github.com/HinTak/cf7528519552f0eded236fa8e9f41247 when the rasterization test works (and the embedded python interpreter hooked up), you do "FontValidator -rp-parms editedversion.py ..." in an expert-usage mode - not using "-rp-parms" just let the default does its job. It could do with a lot more comments and annotations, of course. In https://github.com/HinTak/Font-Validator/blob/master/FontValidator/CmdLineInterface.cs , line 203, it already does "ValidatorParameters vp = new ValidatorParameters();", only a rather small number of new lines a bit below to over-write vp, with something like this:
Much cleaner than adding piles of new code just to parse and convert a dozen strings to numbers or booleans. |
I have bundled 4MB of IronPython's dll with FontVal. So it just works - you can do 'FontValidator.exe somescript.py ...' now, where you might be doing 'ipy.exe somescript.py ...' earlier. Announcement on binary: code: It addresses #17 in a way too. |
Apologies for a semi-off-topic issue.
Microsoft recently opensourced Font Validator (FontVal) under the super-liberal MIT license. Somewhat unfortunately, Font Validator is written in C#, but it seems to me that, generally speaking, that code is relatively self-contained, and seems to be cleanly written. Do you think it would be feasible to port it to Python (to make "PyFontVal" — which shouldn't really become part of fontTools, much more likely a wholly-separate package)? Given that C# is more strongly typed than Python, it should be possible, at least theoretically.
I've just looked at one function, val_JSTF.cs, and tried to automatically convert a bit of it to Python using the VaryCode converter, which, after the free registration, limits the conversion to 2048 characters.
Here's the input:
and here is the generated Python code:
At least it does not look bad. Perhaps, after a conversion and some refactoring, this could be used as a basis for a more-generic font validation library which could integrate fontTools, the "PyFontVal" port and a somewhat refactored AFDKO CompareFamily.py ?
The text was updated successfully, but these errors were encountered: