Unexplained phrase in GPOS explanation
Theunis de Jong
Posts: 112
In Microsoft's official OpenType spec page on GPOS Pair Positioning, PosFormat 1, there is a small side-remark which is not explained any further:
When investigating a font, you can write out the PairPos code for a font such as Unicode.org's TestGPOSTwo.otf (https://github.com/unicode-org/text-rendering-tests/tree/master/fonts) like this:
so this must indeed be a case where originally uni25EF appeared twice in the coverage list – and thus included only once.
For other fonts, the value of pairSetCount is equal to the length of Coverage; each covered glyph only gets used once as the "first glyph". The specification acknowledges this may not be the case but does not offer an alternative.
A number of other tools and libraries agree with my interpretation so far (opentype.js, otl_dump.php). The otherwise reliable DTL OTMaster Light reports the second occurrence as '.notdef'.
I am unable to duplicate a valid .fea file to exactly mimic TestGPOSTwo's behaviour. makeotf complains it sees two same kerning pairs and only uses the larger value – it must be a common mistake – and my Feature-Fu is not strong enough to fool it otherwise.
It is very likely TestGPOSTwo.otf was artificially constructed, but the accompanying document states
Any insights on this?
A PairPos subtable also defines an offset to a Coverage table
(coverageOffset) that lists the indices of the first glyphs in each
pair. More than one pair can have the same first glyph, but the Coverage
table will list that glyph only once.
(my emph.)
The problem lies here:(my emph.)
When investigating a font, you can write out the PairPos code for a font such as Unicode.org's TestGPOSTwo.otf (https://github.com/unicode-org/text-rendering-tests/tree/master/fonts) like this:
subtable 1: posFormat: 1 coverage: uni25EF valueFormat1: X_ADVANCE valueFormat2: (None) pairSetCount: 2 pairValueCount: 1 uni25EF -> sun x1=-800 pairValueCount: 1now what character comes at the '...'? PairSetCount is 2; and
...
A PairSet table enumerates all the glyph pairs that begin with a covered glyph.
so this must indeed be a case where originally uni25EF appeared twice in the coverage list – and thus included only once.
For other fonts, the value of pairSetCount is equal to the length of Coverage; each covered glyph only gets used once as the "first glyph". The specification acknowledges this may not be the case but does not offer an alternative.
A number of other tools and libraries agree with my interpretation so far (opentype.js, otl_dump.php). The otherwise reliable DTL OTMaster Light reports the second occurrence as '.notdef'.
I am unable to duplicate a valid .fea file to exactly mimic TestGPOSTwo's behaviour. makeotf complains it sees two same kerning pairs and only uses the larger value – it must be a common mistake – and my Feature-Fu is not strong enough to fool it otherwise.
It is very likely TestGPOSTwo.otf was artificially constructed, but the accompanying document states
The second subtable has two PairSets, both kerning
◯ U+25EF LARGE CIRCLE and ☼ U+263C WHITE SUN WITH RAYS. The first
PairSet applies kerning so that the two symbols will exactly
overlap. If the second PairSet was applied, it would add spacing
to move the two symbols away from each other. But a correct text
rendering engine should walk the PairSets in the order given by
the font, and stop processing after finding the first
match.
which indicates that first character ought to be valid for both pairs, even though a bad renderer might use the second value instead of the first.Any insights on this?
0
Comments
-
It's not clear to me what you're asking. Your first question is in this context The subject seems to suggestion you're looking for clarification on the OT spec. But your first specific question is in this context:When investigating a font, you can write out the PairPos code for a font such as Unicode.org's TestGPOSTwo.otf (https://github.com/unicode-org/text-rendering-tests/tree/master/fonts) like this:
subtable 1: posFormat: 1 coverage: uni25EF valueFormat1: X_ADVANCE valueFormat2: (None) pairSetCount: 2 pairValueCount: 1 uni25EF -> sun x1=-800 pairValueCount: 1
now what character comes at the '...'?
...
Now, guessing at how this higher-level interpretation is formatted, I would expect what is presented at the '...' would be the glyph ID for the second glyph in the first glyph pair.
Again, guessing at this higher-level format, it appears that the coverage table lists exactly one glyph, uni25EF. But if PairSetCount is 2, there needs to be at least two glyphs in the coverage table for the font to be valid—for each PairSet table, there needs to be a separately-listed glyph in the coverage table. If the is positioning data for two glyph pairs and both pairs have uni25EF as the first glyph, then there must be 1 PairSet table, not two, and the one PairSet table must include two PairValueRecords.PairSetCount is 2;
Something seems amiss. It's not clear to me if the font itself is invalid (I'd guess not), or (more likely) the dump you're showing is not an accurate reflection of the font.andA PairSet table enumerates all the glyph pairs that begin with a covered glyph.
so this must indeed be a case where originally uni25EF appeared twice in the coverage list – and thus included only once.For other fonts, the value of pairSetCount is equal to the length of Coverage; each covered glyph only gets used once as the "first glyph".
Just so: that's how the format is spec'd. To clarify: the sequence of PairSet tables corresponds respectively to the sequence of glyphs in the coverage table, and each PairSet table provides positioning data for all pairs that begin with a given glyph in the coverage table (hence the sequence of glyphs in the coverage table must be at least as long as the array of PairSet table offsets).The specification acknowledges this may not be the case but does not offer an alternative.
Eh?? If you're saying that the spec allows for a covered glyph to be used twice—i.e., to pertain to two different PairSet tables—then that is incorrect: the spec does not allow for that.
1 -
Apologies for providing an interpreted version of the original binary information. Here it is:
00 01 00 26 00 04 00 00 00 02 00 0E 00 14 00 01 00 02 FC E0 00 01 00 02 00 C8
(where the first 00 01 is the posFormat value). TTX shows it as<PairPos index="1" Format="1"> <Coverage Format="1"> <Glyph value="uni25EF"/> </Coverage> <ValueFormat1 value="4"/> <ValueFormat2 value="0"/> <!-- PairSetCount=2 --> <PairSet index="0"> <!-- PairValueCount=1 --> <PairValueRecord index="0"> <SecondGlyph value="sun"/> <Value1 XAdvance="-800"/> </PairValueRecord> </PairSet> <PairSet index="1"> <!-- PairValueCount=1 --> <PairValueRecord index="0"> <SecondGlyph value="sun"/> <Value1 XAdvance="200"/> </PairValueRecord> </PairSet> </PairPos>
(one you may be more familiar with )It's not clear to me if the font itself is invalid (I'd guess not) ...
The font itself must be considered invalid. Only now I found this (apologies, again), the exact phrase I was overlooking earlier on:The PairSet array contains one offset for each glyph listed in the Coverage table and uses the same order as the Coverage Index.
and the suspect behavior tested in https://github.com/unicode-org/text-rendering-tests under GPOS-2 is not due to a specific renderer fault (which is what it is intended to test for), but because the offered scenario – an invalid font where length(Coverage) != PairSetCount – indicates a bad font instead.
I am going to ignore the results from this test, then. I should probably post it as an issue on the Unicode Github page with a request to withdraw this particular test.
0 -
Yes, that’s an invalid font and you should probably raise an issue with Unicode. Additionally, I’d argue that there is a bug in fontTools if you can generate that font in the first place. If you can recompile that ttx dump back into a font without any errors or warnings, I’d suggest raising an issue with fontTools too.1
-
I've posted the issue on Unicode's Github for this.
ttx indeed recompiles it back to the exact same font, byte for byte. Does it show errors or warnings for similar mis-counts in the other tables? In that case this one got overlooked.
0 -
Perhaps the font was crafted explicitly to be invalid? (It is a test case, after all.)0
Categories
- All Categories
- 43 Introductions
- 3.7K Typeface Design
- 803 Font Technology
- 1K Technique and Theory
- 622 Type Business
- 444 Type Design Critiques
- 542 Type Design Software
- 30 Punchcutting
- 136 Lettering and Calligraphy
- 83 Technique and Theory
- 53 Lettering Critiques
- 485 Typography
- 303 History of Typography
- 114 Education
- 68 Resources
- 499 Announcements
- 80 Events
- 105 Job Postings
- 148 Type Releases
- 165 Miscellaneous News
- 270 About TypeDrawers
- 53 TypeDrawers Announcements
- 116 Suggestions and Bug Reports