Fun with unattached marks

Simon Cozens
Simon Cozens Posts: 730
edited September 2021 in Font Technology
Here's a weird corner of OpenType that I don't quite understand, and which cost me an hour and a lot of head scratching this morning.

Assume that sdb.yb is a mark, and that everything else is a base. Assume also that sdb.yb does not have a mark attachment rule positioning it.

With the text "BEi17|sdb.yb|SINi1|BEm7|sdb.yb" and the following rules:

    lookup FixYBPositions {
        pos BEi17 <NULL> [sdb.yb] <330 0 0 0>;
        pos BEm7 <NULL> [sdb.yb] <330 0 0 0>;
    } FixYBPositions;


Both rules will fire, repositioning both sdb.yb glyphs. But add another rule:
    lookup FixYBPositions {
        pos BEi17 <NULL> [sdb.yb] <330 0 0 0>;
        pos BEm7 <NULL> [sdb.yb] <330 0 0 0>;
        pos SINi1 <NULL> [sdb.yb] <290 0 0 0>;
    } FixYBPositions;

Now only the BEi17 rule will fire, repositioning the first sdb.yb, and the second glyph is not repositioned.

For more fun: Remove the classes and replace with bare "sdb.yb", changing it from a format 2 pairpos subtable to a format 1 subtable, and both rules will fire. This happens with both Harfbuzz and CoreText.

I expect we're well into "undefined behaviour" territory here, but just in case anyone ever comes across something like this in the future, hopefully it will save you the same head-scratching I went through.

Comments

  • John Hudson
    John Hudson Posts: 3,086
    edited September 2021
    Is this in right-to-left text layout? 
  • Yes.
  • Not sure I completely understand your examples, but since you mentioned pairpos:

    In RTL glyph runs, pairpos adjustments need to be implemented as corresponding negative dx and width adjustments on the right (first) glyph, otherise every other pair in the string gets skipped. Not sure what the syntax is in AFDKO code. This is what it looks like in VOLT:


  • The feature code produces x-placements to the second glyph of pairs. This should not be subject to the RTL strangeness where OpenType can only modify advance widths by moving the left-sidebearing, regardless of the text direction. 

    What puzzled me is the <NULL> in the code, which appears to be an empty positioning record (unsupported by Adobe?). It is supported by Glyphs 3, when I tried out the code. At least it is accepted and has an effect, which may or may not be intentional. It adds extra class pairs to the lookup using class 0 for the second class. I am not trying to figure out what that means in the context of RTL shaping, but it should give confusing results.

    Following is what the binary contains, with Latin based placeholder glyphs. The "a" in coverage is the implicit class 0 for the first of classDef. What is shifted with -1 should be all glyphs except the acute. These entries disappear when <Null> is removed from the code.

    lookup 0 pair

    RightToLeft no
    IgnoreBaseGlyphs no
    IgnoreLigatures no
    IgnoreMarks no

    coverage definition begin
    a
    b
    c
    coverage definition end

    firstclass definition begin
    b 1
    c 2
    class definition end

    secondclass definition begin
    acute 1
    class definition end

    right x placement    0    0    -1
    right x placement    0    1    330
    right x placement    1    0    -1
    right x placement    1     1    330
    right x placement    2    0    -1
    right x placement    2    1    290  

    lookup end

  • Grzegorz Rolek
    Grzegorz Rolek Posts: 22
    edited September 2021
    Jelle Bosma:
    What puzzled me is the in the code, which appears to be an empty positioning record (unsupported by Adobe?). It is supported by Glyphs 3, when I tried out the code. At least it is accepted and has an effect, which may or may not be intentional.
    True, it is intentional and Glyphs 3 indeed will produce an empty (null) value record for <NULL>. Furthermore, because of zero adjustments being skipped from a value record by default, <0 0 0 0> (or 0 for that matter) will also result in a null value record (assuming, of course, that the subtable's value format allows for that). This is different from makeotf, which will produce a dummy (zero x advance) value record in such case (for both <0 0 0 0> and 0).

    Now, makeotf does produce a null value record, but only for the format B positioning rule (pos a b <0 0 10 0> or pos [a] [b] <0 0 10 0>), which is equivalent to pos a <0 0 10 0> b <NULL> (though makeotf still won't accept the <NULL> here).

    I have yet to investigate why makeotf produces those dummy value records, even if it normally skips zero adjustments as well. Typically in such cases, I would blame some layout engine for requiring it, despite the OpenType spec being clear that each value format for each subtable type can be null. But because makeotf does produce a null value record for the rules above, I now wonder if that’s just some makeotf legacy or even a bug. Note that the presence of those dummy value records seems to have no effect in Simon's case.

    Following is what the binary contains, with Latin based placeholder glyphs. The "a" in coverage is the implicit class 0 for the first of classDef. What is shifted with -1 should be all glyphs except the acute. These entries disappear when <Null> is removed from the code.
    This is unrelated to all of the above, but there’s indeed been a bug in Glyphs 3 that would set those meant-to-be-zero Class 0 adjustments to -1 for the second class in a pair with a certain specific value formats configuration and in a class-based subtable only. A pretty extreme scenario and the value is unnoticeable with a naked eye, but it could affect the overall line length within a long run of text nonetheless. This is now fixed, anyway, and many thanks for finding that out!
  • I have yet to investigate why makeotf produces those dummy value records, even if it normally skips zero adjustments as well.
    Just as an update: The moment I wrote that I've checked the spec and found this:
    If valueFormat2 is set to 0, then the second glyph of the pair is the “next” glyph for which a lookup should be performed.
    So those dummy zero-advance records are a means to let people decide to either include the second glyph while applying the next lookup (<NULL>, format zero) or skip it and jump straight to the next pair (<0 0 0 0>, dummy zero-advance). This is how it works in Glyphs 3 since, anyway.
  • RichardW
    RichardW Posts: 100
    Was Jelle Bosma's explanation understood?  The critical fact is that the glyph pair <SINi1, BEm7> matched <class 2, class 0>, and therefore the pair <BEm7, sdb.yb> wasn't considered.  The coverage field ensures that only pairs whose first member is BEi17, BEm7 or SINi1 are matched, but it is uncertain that there is any mechanism to constrain what can be matched as a second member.