Fun with unattached marks

Simon Cozens · September 2021

Here's a weird corner of OpenType that I don't quite understand, and which cost me an hour and a lot of head scratching this morning.

Assume that sdb.yb is a mark, and that everything else is a base. Assume also that sdb.yb does not have a mark attachment rule positioning it.

With the text "BEi17|sdb.yb|SINi1|BEm7|sdb.yb" and the following rules:

    lookup FixYBPositions {
        pos BEi17 <NULL> [sdb.yb] <330 0 0 0>;
        pos BEm7 <NULL> [sdb.yb] <330 0 0 0>;
    } FixYBPositions;

Both rules will fire, repositioning both sdb.yb glyphs. But add another rule:

    lookup FixYBPositions {
        pos BEi17 <NULL> [sdb.yb] <330 0 0 0>;
        pos BEm7 <NULL> [sdb.yb] <330 0 0 0>;
        pos SINi1 <NULL> [sdb.yb] <290 0 0 0>;
    } FixYBPositions;

Now only the BEi17 rule will fire, repositioning the first sdb.yb, and the second glyph is not repositioned.

For more fun: Remove the classes and replace with bare "sdb.yb", changing it from a format 2 pairpos subtable to a format 1 subtable, and both rules will fire. This happens with both Harfbuzz and CoreText.

I expect we're well into "undefined behaviour" territory here, but just in case anyone ever comes across something like this in the future, hopefully it will save you the same head-scratching I went through.

John Hudson · September 2021

Is this in right-to-left text layout?

Simon Cozens · September 2021

Yes.

John Hudson · September 2021

Not sure I completely understand your examples, but since you mentioned pairpos:

In RTL glyph runs, pairpos adjustments need to be implemented as corresponding negative dx and width adjustments on the right (first) glyph, otherise every other pair in the string gets skipped. Not sure what the syntax is in AFDKO code. This is what it looks like in VOLT:

Image: https://us.v-cdn.net/5019405/uploads/editor/q9/1o37dy167ix3.png

Jelle Bosma · September 2021

The feature code produces x-placements to the second glyph of pairs. This should not be subject to the RTL strangeness where OpenType can only modify advance widths by moving the left-sidebearing, regardless of the text direction.

What puzzled me is the <NULL> in the code, which appears to be an empty positioning record (unsupported by Adobe?). It is supported by Glyphs 3, when I tried out the code. At least it is accepted and has an effect, which may or may not be intentional. It adds extra class pairs to the lookup using class 0 for the second class. I am not trying to figure out what that means in the context of RTL shaping, but it should give confusing results.

Following is what the binary contains, with Latin based placeholder glyphs. The "a" in coverage is the implicit class 0 for the first of classDef. What is shifted with -1 should be all glyphs except the acute. These entries disappear when <Null> is removed from the code.

lookup 0 pair

RightToLeft no

IgnoreBaseGlyphs no

IgnoreLigatures no

IgnoreMarks no

coverage definition begin

a

b

c

coverage definition end

firstclass definition begin

b 1

c 2

class definition end

secondclass definition begin

acute 1

class definition end

right x placement 0 0 -1

right x placement 0 1 330

right x placement 1 0 -1

right x placement 1 1 330

right x placement 2 0 -1

right x placement 2 1 290

lookup end

Grzegorz Rolek · September 2021

Jelle Bosma:

What puzzled me is the in the code, which appears to be an empty positioning record (unsupported by Adobe?). It is supported by Glyphs 3, when I tried out the code. At least it is accepted and has an effect, which may or may not be intentional.

True, it is intentional and Glyphs 3 indeed will produce an empty (null) value record for <NULL>. Furthermore, because of zero adjustments being skipped from a value record by default, <0 0 0 0> (or 0 for that matter) will also result in a null value record (assuming, of course, that the subtable's value format allows for that). This is different from makeotf, which will produce a dummy (zero x advance) value record in such case (for both <0 0 0 0> and 0).

Now, makeotf does produce a null value record, but only for the format B positioning rule (pos a b <0 0 10 0> or pos [a] [b] <0 0 10 0>), which is equivalent to pos a <0 0 10 0> b <NULL> (though makeotf still won't accept the <NULL> here).

I have yet to investigate why makeotf produces those dummy value records, even if it normally skips zero adjustments as well. Typically in such cases, I would blame some layout engine for requiring it, despite the OpenType spec being clear that each value format for each subtable type can be null. But because makeotf does produce a null value record for the rules above, I now wonder if that’s just some makeotf legacy or even a bug. Note that the presence of those dummy value records seems to have no effect in Simon's case.

Following is what the binary contains, with Latin based placeholder glyphs. The "a" in coverage is the implicit class 0 for the first of classDef. What is shifted with -1 should be all glyphs except the acute. These entries disappear when <Null> is removed from the code.

This is unrelated to all of the above, but there’s indeed been a bug in Glyphs 3 that would set those meant-to-be-zero Class 0 adjustments to -1 for the second class in a pair with a certain specific value formats configuration and in a class-based subtable only. A pretty extreme scenario and the value is unnoticeable with a naked eye, but it could affect the overall line length within a long run of text nonetheless. This is now fixed, anyway, and many thanks for finding that out!

Grzegorz Rolek · March 2022

I have yet to investigate why makeotf produces those dummy value records, even if it normally skips zero adjustments as well.

Just as an update: The moment I wrote that I've checked the spec and found this:

If valueFormat2 is set to 0, then the second glyph of the pair is the “next” glyph for which a lookup should be performed.

So those dummy zero-advance records are a means to let people decide to either include the second glyph while applying the next lookup (<NULL>, format zero) or skip it and jump straight to the next pair (<0 0 0 0>, dummy zero-advance). This is how it works in Glyphs 3 since, anyway.

RichardW · March 2022

Was Jelle Bosma's explanation understood? The critical fact is that the glyph pair <SINi1, BEm7> matched <class 2, class 0>, and therefore the pair <BEm7, sdb.yb> wasn't considered. The coverage field ensures that only pairs whose first member is BEi17, BEm7 or SINi1 are matched, but it is uncertain that there is any mechanism to constrain what can be matched as a second member.

Fun with unattached marks

Comments

Categories