Order of execution of OpenType features

Just checking something here. The OpenType Cookbook (by Tal Leming) suggests that it's the designers responsibility to order features:

The order in which you list your features is very important. This is the order in which they will be processed.

The Standard, though, suggests that the shaper has the responsibility of determining the order of features (at least for the Devanagari-related features):

The application is expected to process this feature and certain other features in an appropriate order to obtain the correct set of basic forms: 'nukt''akhn''rphf''rkrf''pref''blwf''half''pstf''cjct'

The AFDKO documentation is ambiguous, just saying that the shaper will "assemble" the list of features:

Do the following first for the GSUB and then for the GPOS:
Assemble all features (including any required feature) for the glyph run’s language system.
Assemble all lookups in these features, in LookupList order, removing any duplicates. (All features and thus all lookups needn’t be applied to every glyph in the run.)

Is there a canonical understanding of the order in which features are processed?

Comments

  • The order of features does not matter, but the order of lookups does. However, the lookups of certain features are always processed in a certain order (for example `ccmp` lookups are processed first), and layout engines don’t usually agree on this (for HarfBuzz you can find this somewhere in the source code, for others you will have to either experiment with fonts or ask).
  • OK, that’s confusing - surely the order of features has to matter!

    Consider something like smcp and rlig: if you do the rlig first, you’ll replace f i by f_i, and then you may not have a substitution to f_i.sc. But if you do the smcp first, the ligature doesn’t apply.


  • Jens Kutilek
    Jens Kutilek Posts: 364
    edited December 2019
    OK, that’s confusing - surely the order of features has to matter!

    Consider something like smcp and rlig: if you do the rlig first, you’ll replace f i by f_i, and then you may not have a substitution to f_i.sc. But if you do the smcp first, the ligature doesn’t apply.


    That is because each feature creates an implicit lookup. If you put the lookup definitions before the feature definitions, and just reference the lookups in the features, you can see that it is really the lookup order, not the feature order that matters (for "normal" features as Khaled explained).
  • So it sounds like there are two lists of lookup orderings: the shaper first pulls out lookups related to ccmp and possibly the Devanagari features (and possibly other stuff too) and executes them first (in some shaper-defined order), and then the remaining lookups are processed in the (designer-specified) order they appear in the table.

    Do any shapers also have a list of features they pull out to execute at the end of processing?
  • I haven’t ever heard of a shaper that pulls out stuff to execute AFTER the stuff they don’t explicitly worry about. If they do something other than “order of lookups in the font,” they all do it in “their own order” and then do anything else, not previously specified, afterwards—in the order those lookups are in the font.
  • Some features have a specific order. The biggest group are all features connected to Indic script. They are executed in the order as written in the spec. 
    Then there is 'rvrn'. It should always be executed first (that makes it unusable for general purpose feature variation substitution).
    I just did some quick test about the 'ccmp'. Indesign respects the lookup order but Safari isn’t.

  • OK, digging around in the Harfbuzz source I found references to "the spec", which I assumed would be the OpenType spec, but actually is the Microsoft Script Development Spec. This does define an expected order of processing for features for different scripts:

    Regardless of the model an application chooses for supporting layout of standard scripts, Uniscribe requires a fixed order for executing features within a run of text to consistently obtain the proper basic form. This is achieved by calling features one-by-one in the standard order listed below.

    Uniscribe by default processes features in the order ccmp, liga, clig, dist, kern, mark, mkmk. Harfbuzz does it in the order rvrn, (ltra,ltrm)/(rtla,rtlm), frac, numr, dnom, rand, trak, HARF, BUZZ, abvm, blwm, ccmp, locl, mark, mkmk, rlig, and then either (calt, clig, curs, dist, kern, liga, rclt) or vert. Then user-specified features come after that.

    The Script Development Spec specifies additional feature orderings for USE scripts, Arabic, Buginese, Hangul, Hebrew, different Indic scripts, Javanese, Khmer, Lao, Myanmar, Sinhala, Syriac, Thaana, Thai and Tibetan.

    What's fascinating is that nobody seemed to know that. :-)
  • Generally, it works like this: 

    First, the layout engine reads the GSUB features for the current languagesystem of the text run, and determines which features it should apply. It classifies the features into groups: 

    - Pre-shaping (ccmp, rvrn, locl)
    - Shaping (script-specific)
    - User-controllable

    For each group, shapers have their own rules how to apply them: either the list of lookups associated with all features enabled for a given group are pulled and executed in the order of lookups, or the lookups associated with each feature are pulled or executed, in a predefined order of features.  

    In a way you could say that it's always a list of enabled that is pulled, in the order defined in the font but then some lookup groups are resorted (moved to the to of the list)

    Then it does GPOS, analogically.

    The FEA syntax uses the ordering of feature definitions to implicitly control the order of lookups. In FEA you can create lookups explicitly,  but of you don't, lookups are created implicitly inside feature definitions 


  • Theunis de Jong
    Theunis de Jong Posts: 112
    edited December 2019
    Suppose you have a ligature f_i and a small caps feature. A font might need a smallcap FI if it also has a hardcoded Unicode glyph fi, but if it hasn't, surely you don't need to include an explicit small cap f_i.smcap too? (And for every other ligature as well...) So there must be some rule "scap goes first, then liga", right?

    Using FreeType I wrote myself a small feature tester, and since I don't know the "official" order, I just apply them in the order given on the command line. The results vary wildly with different ordering for some feature combos.

    It should totally be possible to make any program expose its internal workings with a specially crafted feature file; that's been on my Nothing Else To Do list for some time now.
  • So there must be some rule "scap goes first, then liga", right?

    I don’t know if there is a rule per se, but that is a good way to do it, and I think most do it this way. (Also, it’s “smcp” not “scap”.)

    Also, you don’t need both /f_i and /fi in a font. Using just /fi is common practice, rather than /f_i. (And using /fi as the sub for f i is okay in liga.)
  • I put them both in, but mostly out of habit. The only reason to do it, is to get better underlying text representation for PDFs created from print streams without access to the original font ... surely a real trivia point these days.