Hi! I'm trying to understand how the OpenType layout works for GSUB rules in a right-to-left direction scripts, and I have questions that I can't find the answer to.
In the case of precomposed Hebrew letters with combined marks (such as alefdagesh-hb
uniFB30):
- Does it make a sense to add a composition for them in ccmp?
- If so, should I wrap these rules in a lookup with the script hebr and language IWR labels? The documentation says that ccmp is not script/languages sensitive, so looks like it doesn't make a sense.
- Is it necessary to specify the right-to-left (RTL) direction for such a lookup?
sub alef-hb dagesh-hb by alefdagesh-hb;
Thanks in advance.
Comments
2. Yes, you should associate the relevant ccmp lookups with the hebr script tag, but you can use the dflt language system tag, which will work with any text in Hebrew script.
3. No, it is not necessary to set the RTL direction tag on the Hebrew lookups, but if you are using a graphical GSUB tool like VOLT you might want to, as it will flip the input UI so that the logical input and visual order correspond. The important thing to remember is that GSUB useි logical order, so you need to think in terms of RTL directionality of the glyphs even though the coding might specify the glyph sequence LTR, as in your example.
Thank you for detailed explanation about lookups and especially about RTL.
I still do not understand why the language dflt should be specified instead of IWR in this case. Does language dflt mean all possible languages in Hebrew script (such as Hebrew, Yiddish, Ladino, etc), or does it mean just the language activated by customer (default for his OS/application)?
And what will happen if specify the script but do not specify the language at all?
@Simon Cozens
Yes, that's really not what I was thinking.
The former. OpenType Layout begins with script itemisation based on Unicode script property: a string of Hebrew characters is recognised as Hebrew text and passed to the layout engine responsible for Hebrew OTL processing. The dflt language system tag is used to process the features and lookups unless a) the text has been tagged as a specific language in some way that the layout engine can associate with a different langsys tag and the font contains some language-specific shaping for that language that differs from dflt shaping for the script. So dflt means default shaping for the script.
A Hebrew font, for example, might contain some language-specific lookups for Yiddish digraphs, but most shaping would be dflt.
Every script tag should have a dflt langsys tag asssociated with it, while other langsys tags are all optional.
It is interesting that when the feature is generated automatically by type design applications, the script and language may be specified inside the lookup, so it may looks like this:
And after the export it is converted into the structure you pointed out:
This difference has confused me about which structure is correct. So thanks for clarifying.