OpenType feature for positioning base glyph according to diacritic mark

The short version:
In which OpenType feature should I put a pos rule that contextually positions a base glyph when combined with a certain diacritic mark?

The long version:
I would like to respace the Hebrew letter VAV when combined with the diacritical mark DAGESH in order to make room for the DAGESH.
This is very similar to the situation with l with middle dot (used in Catalan), which typically requires more space (x advance) than just l.
Note that while there's a unicode character Hebrew Letter Vav with Dagesh, it decomposes into VAV followed by DAGESH, so to the best of my understanding it's irrelevant (i.e., including a glyph for this character in a font won't help, as it will be ignored).
It sounds like the most straight-forward way to respace VAV when combined with DAGESH would be using contextual positioning (the VAV would be the target and the DAGESH would provide the context).
Is that right?
Where should I put this pos rule?

Thanks!

Comments

  • Hi Ori,
    this a contextual kern feature
    All the kern features must put before the mark and mark to mark features.
    Do you work with microsoft VOLT?
    Regards
    Sami
  • All the kern features must put before the mark and mark to mark features.

    Do you mean after? Kerning is run after mark positioning.

    More generally: there is no required relationship between the type of lookup (mark to mark, mark to base, positioning, substitution, etc.) and the name of the feature. The names of the features are only useful in so much as (i) they specify the order in which lookups are processed, (ii) they determine which lookups are required and which are discretionary (can be turned on and off by users). (And a third special case: Indic and USE shapers use certain feature names for reordering.)

    To check the order of processing, use the Harfbuzz cheat sheet:


    Lookups are processed in the order given, except that within horizontal blocks (e.g. mark/mkmk/rlig) they are processed in order of appearance in the font binary. So no matter which way around you order your lookups, mark/mkmk will always come before dist and kern.

    So you could put it in kern; you could put it in dist if you don't want people to turn it off; you could put it in curs if you want to be weird. It doesn't really matter, so long as it's in something after the abvm/blwm/ccmp/locl/mark/mkmk/rlig block.

  • All the kern features must put before the mark and mark to mark features.

    Do you mean after? Kerning is run after mark positioning.

    With Uniscribe/DirectWrite, if mark positioning lookups came first, the marks will be positioned and if any later lookup moves the base glyph, the marks will remain in their position and not moved relative to it.
  • Hey Sami & Simon, thank you for these answers!
    I like the idea of using dist. I don't particularly want the user to be able to turn it off, and I wasn't keen on using kern for aesthetic reasons: I think of this situation as slightly different from kerning (there's no second glyph involved).
    Sami, no, I don't use VOLT. Doesn't VOLT use a totally different language? Does it even have the notion of feature?
  • All the kern features must put before the mark and mark to mark features.

    Do you mean after? Kerning is run after mark positioning.

    With Uniscribe/DirectWrite, if mark positioning lookups came first, the marks will be positioned and if any later lookup moves the base glyph, the marks will remain in their position and not moved relative to it.
    Thank You @Khaled Hosny this is the reason kerning must be before mark position.
    Ori VOLT has the notion of features, lookups, languages, and scripts, and it is much better and much easiest to work with hebrew opentype features.
    If you want, send me a message and I send to you a hebrew font with a VOLT project.
    Sami
  • Ori Ben-Dor
    Ori Ben-Dor Posts: 386
    edited November 2022
    I have two follow-up questions, actually.

    But first I should note that I'm not in the middle of the production of a specific font, so I'm not looking for solutions that would fit in a very specific process. I'm interested in understanding these problems and possible solutions more generally.

    1.

    Text-rendering engines typically apply kerning information regarding a given glyph also to combinations of that glyph with marks, right? But in this particular situation a pair of VAV with DAGESH followed by another letter should not be kerned like a pair of VAV followed by the same letter (think of kerning small l with middle dot versus just small l). What do you do then? How do you make text-rendering engines change their default behavior in this case? I guess you could just fix each kerning pair with a dedicated rule, but that sounds annoying. Is there some other approach?

    2.

    Assume you develop a font using a font editor such as FontLab, Glyphs, or RoboFont. If it's a single-master font, you could just add a few lines of code in the features section, something like

    feature dist {
        pos vav-hb' dagesh-hb <X 0 X 0>;
    } dist;

    and that would do the trick.

    But how do you handle multiple-master fonts, where different masters require different values of X?

    The deltas should go in the GDEF table, right? How do you inject that information there? Does FL/Glyphs/RF offer a way to do that? What other tools might I use?

  • Thanks, @Sami Artur Mandelbaum, sent you a message.

  • feature dist {
        pos vav-hb' dagesh-hb <X 0 X 0>;
    } dist;

    and that would do the trick.

    But how do you handle multiple-master fonts, where different masters require different values of X?

    This is an important question, and there are multiple ways to solve it - which is not ideal. In Glyphs 3, you would solve this by making X a "number" $x; essentially a variable which can be given a value for each master. In a UFO-based workflow, you obviously have a different features.fea for each master, so it's easy, although keeping the features files in sync once you make changes is less easy. My preferred solution is the AFDKO variable features syntax proposal, which is implemented by fontmake and HighLogic.
  • Very interesting, thanks, Simon!
  • John Hudson
    John Hudson Posts: 3,264
    Sami, no, I don't use VOLT. Doesn't VOLT use a totally different language? Does it even have the notion of feature?
    VOLT more closely follows OTL table structure, so there are indeed features, but the lookups associated with the features are independent of them. It is now possible to set up AFDKO fea code in this way, by putting all the lookups at the front of the code and then calling them in the features, but VOLT has a graphical interface, which is nicer to work with.

  • Earlier this month I held a presentation for ATypI Tech Talks 2022, that showed how to add Kerning Triplets to a variable font. Here is a tutorial:
  • It might be a lot easier to include vavdagesh-hebr and then add this ligature to ccmp:
      sub vav-hebr dageshormapiqcomb-hebr by vavdagesh-hebr;

  • As Erwin points out, there is a Unicode code point for vav-dagesh, FB35, as there are code points for all other characters that take the dagesh. Using them would obviate the need for any fancy workarounds.
  • By the way, if you increase the left side bearing of vav-dagesh, you will need to make far fewer kerning adjustments than if you leave it the same as the regular vav.

  • Thank you all for your comments!

    I was under the impression Unicode's vavdagesh, having a canonical decomposition, would be simply substituted by the text-rendering engine with its decomposition (vav followed by dagesh). I understand this isn't the case?
    (Of course even if it was the case, one could always use a vavdagesh glyph of their own.)

    By the way, if you increase the left side bearing of vav-dagesh, you will need to make far fewer kerning adjustments than if you leave it the same as the regular vav.

    I'm not following, isn't it obvious that you need to increase the left side bearing in order to make room for the dagesh? Oh, maybe you're referring to designs where the vav has a roof and then it's not obvious?

  • I was under the impression Unicode's vavdagesh, having a canonical decomposition, would be simply substituted by the text-rendering engine with its decomposition (vav followed by dagesh). 
    Definitely an appropriate assumption. Moreover, you should allow for FB35 not occuring in the document in the first place, but rather that the document could have the combining sequence.
  • John Hudson
    John Hudson Posts: 3,264
    ...but that combining sequence can be represented by a /vavdagesh ligature—accessed via the ccmp feature in the font—and the same glyph can also be encoded as U+FB35.
  • I agree with John and Peter. Furthermore, be aware that no one who types liturgical or biblical Hebrew will type FB35 directly (except through a glyph palette), but rather as a vav followed by dagesh, which via the ligature entered in ccmp table becomes FB35. If the vav glyph has a positioning mark for dagesh, but does not call for the vavdagesh ligature, then you will get a vav-dagesh with insufficient left side bearing. That’s the consequence implied by my previous statement.