Kerning after a mark character in Tibetan

Gregg Weber · June 2021

I want to add some space in the following sequence in Tibetan script:

(simple glyph with a mark) space (simple glyph with a mark)

example: ཏོཡོ

This kind of sequence occurs in transliterated Sanskrit in Tibetan.

My question is, since I can not put an advance after the vowel mark, what is the best alternative?

Here is shown two possibilities in the feature snippet below, along with the rule that does not work.

So, is there a better way that I am missing?

Also, is there an advantage to using "kern" feature, instead of "abvm" feature?

Thanks for considering this.

languagesystem DFLT dflt;

languagesystem tibt dflt;

feature abvm {
    lookup abvm3 {
        pos tib.Ta' tib.Naro' tib.Ya' <200 0 200 0> tib.Naro'; # works
        #pos tib.Ta' tib.Naro' <0 0 200 0> tib.Ya tib.Naro; # does not work
        #pos tib.Ta' <0 0 200 0> tib.Naro' <-200 0 0 0> tib.Ya tib.Naro; # works also
    } abvm3;
} abvm;

John Hudson · June 2021

This is the classic problem of mark and spacing interaction in OpenType GPOS: space is carried by the base glyphs, but needs to be adjusted relative to the mark glyphs.

I would probably handle your Tibetan example by contextually adding space on the left side of the second base glyph, triggered by the preceding base+mark and the following mark. You can usefully filter mark groups in the lookup so that e.g. below-base consonant stacking forms are ignored and the spacing adjustment applies regardless of whether they are present.

I would perform the contextual spacing adjustment in either the dist feature or the kern feature (depending on whether you want the output to be editable by a user: think of the dist feature as 'required kerning' that is always applied and not editable).

It is possible to kern a base glyph off a mark glyph, but it gets complicated because you have to take the anchor distances of the mark into account. We do this in Thai fonts, but that is because the anchors are mostly on the right side of the preceding base, so the distances are predictable. In Tibetan, this is probably too complex because of the optical centering of the marks on bases of varying width.

The other thing to bear in mind is that spacing adjustments in GPOS are additive, while anchor positioning is substitutive. So the output of each dist or kern lookup is added to previous dist and kern lookups, and will be adjusted by further lookups. But the output of each abvm or other mark positioning lookup will replace that of previous lookups. Bearing this in mind, you will need to consider your spacing adjustments sequentially. There are two ways to do this: one is to start with the shorter contexts, and then extend the context in subsequent lookups, adding a bit of space each time; the other is to use a subtable structure and start with the longest contexts, so that once a context is met all the subsequent, shorter contexts are ignored.

John Hudson · June 2021

@Simon Cozens has been doing interesting work developing automated GPOS, which he has demonstrated using the very complex nastaliq case. I suspect it could adapt quite well to Tibetan stack spacing.

Gregg Weber · June 2021

Thanks! Your points are very helpful to me. I will try out dist for kerning.

This is very interesting: "You can usefully filter mark groups in the lookup so that e.g. below-base consonant stacking forms are ignored and the spacing adjustment applies regardless of whether they are present."

Microsoft himalaya and any other Tibetan unicode font I have seen, have not tried that. himalaya font put those below base consonant forms in a class, and then

has 4 lookups with 1,2,3, or 4 of those forms between the base and the vowel.

I don't see them using filter marks, at all. I will investigate your approach.

As far as the two ways to do sequential additive positioning, that is good for me to keep in mind.

Automated kerning would be wonderful indeed! I am glad Simon Cozens has shown it can be done. I could produce a lot of very nicely kerned training data. First, I want to

try an algorithmic approach since I have a good feel for what TIbetans consider elegant looking kerning (in u-chen), which is simple to implement.

Simon Cozens · June 2021

You should absolutely make friends with the "UseMarkFilteringSet" keyword. It reduces this kind of problem space considerably.

What I've been doing is building an environment which makes it easy to generate layout rules algorithmically in Python. With that, you can enumerate each base/topmark(s)/base/topmark(s) combination, find the bounding box of the left and right clusters, and add positioning rules to ensure the kind of space you want between the clusters.

Gregg Weber · June 2021

Thank you Simon! It sounds like you also support the idea of treating the below top consonants as "marks" in a markfiltering set. I will look at your python environment, but instead of bounding boxes, I am finding the interesection of images of the 2 (stacks with vowels) to be kerned, and moving the right stack (keeping track of the amount of movement) until the intersection is zero, and then adding some small space to the movement amount.

I am using hb-shape to get the starting advance, and hb-view to create the images,

and opencv to manipulate the images.

Kerning after a mark character in Tibetan

Comments

Categories