Can you do reorder to the glyphs and apply ligature after that?

WAY KYI · June 2022

Can you do reorder to the glyphs and apply ligature after that?
For example: The glyphs are reordered after shaping like this
ABMC. I have ligature for B_C. So the process could be quicker if
I reorder ABMC ( switching B and M ) become AMBC then apply
ligature to BC to become the final output as AMB_C?
Since this process will call two processes, can this be done in
one process in which I do reorder & ligature? and how? I am afraid,
the shaping engine will reorder the text right after reordering of BM?
What is the best way to do and please share if you have any suggestion?
And can the text still be correct if I change the font? Thanks

Erwin Denissen · June 2022

Reordering glyphs can be done through OpenType layout features, but in general this is not a good idea.

Can you be more specific, e.g. are all the glyphs base glyphs? If M is a mark, then setting the IgnoreMarks flag might solve your problem.

WAY KYI · June 2022

Erwin Denissen said:

Reordering glyphs can be done through OpenType layout features, but in general this is not a good idea.

Can you be more specific, e.g. are all the glyphs base glyphs? If M is a mark, then setting the IgnoreMarks flag might solve your problem.

Ok, A is Base glyph, M is medial and B&C are above base glyphs. Shaping Engine did reorder correct. But I need to get ligature B_C for BC which must be together in order ligature to work. Shaping Engine did reorder correct, but I need to reorder B and C to get B_C. That it is. There are no mark glyph in there. Thanks

John Hudson · June 2022

It would help to know what actual writing system you are working with, rather than trying to abstract this to ABMC, because character properties, shaping engine behaviour, and glyph categorisation are all factors.

Peter Constable · June 2022

@WAY KYI Regarding Erwin's question, what matters is how the glyphs are classified in your font data (the GDEF table), and the key distinctions are base, ligature, mark and component (not used in GSUB). See the Glyph Class Definition subtable. This classification can be used in lookup tables to tell the engine to ignore certain glyphs when it is processing. In addition, you can also create your own classification through filtering sets, and tell the engine to ignore particular sets of glyphs. See lookup flags and mark filtering sets in the Lookup table description.

WAY KYI · June 2022

The script is MYM2 ( Myanmar ) and since a lot you may not know, I am just giving example with ABMC. You can access the font here https://drive.google.com/file/d/1RoC7fFV-ZVrAnJS3Hsp62h7QZC5l96S-/view?usp=sharing and

Image: https://us.v-cdn.net/5019405/uploads/editor/80/ms7rdu2iodsm.jpg

Image: https://us.v-cdn.net/5019405/uploads/editor/ip/d6i6v6cgkf8q.jpg

At second image "End Lookup 90", uni102E is replaced with UE392. But that is not the end. So if I can reorder uniE390 & uni103B and ( switch places ) before lookup 90 and use ligature U390+ U102E = uniE392 (lig ). So, I don't need to go thru the rest of the steps starting from Lookup 90. Sorry, I am not too knowledgeable about skipping Medial u103B here. How can I do this without going many steps here to get there. Thanks

Simon Cozens · June 2022

The way I handle this in Myanmar is to make a ligature with the IgnoreBaseGlyphs flag set in the rphf feature:


feature rphf {
    lookupflag IgnoreBaseGlyphs UseMarkFilteringSet @abovemarks;
    sub kinzi-myanmar iMark-myanmar by kinzi_iMark-myanmar;
} rphf;

IgnoreBaseGlyphs normally causes more problems than it solves (you can get interactions between faraway marks), but in the rphf feature it's scoped to the current cluster, not to the whole run.

John Hudson · June 2022

I am not too knowledgeable about skipping Medial u103B here.

A typical procedure for skipping a glyph like this would be to categorise the /uni103B/ glyph as a mark in the font’s GDEF table, and then to filter all marks out of your ligature lookup processing by setting the process marks flag to NONE, or to set the flag to only allow a group of marks that does not include /uni103B/.

As Simon has indicated, it is also possible to set a lookup flag to ignore non-mark base glyphs, which can be useful if you want to manage mark interaction and skip intervening base glyphs.

It has been a while since I worked on Myanmar, but when I a bit less busy I will go and look at how we handled this in Microsoft’s Myanmar Text fonts.

WAY KYI · June 2022

Simon Cozens said:
The way I handle this in Myanmar is to make a ligature with the IgnoreBaseGlyphs flag set in the rphf feature:
feature rphf {
    lookupflag IgnoreBaseGlyphs UseMarkFilteringSet @abovemarks;
    sub kinzi-myanmar iMark-myanmar by kinzi_iMark-myanmar;
} rphf;
IgnoreBaseGlyphs normally causes more problems than it solves (you can get interactions between faraway marks), but in the rphf feature it's scoped to the current cluster, not to the whole run.

Thank you very much for your suggestion. Let me try and see this will work. I will update you with the result. One other thing, I just want to know - can you put any kind of glyphs in markclass/markset? What is the different between the two and when to use one of them in what given situation and why? You want to set markset here, right? Just my original question about reorder ( which opentype feature can do this task?? ) and do ligature will work in this case too? Thanks

WAY KYI · June 2022

John Hudson said:

I am not too knowledgeable about skipping Medial u103B here.
A typical procedure for skipping a glyph like this would be to categorise the /uni103B/ glyph as a mark in the font’s GDEF table, and then to filter all marks out of your ligature lookup processing by setting the process marks flag to NONE, or to set the flag to only allow a group of marks that does not include /uni103B/.

As Simon has indicated, it is also possible to set a lookup flag to ignore non-mark base glyphs, which can be useful if you want to manage mark interaction and skip intervening base glyphs.

It has been a while since I worked on Myanmar, but when I a bit less busy I will go and look at how we handled this in Microsoft’s Myanmar Text fonts.

Glad to know original Myanmartext type Engineer. Thank you very much. Pyidaungsu font was developed may be 3-4 years later than Microsoft Myanmartext font. It still widely used as default font for Myanmar here. Myanmar Unicode font development is not so easy to do up until now and font designers use either Pyidaungsu /Myanmartext background opentype programming and replace glyphs with new ones without needing to know a thing about opentype. I originally from Myanmar and lived and worked as Sr. Software Engineer for about 30 years in USA then back to retire in Myanmar. I started learning font development over a year ago and I want to change this and hand my experience and knowledge to younger generation to carry on in Myanmar.

Ok, back to the subject about setting /uni103B/ glyph as a mark in the font’s GDEF table. I am not able to find in Fontforge where to set it or know how to set in the table. May be point me to documentation or sample on how to implement it. Thanks

Simon Cozens · June 2022

WAY KYI said:

Thank you very much for your suggestion. Let me try and see this will work.

It does work, I promise. :-)

One other thing, I just want to know - can you put any kind of glyphs in markclass/markset?

I think you can only put those glyphs which have the GDEF category Mark.

What is the different between the two and when to use one of them in what given situation and why? You want to set markset here, right?

Mark attachment classes are an older mechanism and the main characteristic of them is that a glyph can only belong to one class. Mark sets can overlap, in that glyphs can belong to more than one mark set. In general the rule is: always use mark sets, never use mark classes.

Just my original question about reorder ( which opentype feature can do this task?? ) and do ligature will work in this case too? Thanks

It depends on what you want to do. You may want to do the swap afterwards, because in situations like င်္ကျ, I believe the above marks should be anchored onto the medial ya. (Even though they are not in the font on this board!) This is much easier to achieve if you have only one above mark glyph to swap with the medialYa instead of trying to move all the above marks, so you need to ligate first and then swap. Swapping glyphs in OpenType is not supported directly, but you can do it with something like this:

lookup AddYaBefore {
        sub kinzi-myanmar by medialYa-myanmar kinzi-myanmar;
        sub repha-myanmar by medialYa-myanmar repha-myanmar;
        sub iMark-myanmar by medialYa-myanmar iMark-myanmar;
        ...
} AddYaBefore;

lookup RemoveYa {
       lookupflag UseMarkFilteringSet @abovemarks;
       sub kinzi-myanmar medialYa-myanmar by kinzi-myanmar;
       sub repha-myanmar medialYa-myanmar by repha-myanmar;
       sub iMark-myanmar medialYa-myanmar by iMark-myanmar;
       ....
} RemoveYa;

feature abvs {
      sub @abovemarks' lookup AddYaBefore medialYa-myanmar' lookup RemoveYa;
} abvs;

(These rules are a lot easier to generate in my FEZ language):

Routine AddYaBefore {Substitute @abovemarks -> medialYa-myanmar $1; };
Routine RemoveYa    {Substitute @abovemarks medialYa-myanmar -> $1; } UseMarkFilteringSet @abovemarks;

Notice how the reordering works: the "AddYaBefore" applies to the first glyph in the sequence and the "RemoveYa" applies to the second glyph. So with kinzi-myanmar|medialYa-myanmar, this is what happens:

kinzi-myanmar|medialYa-myanmar
AddYaBefore applies to first glyph, giving:
medialYa-myanmar|kinzi-myanmar|medialYa-myanmar
RemoveYa applies to the new second glyph, giving:
medialYa-myanmar|kinzi-myanmar

WAY KYI · June 2022

Seem like your suggestion is the way to go. The other one seems rather so complicated. So, I found out from .fea file that Pyidaungsu font already has markset below:

@GDEF_Mark = [\uni102D \uni102E \uni102F \uni1030 \uni1033 \uni1035 \uni1037

\uni103D \uni103E \uniE1D1 \uni103D.blws \uniE1D1.blws \uniE1F2 \uniE430 ];

But 103B is not included. It is in @GDEF_Simple set. So, I include 103B in markset, it is ok right? It will be in both sets. One more question, after you do -

feature rphf {
    lookupflag IgnoreBaseGlyphs UseMarkFilteringSet @abovemarks;
    sub kinzi-myanmar iMark-myanmar by kinzi_iMark-myanmar;

} rphf;

The kinzi_iMark-myanmar will be the last glyph, right? Thanks

Simon Cozens · June 2022

John and I are talking about two different approaches. In my way, we make 103B a base and skip it using IgnoreBaseGlyphs. You still need the MarkFilteringSet to ignore the below marks. In John's way, you make 103B a mark and skip it using the MarkFilteringSet.

WAY KYI · June 2022

Simon Cozens said:

John and I are talking about two different approaches. In my way, we make 103B a base and skip it using IgnoreBaseGlyphs. You still need the MarkFilteringSet to ignore the below marks. In John's way, you make 103B a mark and skip it using the MarkFilteringSet.

ok, I will try it and whichever works it is ok for me. I was able to find info of IgnoreBaseGlyphs but could not find UseMarkFilteringSet. You said "You still need the MarkFilteringSet to ignore the below marks." - I am confused that the text will only include Base, above base and medial in this case. So, where below mark is coming from? Sorry, there are so many things I need to study and these terms are so new to me. I need all the help I can get or direction to the right place. Thank you very much for both of you and will update you two later. Thanks

Simon Cozens · June 2022

Consider a sequence like င်္က္ကျိ. This will turn out to be something like:

ka-myanmar | kinzi-myanmar | virama-myanmar | ka-myanmar | medialYa-myanmar | iMark-myanmar

When you form the kinzi_iMark ligature, you need to skip over the medialYa, but also you need to skip over the conjunct glyphs as well.

WAY KYI · June 2022

Right, in this case you need to consider below mark too. Now I get what you wanted me to see all possible combinations. So, the whole thing related to kinzi+above mark will be completely thought out. This is expert level advices. Thank you, thank you very much.
PS: I found and understand now that IgnoreBaseGlyphs and UseMarkFilteringSet are the switches to flag the lookup to ignore or use them as you needed. And found how to set markset in Fontforge. Thanks

WAY KYI · July 2022

This is my test font and rphf was working in the first try, then I added blwf features thru Merge Feature file and it only works in Fontforge Metrics Windows. I generated as a font and tried it on Coreldraw, MS Word and AI and both features are not working at all. Don't know what has happened. Can someone take a look?? My friend said Fontlab reported as Script/Language problems. below is my test font:

https://drive.google.com/file/d/1sK7AAjoi1TMqv_c351-n1CWSq_cXdckH/view?usp=sharing

Erwin Denissen · July 2022

Your font has no OpenType layout features at all.

WAY KYI · July 2022

arrr... I saw them in Fontforge. What happened when I generated as a font??? Below features are I am trying to get them work. Thanks

lookup rphfRephForminMyanmar2lookup0 {

sub \uni1004 \uni103A \uni1039 by \uniE02D;

} rphfRephForminMyanmar2lookup0;

feature rphf {

script DFLT;

language dflt ;

lookup rphfRephForminMyanmar2lookup0;

script mym2;

language dflt ;

lookup rphfRephForminMyanmar2lookup0;

} rphf;

lookup blwfBelowBaseFormsinMyanmar2lookup1 {

sub \u1039 \u1000 by \uE000;

sub \u1039 \u1001 by \uE001;

sub \u1039 \u1002 by \uE002;

sub \u1039 \u1003 by \uE003;

sub \u1039 \u1004 by \uE004;

sub \u1039 \u1005 by \uE005;

sub \u1039 \u1006 by \uE006;

sub \u1039 \u1007 by \uE007;

sub \u1039 \u1008 by \uE008;

sub \u1039 \u100A by \uE00A;

sub \u1039 \u100B by \uE00B;

sub \u1039 \u100C by \uE00C;

sub \u1039 \u100D by \uE00D;

sub \u1039 \u100F by \uE00F;

sub \u1039 \u1010 by \uE010;

sub \u1039 \u1011 by \uE011;

sub \u1039 \u1012 by \uE012;

sub \u1039 \u1013 by \uE013;

sub \u1039 \u1014 by \uE014;

sub \u1039 \u1015 by \uE015;

sub \u1039 \u1016 by \uE016;

sub \u1039 \u1017 by \uE017;

sub \u1039 \u1018 by \uE018;

sub \u1039 \u1019 by \uE019;

sub \u1039 \u100A by \uE00A;

sub \u1039 \u100B by \uE00B;

sub \u1039 \u101C by \uE01C;

sub \u1039 \u101D by \uE01D;

sub \u1039 \u100E by \uE00E;

sub \u1039 \u100F by \uE00F;

#sub \u1039 \u1020 by \uE020;

} blwfBelowBaseFormsinMyanmar2lookup1;

Erwin Denissen · July 2022

I didn't know you are allowed to use code-points instead of glyph names.

Apart from that it seems valid to me, but be aware blwfBelowBaseFormsinMyanmar2lookup1 is unused.

WAY KYI · July 2022

I got it working now after reinstalling FF. Thank everyone!!!

Thomas Phinney · July 2022

Erwin Denissen said:

I didn't know you are allowed to use code-points instead of glyph names.

Those are glyph names, which happen to look like code points. Both “uXXXX” and “uniXXXX” styles are standard strings for glyph names, from Adobe’s naming approaches as documented in the Adobe Glyph List. https://github.com/adobe-type-tools/agl-specification

(At one point, long ago, there was a separate document called “Unicode and Glyph Names,” but I think that got folded into the AGL spec somewhere along the way.)

Simon Cozens · July 2022

Thomas Phinney said:
Those are glyph names, which happen to look like code points. Both “uXXXX” and “uniXXXX” styles are standard strings for glyph names

Of course, if you used FEZ, you could use Unicode glyph selectors and get the best of both worlds. :-)

Erwin Denissen · July 2022

Thomas Phinney said:

Those are glyph names, which happen to look like code points. Both “uXXXX” and “uniXXXX” styles are standard strings for glyph names, from Adobe’s naming approaches as documented in the Adobe Glyph List. https://github.com/adobe-type-tools/agl-specification

(At one point, long ago, there was a separate document called “Unicode and Glyph Names,” but I think that got folded into the AGL spec somewhere along the way.)

Thank you for your feedback, but the specific glyph names all start with \uni while lookup blwfBelowBaseFormsinMyanmar2lookup1 contains \u1039, etc.

Therefor I have added support for both conventions in FontCreator, so it now successfully compiles the fea code provided by WAY KYI.

Image: https://us.v-cdn.net/5019405/uploads/editor/7b/tgqi8bhg52ws.png

Thomas Phinney · July 2022

Erwin Denissen said:

Thank you for your feedback, but the specific glyph names all start with \uni while lookup blwfBelowBaseFormsinMyanmar2lookup1 contains \u1039, etc.

No, that is not correct—at least, not as a matter of the glyph naming spec.

In short:

for a single BMP codepoint, one can use either form.
for beyond-BMP (“supra-BMP”) codepoints, one must only use “u” and not “uni”

There are some complications with ligatures:

if one wishes to express a ligature of BMP codepoints, one can do so with “uni” by stringing them together like “uni20AC0034”, an option unavailable with “u”
one could however do “u20AC_u0034” which is clearer, and no longer… but would become longer if there were more than two codepoints involved. This can be a reason to use “uni” names with ligatures involving long strings of BMP codepoints, if one is worried about total glyph name length.

See section 2 of the Readme portion of the AGL, partly quoted here:

Otherwise, if the component is of the form ‘uni’ (U+0075, U+006E, and U+0069) followed by a sequence of uppercase hexadecimal digits (0–9 and A–F, meaning U+0030 through U+0039 and U+0041 through U+0046), if the length of that sequence is a multiple of four, and if each group of four digits represents a value in the ranges 0000 through D7FF or E000 through FFFF, then interpret each as a Unicode scalar value and map the component to the string made of those scalar values. Note that the range and digit-length restrictions mean that the ‘uni’ glyph name prefix can be used only with UVs in the Basic Multilingual Plane (BMP).

Otherwise, if the component is of the form ‘u’ (U+0075) followed by a sequence of four to six uppercase hexadecimal digits (0–9 and A–F, meaning U+0030 through U+0039 and U+0041 through U+0046), and those digits represents a value in the ranges 0000 through D7FF or E000 through 10FFFF, then interpret it as a Unicode scalar value and map the component to the string made of this scalar value.

Erwin Denissen · July 2022

Thomas Phinney said:

Erwin Denissen said:

Thank you for your feedback, but the specific glyph names all start with \uni while lookup blwfBelowBaseFormsinMyanmar2lookup1 contains \u1039, etc.

No, that is not correct—at least, not as a matter of the glyph naming spec.

The font and the fea code are not mine, but I ensured that FontCreator can cope with it.

WAY KYI · July 2022

ok, here is my first try but failed. I imported into FF( Fontforge ) and get lot of error and the features are not imported. And I don't know where I can put lookupflags in FF but I was able to add abovemarks in there. Seemed so easy but I am not correct in writing syntax. Please see my codes and tell me where I went wrong...Thanks

# GDEF Mark Attachment Sets

@abovemarks=[\uni102D \uni102E \uni1032 \uni1036 ];

@Kinzi_abovemarks=[\uniE030 \uniE031 \uniE032 \uniE033 ];

languagesystem DFLT dflt;

languagesystem mym2 dflt;

lookup rphfRephForminMyanmar2lookup0 {

lookupflag 0;

sub \uni1004 \uni103A \uni1039 by \uniE02F;

} rphfRephForminMyanmar2lookup0;

lookup rphfRephFormlookup2 {

lookupflag IgnoreBaseGlyphs UseMarkFilteringSet @abovemarks;

sub \uniE02F @abovemarks by @Kinzi_abovemarks;

} rphfRephFormlookup2;

feature rphf {

script DFLT;

language dflt ;

lookup rphfRephForminMyanmar2lookup0;

lookup rphfRephFormlookup2;

script mym2;

language dflt ;

lookup rphfRephForminMyanmar2lookup0;

lookup rphfRephFormlookup2;

} rphf;

Here is my font at --> https://drive.google.com/file/d/1NAP0He9BJui_oCpcobgR96bfNsBFodGy/view?usp=sharing

WAY KYI · July 2022

Erwin Denissen said:
Thank you for your feedback, but the specific glyph names all start with \uni while lookup blwfBelowBaseFormsinMyanmar2lookup1 contains \u1039, etc.

Therefor I have added support for both conventions in FontCreator, so it now successfully compiles the fea code provided by WAY KYI.

In FF, I always able to use uniXXXX and it printed out as \UXXXX into .fea file. But it works. Thanks for your help and I am using Fontforge for my font development. Thanks

Erwin Denissen · July 2022

The OpenType layout features don't match that fea code.

This is inside your font:

Image: https://us.v-cdn.net/5019405/uploads/editor/q3/lvnyxjgsr556.png

And this is what it looks after importing the fea code:

Image: https://us.v-cdn.net/5019405/uploads/editor/q1/1kdrenj8yjo4.png

WAY KYI · July 2022

Erwin Denissen said:

The OpenType layout features don't match that fea code.

Yes, I knew the font only has rphf first part and blwf. I was not able to import second part of the rphf-lookup. I was following suggestion by Simon Cozens and I failed, it is my part not able to follow thru the directions. There is no setting to set lookupflag in FF and I was trying to get thru .fea file import, which also failed. Simon & John knew what I was trying to do if you follow this from very beginning. Thank you very much for trying to help me here. We are on different font tools. But thanks

Erwin Denissen · July 2022

Well, FontCreator was able to import your fea code into your font, including IgnoreBaseGlyphs and MarkFilteringSet.

See:

https://www.high-logic.com/tmp/playground/unitest2b.ttf

Hope this helps.

Can you do reorder to the glyphs and apply ligature after that?

Comments

Categories