Can you do reorder to the glyphs and apply ligature after that?

2

Comments

  • WAY KYI
    WAY KYI Posts: 130
    edited July 2022
    Thank you very much Mr. Erwin Denissen. This is very close to what I am trying to do. By the way, the lookupflag options are now visible in FF. So, I think I be able to set those options now. Thanks. But there is still not working as I expected, see image below. The left side is my font and the right side is correct text. So, let me check where things are wrong, it seems probably second rphf ligature is not working thus texts are bumping each other. Thanks
    Here is the steps it went thru: There is no run for second rphf feature lookup. Thus no ligature replaced.  Link here:

    https://drive.google.com/file/d/1dUxkm4P_LWCsiDyIRpvcbfbbVYij1qEq/view?usp=sharing


  • RichardW
    RichardW Posts: 100
    I thought HarfBuzz ignored IgnoreBaseGlyphs - is this (now) not true?
  • WAY KYI
    WAY KYI Posts: 130
    RichardW said:
    I thought HarfBuzz ignored IgnoreBaseGlyphs - is this (now) not true?
    I tested on Windows Application ( CorelDraw ), Note and Word. It is not working either. Windows uses Uniscribe ( USE ) right? So, there must be an error within the codes. Thanks
  • Erwin Denissen
    Erwin Denissen Posts: 302
    edited July 2022
    In the font I provided uniE031 is a Base, so it is ignored by rphfRephFormlookup2.


  • RichardW
    RichardW Posts: 100
    WAY KYI said:
    Windows uses Uniscribe ( USE ) right? So, there must be an error within the codes. Thanks
    Windows' native shaper is contained in Uniscribe and DirectWrite.  As the Uniscribe DLL has shrunk recently, and changing the DLL seems to no longer change the shaping ability, I suspect the script-specific shapers invoked by Uniscribe are contained in or shared with DirectWrite at the executable code level.  Not all applications supplied by Microsoft use the native shaper.  Notably, when MS Edge became based on Chromium, changes included using HarfBuzz as its shaper.  HarfBuzz is still included in MS Edge's list of credits.

    The Universal Shaping Engine (USE) is one specific Microsoft shaper, nominally at least shared by many scripts.  It is not used for Myanmar and may well be incompatible.

  • WAY KYI
    WAY KYI Posts: 130
    Mr. RichardW, thank you for your inputs. Ms USE is definitely used in Myanmar ( as Microsoft has provided how to develop and support Myanmar Script & its documentation ). Also have fonts for Myanmar in Windows too - Myanmartext fonts come as Windows default fonts. Thus, its USE engine supports Myanmar Script. I don't really need to know technical details of the shaper engine but would be very helpful why things are not working ( in this case, second rphf feature is not working ) and why and how to fix them. Thanks
  • WAY KYI
    WAY KYI Posts: 130
    In the font I provided uniE031 is a Base, so it is ignored by rphfRephFormlookup2.

    A glyph can be included in GDEF_SIMPLE and GDEF_MARK too, right? Can I remove it from Base class? 
  • John Hudson
    John Hudson Posts: 3,227
    Myanmar script is not shaped by USE (Universal Shaping Engine) on Windows. It is shaped with its own dedicated Myanmar shaping engine, which is very similar in its cluster model to USE, but predates USE by a short time. I believe Microsoft considered passing Myanmar text to USE, but since some fonts had already been made for the Myanmar shaping engine, they decided to avoid the risk of compatibility issues.

    As I recall, HarfBuzz also uses a dedicated Myanmar shaping engine instead of its USE implementation.
  • WAY KYI
    WAY KYI Posts: 130
    Myanmar script is not shaped by USE (Universal Shaping Engine) on Windows. It is shaped with its own dedicated Myanmar shaping engine, which is very similar in its cluster model to USE, but predates USE by a short time. I believe Microsoft considered passing Myanmar text to USE, but since some fonts had already been made for the Myanmar shaping engine, they decided to avoid the risk of compatibility issues.

    As I recall, HarfBuzz also uses a dedicated Myanmar shaping engine instead of its USE implementation.
    Please check the document here ... https://docs.microsoft.com/en-us/typography/script-development/myanmar - it is posted on 18 June, 2022. In the docs, MS said they use Uniscribe Engine to support Myanmar Script. I thought, it was USE engine. Sorry, and thank for your correction. Thanks
  • WAY KYI said:
    A glyph can be included in GDEF_SIMPLE and GDEF_MARK too, right? Can I remove it from Base class?
    I have seen "Simple" before, but I think the official terminology is Base.

    No, a glyph can be either assigned to a single class or unassigned.

    It will work as soon as glyph uniE031 is unassigned. Not at the exact position as in your screenshot though:



  • WAY KYI
    WAY KYI Posts: 130
    edited July 2022
    Thanks Mr. Erwin Denissen. It seems that it was getting the wrong ligature one (UE030). The right one is UE031 ( which you unassigned ). Though, you input U102E ( the right glyph - you unassigned it too ), it comes out with U102D. See the image, there is a little curve inside in U102E and no curve in U102D. Thanks a lot anyway and please send me the feature file or the font. I should be able to figure out the rest and update you later. At least the second rphf is working in your font. Thanks
  • John Hudson
    John Hudson Posts: 3,227
    I have seen "Simple" before, but I think the official terminology is Base.
    This is an odd one. The terminology used in the glyph classification section of the GDEF table spec is indeed

    1 Base glyph (single character, spacing glyph)
    2 Ligature glyph (multiple character, spacing glyph)
    3 Mark glyph (non-spacing combining glyph)
    4 Component glyph (part of single character, spacing glyph)

    but almost all font tools—including Microsoft’s own VOLT tool—use ‘Simple’ instead of Base.

    BTW, I have never found any practical use for the ‘Component glyph’ classification.

  • WAY KYI said:
    Thanks Mr. Erwin Denissen. It seems that it was getting the wrong ligature one (UE030).
    It is a glitch within our shaping engine, but the exported font contains the correct ligatures.
  • Simon Cozens
    Simon Cozens Posts: 752
    edited July 2022

    BTW, I have never found any practical use for the ‘Component glyph’ classification.
    Ah, I have! You might have a set of base glyphs that you want to skip over in some rules but not in others. In Gulzar I make the Urdu space glyph a component glyph so that I can apply contextual substitution across spaces in some contexts but not in other contexts.

    edit: I’m misremembering. I tried to use it for that but of course there’s no flag to ignore them. So I used the ligature class instead.
  • WAY KYI
    WAY KYI Posts: 130
    WAY KYI said:
    Thanks Mr. Erwin Denissen. It seems that it was getting the wrong ligature one (UE030).
    It is a glitch within our shaping engine, but the exported font contains the correct ligatures.
    Thanks a lot Mr. Erwin Denissen. I found where to set all these GDEF Glyph Classes in FF. I will try and update you later. Thanks 
  • WAY KYI
    WAY KYI Posts: 130
    WAY KYI said:
    WAY KYI said:
    Thanks Mr. Erwin Denissen. It seems that it was getting the wrong ligature one (UE030).
    It is a glitch within our shaping engine, but the exported font contains the correct ligatures.
    Thanks a lot Mr. Erwin Denissen. I found where to set all these GDEF Glyph Classes in FF. I will try and update you later. Thanks 

    Update: I assigned E030 to E036 as No Class. Then created the font and test on the Windows Apps. Nothing changed and I got E02F bumped into 102E. So, second rphf did not run... 
    Here is my font at -->  https://drive.google.com/file/d/1MbHwB09qQYZncHOB_5U_t2NbL6ohY38l/view?usp=sharing
  • That font has no unassigned glyphs, but you should have done that for u102E. See this font:

    And an online demo:

  • WAY KYI
    WAY KYI Posts: 130
    Thank you Mr. Erwin Denissen for all your help. I misunderstood with your advice and make all those lig. glyphs as no class. And keep all 102d, 102e, 1032, 1036 in both Base and mark classes. The right thing to do was to remove mark classes from base class, so making them as unassigned to base class. Thank you, thank you very much for your patience with me and continued support. I learned a lot from you. Take care.
  • RichardW
    RichardW Posts: 100
    WAY KYI said:
    RichardW said:
    I thought HarfBuzz ignored IgnoreBaseGlyphs - is this (now) not true?
    I tested on Windows Application ( CorelDraw ), Note and Word. It is not working either. Windows uses Uniscribe ( USE ) right? So, there must be an error within the codes. Thanks

    The statement I had (see https://lists.freedesktop.org/archives/harfbuzz/2013-June/003331.html and perhaps my OP in that thread) applied to GPOS rather than GSUB.  However, in https://github.com/harfbuzz/harfbuzz/issues/2647 (dated 2020), Khaled Hozny reports, "Oddly enough, both Uniscribe and Core Text ignore the lookup flags and don’t for[m] the ligatures".  Thus, in at least some situations, skipping bases doesn't work.

    Given the complexities of what are being done, what I would do in the situation, though a bit redundant, is to use a context lookup with context "kinzi, medial-ya, vowel-above" that triggers two subsidiary lookups.  The first, applying at offset 0, would replace kinzi by a 'null mark', i.e. a non-inking, non-spacing mark.  The second, applying at offset 2, would replace the vowel above by the ligature mark  for kinzi and vowel-above.  (I don't write fluent AFDKO.)

    It would be tempting to immediately merge kinzi and medial-ya into just medial-ya, but that changes the offset for the second subsidiary lookup to 1, and not all renderers can handle that, and I suspect some compilers can't.





  • WAY KYI
    WAY KYI Posts: 130
    I see your points Mr. RichardW. For Windows applications this works fine. I don't have any other operating systems at hand and thus unable to test this. I will ask some friends to test on Apple and Linux systems. I will update you later. Thank you.
  • RichardW
    RichardW Posts: 100
    But if you have Windows, you should have a HarfBuzz application, namely MS Edge!  You might also have another HarfBuzz application, namely LibreOffice.
  • WAY KYI
    WAY KYI Posts: 130
    I did it on LibreOffice on Windows as Text Document and it did work ok. Sorry, I haven't try it on Apple and Linux yet. Part of the world where I live now, Internet connect is not so good. Stay tune..., thanks
  • RichardW
    RichardW Posts: 100
    edited July 2022
    WAY KYI said:
    I did it on LibreOffice on Windows as Text Document and it did work ok. Sorry, I haven't try it on Apple and Linux yet. Part of the world where I live now, Internet connect is not so good. Stay tune..., thanks

    How did you test it for a Windows shaper?

    Incidentally, what do you mean by 'try it on Linux'?  While most applications on Linux use HarfBuzz, there are other shapers.   For example, at least by default, Emacs Version 26 uses M17n.
  • WAY KYI
    WAY KYI Posts: 130
    I tested on Windows using Notepad, MS Word, Coreldraw , libreOffice and Adobe Illustrator. I don't know exactly which apps use which shaper. So, I type the text as in visual order ( I am using Windows Visual keyboard ) and when you type ENTER, the text appeared as expected on all these apps. I am sorry, I knew nothing about Linux system. I rely on friends to test it for me. Will update you later. Thanks
  • RichardW
    RichardW Posts: 100
    Notepad (on Windows) is very definitely still using Uniscribe or DirectWrite, and LibreOffice uses HarfBuzz even on Windows.  Of the major shapers, that just leaves CoreText to test the font on.
  • RichardW
    RichardW Posts: 100
    I have now tested this feature in a modification of my Da Lekh font, currently but temporarily publicly available as version 0.13.1 (version in the head table being the rather non-standard 0.0131) in https://wrdingham.co.uk/lanna/test/test/dalekh.woff and displayed in the web page font_test.htm in the same directory, in line 1 of vowel combination 53 under heading Vowel Combination Check.  I broke a lot of related shaping to put it in.

    In the context uni1A75 uni1A63 uni1A74 (mark, base, mark according to the GDEF table), I execute, at offset 0, the ligation "sub uni1A75 uni1A74 by spawning_mai_ek" skipping base glyphs.  The glyph spawning_mai_ek is declared to be a mark glyph.  I then apply the unconditional multiple substitution "sub spawning_mai_ek by uni1A74 uni1A75".  The intention is to replace "uni1A75 uni1A63 uni1A74" by "uni1A74 uni1A75 uni1A63".  I tested the effect in Firefox on Ubuntu (HarfBuzz shaper), Safari on iPhone 6 with iOS 12.5.5 (CoreText shaper) and MS Edge Legacy on Windows 10 (Build 15063.rs2_release 170317-1834, DirectWrite shaper) with a few sets of automatic updates.  I got the intended display for HarfBuzz and CoreText.  With DirectWrite I got an unexplained display that looked like "uni1A74 uni1A63 uni1A75".

    I suspect the base-mark association under DirectWrite is corrupt.  The original character input <1A75, 1A63, 1A74> is corrupted by the Universal Shaping Engine to <uni1A75, uni25CC, uni1A73, uni25CC, uni1A74> and then cleaned up by the font to <uni1A75, uni1A63, uni1A74> before Siamese-style shaping begins.  Most Tai Tham styles only need this shaping when the uni1A63 ligates with the preceding base consonant, but the Siamese style dominates Northern Thai typesetting.

    My conclusion is that ignoring bases works if the glyph classes are defined in the GDEF table, but is risky if there has already been a lot of ligaturing of marks and bases.

  • RichardW
    RichardW Posts: 100
    (The source code is actually at version 0.13.1.1.)
  • WAY KYI
    WAY KYI Posts: 130
    edited July 2022
    Thank you very much Mr. RichardW. I could not download your test font - 404 not found. And your language is very different from mine. You said - "My conclusion is that ignoring bases works if the glyph classes are defined in the GDEF table, but is risky if there has already been a lot of ligaturing of marks and bases." seem good enough for my language because there are ligatures and mark classes, but they do not overlap much to use lookupflags/skipping glyphs. Thanks
  • RichardW
    RichardW Posts: 100
    The correct and full references are https://wrdingham.co.uk/lanna/test/dalekh.woff and https://wrdingham.co.uk/lanna/test/font_test.htm.  I apologise for the typo.

    'Ligature' is a confusing word.  There are ligature glyphs, and there are ligature substitutions, which for many scripts will not yield ligature glyphs.  The problems I've seen in developing Da Lekh chiefly relate to applying a ligature substitution to a base (commonly uni25CC) and a mark to yield a mark.  The problems don't relate to lookupflags, but rather to with which base and mark glyphs a mark glyph can be associated by an attachment lookup.  The OpenType 'specification' does not define this behaviour except for very simple cases.