Tonos and oxia

mauro sacchetto
mauro sacchetto Posts: 353
edited May 2020 in Font Technology
When implementing Greek in a font, I realized that using the polytonic version some glyphs point instead to those of the monotonic version. This happens both in LibreOffice and with LaTeX, although I specify the composition engine I use, namely Lualatex, the <polutonicogreek> option.
I state that the accents of monotonic and polytonic are slightly different. The fact is that the glyphs (polythonic) with the grave accent, which should be those with oxia, point instead to the glyphs (monotonic) with tonos (
unless I, instead of using keyboard shortcuts to write, directly enter the glyph with the correct unicode value).
Of course, this problem doesn't show up in the professional fonts I've checked, but I haven't found the same solutions. In some there is a lookup of ccmp, in others of the Contextual Alternates with replacement rules. But honestly, I didn't understand how to proceed.
How do you solve the problem of the correct coupling between monotonic and polytonic, precisely in relation to these accents?
Thank you
Tagged:

Comments

  • I guess it is advisable to sort out first, wether your issue is
    • a glyph design matter
    • an encoding problem
    • a character input issue
    • a layout engine issue.
  • 1. the glyphs are slightly differentiated as regards the size and inclination of the accent (although I am evaluating whether to eliminate this difference, which seems to have no historical-philological basis)
    2. the glyphs are conveniently placed in their slots (UnicodeBmp encoding) and correspond to the Unicode values ​​provided
    3. I also asked myself the problem of input. I write directly in Greek in a text editor with Unicode encoding and, for diacritics, I use shortcuts. Now, if I insert a vowel with an acute accent through the relative shortcut, I always produce the letter with tonos. It results with oxia only if I copy-paste from some site of the glyph with oxia. But I don't know how to reproduce that glyph, since evidently different Unicode values ​​correspond to it, with the keyboard.
    4. The problem is encountered both with Lualatex, with which I can manage the various types of Greek, and with LibreOffice.
  • I did an experiment: in two other fonts (EB Garamond and GaramondPremierePro) I replaced the apha with tonos with an alpha with a bullet above. In both cases this "alpha modified with bullet/tonos" always appears, while the glyph with oxia is always ignored.
    At this point maybe it's an input problem, but I don't know how to type the uni1F71 glyph with some shortcut.
    Can anyone reproduce the two glyphs, uni03AC and uni1F71 directly from the keyboard?
    I thought of a 'locl' lookup for <grek {PGR}>, now I'm trying to create it
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    I created the following lookup:

    lookup loclLocalizedFormsinGreeklookup5 {
      lookupflag 0;
        sub \Alphatonos by \uni1FBB ;
        sub \Epsilontonos by \uni1FC9 ;
        sub \Etatonos by \uni1FCB ;
        sub \Iotatonos by \uni1FDB ;
        sub \Omicrontonos by \uni1FF9 ;
        sub \Upsilontonos by \uni1FEB ;
        sub \Omegatonos by \uni1FFB ;
        sub \iotadieresistonos by \uni1FD3 ;
        sub \alphatonos by \uni1F71 ;
        sub \epsilontonos by \uni1F73 ;
        sub \etatonos by \uni1F75 ;
        sub \iotatonos by \uni1F77 ;
        sub \upsilondieresistonos by \uni1FE3 ;
        sub \omicrontonos by \uni1F79 ;
        sub \upsilontonos by \uni1F7B ;
        sub \omegatonos by \uni1F7D ;
    } loclLocalizedFormsinGreeklookup5;
    
    feature locl {
      script grek;
         language PGR  exclude_dflt;
          lookup loclLocalizedFormsinGreeklookup5;
    
    } locl;


    but it does not produce any results, because it does not make replacements. Is there something wrong with the system or the syntax?
    Moreover, to exclude that it is a layout engine issue, I add that I have compiled a test file with Lualatex and with the options <\usepackage[greek.ancient]{babel}> and <\usepackage[greek.polutoniko]{babel}>. Perhaps the problem lies in making the text understand that it corresponds to PGR ...
  • KP Mawhood
    KP Mawhood Posts: 296
    1. the glyphs are slightly differentiated as regards the size and inclination of the accent (although I am evaluating whether to eliminate this difference, which seems to have no historical-philological basis)
    2. the glyphs are conveniently placed in their slots (UnicodeBmp encoding) and correspond to the Unicode values ​​provided
    3. I also asked myself the problem of input. I write directly in Greek in a text editor with Unicode encoding and, for diacritics, I use shortcuts. Now, if I insert a vowel with an acute accent through the relative shortcut, I always produce the letter with tonos. It results with oxia only if I copy-paste from some site of the glyph with oxia. But I don't know how to reproduce that glyph, since evidently different Unicode values ​​correspond to it, with the keyboard.
    4. The problem is encountered both with Lualatex, with which I can manage the various types of Greek, and with LibreOffice.
    This might shed some light on a couple of these points, although it doesn't problem-solve your overall issue. 

    There are some "duplicate" Greek characters that exist for compatibility with legacy systems, also known as compatibility characters. They are canonically and semantically equivalent. Most fonts, search engines and other software will work with and display both versions identically. The Greek Keyboard in use will make a decision to input one or the other.

    The duplicated characters preserved a false distinction of legacy code tables, following Greek spelling reform in the 1980s. The Unicode database initially established normalizing rules and since 2016 has formally deprecated and removed the versions from the Greek extended range, leaving the basic range.

    There are sixteen (16) affected characters:

    Unicode Basic range Extended range Name
    ά U+03AC U+1F71 GREEK SMALL LETTER ALPHA WITH TONOS
    έ U+03AD U+1F73 GREEK SMALL LETTER EPSILON WITH TONOS
    ή U+03AE U+1F75 GREEK SMALL LETTER ETA WITH TONOS
    ί U+03AF U+1F77 GREEK SMALL LETTER IOTA WITH TONOS
    ό U+03CC U+1F79 GREEK SMALL LETTER OMICRON WITH TONOS
    ύ U+03CD U+1F7B GREEK SMALL LETTER UPSILON WITH TONOS
    ώ U+03CE U+1F7D GREEK SMALL LETTER OMEGA WITH TONOS
    Ά U+0386 U+1FBB GREEK CAPITAL LETTER ALPHA WITH TONOS
    Έ U+0388 U+1FC9 GREEK CAPITAL LETTER EPSILON WITH TONOS
    Ή U+0389 U+1FCB GREEK CAPITAL LETTER ETA WITH TONOS
    Ί U+038A U+1FDB GREEK CAPITAL LETTER IOTA WITH TONOS
    Ό U+038C U+1FF9 GREEK CAPITAL LETTER OMICRON WITH TONOS
    Ύ U+038E U+1FEB GREEK CAPITAL LETTER UPSILON WITH TONOS
    Ώ U+038F U+1FFB GREEK CAPITAL LETTER OMEGA WITH TONOS
    ΐ U+0390 U+1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS
    ΰ U+03B0 U+1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    The correspondences you list are precisely those between the glyphs with tonos and the glyphs with oxia. It probably doesn't make sense to make them different (EB Garamond for example has slightly larger and less angled tonos than oxia), because this difference does not seem to have historical-philological rediscoveries.
    However, I do not understand in any case the mechanism despite having tried multiple keyboards (at least Greek and Greek polytonic) and above all, at this point, I do not understand why the lookup that I reported in the post above does not work.
    Or do you mean that in any case the character used is U + 03AC (ie alphatonos) and instead never the duplicate U + 1F71?

  • John Hudson
    John Hudson Posts: 3,221
    I think Katy has nailed it. The behaviour Mauro reports is likely due to the test environments using compatibility normalisation, so e.g.

    1F71 ά GREEK SMALL LETTER ALPHA WITH OXIA
    ≡ 03AC ά  greek small letter alpha with tonos

    Not all environments apply normalisation, so you may get different results in different apps/platforms with the same fonts.

    There's no really robust way around this other than to make the two identical (or to make the polytonic and monotonic separate fonts). Making them identical also makes sense if one takes the view that the tonos is the oxia, that the monotonic system didn't create a new accent, but simply got rid of all-except-one of the existing accents (and reformed some of the rules around its use).

    I can imagine wanting to modify the length of the tonos, to make it shorter than the oxia, because as a single mark it doesn't need to have the same graphical presence as in a more complex system of marks, but I can live with a slightly too tall tonos if that's the price of my polytonic looking correct.

    See also Nick Nicholas' discussion of oxia vs tonos
    .
  • Thanks for the listing, Katy. I always understood that these ch.s are actually doublettes and I give them identical glyphs respectively. Another aspect: if one desires to check Unicode codepoints in some output, I find the Unicode Fallback font very useful, for cases of ambiguity such as this.
  • In fact, from a historical point of view, there is no difference between these accents, It is a license from today's graphics, to make the font more homogeneous. That's why I was considering whether to eliminate it.
    I would like to ask @JohnHudson to be kind enough to clarify his statement better:
    Not all environments apply normalization, so you may get different results in different apps / platforms with the same fonts.
    What do you mean by "environments"? Are LibreOffice or Lualatex environments in your understanding?
    Is the discussion you are referring to about oxia vs tonos the followin one?
  • John Hudson
    John Hudson Posts: 3,221
    edited May 2020
    An environment is any specific combination of application, operating system, versions of either, algorithms, optional libraries, etc. that might affect the outcome of testing of particular operations. So in the case of text processing and display, one of factors that can play a role is normalisation of text, which is a process by which multiple possible ways to encode the same piece of text are reduced to one of a set of forms defined by Unicode's normalisation forms standard. In this case, the Greek letters with oxia are considered to have compatibility equivalence to corresponding letters with tonos. This means that some normalisation forms may convert those oxia diacritic characters to the tonos diacritic characters as a one way operation (some characters have canonical, rather than compatibility, equivalence, meaning that two-way transformations are possible, depending on the normalisation form used).

    So yes, LibreOffice and Lualatex are environments, and may be part of larger environments — host operating systems, as well as default algorithms or preference settings within the applications — that could affect what is happening to text.

    Is the discussion you are referring to about oxia vs tonos the followin one?

    Yes, that's the one. Sorry, I thought I had included a link, but it didn't work.
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    I understand the situation you describe, even if I find it rather disconcerting that, in an involuntary way and due to a complex series of circumstances, the result may change. My "work environment" causes a normalization, but it's impossible for me to determine how many are the elements that produce one result instead of another.

    But, evidently, these factors determine the outcome to such an extent that the lookup that I reported above does not produce the desired outcome. Do you find them lookup correct or does it contain errors? I believed that by this means I could have modified the "environment" to my liking ", but it does not produce any results ...

    A detail. To avoid that some letters with diacritics, in their stratification, become "too high" (I think in particular to the vowels with psili or dasia + perispomenes, it is acceptable that psili or dasia are used in this case of a smaller size than that used in vowels who have no overwritten diacritics?
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    I have made further investigations. There is certainly a problem with the fact that, in many cases, the editor and / or OS will normalize the input into some Unicode normal form (typically NFC). In Unicode, the characters with tonos and oxia are canonically equivalent, so you will always end up with tonos in your output.
    The solution is to add a substitution rule which recovers the oxia, but you have to be sure that the replacement isn't normalized itself.
    With Lualatex the replacement works because, to prevent the system from normalizing even what has been replaced, in the rule I indicate not the font itself, but the Unicode codepoint numbers.

    \directlua {
    fonts.handlers.otf.addfeature{
      name = "tonosoxia",
      % features = {grek = {pgr = true}}, % Restrict the change to Polytonic Greek. Doesn't work here because babel only sets the language to greek
      type = "substitution",
      data = {
        Alphatonos = 0x1FBB,
        Epsilontonos = 0x1FC9,
        Etatonos = 0x1FCB,
        Iotatonos = 0x1FDB,
        Omicrontonos = 0x1FF9,
        Omegatonos = 0x1FFB,
        Upsilontonos = 0x1FEB,
        alphatonos = 0x1F71,
        epsilontonos = 0x1F73,
        etatonos = 0x1F75,
        iotatonos = 0x1F77,
        iotadieresistonos = 0x1FD3,
        omicrontonos = 0x1F79,
        omegatonos = 0x1F7D,
        upsilontonos = 0x1F7B,
        upsilondieresistonos = 0x1FE3,
      },
    }
    }

    But I don't know if and how this can be done in the font editor.



    In any case, as you can see from the attached screen, the replacement seems correct in the lookup window.
    Now, the problem is to understand what is wrong with the lookup I have indicated above, because in any case it operates (if I'm not wrong) after the normalization of the text to be composed has been carried out.


  • John Hudson
    John Hudson Posts: 3,221
    Glyph substitutions should all be occurring after normalisation (which is a character operation). I say ‘should’ because I am aware of at least one situation in which Adobe re-normalises output from the ccmp feature, but that's clearly a stupid bug on their part even if they've resisted fixing it for more than a decade. Harrumph!

    Now, your substitution seems to depend on the PGR language system tag, so in order to work it requires that you are able to tag text in your environment in a way that invokes the PGR language system feature lookups in the font. This is far from the most robust aspect of OpenType Layout, with varying levels and methods of support. So there are quite a lot of ways it can fail.
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    In fact, the problem is probably finding a way to tag the text as a PGR for Lualatex and choosing <greek.polutoniko> as an option doesn't seem enough. But this is a Latex problem, not a typography one.

     A last question. The replacement works fine if I set, in the metadata of the lookup, grek{PGR, dflt}: what inconveniences can the fact of adding <dflt> produce? 

    That replacement takes place even if I use monotonic Greek?
  • John Hudson
    John Hudson Posts: 3,221
    Yes. The dflt language system features and lookups are what will be used for the grek script if no other language system tag is invoked. So whatever you put in the grek dflt combination will be fallback behaviour in any situation where language tagging isn't picked up.
    _____

    Hmm. I've thought of another way you could get your oxia form to substitute only and specifically in polytonic setting, but it's a bit crazy (and would only work for sequences of polytonic words containing at least one other polytonic accent). Probably too silly to consider a viable solution.

    You'd be better off making separate polytonic and monotonic fonts if you really want the accent forms to differ. That's what I do when I have language-specific forms that I want to be really reliably used, e.g. Tiro Devanagari Hindi, Tiro Devanagari Marathi, Tiro Devanagari Sanskrit.
  • mauro sacchetto
    mauro sacchetto Posts: 353
    edited May 2020
    This discussion finally clarified my ideas.
    In summary, I see three possible solutions:
    1) radical solution (probably the best, which avoids unnecessary headaches): use the same accents for tonos and oxia. So I see, among others, in Brill font. This completely eliminates the problem and, after all, is largely justified by the fact that differentiating tonos and oxia (as EB Garamond does, for example. Note that without particular "corrections" the EB Garamond also has the defect that can mix vowels with tonos and vowels with oxia in the same text; being the glyphs very similar, at 10-12 pt only a particularly attentive eye, or a font designer, realizes it, but the difference, even if minimal, is present) appears more of a designer's habit than a necessity, since it has no philological foundation.
    2) exaggerated solution: as @JohnHudson suggests, create two different fonts. But this solution, if it makes sense in the cases he cites, seems to me out of proportion to my goal. It's just a dozen glyphs and building two fonts with over a thousand glyphs to diversify twelve is like shooting sparrows with a cannon.
    3) platform-dependent software solution: for word processors as LibreOffice I have no idea if and how it can be done, but with Lualatex it is possible to create a code that "tags" appropriately the text in reference to grek{PGR} and that causes the lookup 'locl' is applied
  • John Hudson
    John Hudson Posts: 3,221
    4) crazy solution:
    a) Decompose all combined diacritics to letter + mark combinations using the ccmp feature.
    b) Ensure marks are identified as such in the GDEF table.
    c) Use GPOS mark anchor attachment to correctly position marks on letters.
    d) Add rclt feature substitution tonos -> oxia in chaining context of other polytonic marks (i.e. look behind and ahead for varia, perismoneni, etc.) and set the lookup flag to not process base glyphs.

    The last step, the lookup flag, is the reason for the initial decomposition: you want to get to a state where you can process the mark glyphs independent of the letters.

    That would work for any sequences of polytonic words that contain multiple accent types. Obviously it wouldn't work for a string that only contained the tonos, as there would be no context to trigger the substitution. There would also be an issue in some environments that don't process the word space as part of a string, so won't catch context from adjacent words.
  • This is a very remarkable job. It is necessary to evaluate if it is worth it, and if in certain circumstances it may still not work ....
  • In the end I solved it by managing to create the right combination of font lookups and Lualatex code.



    As you can see from the image, on the left Adobe Garamond Premiere Pro itself presents the problem of using in the polytonic the acute accents of the monotonic, slightly different from the polytonic causing a lack of homogeneity in the rendering (see the third line). On the right the final rendering of the font I'm working on.

    Thanx a lot for your cooperatiom!