What is i.latnTRK

I don't see a unicode value for it. It's in a font I started a while back and I haven't figured out where it came from.

Comments

  • Chris Lozos
    Chris Lozos Posts: 1,458
    edited April 2015
    It is probably an additional dotted i for Turkish substitution when you want it to read back [in a pdf] as a dotted i instead of dotless. I often see it as i.TRK.
  • Hmmmm.
    Thanks.
  • Mark Simonson
    Mark Simonson Posts: 1,739
    edited April 2015
    Any glyph that has a dot in the name will necessarily not have a unicode value. The part before the dot is usually (but not necessarily) the name of one of the standard unicode characters. The .whatever indicates a variant of that character, usually accessed through an OpenType feature.
  • So, why not use an i instead of idotaccent? is a special glyph actually required? 
  • Mark Simonson
    Mark Simonson Posts: 1,739
    edited April 2015
    It has to do with automatic ligatures.

    In Turkish, i is distinct from ı. If one of these follows an f, there may be confusion if the the ligature is designed in a way that would make the presence or absence of a dot unclear. By substituting i with idotaccent (identical to the i in appearance) for Turkish, you automatically suppress the fi ligature as a side effect (you purposely don't include an f+idotaccent ligature), which is the proper way to handle it in Turkish. (Alternatively, you could suppress the fi ligature explicitly in the locl feature, but using a special Turkish i is simpler.)

    idotaccent is one name that is used, but I’ve also seen i.trk, i.dotaccent, i.dot, and others. It doesn’t really matter since it’s not encoded and not a standard name, although using a name with the dot (i.whatever) helps with PDFs since it signifies that it’s really just an i (for searches, etc., as Chris mentioned).
  • @ Mark, since it is a standard glyph required for Turkish typography, why would it not have a unicode value? Like any other accented character. Why not treat dot-less i as an accent and use the regular latin 'dotted' i as idotaccent in lowercase? (And the reverse for upper case). 

    I sounds convoluted, actually.

    I see that. :smile: 

    Thanks everyone for the help and information.

  • Mark Simonson
    Mark Simonson Posts: 1,739
    It’s just how it all went down when localization of computers for Turkish happened, as far as I know. What we call "dotless i” was adopted as a normal i in Turkish. Our normal i is an i became an accented i in Turkish. It gets a bit confusing because our normal capital I has no dot, so it can be used as is in Turkish. All that was needed to support Turkish was a capital I with a dot accent (İ) which does have its own unicode. Thankfully, type designers don’t have to worry about how the software sorts this out (like for changing case) as long as the codes are properly assigned. It only becomes an issue for type designers when you get to ligatures, as noted above.
  • attar
    attar Posts: 209
    What do you want a Unicode codepoint for? i.trk is just a trick to prevent substitution as first used by Adobe so they wouldn't have to special case every lookup that substitutes i with something else.
    In my upcoming font the fi ligature has a dot so I'm not going to use that, I'm just going to need a locl specific substitution for the small caps.
  • Adam Twardoch
    Adam Twardoch Posts: 515
    edited April 2015
    Yes, the point of this is: in most Latin-script languages we have the lowercase-to-uppercase conversion:
    i -> I

    But in Turkish, we have (actually more logical):
    i -> İ
    ı -> I

    In Turkish, the letter "i" represents a vowel similar to English "ee" in "been" or English "y" in "belly" or English "ie" in "fierce". That vowel in Polish is represented by "i".

    The Turkish letter "ı" represents a vowel similar to English "i" in "bin" or "lift", which in Polish is written as "y".

    And Turkish "y" represents a consonant like "y" in English "you" or "boy", in Polish written using "j".

    In English, the distinction between the "ee", "i" and "y" sounds is a bit unclear but in Turkish (or Polish) it's very distinct orthographically.

    Because in Turkish, you have separate case conversions (i -> İ, ı -> I) which are different from other languages, and case conversion also applies to the smcp feature, and because the visual distinction between i and ı should be maintained even in ligatures, and some designers tend to blur this distinction in f_i ligatures by folding or merging the dot, this is how you'd deal with it in an OT font.

    1. You make an i.TRK glyph which is identical to i but it'll be treated differently in subsequent features.

    2. You have a special languagesystem latn TRK in which you only perform one substitution: i -> i.TRK in the "locl" feature

    3. From now on, you know that in the Turkish context, one of two glyphs will appear: i.TRK (with a dot) or dotlessi (without a dot).

    4. In features such as liga, you can safely do things like f i -> f_i and you don't need to think about special lookups for "ligatures with an i" and ligatures "without an i".

    5. Similarly, in the smcp feature, you can safely do i -> I.sc (dot vanishes), i.TRK -> Idotaccent.sc (dot stays), dotlessi -> I.sc (not dot at all).

    For most languages, the dotted i's uppercase or smallcaps counterpart is a dotless I but for Turkish, the dot needs to stay. So without i.TRK, you'd need special treatment whenever "i" is involved in a feature.

    With "i.TRK", you disambiguate the normal "i" and the Turkish "i" early (locl is among the first features to be processed), and all your subsequent features can be written in a simple way.

    Of course i.TRK is not always required. If your font doesn't have small-caps and your f_i ligatures are drawn so that the dot over the i is fully visible (because you shorten the f's tail rather than elongate it), you shouldn't have a TRK languagesystem or i.TRK.

    f_i ligatures can be used in Turkish but only if those ligatures don't blur the visual distinction between the dotted and the dotless i in such a ligature.

    Best,
    Adam
  • Adam Twardoch
    Adam Twardoch Posts: 515
    edited April 2015
    Mark is right that the name i.TRK or i.dotaccent is better than idotaccent because that last one does not resolve to anything (and it should resolve to the Unicode of i). Using a period in the name improves PDF accessibility.

    Also, if for any reason your lowercase glyph "i" does not have a dot (because your lowercase glyphs are small caps or look like uppercase etc. — eg. in Adobe's Trajan), then you should draw your i.TRK separately, with a dot. In most Latin scripts, the dot over i is sort of optional but in Turkish, it's mandatory.
  • In Glyphs, you can use idotaccent as a design name to make clear what it is. On export it will use a production name: i + dot + suffix.
  • Adam Twardoch
    Adam Twardoch Posts: 515
    edited May 2015
    Here's a recap to hopefully clarify it once and for all:

    "i.TRK" or "i.dotaccent" is an identical copy of "i", with a dot.

    "dotlessi" has a separate Unicode. The Turks input "i" and "dotlessi" as separate letters from the keyboard.

    For "dotlessi", the case conversion, ligature forming and small caps conversion is straightforward: no dot on uppercase or small caps, no ligation.

    But with "i", it's different: in other languages, it loses the dot on case or small caps conversion and in ligation. In Turkish, it always keeps the dot, so you make the shadow copy of "i" as "i.TRK" or "i.dotaccent", substitute it in locl and then treat it separately (no ligation, conversion to "i.TRK.smcp" which has a dot etc).

    If Unicode was created today, they would have added "dotlessi" and "idotaccent" for Turkish. "idotaccent" as used in Turkish is de facto a different letter than "i". In the the Latin "i", the dot is sort of optional. In the Turkish "i" i.e. "idotaccent", the dot is mandatory regardless of the letter case.

    But since Unicode is old and full of horrors, OpenType fonts need to address it post-factum.

    So the correct way is:

    1. Have these glyphs:

    LOWERCASE:
    dotlessi (without dot)
    i (with dot)
    i.TRK (with dot)

    UPPERCASE:
    I (without dot)
    Idotaccent (with dot)

    SMALL CAPS (pick one name or duplicate your glyphs to have them all):
    i.smcp / I.c2sc / dotlessi.scmp (without dot)
    i.TRK.smcp / Idotaccent.c2sc (with dot)

    Features:

    feature locl {
    script latn;
    language TRK exclude_dflt;
    sub i by i.TRK;
    } locl;

    Then, form ligatures from "i" only, and for small caps, substitute "dotlessi", "i" and "I" by a small-cap i without dot, and substitute "i.TRK" and "Idotaccent" by a small-cap i with dot.
  • Thank you for that great recap.