Where do I find which glyphs are required for a given language?

13»

Comments

  • Nick ShinnNick Shinn Posts: 1,767
    John Hudson wrote:

    langsys tags … regional preferences

    Examples of this in Latin script are a more vertically inclined Polish kreska (acute) accent, lowered capital dieresis in German, and extra guillemot sidebearings in French. 

    These are quite rare and discretionary, not a standard.  
  • John Hudson said:
    But there is no documented de facto standard for most of the langsys tags...

    Fair enough, though at least when there's a mapping to an ISO 639 identifier it's clear in what contexts it would be appropriate to apply. Also, while I don't know about details of tags that were registered very early on, when people have since requested that tags be registered, then I would assume that's because they have distinctions they want to make in fonts.
  • John HudsonJohn Hudson Posts: 2,205
    edited August 24
    Also, while I don't know about details of tags that were registered very early on, when people have since requested that tags be registered, then I would assume that's because they have distinctions they want to make in fonts.
    I was pondering that when considering the requests that Bob Hallisey just submitted. I took a look at the Tamil fonts that are using some of the new langsyst tags, and as far as I can tell from a quick GSUB table review, only two of the four new tags were using locl variants: the other two seemed to mimic the dflt processing.

    Of course, though, once you have a model that treats language X as default processing in a font and languages Y and Z as having variant behaviour, you have created the possibility that someone may want to build a font that treats Y or Z as default processing and the others the variants. So unless a de facto standard implies always that a particular language is default, registering all the languages with behaviours that vary from each other becomes necessary.
  • John SavardJohn Savard Posts: 860
    edited August 25
    But there is no de facto standard.

    Well, yes. But if enough font designers bring out fonts purporting to support these language tags, according to their own ideas of what that should entail, eventually a de facto standard for those tags would emerge.
    That's what I was viewing as preferable to no attempted implementations at all, on the grounds that "a de facto standard is better than none"; but in the absence of even a de facto standard, incompatiible implementations are also better than no implementations, because a de facto standard cannot be born from silence (of course, a real standard could emerge without preceding implementations, if there were interest enough for some suitable body to develop one, but that is precisely what seems to be absent), but it can emerge from chaos.
  • To be fair Unicode CLDR was first released 2003-12-19. In the dark times before technicians (including me) did "something". CLDR allows locale tags based on BCP47, which allow to specify language, region, script, further variants and private extensions.

    Thus for e. g. a transcription of Yiddish to German writing system in JIVO standard one can specify:

    yi-Latn-x-jivo

    or for current US/international YIVO

    yi-Latn-x-yivo

    For French of France in IPA

    fr-FR-x-ipa

    For Ancient Greek (polytonic is a AFAIK a reserved tag) of French scholars e. g.

    grc-polytonic-x-sorbonne1833

    For historical German I use e. g.

    de-x-1750-x-longs

    And for the transliteration to current alphabet (not orthography)

    de-x-1750-x-rounds

    For transcription to modern orthography 1901 (1st spelling reform) and 1996 (2nd) are reserved tags:

    de-1996-x-1750

    But on font level it's IMHO seldom necessary to specify it in such detail. Sure, German Fraktur will need special care for ligatures and accents. Maybe there are a few glyphs needing variants in Czech or Polish Fraktur.
  • Of course, though, once you have a model that treats language X as default processing in a font and languages Y and Z as having variant behaviour, you have created the possibility that someone may want to build a font that treats Y or Z as default processing and the others the variants. So unless a de facto standard implies always that a particular language is default, registering all the languages with behaviours that vary from each other becomes necessary.

    Indeed.

    Use of langsys to effect substitution of glyphs is, effectively, nearly equivalent to creating separate fonts with distinct names and using font names to select which bundle of glyphs will be displayed. A designer could create a set of fonts with names like "Foo", "Foo Betta Kurumba", "Foo Irula", etc. (see Bob's request), or any number of additional fonts for specific languages. But then if someone comes along wanting to use one of the fonts for some other language written in Tamil script, they have to determine which, if any, of the provided fonts works. If they find that one is suitable, then they can just use it.

    But with langsys tags that are selected automatically by layout software based on content metadata (e.g., html lang), they can only get the glyphs needed for that other language if (a) a langsys tag for that other language is added to the font, or (b) the content metadata lies about the actual language of the content.

    Fortunately, adding an additional langsys tag to a font, perhaps with corresponding 'locl' feature, is not a lot of additional data that bloats the file size.
  • John HudsonJohn Hudson Posts: 2,205
    But with langsys tags that are selected automatically by layout software based on content metadata (e.g., html lang), they can only get the glyphs needed for that other language if (a) a langsys tag for that other language is added to the font, or (b) the content metadata lies about the actual language of the content.
    Even if a langsys tag is included in the font with appropriate glyph substitution or other specific behaviour, the whole mechanism relies on too many underspecified things happening in software, such that even if the correct glyphs and shaping for a non-default language are displayed in one place, there is no guarantee that they will elsewhere on the system, or in the same apps on other systems. And when text is copied and pasted between software, language tagging is liable to be lost. It is a fragile mechanism, which is why in recent years I have favoured making separate, language-specific fonts.
Sign In or Register to comment.