Language support & character set standards: how do foundries label their Latin-based products today?

Clients often ask me what languages are covered by most professional fonts and I still have to tell them that there’s no simple answer. A good foundry clearly illustrates each font’s support, but a lack of a standard makes it difficult to compare between foundries. 

There are Unicode standards and character encodings, but most font makers offer products that cover sets of languages somewhere between ASCII/Basic Latin, Latin Extended A, and full Unicode.

Is it fair to say most font makers now use the labels “Basic Latin” and “Latin Extended” to describe their Latin-based character sets? How standardized are those sets? What is your favorite reference for those sets that explain to the average human which languages are covered?


    the labels “Basic Latin” and “Latin Extended”
    These terms have lost what meaning they had several years ago.  I don't know if they indeed have any value other than causing confusion.  Lists of covered languages is more telling but much longer to spell out.  The "Pro" term likewise has outlived its usefulness. Perhaps if there were true and universally accepted standard labels, it would be better.  God help us if we try to agree on what that is though!
    Even if we use a percent of all Latin languages covered figure [YourFont 90%], a user would want to know what languages were there.
    I'm confused by the coverage of Latin Extended as well, being that one foundry will list Latin Extended with language support for about 59 languages, but another foundry with the same glyph set (basically Latin Extended-A) will support more than 140 languages? What am I missing here, it's the exact same glyph set.
  • I don’t label my fonts with any character set because there are enough labels out there to be confusing. I put a complete character set in the PDF specimen, and my newer stuff includes a list of languages I know are supported by that character set. Although I don’t pad my list with different versions of Norwegian, mutually intelligible dialects, constructed languages, and politically disputed names of languages.

    Even better, all of my vendors will display a dump or the character set and/or have a type tester that supports diacritical marks. So designers can look this up themselves. Which is something they should be able to learn in about five minutes—it’s not hard to look up a language on Omniglot or Wikipedia and see what letters it uses. This would probably be a good topic for a Typographica article. World languages are fun stuff, and designers would probably be happy to know that one doesn’t need a degree in linguistics to look this stuff up.
  • Ray LarabieRay Larabie Posts: 1,059
    Fontspring has a system that displays most of the languages covers under the tech specs tab. It seems to scan for minimum character sets for each language rather than using codepage flags. For example: it can differentiate between a few Greek symbols for mathematics and a proper, usable Greek set.
  • There are other standardized character sets. For instance, my Cormorant family covers Adobe Latin 4:

  • Trying to be as brief and clear as I can be when writing, I most often describe Latin faces as covering Western and Central European languages / just one or both; And beyond that by scripts covered. It’s a little sloppy, but as we’ve discussed here there’s no perfect solution. FontShop’s family pages do list the languages covered by a given family, which I think gives as specific an answer as is needed.
  • With web fonts, I think you cannot any longer claim to support a language unless you offer mark features. 

    Do browsers automatically (and reliably) build accented letters on the fly?
    Browser support for combining mark sequences should be pretty reliable because the layout engines all the major browsers use support GPOS mark positioning.
    If only Google Fonts served from the CDN weren't stripped of GPOS... Google Chrome is actually monkey-patching this as it substitutes precomposed glyphs on the fly, even if the font doesn't. Firefox reveals the ugly truth (about both GF and Google Translate in this case):

  • We provide a list of the languages supported on the theory that will make more sense to end users.
  • I agree to the notion that it is tricky to give the long answer on the matter, because a common understanding of the underlying definitions – does hardly exist. Fontspring lists e.g. Arapaho, Cebuano, Gilbertese and Warlpiri for some of my fonts, how useful is that? I don’t know.
    In order to find a brief answer: I tend to label my fonts with “complete Euro-Latin”. Not that any official definition about the meaning of Euro-Latin existed, but I hope it gives people a sort of sensible clue about the font will be running for all romanic, germanic, gaelic, slavic, finno-ugric and baltic as well as Turk languages – nearly everything which is likely to occur in a European context. By which label I imply, admittedly, also the usage scope of most (large) American languages. Since these are basically English, French, Spanish and Portugese, I usually don’t mention “American” explicitly, I’m not aware that anyone does.
    Vietnamese, however (this corresponds to this earlier discussion) is another matter. As well as Azeri (geographically Asian but linguistically a branch of the Euro-related Turk complex) and the indigenious American languages which require special attention in some respect.
    I know my view of this is somewhat Euro-centric. What I’m not convinced of is (in my opinion) the completely outdated labeling by “Western” and “Eastern” or “Central European”, I see no point in splitting the Latin realm by those categories.
    I also think that, for a truely global mastering more insights about concepts from Asia, Africa and the Americas would be welcome. The other discussion (see link above) is now 5 years old, little seems to have been moving since then.

