Clients often ask me what languages are covered by most professional fonts and I still have to tell them that there’s no simple answer. A good foundry clearly illustrates each font’s support, but a lack of a standard makes it difficult to compare between foundries.
There are Unicode standards and character encodings, but most font makers offer products that cover sets of languages somewhere between ASCII/Basic Latin, Latin Extended A, and full Unicode.
Is it fair to say most font makers now use the labels “Basic Latin” and “Latin Extended” to describe their Latin-based character sets? How standardized are those sets? What is your favorite reference for those sets that explain to the average human which languages are covered?
Comments
Even if we use a percent of all Latin languages covered figure [YourFont 90%], a user would want to know what languages were there.
Even better, all of my vendors will display a dump or the character set and/or have a type tester that supports diacritical marks. So designers can look this up themselves. Which is something they should be able to learn in about five minutes—it’s not hard to look up a language on Omniglot or Wikipedia and see what letters it uses. This would probably be a good topic for a Typographica article. World languages are fun stuff, and designers would probably be happy to know that one doesn’t need a degree in linguistics to look this stuff up.
http://adobe-type-tools.github.io/adobe-latin-charsets/adobe-latin-4.html
As James alluded to, it depends upon how one decides to define “language” and “coverage.”
The difference between a language and a dialect is a fuzzy one and different foundries will choose to draw the line in different places. (Or, more accurately perhaps, the sources they relied upon drew the lines in different places.)
Is Dalecarlian a dialect (or group of dialects) of Swedish or an independent language deserving of being listed separately? (I’ve seen it on at least one list, although I cannot find a source for the Latin alphabet required.)
Coverage can also be a fuzzy area. The Guaraní alphabet includes a g with a tilde over it. This character is not encoded in Unicode. Most fonts do not include a gtilde, while they often cover the rest of the diacritics required. So, do they cover Guaraní? Some will include this in their list, some will not. Does it depend upon having a combining tilde? Does it depend upon having a {mark} feature to place that tildecmb over the g or not?
Jèrriais is the form of the Norman language spoken in Jersey, one of the Channel Islands off the coast of France. There are, perhaps, a couple thousand speakers. The alphabet does not require any diacritics beyond those used for French. Does Jèrriais need to be listed by a foundry as a covered language? Some do, some do not.
Beyond the usual suspects, the issue of language support gets murky and it may be tricky (if not impossible) to give a complete, exhaustive, and definitive list for any given set of codepoints.
Do browsers automatically (and reliably) build accented letters on the fly?