It looks like you're new here. If you want to get involved, click one of these buttons!
And U+0315 is useful for the Slovak 'caron' composites.
Ray Larabie said:
... I think this topic is very important. I imagine there are a lot of new type designers who are curious about extending Latin language coverage.
John, I’d never considered that one. Isn’t the Slovak preference not to have this form of haček look like a comma or apostrophe? You wouldn’t use U+0315 for actual decomposition, would you? Or am I taking the Unicode name too literally?
Thirty years ago it must have been very very difficult to become language-savvy in a big way.
As far as I know, U+0315 was encoded specifically for
compatibility with older encodings that used a postscript mark,
following typewriter practice, to write the Slovak Ľ ď ľ and ť.
The most interesting links to search were already posted here and in other threads. But I would like to add some other. The first one, from Unicode, is quite important – although the overwhelming number of items makes navigation slow. The last one has a list of scripts with their proposals to Unicode. These proposals are, many times, the best documentation one can find regarding ancient alphabets, less known languages or rare characters.*
Unicode CLDR Locales
Script Encoding Initiative
About the Underware's infographics, they are impressive, but some data are wrong or at least dubious, as Frode already related. I did not tried exhaustively, but Æ and ĸ usage information are wrong.
* My order of confidence to know about scripts, languages and strange characters is: 1. To see if John Hudson posted something about it; 2. To find the proposal presented to Unicode; 3. To get information from a specialized site; 4. To see what Wikipedia says.
* My order of confidence to know about scripts, languages and strange characters is: 1. To see if John Hudson posted something about it;
Now, Unicode's Gujarati page lists 85 characters. So what's up with the 82 character difference?
John Hudson said:
Now, Unicode's Gujarati page lists 85 characters. So what's up with the 82 character difference? I presume Glyphs is reporting the number of glyphs it considers to be necessary for Gujurati support (based on its internal automation data, which might or might not be equivalent to what a designer considers necessary for a given design), not the number of characters needed. So, for example, none of the 56 conjuncts reported would be encoded characters: they're ligatures.
Igor Freiberger said:
3. Quechua is reported as a younger brother of English, but it is a native South-American language with no relation with English. The same about Indonesian and Albanian, also shown as English brothers.
I've visited a lot of sites that go into Gujarati in some depth, but what glyphs are needed, isn't clear cut.
Now, unless I'm missing something, this means the current empty test font I've got loaded in Glyphs is missing everything Gujarati. Zero vowels, consonants, etc...
Kent Lew said:
I've visited a lot of sites that go into Gujarati in some depth, but what glyphs are needed, isn't clear cut.Rich — You are going to run into that problem with most all of the Indic languages, and others as well. There is not a one-to-one relationship between the characters required to encode the language and the glyphs required or desirable to represent the language typographically.The issue of conjuncts is not a sharply contained one. There may be grey areas regarding which are “required.” And there will be different design approaches to supporting them (pre-composed ligatures vs modular components). Thus a definitive list of glyphs may be elusive.Because they are unencoded and rather the result of codepoint combinations and interactions, you may find it tricky to set up a one-size-fits-all investigation.
John Hudson said:
Rich: http://www.unicode.org/versions/Unicode8.0.0/ch12.pdfRead the entire Devanagari section before reading the Gujurati section, as it provides the archetype for Indic script encoding and processing.The table of Gujurati conjuncts is representative, rather than exhaustive.