What is your standard Latin character set?

How do you decide about your default character set for Latin typefaces?
Do you include Vietnamese?
Do you care for special American or Eskimo (non-English/French/Spanish/Portugese) characters, currency sings etcª?
What about spcial African characters/alphabets? Why are they catered for so little?
Are you supplying ə (Aseri) or other special characters? What is your standard set of languages to be furnished?

What is your standard Latin character set?



  • James PuckettJames Puckett Posts: 1,946
    How do you decide about your default character set for Latin typefaces?

    I took the old Fontlab on Steroids set Tiro developed and stripped out the Esperanto characters. Then I did a lot more research to determine what languages were covered.

    Do you include Vietnamese?


    Do you care for special American or Eskimo…

    Regarding American Indian and Eskimo languages, no. Regarding currency, I support the new Turkish Lira sign because Turks buy Latin fonts and the new Indian Rupee sign because I enjoy drawing it. Otherwise I ignore most stuff that isn’t already in my list. 

    What about spcial African characters/alphabets? Why are they catered for so little?

    Not all African languages have a fixed orthography, so some languages are a mess I’m not going to wade into. Others are used by people whose economic circumstances make it unlikely they’ll ever buy—or even have a use for—any of my fonts. That said, I am considering dropping ð and þ from my fonts so I can use that time designing a few extra letters with diacritical marks that could support more Africans than Iceland has residents. I should note that I regularly sell fonts to Japanese buyers—whose native writer system is different—but never to Icelandic designers.

    Are you supplying ə (Aseri) or other special characters?


    What is your standard set of languages to be furnished?
    What is your standard Latin character set?

    See the pages six and sixteen of this PDF: http://www.dunwichtype.com/pdfs/Ironstrike_Specimen.pdf

  • Keep in mind that the buyers of your font are not necessarily the same as its readers.
  • > the buyers of your font are not necessarily the same as its readers

    what do you mean by that?
  • Stephen ColesStephen Coles Posts: 985
    edited March 2015
    James was implying that his fonts aren't bought in Iceland so there isn't a demand for Icelandic characters. What matters is whether your buyers are setting type that will be read by those outside the common Western European languages. The largest licenses often go to large corporations for whom broad language support is vital.

    A note about Vietnamese: support is surprisingly rare among professional fonts so offering Vietnamese in your standard set can be a differentiator.
  • Chris LozosChris Lozos Posts: 1,456
    "Vietnamese: support is surprisingly rare among professional fonts"
    Not so surprisingly. The potential return for the amount of work is not high.  Icelandic is a no-brainer yes.  Not a huge undertaking so you need far less of a return than say Vietnamese.
  • Mark SimonsonMark Simonson Posts: 1,594
    It’s not uncommon to see Vietnamese characters in use in the US. There are several Phở restaurants not far from where I live in Saint Paul.
  • Phở is amazing, but there's probably one problem, the typography is Arial Unicode. Therefore, I would most definitely provide Vietnamese. Notice the ở is a different font, most likely. Check this out: hởh. And as for the character set, it depends. If I am in a particularly good mood, I'll escape my comfort (relatively) Basic Latin for some accents. Maybe it'll be fun to get those funky accents positioned. My standard is probably fairly low, maybe some accents and some basic things like /©/™/®.
  • Russell McGormanRussell McGorman Posts: 255
    edited March 2015
    I'm convinced -- he said with resignation. 

    I find the thought of adding Vietnamese language support intimidating but now, when I see see the grossly doctored signs in front of Vietnamese restaurants I feel sad. 

  • I really wonder about providing Vietnamese by default. Why shouldn’t we? The amount of extra work is moderate, it’s all just setting up composita – and the dong sign.
    Vietnamese is the one big seperate Latin character set outside the Atlantic world, if I don’t something.
    It seems a bit queer on the other hand to suport Icelandic and Lule Sami but not Vietnamese.
    The other question that arose to me was about languages of the Amerikas (apart from English/French/Spanish/Portugese). Are languages like Navajo or Guaraní worth supporting? Is there any ‘good practice’ on that?
  • Thomas PhinneyThomas Phinney Posts: 2,584
    I do Adobe Latin 3 as my base character set (331 characters). I'm fairly easily convinced to do Adobe Latin 4 (619).

    I'll note that the difference between AL-4 and AL-4 without Vietnamese is still about 90 characters. I have yet to add Vietnamese support to the family I've been working on most recently.


  • I have recently been working with Omnibus-Type expanding its linguistic coverage for languages that use the Latin writing system, including Vietnamese, Croatian and Guaraní. The standard Latin Set that we have adopted covers the 105 languages that OTMaster lists as languages that require Latin letters. It also includes the Underware Latin Plus diacritics coverage and the letters necessary for Piying transliterations. We also support the Polish kreska and a variation of the letter Y when it's used as a vowel in Guaraní. I think that this could be a lot of work, but you never know who might be interested in your font; and non-covered niches could present a business opportunity. Giving support to Guaraní causes a problem with «G tilde», because this character doesn't have a codepoint. Nevertheless, with Open Type mark positioning and mark to mark features you can not only solve this issue, but also create arbitrary diacritical marks over the characters without needing to draw all the glyphs.

  • Vietnamese

    The character set I created is not a parameter as it tries to support all Latin-based alphabets. But for any "pro" font I consider Vietnamese support a mandatory feature. The language has 90 million of speakers and a long written tradition. It surely demands more work, but with components it is not a huge task. IMHO, it is completely inconsistent to build a font with Sami or old Greenlandic characters while omitting Vietnamese.

    Native South and Center American Languages

    As a general rule, any character set supporting Spanish and Portuguese will also support almost all native-South and Central American languages. They were not written before European colonization so the Spanish/Portuguese alphabet was used to register them. Latter researches also included other diacritical combinations, but again all already present in fonts supporting Western-European languages.

    There is an issue with glottal stop, marked with an apostrophe and very common in those languages. In recent years scholars are pointing that the original sign was not an apostrophe but a similar entity called "saltillo". This character was encoded in Latin Extended-D (A78B) not long ago. So, for a completely correct support for these languages, the saltillo pair (UC/lc) is needed. Its usage is especially important for linguists working with Mayan languages, but tends to increase as fonts offer the character.

    From all those languages, Guarani and Quechua are the most relevant. Guarani is official and widely spoken in Paraguay. Quechua is official in Bolivia and Peru, with about 10 million of speakers in a vast region from Argentina to Colombia. A deeper support for native-Brazilian language needs special kerning for pairs of diacritical combinations like ïï and îî, but those are for very specific languages.
  • Nicolas, AFAIK G̃ is not part of Guarani alphabet nor a different y is needed to show it as vowel. It would be very nice to know about the references you did use to adopt those criteria.
  • Hi Igor, 

    Thanks for sending this information about the use of typographical characters in the indigenous languages of Latin America and for asking me about including the G̃ in the Guaraní set and an alternate shape of the “Y” to distinguish its use as a consonant or a vowel.

    First of all I would like to apologize for not clarifying that I am not a Guaraní speaker nor a Guaraní linguistic expert. I want to let you know that I am just somebody who tries to learn something new everyday, and I am very happy when people help me correct the knowledge or ideas that I have that could be wrong.

    My introduction to the type needs posed by Guaraní was through colleagues from Paraguay, Brazil, Argentina and Uruguay who have been working over the last 20 years in order to recover and support the typographic identity of Guaraní. Juan Heilborn, Rubén Fontana, and Pablo Cosgaya are some of the people that have influenced my point of view with respect to the character set to support Guaraní.

    Having a distinctive visual feature for a letter that is used for two different phonemes in a specific language isn’t a new idea. The reason for this is to keep consistence with the premise of “one character to represent each phoneme”. We have seen Paul Renner’s sketches of the German digraphs “ch” and “ck” for Futura (Burke, 2000), the Spanish digraphs “ch”, “ll” and “rr” from Andralis (Fontana, 2007) and the Guaraní digraphs “ch”, “mb”, “nd”, “ng”, and “rt”, designed by Juan Heilborn (a Paraguayan designer), as ligatures in order to highlight the fact that these two letters are a linguistic unit. Those examples have been design decisions based on linguistic and historical arguments in order to give a specific language a unique identity in a typeface. I can’t talk on behalf of Omnibus-Type, but having worked with the members of Omnibus-Type and having worked closely on the font production workflow, I think that the Omnibus-Type team has been introducing an alternative Guaraní localized shape of “Y” in its fonts in order to support the efforts of the Paraguayan typeface design community to emphasize the different use [in Guaraní] of this character in comparison with Portuguese and Spanish. Something like the difference between an acute and a kreska, or an inverted circumflex and a hákěk.

    With respect to the inclusion of the “G̃”, this character is actually part of the official Guaraní alphabet, standardized in 1950 at the Guarani Language Congress in Montevideo and recognized by the Ministery of Education of Paraguay, the Linguistic Guaraní Institute of Paraguay and Guaraní’s Language and Culture Athenaeum. Also you can check the following webpage [in Spanish] that mentions the disagreement about the existence of the“G̃”.

    I am writing a letter to the new Academy of the Guaraní Language (Ava Ñe'e Rerekuá Pave) in order to ask if they consider the “G̃” to be part of the Guaraní alphabet.

    Nevertheless, my position as a typeface designer is not take sides in a linguistic debate. I think that as a typeface designer/font developer, my duty is to support as many glyphs as possible. This could take a few more hours, but once the font is published, there might be a few people that feel grateful to have a font that supports glyphs that are not common in other fonts.

    Thanks, Igor, for making me question the inclusion of the  “G̃” in the Guaraní set, and inspiring me to write about that topic here. In that way we can learn from each other and find a way to come to an agreement on this issue in the typeface design community.

    Here are other sources that I consulted:

  • Nicolas,

    Thank you VERY much for sharing your deep research on this issue. I really appreciate to learn about the G̃ debate. And completely agree with you about the role of type designers on these questions.

    The sources I did consult were studies published by SIL and from the Linguistic Lab of UnB (Universidade de Brasilia). One paper even analyses just the nasalization on Guarani, where I discovered some kerning challenges like pĩtã and ãmõcĩĩ. But none of them includes G̃/g̃.

    I suppose this is related not only to a linguistic split, but also to differences between Paraguayan Guarani and the Mbyá variant which prevails in Brazil. Also the difficulty to write G̃/g̃ maybe also inhibited its adoption (even the Abecedario page you linked uses Ĝ/ĝ instead of G̃/g̃, probably due to lack of OT support). For a comprehensive learning, I should had read more sources about the Paraguayan side.

    Regarding the y, I now understand it is not a linguistic need but a design choice based on linguistic criteria. Thanks for clarifying.
  • You don't need a Gtilde glyph. There is no unicode for it so the standard encoding is a G and a tildecomb. All you need to do is to add a mark feature.
  • Gentlemen, thank you very much for providing such excellent reading on the matter.

    I think it would also be sensible to cater for Azeri. There are about 26M native speakers. They use the Latin alphabet officially since the 1990ies and they need one particular extra character, the Schwa: Ə ə (018F, 0259). Enjoy drawing it!
    It seems that the existence of this character as a regular part of the modern Latin writing system has not yet been noted on a general scale by font producers, however, I think it actually belongs to a char. set which also supports e.g. Turkish.


  • Chris LozosChris Lozos Posts: 1,456
    I usually include the Schwa as well.
  • For Navajo there seems to be another diacritics issue:

    hááhgóóshį́į́   –  see Wiki

    any insights?
  • Andreas,

    Besides the usual resources needed to implement such stacked diacritics (combining diacritics, mark, mkmk and cmap OT features), there are two tricky issues with Native Norte-American languages.

    1. Ogonek is largely used to indicate nasalization – similar to tilde in Central and South-American languages. It appears in Navajo, Apache, Diné Bizaad and several languages from Pacific Coast and Yukon region. But, contrary to Polish and Lithuanian usage, the preferable position of ogonek for a is at the bottom center:

    2. Macron below and lowline are used in Northweast languages to denote the strongest syllable. As both nasalization and accent can fall on the same vowel, many times you have stacked diacritics bellow. Not only the OT support for this is almost non-existent, but you also need more room below baseline to properly position the macron/lowline with ogonek.

    There was some discussion about this theme on Typophile some years ago.
  • What do you mean 'OT support for this is almost non existent'?
  • John HudsonJohn Hudson Posts: 2,818
    But, contrary to Polish and Lithuanian usage, the preferable position of ogonek for a is at the bottom center.

    Isn't that just an outcome of mechanical typewriter implementations, which overstruck the ogonek on the width of the preceding letter?
  • Igor FreibergerIgor Freiberger Posts: 239
    edited April 2015

    I mean the number of fonts which includes combining diacritics and OT codes is almost non existent.

    When I did research (2010-2012), the only quality fonts with such support were the ones released by SIL, Language Geek and Linguistic's Software. Others available were low quality adaptations from Times, Arial or Courier, barely suitable for professional use.  And the ogonek at the bottom middle was present just in a special adaptation made for a Yukon NGO.

    Later Brill and Huronia joined the quality group. Now probably this improved a bit with newer OS, Reading influence and other releases. But minority non-European languages using Latin script are still far from mainstream focus.
  • John,

    Maybe this explains the preference origin. Linguists I did contact from Canada and USA said the bottom middle should be used, if available. Specialized adaptions of digital fonts were made this way, indicating a choice made already in OT era. For me, this is enough to include alternate glyphs/combinations for those languages.
  • John HudsonJohn Hudson Posts: 2,818
    I go back and forth on this issue, Igor. Sometimes I think, okay this is an established community preference, regardless of its origins, and we should support minority cultural preferences. But I also think that divergent conventions that arise out of technical limitations should be deprecated when those limitations do not apply. In the case of invented orthographies for aboriginal languages, I'm more inclined to the first position, on the grounds that the technologically determined convention isn't diverging from a pre-existing practice within that culture (unlike, say, over-use of half forms in Hindi). In the case of the ogonek, though, mid-attachment is certainly a divergence from the normative e caudata.
  • I like your objective and technical criteria. The problem I see is how to identify which divergent conventions could be deprecated. Many variants in minority alphabets are results of technical limitations. When the variant remains a reversible adaptation and when it becomes part of a culture? When divergence from some norm becomes a new norm?

    I don't know anything about typewriters used to native American languages. It would be interesting to know the solutions adopted as an ogonek dead key linked at right does not fit /O/o/e/ while at middle it changes the European standard and also misses connection in /A/. This would also be an issue to Polish and Lithuanian.
  • John HudsonJohn Hudson Posts: 2,818
    I've not looked at a lot of native American original materials, but have looked at a lot of early publishing in African orthographies with similar development (invented by linguists, often incorporating IPA letters, reproduced from typescript). The typescript is typically monospaced, and produced on manual typewriters (the linguists are often working in areas without electricity; portable typewriters were a key tool; sometimes, but not always, the typewriters have been modified to incorporate custom characters; sometimes the typescripts are augmented with manuscript interventions, e.g. a hook drawn on an n to indicate eng). In African orthographies, one sees a number of features that seem to derive from this practice. One is central positioning of all accent marks, which are input by typing the letter, backspacing and then typing the mark. Another, related, is a significant use of lowercase only text (marks could typically not be raised above the cap height without manually adjusting the page advance). And one also sees unusual selection of marks, such as cedilla to indicate nasalisation, which I suspect is due to whatever the linguist had available on his or her typewriter. 
Sign In or Register to comment.