Vietnamese with combined diacritical marks


I'm adding Vietnamese language support for a client's existing Latin typeface.
I remember vaguely, not being an expert in Asian languages, that Vietnamese writing would contain combined diacritical marks that would stack atop or below letters.
Unfortunately, I'm *so* not finding any information on that on the internet that I'm writing a forum post :)
After research it appears to me that the Vietnamese alphabet consists of following encoded characters, and that mark positioning isn't even necessary: aAàÀảẢãÃáÁạẠăĂằẰẳẲẵẴắẮặẶâÂầẦẩẨẫẪấẤậẬbBcCdDđĐeEèÈẻẺẽẼéÉẹẸêÊềỀểỂễỄếẾệỆ

Or, alternatively, which Latin based languages do you know of that use stacks of diacritical marks?
I'm looking for information as well as samples of encoded strings to use for designing and testing.
At the moment I believe that my mind is playing tricks on me.

THank you.


  • Though it looks like combined marks, it's more that the letters that look like they have two marks are really just a letter with a single mark, e.g., ề is really the Vietnamese letter ê with a grave tonal mark.

    Here are some possible references to look at:

    Hook Above:

    I thought I had some other references lying around. When I run across them, I'll try to remember to post them. Right now, though, I have to get ready for my dance troupe's auditions tonight.

    I've been aiming for both precomposed glyphs and mark/mkmk positioning, in general.
  • yanoneyanone Posts: 100
    Thanks for the resources, though I ran across all of them yesterday.
    So it's really only a fixed set of encoded characters.

    However, here's the answer to my question:
    Input methods that involve typing the vowel and the diacritics separately. But unlike my memories the results aren't infinitely stackable diacritics, but only those exact same fixed set of characters, only with separated keyboard input and mark/mkmk positioning.

    My memories were probably influenced by things like Glitchr who (mis-)use the mark/mkmk positioning.

  • John HudsonJohn Hudson Posts: 2,066
    Vietnamese text almost always uses the precomposed diacritics, and even if the text is decomposed -- e.g. Unicode normalisation form D -- it is easiest to represent with the precomposed glyphs via the {ccmp} feature, rather than trying to use GPOS. The reason for this is that the conventions of Vietnamese mark positioning are not a linear stack, and the nested positions generally benefit from some modifications to the size and weight of the marks in combination.

    Input methods are usually deadkey mechanisms, so the separate letters and marks keyed do not correspond to separate letter and mark characters stored in text. Instead, the key combinations are mapped to the precomposed diacritic characters.

    If you are making a font in which you want to support both Vietnamese diacritics and also generic mark and mkmk GPOS, you'll want to precompose mark combinations in {ccmp} for the Vietnamese language system tag, but decompose the combinations in {ccmp} for default script processing.
  • To add to John’s comment, if you look at Gentium, you’ll see that something like “ề” will have a Vietnamese variant and a variant which has the accents “stacking”. The “stacked” variant is mapped to unicode while, I assume, the Vietnamese variant is activated when one switches the language in their document.
  • John HudsonJohn Hudson Posts: 2,066
    There is, by the way, legitimate variation in placement of the grave tone mark relative to the circumflex modifier in Vietnamese. In some fonts it is place to the upper left, and in some fonts it is placed to the upper right, following the positioning of acute and hook. I've looked into practices in Vietnam, and have come to the conclusion that either is acceptable.
  • Hi Yanone, 'twas good to see your talk @ ATypI, thanks!

    You inquired about other Latin-based alphabets stacking diacritics. One example is Livonian that employs a wider array of diacritics than the neighbouring Finno-Ugric languages —Estonian and Finnish which avoid stacking by reduplication of long vowels. The Livonian orthography, unfortunately, followed the Latvian standard - marking long vowels with a macron and thus yielding the dubious skyscrapers: Adieresismacron, Odotaccentmacron, Otildemacron and their respective lowercase relatives.

  • John HudsonJohn Hudson Posts: 2,066
    Standard Latin transcriptions of some Indian languages also involve vertical stacked marks, such as combinations of macron with acute above, indicating length and stress.
  • John: Where did you see grave on the left? Or rather how often, relatively speaking, did you see it in Vietnamese publications?
  • John HudsonJohn Hudson Posts: 2,066
    I've mostly looked at Vietnamese signage, including hand-painted, as I don't have access to a lot of Vietnamese print publishing. In signage I've seen the grave on the left probably about as often as I've seen it on the right. Hence I conclude that there's legitimate variation, at least in terms of what people read and accept. BTW, I've also seen acute on the left, but that was only in one sign.
  • Yes, acute on the left can be seen in Alexandre de Rhodes’ dictionary or in some Deberny & Peignot types, but it is rather rare today.
Sign In or Register to comment.