Exciting News for Mongolian Script Type Design (Hopefully…)

Jacob Casal · June 2019

I have followed Mongolian Script Unicode support for some time, and needless to say it has had a rough road but made some great steps forward in recent years. Most recently, I have been reading through the documents of the Mongolian Working Group Meeting #3 from April of this year. Some robust proposals (both for minimal changes and major changes) have been made for standardizing the OpenType coding and creating clear character definitions for designers and creating easy and accurate use for users (the latter perhaps most important). At the very least, we can look forward to thorough Unicode Technical Notes on designing for it, which will greatly help for interoperability among fonts.

I bring this up here both to note this hopeful step forward for Mongolians using the script and out of curiosity to hear from those more experienced about revisions made to Unicode blocks. With the sort of “set in stone” attitude Unicode seems to take toward changes to code charts (a la legacy characters), I wonder how difficult it will be to refine the Mongolian Script block even after it gets to a more unified, stable form through minor changes. The major changes would be quite a hassle to make I imagine.

John Savard · June 2019

Unicode, of course, isn't just used for printing and display. It's also used for communications, like ASCII, on which it is based.

Of course, ASCII wasn't always set in stone. Or, rather, ASCII was merely one version of an international standard, International Telegraph Alphabet #5 (aka ISO 7) so other countries could replace some of the less used characters that occupied "national use" positions of the code.

Then a shift to an 8-bit code took place, and there were several codes under the heading of ISO 8859, where the first half of the table was fixed to correspond to ASCII, but there were several alternative high halves.

Of course, this was necessary. Neither 7 bits nor 8 bits would be enough to specify all the characters anyone might need.

Unicode started out life as a 16-bit code, but was quickly changed to a 31-bit code. As that is enough, at least one might hope, to cover all the characters anyone might need, it's no longer necessary to replace parts of the code depending on the language or the application. That's one of the great advantages of, and reasons for, Unicode. A certain binary code, if assigned to a particular character, will always represent that character and no other, so an electronic text file in Unicode won't ever confuse people because they didn't realize it was prepared for an older version. The meaning of binary bits is now as permanent as the meaning of ink on paper.

So they have a reason for wanting the code to be "set in stone"; it's not an ego thing.

Jacob Casal · June 2019

That was poor wording on my part, sorry, I didn’t mean it as an ego thing either. I just get a little excited about Mongolian (and I cannot neglect Manchu, Todo, Sibe, the Ali Gali extensions, etc. in this as well), and really want their script to flourish in the digital space more than it does now.

I actually had not heard about the 31-bit code, only hearing of 16-bit. Unicode is great in the grander scheme of support without a doubt. Though the lack of confusion across versions holds up, the problem for Mongolian unicode, unfortunately, is the confusion it has caused in its coding in each version. One example mentioned in the MWG#3 that cannot be quickly refined: the letter Ga having the separate letter Ge as a variant under it in the specs (the same happens with letters Qa and Ke). Another good in depth look comes from Bolorsoft: The Aesthetics of Mongolian Script in 800 years.

John Savard · June 2019

I certainly wouldn't want to say that Unicode is perfect. I am aware of complaints about its support of Burmese: basically, while major scripts have legacy characters that permit relatively easy encoding, the minor ones instead are expected to receive advanced implementations making full use of variant forms, composed characters, and so on.

Mongolian, apparently, has the opposite problem, but since they have the option of marking characters as deprecated, I think that it will be possible to sort this out, while Burmese is likely to receive nothing but continued inaction.

Jacob Casal · June 2019

Mm, I had used Burmese before on a project but was unaware of their troubles with it. I’ll read some more into that too, thanks for letting me know. I imagine in both the Mongolian and Burmese cases it started out with a lack of deep information about the scripts and their derived usage and history.

Aaron Bell · June 2019

I wish I could say that I think that it'll be sorted out soon, but it has taken 7 years to get to this point (at least, that's when I first got involved with Mongolian back in my MSFT days), and the number of stakeholders has only increased (rather than decrease!) since then. The biggest challenge will be deciding *which* of the different proposals to take forward, and achieving alignment will, I think, take some time.

Jacob Casal · June 2019

I can only wish them the best in their efforts. Personally, I would think some of Bolorsoft’s approaches are sound for later on when “smaller” changes of encoding (figuring out what to do with FVSs and the MVS) have been solidified, which would be a good foundation for bigger character changes when those are sorted.

P.S. Should anyone have read Bolorsoft’s main PowerPoint for the meeting, the dead link with insufficient design notes from Microsoft they mentioned can be found via Wayback Machine.

Exciting News for Mongolian Script Type Design (Hopefully…)

Comments

Categories