How would you redo Unicode?

Tofu Type Foundry
Posts: 50
I’m not really sure how to word this question. If you could “redo” Unicode right now with no issues regarding compatibility in legacy software, what changes would you make? It seems that some of the decisions made decades ago have led to unexpected issues with modern digital typesetting. I’m curious what Unicode would look like in an “optimized” release with all language systems and languages accounted for from the start.
Tagged:
0
Comments
-
For Latin: Abolish quotesingle and quotedbl.
That might prompt keyboard manufacturers to provide separate keys for all four “curly quotes.”
I doubt that having separate code points for quoteright and apostrophe would solve more problems than it would create.0 -
Completely eliminate all decomposable diacritic characters and enforce use of letter + combining mark sequences in all languages.
Fix Hebrew canonical combining class assignments.
Consistently assign script properties based on the context in which characters are used, not on the script from which they historically derive (looking at you, not-Greek ᶿ).
Avoid unification or compatibility decompositions for letter/symbol lookalikes, so e.g. separate codepoints for hooked-f and florin sign, and for lowercase mu and micro sign.
Provide recommendations for encoding choices for similar and confusable characters, especially for digitally disadvantaged languages.
2 -
John Hudson said:Completely eliminate all decomposable diacritic characters and enforce use of letter + combining mark sequences in all languages.0
-
...we would need to define where to place the cedilla in letters like H, K, N, R or turned V.That’s already the case, independent of how such things are encoded. Personally, I am fine with floating the cedilla under the middle of these letters, in the absence of any attested forms in actual use.
1 -
John Hudson said:Completely eliminate all decomposable diacritic characters and enforce use of letter + combining mark sequences in all languages.And, of course, if I were re-doing Unicode, I would do exactly the opposite. I would provide the less popular languages, the languages of countries which entered the computer age later, the languages of countries that are less economically powerful, with a full set of pre-composed characters - as are often found in unofficial, unrecognized encodings people have used in those countries - if they are desired.Why?Because pre-composed characters make it simpler to process text in those languages. Less processing power, less complicated algorithms, less sophisticated programs are required.But I have to admit, this isn't a no-brainer. It seems like a natural consequence of the current desire to provide all peoples with full equality.But the only reason a program that handles a given language can be simpler due to the availability of pre-composed characters is if it only handles the pre-composed versions of those characters. Otherwise, having two alternatives that both need to be handled just makes things more complicated. And that means those programs won't work properly - they won't be compatible with other programs that are more sophisticated which do properly handle combining mark sequences properly, which presumably would also be exist and which would be likely to also be running on the same computers.So I do admit that what I would prefer is seriously flawed.Thus, perhaps what I would really want to see is instead for Unicode to be succeeded by two codes - one done the way John Hudson advocates, one done the way I propose, each of these codes being designed to serve a different purpose.His successor to Unicode would serve the purpose of being a logical standard for worldwide communications.My successor to Unicode would serve the purpose of either serving as a computer code, or being closely related to a computer code or codes, that are well suited to simple and straightforward computation in each particular language.0
-
The one thing that is 100% for sure worse than John Savard’s proposal, is his additional proposal to have two encoding standards.
Good grief, please, no. That way lies madness.1 -
Thomas Phinney said:The one thing that is 100% for sure worse than John Savard’s proposal, is his additional proposal to have two encoding standards.
Good grief, please, no. That way lies madness.Sadly, we've already passed this point.The world of standards is already in the grip of that sort of madness.Of course, though, my goals can be achieved without having two standards. Add in all the desired precomposed characters for those who need them... but deprecate both them and the existing ones to point modern systems in the better direction.
0 -
John Savard said:Because pre-composed characters make it simpler to process text in those languages.
Doing the complex stuff by default makes things better for minority languages. Trying to turn the processing of minority languages into the same process used for majority languages is precisely the wrong direction, and the thing that got us into this mess in the first place.1 -
How is waiting years before a precomposed accented character is added and usable on updated devices a good approach?2
-
How would you redo Unicode?a) do basic research and systematics about notation systems firstb) define usable standards with regard to font technics – not only for combined characters, but also for variant characters and ligaturesc) re-order code blocksd) straighten out terminologye) edit glyph bugs and annotation faultssince all this will never happen, f):paint a picture in oil with a flat landscape in sunset (purple sky), a timber barn on the left side (with open door), a white unicorn with golden hair on the right side and a black horse with white figures painted on it, in the middle.
2
Categories
- All Categories
- 44 Introductions
- 3.8K Typeface Design
- 817 Font Technology
- 1.1K Technique and Theory
- 635 Type Business
- 451 Type Design Critiques
- 549 Type Design Software
- 30 Punchcutting
- 139 Lettering and Calligraphy
- 86 Technique and Theory
- 53 Lettering Critiques
- 500 Typography
- 309 History of Typography
- 117 Education
- 74 Resources
- 520 Announcements
- 84 Events
- 107 Job Postings
- 160 Type Releases
- 168 Miscellaneous News
- 271 About TypeDrawers
- 53 TypeDrawers Announcements
- 117 Suggestions and Bug Reports