I have a bunch of stupid questions about OpenType tables. Let's kick off with one about localisation in the
name
table: What character encoding should be used for non-Latin entries? The
specification says:
Note that OS/2 and Windows both require that all name strings be defined in Unicode. Thus all 'name' table strings for platform ID = 3 (Windows) will require two bytes per character. Macintosh fonts require single byte strings.
"Unicode" isn't an encoding, but "requires two bytes per character" suggests to me this is in UTF16 when platformID=3. Could or should that be made more explicit?
But the one about the Macintosh completely confuses me. Let's say I have a Japanese string. The spec suggests I could have a string which is for the Mac platform (platformID=1) and then I choose the appropriate language ID for the platform (languageID=11). Now I have to choose the encoding ID, which on this platform is a "script manager code." Looking in the table, there's a script manager code for Japanese which means I set encodingID=1. Now I need to encode the string itself in a "single byte string" encoding of Japanese, which as far as I'm aware doesn't exist.
What's going on here? Why do the Macintosh script manager codes even exist? Perennial question: does anything actually use them in practice, or should all non-Latin stuff be restricted to platformID=3 entries?
Comments
Illustrator:
Photoshop:
(Win 8.1)
MacOS provides a function to convert a string (NSString or CFString) to different encodings:
CFStringCreateExternalRepresentation
external_string_encodings
In which case "Macintosh fonts require single byte strings" is completely wrong.
These are legacy Macintosh encodings, before OS X. The ones listed in OT specs can be found there, at least 0 to 11 and 21 to 29. Unless you’re making a font for System 7, Mac OS 8 or Mac OS 9, don’t use these.
That said you may need a couple of names platformID=1, encodingID=0 for some Mac applications to be happy like mentioned before but the other encodingID are useless.
Remember that this part of the specs was written when Unicode was still a new thing.