Non-Latin in the "name" table

I have a bunch of stupid questions about OpenType tables. Let's kick off with one about localisation in the name table: What character encoding should be used for non-Latin entries? The specification says:

Note that OS/2 and Windows both require that all name strings be defined in Unicode. Thus all 'name' table strings for platform ID = 3 (Windows) will require two bytes per character. Macintosh fonts require single byte strings. 

"Unicode" isn't an encoding, but "requires two bytes per character" suggests to me this is in UTF16 when platformID=3. Could or should that be made more explicit?

But the one about the Macintosh completely confuses me. Let's say I have a Japanese string. The spec suggests I could have a string which is for the Mac platform (platformID=1) and then I choose the appropriate language ID for the platform (languageID=11). Now I have to choose the encoding ID, which on this platform is a "script manager code." Looking in the table, there's a script manager code for Japanese which means I set encodingID=1. Now I need to encode the string itself in a "single byte string" encoding of Japanese, which as far as I'm aware doesn't exist.

What's going on here? Why do the Macintosh script manager codes even exist? Perennial question: does anything actually use them in practice, or should all non-Latin stuff be restricted to platformID=3 entries?

Comments

  • Platform ID=3 only. Photoshop for Windows do not use this names.

    Illustrator:



    Photoshop:

    (Win 8.1)
  • I tend to just forget about Macintosh names since this is mostly pre-OS X stuff. However, I’ve been told that some obsolete applications on Mac (old versions of Word so far) require Macintosh names, in this case I’d include only MacRoman names and forget about anything else, localized names for Macintosh platform is just broken.
  • You can also check FontTools, IIRC it has code to decode (most of?) the non-Unicode name entries.
  • What I don't understand is how this could have ever made sense. Why specify that you can write Mac platform strings in Mongolian or Devanagari if there wasn't even a way to correctly encode those strings?
  • Georg Seifert
    Georg Seifert Posts: 675
    edited October 2016
    It is possible to encode all that scripts. It is just not 8 bit. 
    MacOS provides a function to convert a string (NSString or CFString) to different encodings: 

    CFStringCreateExternalRepresentation

    external_string_encodings

  • It is possible to encode all that scripts. It is just not 8 bit. 

    In which case "Macintosh fonts require single byte strings" is completely wrong.

  • Georg Seifert
    Georg Seifert Posts: 675
    edited October 2016
    Yes. I'll check that to be sure. 
  • What I don't understand is how this could have ever made sense. Why specify that you can write Mac platform strings in Mongolian or Devanagari if there wasn't even a way to correctly encode those strings?
    @Simon Cozens Have a look at http://unicode.org/Public/MAPPINGS/VENDORS/APPLE/
    These are legacy Macintosh encodings, before OS X. The ones listed in OT specs can be found there, at least 0 to 11 and 21 to 29. Unless you’re making a font for System 7, Mac OS 8 or Mac OS 9, don’t use these.

    That said you may need a couple of names platformID=1, encodingID=0 for some Mac applications to be happy like mentioned before but the other encodingID are useless.

    Remember that this part of the specs was written when Unicode was still a new thing.