I recently did a quick and dirty survey of the cmap tables of all the fonts installed on my computer. Here are the results:
I'm not entirely sure what to draw from this, other than "put (0,3), (1,0) and (3,1) tables in your fonts, and (3,10) if you need codepoints outside the BMP." Is that a fair statement?
- Is (0,10) really a thing? I can't find it in the Unicode platform section on the "name" table page. The text on the "cmap" page says very little about platform 0, and should probably say more.
- There are lots of platform 1 encodings. But I don't see why there should be. The "cmap" table spec says that "When building a font that will be used on the Macintosh, the platform ID should be 1 and the encoding ID should be 0." "Should" is not the same as "must", of course - and more encodings are listed on the name table page, which might be confusing people.
- Similarly "when building a Unicode font for Windows, the platform ID should be 3 and the encoding ID should be 1." I'm guessing this is legacy language that wasn't updated after the creation of encoding ID 10. The language "Microsoft strongly recommends using Unicode 'cmap' subtables for all fonts" should probably be clarified - I think it means Unicode encoding, not Unicode platform, but since the text follows the description of the Unicode platform and the Unicode encoding has not been introduced at this stage, it's unclear.
The code I used to generate the survey information is available here