Unicode Question - why 17 planes?

André G. Isaak
André G. Isaak Posts: 633
edited February 2018 in Font Technology
This isn't a font-related question per se, but it's bugging me and I thought someone here (Michael Everson?) might know the answer. Why does unicode comprise 17 rather than 16 planes? 17 seems like a rather non-computer-friendly number.

André

Comments

  • This link may provide you with the answer:
    https://en.wikipedia.org/wiki/Plane_(Unicode)


  • Indeed it does! I should have thought to check wikipedia — instead I was going through the Unicode Standard assuming it would give an answer (which it doesn’t -- at least not that I could find).

    André
  • John Savard
    John Savard Posts: 1,126
    Given that UTF-8 can encode characters of up to 31 bits in length, and other codes related to Unicode went up to 32 bits, I would propose that the characters FFF0 to FFF3 be reserved for use as high surrogates for additional planes, which would encode the first two bits of a 32-bit character; the remaining 30 bits would be encoded by three consecutive low-surrogate characters from the existing range of DC00-DCFF.

    This encoding would continue to have the same good properties as UTF-8, at least. The only characters to be encoded this way would be those not found in the first 17 planes of Unicode, so it would start with the character encoded as FFF0 DC01 DC40 DC00.