Code points for alternate glyphs

Sorry if this is a bit of a basic question, but I'm wondering what the code point should for alternate glyphs? For example, let's say I have glyph A and I also have glyph A.ss01. What should the unicode value be for A.ss01? Should it be the same as A? Should it have a code point at all?

Comments

  • James Puckett
    James Puckett Posts: 1,992
    Should it have a code point at all?
    In general, no. Giving PUA code points to alternate glyphs is bad practice because when the alternate is used it changes the text, which means the text has to be re-typed to use it with other fonts. However, if you’re designing display fonts for the scrapbook crowd they’ve come to expect PUA encoded fonts for use in software that wasn’t designed for accessing alternate glyphs.
  • AbrahamLee
    AbrahamLee Posts: 262
    Very grateful to both of you for sharing your thoughts. It gives me the clarity and direction I needed.
  • Jacob Casal
    Jacob Casal Posts: 99
    It felt more appropriate to mention this here than in Clint’s thread over here:
    I have seen alternate glyphs both have a code point in a PUA while also having the same glyph unencoded. Menk Qagan Tig by Menksoft (Mongolian unicode block) has glyphs in the PUA that don’t seem attached to anything. I imagine it would be for if a user wanted to type an alternate by itself (though there are ways to do that already via a /nirugu or zero width characters), while the unencoded ones actually attach to and modify the other letters.
  • Thomas Phinney
    Thomas Phinney Posts: 2,866
    edited May 2019
    Yes, one can have the same glyph both encoded and unencoded by duplicating the glyph. In some rare cases this may even make a fair bit of sense (though usually not).

    If you care about the ability to backwards-derive Unicode from a stream of glyphs, and you have a situation like the wacky one with the standard Latin/western f-ligatures, then sure.

    So, the situation is that Unicode encoded the ff, fi, fl, ffi and ffl ligatures because they exist in ancient “expert sets” and there was a legacy compatibility issue. So, some people want to give them their Unicode codepoints. But when these glyphs are happening because of ligature behaviors,  in the get-Unicode-from-the-glyph-stream situation, you want them to decode back to f and i, not the codepoint for the hardcoded fi ligature. So you can possibly “help” things by having a glyph "fi" that is encoded but has no OpenType feature relationships, and the glyph “f_i” that is unencoded and is the result of f+i and a 'liga' feature.

    These are mostly corner cases, but there are some, for sure.
  • Jacob Casal
    Jacob Casal Posts: 99
    Fair enough, in looking at the legal info it seems the font goes back to 2012, so I’m sure things have changed a lot since then.