Code points for alternate glyphs

AbrahamLee · May 2018

Sorry if this is a bit of a basic question, but I'm wondering what the code point should for alternate glyphs? For example, let's say I have glyph A and I also have glyph A.ss01. What should the unicode value be for A.ss01? Should it be the same as A? Should it have a code point at all?

James Puckett · May 2018

AbrahamLee said:

Should it have a code point at all?

In general, no. Giving PUA code points to alternate glyphs is bad practice because when the alternate is used it changes the text, which means the text has to be re-typed to use it with other fonts. However, if you’re designing display fonts for the scrapbook crowd they’ve come to expect PUA encoded fonts for use in software that wasn’t designed for accessing alternate glyphs.

Thomas Phinney · May 2018

Two different glyphs can’t share the same code point. One is encoded, and the other is either unencoded, or differently encoded.

General practice these days is to not encode alternates. Speaking as the person who originally made/advocated the decision to PUA encode alternate glyphs in the early Adobe OpenType fonts, I will take this opportunity to say: if I had it to do over again, I would leave them unencoded. The fact that Adobe encoded those glyphs in OpenType to start with left Adobe a legacy compatibility issue so they couldn't un-encode them when revising the same fonts.

The whole reason for encoding alternate glyphs was to allow access to the most common and interesting alternate glyphs, in a world where OpenType feature support was just starting. So, there were reasons. But honestly, they didn’t get heavy use.

Things may be different for different audiences, of course. If most of your users are using apps that are not feature-savvy, or have limited feature support (e.g. MS Word), you may have good reasons to PUA-encode some or all of your alternate glyphs. For example, if I were Laura Worthington and had her audience, I might have a different take.

AbrahamLee · May 2018

Very grateful to both of you for sharing your thoughts. It gives me the clarity and direction I needed.

Jacob Casal · May 2019

It felt more appropriate to mention this here than in Clint’s thread over here:

I have seen alternate glyphs both have a code point in a PUA while also having the same glyph unencoded. Menk Qagan Tig by Menksoft (Mongolian unicode block) has glyphs in the PUA that don’t seem attached to anything. I imagine it would be for if a user wanted to type an alternate by itself (though there are ways to do that already via a /nirugu or zero width characters), while the unencoded ones actually attach to and modify the other letters.

Thomas Phinney · May 2019

Yes, one can have the same glyph both encoded and unencoded by duplicating the glyph. In some rare cases this may even make a fair bit of sense (though usually not).

If you care about the ability to backwards-derive Unicode from a stream of glyphs, and you have a situation like the wacky one with the standard Latin/western f-ligatures, then sure.

So, the situation is that Unicode encoded the ff, fi, fl, ffi and ffl ligatures because they exist in ancient “expert sets” and there was a legacy compatibility issue. So, some people want to give them their Unicode codepoints. But when these glyphs are happening because of ligature behaviors, in the get-Unicode-from-the-glyph-stream situation, you want them to decode back to f and i, not the codepoint for the hardcoded fi ligature. So you can possibly “help” things by having a glyph "fi" that is encoded but has no OpenType feature relationships, and the glyph “f_i” that is unencoded and is the result of f+i and a 'liga' feature.

These are mostly corner cases, but there are some, for sure.

Jacob Casal · May 2019

Fair enough, in looking at the legal info it seems the font goes back to 2012, so I’m sure things have changed a lot since then.

Code points for alternate glyphs

Comments

Categories