Have you ever had conflicts between double-coded glyphs?

Vasil Stanev
Vasil Stanev Posts: 775
edited November 2018 in Technique and Theory
Has it ever happened to you that the Math symbols like mu, Pi and Sum get in conflict with Greek glyphs you included in your font? Same goes for Omega/Ohm (the latter is the unit for electrical resistance), Tcedilla/Tcommaaccent, random localized or versions of the same glyph and whatever I am missing.

Do not have the problem currently, but wish to be prepared if it does occur.

Comments

  • If you keep your unicodes and production names right then there shouldn’t be a problem. 
  • Double-encoding a glyph only has a slight risk if an end user then makes a PDF from the font, via some route that does not preserve the original text encoding, but only uses the print stream. For example, they print a document to a PostScript file, then use Acrobat Distiller to make a PDF from the file.

    In such cases, it is then possible for the PDF creation app to end up using the glyph encoding or glyph name to determine what underlying character is. At that point, the PDF will still look and print fine, but an attempt to copy/paste from the PDF might result in technically the wrong character being pasted (though it might look very similar, depending on the font). Or an attempt to search the PDF for a string containing that character might fail, due to the character being unexpectedly different.

    I don’t think these scenarios for PDF creation are terribly common any more. I don't double-encode glyphs because of habit and workflows I learned long ago. I figure I might as well make a bit more bulletproof font, as it were. But I doubt many users would notice the difference.
  • John Hudson
    John Hudson Posts: 3,190
    Tcedilla/Tcommaaccent
    With regard to these (and corresponding lowercase), I recommend against double-encoding the commaaccent character U+021A to the cedilla character U+0162. We used to do this, but the feedback from users in Romania was that if the older 'with cedilla' codepoints are used for the Romanian  S/s and T/t diacritics, and locl feature forms are not triggered, they much prefer for both letters to get the wrong, cedilla form than for the S/s to get the cedilla and the T/t to get the commaaccent. Consistency is considered preferable to only one letter having the correct Romanian form.
  • Tcedilla/Tcommaaccent
    Some languages expect Tcedilla to look like a T with a cedilla and not a comma below, like Mankanya or Manjak and, if I’m not mistaken, Gagauz.