Monotonic Greek in double encoded unicase fonts
Michael Rafailyk
Posts: 213
Question 1 — quadruple encoding
Since the letter shape of Alpha and Alphatonos looks identical in Monotonic Greek (and it is true for double encoded unicase fonts), I would like to encode all ~tonos versions to their base letters. Is it a good idea? If so, is my encoding map correct?
Alpha 0391 03B1 0386 03AC ← Alphatonos Beta 0392 03B2 Gamma 0393 03B3 Delta 0394 03B4 Epsilon 0395 03B5 0388 03AD ← Epsilontonos Zeta 0396 03B6 Eta 0397 03B7 0389 03AE ← Etatonos Theta 0398 03B8 Iota 0399 03B9 038A 03AF ← Iotatonos Iotadieresis 03AA 03CA 0390 ← iotadieresistonos Kappa 039A 03BA Lambda 039B 03BB Mu 039C 03BC Nu 039D 03BD Xi 039E 03BE Omicron 039F 03BF 038C 03CC ← Omicrontonos Pi 03A0 03C0 Rho 03A1 03C1 Sigma 03A3 03C3 03C2 ← sigmafinal SigmaLunateSymbol 03F9 03F2 Tau 03A4 03C4 Upsilon 03A5 03C5 038E 03CD ← Upsilontonos Upsilondieresis 03AB 03CB 03B0 ← upsilondieresistonos Phi 03A6 03C6 Chi 03A7 03C7 Psi 03A8 03C8 Omega 03A9 03C9 038F 03CE ← Omegatonos
Question 2 — suppress tonos in ccmp
Also, in the case when tonos is a separate symbol in the text, I would like to compose it in ccmp, but directly to Alpha glyph instead of Alphatonos (which is not presented in the font and is encoded in Alpha glyph now). Will this substitution break the source text if the user copy it or change the font?
script grek;
language dflt;
lookup ccmp_grek_1 {
sub Alpha acutecomb by Alpha;
sub Epsilon acutecomb by Epsilon;
sub Eta acutecomb by Eta;
sub Iota acutecomb by Iota;
sub Omicron acutecomb by Omicron;
sub Upsilon acutecomb by Upsilon;
sub Omega acutecomb by Omega;
} ccmp_grek_1; 0
Answers
-
To confirm my understanding: this is an all-caps font, and you are expecting text to be displayed with the tonos mark suppressed, even at the beginning of words, correct?
It can get a little more complicated than your proposed multiple encoding scheme suggests, because the diaeresis can behave contextually in all-caps text. I suspect this is a little more frequent in polytonic, but I think there are also instances in monotonic where a sequence of two vowel letters not forming a diphthong are distinguished by the tonos being applied to the first vowel instead of the second, but in all-caps the suppressed tonos is replaced by a diaeresis on the second vowel:
Will this substitution break the source text if the user copy it or change the font?No, because the substitution is happening at the glyph level, and does not affect the source text character string. Of course, if the user switches fonts, they will see different results presuming the other font does not include the kind of code you are proposing.
However, it could affect downstream text in a print-stream-distilled PDF. Acrobat (and some other PDF viewres?) parse glyph names in the embedded font to attempt to reconstruct the text for searching and copying. This is always an issue with multi-encoded glyphs, because the glyph name will only map to one of the possible encodings.
0 -
Yes.John Hudson said:this is an all-caps font
I expect the font to visually appear according to Gerry Leonidas' Monotonic conversion, like when the text is set to all-caps.and you are expecting text to be displayed with the tonos mark suppressed, even at the beginning of words, correct?
This is most complicated part, which I'm not sure I understand completely. John, are you talking about some exceptions cases? How rare they are?there are also instances in monotonic where a sequence of two vowel letters not forming a diphthong are distinguished by the tonos being applied to the first vowel instead of the second, but in all-caps the suppressed tonos is replaced by a diaeresis on the second vowel
Thanks for confirming that. Things are easier to understand when uppercases and lowercases are encoded to different glyphs, and with a double encoding I was confused. So, it's good to know it now.the substitution is happening at the glyph level, and does not affect the source text character string
I remember a discussion about this. As I understand it, the problem occurs only when copying the text from such a PDF. But it should be not a problem for just watching this PDF, right?However, it could affect downstream text in a print-stream-distilled PDF0 -
This is most complicated part, which I'm not sure I understand completely. John, are you talking about some exceptions cases? How rare they are?I don’t know what the frequency is. The point is that this is an orthographic rule that is possible to catch at the glyph level, but not if the cmap has already removed the distinction between accented and unaccented letters. In order to handle this rule contextually, one needs to preserve the character distinction into the GSUB level.As I understand it, the problem occurs only when copying the text from such a PDF. But it should be not a problem for just watching this PDF, right?Text searching within the PDF would also be affected.1
Categories
- All Categories
- 46 Introductions
- 3.9K Typeface Design
- 488 Type Design Critiques
- 571 Type Design Software
- 1.1K Type Design Technique & Theory
- 657 Type Business
- 869 Font Technology
- 29 Punchcutting
- 527 Typography
- 121 Type Education
- 326 Type History
- 80 Type Resources
- 111 Lettering and Calligraphy
- 32 Lettering Critiques
- 79 Lettering Technique & Theory
- 560 Announcements
- 95 Events
- 116 Job Postings
- 169 Type Releases
- 179 Miscellaneous News
- 269 About TypeDrawers
- 53 TypeDrawers Announcements
- 114 Suggestions and Bug Reports
