A little curiosity about COMBINIG GRAPHEME JOINER
data:image/s3,"s3://crabby-images/edb7e/edb7ef02be3a2c68c49d13b44bb06ccb42a4a2a4" alt="mauro sacchetto"
mauro sacchetto
Posts: 353
In the glyphs table, at uni034F, there is COMBINIG GRAPHEME JOINER.
From Wikipedia I learn that it «is a Unicode character that has no visible glyph and is "default ignorable"».
So, how is to to be "designed", or the slot must remain empy, or what else?
Thank you
ms
From Wikipedia I learn that it «is a Unicode character that has no visible glyph and is "default ignorable"».
So, how is to to be "designed", or the slot must remain empy, or what else?
Thank you
ms
0
Comments
-
An empty glyph with proper Unicode, just that.0
-
Ah, ok... thank you0
-
Normally, you do not need to include it in a font at all.1
-
A bit more background:
This character ended up not being used for it's original intended purpose, so the name is a bit misleading. In fact, the most common use of this character now is to prevent some kinds of interaction between characters. In particular, this character is recommended by Unicode to be used to force overrides of canonical combining class reordering during string normalisation (and if you don't know what that means, you probably don't need to know). The best known use case is Biblical Hebrew, for which the canonical combining classes assigned to Hebrew vowel and cantillation marks are wrong but cannot be fixed due to stability agreements between Unicode and other standards bodies. So Unicode specifies that U+034F can be inserted between combining marks to prevent their reordering during normalisation.
If included in fonts, the glyph for this character should be empty and zero-width.3 -
Thomas Phinney said:Normally, you do not need to include it in a font at all.
0 -
The name is confusing because it's more the opposite.
There are examples in the the English and German Wikipedias:
https://en.wikipedia.org/wiki/Combining_Grapheme_Joiner
https://de.wikipedia.org/wiki/Combining_Grapheme_Joiner
Technically it should avoid normalisation, e. g. a + CGJ + combing_dieresis should convert to the codepoint ä. As canonical reordering is part of normalisation it's also avoided.
Other examples are digraphs like <ch>, <ph>, <sh> or multigraphs like <sch> which are seen in some languages as graphemes (in the linguistic sense), because they are single phonems. In this case CGJ can be used to treat the sequence as single characters e. g. for sorting (collation order).
It will also avoid ligatures, but for this purpose ZWNJ should be used, or if the letters belong to two different syllables the soft hyphen would be a better choice.0
Categories
- All Categories
- 43 Introductions
- 3.7K Typeface Design
- 811 Font Technology
- 1.1K Technique and Theory
- 628 Type Business
- 449 Type Design Critiques
- 547 Type Design Software
- 30 Punchcutting
- 137 Lettering and Calligraphy
- 84 Technique and Theory
- 53 Lettering Critiques
- 493 Typography
- 307 History of Typography
- 115 Education
- 71 Resources
- 505 Announcements
- 81 Events
- 106 Job Postings
- 151 Type Releases
- 166 Miscellaneous News
- 271 About TypeDrawers
- 53 TypeDrawers Announcements
- 117 Suggestions and Bug Reports