The Mysteries of the Unicode Table

2»

Comments

  • Chris LozosChris Lozos Posts: 1,093
    edited March 2016
    @Ray Larabie

    The chi or lambda or most Greek lowercase are not so tightly controlled a system as latin glyphs. Greek is more free-flowing and not so dependent on geometry as Latin may appear.  Think of it as crafted writing rather than construction and architecture.  While the modern more Latinized Greek fonts do attain more rigidity to construction, they do not shy away from this and embrace it. The more traditional Greek forms for lowercase have more the feel of Matisse gesture drawings with vitality.  They flow together rather than fit together.  Latinized Greek is more like soldiers marching in step while the more humanized Greek forms are line-dancing in harmony with each other.
  • Ray LarabieRay Larabie Posts: 786
    The General Punctuation rage (2000-206F) is full of interesting stuff. I think the following glyphs are zero width: 200B, 200C, 200D, 200E, 200F, 2028, 2029, 202A, 202B, 202C, 202D, 202E, 2060, 2061, 2062, 2063, 2064, 2066, 2067, 2068, 2069, 206A, 206B, 206C, 206D, 206E, 206F. 2024 seems to be a period for some primordial Xerox encoding.

    Two more questions:

    Spacing Modifier Letters (02B0-02FF) contains the non-zero width accents we usually include in our fonts. For example: ring at 02DA. Apart from using these glyphs in composites, when are these actually used? We already have a combining ring at 030A. Who actually uses the 02DA ring? Do some applications use these as combining accents?

    If I'm using regular accents for lowercase and unencoded compact accents for capitals, how do I deal with combining accent substitution? I've got @comb, @combcap and @cap classes so I can check if the preceding glyph is a capital and substitute the alternate combining accent. Should these combining accents be placed at lowercase height so applications will then raise them to cap height? If I place those alternate combining accents at cap level, will applications bump them up too high over the capitals?
  • Ray, the composites you refer actually use the combining diacritics. Spacing modifier letters are used in phonetics, programming, and also as metainformation (you use a spacing modifier tilde to show the tilde). I suppose there are other uses I am not aware.

    Regarding the accent substitution, this is made with ccmp and marks. You define, for example, that á is made by a+acutecomb while Á is made by A+acutecomb.uc.
  • edited March 2016
    Spacing Modifier Letters (02B0-02FF) contains the non-zero width accents we usually include in our fonts. For example: ring at 02DA. Apart from using these glyphs in composites, when are these actually used?

    Some Austronaesian languages use glottal stops with different names and occasional local shape variations, including the okina, eta and fakauʻa (U+02BB). Skolt Sami separates between ʼ (U+02BC), ʹ (U+02B9), and ˈ (U+02C8). The modifier letter apostrophe (U+02BC) also surfaces in some Na-Dené languages.

    Not implying that is the extent of their use.
  • edited March 2016
    For example: ring at 02DA.

    Note that the legacy diacritical marks (see under the heading Legacy Marks) included in Unicode are now primarily useful for discussing the marks themselves. I.e, I can type ˚ (U+02DA) when I talk about the ring accent, instead of having to type a space before the combining ring above (U+030A) for a similar result:  ̊

    Apart from using these glyphs in composites, when are these actually used?

    I personally don’t use them for composites either, as that would require me to split the diacritical marks in our character set between spacing marks and zero-width marks, or keep some of them without Unicode value (there is, for example, no spacing comma below).

    (Edited for clarity)


  • U+200C and U+200D are a little different. They can be no-outline glyphs, but for scripts in which these are used as layout control characters is is helpful to have visual representations for editing purposes. Software like MS Word has an option to display control characters in text, and does so by using glyphs in the font; when this option is disabled — most of the time — display of these glyphs is suppressed. Conventions for display of these and other layout control characters varies, but typically involves a thin vertical bar to make it easy to identify the insertion point in text when displayed, topped by a small symbol indicating the character. And, of course, all zero width. These are the forms I have come to favour, the first few following Microsoft conventions:



    Thanks for pointing this out.

    John Hudson said:
    BTW, note that the 'combining grapheme joiner' is a bit of an oddity, not least because it doesn't join graphemes. There was a period when it looked like it might be deprecated, but then it was found to be useful for preventing mark reordering during normalisation, which is necessary for Biblical Hebrew.
    The Wikipedia page for 034F does a good job of explaining. With Hebrew examples, too. 
  • The General Punctuation rage (2000-206F) is full of interesting stuff. I think the following glyphs are zero width: 200B, 200C, 200D, 200E, 200F, 2028, 2029, 202A, 202B, 202C, 202D, 202E, 2060, 2061, 2062, 2063, 2064, 2066, 2067, 2068, 2069, 206A, 206B, 206C, 206D, 206E, 206F. 2024 seems to be a period for some primordial Xerox encoding.
    Thanks. Gotta check all these characters out.
    Sorry to go off topic, but wasn't Primordial Xerox the name of a heavy metal band from the eighties?
  • Chris LozosChris Lozos Posts: 1,093
    No, Richard, they were just copies ;-)
  • John HudsonJohn Hudson Posts: 1,486
    Note that the legacy diacritical marks (see under the heading Legacy Marks) included in Unicode are now primarily useful for discussing the marks themselves. I.e, I can type ˚ (U+02DA) when I talk about the ring accent, instead of having to type a space before the combining ring above (U+030A) for a similar result:  ̊ 

    A distinction should be made between legacy spacing accents such as U+0060 GRAVE ACCENT and the spacing modifier signs in the 02XX block. The latter are used in phonetics.

  • edited March 2016
    Thanks for the clarification, John. I mixed them up in my head somehow. Please disregard my last comment.
  •  Chris Lozos said:
    No, Richard, they were just copies ;-)
    Yet another tribute band - a nickel a piece.   
  • Ray LarabieRay Larabie Posts: 786

    Greek question: φ vs ψ (phi vs psi)

    Does it look better if the vertical strokes start at the same height? Assuming it's the style of φ that has a vertical stroke, not the curly type. From looking at other fonts it seems pretty arbitrary. To my non-Greek eye, it looks better when they match.

    Cyrillic question: ф vs dp (ef vs dp)

    I've noticed that in many contemporary Cyrillic typefaces, a single loop ф is favored over the type that looks like a dp collision. What sort of typeface suits each style? Old-fashioned, ultra-modern, script, handwriting etc.
  • Chris LozosChris Lozos Posts: 1,093
    Does it look better if the vertical strokes start at the same height?

    Greek is not to vertically benchmarked so rigidly as Latin.  The question is more of balance than alignment.  Think of a Bach phrase compared to a Sousa march.  They are both in an order but not the same one.
  • John HudsonJohn Hudson Posts: 1,486
    Ray:

    Cyrillic question: ф vs dp (ef vs dp)
    I've noticed that in many contemporary Cyrillic typefaces, a single loop ф is favored over the type that looks like a dp collision. What sort of typeface suits each style? Old-fashioned, ultra-modern, script, handwriting etc.

    Hopefully Maxim will chime in with a fuller and more nuanced explanation, but in the meantime:

    Idiomatically, the double-bowl ф works best in types with a vertical axis and an expansion stroke model (neoclassical/transitional and romantic/didone). The single bowl ф works better in the neo-renaissance humanist style.

    In sans serif types, this feature doesn't always translate in the same way that didone -> grotesque sans and renaissance -> humanist sans features of Latin script tend to. There are well known early 20th Century Russian grotesques with a single-bowl ф, often quite squarish. So that's the route I took with Helvetica World, despite the stroke construction of Helvetica suggesting a low-contrast didone structure. See page 22 of my Serebro Nabora presentation.


  • Nick ShinnNick Shinn Posts: 1,306
    Ray, it doesn’t make any difference to readers.

    Those who know type generally like to see new faces which are close to existing styles conform to the familiar letter shapes. So some research is necessary if you want to please that audience.

    Otherwise, and especially if you are designing a face with some novel personality, make both variants and judge for yourself which “reads” best in native language setting.




  • edited June 2016
    The English Wikipedia entry for Ȳ claims that it is used in Livonian. There is no source for this claim, and I’d venture to say it is a mistake, seeing as neither Livonian, Latvian nor Estonian uses the letter Y (except for in loanwords).

    (Was there a question here somewhere?)

  • @Frode Bo Helland -- Unicode says it is used in Livonian; uni0232 and uni0233.
  • Where does this come from? None of my sources corroborate. There is, however, the U with macron.
  • George ThomasGeorge Thomas Posts: 441
    edited June 2016
    Michael Everson shows it in parens, and writes: "Letters in (parentheses) are fundamental letters normal to the alphabet of a languages, used in writing native or naturalized (non-foreign) words, but which are, in the sources, interfiled with the base letter."

    Geonames.de shows the base /Y only, in parens; other sources don't show the base letter.
  • Katy MawhoodKaty Mawhood Posts: 190
    edited January 2017
    System of Scansion
    In 2-level notation, if the classical nonictus (metrical breve) uses uni23D1, what is the unicode value for the classical Ictus? Is it a "horizontal scan"?

    https://en.wikipedia.org/wiki/Scansion#2-level_notations
  • John HudsonJohn Hudson Posts: 1,486
    I believe the en dash character is typically used for the classical ictus. I know when I was designing the Brill type, I coordinated the width and height of U+23D1 with the en dash for this reason.
  • Ray LarabieRay Larabie Posts: 786

    What's permanent paper?
    267E permanent paper symbol.
  • Kent LewKent Lew Posts: 799
    Ray — Here’s what typical usage looks like:



    That language is standardized, as indicated in the Wikipedia article Khaled linked to.

    Incidentally, we included this character in Miller Text when it was expanded a few years ago, since Matthew had already drawn it at the special request of someone (possibly Will Powers?). I’ve taken to including it in my own designs now also, given my target market.
  • There is a second use for the permanent paper sign. Organizations with documental policy or adopting ISO 9xxx standards need to indicate the validate of a given document or for how many years it should be keeped. Those destined to permanent use or archive could receive the permanent paper sign.
Sign In or Register to comment.