Space glyphs

Hi, how about these space glyphs, Thin space (uni2009) etc. Are they really needed in a font?
Do they override apps if they are there and does apps handle them if they are not there?


  • Kent LewKent Lew Posts: 905
    I managed to find our old Typophile thread on this topic in the Wayback Machine. Might be of interest to you:

  • Thank you, Kent!
  • Kent Lew said:
    I managed to find our old Typophile thread on this topic in the Wayback Machine. Might be of interest to you:

    Thanks for digging that up, Kent.  A trip down memory lane.
    And my hat is off because I tried dredging up the typophile site on Wayback awhile back, with no luck at all.
  • @Simon Cozens Thank you! There's a lot of good info there.
  • I had a harddisk crash about 10 days ago so that set me back a bit :(
  • Kent LewKent Lew Posts: 905
    Richard — I just happened to have that thread bookmarked from long ago, so I had the specific node number, which made it easier to search Wayback.
  • @Dave Crossland and @Simon Cozens - Access to the threads and tutorials on Typophile would be great. Kudos. Let us know. 

  • @Tobias Kvant  - since that thread on Typophile, from 2010, the HTML5 recommendation was published by the W3C. You'll find there a long list of Unicode points that are mapped to human-friendly names - postscript style.

    In the HTML4 rec there is a smaller list called character "entities", but in HTML5 they are called character "references".  (Since the "entities" from HTML4 are a subset of HTML5's "references", I guess the term "entities" can be considered deprecated.)
    Two things:
    1) There are new references corresponding to empty space characters in HTML5.
    In addition to  , there is now   which is the 3-per-em space (U+02004) and   which is the 4-per-em space (U+02005) . You'll see those Unicode points listed in the image contained in the initiating post from the old Typophile thread.
    [Note on the naming convention: I assume 13 is meant to mean the fraction 1/3, that is, the space created is one-third of an em. Same with 14 - the space created is one-quarter (1/4) of an em.]

    There are also these other glyphless HTML references to consider:

    From HTML4 ...


    ‍   200D
    ‌  200C

    From HTML5 ...

       2009  (equivalent to   just an alternate spelling)
       200A  (equivalent to   just an alternate spelling)
      00A0 (equivalent to   just an alternate spelling)

       2007 (width calculated same as the standard glyph name 'figurespace') 
        205F 200A
    (I am assuming ThickSpace creates a space equal to the two spaces added together contiguously, but I'm going to check.)

    ​ 200B
    ​ 200B 
    ​ 200B
    ​ 200B
    ​ 200B
    [Note: I find the above a little weird. I just read that: "this character is intended for invisible word separation and for line break control; it has no width, but its presence between two characters does not prevent increased letter spacing in justification". So why it is being short-named 'negativespace' of varying dimensions, I don't know. Can anybody shed some light on what the thinking was behind this?]

    ‏  200F 
    ‎  200E

    2) All modern browsers support these references. They are implemented as universally as is possible. 

    My takeaway from revisiting all this is that, while it's very nice that InDesign has a menu which will compute, from the font, the value for these spaces, other environments don't. So there is no solid reason for not including them in every font's character set, universally. 
    As Si's comment on the thread from 2010 says, in essence - you lose nothing by including them and in doing so you ward off the possibility of a user complaint.
    If Microsoft considers it best practice, why wouldn't I?  And if you think the font will end up being adapted for web use at some point in the future, you're just asking for trouble leaving them out. IMHO. 

  • joeclarkjoeclark Posts: 122
    [Dave Crossland] and I have managed to dredge up the whole Typophile site... and are secretly working on extracting the goodness in a user-friendly form. 
    Yet Andy Crewdson’s New Series (2002) sits on one decaying CD-ROM that one person has yet will not publish.

    On the other forum that shall not be named but shall have its own world to rule over post-Rapture, every year I asked where Andy was. And nobody ever knew.
  • joeclarkjoeclark Posts: 122
    One reason why space “glyphs” go unencoded in some applications and on some platforms, and did so for quite a few years, can be traced back to PostScript and PDF. It was not atypical to find PDFs that did not encode space characters but essentially lifted the pen (à la the children’s programming language Logo), moved over to the right a little bit, and dropped the pen and continued typesetting. You the human viewer interpreted the gap as a space character, but it did not exist in the underlying file.

    Tagged PDF, including PDF/A and PDF/UA (ask me about that latter), specifically reinsert space characters and carriage returns inside PDFs. (I don’t think there is a thorough and exhaustive list of whitespace characters that are reinserted.) That way screen readers and a host of other technologies, not incidentally including search engines (which are blind users), can correctly read the text. It does seem baffling in retrospect that anyone ever thought this was a viable way to typeset.

    Next! In HTML development (especially in standards-compliant development, not that anyone remembers that and not that anyone but me publishes valid HTML anymore), it is a terrible idea to use the actual space characters instead of a named entity or reference of some kind. Why? Because you cannot tell these space characters apart at a glance in source code. Try finding the one errant nonbreaking space in a 500-word article if it looks exactly the same as a default wordspace.

    So in fact the only space character one should use as a space character is the common wordspace. Everything else should use some kind of entity. (There will be rare JavaScript scenarios when you can’t.)
  • ...I asked where Andy was. And nobody ever knew.

    Is he supposed to have disappeared or something? He appears to be pretty much alive and well, working as a web developer. It took me less than fifteen minutes to locate him.

  • Richard FinkRichard Fink Posts: 165
    edited February 2016
    joeclark said:

    So in fact the only space character one should use as a space character is the common wordspace. Everything else should use some kind of entity. (There will be rare JavaScript scenarios when you can’t.)
    @joeclarkThanks for the tip about using the reference (or are "entities" preferred by you?), instead of the Unicode point.
    And, BTW, somebody cares about the standards cause I was shocked by the fact that all the browsers support the HTML5 Character References. At least on the browser end of things, if not the author end of things, compliance seems to count for quite a lot. 
    Especially after the kerfuffle between Microsoft and the EU court instigated by Opera's complaint about IE's lack of support for the current standard.  Different world today, it really is.
    [Addendum: I didn't mean to say that the only thing Microsoft reacted to was a kick in the pants by the EU. There's more to the story than brute legal force.
    There was, simultaneously - and I've read first-hand reporting about it - 
    a change in perception regarding standards and the need for compliance, and also toward open-source code and the need, at the very least, for developers to know what code major vendors like Microsoft consider proprietary and what they consider free for everybody to use.]
  • joeclark said:

    Yet Andy Crewdson’s New Series (2002) sits on one decaying CD-ROM that one person has yet will not publish.
    What was it, actually? The link you provide is quite terse :) 
  • Thanks everyone for your input. Seems including them is close to obligatory nowadays, then.
  • Richard FinkRichard Fink Posts: 165
    edited February 2016
    @Tobias Kvant 

    Not so fast!

    Yesterday morning I thought of an HTML test that would prove to me that a font SHOULD have these characters defined as a matter of best practice. Simply as a matter of giving authors the behavior they expect in any medium.

    I ran some tests. I explain the methodology and, for myself and everybody in the world, I now consider the matter closed:

    I created a font with no spaces defined in it named nospacesfont.
    I created a font with spaces defined in it named spacesfont but deliberately expanded the EM space (U+02003).

    I wrote an HTML page that defined three font families in a typical CSS "font stack" for the body of the page:
    font-family: nospacesfont, spacesfont, serif;

    In the body of the page, within a paragraph tag, I put this:


    The result in the browser was an elongated emspace  like this:

    As it was supposed to do, the browser found that no &emsp; had been defined in the nospacesfont, the first font in the stack, and it then went on to see if the next font in the stack, spacesfont, had the &emsp; defined and, finding that it did, the browser went ahead and displayed it. (Tested on FF, IE, Chrome.)

    Font fallback is a very profound difference between text in browsers and text in graphic mediums! 

    Now, you might say: Hey, Rich, your test page is rigged. What's the big deal? These space characters have fixed widths in any font? What's the difference if it uses the emspace character from the system font? It's still going to be the same width!

    For the emspace character, that will probably be true. However, in focusing on the width, you are forgetting that font fallback also brings the vertical metrics of the fallback font into visual play because even if it's displaying a single character from the fallback font, the browser will still use the vertical metrics of that font to calculate the line height.

    So I wrote another test page to demonstrate the effect. It was very similar to what I did in the first test page but in this one, I added to the vertical metrics of the fallback font to demonstrate what happens. And I had to add more text to the page to get full sentences to show how the line-height widens where the fallback font comes into play.

    You can see additional leading between the second and third lines:

    I inserted a thinspace character (&thinsp;) in the third line - it doesn't matter where. Since there is no thinspace defined in the font at the top of the font stack, the browser takes the thinspace character from the fallback font along with the vertical metrics of that font. And that's what causes the extra space.

    One definition of quality control is quite simple: Deliver a product that fully meets the customer's expectations.
    Would introducing a mysterious and buggy-looking result like this meet the expectations any professional font creator's customer?

    And you know, I'm not sure I even know how to go about fixing a problem like this using CSS alone. Extra line-height showing up seemingly at random like this would drive me nuts.

    Now, I'm done.
Sign In or Register to comment.