Special dash things: softhyphen, horizontalbar

2»

Comments

  • Adolfo Jayme said:
    That would be the case in Spanish and Portuguese as well, and I guess it’s due to the lack of support for U+2015 in most publishing fonts. In these languages, as a matter of style you can use both U+2014 and U+2013 as quotation dashes, but to avoid the wrong line-breaking behavior of these characters, you may need to add U+2060 next to them.
    I found this thread while trying to understand the softhyphen logic (and I decided not to include it as a glyph, as John suggested).

    I find relevant what Adolfo says about the use of endash: I see either the endash (U+2013) and emdash (U+2014) used as quotation dashes. But as far as italian books go, my impression is that in hot metal the emdash was more used for quotations.
    In italian, it seems historically it’s been a predominant use in novels.
    Whether there were two separate glyphs in lead typefaces, one for the emdash and one for the quotation mark/horizontal bar (U+2015) I don’t know, but more or less the length used is always the one of the emdash.
    So I think I will just double-encode the emdash as U+2014 and U+2015
  • The soft-hyphen (U+00AD) is a control character which tells the application that this is possible hyphenation place (to supplement automatic hyphenation or supplement it), and is such it does not need a glyph at all or even be present in the font.

    Application that handle this “correctly” will use the soft-hyphen as a line breaking opportunity and if the line is broken there it will insert the glyph of hyphen (U+2010) or hyphen-minus (U+002D), so the glyph for soft-hyphen (U+00AD) will never be used.

    However there is no shortage of broken applications; some applications will not use soft-hyphen unless it has glyph in the font, and others will use its glyph when breaking the line at it. So I agree with @Kent Lew that the best approach is to double encoded it with hyphen (U+2010).
    There are other ways that it is used. It has a different role in pre-formatted text than in endpoint-formatted text.
  • Are there any low end applications that can be induced to use the correct representation for language and script?  Especially when the form of a hyphen is dependent on house rules.  Using U+200B for intra-word breaks feels like a violation of the text.
  • In Spanish, when setting a bibliography, some publishing houses – namely the Fondo de Cultura Económica – uses three emdashes to indicate that the author is the same than in the preceding entry. I guess this is a practice that the digital typesetters learnt from the Linotype typesetters, but I am not 100% sure. Ideally, to get it right, the emdash must not have sideabearings at all, so it looks like a continuous line… or, as I do it, I kern emdash versus emdash. So, if someone actually wants to use it that way, it does the trick automatically.
  • Nick Shinn
    Nick Shinn Posts: 2,207
    Yes, I usually leave a tiny sidebearing on the em dash, and have kerned it to itself for continuous effect in some types—but I don’t always remember to.
  • You can also use the three-em dash (U+2E3B, ⸻), but few fonts offer it.
  • You can also use the three-em dash (U+2E3B, ⸻), but few fonts offer it.
    And few typesetters are aware of it.
  • While I don’t know the intent of the Unicode Consortium in designating U+2015 as a “horizontal bar,” it appears to me that it is a 3-em dash, which is an essential in the typography of English-language bibliographies, perhaps in other languages as well. It is used to indicate “same author as the previous entry (see below). In metal type, it was a standard cast character in text fonts. But in today’s fonts, there’s seldom a need for it, as one generally sets two or three em dates consecutively. That’s why the em dash should not have side bearings (pace Nick Shinn). Modern em dashes are generally wider than those cast in metal, so two em dashes are usually sufficient to make a 3-em dash when needed.


  • Modern em dashes are generally wider than those cast in metal…

    I am of the belief—possibly mistaken?—that modern em dashes are much more often narrower than those cast in metal.

    I have analyzed quite a few modern digital fonts for em dash width, as it has previously come up as a subject of discussion. I found:
    • very few of them ever had an advance width more than en em
    • advance width varies from about 2/3 of an em to a full em (with some exceptions for condensed and extended fonts, which is how one also gets a few more than an em. But very few!).
    • the actual dash length within the advance width varies; some designers/foundries extend it a tiny bit beyond the advance width at both ends; some make it exactly equal; still others give it a bit of white space on either side like a hyphen
    Perhaps I am mistaken about metal em dashes. I thought they were generally generic sorts and not font-specific. I am of the impression that they are/were an em in width. How wide do you believe metal em dashes are/were?
  • @Scott-Martin Kosofsky
    While I don’t know the intent of the Unicode Consortium in designating U+2015 as a “horizontal bar,” it appears to me that it is a 3-em dash, which is an essential in the typography of English-language bibliographies, perhaps in other languages as well.
    I’ll try to find out what the original intent of the encoding was, but my suspicion is that no one know or remembers. The character was inherited into Unicode from some earlier 8-bit codepages: ISO 8859-10, IBM Symbols, Windows CP932 (Japanese), and Windows CP1253 (Greek). But it is likely only in retrospect that the long dash character in those different codepages were unified as U+2015.
  • If one would make the em-dash with zero sidebearings, then a setting like
    1920–1985
    would produce a clash, at least with figures like 0, 6, 9. The similar problem occurs with something like
    Hamburg–Berlin
    in which the dash means “to”.
    – So far for German typesetting conventions, concerning the em-dash.

    (the dash is too short in this typeface here, to my taste)
  • Andreas, your examples should be en dashes, not em dashes. En dashes need to have equal sidebearings, as the same issues will appear on both sides. (They also require kerning adjustments.) 

    Thomas, you are half correct about foundry dashes being generic. Foundries kept a variety of dashes on hand in each size, in various line weights and widths, and matched them to this font or that. It was seldom that one found an em dash that was the full em width, though I have seen some. In (metal) Monotype, the width of the en and em dashes could vary with the the chosen “set width.” In general, the “set em” was defined as the width of then widest letter in the font, and all other characters are a fixed proportion of the set size (Sorry, the language of metal Monotype is rather complicated and peculiar!) The presence of a 3-em dash in the matrix case depended on the size of the type and the matrix case arrangement. When it was not possible to include it, such as in the largest composition sizes, a separately cast sort had to be inserted by hand. I seem to recall that the same was true for metal Linotype, at least with some of the lesser models that didn’t have expanded matrix cases. In Linotype, though, it was a matrix that was inserted by hand into the line, not a cast sort.

  • … should be en dashes, not em dashes. …


    damn … you are right!