Capital Eszett in OpenType code

James Grieshaber · May 2013

I’ve been drawing these Capital Eszetts for fonts lately. And I am aware there is a bit of controversy weather or not to use the Capital Eszett or simply replace it with SS. That being said, how do you deal with substituting the cap for the lowercase in the OpenType code? Which way makes the most sense and why?

Christoph Koeberlin · May 2013

I’d suggest to only make the glyph a source glyph, not a target glyph. So, substitute Germandbls by Germandbls.smcp etc, but not germandbls by Germandbls.
If people want this glyph, they’ll access it directly. Most people will confuse it, though.

James Puckett · May 2013

Do enough Germans even agree on the legitimacy of Germandbls that the majority of times it gets used it won’t be perceived as a problem?

Chris Lozos · May 2013

I make all substitutions for eszett with SS. That way, only the users who know they really want it, can apply it manually. The only substitution I make is when creating all smallcaps. If the user has inserted the cap Eszett glyph (uni1E9E) , it will be substituted with the by uni1E9E.sc when selecting C2sc.

Rainer Erich Scheichelbauer · May 2013

I suggest you make an exact copy of uni1E9E and name it germandbls.case, then you can put something like this in calt:

sub @Uppercase germandbls' @Uppercase by germandbls.case;
sub @Uppercase @Uppercase germandbls' by germandbls.case;

… and also in locl, right after the script latn; and before the first language tag, so it is always on in InD and QXp. And you can substitute germandbls by germandbls.case in case, but that won't be effective in InD.

Christoph Koeberlin · May 2013

James, most people don’t even know of this character. In German orthography it’s Straße and STRASSE. Chris’ implementation totally makes sense.

Rainer Erich Scheichelbauer · May 2013

The only real problem is when users type ß in all-caps, like STRAßE. If they do, the OT code above fixes its appearance, i.e., at least the ß fits the caps.

John Hudson · May 2013

Case conversion is a character operation performed by layout software. Generally speaking it is a bad idea to try to mimic it through glyph substitutions such as the {case} feature, which is intended to provide all-caps compatible variants of non-casing characters such as punctuation and symbols.

In the case of the eszett, there are two different casing options. The standard casing is still ß -> SS, and that is enshrined in both DIN standards and Unicode casing rules. So while Rainer's contextual substitution quite cleverly deals with the case in which a lowercase ß character is keyed in the middle of all-caps, it also masks what is actually encoded in the text, and will result in what may be unexpected changes in the appearance if someone then selects and applies an all-caps case conversion to the text: what appeared as an uppercase eszett glyph will become SS. I think it is better to encourage clean text encoding, now what we have an uppercase eszett character to support the non-standard casing option.

Craig Eliason · May 2013

Case conversion is a character operation performed by layout software. Generally speaking it is a bad idea to try to mimic it through glyph substitutions such as the {case} feature, which is intended to provide all-caps compatible variants of non-casing characters such as punctuation and symbols.

Does that mean I shouldn't have sub @lc by @uc; in my case feature code?

John Hudson · May 2013

Definitely not. The full name of the {case} feature is Case-Sensitive Forms and its purpose is to provide forms of some non-casing characters that look better in the context of all-caps (usually by being raised to align better with capital letters). It is not intended to perform case conversion.

Jackson Showalter-Cavanaugh · May 2013

See also: All-Cap Greek

Craig Eliason · May 2013

Okay, thanks! But I gather that smcp is a different animal, correct?

John Hudson · May 2013

I find it makes more sense to handle all-cap (and smallcap) Greek diacritic supression/substitution in the {calt} feature, because of the variables in {case} feature implementation. Neither feature is really ideal, but {calt} works more reliably in more places and has the advantage that it can be toggled on and off, since the Greek diacritic suppression in all-caps is a modern convention arising from left-positioned marks on caps, so not always to be followed in all styles of typography.

James Grieshaber · May 2013

Thank you @John, your comets are very helpful.

Let's say I do want to implement a feature, then the best place to put it is in calt.

@Renier. I don't understand, why would you need to rename uni1E9E at all?

I understand you suggestion for calt:
sub @Uppercase germandbls' @Uppercase by uni1E9E;
sub @Uppercase @Uppercase germandbls' by uni1E9E;

But wouldn't you still need something like:
sub @Uppercase germandbls' space @Uppercase @Uppercase by uni1E9E;
...just in case the words Aß and Iß appear in an all caps title?

Denis Moyogo Jacquerye · May 2013

Just use the character U+1E9E instead of having incoherent OpenType hacks.
Chris and John are right.

What you're trying to do is have hacks to guess what the user means. That might be right in some cases but will be wrong in others.

Rainer Erich Scheichelbauer · May 2013

Okay, not in case then. I’ll adapt my blogpost accordingly.

So in calt, I don’t think I am case-converting, am I? That’s why I use the regular germandbls with a suffix (perhaps .case is confusing, but I deemed it the most appropriate one) instead of uni1E9E. And I use calt because the ß is simply adapting its shape to its context.

(Now I see, I should have used germandbls.calt instead.)

In official German orthography, there simply is no uppercase ß. That’s why it’s tricky to encourage its use. And sometimes it is simply impossible, think all-caps last names in passports or other documents, where ß cannot be replaced by double-S or cap-ß. But its shape can change to adapt to the letters surrounding it.

But wouldn't you still need something like:
sub @Uppercase germandbls' space @Uppercase @Uppercase by uni1E9E;

Hmmm, no. »Iß ALLES auf!« (‘eat up EVERYTHING’) is an example where you must not capitalize the ß. This would be the guessing Denis referred to, whereas in ‘WEIß’ or ‘HEIßEN’, it is clear.

John Hudson · May 2013

So in calt, I don’t think I am case-converting, am I?

That isn't a matter of what feature is used, but a question of whether you are representing a character in one case with a glyph that represents the character in a different case. The situation of the German ß is complicated because in the standard orthography the uppercase of ß is SS (which is not the same as saying 'there simply is no uppercase ß'), but a minority usage has employed an uppercase form, now encoded but not case mapped as U+1E9E. With that encoding, I believe any other mechanism to display an uppercase form of ß should be discontinued. There is now one standard way to include the uppercase glyph form of ß in text, and that is to use U+1E9E. In light of this, I would argue that ‘WEIß’ or ‘HEIßEN’ are simply incorrectly encoded, and a glyph substitution solution to display the uppercase glyph simply masks the encoding error. These words should be either encoded as ‘WEISS’ or ‘HEISSEN’, using the standard orthography, or as ‘WEIẞ’ or ‘HEIẞEN’, using the minority practice.

Rainer Erich Scheichelbauer · May 2013

Technologically: yes. But I tend to disagree with what Unicode says on this topic. Orthographically, double-S is more of a substitute for an uppercase in the absence of a real uppercase. The wording of the orthographic commission suggests that. Similarly, lowercase double-s is considered a substitute spelling if ß is not available or in Swiss-German orthography. See §25 E2 and E3 in http://www.neue-rechtschreibung.de/regelwerk.pdf, also see page 15 where it says:

Jeder Buchstabe existiert als Kleinbuchstabe und als Großbuchstabe (Ausnahme ß)
[‘each letter exists as lowercase and uppercase (except ß)’]

In official documents, where you have to print your last name in uppercase letters, you are not allowed to substitute ß by SS, even. E.g., one would have to write W-E-I-ß, whereas W-E-I-S-S would stand for Weiss only. That’s one reason why I think InD’s practice of always substituting double-S for ß in all-caps is problematic. I should have the option to choose between no substitution, double S, SZ or cap ß.

These technicalities aside, you know there’s not only prescriptive linguistics, there’s also descriptive linguistics. And from a descriptive standpoint, people do use ß in all caps. I see it so often, even handwritten, perhaps it’s even the majority. A type designer, who also sells fonts to these users, may want the font to look good for that kind of use as well. I have encountered people who defended the practice, even.

As a compromise solution, I am tempted to suggest that uni1E9E and germandbls.calt may very well look different, the former being a ‘true’ cap, the latter the lowercase shape, but reaching only the cap height, and drawn a little wider to better fit between caps.

(Sorry about adding yet another sidenote to this endless topic. I really don't want to be the new you-know-who of this forum.)

John Hudson · May 2013

I understand the complexities of the orthographic issue -- although note in addition to what you have written that since the spelling reform of the 1990s ss is no longer a reliable acceptable substitute for ß in German orthography, since the two are now used differentially --, and the Unicode casing rules, which I believe were inherited from DIN, mean that there are unwanted character results from case conversion in some situations. But it is precisely to address the situation of all-caps with eszett, as it occurs in official documents for example (this was the case repeatedly cited in favour of the new encoding), that U+1E9E was added to Unicode. The point here is that German all-cap settings should always be explicitly encoded and checked to confirm that the desired representation is present, because automatic case conversion of ß is unreliable, since the desired representation might be SS, SZ or ẞ. An OpenType GSUB solution that displays a lowercase ß as ẞ or as something like ẞ, encourages bad encoding practice.

Georg Seifert · May 2013

Just to add some personal experience to this endless topic. If I read a Swiss newspaper I always stumble across one word in particular (In Switzerland, they decided to get rid of the "ß" and replace it with "ss"). You read "Er bezahlte die Busse". In German German that means "he payed the buses". But they mean "he payed a fine".

Nick Shinn · May 2013

Homonyms occur in many languages.
“Eats shoots and leaves” is a recent classic example.
Somewhat related, I recently came across this book by Lean In (something like Lianne Ng?):

Mark Simonson · May 2013

Also read/read, lead/lead which you can only tell what it means and how to pronounce it by context, and not always then. Spelling reform might help with these, but then you would lose something with words that sound the same but can be distinguished by how they are spelled, like red/read, led/lead.

[Deleted User] · May 2013

The Busse vs Buße sample is a funny one in that it reminds that Swiss German and German German (or Austrian German) prefer different terms. In Germany, Buße is used mostly in religious context, while ‘fine’ might be something like Bußgeld which would not pose much of a problem even if spelled the Swiss way as Bussgeld – the double s indicates it is not a Busgeld.

My own experience was almost the opposite. Last year I read a couple of books and articles all printed before ca 1905. It took a couple of them until I noticed that there was not a single ‘ß’ in them.

Btw, I do consider smcp as problematic in that it implicitly maps lowercase (shapes) to uppercase (shapes). InDesign’s approach is a clean one, iirc, first doing casing on application level and then applying cpsp.

Chris Lozos · May 2013

Please don't remove all the ambiguity, though, we punsters need fuel as well ;-)

Thomas Phinney · May 2013

> InDesign’s approach is a clean one, iirc, first doing casing on application level and then applying cpsp.

Karsten meant to write 'c2sc' not 'cpsp', I am sure.

Although I was initially surprised when this approach was suggested, I have become more and more sure it is a good choice over the years.

John Hudson · May 2013

I InDesign’s approach is a clean one, iirc, first doing casing on application level and then applying [c2sc].

And then applying the {smcp} feature lookups to catch anything left over in the glyph run that wasn't transformed by {c2sc} but should be by {smcp}. This last step is important, because not everything mapped in {smcp} necessarily has an case mapping.

[Deleted User] · May 2013

Thomas & John, yes!

Capital Eszett in OpenType code

Comments

Categories