Opentype case & eszett

I tried to specify two special substitutions in my case feature:

ß > ẞ
ĸ > K‘

In Adobe’s enviroment ß becomes SS and ĸ remains ĸ, but OpenType-savvy browsers respect my feature. I understand there is some discussion re. the cap eszett, and I am not sure about the one-to-many substitution in regards to the old greenlandic k. Would it be better to leave these out alltogether? Are there other “hard-coded” case variants I should be aware of?
Tagged:
«1

Comments

  • John HudsonJohn Hudson Posts: 1,236
    The case feature is not intended to perform case transforms, which are character-level operations normally performed by software (as you found with Adobe apps. The case feature should be used to substitute variant forms of non-casing characters, e.g. punctuation or numerals, within an all-caps setting.
  • I agree, but where would you put something like the ĸ?
  • Nick ShinnNick Shinn Posts: 1,192
    The only feature that should involve cap Eszett is <c2sc>
  • I think it might be suitable as a stylistic alternate. Last year when I visited Germany, I saw a couple of store signs with a cap Eszett.
  • John HudsonJohn Hudson Posts: 1,236
    I agree, but where would you put something like the ĸ? 
    It's still a character level mapping, although in that case there is no standard Unicode case mapping. In other words, if someone wants the K‘ representation in all-caps, they need to key that. Bear in mind that a) this is an obsolete orthography, so is only going to be encountered in historical documents, and b) case mapping isn't standardised.
    _____

    I think it might be suitable as a stylistic alternate. Last year when I visited Germany, I saw a couple of store signs with a cap Eszett.

    Stylistic alternate of what?

    The uppercase ẞ character was added to Unicode precisely to accommodate uses such as you describe, which are still a minority, non-standard representation. If you really want the uppercase ẞ, best to use the specific character, which again means entering that rather than relying on case transformation.


  • Igor FreibergerIgor Freiberger Posts: 106
    edited October 2015
    When I asked about Eszett substitution in Typophile, German fellows (I guess Andreas Stötzner and Nina Stössinger) strongly said me not to do that. Firstly, because Eszett is still under adoption. Secondly, because there are situations where the correct uppercase for ß is SS.

    Greenlandic k substitution is eligible to smtp and c2sp (and petite caps, but it seems no one –except Mota Italic and myself– is designing fonts with this additional case).
  • In the situation where you have a cap Eszett written in the text and wish to maintain this identity in small caps, why would you not want to allow this?
  • Yes, certainly you should have it in 'c2sc'. Just not in 'case'.
  • Christian ThalmannChristian Thalmann Posts: 965
    edited October 2015
    Secondly, because there are situations where the correct uppercase for ß is SS.

    I can't imagine how such a situation could arise, given that the rules for how to use ß are case-agnostic.

    The Glyphs tutorials recommend using CALT to replace ß by ẞ under all-caps conditions:

    https://glyphsapp.com/tutorials/localize-your-font-german-capital-sharp-s
  • John HudsonJohn Hudson Posts: 1,236
    The uppercase standard case mapping for ß is always SS. This is enshrined in both official German practice, as recorded in Duden, and also in Unicode case folding. The uppercase character ẞ is a variant usage, and was encoded as such, without case folding.

    The Glyphs tutorials recommend using CALT to replace ß by ẞ under all-caps conditions

    I can imagine the thinking being that if character level case folding or user intervention has not already converted ß to SS in all-caps, then the best graphical option is to perform a glyph substitution of ẞ. This is, however, contrary to the general principle that OpenType substitutions should not be used to represent one character with the default form of another character.

  • Stylistic alternate of what?
    An alternate of ß in an all-caps display font. Or should we put the ẞ in the ß position and leave it at that?
  • The uppercase standard case mapping for ß is always SS. This is enshrined in both official German practice, as recorded in Duden, and also in Unicode case folding. The uppercase character ẞ is a variant usage, and was encoded as such, without case folding.
    Similarly, the uppercase standard case mapping for ä used to be Ae. Luckily, the convention was changed when the availability of Ä became widespread. Personally, I'd rather my fonts went with the times than that they propagated an obsolete practice.
  • Using case here is definitely wrong, but recently I came to the conclusion that adding smcp to OpenType with a mistake because it puts a text transformation process (case mapping) in the hands of fonts which is not the proper place (because case mapping is complex and not static). Small caps should have been handled by the c2sc feature only; the application first maps the text to uppercase then applies c2sc feature to it. This way the text transformation is done outside the font and does not depend on type designers handling all its subtilises properly or to remain up to date.
  • Out of curiosity/ignorance, why is implementing it in CASE more wrong than CALT?
  • John HudsonJohn Hudson Posts: 1,236
    edited October 2015
    Small caps should have been handled by the c2sc feature only; the application first maps the text to uppercase then applies c2sc feature to it. This way the text transformation is done outside the font and does not depend on type designers handling all its subtilises properly or to remain up to date.

    That's what at least one of Adobe's layout engines does, which is why Frode reported ß becoming SS despite his case feature mapping to a ẞ glyph.

    _____

    Similarly, the uppercase standard case mapping for ä used to be Ae.

    Standard? Or simply conventional in the absence of Ä on typewriters? When I say standard, I mean defined in a standard, and, in the case of a technical standard such as Unicode, with clearly defined outcomes. You can try to trick a layout environment into displaying ß as ẞ in an allcap setting, but to do so you have to work around the standard case folding behaviour. There is, of course, a correct way to do this, which is a character-level custom case folding, as reportedly needed in some database software in Germany to handle casing of personal names where the distinction between ß and ss needs to be distinguished in data that is stored as allcaps in some places and mixed case elsewhere. Again, this is a character encoding and case folding issue, not something to try to fake in glyph processing.

    Out of curiosity/ignorance, why is implementing it in CASE more wrong than CALT?

    I'd say both are wrong, because they are misrepresenting the encoded text. The case feature is arguably more wrong because of the way in which it is applied as a secondary function of an already dubious case display function in InDesign.
  • John HudsonJohn Hudson Posts: 1,236
    PS. As use of ẞ increases, I can easily imagine word processing and page layout software introducing user preference settings to case fold ß to ẞ instead of SS. That's the better way to handle this emerging situation: change the uppercase character to which ß maps, rather than misrepresenting the lowercase ß with an uppercase form.
  • Igor FreibergerIgor Freiberger Posts: 106
    edited October 2015
    Khaled, this approach causes a problem with mixed uppercase and small caps. In the absence of smcp, the application would apply a faux small caps to lowercase or simply do nothing. Both are greater problems than having this control on the font side.

    Regarding why not to do the automatic conversion ß >  ẞ, I found this additional data in a thread I saved from Typophile: ẞ was adopted in Germany, but not Austria or Switzerland (the info is from mid-2012). And in Switzerland, ß is treated as a kind of digraph, not a single letter.
  • John HudsonJohn Hudson Posts: 1,236
    Khaled, this approach causes a problem with mixed uppercase and small caps. In the absence of smcp, the application would apply a faux small caps to lowercase or simply do nothing.


    As I understand it, Khaled's suggestion, which is the same as numerous other people have made, is that it should be the job of the software to keep track of to which characters the smallcap styling is applied, then perform a buffered case transform before applying a single cap-to-smallcap feature. So the distinction between all-to-smallcaps and lowercase-to-smallcaps is a software feature, not a font lookup distinction.

    Now, I think that's a sensible solution to a lot of situations, and would enable the contents of GSUB to be greatly reduced and simplified. But there are a handful of special cases — excuse the pun — where the glyph outcome of an all-to-smallcaps function and a lowercase-to-smallcaps function might differ — I'm thinking of things like hwair —, for which a rump smcp feature would remain useful.

  • John HudsonJohn Hudson Posts: 1,236
    And in Switzerland, ß is treated as a kind of digraph, not a single letter.
    The Swiss abandoned the eszett altogether, no?

  • edited October 2015
    For the record, I do not recommend substituting ß (lowercase sharp s) for ẞ (cap sharp s) in between caps, but rather with a .calt variant of ß that looks OK between uppercase letters (i.e., usually a little wider and smaller).

    Official orthography knows no capital sharp s at all (also not in Germany), but there are situations where ß must be typed in otherwise all-cap words, and must not be substituted by SS, e.g., family names in passports. To be precise, this is not the cap ẞ, but rather a variant of (still lowercase) ß. In my tutorial, I called it germandbls.calt. Sure, depending on your intentions, it can (and will likely) look like the cap ẞ.

    Also, people simply don’t (always) stick to official orthography, but will type the lowercase ß between caps, because they don’t know better, or cannot (or don’t know how to) access the ẞ. The solution I propose would take care of that too.
  • edited October 2015
    Regarding why not to do the automatic conversion ß >  ẞ, I found this additional data in a thread I saved from Typophile: ẞ was adopted in Germany, but not Austria or Switzerland (the info is from mid-2012). And in Switzerland, ß is treated as a kind of digraph, not a single letter.

    Both pieces of information are wrong. Neither did Germany officially adopt the ẞ, nor does ß count as a digraph in Switzerland.

    Official orthography (‘amtliche Rechtschreibung’) is regulated by the Rat für deutsche Rechtschreibung, and is the same in all areas where German is an official language. That is, with the notable exception of Switzerland and Liechtenstein where ß is simply not in use, also not as digraph. The only other regional exception I know of is that after a greeting in a letter, the Swiss can leave out the comma.
  • Nina StössingerNina Stössinger Posts: 150
    edited October 2015
    PS. As use of ẞ increases, I can easily imagine word processing and page layout software introducing user preference settings to case fold ß to ẞ instead of SS
    This is insightful — nice to think that fonts may not need to be updated again to accommodate changing rules and expectations on that end.

    I think the main thing to wait for is the official orthography Rainer mentions. Currently, mapping ß to ẞ may seem both logical and desirable, but seriously don’t do it — orthographically this is still not the recommended usage, and German orthography is quite tightly regulated by the rules in effect. To be more dramatic, I would even say that in the current situation, fonts that automatically use the cap eszett in cap settings might be effectively unusable for German. The thing we can do with the uni1E9E now is to prepare the glyph and have it ready for people who do want to use it, but it is too early to enforce its use on anyone.

    And indeed, like others have said, this is not an issue in Switzerland at all, which uses neither the uppercase nor the lowercase eszett.
  • Igor FreibergerIgor Freiberger Posts: 106
    edited October 2015
    Rainer, thank you very much for those information and links. Maybe the person who posted that did want to say "adopted" as "begun to be used", not an official orthography change. Anyway, it is always great to get precise data from native speakers.

    John, it seems I am missing something here. Let me try to explain again so you and others could clarify it. OpenType smcp and c2sc exists separately because they have quite different aims:



    Both are desirable formatting options. The first is the preferable inside texts and the second is largely used in titles. Without a smcp feature, the application will build a faux small caps in formatting 2, scaling letters down to 58% (by default).

    The removal of smcp feature seems doable when applications became able to identify that a glyph with a xxx.sc name should be used instead a scaling procedure to achieve formatting 2. But when (and if) this became reality, it would be also possible to remove c2sc as the application would do both transformations. A scenario very unlikely as most applications still barely support OT.

    Sorry for my lack of understanding, but I cannot see how to do proper formatting 2 without smcp. Nor the necessity of a c2sc when the software became able to control all transformations without font features.
  • edited October 2015
    Thanks everyone. Great to be reminded of all this again.

    @Igor Freiberger: I was under the impression c2sc should not convert any lowercase to smallcap. “All small caps”, meaning smcp+c2sc.
  • Christian ThalmannChristian Thalmann Posts: 965
    edited October 2015
    For the record, I do not recommend substituting ß (lowercase sharp s) for ẞ (cap sharp s) in between caps, but rather with a .calt variant of ß that looks OK between uppercase letters (i.e., usually a little wider and smaller).
    Wow, I had totally missed that part. So you're saying the substituted glyph should still be a ß by encoding?

    Isn't the ẞ basically «a variant of ß that looks OK between uppercase letters», though? Would it be OK if I created a copy of my ẞ called /germandbls.calt and that got used for the replacement?

    EDIT: Re-read the tutorial; never mind the question. :grimace: 
  • John HudsonJohn Hudson Posts: 1,236
    Sorry for my lack of understanding, but I cannot see how to do proper formatting 2 without smcp. 

    That's because you are presuming that the font is responsible for knowing which characters get substituted for smallcap forms and which do not, but that is begging the question. As a thought experiment, imagine how you would have to solve this formatting distinction if you only had one smallcaps mapped from uppercase characters as an option in the font, i.e. imagine that 'smcp' never existed. It isn't difficult at all to imagine that an application Caps+Smallcaps function could keep track of which characters were originally lowercase — the backing encoding, after all, would remain unchanged — and apply a 'c2sc' feature only to those characters after having first performed a buffered case transform.

    This is, in fact, what Adobe's older line and paragraph composer did, which is why, when that composer is activated in InDesign, a lowercase ß will result in a smallcap SS sequence when the regular smallcap function is applied to a word, leaving an initial cap as a full uppercase form. I've not tried it, but I suspect with that implementation, one would get exactly the same results regardless of whether the font contained an 'smcp' feature, so long as it had a 'c2sc' one.

    Adobe's World Ready Composer, on the other hand, does not do this, but instead maps what it finds for ß in the 'smcp' feature (which in the font I was testing turned out to be a smallcap form of ẞ, which probably annoys a lot of German users; well, the client was Dutch).
  • So for smcp the ß should map to SS, but c2sc the SS maps to SS and the ẞ maps to ẞ.sc?

    Which would leave the user to first decide to use SS or ẞ and once they do that then the features take care of either sc conversion.

    Related, doesn't most software transform ß to ẞ (if it is present in the font glyphset) when using an All Caps or Uppercase transformation rather than transforming ß to SS?
  • John HudsonJohn Hudson Posts: 1,236
    Which would leave the user to first decide to use SS or ẞ and once they do that then the features take care of either sc conversion.

    Yes, the decision to use 

    Yes, the decision to use SS or ẞ is fundamentally a character encoding preference, which is exactly why ẞ was added to Unicode, so that users could make that choice.

    Related, doesn't most software transform ß to ẞ (if it is present in the font glyphset) when using an All Caps or Uppercase transformation rather than transforming ß to SS?

    No, absolutely not. Both the standard orthographic rule of the Rat für deutsche Rechtschreibung and the Unicode casing rule are explicit that the standard uppercase form of ß is SS. ẞ is only available as a minority option outside of the standard orthography.
  • Thank you very much, John. Now I see the whole picture. Shortly, we have two possible models: (1) application reads smcp and apply substitution; (2) application identifies corresponding uppercase, does a buffered case change, reads c2sc and apply substitution.

    Although the model 2 enables simpler fonts, it demands more complex operation on the software side. Does it represents an improvement on performance? Isn't better to keep fonts responsible to define the smcp/c2sc transformations and so ensure greater standardization over results? I thought the OT features system were exactly to let the fonts embed everything needed to correct results, while the host application just reads that.
  • Michael JarboeMichael Jarboe Posts: 212
    edited October 2015
    So if one were designing a titling face SS would be put in place of ß and ẞ ideally would be an ss01 alternate?

    *update Actually how does that work do you actually add an SS glyph?

    *update2 Actually could you leave uni1E9E in place of ß and use some substitution code so that a user inputting ß would actually get /S /S (separate glyphs), and only if they activate an alternate (say… ss01) would they get ẞ, which in this scenario I'm describing would simply be a revert back to the ß glyph?
Sign In or Register to comment.