Separate language codes for different Englishes

John Hudson · January 2018

And I’m getting a lot of flak from the principled purists, of whom I would ask, do you have a better idea? Certainly it’s true that as Kent says, “What you need is [a better algorithm]”, but I already know that ain’t gonna happen, and it’s not something I can come up with—but a font hack is.

Nick, the assumption of this statement is that this is somehow a problem for you to solve. I don't think it is, any more than it is a problem for me to solve. We're font makers and this is not a font problem.

With regard to standards for distinguishing different forms of English, yes, there are such standards, but because they deal primarily with spelling, i.e. with things for which fonts are not responsible, specific English language locale tags don't have corresponding OpenType Layout language system tags.

Deleted Account · January 2018

“I would like to replace quotesingle with a right quote mark for North America, to remedy the “smartquote” fail that generates ‘18, rock ‘n’ roll, etc.”

Is there nothing else?

We had a similar issue come up with using “Smart Quotes” for something not at all related to quotations before, from French l’ or something. Contractions not working is, imho, not a failure of smart quote logic, if you follow my logic.

I think a “better idea” is, if contracting, is to turn off smart quotes, be the typographer, and input the glyph you want to appear.

If you want reversed spun, mirrored, reversed unspun or twisted quotish glyphs of any kind put them into the font. If you can, put them in in such a way as to work with “smart quote” software, or as a feature or as both. Nobody is stopping anybody from making, popularizing or using dumb, smart or manual quotes, apostrophes, repostrophies or quopostrophies.

On this old subject in general, I’ve concluded that the greatest hypocrisy is todo nothing. Simple — all talk-no fonts on this molehill-issue forever, coupled with typographic inaction, someone could go on talking their own tough brand of purity through sloth, ‘till the end of time.

Hrant Հրանդ Փափազեան Papazian · January 2018

Simon Cozens said:
This whole discussion reminds me of my favourite bit of OpenType code,
sub period space space by period space
If you think fonts should be opinionated about linguistic conventions, you should probably include that one too.

I'm actually opposed to that.
To me there's a difference between somebody consciously twice hitting one of the best-defined keys out there versus hitting an overloaded key that ends up showing up as one of many things, dependent on various layers of software.

@John Hudson Or you could just disagree without casting my attempts at distillation as sloganeering. Not least because public discourse is not about you or me.

https://twitter.com/hhpapazian/status/931200071445573632

John Hudson said:
We're font makers and this is not a font problem.

That's not a decision for you to make.
Everybody has different reasons to make fonts, and sees their place in society differently. It's completely OK for somebody to violate some purists' compartmentalization of letter versus character versus glyph versus whatnot if they believe that will help the highest beneficiary of any act of Design: the user; which is ultimately the reader. That's not a slogan, it's a principle; something we should all have, and preferably share.

The apostrophe being replaced by an open single quote is a pervasive cultural embarrassment, and although it doesn't exactly cause world hunger, in our domain it's yet another detail to try to overcome. Yes, sometimes by treading on something else.

D. Epar ted said:

Nobody is stopping anybody from making, popularizing or using dumb, smart or manual quotes, apostrophes, repostrophies or quopostrophies.

Except human nature.

Kent Lew · January 2018

The apostrophe being replaced by an open single quote is a pervasive cultural embarrassment,

Sure. And yet, in my Apple Mail program, if I type '18 or rock 'n' roll, using simple keyboard straight quotes, I actually get ’18 and rock ’n’ roll. So, in some cases this has in fact already been solved. (I presume that Mail is using a system-level algorithm, so this may be “solved” in other places as well.)

As it so happens, I had occasion to reference the letter ‘n’ in an email to a colleague the other day, and I specifically chose to use single quotation marks to distinguish my reference. Yet, Mail “corrected” my input to ’n’.

I had to manually override by typing the curly quotes myself. Thank goodness my font didn’t then subvert my efforts. ;-)

John Hudson · January 2018

As it so happens, I had occasion to reference the letter ‘n’ in an email to a colleague the other day, and I specifically chose to use single quotation marks to distinguish my reference. Yet, Mail “corrected” my input to ’n’.

A good example, Kent, of the difficulty of trying to algorithmically determine which option a user wants when there are conventions valid for either/or. Part of what I find bemusing about this discussion is that the assumption seems to be that smart quote algorithms are stupidly wrong, and in need of fixing, rather than that they're bound to be a compromise getting as much right as they can given that it isn't always possible to determine what is wrong. The other thing that bemuses me is the notion that it will be possible to solve this at the OpenType Layout lookup level, with its clumsy context mechanism, given that it is evidently hard to do with the much more powerful capabilities of a true programming language.

Hrant Հրանդ Փափազեան Papazian · January 2018

Purists need to realize: it's never about solving, but reducing errors.

One thing we need is a way to detach the apostrophe from the quotes key. Also good would be to switch to guillemets. Such things are the job of simply anybody who cares enough.

Nick Shinn · January 2018

What I have to resolve is my inconsistency on this topic.
On the one hand, I appear to be a stickler for grammatical correctness regarding apostrophes, yet on the other I’m not so rigid about following the propiety of font behaviour.

John Hudson · January 2018

Hrant,

Or you could just disagree without casting my attempts at distillation as sloganeering.

I don't think discussion needs or benefits from distillation, and the result really does just come across as slogans, leaving the impression that, far from carefully reading and distilling what other people have taken time to write, you are lazily responding with one-liners. I also take issue with your characterisation of the people who disagree with you in this discussion as 'purists', which I take as simply dismissing or avoiding having to address their arguments, instead framing the discussion as an ideological conflict. Everyone with whom you have disagreed in this thread has put forward reasons why it is better to resolve character level problems at the character level. I don't see anything ideological about those reasons: they're practical, taking into account the nature of the technology and the broader benefits of not idiosyncratically misrepresenting characters in glyph space, especially when so many methods exist to get the correct characters into the text.

One thing we need is a way to detach the apostrophe from the quotes key. Also good would be to switch to guillemets. Such things are the job of simply anybody who cares enough.

So go make yourself a custom keyboard layout, and open source it if you'd like other people to adopt it. It isn't difficult, and it's an entirely appropriate solution to what you want to achieve.

Michel Boyer · January 2018

John Hudson said:

So go make yourself a custom keyboard layout, and open source it if you'd like other people to adopt it. It isn't difficult, and it's an entirely appropriate solution to what you want to achieve.

Just having a custom keyboard does not solve the problem; I expect a custom keyboard to put a straight quote (i.e. U+0027, APOSTROPHE) in my file if that character corresponds to the key I am hitting. That will not happen unless you disable smart quotes either at the system level or at the application level. On the Macintosh, if you do not disable smart quotes in the system preferences (Keyboard > Text > Use smart quotes and dashes), the substitution automatically occurs in TextEdit, Mail and other OS X applications whatever keyboard you are using (and chosing the straight quote with the character palette does not help either). Word on the Macintosh (I have Office 365 from my employer) does not use the system preferences and you need to deactivate smart quotes from the application preferences if you want to keep control. Keeping control is getting harder and harder as “intelligent software” is making more and more decisions for us.

André G. Isaak · January 2018

I’ve always wished there were some single, global preference “leave my text the $%& alone” which all applications would obey, and which would turn off absolutely everything related to smart-quotes, auto-correction, auto-formatting, auto-anything-else. In other words, just have the system assume that what I typed was what I intended to type.

If people start trying to implement these sorts of things in GSUB tables, that creates yet one more place where I have to turn things off.

Even worse, it creates a situation where the behaviour of your text may be entirely different depending on which font you choose to use. This is already a (relatively minor) problem in that different fonts often implement various features differently — some examples that come to mind are 'frac' and 'ordn' where some fonts allow you to apply these features to runs of text and then try to figure out what you intended contextually, whereas others expect you to apply the feature only to the characters you intend to be affected.

Such inconsistent behaviours are undesirable from a users point of view and I think that inconsistency outweighs any convenience which might be achieved by having the font try to guess your intentions.

André

Hrant Հրանդ Փափազեան Papazian · January 2018

André G. Isaak said:

the behaviour of your text may be entirely different depending on which font you choose to use.

That's what fonts are for.

André G. Isaak · January 2018

I expect the *appearance* to change when I change fonts. I’d like the *behaviour* of the text to remain the same wherever possible.

André

Hrant Հրանդ Փափազեան Papazian · January 2018

Appearance is all the behavior fonts have.

Nick Shinn · January 2018

'frac' and 'ordn' where some fonts allow you to apply these features to runs of text and then try to figure out what you intended contextually, whereas others expect you to apply the feature only to the characters you intend to be affected.

André, isn’t the <frac> feature quite straightforward?

I’m assuming the method I use, which is almost identical to that promoted by Tal Leming in the early days of OpenType, is pretty much a de facto standard, enabling the global application, via paragraph style, of changing:

figure(s) space figure(s) slash figure(s)

into

figure(s) thinspace* numerator(s)* virgule* denominator(s)*

*not changed characters of course, but “ .frac” alternate glyphs which don’t alter the canonical text.

Now, some fonts may have variant coding which isn’t quite so thorough, for instance requiring each individual instance to be selected, or (wrongly) altering the text, but in the absence of an officially mandated standard, the presence of different outcomes with fonts from different foundries is impossible to avoid—it’s just another aspect of quality (in the broadest sense) that results from variety in the marketplace. Undesirable, perhaps, from the user’s perspective, but, on the other hand, a necessary evil given the fact that standards exist as a continuum, many by “social contract”, as it were, and that there is a great diversity of font makers.

André G. Isaak · January 2018

André, isn’t the <frac> feature quite straightforward?

The method you outline is certainly commonplace, but when applied at the paragraph level it entails some guesswork on the part of the application (i.e. excluding things like 2018/01/07) and so forth. That guesswork gets considerably more complex if the font allows the creation of fractions from things like x/y or 2n!/z, and switching from a font which supports only numeric fractions to one that supports additional fractions may well result in text becoming mangled. Also, until applications start supporting type 8 lookups (reverse chaining) Tai Leming’s approach involves imposing an arbitrary length restriction on fractions.

This is why I’ve always preferred fonts which require you to select the thing you want to be a fraction, not the paragraph as a whole. I suspect I’m probably the minority in this regard, but I don’t want my fonts trying to guess my intentions (just like I don't want MS Word trying to guess which kind of quotes I want).

André

Nick Shinn · January 2018

(i.e. excluding things like 2018/01/07)

Tai Leming’s approach involves imposing an arbitrary length restriction on fractions.

Actually, Tal’s method takes account of those—and although the length restriction is arbitrary, it’s quite long!

I take your point about complex fractions involving x and y etc., but wouldn’t a typographer be using special maths fonts for those?

Kent Lew · January 2018

Tal’s original {frac} code was partly a result of a conversation he and I and Christian Schwartz had. Prior to this, the Adobe approach required selecting and applying the feature only to the components involved, which was highly impractical for my work at the time with cookbooks, crafts & project books, and such. It seemed to me a more comprehensive approach was possible.

(This was back when I was still primarily a book designer, and before I became adept at OTL feature-writing myself. I believe that GREP searches and GREP styles were not widely implemented yet at that time. So, such fractions were rather production intensive and time-consuming.)

The global {frac} code still works, of course, when the layout has applied it selectively only to the fraction components themselves. The contextual approach allows the possibility of applying {frac} at a paragraph or whole-text level when the nature of the text is amenable. But it does not require it be applied globally, if either the text doesn’t lend itself or if the workflow environment argues against such an application (e.g., texts destined to be re-purposed in multiple formats with unspecified fonts).

I agree that having multiple, incompatible approaches in the marketplace is not ideal and can cause problems. The main problem is when text that has successfully had {frac} applied globally is changed to a font that has Adobe-style {frac} code.

I don’t necessarily consider that the fault of the contextual-{frac} font, however.

Just as with Smart Quotes, the contextual {frac} feature necessarily involves compromise; it is not (and cannot be) fool-proof.

In my opinion, all such algorithmic features, like Smart Quotes and contextual {frac}, are an assistive convenience, not a replacement for knowledge and discretion. The responsibility still lies with the user.

John Savard · January 2018

In the past, people entered text on keyboards that had the ASCII character ' (U+0027) and possibly the ASCII character ` (U+0060). Unicode didn't exist yet, so there was no way to specify ´ (U+00B4) acute accent, ʹ (U+02B9) modifier letter prime, ‘ (U+0218) left single quotation mark, or ’ (U+0219) right single quotation mark.

Since the bi-directional quote ' only existed on a typewriter, while printer's fonts had left and right single quotes, with the right single quote also being used as the apostrophe, it seemed reasonable to substitute the right single quotation mark for 0027 and the left single quotation mark for 0060. That is just a character translation; it leaves control in the hands of the user, which is important since simple software wouldn't be able to detect apostrophes.

There was, however, no companion character to ", hence "smart quotes".

Putting something like smart quotes in a font would indeed be a bad thing. For the simple reason that there's no way to turn them off.

But in a word processor, they can be turned off with a menu option, if the software is designed correctly. In that case, it isn't a hack, it's an input method.

A computer terminal keyboard is not quite the same as a typewriter keyboard; for American English, for example, it replaces ¼, ½, and ¢ with [, ], and ^... and perhaps ± and ° by ~ and `, and ¶ and § by < and >. (and £ and ¾ (or ³ and ²) by { and }) But neither one is a typesetting keyboard, which would perhaps replace " and ' with ‘ and ’, and which might also offer † and ‡.

So a word processing program could offer options to simulate both of those kinds of keyboard with one's computer keyboard - although, as far as I know, no one has tried to simulate a typewriter keyboard on a computer terminal keyboard. Instead, the extra characters might be made available via the Alt key (which acts like the Code key on some older IBM devices).

Separate language codes for different Englishes

Comments

Categories