Open source implementations for stacked fractions in OTL

John Hudson · February 2014

https://github.com/TiroTypeworks/Nutso

Two related implementations for handling arbitrary vertically stacked fractions in OpenType Layout. These are licensed under the Apache 2.0 license, so can be used in non-open licensed fonts.

Kent Lew · February 2014

Thanks for the mention. Do you think we could persuade Miguel to also translate the .vtp of Nutso2 to .fea syntax? I might be interested in trying to marry your {mark}/{mkmk} back with some of my and Tal’s approaches.

The hitch there is that the World-Ready Composer, which is required for {mark} activation, has a bug that interferes with one part of Tal’s “fraction fever” {frac}. The World-Ready Composer appears not to properly respect any substitutions involving an alternate space glyph, so the adjustment between integers and fractions is not implemented. I don’t know if there’s a good reason for that and/or if it will be fixed at any point.

John Hudson · February 2014

Yes, I'm hoping Miguel will add .fea code for the Nutso2 implementation.

While Tal's contextual fraction isolation is clever, I think this is properly something for markup or, if a robust plain text mechanism is needed, use of ZWNJ to separate integers and fractions. While Tal's approach is convenient, I think it is a mistake. The {frac} feature is a formatting feature for particular pieces of text, and having it turned on for whole documents makes no sense to me: it's like turning on italics for a whole document and then using contextual substitutions to determine what text you actually want to display in italics. It also means that text built in this way will break when it is changed to a font that uses a selection targeted {frac} implementation.

The Unicode Standard actually specifies fraction behaviour with regard to the U+2044 slash, and if I were working on text that made heavy use of fractions I would definitely try to conform to that behaviour, which would mean a) applying {frac} to fractions and not the whole text (note that the Harfbuzz layout engine does this automatically based on the Unicode specification for U+2044), and b) inserting a delimiter between integers and numerals. I think it makes sense for ZWNJ (U+200C) to be this delimiter.

Kent Lew · February 2014

As always, John, it is hard to argue with your logic. You know that I respect your rigorous standards-oriented stance. And the italic analogy is compelling.

In an ideal text production environment, I would of course agree with you entirely. However, I have no experience with such a beast. The trade publishing workflow — the Editorial-Design-Production machine that I am familiar with — is like an oceanliner: it does not turn on a dime. And the kinds of reforms that would approach such an ideal text production environment are still a long way off for most. I know that those improvements I was able to effect with my employer during the time I was in a position to do so were hard-fought and hard-won and far from sweeping.

So, I also have appreciation and even respect for a nondestructive, albeit nonstandard, pragmatic approach. There is nothing about Tal’s {frac} code that precludes it from being applied in a conventional delimited fashion.

The Unicode Standard actually specifies fraction behaviour with regard to the U+2044 slash,

Yes, and I don’t disagree that this properly belongs at the text-shaping level. But from a pragmatic standpoint, in the absence of a robust text-shaping engine in the layout app to handle the Unicode recommendation for U+2044, (and if you set aside for the time being that single line of code that targets the space, which is completely separate and severable from the global, contextual fraction mechanism) Tal’s “fraction-first” {frac} code essentially mimics the Unicode recommendation — acting only upon those digits in contiguity with the slash, delimited by a non-digit.

Is this not the same contextual determination that a text-shaping engine makes (albeit based upon the character properties of the codepoints, instead of a glyph names)? The fact that at the layout level we have to accomplish this by activating the {frac} feature globally, whereas a text-shaping engine would invoke the behavior or apply markup locally seems almost a distinction without a difference.

(I suppose my case would be stronger if the code were revised to act solely upon the fraction glyph and not the slash; and if writers and editors were trained to reliably use that character, I would advocate a move in that direction.)

But, again setting aside that space substitution, there is nothing else about Tal’s code that seems to me to violate the spec for the {frac} feature — depending upon just how literally you read the recommendations.

Take the Function from the spec: “Replaces figures separated by a slash with 'common' (diagonal) fractions.” I realize that this element of the registry is always just a broad summary; but taken literally, one could argue that what I will call the “numr-first” conventional {frac} feature does not conform to this. If you select a given numeral and apply the {frac} feature, it will convert it to a numerator. That’s not the stated function of the feature. Based on the summary, one could reasonably expect the feature to apply a substitution only if the figure is separated by a slash from another.

Okay, the first part of Application Interface states: “The application must define the full sequence of GIDs to be replaced, based on user input (i.e. user selection determines the string's delimitation).” But as I said, there is nothing about Tal’s code that precludes this approach. It just happens to go beyond.

You’re right, of course: a text processed with a global application of this kind of {frac} will be out of sync when reset in a font with a “numr-first” {frac}. Think of global application as off-label use, to be done with full knowledge of the potential side effects. ;-)

And really — from the innocent user’s perspective, which feature do you think they would say is “broken”?

Still, I recognize that you find such enabling and compensatory behavior — and the arguments in favor, however pragmatic — to be wrong-headed.

Kent Lew · February 2014

Thinking more about my somewhat playful “off-label” characterization, I’ll add this:

The global applicability of Tal’s code is really just a byproduct of the “fraction-first” approach to arbitrary fractions — that is, the code’s contextualization keys primarily off the presence of the fraction bar.

In “numr-first” code, first any figure is converted to a numerator and any slash to fraction, indiscriminately. Only then is the fraction used as a contextual trigger to convert numerators to denominators. As a result, an application of this {frac} feature outside the bounds of a bona fide fraction will have the unintended consequence of converting a normal figure into a numerator.

In the fraction-first code, a slash is only targeted for conversion to fraction based on a context of backtrack figure and lookahead figure. Then, the fraction itself is used as the key to contextually convert contiguous figures on either side to numerator or denominator as applicable. As I said, this more closely echoes the Unicode specification for U+2044 behavior, as well as the stated function from the spec.

As a byproduct, figures that do not occur in the context of a fraction (or pseudo-fraction, like a MM/DD date abbreviation) are not affected. Thus there are not nearly the same unintended consequences when applied broadly. And thus the opportunity for “off-label” use, albeit at one’s own risk.

Regardless of one’s stance on the wisdom of applying such a {frac} feature on a widespread basis, I stand by the fraction-first approach as being the more precise and rigorous routine.

Kent Lew · February 2014

Getting back to the other issue of integer + fraction. I think this is a challenging and possibly unique quandary. In plain text you want an unambiguous delineation; in typeset text, you want a closely integrated presentation. Is it possible to effect a non-lossy round-trip between the two? That was part of the challenge I put to Tal initially.

I don’t agree that the ZWNJ is a robust plain text mechanism for separating integers and associated fractions.

You tell me: in this [relatively] plain text forum environment which doesn’t happen to recognize my fraction markup — is the following expression greater than or less than 1?:

1‌5⁄16

In fact it is greater than 1. What I intended to signify with numerals was “one and five-sixteenths.” But with a ZWNJ for a delineator, this is not unambiguous in plain text.

I continue to think that this challenge is best met at the presentation level, not in the text encoding. In many ways, it remains a typesetter’s problem. Implementing this presentational adjustment in {frac} seems a convenient mechanism to me. Even if one heeds the injunction not to apply {frac} globally, there is the option to sweep up an intervening space between integer and fraction within one’s <frac> markup where applicable.

This one line is arguably the most nonstandard and perhaps controversial aspect of Tal’s code.

I am open to other ideas about solutions or mechanisms, but this one still strikes me as eminently practical and efficient.

John Hudson · February 2014

Nicely thought through, Kent. When I originally developed the 'numerator-first' model for arbitrary fractions, I think I favoured it mainly because it was more efficient, and existing fraction ligature implementations already assumed that the feature would be applied selectively. It never crossed my mind that someone would want to turn on {frac} for entire documents, so having a more complex contextual lookup structure didn't seem a reasonable trade-off. [At the time, I was a lot more fussed about OTL performance hits than I am now, both because of less powerful computing and fewer optimisations in the table code produced by OTL tools.]

Miguel has now pushed .fea code for the Nutso2 implementation to Github.

I suppose the next stage would be a Nutso3 implementation that does 'fraction-first' and enables something like Tal's contextual handling. I think that's something for someone else to contribute. Hint.

Kent Lew · February 2014

Thanks to Miguel for the .fea translation.

As you know, I already have my own implementation for stacked piece fractions that builds on the fraction-first model, and/but which does not use your mark-anchor approach. I think the practical advantage of my GPOS type 8 approach is that it does not rely upon the World-Ready Composer, and thus works by default in InDesign going back to at least CS3.

Karsten made a suggestion to me that, if I can figure out how to implement and compile, might even make it workable in Illustrator.

I don’t know if/when I might release my code publicly. I’m still working on generalizing formulas for the components and lookups to enable easily building support for n places and possibly combining it all into a script that takes a few initial conditions & variables and constructs the rest.

But philosophically I am interested in your distinction between mark vs kern positioning, and so I might be interested in further investigating your GPOS 4/6 approach as well, as I mentioned initially here.

If I do so, obviously I will contribute that to the public repository.

Open source implementations for stacked fractions in OTL

Comments

Categories