Precomposed fractions — waste of time and space?

Adam Jagosz · January 2020

In a quick peek through MyFonts, I was only able to find a handful of foundries that provide precomposed fractions. Are they considered obsolete, kind of like /ldot and /napostrophe?

Thomas Phinney · January 2020

It really depends on your audience.

In web fonts, file size matters. Precomposed glyphs made from components (TT) or subroutinized away (CFF) are smaller than regular glyphs, but still take up some space.

(Same with precomposed diacritic+base-char combinations that could be dynamically created with GPOS. With the diacritics, precomposed have some benefits for kerning from having real glyphs. But there is still the size penalty. Tradeoffs....)

The only reason to have precomposed glyphs for fractions would be if the text engines are not smart enough to compose the characters from glyph pieces on the fly. I would be curious to hear the results of testing on this, how clever the various engines are these days—and how far back one has to go to get failures. There are still people using Internet Explorer, even.

Adam Jagosz · January 2020

Thomas Phinney said:

The only reason to have precomposed glyphs for fractions would be if the text engines are not smart enough to compose the characters from glyph pieces on the fly.

Are you alluding to means of composing fractions other than the OpenType frac feature?

Erwin Denissen · January 2020

In general include all characters you think your customers use.

But which ones do you mean, only ¼ ½ ¾, or more?

Adam Jagosz · January 2020

Very few fonts include the extended fractions, but even basic fractions are a rarity in new fonts. Some fonts don't even include encoded subscript and superscript numbers, even though they do have the glyphs for sups and subs features.

Image: https://us.v-cdn.net/5019405/uploads/editor/ca/nggckiego890.png

This is technically impeccable by conforming to the rule of substituting glyphs but never characters. Is there a sound rationale behind this rule, though? (A better one than being able to distill character runs from an Adobe PDF stream encoded in some long-forgotten standard — which doesn't even apply in this case, as the original text will contain regular figures, and obtaining encoded superscripts could be seen as an improvement.)

Would it not be practical to encode the superscript and subscript glyphs with 2070-2079 (plus 00B9, 00B2, 00B3) and 2080-2089?

Btw, can you set fractions next to old-style figures (in the integer part)?

Adam Jagosz · January 2020

MS Office, for one, doesn't support any other fractions then precomposed (and the Math module in Word that uses Cambria Math). LibreOffice does, through this one nifty hack.

Image: https://us.v-cdn.net/5019405/uploads/editor/a2/dxsmtj2no7o2.png

Adam Jagosz · January 2020

As for webfonts and size considerations... is it a common situation that subsetting cannot be applied? Isn't restricting the base character set a sort of premature optimization?

Helmut Wollmersdorfer · January 2020

If the glyph is in the font, it's not understandable that the glyph is not usable by the code point but by rules. I discussed this with the author of a font, where the longs had a code point in the PUA but was not usable with the standard code point directly. But you could write the text with round s and the complicated rules for longs in German where applied. He didn't understand me, that there are cases, where a longs appears without context or even against the rules is sometimes the right thing. I just changed the font myself.

If the glyph is in the font it can be addressed by additional code points. That doesn't take much space.

Thomas Phinney · January 2020

> Very few fonts include the extended fractions, but even basic fractions are a rarity in new fonts.

I would be sure to cover the three basic WinANSI fractions as a minimum in most fonts (except possibly a wacky display face).

If one has actual position-shifted glyphs in the font (as opposed to using GPOS for the features, both are options), then I favor always supporting both hardcoded Unicode, and OpenType features, for the superscript and subscript glyphs.

The only question then is whether to do that with a single set of glyphs, or two sets. The general OpenType philosophy is that formatting shouldn’t change code points (an argument for using duplicate glyphs). But the underlying problem is that we have an OpenType feature that changes character codes, which OT features shouldn’t do.

Yet I don’t see anyone deprecating the sups/subs/numr/dnom features. And they are super handy for users. So ...

Andreas Stötzner · January 2020

please help me, I always thought that subscript and superscript characters are to be distinguished from the numerator/denominator sets (which go as glyphs, not as ch.s).

Am I wrong?

Daniel Benjamin Miller · January 2020

I've seen fonts where fraction features are included, as well as the ANSI fractions, but the result of using the fraction feature is not the same as the display of the ANSI fraction (e.g., the numbers are different sizes)!

Adam Jagosz · January 2020

Thomas Phinney said:

The general OpenType philosophy is that formatting shouldn’t change code points (an argument for using duplicate glyphs). But the underlying problem is that we have an OpenType feature that changes character codes, which OT features shouldn’t do.

That's exactly what I've heard repeated ad nauseam. BUT. Is there actually an environment that will allow you to detect the character code change? In any app I tried that in (apart from FontLab), after copying the text with applied features, I still got the original textual representation. OpenType features are something applied on top of that, in the layout process, and don't actually change anything for good. If an app allows you to copy text with character codes changed by OpenType, I'd blame the app.

This rule is more about not replacing (c) with the copyright sign, or devising new clever solutions like => turning into an arrow or the already-not-so-recent trend of programming ligatures. These things might be useful and an attractive novelty, but must be applied with consideration if at all. I think it isn't harmful to assume that subs, sups, and maybe ordn could be safely exempted, because these substitutions do not obfuscate the semantics.

@"Andreas Stötzner" I think you're right, as they mostly demand different alignment. But they do not need differentiation per se, so it depends on the design (vide wacky display fonts).

Adam Jagosz · January 2020

Thomas Phinney said:

I would be sure to cover the three basic WinANSI fractions as a minimum in most fonts (except possibly a wacky display face).

I'd drop in /onethird and /twothirds as well.

John Savard · January 2020

Adam Jagosz said:

Thomas Phinney said:

I would be sure to cover the three basic WinANSI fractions as a minimum in most fonts (except possibly a wacky display face).

I'd drop in /onethird and /twothirds as well.

Nothing wrong with that. However, there is a reason why ½, ¼, and ¾ are more important. They appear as characters on several national keyboards for Microsoft Windows.

In fact, this makes me realize that a list of "characters accessible by ordinary typing" from Windows, the Macintosh, or Linux would be a useful resource, as it would help to define a "minimum" set of characters for a font to support.

Of course, most fonts woud still support less than that, since it is possible to type, using built in Input Method Editors, in the East Asian languages on a modern computer. So one would still have to also further restrict language support.

John Savard · January 2020

Adam Jagosz said:

This is technically impeccable by conforming to the rule of substituting glyphs but never characters. Is there a sound rationale behind this rule, though?

There is a sound rationale behind the rule of substituting glyphs but never characters.

But that rule does not explain why H-subcript 2-O should ever fail to print correctly. If the font has the glyph, then it should use the glyph whenever the characters call for it, in rendering all the characters to which the glyph is applicable.

Thomas Phinney explained that apparently fonts need to break the rule to work because of a mistake in OpenType, which apparently is genuinely useful because of older fonts that don't use certain glyphs everywhere that they would be useful.

For a font, rather than a word processor, to turn 1/2 into ½ is bad. While I'm not familiar with the technical details of OpenType, it appears to me that what is being discussed is, for a typeface lacking both a desired fraction as a precomposed fraction and the digits used for composing fractions, to use the superscript and subscript numerals for the same purpose.

No rule is broken if a word processor does this. There's no reason to break this rule inside a font that I can see either; if a font wishes to use the same glyph as a superscript digit and as part of composed fractions, it can do so. However, I didn't think that superscripts and subscripts worked that way. I thought that when you said "superscript", you used all the font's regular glyphs, in a smaller point size and repositioned, and there was no such thing as a glyph specifically for a superscript or subscript character (with the exception of superscript and subscript characters uttered in response to certain Unicode code points).

In that case, using superscripts and subscripts to compose fractions seems to require referring to the parent characters of the glyphs for the regular digits.

However, it is at this point that I become confused.

If OpenType lets you change the size of a glyph on the fly when using it, so that one doesn't need to draw the digit glyphs again, or even copy and manipulate them, to create glyphs for digits used in composing fractions, then one can go from a composed character digit to a digit glyph with a size and position specification in the font. So if the word processor lets you change the size of superscripts and subscripts (and why not?) those changes would not affect fractions, as should be the case.

A feature which simply allows a font designer to have a pointer point to the wrong thing doesn't seem to be useful. So I'm clearly missing something here.

Adam Jagosz · January 2020

> to turn 1/2 into ½ is bad

That's how the frac OT feature works (kind of, actually the result should be one.numr fraction two.dnom). But this feature is discretionary, so I don't see a problem with that.

> If OpenType lets you change the size of a glyph on the fly when using it, so that one doesn't need to draw the digit glyphs again

Not per se. OT fonts contain numeric indications of how faux super- and subscripts should be sized and positioned, but these are not always respected by software (just like the underline instructions). For true sub- and superscripts, we do want to draw them again.

Erwin Denissen · January 2020

Adam Jagosz said:

> to turn 1/2 into ½ is bad
That's how the frac OT feature works

There is a huge difference between a word processor that auto-magically changes a sequence of characters into another character, and an OpenType feature that shows a substitute, but doesn't change the codepoints.

There are only a few special features (e.g. subs and sups) that do change codepoints, but that shouldn't affect fractions.

Adam Jagosz · January 2020

The difference is essential, I agree. But the only codepoint changed by frac is slash -> fraction (the numerators and denominators are unencoded). The rationale is that fraction is hard to type, I presume.

John Hudson · January 2020

I almost always include the basic three — ¼ ½ ¾ — because they're part of the ANSI character subset, but only include more precomposed fractions if they clients request.
_____

But the only codepoint changed by frac is slash -> fraction

Note that there are no codepoint changes in OpenType GSUB: all GSUB lookup types apply only at the glyph level, and the underlying text string remains unchanged. I presume what you mean is that the substitution points from a glyph that is mapped to one codepoint to a glyph that is mapped to another codepoint. That's discouraged for a couple of reasons, but the only time it actually results in a codepoint change is in a PDF distillation workflow where the original text string is lost and a new one constructed by parsing glyph names.

Simon Cozens · January 2020

John Savard said:
For a font, rather than a word processor, to turn 1/2 into ½ is bad.

And yet a font turning f i into ﬁ (and here I have copied-and-pasted in U+FB01 LATIN SMALL LIGATURE FI) is just fine? It's the same thing - and in neither case are codepoint substitutions involved.

The thing is, formatting never changes codepoints, as John mentions. In fact, once the character to glyph mapping has been done by the shaper, codepoints are no more - we're just in glyph land from now on. (Indeed, the "ccmp" feature really should be called "gcmp" and the words "maps the character sequence" in the Standard's description of ccmp are arguably a bug.)

What I've written about Turkish i and the small caps feature may be helpful here.

Erwin Denissen · January 2020

I find the specs rather unclear. I probably should have paid more attention at school...

What is meant with:

When the 'frac' table does not contain an exact match, the application performs two steps. First, it uses the 'numr' feature to replace figures (as used in the 'numr' coverage table) preceding the slash with numerators, and to replace the typographic slash character (U+002F) with the fraction slash character (U+2044).

And with:

For GIDs found in the 'subs' coverage table, the application passes a GID to the feature and gets back a new GID. Note: This is a change of semantic value. Besides the original character codes, the application should store the code for the new character.

John Hudson · January 2020

Ignore the spec for the frac feature. It's still the original text as registered by Adobe way back in the 1990s before anyone had actually implemented OTL. Adobe had this complex idea that some features would need to reference other features — despite no way to actually do this in the GSUB or GPOS table structure —, and that the frac feature would be a kind of higher level interface to the numr and dnom features. In fact, the numr and dnom features are completely unnecessary, and should have been deprecated when everyone, including Adobe, started implementing frac using chained contextual substitutions that enable arbitrary fractions.

John Hudson · January 2020

[I've been pushing for years for a total editorial overhaul of the OTL feature registry to bring the feature descriptions in line with actual implementations, but no one's come up with a budget for the work. I was able to get the topographical/joining form features — isol, init, medi, fina — revised a couple of years ago, so those now reflect actual use, but there's still a lot of outdated and just plain wrong information in the registry.]

Erwin Denissen · January 2020

John Hudson said:

Ignore the spec for the frac feature.

This is all important information and an update to the specs is welcome indeed.

John Hudson said:

In fact, the numr and dnom features are completely unnecessary, and should have been deprecated when everyone, including Adobe, started implementing frac using chained contextual substitutions that enable arbitrary fractions.

FontCreator also generates frac with chained contextual substitutions, as shown here:

Simon Cozens said:

What I've written about Turkish i and the small caps feature may be helpful here.

We've written a similar tutorial: Localized Forms – Turkish i.

Georg Seifert · January 2020

The hard coded fi ligature is just a stylistic choice where and hard coded fractions actually has meaning. 1/2 can mean something else then ½.

Adam Jagosz · January 2020

A hard-coded fi ligature is a disservice to everyone. It renders the text unsearchable and unnormalized. It belongs entirely to the song of the past.

Adam Jagosz · January 2020

@Georg Seifert All correct but what is your point? Once again: frac is discretionary. Activated at user's discretion. Most likely applied to a selection rather than a whole block of text. Right?

Thomas Phinney · January 2020

Agreed on the local vs global. Early on, some people thought that 'frac' could be applied globally, assuming the implementation would always be programmed so it only worked in and around the slash.

There were two problems with that theory: (1) the assumption wouldn’t always hold; and (2) sometimes it is very hard to tell other things apart from fractions (9/11 being one classic example).

Nick Shinn · January 2020

Image: https://us.v-cdn.net/5019405/uploads/editor/ys/0hgcjim1uayi.jpg

I have included a basic repertoire of Unicoded, precomposed, nut (stacked) fractions in several fonts, activated by the <frac> feature.

Above: Dair. Left: osf + nut, Centre: lining + “slash virgule”, Right: osf + “slash virgule”.

The reason: “slash” fraction figures are too similar in size to old-style figures (see above right). So this is a good practice for fonts where the default figure style is old-style.

Also, I think nut fractions are really cool; I have tried to code them in OpenType with numerators and denominators, but it was much too tricky!

A complication is that if text includes fractions not included in the precomposed range, those will show as “slash virgule” fractions, and for consistency’s sake the other, nut fractions have to be de-activated by applying a Stylistic Set.

Nick Shinn · January 2020

@Thomas Phinney:

There were two problems with that theory: (1) the assumption wouldn’t always hold; and (2) sometimes it is very hard to tell other things apart from fractions (9/11 being one classic example).

The <frac> feature may be applied selectively or globally, like any other feature. Why is this a problem?!

But even if it is, there are work-arounds, like the f_i ligature in Turkish reading as f_dotlessi, so there is a <locl> fix. Certainly, the global <frac> code I use dis-applies it to date text in the 00/00/00 format.

In general, I think that a global <frac> feature is very useful for things like recipes.

Adam Jagosz · January 2020

In cookbooks, there's a good chance there won't be a date in the month/day format. Again, it is at user's discretion to consider whether or not there appear dates in their text, and then apply fractions globally or selectively.

Precomposed fractions — waste of time and space?

Comments

Categories