Making JSTF better

Simon Cozens · January 2020

I've been trying to implement OpenType justification (JSTF table) in SILE, and, well, I can see why nobody else has done much with it so far. If I were writing the specification, here are some things about JSTF that I would change:

The concept of extender glyphs is underspecified. Here is what the spec says about it: "Each script also may supply a list of extender glyphs, such as kashidas in Arabic. A client may use the extender glyphs in addition to the justification suggestions." This does not tell you (a) where extender glyphs may be inserted; (b) when they may be inserted and how this interacts with the other extension suggestions (at the start? before each one? at the end?); and (c) how this interacts with shaping. (If the client inserts a tatweel between beh and reh, are we expected to get another round of shaping to tidy up the connections?) I'm also unconvinced that the idea is necessary. You can get around all of the above problems by removing the extender glyphs concept altogether and encoding the kashida insertion rules as substitution suggestions. (suggestion 1: "sub beh' reh' by beh_tatweel1_reh", suggestion 2: "sub beh' reh' by beh_tatweel2_reh", etc.)
The fact that each suggestion includes ways of shrinking and expanding a line is confusing. Are we supposed to use both at the same time? A list of ways to make a line longer and a list of ways to make a line shorter would make more sense and be easier to process.
A priority list of suggestions is not the most flexible way to communicate extension/shrinkage opportunites. From a layout engine perspective, justification is normally implemented using a dynamic programming algorithm to minimize a penalty variable. It would be helpful to have the font specify penalty values for each substitution, rather than a flat order of execution.
The OpenType Specification's references to "clients" are not helpful. There is often a division of labour in OpenType into three parts - font, shaper, layout engine. Shapers deal with text runs and don't really know about lines. Layout engines, which know about lines and talk to shapers, can tell shapers about features but not about lookups. So specifying the suggestions in terms of lookups doesn't help anyone. More generally, implementation-wise, the JSTF table is a mess. It's patterned somewhat after GSUB/GPOS, but not enough that we can reuse code. It contains references to data in two different external tables, so you have to jump into those anyway, but it uses its *own* language/script matrix instead of theirs.

I've written some code to add a JSTF table to a font by pulling lookups out of specially named GPOS/GSUB features, but I can't help thinking that if the specially-named features were the way that the justification system was specified in the first place, it would be an awful lot easier for layout engines to implement.

Now, given that the table's already in the spec, (Harfbuzz has code to parse the thing even if it does nothing else with it) if there anything that can be done about this? And given that we have jalt and JSTF already, is there any mileage in suggesting a third OpenType justification approach?

Simon Cozens · January 2020

(I’ve realised I didn’t even get into the JstfMax idea, which works beautifully for the space glyph but for everything else just raises the question “but how am I meant to stretch/shrink it?” And is begging for an answer involving variable fonts.

I am honestly not trying to rubbish the JSTF table. I think it’s actually pretty close. I sat down with Titus today and tried to work out how, in an ideal world, we would like to specify justification features of a font, and JSTF has almost all the elements we wanted. It’s not that it’s totally wrong; it’s just a little bit shy of being right. Maybe it’s time to put that version field to use.

Simon Cozens · January 2020

OK. I actually got something working and implemented! It's a rigged demo (sorry, I mean "proof of concept"), but some parts of it do actually work.

Image: https://us.v-cdn.net/5019405/uploads/editor/h5/h6piwdcfn1cb.png

Glyphs are from a font by @Titus Nemeth. We created a version of the font with a JSTF table by adding the swash lookups into the "exs1" feature and using my JSTF table builder script.

I then created a version of SILE which knows about JSTF. It manually applies the lookups specified in the JSTF extension tables, and offers both the original token and the extended token as alternatives to the line breaking engine.

So we are dealing with a real layout engine and a real JSTF-enabled font. Obviously I have chosen the text, column width, and linebreaking options carefully, but the main text (in black) is the same in each paragraph. SILE is genuinely using the JSTF table to choose alternative, extended versions in order to fill the line better. So in that sense: it works!

There is more work to be done: the paragraph builder is failing horribly in cases which were not carefully chosen; I think there are some bugs in my "alternate-choosing code" which is a bag-on-the-side extension to Knuth's line breaking algorithm which, let's face it, I only barely understand. In an ideal world, I would write a completely new paragraph builder which (a) is maintainable and extensible by people who aren't Knuth, and (b) is designed from the start to allow one to choose between a blend of strategies for justification - stretching glyphs and replacing glyphs - rather than purely extending spaces.

Executing the JSTF lookups inside the layout engine is obviously horrible. SILE only supports single substitution lookups, because strangely enough I didn't fancy implementing a whole OpenType shaper. The JSTF functionality should live in the shaper itself, and before that happens we need an interface for the layout engine to tell the shaper that it needs a wider/narrower line.

But there we go. Rudimentary JSTF support.

Simon Cozens · January 2020

I wrote a new paragraph building algorithm which supports multiple methods of justification. I am ridiculously proud of it.

Image: https://us.v-cdn.net/5019405/uploads/editor/wy/jgpybufowtku.gif

Georg Seifert · January 2020

That looks very good.

Simon Cozens · January 2020

Thanks! The shaping on the qaf (not taking medial form) is a bug but I think it's Safari's fault. I've just fixed a bug to make it work in Firefox and it's looking even better there.

Adam Jagosz · January 2020

Cool! Can this be applied for Latin? (Say connected Blackletter design based on calligraphy, which indeed had justification alternates.)

Simon Cozens · January 2020

Sure, there's nothing Arabic-specific about it; it's just that Arabic is quite easy to demonstrate extension and substitution. Grab Alter Littera's Gutenberg B font and knock yourself out - just be warned that computing all the different possibilities will make rendering extremely slow.

Helmut Wollmersdorfer · January 2020

Adam Jagosz said:

Cool! Can this be applied for Latin? (Say connected Blackletter design based on calligraphy, which indeed had justification alternates.)

You can get the Gutenberg Bible-Textura with the 390 original forms at http://www.kps-fonts.ch/ for free to play around. But Textura is not, what I would imagine as Blackletter calligraphy. But Simon Cozens may be right that it needs adaption to available horizontal space imitating how Gutenberg used it.

One with many flourish variants, begin, middle and end forms, Kanzleischrift (German for Cancelleresca), cut by Hans Schönsperger, Augsburg 1514, looks clean and relatively modern. You can find a font for it at kps-fonts under the name 1513_gebetbuch.

Image: https://us.v-cdn.net/5019405/uploads/editor/av/ayibr5bx186x.png

Don't know how difficult it is to automatically choose flourish variants depending on vertical space. The /longs in the 3rd line would collide with the /g in the 1st, if directly under it.

Historically it is coming from professional writing masters ("Schriftmeister"). Shortly later the first Fraktur-Designs appeared, 1517_theuerdank, 1518_neudoerffer_fraktur, 1520_gilgengart. German handwriting developed to a style called Kurrent, which is not black but broken, used until the first half of the 20th century.

Simon Cozens · January 2020

You asked for it, you got it.

Image: https://us.v-cdn.net/5019405/uploads/editor/ne/xfcbr9s4dd0r.gif

I have to admit that for anyone whose idea of justification is "you just make the spaces bigger or smaller", this is weird to watch.

Adam Jagosz · January 2020

Yas! Exactly what I had in mind. I have my own textura typeface (based strictly on calligraphy and not metal typefaces) that I would love to apply this to, but I gotta admit Gutenberg gets amazingly close to handwriting.

You mentioned implementing JSTF support for your engine SILE. And you said the support otherwise is not great. Just some of the browsers?

Also thanks for open-sourcing your script, I'll dig into it later.

John Hudson · January 2020

I'm really looking forward to the late Byzantine cursive Greek example.

Simon Cozens · January 2020

Adam Jagosz said:
You mentioned implementing JSTF support for your engine SILE. And you said the support otherwise is not great. Just some of the browsers?

Well, we need to back up a bit here because I've been talking about two different things. I'm approaching the OpenType justification problem from two directions and hoping they'll meet in the middle.

The first direction is: what information should the font carry about justification, and how should that information be interpreted? At the moment that's specified (in theory, because there were no implementations) through the JSTF table. The purpose of implementing JSTF in SILE was to see whether that information was enough. I think it's almost enough but not quite. The way JSTF is specified makes it a pain to implement, but beyond that I think it would be helpful to have the notion of penalties for individual lookups and strategies so that they can be effectively combined, rather than "first try this, then try that". It also needs to be more aware of glyph extension via variable fonts.

So then we flip over to the other direction of travel: what does a justification engine need to do to support all the things we want to do with it?

Working with Titus, we identified three different justification strategies that are combined in different orders and to different degrees to produce a pleasing line: space adjustment, glyph substitution, glyph extension. (Four if you include hyphenation; I think this can be implemented as a special case of glyph substitution - substitute breakable hyphen for zero-width non-breaking space - but I could be wrong.) And these are things that JSTF supports, but it doesn't support blending those strategies in different ways.

Most justification engines at present only offer space adjustment (TeX and its derivatives). Some do whole-line glyph extension ("add 10% of width to all glyphs" etc. - Adobe apps, some TeX extensions and of course the hz-Program); I don't know if they do extension on specific glyphs. Some do justification alternates (InDesign Arabic).

I tried to making SILE support JSTF as much as it could, including justification alternates, but it didn't work particularly well. (That was what I meant about "support" in the first few posts.) It also doesn't know anything about glyph extension. So SILE only has half an implementation of half of what we need. And I don't think any of the existing engines allow you to say "first try this strategy, then try these two strategies at the same time" or similar. Neither do they support variable fonts.

So implementing OpenType justification will not only need some tweaks to JSTF, it also needs a new kind of justification engine. This new engine will eventually need to be put into SILE if we're going to properly support these three strategies. I then went over to prototyping a new algorithm in JavaScript (because it's quite a nice environment for coding, for testing different widths and for producing attractive demos...), which is newbreak. In other words, newbreak was not really designed for doing complex justification on the web - that's just a helpful side-effect of developing in JavaScript.

Of course once you do stuff on the web, you have to deal with all the infelicities of different browsers, their layout engines and their JavaScript implementations - that was what I meant by "support" in the newbreak-related posts.

But that's not what I'm actually interested in. If people want to use newbreak to do funky things on their web sites, fine with me, but the purpose of it was to prototype a new justification system which can be used in SILE (and anything else that wants it).

So my next step is to port newbreak to Lua (SILE's implementation language) - or more probably C, then other things can use it as well - and also to get SILE able to handle variable fonts; this will require me understanding how Harfbuzz supports variable fonts, because SILE uses Harfbuzz for shaping. Once those two things are done, we will have a much better idea of how useful JSTF turns out to be in practice.

Helmut Wollmersdorfer · February 2020

Simon Cozens said:

I have to admit that for anyone whose idea of justification is "you just make the spaces bigger or smaller", this is weird to watch.

In this book [Faulmann 1882] the author has the theory that early printing had spaces of fixed width. Therefore they used abbreviation signs or set the word with normal letters depending on available space.

Here https://archive.org/details/illustrirtegesch00faul_0/page/42/mode/2up the author gives an example from a book printed by Mellin existing in two versions, the second one corrected:

Image: https://us.v-cdn.net/5019405/uploads/editor/hx/jxc3oed678q9.png

Look at the first letter v exchanged against V and "in" at the end exchanged against i with abbreviation sign.

Adam Jagosz · February 2020

What's more puzzling is that the last word, fine, was changed to fie.

Check out the Book of Psalms from the Zwolle Bible: most pages consist of three almost identical columns, and often two of them are exactly the same, sometimes even all three.

Lines 3–5: Domini est salus: (et) super populum tuum benedictio tua. The first and second column look quite different yet read alike.

Image: https://us.v-cdn.net/5019405/uploads/editor/fa/6q0ulv4r6v82.png

Btw, the font I mentioned was based on Zwolle-Bibel, and was recently released. Take it for a spin on a test drive page. The justification alts (for c, g, s, t, st: medial and final forms) ended up as stylistic set.

adamwhite · March 22

Simon Cozens said:

I wrote a new paragraph building algorithm which supports multiple methods of justification. I am ridiculously proud of it.

Your algorithm is interesting, but you shouldn't do that to a script if you don't understand the nature of it. First of all, Arabic typeface industry needs a reform. To design typefaces (and determin skeletal modifications) in two groups.

1) To design traditional styles following RULES of relations between letters, its ligatures and possible layout structures, WITHOUT RELATING stylistic contextual forms to dedicated unicodes, but to only use base unicodes for charaters.

2) To design typographic fonts which will not try to imitate traditional writing ligatures and letter relations, as simplified typographic scripts for modern usage and as well keeping only base character unicodes.

Any visual variations should be as contextual variants, and NEVER hardoced.

If you make for an example naskh, then you make it all the way. Making simple typographic font/naskh crossover with non-existing and irregular morphing, variations, etc. is distorting the typeface design science.

To understand it easier just compare styles like blackletter, penmanship, and similar to publishing typefaces like TNR or Helvetica. You can't make Helvetica and introduce penmanship ligatures or elongations.

Simon Cozens · March 23

Not sure why people are disagreeing, as I think it's a very good idea to use contextual variants for visual variant glyphs instead of the legacy positional Unicode codepoints - as anyone can see by inspecting any of the Arabic fonts I have engineered...

Nick Shinn · March 23

For typography that is completely puréed, ya gotta thesaurize.

michele casanova · March 23

@Simon Cozens I didn't know about this opentype feature (I'm not an expert).

Are there any practical examples of how to use this JSTF feature with Fontlab?

I'd be interested in experimenting with it in this font, which contains some abbreviation signs and ligatures.

John Hudson · March 23

JSTF is not a feature: it’s a separate font table, and it is not really supported in software. Simon is one of the few people who has actively experimented with it.

The idea of JSTF is derived from the understanding that justification in Arabic can be analysed in terms of prioritisation of different mechanisms, which can be algorithmically applied to fill out a line measure, and these mechanisms are style- and design-specific, so need to be defined at the font level rather than at the application layout level. Unfortunately, that is also where support for JSTF collapsed, because a lot of software that nominally supports Arabic includes internal algorithmic justification, so getting software makers to support a more complex font-dependent model was a hard sell. Basically, JSTF has been mostly ignored.

michele casanova · March 24

John Hudson said:

JSTF is not a feature: it’s a separate font table, and it is not really supported in software. Simon is one of the few people who has actively experimented with it.

Thanks a lot.

jeremy tribby · March 26

interesting, hadn’t seen newbreak the first time around, I’m going to give it a try. looks really nice, simon! I’ve recently built a font renderer for three.js (more on that soon) and for layout it is using knuth-plass, but I keep running into edge cases where KP favors the texture of a paragraph a little too much and is a little too willing to make one really poorly set line as a result. I probably need to adjust penalties or work on something specifically for the first line. but it’s been frustrating to work with because there are some aspects to the algorithm that I truly don’t totally understand (maybe yet, maybe ever). excited to dive into newbreak, especially given that it’s already typescript

Making JSTF better

Comments

Categories