Contextual substitution issue in a multiline text

Michael Rafailyk · May 2024

The background

Here's my recent Latin/Cyrillic design that works on the contrast between wide/tall/decorative uppercases and low/condensed lowercases. I'm thinking about how to translate it into Hebrew, because Hebrew has no upper/lower-case logic, and there is the dilemma of adaptation. So if I adapt only the lowercase shapes into Hebrew, the design balance of upper/lowercases will be lost, although it will be an acceptable neutral solution. And if I adapt only the design of uppercases, it will look just wrong, too decorative and difficult to read.

Image: https://us.v-cdn.net/5019405/uploads/editor/a2/gf2eduproeay.png

Looking for a combined adaptation for Hebrew

One of ideas was to make the first letter of the Hebrew text sequence wider (not higher) within the Hebrew unicase framework, and a bit more decorative. Hrant and I have a discussion about that in the replies to this tweet. It may feels more like a Latin concept (throw a stone at me if you think I'm trying to Latinize Hebrew), and I don't even know how it can be perceived by a Hebrew reader.

Substitution

Here's an approach with using default wide letters (to ensures that the first letter in the sequence will be wide), and all the others (except ignore period+space) will be (contextually) substituted with a condensed versions. This approach however is technically overloaded so I not sure about efficiency. Here's how the substitution (where hbWide are default glyphs) looks like:

ignore sub period space @hbWide';
sub @hbWide @hbWide' by @hbNarrow;
sub @hbNarrow @hbWide' by @hbNarrow;
sub @punctuation @hbWide' by @hbNarrow;

Multiline text issue

The first tests reminded me that multi-line text is processed as separate text sequences (for each line) and the OpenType cannot look back to the previous line. So the substitution didn't work as I expected, and every first character of a new line appears wide unfortunately (marked with red).

A questions

Have you faced such a dilemma and how did you solve it?

Is there a cases of use an Initials in Hebrew (maybe like a decorative ones for the book chapters)?

Should I abandon this idea of wide/decorative first Hebrew letter, and just find an optimal unicase design which will probably be closer to the Latin lowercase shapes?

John Hudson · May 2024

Looking just at the technical issue: there is no way to solve this for all cases within OpenType Layout. You can prevent the contextual form from occuring at the beginning of every line, but then it won’t appear at the beginning of those lines where you do want it (the first line and where the preceding line ends in a full stop). You may also run in to the problem of some text segmentation implementations triggering the context where you don’t want it or failing to trigger it wheree you do want it, e.g. some browser versions treated each individual word as a separate run for OTL processing.

Text in bicameral scripts like Latin are encoded with combinations of upper- and lowercase letters because the rules of capitalisation—which also vary by language—cannot be reliably captured contextually, and even with dictionary support may fail.

Ray Larabie · May 2024

I was recently working on a Cree font where I used a modifier character to alter the form of characters. I used a modifier character that acted as a shift. You could create substitutions that would turn a backslash followed by a Hebrew character, into the capital form. It's unlikely that anyone will be using your font to enter DOS commands, so the backslash will likely never be used anyway. That gives the user full control over the effect while making it easy to type on a keyboard. Also consider ^ or ~. Since this will be only used for Hebrew text, those characters will look and perform normally in other languages.

Michael Rafailyk · May 2024

@John Hudson I see, there's a lot of pitfalls in such approach. Such a things always sound better in my head (than in practice)

@Ray Larabie that's pretty interesting train of thought. That's not active by default (required the user to type the modifier key) but many people here may agreed that such experimental things should not be active by default. So this is an option. And I agree that substitution should isolated just for the current script.

However, the difficulty with this approach is educational – how to make sure users have seen the instruction poster with explanation or read the feature description?

John Hudson · May 2024

However, the difficulty with this approach is educational – how to make sure users have seen the instruction poster with explanation or read the feature description?

That’s only one of the problems with Ray’s approach. It also means that text is being broken to achieve a visual result in one particular font. The text is no longer searchable, indexable, or sortable as Hebrew, and it will display as broken in any other font.

Ray Larabie · May 2024

@Michael Rafailyk You make a good point. In my case, the font is shipped with keyboard decals, instructions, and a transcoding tool. But it's harder to relay instructions for typical font use.

André G. Isaak · May 2024

John Hudson said:

That’s only one of the problems with Ray’s approach. It also means that text is being broken to achieve a visual result in one particular font. The text is no longer searchable, indexable, or sortable as Hebrew, and it will display as broken in any other font.

Apart from the problems that John raises, this approach won't really address Michael’s original concern since these sorts of custom formatting characters would need to be re-entered everytime text reflows after edits.

John Hudson · May 2024

Apart from the problems that John raises, this approach won't really address Michael’s original concern since these sorts of custom formatting characters would need to be re-entered everytime text reflows after edits.

Not if they were used to explicitly trigger ‘capitalisation’ at e.g. the beginning of sentences, certain words, etc. But the other issues remain. A default ignorable formatting control character would be a slightly better option in terms of not mangling text in other fonts or getting in the way of other text operations, but then there is the added difficulty of users not knowing how to input such characters (leaving aside that the proper use of such characters is defined by Unicode on a script-by-script basis, and any non-standard use risks breaking in future).

This may be a case where a stylistic set feature is the simplest and least damaging way to implement the special letter forms.

Contextual substitution issue in a multiline text

The background

Looking for a combined adaptation for Hebrew

Substitution

Multiline text issue

A questions

Comments

Categories