How many glyphs sequence can an OpenType lookup/feature include?

As we are working on a text editing application, that supports OpenType features for the text.
While the user is typing inside a paragraph we must do Re-analysis of the surrounding text cause of this change so the OpenType can be updated.
We want to Re-analysis as less text as we can, so the program will be fast and light. We do not want, and do not need to do all paragraph analysis again.
We will be glad to have your smart opinion on this issue - what is the less sequence of text that we must analyze according to the look behind or look ahead of the OpenType. What is the Maximum of text that open type can catch.    
  
Tagged:

Comments

  • Well, it can’t go past a line-ending, so there’s that. Many apps choose to break the text runs that are subject to such analysis at word boundaries, which would also impose some fluid but tending-to-be-low limits.

    I am not aware of any specific limitation other than those. I suspect there are some sorts of limits in each implementation of OpenType support (e.g. in Harfbuzz, Adobe’s CoolType, Microsoft’s DirectWrite, etcetera).
  • The following document describes how text should be broken up into glyph runs:

    https://docs.microsoft.com/en-ca/typography/opentype/processing-part1
  • There is no practical limit on the maximum number of glyphs to process. (Other than the intrinsic property of the OpenType specifications which does not allow certain lists to exceed 32,768 elements. But a substitution can have multiple lists, for example in lookaround constructions.)

    The documentation André points to notes that the longest possible sequence would be
    [...] normally a maximum of one line in length and consists of a text string formatted in a single font at a particular size, in a single direction, in a particular script and a particular language system.

    and all desired carry-over effects to next lines (or, as it happens, preceding ones) should be handled by the layout software.

    If this influences your software design then you need to re-think it. Until now, I've found all OpenType substitutions and positioning can be implemented with a basic linked list of glyphs. This linked list needs only be generated once per line (actually, per text run – see above) to display it. And if you're afraid this might take too long, you can cache the resulting glyph sequence.

    If a user edits the paragraph, you need to start from scratch: calculate line breaks, provide justification, and re-process any OpenType instructions. It'd be a bad idea to cache "intermediate" steps inside a single glyph run. You cannot know in advance how far back or ahead the effects of a Contextual substitution is going to reach – other than, per definition, it cannot reach beyond the current text run.

  • Yeshurun Kubi
    Yeshurun Kubi Posts: 14
    edited September 2019
    @Khaled Hosny thanks for that interesting issue on GitHub.
    It is a real 'DEEP' level of talking there, and could not figure out at last if HarfBuzz does have a clear method for that - when do we need to reshape the text after typing.

    Though we thought about some solution that looks like a better and safer way to check this.
    See this post on stack: Here
  • In theory, ∞. You can easily write a GSUB that changes the entire line's visuals with only one keystroke.
  • You are probably pre-maturely optimizing here. Shaping is not usually that slow, even for relatively big paragraphs.
    Shaping a filewith 20574 words (one per line), using hb-shape utility from HarfBuzz with a relatively complex Arabic font takes less than half a second on my system, which includes loading the font and all the setup required to use the font for the first time.

  • Thank you all - it is a great help.  :)
    Our application can not afford re-shaping a whole paragraph, for a major part of text that the program needs to handle with, is arranged as just one paragraph. Large text file in one paragraph.
    So we are trying to find the best way to deal with that.
    TypeDrawers is probably not the place to talk about it.