New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UNSAFE_TO_CONCAT [Was Clarify UNSAFE_TO_BREAK and expose OS2V2Tail::usMaxContext as context.] #1463
Comments
I’m not sure |
Hmmm... yes, I see less than enthusiastic comments about its value, for example fonttools/fonttools#466 (comment). Of course, there's always the slow way of just calculating it, which may actually be fine for an editor where there aren't actually that many fonts in play but layout is going to be done many times. Without some sort of value like this correct editing (and sometimes line breaking) is more like n^2 on the length. I happen to know the guy who wrote the first code for detecting ground states in the TrueType state machine shaping to determine how far back to re-shape when appending. Just picking a number like LibreOffice seems to do seems as bogus as using a semi bogus number from the font. Also, until UNSAFE_TO_BREAK was introduced in HarfBuzz there wasn't any reason to know about this value since it was necessary to re-shape the entire run anyway for correctness. This would explain why no open source project was ever interested in this value. I am interested if there is a example where truncation of the of the initial code point sequence requires full re-shaping. It's super tempting on truncation to simply find the previous safe to break position and just re-shape from there to the truncation point. If this is always possible it would be great to state that, if not a counter example should be stated in the docs to scare implementors away from doing it. |
Thanks Ben. You make the use-csae very clear.
However, all of the above is irrelevant. You are assuming that appending a char to a buffer can affect at most usMaxContext chars before it, and as such one can find any unsafe-to-break point before that range and reshape from there forward? There's at least one very easy reason why that's completely unsafe: ReverseChainContextSubst subtables. With those, eg. you can have a sequence of 'a's shape to themselves. But as soon as you append a 'b', all the 'a's before it also change to 'b'. The usMaxContext for that substitution is short (1 or 2), but the effects propoagate backwards. 'morx' table can also do this easily as morx subchains can process input in reverse. Even without ReverseChainContextSubst subtables, the usMaxContext needs to be multiplied by the number of lookups, since having 100 lookups each of which can look forward 5 glyphs, when combined, can look forward 500 glyphs and substitute.... In summary, I suggest you don't try to get this information out of shaping / font. IMO the best way to handle this is to reshape from last line break. |
CC @kojiishi |
This is great information. I suppose that means https://cs.chromium.org/chromium/src/third_party/blink/renderer/core/layout/ng/inline/ng_line_breaker.cc?l=525&rcl=664f94acf2faf985bd33be531b5a9a0f8fcaff46 may need to be reconsidered? So, restating as I understand it, safe to break is good only for partitioning an unchanging sequence of code points at safe to break points. Changing the sequence of code points in any way will require re-shaping any shaping runs which the change affects (often an entire paragraph). Attempting to partition at an unsafe to break point (e.g. while line breaking) will require reshaping from the beginning of the sequence of code points (which is often the beginning of the current partition / line). In theory a real maximum context could be computed from the font, but this number may be very large and generally not useful. At any given point a tighter maximum local context could be computed, but that may be more expensive that just re-shaping (and a lot of code). |
I think the best way forward is for HarfBuzz to expose a new flag bit that does exactly what you want. Since most fonts don't have back-to-front lookups, this will be effective for most fonts. If font has ay back-to-front lookups, we'll mark the entire run unsafe. Let me think about exactly how to implement that efficiently. Need a name for the new flag. |
Like this example site, reported at crbug.com/479370? I think I was under the assumption that UNSAFE_TO_BREAK includes these cases. If not, and if we're adding a new flag, it's easy for LayoutNG to switch to it. We could optimize further by distinguishing two cases as @bungeman pointed out, but I think I'd like to do it incrementally, when we know better if it's causing the perf problems. /cc @eaenet in case he has different opinions. |
While in theory true, I doubt that the implementation will use the flag I'm going to add properly to avoid reshaping when text changes. That requires a lot of change tracking.... |
Currently LayoutNG tracks changes in nodes, but not part of text nodes. In Phase 1, we will not ship editing part though, we switch to the current engine for editable elements. The flag may help fixing cases such as:
though I'm not sure how common the problem is, and Blink will need some additional work in addition to using the new flag. We will start working on editing after phase 1. At that point, we will know better how much text-offset-level change tracking is needed. /cc @yosinch |
As a potential counter-example without explicit backward running features, consider that I construct a font with a simple 1:1 mapping of characters to glyphs except for two features, one of which (say 'rlig') which has a lookup 1 which maps "ived" with a special glyph and another (say 'liga') which has lookup 2 which maps "tri" to a different special glyph and shape the string "contrived" with this font. (The same effect might happen with one feature with two lookups.) As I understand it, the feature lookups run in the order of the lookup list (after determining which features to run in a given pass). As a result, "contrived" should map to [c o n t r ived] with 'ived' marked as safe to break. However, should there be a possible line breaking point found between "e" and "d" such that we should be shaping "contrive" we should end up with [c o n tri v e]. However, this would not be the result if going back to just before 'ived' and just re-shaping the "ive" into [i v e]. It seems like this is a general issue any time a prefix of an early lookup is a suffix of a later lookup. If this is a valid example, it seems that it would be more difficult to take this sort of behavior into account than simply looking to ensure no backward looking features are in the font. I suppose most fonts don't have this property, though now I think I'm going to create this one just to see the effect. |
IIUC we all agree you're correct, and appreciate to bring this problem into our focus. It's even easier to hit than your example. Consider, user typed "f", shape, then type "i". If apps is smart enough to know "f" is unchanged, because after "f" is safe-to-break, the app will not reshape "f" and fail to get the ligature. Behdad's point is that apps is not that smart today, and he's right on that regard. Blink reshapes all the text as long as they're in a same node. The Blink code you quoted above is only for removing trailing spaces of line end, so it should be ok. Blink may want the smartness when we work on editing in post phase 1 if the current per-node change tracking hits the performance bad enough. If Blink switch to per-character change tracking, this case will be much more common. |
Okay. I thought more about this. There are some things that I misunderstood before about our current unsafe-to-break flag. I like to correct that understanding, and together we can design a new flag for stronger guarantees. The way I previously suggested the unsafe-to-break can be used was this: full paragraph shaped, then line-break position determined, and if line-break position is unsafe-to-break, then walk back from it to the first safe-to-break, and shape the tiny resulting piece and append it to the shaped results for the chunk of line before it. This, unfortunately doesn't work. What safe-to-break guarantees, and only thing it guarantees, is that if you break at this point, and shape the two sides separately and append the results, it would be same as shaping the whole thing. But if you break here and then change one of the pieces, eg by truncating it, then there's no guarantee that shaping can be done in pieces anymore. Ie. if "abcd" as text results in glyphs "AB|CD" where "|" means safe-to-break, then it must be the case that "ab" shapes to "AB" and "cd" shapes to "CD". Let's imagine "c" also shapes to "C", but there's no guarantee from these that "abc" shapes to "ABC". It might shape to "XYZ". Back to line-breaking. The optimization can be changed to this: if chosen line-break is safe-to-break, then good, just break the glyphstring and move on. Otherwise, the line needs to be reshaped. In a separate commit I'll talk about how to design a flag that can give stronger guarantees. |
Oh I didn't mean that any font that doesn't have backward features is safe... I was designing a new flag that can be implemented to discover such break positions. I'll expand on it in more detail separately. |
Novice question; doesn't OpenType match the longest prefix? I mean, if "ab" shapes to "AB" and "abc" shapes to "XYZ", shouldn't "abcd" shape to "XYZd" rather than "ABCD" in the first place? |
Fonts can have complicated context-sensitive rules. In this case, can have rules such that abc shapes to XYZ only if not followed by another letter, or not followed by d. |
I now have a simple line breaker implementation (still working on it) at https://skia-review.googlesource.com/c/skia/+/156641/6/modules/skshaper/src/SkShaper_harfbuzz.cpp#715 which illustrates the issue. First, any time an unsafe to break location is considered it is necessary to reshape from the beginning of the text under consideration (essentially this issue as described best above at #1463 (comment)). Second the only way to know to stop looking at line breaking opportunities (instead of looking at all of them) is to stop looking at safe to break opportunities after the first safe to break opportunity which is too long (later unsafe opportunities would still need to be looked at). Note that it is possible to use the heuristic of "just stop looking after the first break opportunity which is too long" (I think this is effectively what layoutNG currently does), but this gives up the nice trait of always fitting the most characters onto a single line. If it were known ahead of time that all breaking opportunities were either safe to break or the kind where you can just go back to the previous safe to break and re-shape the little bit then we know all breaking opportunities are monotonic and the heuristic is correct. Otherwise... I suppose some heuristic would probably still need to be used in practice since the unoptimized 'correct' algorithm one would need to fall back to is so slow. In any event, any information which can be used to speed up these cases is of interest, and I now have some example code around to tweak to test out proposed changes (which, depending on the change, may be easier to reason about than all of layoutNG). Note that both of these cases are seen most easily in practice with Times New Roman which likes to kern [space 'T]' (I forget if it's actually [period space 'T']). |
I thought more about this. It is possible to add a flag to HB such that after, say, shaping "ab" and getting "AB" as results, it will signal that no matter what you add after "ab", the shape for it will not change and stay "AB". However, this itself is not enough either. Because if you want to shape "abc", you still have to shape the whole thing to get the part after "AB". You cannot assume it's the same as shaping "c" alone. That said, if we expose the same kind of flag on both side of the string, then we can tell whether concatenating is safe. So, that one would be UNSAFE_TO_CONCAT unlike UNSAFE_TO_BREAK. If we do have the above flag exposed in the middle of the shaping result as well, then one can shape a paragraph, then look for fragments between UNSAFE_TO_CONCAT flags and cache the shaping result for those (a trie fits more naturally, but since these are short, quite possibly a hash table works best in practice) and "shape" new text by concating fragments as found. The above is certainly possible, though I'm not sure whether benefits are justified. I'd rather see actual numbers before we engineer too many things. In particular, this UNSAFE_TO_CONCAT flag would be more expensive than UNSAFE_TO_BREAK to compute. |
Or should that be UNSAFE_TO_UNBREAK? :D |
We discussed this with Behdad couple of weeks ago and he asked me to explain how we solved this problem in DWrite. We actually spent 3 years in on/off discussions with multiple restarts to figure out final solution, so it was very interesting to me to go back and revisit whole thing, rather than just give you our results. So this text turned out to be quite long, but it shows what effort went into defining this simple bit. I hope you enjoy reading it as much I was when writing it :). Logic described here is implemented in current version of DWrite, starting from Windows Redstone 5 (Autumn 2018) update and already used by Word and line layout in Edge proper. They don't use these bits to full extent yet, but it seems like logic is flexible enough to satisfy all our goals. Here are our with high-level goals, as we identified them:
And this is principles and assumptions, we used:
And this is how bit is defined as part of per-character DWRITE_SHAPING_TEXT_PROPERTIES structure in dwrite.h, but accompanying comment does not list full list of guarantees provided by DWrite:
So this is full description and it will use following definitions:
Shaping library guarantees that:
In essence, we say that simple bit gives you only some information to do scissor-cut, but not much beyond that. Our bits provide different approach. Cluster serves as a separator for independent pieces: when unchanged between calls, any text on either side doesn't affect shaping on another side. Important thing to note is that when Layout engine uses this bit, it is flexible in terms of how much prefix before changed text it will use to reshape. Goal is to balance amount of reshaped text with probability of finding unchanged cluster. Piece to reshape could be from single cluster (two works better in real life) to whole preceding word, to whole line (not practical). This decision doesn't affect rest of the logic at all, only probability of successful reuse. If any cluster in prefix doesn't change, we can trust all clusters on the line before (or after) it didn't change. If there are no fixed clusters in the string we tried, we can extent context or simply back to shaping this line as a whole. Examples of how this applies to various fonts and scripts:
Implementation by shaping library:
And finally, this is how line layout engine may use new bits:
|
Excellent, @SergeyMalkin - thank you so much for taking the time to explaing your findings and design. |
Thanks again @SergeyMalkin for writing that down. I read it again now, and I am certain that what you are describing matches my proposed |
I've given this a first stab finally, in #3297. I believe that matches what @SergeyMalkin describes. |
The current documentation clearly states what UNSAFE_TO_BREAK (or it's absence) can be interpreted to mean. However, it should also be documented as to what it is not, since it's very easy to be lured into a false sense of security with it.
The current behavior (as I understand it) is that one can shape a sequence of code points into glyphs, pick any two safe to break locations in the resulting buffer, reshape the associated sub-sequence of code points in isolation and get the same buffer output as is seen between the original two safe to break locations. Essentially, any pair of safe to break locations is stating the the larger context of the original sequence of code points did not affect the shaping of that run. (This is actually a bit stronger of a statement than the current documentation implies, please correct if this is not true.)
However, it is important to note that the inverse is not true, it is not the case that such a run of code points will always shape this way; any modification to the original sequence of code points can arbitrarily modify the overall shaping. For instance I can make a font which creates ligatures of all the sentences in the works of Shakespeare such that "There is a written scroll" will come back with many safe to break locations, but "There is a written scroll!" will shape into a single glyph which visually represents the icon for the Prince of Morocco. In other words, adding to or modifying any of the original sequence of code points invalidates everything about the shaping before doing so (including any safe to breaks).
In order to avoid requiring complete re-shaping when editing (especially simply adding to the end of a paragraph), it is necessary to let the user know the maximum affected area. For OpenType this is the purpose of OS2V2Tail::usMaxContext, in theory even TrueType knows the answer to this question for at least appending (keep track of the last ground state). With this sort of information it should be possible for a sophisticated user to only re-shape a portion of the overall sequence of code points by dirtying the maximum context area and finding the safe to break locations surrounding that area. The pathological Shakespeare font I propose above would obviously have quite a large value for the maximum context, but for most fonts this value is three or less.
In between these two cases (breaking at safe to break locations and mutating the sequence of code points) there is the case of determining the extent of re-shaping if truncating the original sequence of code points. Is it enough to find some safe to break location before the truncation and shape only the remaining portion? I cannot construct an example to answer in the negative, but it is not obvious that this is always allowable either. I think this is equivalent to asking if any shaping operation can operate by negative look ahead in a way which doesn't affect safe to break. With the maximum context information it seems this could be at least bounded, but it would be nice to know if this common case can be handled even more efficiently (may be common when line breaking).
With all of that as background, the two actual requests here are:
Can general and/or end of shaping maximum context information be exposed to reduce re-shaping when editing?
Can anything be assumed about code point sequence truncation and safe to break information?
The text was updated successfully, but these errors were encountered: