I am trying to check to see the final ordering of the glyphs after the shaping engine has done its job.
I am trying to get exact ordering in Unicode numbers, so that I will know how I can write opentype
features based on final result of glyph ordering.
Note that reordering is performed by the shaper after the codepoints have been mapped to glyphs (although the shaper internally remembers the initial codepoints). So the reordering will appear in terms of glyphs, not Unicode codepoints. For example:
Here you can see that the shaping engine first took a buffer of codepoints, then mapped them to glyph, then performed initial Indic reordering on those glyphs (placing the evowelsign at the beginning of the cluster). Next it executed the lookups in the half-form feature, before doing final reordering (which on this string didn't do anything).
I agree that a DWrite step-by-step debugger would be a great thing to have. I presume Microsoft has one that they use internally, but I am not aware of it ever being made available to anyone else. Mind you, I never asked for it...
The OTM Text Viewer tool is quite useful for examining the interaction of individual layout features with the shaping engine, but it doesn’t provide the kind of information @WAY KYI is seeking about glyph order. Also, it can be misleading because it doesn’t graphically indicate the presence of mark glyphs or provide any way to interact with them to identify them. So, for example, in this case the rectangular boxes suggest that there are two glyphs present, when in fact there are three.
What would the steps be? Each time the shaper has either re-ordered or processed features that results in some change to the glyph sequence?
For font debugging, one wants to know what has caused the change in the glyph sequence (whether by substitution or positioning), the steps need to be at the lookup level. Basically, one wants something like the VOLT proofing tool, in which it is possible to step through the string lookup-by-lookup and glyph-by-glyph, but with shaping engine support so that character-level operations and reordering outcomes are also analysable step-by-step. So this would start with things like buffered character composition of base+combining mark cmap lookups, and initial reordering for those script shaping engines that perform it, then the GSUB, with final reordering occuring at the appropriate point.
A complexity of this is that some fonts might have lookup order that the shaping engine overrides in the processing blocks, so stepping through the lookups in the order in which they appear in the GSUB table might not correspond to the processing that is actually being applied by the shaping engine with regard to glyph ordering. @Simon Cozens, how do you handle that sort of thing in Crowbar?