CCMP one-to-many InDesign

I have a feature that decomposes some accented glyphs, for example Aogonek to A + combining ogonek. This works as expected in Chrome, but it doesn’t work in InDesign – not even if I use the World-Ready composer. The underlying text still shows one character: 0x104. Is this expected behaviour, or is there something wrong on my side?

Comments

  • The underlying text still shows one character.

    That is because features are not supposed to change the underlying text, only the derived glyphs. I would suggest testing the ccmp by changing the look of one of the decomposed glyphs, so that you can tell immediately when you see it.
  • Eric is right: there are two distinct operations: the Unicode decomposition (into the NFD form where accented are encoded as basic letters followed by combining marks; you can convert your text into NFD using Unicode Checker's Tools palette), and glyph one-to-many substitution (via "ccmp" or any other feature, which will do whatever you put into the font). The latter will not influence your character codes. 

    Note that some apps automatically convert your text into NFC (composed) form behind the scenes, so testing "ccmp" in apps may prove a bit tricky. 
  • Aha! InDesign is “re-composing” the separate parts again.
  • That is true, InDesign likes to default to the precomposed glyph if it is available for the respective letter+mark combo. You could circumvent that by adding another combining mark, because it only des that when there is a single mark following a letter, IIRC.
  • Kent LewKent Lew Posts: 793
    because it only des that when there is a single mark following a letter, IIRC.

    That’s only true with InDesign’s standard composer. A second combining accent will break up the automatic composition of the base and the first (unless, of course, there is a relevant double-accent combined NFC codepoint — Vietnamese, for instance — and the encoded glyph is present).

    In the same situation, the World-Ready composer will maintain the first composed glyph and just place the second combining accent.

    BTW, there’s an additional difference in the handling of combining accents between InDesign’s standard composer and the World-Ready composer. While both appear to prefer precomposed glyphs when the corresponding NFC codepoint is available in the font, they behave differently when one is not.

    If there is not a {mark} feature to manage the positioning of a combining accent, then the standard composer will place the zero-width glyph at the advance width (right sidebearing) of the previous glyph. The World-Ready composer, on the other hand, will center it on the advance width of the previous glyph.

  • Kent LewKent Lew Posts: 793
    P.S. Note that World-Ready composer appears to center the *bounding box* of the combining accent on the advance width of the base glyph in these situations. So it yields the same results regardless of whether the font employs a strategy of offsetting combining accents to the left or just has them centered on the zero point.
  • Chris LozosChris Lozos Posts: 1,087
    If InD doesn't even do it properly, what apps can you count on to do it right?
  • edited August 2015
    Is there any way to inspect what is actually happening to the “derived” text behind the scene in InDesign? Or browsers, for that matter?
  • Adam TwardochAdam Twardoch Posts: 397
    edited August 2015
    Čĥĕčķ őúť váŕĩőúś “fáńčŷ ťĕжť” ğĕńĕŕáťőŕś*, ťĥĕń úśĕ Uńĩčőďĕ Cĥĕčķĕŕ’ś Diff, Normalize áńď Split Up úťĩĺĩťĩĕś. Diff áńď Split Up áĺĺőŵ ŷőú ťő śĕĕ ťĥĕ čőďĕpőĩńťś ĩń á ťĕжť, áńď Normalize áĺĺőŵś ŷőú ťő čőńvĕŕť ßĕťŵĕĕń NFC áńď NFD. 

    *) 
    http://www.messletters.com/en/
    http://www.jamfoo.com/text-generators/
    http://www.mallubar.com/text-convertors/fancy-text.php

    A. :)
  • BTW, this tool is good for testing mark attachment: http://www.messletters.com/en/stripes/ :) 
  • Haha. Thanks!
  • I've found BabelPad by Andrew West to be very helpful in understanding what is going on at character-level and glyph-level.
    http://www.babelstone.co.uk/Software/index.html
    (Windows only)
  • Adam JagoszAdam Jagosz Posts: 207
    edited October 5
    I've been trying to replace ccmp with more straightforward substitutions to make it possible to compose accented alternates in InDesign. However, I reached a point where the whole thing stopped working even in non-Adobe apps — Word, browsers. The following code:
        ignore sub ecaron' @letters;
        sub ecaron' by e.fina uni030C;
    results in /ecaron getting substituted by only the first specified glyph, e.fina — the accent disappears. What's wrong?
    I tried to find in the doc if one-to-many substitutions are allowed with the ignore statement , but there's no mention of it, though lack of such examples seems to mean they aren't.
  • Adam JagoszAdam Jagosz Posts: 207
    edited October 5
    Is the following approach a valid & safe workaround? (If so, I'm probably reinventing the wheel here). It works, but... there's always a “but” :smirk:
        sub egrave by e mark_separator  gravecomb; 
        sub eacute by e mark_separator  acutecomb;
        sub ecircumflex by e mark_separator  uni0302;
        sub edieresis by e mark_separator  uni0308;
    ...
        ignore sub @nonFinal' @letters,
            @nonFinal' mark_separator @combiningMarks @letters;
        sub @nonFinal' by @final;
    ...
        sub e mark_separator by e;
        sub e.fina mark_separator by e.fina;
    This trick amends InDesign's eagerness to be smarter than its users.
  • Kent LewKent Lew Posts: 793
    edited October 5
    I tried to find in the doc if one-to-many substitutions are allowed with the ignore statement , but there's no mention of it,
    What you are doing here is foremost a GSUB Type 5 Contextual Substitution. Your contextual substitution then parses down to a resulting Type 2 Multiple Substitution.
    Under section 5.f.i., you’ll find this statement:
    If there is only a single substitution operation, and it is either single substitution or ligature substitution, then the operation can be specified in-line and its type will be auto-detected from the input[...]
    Since your final substitution is neither a single nor ligature substitution, I don’t think you can write it in the format that you did. It seems like it should be able to be constructed as a valid rule, but I think you would need to use the more general syntax to get it be compiled properly.
    Your workaround looks quite cumbersome. I didn’t take time to parse all of it, but surely there’s a better way. (Although, when it comes to trying to “outsmart” InDesign, perhaps not.)
  • Kent LewKent Lew Posts: 793
    FWIW, here is how I would interpret the generalized syntax for your rule:
    lookup finalAccent {
    sub ecaron by e.fina uni030C;
    } finalAccent;

    feature ccmp {
    ignore sub ecaron' @letters;
    sub ecaron' lookup finalAccent ;
    } ccmp;
    I have never used this format before, myself. And I have not tested compiling this, so I can’t vouch for whether it would work the way you intend.

Sign In or Register to comment.