Accessing the Font Book language requirements database?

I recently noticed that Font Book seems to have changed its database with respect to what it considers necessary for each language to show up as supported. For example, some of my fonts that were flagged as supporting Dutch and Turkish in older versions of Font Book (or Mac OS) are not showing these two any more.

Is there a way to access the language database in Font Book? Does anyone know which database they use? Although I wouldn’t consider Font Book the only authoritative source on this, it would be good to know which characters I need to include to make Font Book show a certain language as supported.

Comments

  • To (partly) answer my own question, the data can be found in Font Book.app (in the finder, use Show Contents), Resources/LanguageAlphabets.plist.

    Still trying to figure out what the letter pairs mean, e.g. Dutch requires:
    a á ä b c d e é ë f g h i í ï ij j k l m n o ó ö p q r s t u ú ü v w x y z

  • Isn’t that just the preview string that is shown in Font Book, depending on the system language?
  • John Hudson
    John Hudson Posts: 3,229
    Still trying to figure out what the letter pairs mean, e.g. Dutch requires:
    a á ä b c d e é ë f g h i í ï ij j k l m n o ó ö p q r s t u ú ü v w x y z

    Possibly a compatibility decomposition of the 
    ij digraph character (U+0133)?

  • TimAhrens
    TimAhrens Posts: 57
    edited February 2017
    I have a feeling Jens is right but couldn’t confirm it. I tweaked LanguageAlphabets.plist in a copy of Font Book.app but it did not have any effect on the sample string (also after re-starting the computer). Maybe it is chached somewhere really persistently.

    John, I also thought of a decomposition but there are other pairs such as ch and dd for Welsh, which aren’t decompositions, I assume?

    So, this puzzle is still not quite solved.
  • Denis Moyogo Jacquerye
    edited February 2017
    Aren’t they derived from CLDR locale exemplar characters?
    The list of locales is pretty similar, however the content is slightly different in some cases, which probably means it from an older version or has been processed in some way.

    For exemple CLDR’s nl.xml contains:
    <exemplarCharacters>[a á ä b c d e é ë f g h i í ï {ij} {íj\u0301} j k l m n o ó ö p q r s t u ú ü v w x y z]</exemplarCharacters>
    and in LanguageAlphabets.plist there is:
    <key>nl</key>
    <string>a á ä b c d e é ë f g h i í ï ij j k l m n o ó ö p q r s t u ú ü v w x y z</string>
    



  • John Hudson
    John Hudson Posts: 3,229
    edited February 2017
    That makes sense, Denis. In that case, the information is recording digraphs that may sort independently.
  • Thanks Denis! Looks like this is esactly the source I was looking for. Now I need to find an elegant way to get this data into my font editor.
  • Jens Kutilek
    Jens Kutilek Posts: 364
    edited February 2017
    Alphabet Type’s charset builder/checker are based on the same Unicode CLDR data, and they offer a web API: http://www.alphabet-type.com/tools/charset-builder/

    What editor do you use?
  • TimAhrens
    TimAhrens Posts: 57
    edited February 2017
    The Charset Builder looks very useful. But does it really mean the selected languages will be indicated as supported in Font Book? I have a feeling the Charset Builder does not include combining diacritics (auch as u0301 for Dutch) although they are in the CLDR, and they may be required by Font Book. Will run some tests later.

    I’m mainly using Glyphs, btw.
  • The combining diacritics omission in Charset Builder seems weird ... It’s the same in all languages that I checked. Maybe ask Alphabet Type about it. 
  • Eigi
    Eigi Posts: 2
    Yes - I must confess that is based on a lazy interpretation of this bit of information from CLDR’s nl.xml: 
    {íj\u0301}
    So what do you think - a font that misses combining acute (0x0301) does not support dutch? Any native speaker here for help?
  • Hi Eigi, thanks for jumping in, and thanks for providing the Charset Builder!

    I didn’t mean to start a discussion on the question which characters are necessary for Dutch or any other specific language. I am sure there would be a lot to discuss on this topic.

    As I said above, I wouldn’t consider Font Book the only authoritative source on this but it would be good to know which characters I need to include to make Font Book show a certain language as supported.

    I am always trying to have my own opinion on which characters make sense to include but in the case of FontBook language support, it may be easier to just follow along and provide the specified characters so as not to confuser users who, unlike us, might rely entirely on the information given by Font Book.
  • Eigi
    Eigi Posts: 2
    Hi Tim, I totally agree with you.
    I'm not absolutely sure how the information in curly brackets should be interpreted at all. My underständing is {íj\u0301} means the iacute-jacute ligature which is not encoded in unicode. In this case CLDR list the components needed to build this character. But simply adding the combining acute is not enough to fully support the dutch iacute-jacute ligature case.
    On the other hand, if I add the combining acute to the list of required characters for dutch, the Charset Checker (which uses the same database) will mark dutch as unsupported for a whole bunch of existing fonts, what I don't want either.
    It is a dilemma...
    And as a side note: Not everything that Apple has done in the past in regard of unicode was 100% correct ;-)
    Eigi