Inclusion of 'meta' table Design Languages in fonts

Peter ConstablePeter Constable Posts: 141
edited November 26 in Font Technology
OpenType 1.8 introduced a 'meta' (metadata) table, which Apple had previously started using. But Microsoft and Apple worked out together a couple of new entries for this table: "Design languages" and "Supported languages". 

These are meant to supersede the 'OS/2' code page and Unicode range fields, which can't scale beyond Unicode 5.2 (ran out of bits), never worked well because it was never clear what a set bit actually meant, and which did not make an important distinction between what a font is capable of displaying versus what the font is designed for.

We actually defined these new fields over a year or maybe two before OT1.8 was published (they were documented in Apple's TT spec), and in Windows we started supporting these in DirectWrite APIs and using them in fonts in the first Windows 10 release (summer 2015). But there wasn't an OT spec update until September 2016.

At any rate, I'm wondering whether support for these fields has started to get into tools and whether any foundries are starting to put them into fonts?

We'd like to start making more use of these fields in Windows. (If you look at the Fonts control panel in Windows 7 or later, it reports "Designed for" information, but up to now that comes from a separate data file, not from fonts themselves, and we only maintained that data for Windows and Office fonts. The 'meta' table of individual fonts is where this belongs, however.) So, I'm hoping they start getting included in more fonts.

Are people aware of the 'meta' table and including it in their fonts?
Tagged:

Comments

  • Our font editor supports dlng and slng since version 10 which was released back in June 2016.

    It is hard to tell if people actually make use of it, but I suspect there is not much interest in these fields right now. I'm sure more font designers will put effort into this as soon as Windows makes more use of these fields.

    We've had supports request related to the "Designed for" information, as people expect that information comes from within the font. They were disappointed to know it was due to a separate file.

  • I’ll add support for it to Glyphs.
    Two questions:
    The spec from Apple and MS differ in the header. Apple has a `dataOffset` where MS has a `(reserved)`. What is correct? FontTools follows the Apple spec. 

    Is there any definition what "supported" means. (That is the same question as for what unicodeRange bit should be set.) If we start adding a new mechanism, we could also give it a bit more meaning. Are there any suggestion about minimal needed character sets for each language?
  • @Georg Seifert

    I noticed the discrepancy while preparing the draft text for OT 1.8 and followed up with Apple. They did update the description for the dataOffset field in the data map record, that the offsets are from the start of the 'meta' table, and that makes clear where the data corresponding to a given record is located. It also makes clear that the dataOffset field in the header isn't required. Their spec also indicates that the data map records follow the header.

    They probably should update the header description in their spec.

    As confirmation of what I'm saying, here's an excerpt from discussion I had with John Jenkins -- this is John writing:

    "Yes, the offsets are really from the start of the 'meta' table and not the start of the data. (We recently updated our spec to reflect this.) I think the best thing to do—since there are already implementations out there—is to change the 'offsetToData' field in the header to 'unused; set to 0'..."

    And later in the discussion,

    "I think it would still be a good idea to have zeroing it out be the norm."


    Wrt "supported", that is still ambiguous, but let me make some observations: 

    First, "supported" is the less interesting of the two concepts. It's mainly useful for font fallback or font binding scenarios as a heuristic to decide, "Is it worth even trying to display this string using this font?" Ultimately, the question of whether a font can display a string is determined from the 'cmap', and data such as the "supported" field or OS/2 fields is just a first-level heuristic for this.

    Secondly, the OS/2 Unicode range fields were initially based on Unicode blocks, and blocks don't really correspond to anything that has practical use. For example, consider the currency symbols block: there aren't very many usage scenarios that involve all or even a large number of those characters at once. Nobody writes messages in currency symbols, for instance. Or, consider Romanian or Turkish text, say: there are multiple Unicode blocks you would need to check to determine if a font can display it. But a font certainly doesn't need to have complete coverage of all those blocks to support Romanian or Turkish. Should a font designed to support Romanian and Turkish but not, say, Lithuanian set Unicode range bits for those blocks? If it does, that may result in misleading an app used for Lithuanian.

    "Supports" allows one to be specific, if desired; e.g., to say that the font can support Romanian and Turkish, but not list Lithuanian. For widely-used scripts like Latin, though, I wouldn't want to be the font developer struggling over what languages to specifically mention and how to verify that. (Btw, see the thread on testing for language coverage.) 

    It's much easier to simply make a declaration about a script: "This font supports Latin script". Of course, particularly for Latin and Cyrillic, or even Arabic, there are so many characters used for certain languages but not others, this can be very ambiguous. My guidance would be to declare support for the script if there are any major languages that could be displayed using the font. Keeping in mind that this is no more than a heuristic for font fallback, I think that's good enough.
  • @Erwin Denissen 
    Thanks for supporting these fields in your tools. :smile: 

    I worked on the changes in Windows 7, and devised the metadata used in the control panel. I had the intention of later creating ways to pull in data for third-party fonts, and we even prototyped a tool that font developers could use to author that metadata and submit it. Those were early days of using cloud services, but we already had a control panel for devices that was pulling metadata from a cloud service for devices you install, and I planned to leverage that. (The devices team were already piggy-backing on a service created for music.) But org priorities after Windows 7 went in a very different direction (including me no longer being involved with fonts).

    But when I put the designed-for info into that metadata for Windows 7, I had in mind from the outset to eventually get that into fonts directly. And that is something I did get to do after returning to work on text and fonts.
  • I see the point of adding a "design for" field in the font. But as there are no clear recommendation/rule what the "supported" should include/mean (or what it means when something is not included) the data doesn't seem to be useful. Why add it then? 
  • George ThomasGeorge Thomas Posts: 389
    edited October 13
    "Supports" allows one to be specific, if desired; e.g., to say that the font can support Romanian and Turkish, but not list Lithuanian. For widely-used scripts like Latin, though, I wouldn't want to be the font developer struggling over what languages to specifically mention and how to verify that.

    That is a good explanation of "support" but if one lists all the languages believed to be supported by the font it could get pretty lengthy. I would still include it in my fonts because it would be helpful to some.


  • "Supports" allows one to be specific, if desired; e.g., to say that the font can support Romanian and Turkish, but not list Lithuanian.
    But in what case does it mean something that Lithuanian is not mentioned in the list. Did the producer of the font intentionally omitted it or did he didn’t know/care about it?

    What could that information be helpful to whom? I can only see a benefit if there are rules that determine the scripts in the list. But then everybody cold run that analysis on the cmap himself and the font didn’t need that info. 

    So the info could be used in a case where the client/render needs to pick a font for a certain text. The text is in Lithuanian. The font might have all glyphs to support it but the "supported" list doesn't mention it. Should the font be skipped?

    Or do I miss something?
  • That's why for "supports" in the case of a large and fractured script like Latin I'd probably just list the script. 

    If a font declared slng="tr-Latn,Latn", then that would be declaring (i) explicit support specifically for Turkish (Latin script), and also (ii) a generic (hence more ambiguous) support for Latin script. That might be used to decide that the font could be checked for Lithuanian support (by virtue of declaring "Latn"), but the declaration doesn't ensure any confidence level for that use. But if the text was in Turkish, then the explicit declaration of "tr-Latn" should give a high level of confidence that the font can support Turkish.

    But really, if I'm looking for a good font to read Turkish or Lithuanian text, then I'd want to look at dlng (designed for) declarations. A Japanese font might be capable of displaying Turkish (or Lithuanian) text, but be a great choice for that because the design is really targeted at a Japanese user and not a Turkish user. That font probably would include "Latn" in the slng value, but it should not include "Latn" in the dlng value.

    In the dlng value, there is still the question for a large script like Latin of what declarations to make. If I start listing languages, I could potentially list dozens, but still have gaps. You can only list so many, but where to stop? I would still include "Latn" (assuming it is primarily meant for Latin-script usage) as that still provides benefits. If nothing else, it can get that filtered out if what I really want to find are good fonts for Arabic, not Latin. 
  • John HudsonJohn Hudson Posts: 1,239
    The script/language data in the meta table seems to me primarily useful if a font covers a script that is not included in the available OS/2 Unicode range bits. In that case, I would consider it strongly recommended to include a meta table recording this.
  • "Note: Implementations that use 'slng' values in a font may choose to ignore Unicode-range bits set in the OS/2 table."

    Some applications may start to do this, and I definitely expect some applications will start to use dlng without any reference to OS/2 (since OS/2 cannot provide the needed information).
Sign In or Register to comment.