Pet peeve: empty .notdef character.

2

Comments

  • WAY KYI
    WAY KYI Posts: 130
    Does .notdef have Unicode Consortium assign code number? Or you can just create it with the name and place it in PUA or a new unmapped slot?
  • .notdef has no unicode value. It always occurs at GID 0.
  • That's why it's called .notdef . :smile:

  • WAY KYI
    WAY KYI Posts: 130
    edited October 2021
    thanks. I see some font has .notdef at u10000, a rectangle with a question mark inside. Now, I have more questions with these keys: Nul, Tab, Esc, Backspace, Insert, Delete, Shift, Alt and Ctrl keys. Where they are and what is their code numbers? Is GID 0 NULL character? Can I use which of them in the font? and how? Thanks
  • > I see some font has .notdef at u10000

    As noted by @André G. Isaak above, this is incorrect. It should be unencoded

    Is GID 0 NULL character?

    Noo, the .notdef glyph should be at GID 0. (Again, André is correct.)

    these keys: Nul, Tab, Esc, Backspace, Insert, Delete, Shift, Alt and Ctrl keys

    None of these require glyphs in a font, and most apps treat these characters as non-marking control characters and will not render them even if they are present. (Nul has a bit more history, but is also not needed.)
  • u10000 is a Linear B syllable. Which font is encoding notdef there?

    Shift, alt, control, etc. aren't even characters. They're keys which allow you to make other characters. :-)
  • WAY KYI
    WAY KYI Posts: 130
    edited October 2021

    I am still confused with .notdef. See images above that even Microsoft System fonts have two different use of that slot GID 0. In Arial font, it is used as .notdef and placed a rectangle and in Calibri it is used as NULL with 0 width glyph. So, it is valid usage in both? Other things were that in Arial font most of beginning slots assigned for control characters were removed but not in Calibri. I thought we should not touch these slots. So, which font is implemented correctly in here? But my main question is - GID 0 is for .notdef or NULL? I am very grateful  for all your teaching/helping me here. Thanks
  • I'm not familiar with the font editor you're using to display the above images, but I have a suspicion that the first image is displaying glyphs sorted by GID and the second is displaying them sorted by unicode value. NULL should have a unicode value of 0 and a GID of 1.
  • WAY KYI
    WAY KYI Posts: 130
    I use FontForge and both fonts are displayed in Unicode. So, GID 0 = -1 in Unicode? Yes it is, see image. But it is placed at U+10000 slot. So, it is correctly implemented in Calibri, then?
  • It doesn't matter what order FontForge displays your glyphs in: it will order them correctly when you generate a font, with .notdef first.

    When FontForge displays -1 in the "Unicode Value" field of the "Glyph Info" dialog, that means the glyph is unencoded. Not that the Unicode value is actually -1: that's just a placeholder number that FontForge uses for all unencoded glyphs. It means that in the generated font there will be no mapping of that glyph to a Unicode value in the cmap table.
  • Another thing about your image, @WAY KYI. There are two numbers displayed at the top of the Font View window, just below the menu. The first, reported in both decimal and hex (for .notdef, it's 65536 (0x10000), has to do with how Fontforge orders glyphs internally. It has nothing to do with the Unicode value, and it doesn't have to match the order of glyphs in the generated font.

    The other number is the Unicode value, here reported as U+????. That means the same thing as -1 in the "Unicode Value" field of the "Glyph Info" dialog"--that is, no encoding at all for the .notdef glyph.
  • WAY KYI
    WAY KYI Posts: 130
    I'm not familiar with the font editor you're using to display the above images, but I have a suspicion that the first image is displaying glyphs sorted by GID and the second is displaying them sorted by unicode value. NULL should have a unicode value of 0 and a GID of 1.
    So, where do you create .notdef with rectangle with question mark inside? So, GID 0 is where? Sorry, I am confused with GID and Unicode. Now I know they are two different things. Can you point me to any reading about GID? Thanks
  • WAY KYI
    WAY KYI Posts: 130
    Another thing about your image, @WAY KYI. There are two numbers displayed at the top of the Font View window, just below the menu. The first, reported in both decimal and hex (for .notdef, it's 65536 (0x10000), has to do with how Fontforge orders glyphs internally. It has nothing to do with the Unicode value, and it doesn't have to match the order of glyphs in the generated font.

    The other number is the Unicode value, here reported as U+????. That means the same thing as -1 in the "Unicode Value" field of the "Glyph Info" dialog"--that is, no encoding at all for the .notdef glyph.
    Thank you very much for your explanation. I always thought ( 0xNumbers ) is same as U+????. I never went over PUA and that is the case. So .notdef won't be encoded in Calibri, right? So where do you encode .notdef and which slot to use? Sorry bear with me and please help me get it right. Thanks
  • You don't have to worry about where .notdef goes: Just let FontForge put it anywhere.

    Do this: on the Encoding menu, choose "Add Encoding Slots," and enter the number 1 when prompted. This creates (despite the name of the menu item) an unencoded slot. Scroll (if necessary) to the very bottom of the font view window: the slot you created will be the very last one in the font.
    Open the glyph and draw what you want apps to display when a character is missing. Then open your character info dialog, name the glyph .notdef, and you should be done. FontForge will do the right thing with the glyph when it generates the font.

    The GID is simply the index number of the glyph in the font. The first glyph is 0, the second 1, and so on, to the end of the font. Software uses it to retrieve glyphs from the font. The Unicode number is the encoding assigned to a character by the Unicode Consortium. A character's GID may be different from font to font (except for those first few GIDs). The Unicode number, on the other hand, is always the same. A text is a sequence of  Unicode numbers. An application displays or prints text by looking up the Unicode number in the cmap table, getting the GID there, and using that to retrieve the glyph from the font.
  • Interesting. I thought GID was still pretty arbitrary and didn't know it made any difference in how glyphs were displayed in palettes. I don't think FontForge gives you a way to control the order. For a while (back when I used FontForge) I did that via the Adobe Font Development Kit, but I gave it up because I couldn't see the point. But maybe there is one.

  • André G. Isaak
    André G. Isaak Posts: 634
    edited October 2021
    It's definitely worth sorting the glyphs sensibly if possible. As a user, I always find it aggravating when the glyph palette has characters scattered around in random order. For encoded glyphs, unicode order is generally fine , though grouping by script also makes sense. For unencoded glyphs, I prefer these to be grouped together sensibly (i.e. all small caps together, ligatures together, swash characters together, etc) and alphabetized within groups (or sorted numerically for alternate figures).

    Note that I rarely enter characters/glyphs via the glyph palette. But it's extremely useful for getting an overview of what's available in a given font.

    I've found lots of fonts where all glyphs are sorted alphabetically by glyph name. This is particularly irritating since figures and punctuation are scattered throughout, ß comes next to g, etc.
  • I've found lots of fonts where all glyphs are sorted alphabetically by glyph name. This is particularly irritating since figures and punctuation are scattered throughout, ß comes next to g, etc.

    I think I used to do this. Quite right: it made no sense at all. :/

  • WAY KYI
    WAY KYI Posts: 130
    edited October 2021
    Wow... this is so nice of you guys for your helping hands. Now I am a lot smarter :smile: So, what I did was, I went back to FF and found something about CID ( but very buggy and crashes me doing things there ). So, I am ok now with knowing how to create a .notdef and not worry about it any more. Thanks
  • While not familiar with FontForge, I suspect that the  CID option you found is most likely for creating CID-keyed fonts. CID-keyed fonts are used for Chinese/Japanese/Korean fonts.
  • John Hudson
    John Hudson Posts: 3,229
    CID-keyed fonts are a special Postscript font format for efficient packaging of East Asian characters by having the character encoding and name data stored externally to the font and mapped from the glyph index order. So yes, GID is important to CID, but they are not the same thing. If FontForge is able to make CID-keyed fonts, though, that suggests there is some way to manage glyph order.
  • Do not attempt to use FontForge to make CID-keyed fonts. (Well, of course, we cannot demand you do or not do anything, it's free software, so this is just a strong recommendation hehe.)

    Our FAQ states in §D6:

    CID-keyed fonts are a legacy font format, including so-called "SFNT-wrapped" CID-keyed fonts. Of course, we still will merge code that fixes them, but making new ones (especially) is not a development priority.

    To use a CID-keyed font as the base of a new font, you should flatten it first. See №3955 for more information.

    There is truly nothing but misery to be experienced for using FontForge for anything but flattening a CID-keyed font. The code is very buggy, I have no idea why George thought his implementation of certain functions would be better than no function at all.
  • WAY KYI
    WAY KYI Posts: 130
    edited October 2021
    Do not attempt to use FontForge to make CID-keyed fonts. (Well, of course, we cannot demand you do or not do anything, it's free software, so this is just a strong recommendation hehe.)
    Hi Fredrick, thank you very much for your info. First, I love FF very much not that it is FREE but its workspace and navigations are simple, clean and not clustered. The best thing is Opentype features are so easy to create without knowing a thing about scripting. I used Fontographer ( long time ago ), FontCreator, VOLT and other for Opentype features and I was confused a lot in there. The UI of FF Lookups are so easy and I created Ligatures, Stylist Alternatives, Aalt, Marks and Kerning classes in no time without needing to know about how to code them in scripting. That is the best of FF. FF developers also helped me a lot there too with my silly or naïve questions. I recommend friends to FF not only it is FREE but also it is so easy to master those high-end font functionality. It certainly is the best for first learner like me with type creation. Please keep up the good works.
  • John Hudson
    John Hudson Posts: 3,229
    Apparently there is a way to define a glyph order in FontForge via a custom encoding text file:


    The format looks fairly straightforward, although more complex than that for FontLab where Unicode mapping and glyph order are managed in separate files, making the latter a super simple list of glyph names. [FontLab also gives users the ability to drag and drop glyphs into a specific order in one font window mode, which is great.]

    Getting back to the original /.notdef/ topic, I am guessing that FontForge might have code that automatically locates a glyph with that name at GID 0?
  • Possibly a silly question, but why is glyph order within the binary font something that designers expect to control? There are both cmap and layout (coverage table) optimisations available by reordering the glyphs and assigning different GIDs and thereby getting different table formats. Is it about order in character pickers?
  • Paul van der Laan
    Paul van der Laan Posts: 242
    edited October 2021
    Possibly a silly question, but why is glyph order within the binary font something that designers expect to control? There are both cmap and layout (coverage table) optimisations available by reordering the glyphs and assigning different GIDs and thereby getting different table formats. Is it about order in character pickers?

    Character pickers can be a reason, yes. InDesign character palette can be sorted by unicode index or by GID. The latter can be much easier to navigate when the font contains lots of unencoded glyphs -- such as sets of numerals -- that can be grouped together in a custom order.

    Another reason for us is proofing tools that use glyph order to display all glyphs in a font. We always create specific glyph layouts for all our fonts, and it is a great help to see all glyphs in the same order in our editors as in our proofing tools.
  • The order in the font editor does not need to match the export order, but keeping the same order in both cases makes it easier to navigate and proof an exported font file. So while GIDs could be reorganized on export for a smaller cmap table, the size benefits were never significant in my tests.

    I can imagine a compressed export option that would be enabled for webfonts or similar environments where byte count is paramount.
  • WAY KYI
    WAY KYI Posts: 130
    Apparently there is a way to define a glyph order in FontForge via a custom encoding text file:


    The format looks fairly straightforward, although more complex than that for FontLab where Unicode mapping and glyph order are managed in separate files, making the latter a super simple list of glyph names. [FontLab also gives users the ability to drag and drop glyphs into a specific order in one font window mode, which is great.]

    Getting back to the original /.notdef/ topic, I am guessing that FontForge might have code that automatically locates a glyph with that name at GID 0?
    Just to report all of you that, there is auto GID function in FF. When you do it thru this menu Encoding/ re-encode / Glyph Order. It creates GID view with GID 0 = .notdef and GID 1 = NULL and so on. Thanks
  • Ray Larabie
    Ray Larabie Posts: 1,436
    There's one situation I deal with where I need to control the glyph order: fonts for closed caption systems for North American television sets. Some chipsets still require a specific CCTV encoding from the 1970's.