Do you sort your glyphs?

Adam Jagosz
Adam Jagosz Posts: 689
edited January 2020 in Technique and Theory
Do you sort your glyphs in any particular order during development and/or in the final output file?
If so, do you sort by Unicode or according to a custom order (alphabetically, logically). What about unencoded glyphs?
Or are you not bothered by this at all, since it technically doesn’t affect the user’s experience (all that much... or does it? I know the order will be used in InDesign’s Glyphs panel, and... might it affect the font’s performance somehow?).
Tagged:

Comments

  • Chris Lozos
    Chris Lozos Posts: 1,458
    I make a custom file to sort my glyphs so that they are easier for me to work on.
  • I make a custom file to sort my glyphs so that they are easier for me to work on.

    Same here.

    There are assorted technical implications of glyph ordering.

    Some tools will display glyphs in glyph ID (GID) order. Besides the InDesign Glyphs panel, most tools that poke at the insides of fonts show the glyphs in GID order.

    If you are generating a font in OpenType CFF (PostScript outlines, .otf), AND you are using CFF and not CFF2, AND you keep the first several hundred glyphs in the order specified in the CFF spec, you can save on file size because glyph names won’t be needed. Of course, this only works to the extent you have those same standard glyphs with the same names. Character set is ~ Adobe Latin 1 + Adobe CE as I recall. I think it is something like ~ 340 glyphs.

    Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.

    I can’t think of any other benefits of glyph ordering offhand, but I may well be forgetting something.

  • Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.
    It is not just about the feature syntax. But some lookup types can store ranges and that can save some space. Same for cmap. 
  • Paul Miller
    Paul Miller Posts: 273
    edited January 2020
    High Logic Font Creator has something called 'Design Mode' which sorts the glyphs so that all the capital 'A' based glyphs are together then capital 'B' and so on.  This makes it easy to work on because the base glyph is followed by it's composites.
    Before release I sort by Unicode code point.  But during development I change the sort order on the fly to make whatever I'm doing convenient.
  • I used to customize my glyph sorting back in the Fontlab days. Now I just use the default sorting in Glyphs.
  • The user and all related content has been deleted.
  • I always manage the ordering of glyphs, usually in some way that I hope will be intuitive to users but mostly helpful to myself and collaborators during development. Every font project I work on begins with a spreadsheet to keep track of naming, encoding, OTL features, weight progressions, prioritisation and completion, and any other data related to the development, so this provides the basis for the glyph ordering.
    John, do you have a default order you use for fonts that are not particularly complex or with specific features? Is there a sort of standard?
    Mostly, for purely typographic variants, such as Small Caps, case-specific punctuation and currency symbols, etc., is there a sort of standard on where to order them (at the end of the file?).
  • Thanks much. In case one wants to use a pre-defined method, would it be acceptable to sort them by Unicode? Not ideal, of course, but I am still unsure how to handle the glyph ID/index in Fontlab 6 and 7.
  • Sorting by Unicode only works for glyphs that are encoded in Unicode. So what happens to unencoded glyphs?

    Also, Unicode based sorting has some odd effects in terms of how some writing systems get broken up between basic and extended blocks, with other stuff in between. And also how basic ASCII separates from other stuff. For example, my custom sorting keeps typographic curly quotes right after the ASCII typewriter quote marks, and keeps superscript 1, 2 and 3 next to the other superscripts, etc.


  • Adam Jagosz
    Adam Jagosz Posts: 689
    edited January 2020
    I can see some benefits of Unicode-based sorting: it’s easily achievable, commonly implemented, and... familiar to those who know the ins and outs of Unicode >:)
    FontLab 7, when sorting by Unicode, sorts the remainder at the end by Unicode again, deducing it from the pre-suffix portion of the glyph name. I wish there was a command to generate an Encoding file from within the program, based on the current font, as reordering glyphs by dragging is easy and convenient.
    @Claudio Piccinini FontLab 7 has some mechanism described here https://help.fontlab.com/fontlab/7/manual/Custom-data-files-and-locations/#encoding-files-enc but I haven't tried that yet.
  • There is a FontLab VI/7 script here to generate an encoding file from the current font ordering:
    https://forum.fontlab.com/fontlab-vi/exporting-custom-encoding/

  • Adam Jagosz
    Adam Jagosz Posts: 689
    edited January 2020
    To be honest, the reason I asked this was the dispersion of the non-European portion of Latin: particularly how far apart and in odd order are placed the hooked and other pan-African letters. Within European, there is some nuisance but not nearly as disconcerting.
    > I used to customize my glyph sorting back in the Fontlab days. Now I just use the default sorting in Glyphs.
    Which is? Since Christian uses Glyphs, I suppose it's the order found in Cormorant? (letters immediately followed by diacritics, then alternates grouped by feature, I presume).
    For some reason I find sorting /Thorn after /P slightly offensive (even if it's convenient for development), similarly as sorting Schwa after S.
    I never sorted (except for some unencoded alternates that I just had to reign in) and I pretty much learned bits and pieces of Unicode order as what I assumed was “part of the lore”. Now that I'm considering it, I would go for something in the lines of what John suggested, but maybe move base letters / accents around like this?
    • A-Z
    • any caps that are not automated away or require adjustment, so Ą Ɓ, Ɔ, etc.
    • a-z
    • the same for LC
    • punctuation, numerals, operators, symbols, currencies, you name it
    • combining marks
    and only now
    • UC accents
    • LC accents
    • spacing marks (based on combining)
    In other words, I'd push the parts that don't require often adjustments to the bottom.
    Then again, what requires adjustment and what does not depends on the design, and you've got to draw the line somewhere. So maybe having all accents together with base letters is the most logical way. And having accents near bases is neat, since you can easily see if the changes are propagated or if something broke. :/
  • Adam Jagosz
    Adam Jagosz Posts: 689
    edited January 2020

    If you are generating a font in OpenType CFF (PostScript outlines, .otf), AND you are using CFF and not CFF2, AND you keep the first several hundred glyphs in the order specified in the CFF spec, you can save on file size because glyph names won’t be needed. Of course, this only works to the extent you have those same standard glyphs with the same names. Character set is ~ Adobe Latin 1 + Adobe CE as I recall. I think it is something like ~ 340 glyphs.
    From uni0000 up to /longs uni017F, excluding the control characters uni0001-uni001F and uni007f-uni009F (Type 1 Adobe Standard encoding), there's 338 glyphs, so that seems about right. That magnitude of file size saving must be only relevant in embedded systems though, right? Or maybe webfonts.
    Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.
    But the syntax only has ranges based on glyph names, e.g A-Z or uni2000-uni2009 (doesn't even work with hex numbers, so it breaks at uni200A — and in FontLab even decimal fails most of the time)... right? It doesn't depend on glyph order. Maybe it can affect the conciseness of the compiled features though?
  • Thomas Phinney
    Thomas Phinney Posts: 2,918
    edited January 2020
    Certainly, the file size savings of being able to omit (edit: SOME) glyph names is usually a pretty tiny fraction (aside from a few weird special case situations, such as Last Resort or Adobe Blank). Whether you or your client(s) care, I can’t say.  ;)

    My recollection is that in order for the AFDKO code to actually compile and function as expected, the glyphs have to be ordered appropriately in the input font. But even if not, certainly it would make for smaller compiled code.
  • You mean it can omit all glyph names, even for glyphs beyond the initial 338/340?
    I made a test feature defined as sub [a - z] by [A - Z]; and it compiled within FontLab even when I rearranged the glyphs. But dunno.
  • Theunis de Jong
    Theunis de Jong Posts: 112
    edited January 2020
    I took great care to manually sort my (pre-Minion 3) phonetic extension font. In Unicode order, it's pretty much random; new glyphs were added to the standard by popular demand, and with manual sorting I got all a's together, all e's, and so on. It helped if I had to insert an "r but upside down and with a long leg" – it'd be near all other r's.

    Now, with Minion 3, I can just apply the same font to all text if the correct Unicode was used. But manually browsing for the right character got slower.
  • You mean it can omit all glyph names, even for glyphs beyond the initial 338/340?
    I made a test feature defined as sub [a - z] by [A - Z]; and it compiled within FontLab even when I rearranged the glyphs. But dunno.

    1) No, sorry I wasn’t clear enough in the follow-up. Just that set, and only if they are in that order, etc.

    2) Cool that it compiles. The resulting font will be the tiniest bit larger, then, is all.
  • There is a FontLab VI/7 script here to generate an encoding file from the current font ordering:
    https://forum.fontlab.com/fontlab-vi/exporting-custom-encoding/

    Thanks much. I thave installed the script correctly it took me a bit to figure out I had to place the generated .enc file in the Application support folder.
    It would be definitely handy to have it included as a default functionality in the next Fontlab 7 update…
  • Ray Larabie
    Ray Larabie Posts: 1,441
    Does anyone know if there's an official Cyrillic order? The default order in the Unicode table is a jumble. The caps and lowercase follow a different order. After the main set, each cap is followed by lowercase. I know the Russian alphabetical order but when you consider 50 languages use Cyrillic, they must all have their own order. When I'm working on them, I sort by visual similarity but not sure how to sort the index for the exported font.
  • A starting point might be trying to sort the Cyrillic using Javascript, it does a pretty neat job.

    "ЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѠѡѢѣѤѥѦѧѨѩѪѫѬѭѮѯѰѱѲѳѴѵѶѷѸѹѺѻѼѽѾѿҀҁ҂҃҄҅҆҇҈҉ҊҋҌҍҎҏҐґҒғҔҕҖҗҘҙҚқҜҝҞҟҠҡҢңҤҥҦҧҨҩҪҫҬҭҮүҰұҲҳҴҵҶҷҸҹҺһҼҽҾҿӀӁӂӃӄӅӆӇӈӉӊӋӌӍӎӏӐӑӒӓӔӕӖӗӘәӚӛӜӝӞӟӠӡӢӣӤӥӦӧӨөӪӫӬӭӮӯӰӱӲӳӴӵӶӷӸӹӺӻӼӽӾӿ"
    .split('')
    .sort((a,b) => a.localeCompare(b, 'ru-Cyrl', { caseFirst: 'upper' }))
    .join('')

    yields

    "҈҉҆҅҄҇҃҂АаӐӑӒӓӘәӚӛӔӕБбВвГгЃѓҐґҒғӺӻҔҕӶӷДдЂђҘҙЕеЀѐӖӗЁёЄєЖжӁӂӜӝҖҗЗзӞӟЅѕӠӡИиЍѝӤӥӢӣҊҋІіЇїЙйЈјКкЌќҚқӃӄҠҡҞҟҜҝЛлӅӆЉљМмӍӎНнӉӊҢңӇӈҤҥЊњОоӦӧӨөӪӫПпҦҧҀҁРрҎҏСсҪҫТтҬҭЋћУуЎўӰӱӲӳӮӯҮүҰұѸѹФфХхӼӽӾӿҲҳҺһѠѡѾѿѼѽѺѻЦцҴҵЧчӴӵҶҷӋӌҸҹҼҽҾҿЏџШшЩщЪъЫыӸӹЬьҌҍѢѣЭэӬӭЮюЯяѤѥѦѧѪѫѨѩѬѭѮѯѰѱѲѳѴѵѶѷҨҩӀӏ"
  • DTL file system
    In the relatively ancient IKARUS-based file system, which is used in the DTL/URW font tools, the glyph data is ‘physically’ separated from the character naming and encoding information. The glyphs are stored in a database under numbers that correspondent to the entries in the Character Layout (.cha) file. In principle, the order in the database does not matter: it can be easily re-ordered in any way via (entries in) a .cha file. This means that the order of the characters in the exported font is independent of the storage order.
    Character Layout file
    The structure of a .cha file (simple text) is fairly simple. The conversion information from one numbering system to another is included in several columns. The content of each column is characterized by a keyword (of which the order is arbitrary) and the entries for the different columns are separated by a semicolon. Each line can have a length of up to 255 characters.
    FoundryMaster_encoding
    FoundryMaster_encoding
    FoundryMaster_encoding
    FoundryMaster_encoding