Do you sort your glyphs?

Adam Jagosz · January 2020

Do you sort your glyphs in any particular order during development and/or in the final output file?

If so, do you sort by Unicode or according to a custom order (alphabetically, logically). What about unencoded glyphs?

Or are you not bothered by this at all, since it technically doesn’t affect the user’s experience (all that much... or does it? I know the order will be used in InDesign’s Glyphs panel, and... might it affect the font’s performance somehow?).

Chris Lozos · January 2020

I make a custom file to sort my glyphs so that they are easier for me to work on.

Thomas Phinney · January 2020

> I make a custom file to sort my glyphs so that they are easier for me to work on.

Same here.

There are assorted technical implications of glyph ordering.

Some tools will display glyphs in glyph ID (GID) order. Besides the InDesign Glyphs panel, most tools that poke at the insides of fonts show the glyphs in GID order.

If you are generating a font in OpenType CFF (PostScript outlines, .otf), AND you are using CFF and not CFF2, AND you keep the first several hundred glyphs in the order specified in the CFF spec, you can save on file size because glyph names won’t be needed. Of course, this only works to the extent you have those same standard glyphs with the same names. Character set is ~ Adobe Latin 1 + Adobe CE as I recall. I think it is something like ~ 340 glyphs.

Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.

I can’t think of any other benefits of glyph ordering offhand, but I may well be forgetting something.

Georg Seifert · January 2020

Thomas Phinney said:

Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.

It is not just about the feature syntax. But some lookup types can store ranges and that can save some space. Same for cmap.

Paul Miller · January 2020

High Logic Font Creator has something called 'Design Mode' which sorts the glyphs so that all the capital 'A' based glyphs are together then capital 'B' and so on. This makes it easy to work on because the base glyph is followed by it's composites.

Before release I sort by Unicode code point. But during development I change the sort order on the fly to make whatever I'm doing convenient.

James Puckett · January 2020

I used to customize my glyph sorting back in the Fontlab days. Now I just use the default sorting in Glyphs.

[Deleted User] · January 2020

The user and all related content has been deleted.

John Hudson · January 2020

I always manage the ordering of glyphs, usually in some way that I hope will be intuitive to users but mostly helpful to myself and collaborators during development. Every font project I work on begins with a spreadsheet to keep track of naming, encoding, OTL features, weight progressions, prioritisation and completion, and any other data related to the development, so this provides the basis for the glyph ordering.

In large projects that are updated over a period of years, which is typical of the work we've done for Microsoft, the idea of logical and intuitive ordering breaks down because new additions are appended to the end of the existing glyph set, not inserted into the sequence where they would logically belong. But this is a situation in which controlling the glyph ordering is, if anything, even more important, since we want to avoid reordering existing glyphs after a font has shipped.

Claudio Piccinini · January 2020

John Hudson said:

I always manage the ordering of glyphs, usually in some way that I hope will be intuitive to users but mostly helpful to myself and collaborators during development. Every font project I work on begins with a spreadsheet to keep track of naming, encoding, OTL features, weight progressions, prioritisation and completion, and any other data related to the development, so this provides the basis for the glyph ordering.

John, do you have a default order you use for fonts that are not particularly complex or with specific features? Is there a sort of standard?
Mostly, for purely typographic variants, such as Small Caps, case-specific punctuation and currency symbols, etc., is there a sort of standard on where to order them (at the end of the file?).

John Hudson · January 2020

I tend to group glyphs by type:

• Basic alphabetic uppercase (e.g. A–Z for Latin)
• Uppercase diacritics, sorted according to basic alphabetic, followed by specials such as Eth and Thorn
• Basic alphabetic lowercase
• Lowercase diacritics etc.
[Smallcaps, following same ordering as above, and ligature forms inserted here if present.]
• Combining marks
• Punctuation
• General symbols (pilcrow, section mark etc.)
• Currency symbols
• Default numerals
• Additional numeral styles
• Math operators

Within each of these groups, I have a conventional order I use, e.g. for diacritics: grave, acute, circumflex, caron, tilde, diaeresis, dot, and so forth.

The approach for complex script is similar, e.g. for Northern Indian scripts

• Independent vowel letters
• Consonant letters
• Dependent vowel signs
• Other marks (candrabindu, virama, etc.)
• Consonant + vowel sign ligature forms
• Consonant conjunct forms (sorted alphabetically)
• Punctuation and symbols
• Numerals

Claudio Piccinini · January 2020

Thanks much. In case one wants to use a pre-defined method, would it be acceptable to sort them by Unicode? Not ideal, of course, but I am still unsure how to handle the glyph ID/index in Fontlab 6 and 7.

Thomas Phinney · January 2020

Sorting by Unicode only works for glyphs that are encoded in Unicode. So what happens to unencoded glyphs?

Also, Unicode based sorting has some odd effects in terms of how some writing systems get broken up between basic and extended blocks, with other stuff in between. And also how basic ASCII separates from other stuff. For example, my custom sorting keeps typographic curly quotes right after the ASCII typewriter quote marks, and keeps superscript 1, 2 and 3 next to the other superscripts, etc.

Adam Jagosz · January 2020

I can see some benefits of Unicode-based sorting: it’s easily achievable, commonly implemented, and... familiar to those who know the ins and outs of Unicode

FontLab 7, when sorting by Unicode, sorts the remainder at the end by Unicode again, deducing it from the pre-suffix portion of the glyph name. I wish there was a command to generate an Encoding file from within the program, based on the current font, as reordering glyphs by dragging is easy and convenient.

@Claudio Piccinini FontLab 7 has some mechanism described here https://help.fontlab.com/fontlab/7/manual/Custom-data-files-and-locations/#encoding-files-enc but I haven't tried that yet.

Thomas Phinney · January 2020

There is a FontLab VI/7 script here to generate an encoding file from the current font ordering:
https://forum.fontlab.com/fontlab-vi/exporting-custom-encoding/

Adam Jagosz · January 2020

To be honest, the reason I asked this was the dispersion of the non-European portion of Latin: particularly how far apart and in odd order are placed the hooked and other pan-African letters. Within European, there is some nuisance but not nearly as disconcerting.

> I used to customize my glyph sorting back in the Fontlab days. Now I just use the default sorting in Glyphs.

Which is? Since Christian uses Glyphs, I suppose it's the order found in Cormorant? (letters immediately followed by diacritics, then alternates grouped by feature, I presume).

For some reason I find sorting /Thorn after /P slightly offensive (even if it's convenient for development), similarly as sorting Schwa after S.

I never sorted (except for some unencoded alternates that I just had to reign in) and I pretty much learned bits and pieces of Unicode order as what I assumed was “part of the lore”. Now that I'm considering it, I would go for something in the lines of what John suggested, but maybe move base letters / accents around like this?

A-Z
any caps that are not automated away or require adjustment, so Ą Ɓ, Ɔ, etc.
a-z
the same for LC
punctuation, numerals, operators, symbols, currencies, you name it
combining marks

and only now

UC accents
LC accents
spacing marks (based on combining)

In other words, I'd push the parts that don't require often adjustments to the bottom.

Then again, what requires adjustment and what does not depends on the design, and you've got to draw the line somewhere. So maybe having all accents together with base letters is the most logical way. And having accents near bases is neat, since you can easily see if the changes are propagated or if something broke.

Adam Jagosz · January 2020

Thomas Phinney said:

If you are generating a font in OpenType CFF (PostScript outlines, .otf), AND you are using CFF and not CFF2, AND you keep the first several hundred glyphs in the order specified in the CFF spec, you can save on file size because glyph names won’t be needed. Of course, this only works to the extent you have those same standard glyphs with the same names. Character set is ~ Adobe Latin 1 + Adobe CE as I recall. I think it is something like ~ 340 glyphs.

From uni0000 up to /longs uni017F, excluding the control characters uni0001-uni001F and uni007f-uni009F (Type 1 Adobe Standard encoding), there's 338 glyphs, so that seems about right. That magnitude of file size saving must be only relevant in embedded systems though, right? Or maybe webfonts.

Thomas Phinney said:

Keeping matching glyphs in contiguous blocks helps with writing OT code in AFDKO syntax, for things such as mapping caps to small caps, or lining numerals to oldstyle numerals.

But the syntax only has ranges based on glyph names, e.g A-Z or uni2000-uni2009 (doesn't even work with hex numbers, so it breaks at uni200A — and in FontLab even decimal fails most of the time)... right? It doesn't depend on glyph order. Maybe it can affect the conciseness of the compiled features though?

Thomas Phinney · January 2020

Certainly, the file size savings of being able to omit (edit: SOME) glyph names is usually a pretty tiny fraction (aside from a few weird special case situations, such as Last Resort or Adobe Blank). Whether you or your client(s) care, I can’t say.

My recollection is that in order for the AFDKO code to actually compile and function as expected, the glyphs have to be ordered appropriately in the input font. But even if not, certainly it would make for smaller compiled code.

Adam Jagosz · January 2020

You mean it can omit all glyph names, even for glyphs beyond the initial 338/340?

I made a test feature defined as sub [a - z] by [A - Z]; and it compiled within FontLab even when I rearranged the glyphs. But dunno.

Theunis de Jong · January 2020

I took great care to manually sort my (pre-Minion 3) phonetic extension font. In Unicode order, it's pretty much random; new glyphs were added to the standard by popular demand, and with manual sorting I got all a's together, all e's, and so on. It helped if I had to insert an "r but upside down and with a long leg" – it'd be near all other r's.

Now, with Minion 3, I can just apply the same font to all text if the correct Unicode was used. But manually browsing for the right character got slower.

Thomas Phinney · January 2020

Adam Jagosz said:

You mean it can omit all glyph names, even for glyphs beyond the initial 338/340?
I made a test feature defined as sub [a - z] by [A - Z]; and it compiled within FontLab even when I rearranged the glyphs. But dunno.

1) No, sorry I wasn’t clear enough in the follow-up. Just that set, and only if they are in that order, etc.

2) Cool that it compiles. The resulting font will be the tiniest bit larger, then, is all.

Claudio Piccinini · January 2020

Thomas Phinney said:

There is a FontLab VI/7 script here to generate an encoding file from the current font ordering:
https://forum.fontlab.com/fontlab-vi/exporting-custom-encoding/

Thanks much. I thave installed the script correctly it took me a bit to figure out I had to place the generated .enc file in the Application support folder.
It would be definitely handy to have it included as a default functionality in the next Fontlab 7 update…

Ray Larabie · January 2020

Does anyone know if there's an official Cyrillic order? The default order in the Unicode table is a jumble. The caps and lowercase follow a different order. After the main set, each cap is followed by lowercase. I know the Russian alphabetical order but when you consider 50 languages use Cyrillic, they must all have their own order. When I'm working on them, I sort by visual similarity but not sure how to sort the index for the exported font.

Adam Jagosz · January 2020

A starting point might be trying to sort the Cyrillic using Javascript, it does a pretty neat job.

"ЀЁЂЃЄЅІЇЈЉЊЋЌЍЎЏАБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯабвгдежзийклмнопрстуфхцчшщъыьэюяѐёђѓєѕіїјљњћќѝўџѠѡѢѣѤѥѦѧѨѩѪѫѬѭѮѯѰѱѲѳѴѵѶѷѸѹѺѻѼѽѾѿҀҁ҂҃҄҅҆҇҈҉ҊҋҌҍҎҏҐґҒғҔҕҖҗҘҙҚқҜҝҞҟҠҡҢңҤҥҦҧҨҩҪҫҬҭҮүҰұҲҳҴҵҶҷҸҹҺһҼҽҾҿӀӁӂӃӄӅӆӇӈӉӊӋӌӍӎӏӐӑӒӓӔӕӖӗӘәӚӛӜӝӞӟӠӡӢӣӤӥӦӧӨөӪӫӬӭӮӯӰӱӲӳӴӵӶӷӸӹӺӻӼӽӾӿ"
.split('')
.sort((a,b) => a.localeCompare(b, 'ru-Cyrl', { caseFirst: 'upper' }))
.join('')

yields

"҈҉҆҅҄҇҃҂АаӐӑӒӓӘәӚӛӔӕБбВвГгЃѓҐґҒғӺӻҔҕӶӷДдЂђҘҙЕеЀѐӖӗЁёЄєЖжӁӂӜӝҖҗЗзӞӟЅѕӠӡИиЍѝӤӥӢӣҊҋІіЇїЙйЈјКкЌќҚқӃӄҠҡҞҟҜҝЛлӅӆЉљМмӍӎНнӉӊҢңӇӈҤҥЊњОоӦӧӨөӪӫПпҦҧҀҁРрҎҏСсҪҫТтҬҭЋћУуЎўӰӱӲӳӮӯҮүҰұѸѹФфХхӼӽӾӿҲҳҺһѠѡѾѿѼѽѺѻЦцҴҵЧчӴӵҶҷӋӌҸҹҼҽҾҿЏџШшЩщЪъЫыӸӹЬьҌҍѢѣЭэӬӭЮюЯяѤѥѦѧѪѫѨѩѬѭѮѯѰѱѲѳѴѵѶѷҨҩӀӏ"

LeMo aka PatternMan aka Frank E Blokland · January 2020

In the relatively ancient IKARUS-based ﬁle system, which is used in the DTL/URW font tools, the glyph data is ‘physically’ separated from the character naming and encoding information. The glyphs are stored in a database under numbers that correspondent to the entries in the Character Layout (.cha) ﬁle. In principle, the order in the database does not matter: it can be easily re-ordered in any way via (entries in) a .cha ﬁle. This means that the order of the characters in the exported font is independent of the storage order.

The structure of a .cha ﬁle (simple text) is fairly simple. The conversion information from one numbering system to another is included in several columns. The content of each column is characterized by a keyword (of which the order is arbitrary) and the entries for the different columns are separated by a semicolon. Each line can have a length of up to 255 characters.

Do you sort your glyphs?

Comments

Categories