I am sure this has been discussed before, but I could find nothing relevant.
Kerning is a time-consuming slog on large fonts. Automated kerning adds tons of useless pairs and classes that no one needs. (I use FontCreator 10.1).
Surely, there are OpenSource files somewhere with lists of kerning pairs used by different languages?
Does anyone kern lowercase-lowercase pairs? I add a few pairs for y, k- r-, and maybe ke, ko, etc., but by far the majority of pairs are for Uppercase-Uppercase, or Uppercase-lowercase pairs. I never add pairs for lowercase-Uppercase.
Automatically generating the kerning values is not difficult. The hard part is defining the kerning classes, and the kerning pairs.
0
Comments
For example, most of my customers are in the USA. That doesn't mean I can limit myself to pairs for English and Spanish. Because an American magazine might have a story about a Chinese basketball player eating dinner with a Slovenian hockey player, at an Ethiopian restaurant, during which they discuss their families and home towns.
My maths may be off, but I think there would be over 65 thousand pairs with just 256 glyphs. My fonts have over three hundred lowercase and uppercase glyphs, plus punctuation and figures. Then there are Small Capitals, Petite Capitals, and Superscripts.
If you're using left/right kerning classes you can strategically include glyphs from Latin and other scripts to reduce the amount of pairs required. For example, when I kern Cyrillic and Greek, I kern some pairs that would never appear together in actual use. Since other Cyrillic or Greek letters share a common parent, the kerning ends up correct. Often the right side of Cyrillic к is the same as k. Sometimes the right side of k is the same as x. That reduces the number of pairs quite a bit since I can use x as a parent. No k kerns required.
I have a kerning table based on pushing my luck. It's the best case scenario where I can group similar glyphs to a parent everything works out okay; the majority of it works. Now let's say I'm kerning and I notice that sharing the right side of the x and the right side of the k isn't working out. My luck was pushed too far. I make a note of it and keep pushing forward, finishing up my usual kerning script. After I've pushed through and made more notes, I remove the problem glyphs from their respective class and create new classes. In this case, I'd remove the k and other k-like glyphs from the _xLeft class and make a _kLeft class. I have a kerning script that has all my usual kern parents and orphans, kerned with ~. I search and replace the ~ with k and make further adjustments to the k pairs.
Once I've gone through and finalized my kerning classes and made all the adjustments, It's time to do real kerning. The first few million possibilities have been taken care of and now I can take care of the stragglers. I'm using paragraphs of sample text for various languages. I might still need to create more classes as I go but not as many. Greek is especially full of surprises because there are so many letters that can't be classed unless it's an ultramodern Greek...it's just not as modular as Cyrillic. But most of the capitals are.
When I started getting into more languages, I got to the point where checking for accent collisions got out of control: too many possibilities to test and too much margin for error. These days, I deal the accent collisions before I even make my first kern...using spacing. It's not as pretty but I just don't want any collisions. Vietnamese...I don't know if I can deal with every possibility. It depends on the typeface design and how high I can stack the lowercase accents...still learning.
FontCreator used to use text files with kerning pairs, much like a GPOS kerning lookup. Now it generates kerning tables automatically based on letter shape, but this brute force approach is not ideal so I don't use it. I create my own kerning classes based on alphabets rather than just shapes, so I don't mix V and W, for example, as they often require different spacing, but I make separate classes for lowercase with vowels on top like ä à and å because Tä Tà Tå require different kerning to Ta Tą etc.
I think the automatic kerning could be improved by using a word list.
The method used by James can be used for testing, and I do sometimes use a list of common pairs like that at Kern King, but perhaps there are better sources that would help me to create one big pair adjustment lookup that I can import into all of my fonts?
Although I don't have exactly a word list or a class list to build kerning, I did collect some text files to test. They are similar to James' description. IIRC, most of these were shared in Typophile some years ago. Hope they could be useful.
Again, not word lists, but these online resources are very good:
http://www.impallari.com/testing/
http://justanotherfoundry.com/generator
http://readableweb.com/tests/draganddrop/pangram-dragndrop-testfont.htm
My font now has nearly 62K pairs, up from 54K.
The font compiles without problems with FontCreator, and the classes are retained in the exported font, unless there are Class overlap errors.
The compiled font has only one subtable.
Some classes are large, e.g.
class @A_Caps [A Agrave Aacute Acircumflex Atilde Adieresis Aring Amacron Abreve Aogonek Acaron Adieresismacron Adotaccentmacron Aringacute uni0200 uni0202 uni0226 uni1E00 uni1EA0 uni1EA2 uni1EA4 uni1EA6 uni1EA8 uni1EAA uni1EAC uni1EAE uni1EB0 uni1EB2 uni1EB4 uni1EB6 AA];
While a few consist of one or two glyphs:
class @F_Caps [F uni1E1E];
How do you deal with those DZ dz digraphs if you're not using half of the Z and z kerns? How you you make sure the Æ kerns the same as the A and E? What about the right side of the n and h? Are you kerning left side of the C, G and Q separately? For years I avoided left and right kern classes because I thought they didn't work. But it was* just a bug in the FontLab preview window.
*is
Having said that, a few exceptions may arise, of course. Most important: look at r, f, j in neighborhood of ” , ) and so forth. Some i-accents love to collide with T, F or f. Some other accents (e.g. with ogoneks) also have a special affinity to tackle someone else …
UC-UC kerning is very important though, but here also I tend to limit the business to really notorious pairings like TY LA KY WA etc etc. And of course UC-lc pairings like Ac Y… T… and so on. I put not every accent char. into the kerning classes.
Take also care of the figures ( - . , ) but make sure you leave the tabular ones untouched
I had not dealt with DZ digraphs before you mentioned it. They don't need a lot of pairs, so I might use single glyph pairs in such cases. I just added classes for Dz, Dž and dz, dž for pairs with Dz, dz hyphen, and Adz, Tdz, Vdz, Wdz, Ydz. I think I don't need more than that.
I have separate classes for AE and ae. I also have separate classes for n and h lowercase. So far, h is only kerned with hyphen; n is kerned with hyphen on both sides, and with Capital classes.
I have gradually been refining my kerning classes since FontCreator first introduced kerning groups (now classes) in June 2014. Garava Regular is my test bed for new features. When I am happy with the results, I will import the script into the Italic, Bold, and Bold Italic, then run Autokern and manually check for problems.
I only kern the digit 1 with superscripts, and in Garava the default figures are tabular, so I don't kern them at all with other glyphs. I might need to add some pairs for OldStyle figures, which are designed to use with lowercase text, but adjusting the side-bearing could perhaps avoid the need for kerning pairs.
Thank you to everyone for the input. It has already helped me a lot.
I'm surprised you haven't overflowed the kern table and/or kern GPOS feature if you haven't done something to further manage the space consumption.
How many glyphs are in this font?
Garava contains 3,181 glyphs including classes for Petite Capitals, Small Capitals, superscripts, and Swash variants. Caps are paired with pcap and smcp, and swashes are paired with Capitals and lowercase.