Text Files for Kerning Pairs

Bhikkhu Pesala
Bhikkhu Pesala Posts: 210
edited February 2017 in Technique and Theory
I am sure this has been discussed before, but I could find nothing relevant.

Kerning is a time-consuming slog on large fonts. Automated kerning adds tons of useless pairs and classes that no one needs. (I use FontCreator 10.1). 

Surely, there are OpenSource files somewhere with lists of kerning pairs used by different languages? 

Does anyone kern lowercase-lowercase pairs? I add a few pairs for y, k- r-, and maybe ke, ko, etc., but by far the majority of pairs are for Uppercase-Uppercase, or Uppercase-lowercase pairs. I never add pairs for lowercase-Uppercase. 

Automatically generating the kerning values is not difficult. The hard part is defining the kerning classes, and the kerning pairs. 

Comments

  • Calibri has a lot of kerning pairs, but even that does not kern all permutations. It contains no pairs at all for lowercase-Uppercase.  

    My maths may be off, but I think there would be over 65 thousand pairs with just 256 glyphs. My fonts have over three hundred lowercase and uppercase glyphs, plus punctuation and figures. Then there are Small Capitals, Petite Capitals, and Superscripts. 
  • Ray Larabie
    Ray Larabie Posts: 1,425
    edited February 2017
    The first step is to eliminate a few million pairs with classes before you really start kerning.

    If you're using left/right kerning classes you can strategically include glyphs from Latin and other scripts to reduce the amount of pairs required. For example, when I kern Cyrillic and Greek, I kern some pairs that would never appear together in actual use. Since other Cyrillic or Greek letters share a common parent, the kerning ends up correct. Often the right side of Cyrillic к is the same as k. Sometimes the right side of k is the same as x. That reduces the number of pairs quite a bit since I can use x as a parent. No k kerns required.

    I have a kerning table based on pushing my luck. It's the best case scenario where I can group similar glyphs to a parent everything works out okay; the majority of it works. Now let's say I'm kerning and I notice that sharing the right side of the x and the right side of the k isn't working out. My luck was pushed too far. I make a note of it and keep pushing forward, finishing up my usual kerning script. After I've pushed through and made more notes, I remove the problem glyphs from their respective class and create new classes. In this case, I'd remove the k and other k-like glyphs from the _xLeft class and make a _kLeft class. I have a kerning script that has all my usual kern parents and orphans, kerned with ~. I search and replace the ~ with k and make further adjustments to the k pairs.

    Once I've gone through and finalized my kerning classes and made all the adjustments, It's time to do real kerning. The first few million possibilities have been taken care of and now I can take care of the stragglers. I'm using paragraphs of sample text for various languages. I might still need to create more classes as I go but not as many. Greek is especially full of surprises because there are so many letters that can't be classed unless it's an ultramodern Greek...it's just not as modular as Cyrillic. But most of the capitals are.

    When I started getting into more languages, I got to the point where checking for accent collisions got out of control: too many possibilities to test and too much margin for error. These days, I deal the accent collisions before I even make my first kern...using spacing. It's not as pretty but I just don't want any collisions. Vietnamese...I don't know if I can deal with every possibility. It depends on the typeface design and how high I can stack the lowercase accents...still learning.
  • I didn’t mean that every permutation needs a pair. Just that you need to kern all of the permutations that do need pairs. 
  • Bhikkhu Pesala
    Bhikkhu Pesala Posts: 210
    edited February 2017
    James Montalbano said:
    And if you think kerning is a slog, you are probably in the wrong business
    Fortunately, I do this as a hobby, so I can take my time to fix things gradually, without any bailiffs standing behind me. Still, my fonts are large because I want them to be useful to Buddhists who need to write Pāḷi words with diacritics whether their language is English, German, French, Spanish, Polish, or Vietnamese. My fonts include complete character sets for Latin-1 Supplement, Latin Extended-A, Latin Extended Additional, and some from Latin Extended-B (117 of 208). 

    FontCreator used to use text files with kerning pairs, much like a GPOS kerning lookup. Now it generates kerning tables automatically based on letter shape, but this brute force approach is not ideal so I don't use it. I create my own kerning classes based on alphabets rather than just shapes, so I don't mix V and W, for example, as they often require different spacing, but I make separate classes for lowercase with vowels on top like ä à and å because Tä Tà Tå require different kerning to Ta Tą etc.

    I think the automatic kerning could be improved by using a word list. 

    The method used by James can be used for testing, and I do sometimes use a list of common pairs like that at Kern King, but perhaps there are better sources that would help me to create one big pair adjustment lookup that I can import into all of my fonts? 
  • Igor Freiberger
    Igor Freiberger Posts: 268
    edited February 2017
    Bhikku Pesala,

    Although I don't have exactly a word list or a class list to build kerning, I did collect some text files to test. They are similar to James' description. IIRC, most of these were shared in Typophile some years ago. Hope they could be useful.

    Again, not word lists, but these online resources are very good:
    http://www.impallari.com/testing/
    http://justanotherfoundry.com/generator
    http://readableweb.com/tests/draganddrop/pangram-dragndrop-testfont.htm

  • Thanks Igor. I will take a good look at these later. 
  • I found the file "Sequences to adjust kerning.txt" helpful. I found a number of missing kerning pairs, and a couple of glyphs with side-bearing issues that were causing too many clashes, so I increased the right side-bearings instead of kerning them.

    My font now has nearly 62K pairs, up from 54K. 
  • Are you using classes, or not? If you are really doing that as pairs, you have presumably blown past the maximum 64K size of a kern table and/or GPOS kern feature subtable, so I would be mildly surprised if your font compiled (or if it did, if all the kerning worked).
  • Are you using classes, or not? 
    Yes, of course. My Garava font currently has 1,330 kerning classes and 67,174 adjustment pairs. As I said above:-
    I create my own kerning classes based on alphabets rather than just shapes, so I don't mix V and W, for example, as they often require different spacing, but I make separate classes for lowercase with vowels on top like ä à and å because Tä Tà Tå require different kerning to Ta Tą etc.
    I could reduce the number of classes more by combining some glyphs based on letter shape, but I find it easier to manage by keeping them alphabetically sorted. There are separate classes for pcap and smcp, but superscripts are mostly in a single class as I only kern a few pairs of superscripts - most are only kerned with 1, A, or L caps.

    The font compiles without problems with FontCreator, and the classes are retained in the exported font, unless there are Class overlap errors.

    The compiled font has only one subtable. 

    Some classes are large, e.g. 
    class @A_Caps [A Agrave Aacute Acircumflex Atilde Adieresis Aring Amacron Abreve Aogonek Acaron Adieresismacron Adotaccentmacron Aringacute uni0200 uni0202 uni0226 uni1E00 uni1EA0 uni1EA2 uni1EA4 uni1EA6 uni1EA8 uni1EAA uni1EAC uni1EAE uni1EB0 uni1EB2 uni1EB4 uni1EB6 AA];

    While a few consist of one or two glyphs:
    class @F_Caps [F uni1E1E];
    class @Q_Caps [Q];


  • Ray Larabie
    Ray Larabie Posts: 1,425
    Using separate left and right pairs significantly reduces the required kerns but it's definitely more difficult to manage. I usually print a glyph chart and write down the left and right parents for each glyph. That way I don't have to track down conflicts later.

    How do you deal with those DZ dz digraphs if you're not using half of the Z and z kerns? How you you make sure the Æ kerns the same as the A and E? What about the right side of the n and h? Are you kerning left side of the C, G and Q separately? For years I avoided left and right kern classes because I thought they didn't work. But it was* just a bug in the FontLab preview window.

    *is
  • I do almost no lowercase-lowercase kerning, but I spend a good hard time before to precisely adjust all the sidebearings. A good designed typeface needs no lc-lc kerning, see the old days when lead type letters just stood there side by side in peace and looked good.
    Having said that, a few exceptions may arise, of course. Most important: look at r, f, j in neighborhood of ” , ) and so forth. Some i-accents love to collide with T, F or f. Some other accents (e.g. with ogoneks) also have a special affinity to tackle someone else …

    UC-UC kerning is very important though, but here also I tend to limit the business to really notorious pairings like TY LA KY WA etc etc. And of course UC-lc pairings like Ac Y… T… and so on. I put not every accent char. into the kerning classes.

    Take also care of the figures ( - . , ) but make sure you leave the tabular ones untouched ;)
  • Using separate left and right pairs significantly reduces the required kerns but it's definitely more difficult to manage.

    How do you deal with those DZ dz digraphs if you're not using half of the Z and z kerns? How you you make sure the Æ kerns the same as the A and E?
    I only divide a class into left and right if I need to. It is difficult to manage, but FontCreator's Code Editor detects class overlaps with the validator, and tells you where to look for the problems.

    I had not dealt with DZ digraphs before you mentioned it. They don't need a lot of pairs, so I might use single glyph pairs in such cases. I just added classes for Dz, Dž and dz, dž for pairs with Dz, dz hyphen, and Adz, Tdz, Vdz, Wdz, Ydz. I think I don't need more than that. 

    I have separate classes for AE and ae. I also have separate classes for n and h lowercase. So far, h is only  kerned with hyphen; n is kerned with hyphen on both sides, and with Capital classes. 

    I have gradually been refining my kerning classes since  FontCreator first introduced kerning groups (now classes) in June 2014. Garava Regular is my test bed for new features. When I am happy with the results, I will import the script into the Italic, Bold, and Bold Italic, then run Autokern and manually check for problems.
  • Bhikkhu Pesala
    Bhikkhu Pesala Posts: 210
    edited February 2017
    I do almost no lowercase-lowercase kerning, but I spend a good hard time before to precisely adjust all the sidebearings. A good designed typeface needs no lc-lc kerning, see the old days when lead type letters just stood there side by side in peace and looked good.

    Take also care of the figures ( - . , ) but make sure you leave the tabular ones untouched ;)
    I requested an option to turn off lowercase-lowercase pairs in FontCreator's Autokern feature, but some pairs are needed. After checking with the above "Sequences to adjust kerning.txt" file, I added a few more. Previously, ke, ko and pairs with hyphen or y comma were the only ones that I used. 

    I only kern the digit 1 with superscripts, and in Garava the default figures are tabular, so I don't kern them at all with other glyphs. I might need to add some pairs for OldStyle figures, which are designed to use with lowercase text, but adjusting the side-bearing could perhaps avoid the need for kerning pairs. 

    Thank you to everyone for the input. It has already helped me a lot. 
  • Wait, you have 1330 classes and 67K adjustment pairs? That seems like an awful lot of both. I could imagine 67K as the count of flattened kerning from the class kerning, but as adjustments in addition to, that seems perhaps... high.

    I'm surprised you haven't overflowed the kern table and/or kern GPOS feature if you haven't done something to further manage the space consumption.

    How many glyphs are in this font?
  • Bhikkhu Pesala
    Bhikkhu Pesala Posts: 210
    edited February 2017
    67K adjustment pairs means after flattening the classes. 

    Garava contains 3,181 glyphs including classes for Petite Capitals, Small Capitals, superscripts, and Swash variants. Caps are paired with pcap and smcp, and swashes are paired with Capitals and lowercase.