Most common kern pairs

@Wei Huang asked me on Twitter whether, in the course of my messing about with automated kerning, I had compiled a table of the most common kern pairs. I replied that I hadn't but that it wouldn't be hard to do. So I did it. Based on a set of 514 fonts, here is a file with the top ten-thousand most common kern pairs and how many of those 514 fonts implement them.

The top fifty are:
504 L T
499 L V
495 L Y
493 T A
491 P comma
490 T a
489 V A
489 T period
489 A V
488 Y o
488 Y a
488 P A
486 Y A
486 T o
486 T comma
486 P period
484 F comma
484 A Y
481 V period
481 V a
480 F period
480 A T
477 T e
475 V o
474 L W
473 Y period
472 Y e
470 Y comma
469 F A
467 V comma
462 V e
461 T AE
460 Y u
459 W A
459 L quoteright
457 T hyphen
457 A W
452 W a
452 T oe
449 v period
446 y period
445 w period
443 W period
442 T w
442 T ae
441 V AE
439 W o
438 P AE
437 r period
436 v comma

Comments

  • Benedikt BramböckBenedikt Bramböck Posts: 36
    edited May 17
    Interesting stuff, thanks for sharing.

    Could you also give some background information on what kind of fonts you were examining? Are these fonts from 514 different families?
    What stylistic characteristics do they have – sans, serif, script, blackletter, … ?
    Did you also include italics?
    How is your analysis dealing with class kerning?
    Does the size of the kerning value play a role too?
  • Ramiro EspinozaRamiro Espinoza Posts: 677
    edited May 17
    One list I am really interested on is one of kerning pairs that never occur in the Latin alphabet, organised by language. 
  • Ramiro, you mean every possible combination that is not in use anywhere? Kern pairs such as "rX" and "pY"? There must be 100!s of these (using ! in its mathematical meaning as 'factorial').
  • Craig EliasonCraig Eliason Posts: 753
    Though count on things like RacerX brand and LeapYear Inc. to arise and make even those useful. 
  • Ramiro EspinozaRamiro Espinoza Posts: 677
    edited May 17
    @Theunis de Jong Nope, I mean kerning pairs combinations that makes sense. I don't kern lc/uc. I kern every possible combination of uc/uc, uc/lc, lc/lc, uc/sc, sc/sc. Of course some of these will never occur in any language.  Therefore, I would like to know which these pairs are to avoid spending time on them and also to crop existing kerning tables.

  • Kent LewKent Lew Posts: 800
    Ramiro — A couple years ago, I spent a few days compiling a massive word list from several Latin-using languages, using the corpora at the An Crúbadán project, and then writing a script to extract the most common words for every possible lc-lc (including a subset of Uc-lc where the pair is at the beginning of the word).

    I could probably run an analysis for which pairs are absent. It would be limited by my initial choice of languages to combine and by the limitations of the source corpora. If I find myself in the next few days looking for a distraction from work for a while, I'll see what I can come up with.

    Unless someone beats me to it. ;-)
  • How useful is that sort of statistics in practice?

    It gives a hint on the most common instances regarding ‘normal’ typefaces, sure. Hence it says very little about equally neccessary cases like f” f] (j – and so on. But the choice of pairings is also dependent of the specific design of a typeface, e.g. if it is a blackletter or scriptish design or otherwise different from mainstream models.
    I was always more interested in a systematic approach towards the generally most important pairings, as well as for a practical overview on language-specific pairings which are not that much obvious at first-hand for most of us (e.g. f_ð).

  • Hrant H. PapazianHrant H. Papazian Posts: 1,142
    edited May 17
    @Mark Simonson However (and this is veering even more off-topic...) automating/detaching spacing from the letterforms implies a favoring of the black over notan. We spend so much time tweaking the black to within 1/1000 or finer, but when it comes to the white (at least the whites between glyphs) we settle for things like ±5/1000 or even ignore may pairs. Sure it's an expediency, but at the very least we need to admit the flaw. And ideally, if we do automate the inter-glyph white, it needs to depend on more than the black's lateral profiles.
  • Simon CozensSimon Cozens Posts: 315
    Answers to Benedikt's questions:

    Are these fonts from 514 different families? What stylistic characteristics do they have – sans, serif, script, blackletter, … ? Did you also include italics? 

    It's a really mixed bag; the idea was to produce a fairly representative training set of all the different things you might throw at an autokerner, so that it would cope well with whatever it saw. In practice this meant some families and some individual styles; no blackletters but lots of sans, a few serif, and one or two script. Some text, but generally display. (I wish there were a subcategory of "display" meaning "not exactly a text font, but at the same time something more like Optima rather than graffiti, handdrawn, roughened unicase alphabets, and pictures of cats.")

    How is your analysis dealing with class kerning?

    These are computed kern values between the pair - the value you would send to a layout engine. So after class kerns have been "decomposed".

    Does the size of the kerning value play a role too?

    Nope, anything non-zero is in there.
  • Ramiro EspinozaRamiro Espinoza Posts: 677
    Cool! Thanks!
  • Frode Frode Posts: 54
    edited May 18
    Would you not also want your kerning to work for acronyms, business names, product names etc? Some languages merge compounds and form letter pairs not found in any dictionary. Some languages stick lowercase letters to the left of capitals.
  • Ramiro EspinozaRamiro Espinoza Posts: 677
    edited May 18
    IMHO, you shouldn't kern everything (and I kern a lot). Some situations like brands, etc; are best left to the type setters or designers. 
  • Frode Frode Posts: 54
    Oh, I kern lc-to-cap. And do my best to solve the problematic ones with drawing and spacing before kerning.
  • Hrant H. PapazianHrant H. Papazian Posts: 1,142
    There's even a country name with an intercap now (although rarely requiring a kern).
    http://www.bbc.com/news/world-africa-43821512
  • Wei HuangWei Huang Posts: 88
    edited May 24
    @Kent Lew
    I like your idea a lot! 

    @Simon Cozens
    Is there a way to make it find the group kerning, like one can assume all the following:

    Aring V
    Atilde V
    Agrave V
    Acircumflex V
    Aacute V

    Should be `A V`. Can you run it through the kerning table instead?
Sign In or Register to comment.