Czech/Slovak vertical caron kern string

I ran a script through a database of Czech and Slovak texts to find as many possible kerning pairs with the vertical caron letters. I thought I’d share this with the TD community.
ď! ď, ď. ď; ď? ďa ďh ďk ďl ďm ďo ďt ďu ďá ďž Ľ. Ľa Ľu Ľú ľ, ľ. ľ: ľ; ľa ľb ľg ľh ľk ľm ľn ľo ľs ľt ľu ľv ľú ľč ľň ľš ť! ť" ť) ť, ť- ť. ť: ť; ť? ť` ťa ťc ťd ťj ťk ťm ťn ťo ťr ťs ťt ťu ťá ťž ť“
…and all of them can be at the end of a word as well, so you may need positive kerning pairs with space.

These are the resources I used:
http://www.gutenberg.org/browse/languages/cs
http://texty.citanka.cz
http://zlatyfond.sme.sk/autori

I didn’t download all texts, but I kept adding texts until the script output didn’t change anymore. So, I cannot guarantee that all possible combinations are in there.

Comments

  • Ray Larabie
    Ray Larabie Posts: 1,438
    Thanks very much; this is very useful.
  • Kent Lew
    Kent Lew Posts: 950
    Thanks, Rainer.

    What do folks suppose this one is about?: ť`

    Why would a tcaron be followed by a plain grave accent? Seems like an anomaly to me. But maybe a native speaker can clarify.
  • I cannot think of any instance, as far as I remember, Czech and Slovak do not use grave at all?
  • Kent Lew
    Kent Lew Posts: 950
    Yeah, if I had to guess, I’d say this was probably some kind of bogus quote mark in some funky text.
  • I think so too. Some people, especially (cough cough) Windows users, use plain graves and acutes instead of proper quotation marks and apostrophes. Somehow this made it into the corpus I had at my disposal.

    BTW, I’ve modified the script so you can use it for any number of letter combinations in any text corpus. Soon on my Github repository.
  • Here we go: https://github.com/mekkablue/Glyphs-Scripts/tree/master/Metrics

    Run the scripts in Glyphs, it will prompt you for the letters you want to search for, and then for the text files you want to search. You can download all sorts of language-specific texts from sites like the ones I mentioned above, put them in a folder, and search all of them at once. Output goes into the macro window.

    The ‘1st char’ script looks for all pairs where the letters entered are the first letter of the pair, the ‘2nd char’ script does the same for second letters. I may melt them into one script and add check boxes at some point in the future.

    Happy Easter.
  • Thanks Rainer.
  • Nick Shinn
    Nick Shinn Posts: 2,220
    There seems to be a lot of redundancy.
    Firstly, between the three vertical caron accented characters themselves, and secondly between following kerning classes—e.g. b, h, k and l.
    Couldn’t this list be expressed more economically?
  • Craig Eliason
    Craig Eliason Posts: 1,440
    Couldn’t this list be expressed more economically?
    I think comprehensiveness was the very purpose of the list!
  • The point of the script (and me pasting its result) is to give you a preview string with all possible combinations that actually exist in literature. The list I pasted may have some redundancy for kerning certain fonts, but not for all fonts. There are fonts where h, l and k could not go in the same group. And kerning is just one possible application. For instance, the script helped me find out that I do not need a fþ ligature.

    The script could be changed to check for kerning groups in the current font (and you’re free to modify it for your own needs of course), but I don’t even think redundancy is a bad thing here. If you have set up a kerning class that includes b, h, k and l, you automatically kern all three anyway. And the additional pairs serve as a good way to double check. Plus you may want to introduce a kerning exception or a ligature for a specific pair, and then it is important to know which combinations there are.
  • Nick Shinn
    Nick Shinn Posts: 2,220
    Nonetheless, I think that looking at redundancy would be useful.
    For instance, the very first character combination you show -- dcaron followed by exclam -- is not followed through, as there is no listing of lcaron_exclam or tcaron_exclam.