List of glyphs to omit

Vasil Stanev
Vasil Stanev Posts: 775
edited July 2018 in Technique and Theory
We often go all the way when designing the font and include every glyph on the unicode list. I know I have. However, there are many that are completely obsolete and do not need to be included, even for past-proofing. Please everyone from their own corner of the world post such glyphs here, so we can save ourselves precious time.

My 2 cents:
You do NOT need to include historical Cyrillic glyphs from old orthographies before WW2.
https://ru.wikipedia.org/wiki/Юс_большой
https://ru.wikipedia.org/wiki/Ять are not used anywhere.

«1

Comments

  • You do not *need* to include historical glyphs like yat if your only goal is to support modern orthography, but that doesn't mean there aren't people who might still find them useful. That 'yat' is only used in historical contexts is already indicated in the Unicode standard, as is the case for other historical glyphs.

    I think using lists of characters or unicode blocks to be included or lists of characters to be omitted from those blocks is the wrong way to go about deciding on a character set. Instead, one should decide upon their target audience and the glyphs necessary to satisfy the needs of that target audience.

    André
  • Kent Lew
    Kent Lew Posts: 944
    Maxim Zhukov has made persuasive arguments for including ѣ in an otherwise general-purpose Cyrillic font. Even if modern Russian orthography does not use it, there are ex-patriate communities outside of the former Soviet Union where certain generations may still use the old orthography.

    I happen to have a book of the correspondence between Vladimir Nabokov and Edmund Wilson, his friend and literary critic, from the period of 1940–1971. The two men were multilingual and their letters are naturally sprinkled with Russian. Nabokov used the older orthography and ѣ is common throughout the text.

  • Kent Lew
    Kent Lew Posts: 944
    Regarding the main topic of the post, there is this older thread on obsolete characters. (Like many threads, it wanders on and off topic.)

    There is also this classic thread about character sets and what to consider including or not.

    As André suggested, using Unicode blocks as a primary basis for a character set is probably not the most practical approach.
  • The user and all related content has been deleted.
  • I never really understood Big Yus. It looks pretty much the same size as Little Yus to my eyes.
  • [Deleted User]
    [Deleted User] Posts: 0
    edited July 2018
    The user and all related content has been deleted.
  • Nick Shinn
    Nick Shinn Posts: 2,216
    André, LOL, that little Big Yus ain’t no Little Yus.

    One great truth, that I first heard expressed by Maxim Zhukov, is that the finer points of our multi-script designs are for other type designers (and perhaps a few expert typographers). This was confirmed when I was at the St Petersburg ATypI conference in 2008 and showing some Scotch Modern print-outs to some Russian graphic designers and one pointed to the Big Yus and said “That’s a strange-looking Zhe”. The upshot is that the majority of Cyrillic font users know very little about their historic, deprecated characters.  
  • Vasil Stanev
    Vasil Stanev Posts: 775
    edited July 2018
    Kent Lew said:
    Regarding the main topic of the post, there is this older thread on obsolete characters. (Like many threads, it wanders on and off topic.)

    There is also this classic thread about character sets and what to consider including or not.
    I have read this and forgotten it. I have a good mind to map out the most important threads into a list so this does not repeat. I have started working on the list some years ago but it's a long, tedious process, and double-checking with Typophile is sub-optimal because the threads break down more and more the further back you go in time. Threads from 2003 are unreadable, often abruptly ednding in a middle of a word before an apostrophe.

    My dream is that some sunny day I would be able to compile a comprehensive pro bono database of all Latin, Cyrillic and Greek glyphs, unicode codepoint by unicode codepoint. That would be the ultimate reference and would save us and newcomers a ton of time and sweat. It is not so much the major rules like "start with the H and O", but the special cases that need attention. Same as with law and jurisprudence, in my opinion.But there is always something coming up and I have to make a living like everybody else.
  • I have read this and forgotten it. I have a good mind to map out the most important threads into a list so this does not repeat. I have started working on the list some years ago but it's a long, tedious process
    Maybe this is too meta, but would it not make sense to create a thread on typedrawers where all/many major/interesting threads are listed and categorized? Many folks here seem to just remember age old threads, but I for one completely forget them.
  • … would it not make sense to create a thread on typedrawers where all/many major/interesting threads are listed and categorized?
    I had the same thought recently.
  • Ray Larabie
    Ray Larabie Posts: 1,436
    Here's a spreadsheet I made with some Unicode ranges. It's an ODS spreadsheet so it'll work in LibreOffice, OpenOffice etc. It requires Liberation Sans. The idea was to flag each character with a color by changing the cell background. Instead of simply indicating "you don't need these glyphs" the flag color can guide a type designer working on a certain type of project. Here are some of the flags I'd planned to include
    • do not include: genuine deprecated characters or stupid characters included in Unicode that nobody ever used like ₯. Not-recommended characters like Ldot.
    • historical characters
    • uncommon characters. These are characters which are still technically in use for their respective languages but for languages under 10000* readers and/or which have with little to no websites set in that language
    • scholarly: common IPA characters that you might see on Wikipedia. Common historical characters like long s. Ring acutes. Imagine someone is making a typeface that's unlikely to be used to set a textbook...these could be omitted.
    • uncommon mathematical symbols like ⌀ (diameter) that are likely be used outside of a math textbook
    • heavy mathematical (maybe overlaps with scholarly?)
    • text-only: characters which aren't required for a pure display typeface. Examples: paragraph and section symbols
    Dropbox link

    There are no colors because I had to put the project is on hold a few years ago. I hope this spreadsheet will save you a few weeks of research.

    * It's hard to draw the line on whether or not a language is "worth supporting". I think if you're making a OS font, you'll want to include everything. But sometimes I think it's a statistical impossibility that some glyphs will ever be used. There are orthographic representations of spoken languages or non-Latin languages which aren't in common use. For example, Pan Nigerian and Pinyin with would fit with scholarly. Maybe there should be a flag for "disputed" so people can read an explanation and make their own decision.
  • The user and all related content has been deleted.
  • The only characters I think are truly useless in most fonts are the summation, product, and integral symbols unless your font comes in a very wide range of optical sizes. Even then, though, anyone actually typesetting these is either (a) a masochist or (b) using specialized software which likely supplies its own fonts for this purpose anyways.
  • notdef
    notdef Posts: 168
    ¦ <–––– 
  • Christian Thalmann
    Christian Thalmann Posts: 1,988
    edited July 2018
    The only characters I think are truly useless in most fonts are the summation, product, and integral symbols unless your font comes in a very wide range of optical sizes. Even then, though, anyone actually typesetting these is either (a) a masochist or (b) using specialized software which likely supplies its own fonts for this purpose anyways.
    I can certainly see myself using ∑ in an e-mail to students, but it's admittedly an extreme fringe case. :grimace:

    (That was supposed to be the sum symbol. Apparently, it's not in the forum font...)
  • I can certainly see myself using ∑ in an e-mail to students, but it's admittedly an extreme fringe case. grimace 

    And here Christian proves me wrong with a counterexample to my claim which, I think, proves the futility of attempting to identify which glyphs will never be used.

    André
  • Ray Larabie
    Ray Larabie Posts: 1,436
    edited July 2018
    And here Christian proves me wrong with a counterexample to my claim which, I think, proves the futility of attempting to identify which glyphs will never be used
    A typeface used in a forum about typefaces doesn't have the same glyph requirements* as a display font or a mathematical font. What doesn't exist is a simple guide for type designers to let them make an informed choice. Every font has a glyph limit—everyone draws the line somewhere. Unicode ranges aren't an efficient way to decide what to include. Even though I've researched every Cyrillic glyph, I can't remember off-hand which ones are in current use, which ones are for biblical transliteration or historical. That lack of a simple guide encourages people to stop at Unicode range borders, which is a shame. There are only 37 out of 160 characters in the Cyrillic 0460-04FF range that are in current use and all are simple variations that would probably take a few minutes form most type designers to build. IPA extension range includes 2 characters in non-scholarly use and have capital counterparts outside of that range. There are 160 characters in 1E00-1E9F but only 29 in current use. Which 4 characters (apart from the ones at the very end) in the Vietnamese range aren't required? Which diacriticals are needed for language support and which are only for scholarly/historical use? The UCAS block contains 640 characters but 169 of those characters, many of which are a real pain to draw, haven't been in use since the 1960's.

    *off topic but I really don't think this site should be using embedded fonts of any kind. It's an impediment to the discussion of type. Give me OS core fonts, please.
  • Oddly enough, just as I was reading this thread, a link with the following title showed up on my facebook feed:

    I ღ Cats


    So apparently you can't even make latin fonts without throwing in a few Mkhedruli glyphs just in case people need them as emojis
  • Jens Kutilek
    Jens Kutilek Posts: 364
    That’s why font fallbacks were invented. I wonder why it doesn’t work for the forum font.
  • Ray Larabie
    Ray Larabie Posts: 1,436
    I see fallback fonts for obscure characters in this thread in Windows Chrome. When I look at the classic thread, they're often missing.
  • Paul Miller
    Paul Miller Posts: 273
    edited July 2018
    When I did the Kelvinch font I didn't know what to put in and what to leave out.  It started out quite a modest size but uncertainty beget mission creep and so I ended up putting everything and the kitchen sink in there.

    Just because I didn't want to leave out anything which was important.

    It was a lot of extra work but it didn't turn out too bad in the end.
  • Max Phillips
    Max Phillips Posts: 474
    Aside from basic operators (+−×÷±=<>≤≥≠), I don't understand why you'd include math symbols in a non-specialist face, since anyone setting serious math will need a specialist typeface anyway.

    And has anyone ever seen an f_f_j ligature in the wild? Google search for *ffj*returns only the acronyms for Fermented Fruit Juice and Full-Fledged Jerk.
  • Kent Lew
    Kent Lew Posts: 944
    edited July 2018
    Max — I believe this is a combination that you will find in Maltese. For example, in the second bio on this page, the last sentence of the first paragraph starts with the word Irreffja.

    And the headline on this page features the word jirreffja.

    Pretty uncommon, but you asked . . .
  • ffj can pop up in languages where compound words combine into one. Examples: hoffjegermester (Norwegian), Stoffjacke (German).
  • Stoffjacke (German).
    Not in this case. In German a ligature must not cross the border between compound words, so here it would be Stoff with a f_f and then a normal j.
    I wonder about other ligatures which some providers include: f_b, f_k for example. Are they used in any language?
  • Sander Pedersen
    Sander Pedersen Posts: 33
    edited July 2018
    Andreas Stötzner said:
    Not in this case. In German a ligature must not cross the border between compound words, so here it would be Stoff with a f_f and then a normal j.
    Now I seem to recall that rule in an old Norwegian handbook on hand setting type. Doesn't that cause colliding glyphs in some typefaces?
  • Doesn't that cause colliding glyphs in some typefaces?

    Not if you have proper kerning ;)

  • Kent Lew
    Kent Lew Posts: 944
    I wonder about other ligatures which some providers include: f_b, f_k for example. Are they used in any language?
    The classic English examples are surfboard and Kafka.

    Note that English does not have any official prohibition against ligatures across the boundaries of compounds like surfboard or offhand.

  • Michel Boyer
    Michel Boyer Posts: 120
    edited July 2018
    To prevent the ligature in Stoffjacke in XeLaTeX, a zero width non joiner between f and j appears to be a solution (I prefer using an explicit \kern but I guess  that's not the way to go with InDesign). But then, you kern what against what?