How to display the evolution of language using Unicode?
Oxford University Press is publishing a table with the evolution of the English alphabet, in the Oxford Companion to the English Language. The editor is keen to use Unicode values, and accepts there may be some deviation from the previous edition.
We may either:
Would anyone have any insight (e.g. specific glyphs / fonts), corrections or amends?
Your posting is missing a few words and is generally unclear.
Do you mean:
Again, I don’t know if you mean “use Unicode values in creation of the document” or “list for the actual reader what the relevant Unicode values are.”
That needs to be completely reworded.
Your PDF should not be three PDFs, and you should have cropped unused pages out of the scans. I fixed that for you and created a new PDF with bookmarks (attached).
You are dealing with a serious and potentially irreconcilable set of confabulations between underlying character and presented glyph.
I am sensitive to Natalia Ilyin’s observation that design history teaches us that event X happened followed by event Y. If anything happened in between, well, that’s lost to history and obviously didn’t matter. Further, every design history always shows both X and Y in that order. Everything’s already laid out in nice discrete steps that everyone “agrees on.”
Are you doing the same thing here by recapitulating the same kinds of tables of images of the so-called evolution of “the” alphabet that I have been reading for 40 years?
What if you’re making mistakes large and small that have gone unnoticed because everyone, literally everyone, has just copied and pasted previous work on the subject?
The awkward position is that I have a week to work on this table, which would benefit from several dedicated months of investigation, or ideally a lifetime of scholarly research. I would rather not be an authority on the evolution of the English language, as I do not believe that I am qualified.
- "In the print and electronic production of our publication, what steps would we take to make sure we were using correct Unicode values wherever possible? And in what cases would those values not exist, hence using them wouldn’t be possible?"
- “use Unicode values in creation of the document”
- It's a draft.
- The scanned lettering is from the original version, 1992; this is already published. The first draft using Noto and Pfeffer-Mediaeval is a very rough draft, not a final version. It serves as an educational tool, and as you've rightly assessed – there's a great deal of improvement that can be made on the 1992 version. However, in managing the expectations of our editors, it's important to find a middle ground.
- I can agree with what you say, but I cannot present that to our third-party editors, or even internal staff. It would come across as abrasive. How would you present that discussion to them?
Our editor writes…
"I’ve been looking at the print version of the old edition and I’m not convinced a scan would be a good idea – it does look very old-fashioned and the type quality is not high.
So, this will go ahead… how can it be improved?
In illustrating palaeography, I think it is almost always preferable to use actual historical exemplars — either images cropped from manuscripts, or the same forms rewritten by a competent scribe —, rather than to use fonts, which are seldom accurate representations.
Thanks again for your help, I've passed this across and they've taken it on board. We'll try to progress with a scan of the original or remove the table from the text completely.
Good suggestion, thanks for the help.
If one is setting a new English-language document, “normal” (whatever that means) text will always be Unicode encoded these days. The question is more about the more obscure historical characters and the preservation of variant shapes.
Where exactly one draws the line “normal” and “other” that needs a different treatment, and what that treatment should be or needs to be, are the two intertwined questions.
Many obscure shapes will either have a unique Unicode value already, or be among several shapes that seem like variants of the same Unicode. Preserving such distinctions can be done in numerous possible ways: using different fonts; using variant glyphs in a single font (if such exists); or just using images/scans.
In a digital representation, images/scans can also be given an alt-text encoding, if that seems appropriate.
Given the time frame and the nature of the table, John suggests that using images/scans or a single set of custom-calligraphy glyphs is likely the way to go.
a) the mess of this discussion starts with the strange mixing up the terms language and alphabet. This seems plain thoughtless to me. – what Joe said.
b) Katy, if you are speaking as a representative of OUP, if I were you, I would try to phrase my postings a little bit more careful.
c) I can’t see much sense in beginning a history of the English alphabet with Phoenician.
d) Many research has been done already about special Latin characters, of the middle ages in particular. The extensive specific works of MUFI adress such editorial needs exactly. It went even into encoding (offical UCS and PUA) of several variant glyph characters which are relevant here. And the fonts MUFI is providing are known, I suppose.
e) If a category uncials is to be incorporated in the system of those tables and you are going to create them based on typesetting (rather than scans), you can solve this only on the font level. Not by choosing codepoints (there is no Unicode-middlecase-encoding).
a) This ambiguity could be the result of different education backgrounds. The terms alphabet and language are separate, but can be used in the above context when it is understood that language is a vehicle of communication that subsumes all channels, incl. visual. Design communities talk frequently of visual language.
b) When I visit this site, I am not a representative of OUP. Or, at least, that point is unclear. Thanks all the same.
c) As a separate exercise, where would you start it from? For this title, there was an expectation to follow the table in the previous edition.
d) Yes, and the MUFI fonts are known. If required, the mapping of those MUFI characters as an alphabetic evolution would be best addressed by editorial. In absence of that editorial skillset, and without a facing illustration > transcription, the illustration has worked well.
e) Yes, that's what I assumed for uncials. Thanks for clarifying.
Maybe the original poster has revised her original post, but I'm puzzled by the comments about how confused and unclear it is supposed to be.
Given that eth and thorn and those other guys have well-known Unicode values, though, I can indeed suspect that some of the comments here are right on one important point: that the table is at such a level of detail that a scan as an image instead of an attempt to typeset its contents is indeed appropriate.
I assume this didn’t work out well in the end.
We have expertise publishers do not. Our expertise is what’s needed, not whatever assets publishers think they have. “Showing up with incomprehensible postings one week before press time” has now been proven not to work. People generally do not learn from what is proven not to work (Typedrawers never has, for example), which means they never learn.
The smartest people frequently make the dumbest mistakes.
Further, if you can’t show your boss an honest list of everything that’s wrong with your document, publish the document you already have and leave us alone, please.
It worked fine: we scanned the previous edition and applied the current text design. Thanks to everyone for their generous help!
Please keep coming back for help. It's all interesting to me and I think that's what this place is for.
Oh, wait, they also "applied the new text design", so they actually did typeset most of the book so as to be able to come out with an updated new edition. They used the scanned pieces that weren't worth the trouble of trying to typeset again with new digital tools that weren't yet ready.
Maybe they should have put more effort into developing a new extended font with historical English characters. There are projects out there that have enriched typography, such as that described in Dr. Richard S. Cook's paper, "The Extreme of Typographic Complexity: Character Set Issues Relating to Computerization of the Eastern Hàn Chinese Lexicon Shuō Wén Jiě Zì."... but not everyone is prepared to attempt to scale such heights.
Direct link, referring page.
Given the time frame and the nature of the table, we opted for the scans. This is a change from our starting point, clumsily using Noto and Pfeffer-Mediaeval (which display differently to the 1992 original edition).
- single set of custom-calligraphy
- clickable/selectable Unicode text, as a separate layer
- MUFI with font level solutions to uncials.
Note that countless more issues are solved in-house than posted here. I hesitate to post our challenges, as it does feel like cheating. Yet, I've learnt a lot by doing so. I hope that a larger project would acknowledge the need for specialized expertise and extended time frames.