Table of the Evolution of English Language

KP Mawhood · March 2018

How to display the evolution of language using Unicode?

Oxford University Press is publishing a table with the evolution of the English alphabet, in the Oxford Companion to the English Language. The editor is keen to use Unicode values, and accepts there may be some deviation from the previous edition.

We may either:

Scan of the original;
Unicode values in modern font formats.

Would anyone have any insight (e.g. specific glyphs / fonts), corrections or amends?

joeclark · March 2018

Your posting is missing a few words and is generally unclear.

How to display the evolution of language using Unicode?

Do you mean:
- In the print and electronic production of our publication, what steps would we take to make sure we were using correct Unicode values wherever possible? And in what cases would those values not exist, hence using them wouldn’t be possible?
  or
- If we wished to publish a list of Unicode values for these old and current characters, which exact values would we use, and what errors would we need to be alert not to make?
Oxford University Press is publishing a table with the evolution of the English alphabet in The Oxford Companion to the English Language. The editor is keen to use Unicode values, and accepts there may be some deviation from the previous edition.

Again, I don’t know if you mean “use Unicode values in creation of the document” or “list for the actual reader what the relevant Unicode values are.”
We may either:
- Scan of the original;
- Unicode values in modern font formats.
That needs to be completely reworded.
Your PDF should not be three PDFs, and you should have cropped unused pages out of the scans. I fixed that for you and created a new PDF with bookmarks (attached).
You are dealing with a serious and potentially irreconcilable set of confabulations between underlying character and presented glyph.
- By my count, 10 of your existing columns are hand-lettered. You aren’t using fonts there. And, as happens with novices encountering a script for the first time, the exact drawn form you see before you becomes permanently cemented in your mind as the true shape of the character. A classic example of this was the logotype for Pravda (Wikipedia image). Are the third and sixth letters the same, and do those letters always have two crossbars? (No, they’re both just А.)
  - “Current forms” as listed have the same problem and are seriously wrong in some cases as stated. Italic f does not always take a descender, for example. Italic Q can look like a 2 lots of times.
  - It’s not clear why you’re using a serif typeface (or Baskerville). That usage should not go unexamined.
  - Obliques and sloped romans seriously complicate how one would even express “current forms.”
- John Hudson will know this for sure (so might the unsympathetic Michael Everson), but are all the old forms, including variants, actually encoded in Unicode?
- Your typeset table uses pictures of the true glyphs that are impossibly small to read, hence extend beyond pointless to the counterproductive. There is no reason to make these tables so small. Stretch to six pages or more if you need to.
- For Carolingian, “Roman cursive,” and “Roman uncial” forms (I would check those last two terms), you seem to be using a font.
- Why aren’t you showing us current handwritten forms?
I am sensitive to Natalia Ilyin’s observation that design history teaches us that event X happened followed by event Y. If anything happened in between, well, that’s lost to history and obviously didn’t matter. Further, every design history always shows both X and Y in that order. Everything’s already laid out in nice discrete steps that everyone “agrees on.”

Are you doing the same thing here by recapitulating the same kinds of tables of images of the so-called evolution of “the” alphabet that I have been reading for 40 years?

What if you’re making mistakes large and small that have gone unnoticed because everyone, literally everyone, has just copied and pasted previous work on the subject?

KP Mawhood · March 2018

@joeclark

"In the print and electronic production of our publication, what steps would we take to make sure we were using correct Unicode values wherever possible? And in what cases would those values not exist, hence using them wouldn’t be possible?"
“use Unicode values in creation of the document”
Ok.
It's a draft.
The scanned lettering is from the original version, 1992; this is already published. The first draft using Noto and Pfeffer-Mediaeval is a very rough draft, not a final version. It serves as an educational tool, and as you've rightly assessed – there's a great deal of improvement that can be made on the 1992 version. However, in managing the expectations of our editors, it's important to find a middle ground.
I can agree with what you say, but I cannot present that to our third-party editors, or even internal staff. It would come across as abrasive. How would you present that discussion to them?

The awkward position is that I have a week to work on this table, which would benefit from several dedicated months of investigation, or ideally a lifetime of scholarly research. I would rather not be an authority on the evolution of the English language, as I do not believe that I am qualified.

Our editor writes…

"I’ve been looking at the print version of the old edition and I’m not convinced a scan would be a good idea – it does look very old-fashioned and the type quality is not high.

I have spoken to Lise [current editor] once more, and she feels that having a clear format that will not be distorted if enlarged is better in terms of quality than a less than ideal scan. So she would prefer we go with Katy’s table. I know I have been vacillating on this, and I apologise (I think it’s because I have felt supremely unqualified in this area!). But I’m making a final decision now to go with Katy’s table."

So, this will go ahead… how can it be improved?

John Hudson · March 2018

My tuppence:

In illustrating palaeography, I think it is almost always preferable to use actual historical exemplars — either images cropped from manuscripts, or the same forms rewritten by a competent scribe —, rather than to use fonts, which are seldom accurate representations.

Nick Shinn · March 2018

Isn’t it possible to have historical scans in a document, and also clickable/selectable Unicode text in the same position, as a separate layer?

KP Mawhood · March 2018

@John Hudson
Thanks again for your help, I've passed this across and they've taken it on board. We'll try to progress with a scan of the original or remove the table from the text completely.

@Nick Shinn
Good suggestion, thanks for the help.

Thomas Phinney · March 2018

I am going to take the liberty of re-phrasing the discussion, just for the benefit of other readers. Nothing here that Katy, John and Nick don’t know.

If one is setting a new English-language document, “normal” (whatever that means) text will always be Unicode encoded these days. The question is more about the more obscure historical characters and the preservation of variant shapes.

Where exactly one draws the line “normal” and “other” that needs a different treatment, and what that treatment should be or needs to be, are the two intertwined questions.

Many obscure shapes will either have a unique Unicode value already, or be among several shapes that seem like variants of the same Unicode. Preserving such distinctions can be done in numerous possible ways: using different fonts; using variant glyphs in a single font (if such exists); or just using images/scans.

In a digital representation, images/scans can also be given an alt-text encoding, if that seems appropriate.

Given the time frame and the nature of the table, John suggests that using images/scans or a single set of custom-calligraphy glyphs is likely the way to go.

Andreas Stötzner · March 2018

For what it is worth:
a) the mess of this discussion starts with the strange mixing up the terms language and alphabet. This seems plain thoughtless to me. – what Joe said.
b) Katy, if you are speaking as a representative of OUP, if I were you, I would try to phrase my postings a little bit more careful.
c) I can’t see much sense in beginning a history of the English alphabet with Phoenician.
d) Many research has been done already about special Latin characters, of the middle ages in particular. The extensive specific works of MUFI adress such editorial needs exactly. It went even into encoding (offical UCS and PUA) of several variant glyph characters which are relevant here. And the fonts MUFI is providing are known, I suppose.
e) If a category uncials is to be incorporated in the system of those tables and you are going to create them based on typesetting (rather than scans), you can solve this only on the font level. Not by choosing codepoints (there is no Unicode-middlecase-encoding).

KP Mawhood · March 2018

@Andreas Stötzner

a) This ambiguity could be the result of different education backgrounds. The terms alphabet and language are separate, but can be used in the above context when it is understood that language is a vehicle of communication that subsumes all channels, incl. visual. Design communities talk frequently of visual language.
b) When I visit this site, I am not a representative of OUP. Or, at least, that point is unclear. Thanks all the same.

c) As a separate exercise, where would you start it from? For this title, there was an expectation to follow the table in the previous edition.
d) Yes, and the MUFI fonts are known. If required, the mapping of those MUFI characters as an alphabetic evolution would be best addressed by editorial. In absence of that editorial skillset, and without a facing illustration > transcription, the illustration has worked well.
e) Yes, that's what I assumed for uncials. Thanks for clarifying.

John Savard · March 2018

The evolution of the English alphabet is one aspect of the evolution of the English written language, at least. And I find nothing unclear in that a table, typeset with hot metal, now has to have the Unicode values of the characters in it located in order that it can be again typeset - with updates and revisions based on new research - for a new digitally typeset edition.

Maybe the original poster has revised her original post, but I'm puzzled by the comments about how confused and unclear it is supposed to be.

Given that eth and thorn and those other guys have well-known Unicode values, though, I can indeed suspect that some of the comments here are right on one important point: that the table is at such a level of detail that a scan as an image instead of an attempt to typeset its contents is indeed appropriate.

joeclark · April 2018

I assume this didn’t work out well in the end.

We have expertise publishers do not. Our expertise is what’s needed, not whatever assets publishers think they have. “Showing up with incomprehensible postings one week before press time” has now been proven not to work. People generally do not learn from what is proven not to work (Typedrawers never has, for example), which means they never learn.

The smartest people frequently make the dumbest mistakes.

Further, if you can’t show your boss an honest list of everything that’s wrong with your document, publish the document you already have and leave us alone, please.

KP Mawhood · April 2018

@joeclark

joeclark said:

I assume this didn’t work out well in the end…

It worked fine: we scanned the previous edition and applied the current text design. Thanks to everyone for their generous help!

Andreas Stötzner · April 2018

Katy Mawhood said:

… we scanned the previous edition …

What an achievement. Congratulations.

Ray Larabie · April 2018

@Katy Mawhood
Please keep coming back for help. It's all interesting to me and I think that's what this place is for.

John Savard · April 2018

Andreas Stötzner said:

Katy Mawhood said:

… we scanned the previous edition …

What an achievement. Congratulations.

Yes, think of all the unfortunate Linotype machine operators this heartless decision has put out of work!

Oh, wait, they also "applied the new text design", so they actually did typeset most of the book so as to be able to come out with an updated new edition. They used the scanned pieces that weren't worth the trouble of trying to typeset again with new digital tools that weren't yet ready.

Maybe they should have put more effort into developing a new extended font with historical English characters. There are projects out there that have enriched typography, such as that described in Dr. Richard S. Cook's paper, "The Extreme of Typographic Complexity: Character Set Issues Relating to Computerization of the Eastern Hàn Chinese Lexicon Shuō Wén Jiě Zì."... but not everyone is prepared to attempt to scale such heights.

Direct link, referring page.

KP Mawhood · April 2018

John Savard said:

Andreas Stötzner said:

Katy Mawhood said:

… we scanned the previous edition …

What an achievement. Congratulations.

They used the scanned pieces that weren't worth the trouble of trying to typeset again with new digital tools that weren't yet ready.

Absolutely, the scan is based on the guidance in thread for illustrating paleography:

images/scans
single set of custom-calligraphy
clickable/selectable Unicode text, as a separate layer
MUFI with font level solutions to uncials.

Given the time frame and the nature of the table, we opted for the scans. This is a change from our starting point, clumsily using Noto and Pfeffer-Mediaeval (which display differently to the 1992 original edition).

joeclark said:

Our expertise is what’s needed, not whatever assets publishers think they have.

Note that countless more issues are solved in-house than posted here. I hesitate to post our challenges, as it does feel like cheating. Yet, I've learnt a lot by doing so. I hope that a larger project would acknowledge the need for specialized expertise and extended time frames.

Table of the Evolution of English Language

Comments

Categories