Making Windows Fonts Do Something They Shouldn't

Back before there was OpenType, and before Windows had Unicode support, if someone wanted to type text in Georgian or Greek or Armenian, they had to use a special font that printed the glyphs for the characters desired for the character codes within ASCII or at least ISO 8859-1.

Although that's a highly deprecated practice, it is awkward to change keyboards in Windows. Furthermore, I'm not aware of a chess diagram keyboard, for example.

It has occured to me, therefore, that some people might find it very useful to be able to type special characters by the following process:

Have 7-bit or 8-bit fonts which replace ordinary characters by characters in the more distant parts of Unicode, but which also associate with each character the value of the Unicode code for the character the image of which is being printed...

and provide in word processors, including simple editors with text styles (i.e. WordPad and the like) a "Paste as Unicode" function. (Of course, that pastes text in a default font, one would then have to change it to the Unicode font you want.)

Comments

  • I must confess that I do not understand this proposal.
  • John Hudson
    John Hudson Posts: 3,190
    What do you mean by '7-bit or 8-bit fonts'?

    ...it is awkward to change keyboards in Windows. 

    Not really. The process is similar to that for other platforms. If you want to make it easier to provide a custom keyboard, e.g. for entering chess piece symbols, I recommend Keyman.
  • Bhikkhu Pesala
    Bhikkhu Pesala Posts: 210
    edited December 2018
    I use an Ornaments feature (ornm) to type Chess Symbols in my fonts.



    By 7-bit fonts, I think John Savard means fonts with 128 characters (ASCII), and by 8-bit fonts I guess he means 256 characters (ANSI fonts). 
  • John Hudson
    John Hudson Posts: 3,190
    The trouble with ASCII or ANSI codes with a OTL feature such as ornm is that what is being stored in the text is those codes, and not the Unicode chess symbol characters. John seems to be suggesting some kind of dual encoding where '7-bit or 8-bit fonts' also map the glyphs to 'Unicode codes', which would then somehow be accessible via a special paste function, that would somehow know that the user actually wants the Unicode chess characters and not the ASCII or ANSI characters (which are, of course, also part of Unicode). To do that, you need some kind of defined codepage relationship such as systems have to map from legacy 8-bit encodings for various scripts to Unicode.
  • What are “ordinary characters”? Oh, you mean characters in YOUR language. Which are ordinary for you, but not for Chinese or Russian or Arabic or... well, you see the point. So which characters somebody would want to use for the plain-text would be region-specific.

    One way to do what you want: at the font level, encode the glyphs to multiple characters. That is easily done. However, it does not really tell anyone which are the primary characters, and which are a bogus convenience character mapping.

    But the other problem is, what mapping of special characters to “ordinary characters” do you use? How do you share those mappings? This proposal involves re-inventing codepages all over again.

    Keyboards are in fact the elegant solution to this. I am not convinced that switching a software keyboard is harder than the proposed system—I find it fairly easy. If it *is* harder, then the solution is to make it easier to switch software keyboards.

    Don’t get me wrong. If I invent a font of wacky arbitrary symbols that don’t have legit encodings, aimed at a relatively lay audience, I will encode them to the same Unicode slots as ASCII without worrying too much about it. I have done this.

    But if you want to make it easier to access chess symbols, darn, just make a chess keyboard! Way easier than anything else.
  • John Savard
    John Savard Posts: 1,126
    I'm sorry if my suggestion, which indeed goes against the philosophy of standards, was hard to understand. Basically, in the old days of computers, when people used Windows 3.1, they didn't have Unicode.

    So they used fonts for ASCII or ISO 8895-1, but the glyphs were replaced by glyphs for other characters. I was proposing that similar fonts be made now, but with every character replaced by a different character having embedded in the font the Unicode value for the character to which the glyph properly belongs.

    And then a special paste function would read those values.
  • Theunis de Jong
    Theunis de Jong Posts: 112
    edited December 2018
    That'd basically be the OpenType Ornament ornm tag, then.
  • Sort of, yes!

    I am struggling to understand what the real advantage is to this. Because we *do* have Unicode now. Plus, we can even encode a single glyph to multiple codepoints in existing formats. So you can get very close to this capability without anything new.

    If this proposal is implemented in a way that requires new kinds of data in fonts, I just can’t imagine it getting the traction it would need, because the costs are high and the benefits are very small.
  • John Hudson
    John Hudson Posts: 3,190
    edited December 2018
    So they used fonts for ASCII or ISO 8895-1, but the glyphs were replaced by glyphs for other characters. I was proposing that similar fonts be made now, but with every character replaced by a different character having embedded in the font the Unicode value for the character to which the glyph properly belongs.
    You seem to be thinking that there is a way to make a font that is limited to an 8-bit decimal encoding of the kind to which e.g. PS Type 1 fonts were limited, but also has a Unicode encoding for characters beyond that limit. The point that Thomas and I are trying to make is that Unicode is now the default encoding in pretty much any text API: no one is handling text using 8-bit decimal encodings any more). So as Thomas suggests, the only way to have chess symbol glyphs mapped to both ANSI compatible codes and the Unicode codes to which those symbols are assigned would be to double-encode the glyphs. But you'd still be double-encoding them in a Unicode cmap, and only the entry method is going to determine which code is entered in the text string. That's why the keyboard is the proper place to handle this.

    And then a special paste function would read those values.

    How would it know which values? It's all Unicode.

    You could write a text edit macro that converts a text input using characters from e.g. the US English keyboard to Unicode chess symbol characters, but I don't see that having to run such a macro every time you want to create a document including chess symbols is any easier than switching to a custom keyboard.

    Heck, if you want people to be able to avoid switching keyboards on their system, make a web virtual keyboard. It's not like people will be composing lengthy texts composed of chess symbols such that they need to be able to touch-type them. A webpage with a simple text field and clickable chess symbols to input would work fine, and could be used in concert with regular typing from the keyboard for non-chess characters.
  • John Hudson
    John Hudson Posts: 3,190
    That'd basically be the OpenType Ornament ornm tag, then.
    No, because John S does want the correct Unicode codepoints for the chess symbols in the text string eventually. He's just trying to come up with a way to get them there that doesn't involve someone having to switch to a custom keyboard.

    Maybe I've been dealing with multiple scripts and languages, and specialist publishing, for too long, but switching keyboards really doesn't seem to me a difficulty. Millions of people around the world do it several times each day.