Pictograms, Unicode, and ISO

Titus Nemeth · April 2021

I am developing a bespoke type family for a client that requires a set of fairly general pictograms to be included. There are toilets, elevator, stairs, parking, etc.
I didn't think much about it at first, assuming that such basic pictograms, extensively documented by ISO, would certainly be encoded in Unicode...
But it turns out some are not, even though they are probably amongst the most widely-used signage signs you could think of. 'First aid', for example, does not seem to have a corresponding Unicode value, neither does 'exit' (!!), 'parking', 'wheelchair accessible toilet', 'defibrillator', 'dogs prohibited', 'waiting area', 'stairs', etc...

Because I cannot quite believe that this is the case, I am asking for your expert advice: am I right, are they really missing from Unicode? Or did I miss a whole set of code charts? And if so, is there a background story that even comes close to explaining such omissions in view that any shit, literally, has a Unicode point? Is it because ISO claims some kind of IP on these signs? It's just baffling to me.

Thanks in advance!
Titus

Peter Constable · April 2021

Unicode is a character encoding standard. It encodes characters used in text. It does encode symbols and emoji if they are used in text. But it does not encode many signs, icons and pictograms that are not used in text.

To make the point, computer keyboards have characters printed on the keys—the characters that get generated when keys are pressed, and those characters are encoded in Unicode. But keyboards might have other icons printed on the keys for special actions, and those icons won't necessarily be encoded in Unicode: they would only be candidates for encoding in Unicode if they are used in text.

Titus Nemeth · April 2021

Yes.

And by implication you seem to suggest that in the below images from a random google search some are worthy of being called text and thus encoded, and the others not? How does the use of these signs differ? As far as I can see both – emoji and pictograms – work as iconographic depictions alongside alphabetic text.

Image: https://us.v-cdn.net/5019405/uploads/editor/rv/x4b6hiloq53b.jpg

Image: https://us.v-cdn.net/5019405/uploads/editor/m3/6ea63clqc8iu.jpg

Image: https://us.v-cdn.net/5019405/uploads/editor/ea/pwz7adnlwu75.jpg

Image: https://us.v-cdn.net/5019405/uploads/editor/kg/9wwrh12rqe38.jpg

Andreas Stötzner · April 2021

Titus,

you may wish to have a look at the overview “symbols and punctuation” on this site; (scroll down a bit), and consult in particular the entries under Emoji & Pictographs. It is very likely that you find there much of what you look for. It also likely that you establish some pieces are missing in those charts. Then they are possible candidates for future supplement encoding.

Titus Nemeth · April 2021

Andreas,
thank you for your kind suggestion. I have consulted the Unicode website and code charts, which formed the basis of my understanding and brought me here. I guess this confirms what I had already established, that's a start.

Thomas Phinney · April 2021

For “dogs prohibited” and similar signs, I will note that there is a combining “prohibited” symbol in Unicode, U+20E0, “Combining enclosing circle backslash”

Kamal Mansour · April 2021

Titus,
The Unicode Standard does not instrinsically absorb symbols in existing ISO standards, although at its inception, it did that so with certain national standards. At the present, every new character enters the standard through the help of an advocate; that is, someone who proposes its addition. For instance, if you don't find certain AIGA travel symbols, you could gather that information, prepare a proposal, and submit it for consideration. The submission procedure is explained on the Unicode site.

Ray Larabie · April 2021

@Titus Nemeth
ISO symbols are in the public domain but are not included in Unicode. Some of them have Unicode equivalents here and there but you won't find a complete ISO set in Unicode.

Igor Freiberger · April 2021

There isn't and there won't be equivalence between ISO graphic symbols and Unicode symbols. ISO was conceived to embrace every standard possible, from corporative behavior to electric cables. Unicode was conceived to encode writing systems.

Although its focus on text, Unicode is far from pure-text due to backward compatibility. Many symbols were already present in previous text standards. Many other were added later (following the procedure Kamal described). And there was the landslide of emojis, making Unicode less text-focused. Actually, Unicode is a bit like Frankenstein's creature, but still an amazing tool.

As a type designer, I do not limit the symbols to Unicode. If I find that something may be really useful to the user, I include it. Many times, Unicode ends also including these symbols in future versions. Version 13.0 brought 278 new symbols and in the version 14.0 are planned ~40 more.

Some symbols I include and aren't in Unicode:

Image: https://us.v-cdn.net/5019405/uploads/editor/0k/u6qt9mbx0qf4.png

There is also another difference: Unicode is a non-profit consortium and its charts and technical documentation are freely available. ISO is a commercial organization that sells its standards and feeds a huge web of certifications for people and companies. I find paradoxal to adopt a standard upon paying for it, but this is the way it's done.

Hrant Հրանդ Փափազեան Papazian · April 2021

Titus Nemeth said:

in view that any shit, literally, has a Unicode point?

You said it, brother.
Emoji is to communication what coprophagy is to nutrition.

Nick Shinn · April 2021

There are a single foot, and footprints (above) emojis in Unicode, but a <Left Footprint> and <Right Footprint> would have been more suitable for making “Stand Here” COVID floor graphics.

Titus Nemeth · April 2021

Thank you all for taking the time to reply to my post. I was aware of the difference between ISO and Unicode, and also the numerous (historical) inconsistencies of the Standard. I was still surprised about the absence of such commonly used symbols. I would have expected that no later than FF Transit Pict many of them would have been included.

@Thomas Phinney
I did see this Unicode value and wondered about the intended implementation in a font. It's not particularly user friendly for the general office worker, but certainly better than nothing.

A common approach seems to be assigning PUA values to unencoded pictograms, but that is so conceptually unsound that I am reluctant to go down this road. As @Igor Freiberger notes, the Standard does evolve, and re-assigning a 'proper' value to a glyph hitherto encoded with a PUA value would wreck existing documents.
Obviously I also 'do not limit the symbols to Unicode', but unfortunately there are still apps that struggle with unencoded glyphs and OT support remains inconsistent and incomplete throughout, making the accessibility of glyphs and thus the usability of the fonts a real concern.

Igor Freiberger · April 2021

Sorry for talking about things you already know, Titus. It is one of the side effects of the way we communicate nowadays.

To add or not to add PUA codes, that is the question!

For now, I follow John Hudson's advice and do not add. But I may change this because in FontLab 7 you can set more than one code to a given glyph. So, it is easy and secure to update a font with future additions to Unicode.

John Savard · April 2021

Thomas Phinney said:

For “dogs prohibited” and similar signs, I will note that there is a combining “prohibited” symbol in Unicode, U+20E0, “Combining enclosing circle backslash”

I remember that a few years back, some makers of various pictographic signs using that symbol were sued by a company which claimed ownership of the red circle with slash over an icon, despite its origin in European traffic signs. Unfortunately, I do not remember the details.

Nick Curtis · April 2021

Reserve a private-use block, name the glyphs, then let your font editing program generate the code (Miscellaneous Symbols 2?).

I have two freeware fonts composed of ISO symbols and some non-standard glyphs with what I consider to be elegant solutions. You are welcome to use my outlines in exchange for 15% of gross billings of your finished product.

Sander · May 2021

Regarding user friendlyness: I made a font with 123 icons that portray my hometown. The icons were incorporated in the font as ligatures of the related words. See: https://www.tilburgsans.nl/en/. Users can access the icons by just typing their names. No unicode needed.

John Hudson · May 2021

work as iconographic depictions alongside alphabetic text.

The distinction between text and alongside text, while obviously porous, is what means that some iconography does not get encoded in Unicode. The airport signage examples you provide, Titus, are good examples of a product in which there is no functional benefit to the icons being encoded as text: the plane landing and plane taking off iconography are pictures alongside text.

Fonts seem like convenient delivery mechanisms for little pictures, until you want to use them for a little picture that has no text encoding. We need a better delivery mechanism for little pictures, and cramming emoji into a text encoding didn’t make that need go away.

John Hudson · May 2021

Reserve a private-use block

You can’t reserve a private use block. The whole point of Private Use Area codepoints is that they’re entirely unstandardised: anyone can use them for anything.

Ray Larabie · May 2021

Consider using shortcodes for your PUA characters so they'll resolve to something if the font fails or is substituted. For example: A laser hazard warning symbol in PUA. Make a calt that subs the sequence {laser hazard} for your custom character.

If the font fails, you'll get something readable instead of a .notdef. Consider that other fonts might have their own PUA characters. In the case of font substitution it might not display a .notdef...maybe something that changes the meaning entirely. If you use shortcodes you have the option of leaving those custom characters unencoded.

I did this for the Canada1500 typeface and used several languages for each shortcode. It's in the public domain if you want to check the source. If you're nervous about people inadvertanly triggering shortcodes with brackets, use a typographically unimportant keyboard character like backslash or asciitilde. I used \shortcode\

Titus Nemeth · May 2021

John,

you say

a product in which there is no functional benefit to the icons being encoded as text: the plane landing and plane taking off iconography are pictures alongside text.

But how is a basketball next to some text different? Besides, it may not be relevant for the final product whether the little pictures are encoded as text, but for the makers of the product it is very pertinent. After all it also makes no difference for readers of printed books whether the sources that were used are Unicode based, or based on a codepage or some other form of storage, but for the maker of the books it is key.

Fonts seem like convenient delivery mechanisms for little pictures, until you want to use them for a little picture that has no text encoding.

The problem is not exclusively encoding, but also the fragmentary support for things that the OT spec provides. If I had 'ornm' or an equivalent to the glyphs palette in MS Word I wouldn't be here.

Ray,

thanks for your suggestion. I don't consider using PUA for the reasons that you have outlined. The 'shortcodes' that you suggested are a pragmatic workaround that I will probably end up using in some way or another. I didn't think of using the calt feature for them, but this may be a good option, thank you. Yet it remains a kind of hack that the perfectionist in me feels a bit uncomfortable about. There are awkward side effects too, like the selection area referencing the number of characters that the 'shortcode' contains. That may be alright for 'exit', but becomes unwieldy for 'wheelchair friendly exit left', especially so in a language other than English: 'barrierefreier Ausgang links'....

Pictograms, Unicode, and ISO

Comments

Categories