Adobe provides slots for: 1) oldstyle figures (from uniF730 to uniF739); 2) small caps (from uniF761 to uniF77A); 3) ligatures (from uniFB00 to uniFB06).
However in most fonts those glyphs are freely inserted in PUA. Not all, for in some cases ligatures are just in the predefined slots.
So: a) is it correct or by now obsolete to place those glyphs in the slots provided by Adobe, which it seems to me for some time advises against their use, and put them in PUA instead?
b) why instead in the case of ligatures do some fonts use the predefined slots, other ones put them in the PUA, and what is the best solution?
Thank you
ms
Comments
The question most frequently asked is whether to encode them at all — i.e. whether to assign them to the PUA or to leave them unencoded. While opinions on this vary, I think the majority these days recommend against assigning PUA codepoints.
Also, note that the PUA code points used in the past by Adobe for small caps, ligatures, etc. are by no means standardized. Fonts from other vendors often assigned these same glyphs to entirely different PUA codepoints, so software cannot make any assumptions about what a particular PUA value is intended to represent.
PUA use is not standardized. As André pointed, even Adobe does not use them in a solid way. Actually, the only standard for PUA I am aware of is the medieval set by MUFI.
As a general suggestion, do not use PUA codes. You can have glyphs not coded in your font without any problem. These glyphs should be accessible through OpenType features, like small caps, discretionary ligatures, or alternate designs.
You can search for John Hudson's posts about PUA and Unicode. John gave excellent explanations about the subject several times.
So, current practice, for fonts aimed at professional users, is to not encode anything in the PUA that can reasonably be handled by OpenType features.
For fonts aimed at non-professional users, there are a variety of opinions, including the possibility of having a bogus encoding that puts special characters in "normal" character slots. Or having extra fonts with small caps in lowercase slots and oldstyle figures in regular number slots.
BUT, the codepoints at U+FB00–FF06 are standard Unicode codepoints for f-ligatures, due to legacy encodings. The latter have been in Unicode since 1993. However, one very rarely encounters such “hardcoded” ligatures in incoming text streams. It is a judgment call whether to bother using those slots for those particular ligatures, nowadays (versus leaving them unencoded—or even possibly doing both, but that is probably overkill).
Note that using the FBXX codepoints does not replace having the 'liga' and 'dlig' features for those same ligatures.
As long as your small caps, ligatures, etc. are all accessible via opentype features, they don’t need to have any unicode value.
For the Latin script these ligatures are defined in Unicode 12:
$ uni s ligature | grep -i latin
'IJ' U+0132 LATIN CAPITAL LIGATURE IJ (Uppercase_Letter)
'ij' U+0133 LATIN SMALL LIGATURE IJ (Lowercase_Letter)
'Œ' U+0152 LATIN CAPITAL LIGATURE OE (Uppercase_Letter)
'œ' U+0153 LATIN SMALL LIGATURE OE (Lowercase_Letter)
'ff' U+FB00 LATIN SMALL LIGATURE FF (Lowercase_Letter)
'fi' U+FB01 LATIN SMALL LIGATURE FI (Lowercase_Letter)
'fl' U+FB02 LATIN SMALL LIGATURE FL (Lowercase_Letter)
'ffi' U+FB03 LATIN SMALL LIGATURE FFI (Lowercase_Letter)
'ffl' U+FB04 LATIN SMALL LIGATURE FFL (Lowercase_Letter)
'ſt' U+FB05 LATIN SMALL LIGATURE LONG S T (Lowercase_Letter)
'st' U+FB06 LATIN SMALL LIGATURE ST (Lowercase_Letter)
If your precomposed glyph, e. g. c_h ligature, doesn't have a code point in Unicode, then don't define a code point for it in your font.