I've noticed that some fonts have two versions of entire glyph sets.
Adobe Garamond Premiere Pro has the small caps (extension: .sc) and an almost equal version (extension: .a, which I believe is for alternative), but in which, with the exception of very few glyphs (such as the two versions of Q or W ), all others are equal. The names change: A.a = Latin Capital Letter A and a.sc = a small cap (corporate Use), but I do not understand the function of this so wide doubling.
The thing is even more confusing to me if I look, for example, at the 'locl' lookup for the two types of Turkish <i> (dotted and dottless).
Here even two different rules are used, apparently for the same substitution, which certainly has its own precise logic, which however escapes me:
----------------
| I.a | u0130.a |
| i | i.dot |
| i.sc | i.dotsc |
----------------
It is likely that I.a and i.sc have a different logical function, but the glyph is the same: the dottless small cap I. Similarly u0130.a and i.dotsc are the dotted small cap i.
Therefore, in cases like this, what is the function of the double glyphs .a and .sc?
Comments
Is it possible that the .a glyphs are small caps from caps whereas the .sc are simply small caps? That would be consistent with what you write above (A.a vs. a.sc) but I can't confirm this from my more dated version. If this is the case, then this is simply to ensure that proper unicode values can be reconstructed from glyph names (i.e that the uppercase/lowercase distinction can be reconstructed from glyph names after c2sc is applied).
A lot of Adobe fonts from Font Folio 11 have .sc, .a and .alt glyphs....
That is the correct way to do small caps, although I usually don’t bother.
It does seem like a lot of unnecessary work, as it really only addresses the rare occurrence where a PDF file has <c2sc> text applied, and someone wants to extract the original text in U&lc.
Any other situations where it would be useful?
</code>This means that they need to duplicate smallcap glyphs to distinguish those mapped from uppercase letters and those mapped from lowercase letters</div>But in any case from a graphical point if view is the glyph the same, right? A part from a few alternates...<br><div class="Quote">Any other situations where it would be useful?<br></div>In a font which I'm working around, I'm able to abilitate both Turkish <i>. But, with LaTeX, with this code:<br><pre class="CodeBlock"><code><div><br>\textbf{dotted}</div><div><br></div><div>dotted = İ i \textsc{i}</div><div><br></div><div>MakeUppercase \MakeUppercase{aabbccddiixx}</div><div><br></div><div>MakeLowercase \MakeLowercase{AABBCCDDİİ}</div><div><br></div><div>%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%<br></div><div><br></div><div>\textbf{dottless}</div><div><br></div><div>dottless = I ı \textsc{ı}</div><div><br></div><div>MakeUppercase \MakeUppercase{aabbccddeeffgghhıı}</div><div><br></div><div>MakeLowercase \MakeLowercase{AABBCCDDEEFFGGHHII}</div>
I'm not able to have correct upper- and lower-case. For example, in the first \MakeUppercase, I don't see any <İ> ... The resultant string is only AABBCCDDXX.Moreover, in Adobe Garamond Premiere Pro doesn't work (correctly) for Turkish language the legatures f_i and f_f_i (to preserve the dot). So I think that in f_i and f_f_i it identifies the <i> not as the "normal" <i>
But that doesn’t explain the choice of “.a” names.
For some larger fonts, built as OT-CFF with name-keyed glyphs (as opposed to CID-keyed), Adobe sometimes ran into issues with a subtable that uses the glyph names overflowing its 64K limit.
Changing the glyph name suffixes was one way to pull back the total size of the set of glyph name strings, without hurting the most important uses of parsing glyph names. (As described by John above.)
So, one sees a few Adobe fonts where many or even all the glyph names that have “.something” suffixes have those suffixes as .a, .b, .c, etcetera, to shorten the names.
In such a case instead of A.c2sc and a.smcp, you might have A.a and a.b. (Or whatever arbitrary extension was substituted for each feature.)
Now it remains to me to understand why substitution between uppercase and lowercase fails. In any case it doesn't depend from lookups, but from substitution tables. I've to see better inside them
A last dummy question. Can all these glyphs be placed in any place, for they have no predefinite slots? In Private Area or in Corporate Use area? Which is the difference? And why are not continous? I mean: inside an area a find the alphabet, in all another place some particular glyphs with dots, accents and so no. Why not all together?
PS
I'm no more able to insert images...
Adobe formerly chose to encode all alternates in new OpenType fonts, in the PUA (starting at E000 and upwards), in the early days of OpenType. The idea was that apps that were Unicode-savvy but not OpenType-savvy would still offer users some way to access these glyphs. Adobe’s choice of PUA slots was semi-standardized, at least within a family.
The counter-argument was that bogus codepoints would not help anyone in the long run, and once OpenType feature access was more common would just become another legacy irritant.
This was a contentious decision internally at the time, a tough call. I was one of the main people on the “encode” side of the fence, and that view won out. I was wrong. Very few users made use of the capability (even I found myself reluctant to do it, it was a pain), and now years later it just confuses type designers and end users alike.
Which is the more convenient and up-to-date behavior?
Otherwise, if you choose to encode them for some reason, you probably should put them in one of Unicode’s defined Private Use Areas (see https://en.wikipedia.org/wiki/Private_Use_Areas). One is in the Basic Multilingual Plane (BMP) and sees the most use, but they are all valid.
If the font has no lowercase, of course, you could consider using lowercase slots. Or if it has no caps, you could use the cap slots.
One thing you should definitely NOT do is use unassigned bits of Unicode as places to encode your small caps. Such unassigned codepoints are subject to later use. There really is no reason I can think of to use them now. (Unless you are trying to make a fake font from the future as “evidence” of time travel? But even then you would not be putting small caps in those slots.)
Is there a particular reason to encode these glyphs?
PDF creation can be dependent on glyph names to determine encoding of glyphs, IFF all the following are true (rare workflows these days):
- the PDF is created from a PostScript print stream instead of directly from a live document. Acrobat Distiller is an example of this.
- the PDF creator does not have access to the original font (or does not make use of its access)
- the font in the print stream has no inherent encoding info. With OpenType CFF, the usual print method for PS was one in which the CFF was lifted from its OT wrapper, ditching the cmap. With TTF (OpenType or not), as long as it isn’t a truly ancient workflow that converts the TTF to a PS font, I *think* you should be OK. (Native support for TTF was introduced early in PostScript Level 2 history, but not the initial release of it.)
So, yeah if the stars are badly aligned, it can happen. It isn't super common. A designer could reasonably decide they can't be bothered to worry about supporting such corner cases. If creating a font for a particular purpose or client, might need to check if it is needed.Of course, just communicating back and forth about this with the client and trying to figure out if they care might be as much work as just doing the glyph names the good old safer way in the first place.
Certainly it is I who have not completely understood how to move in practice.
Then: in slot 192 I find in the font the glyph called Agrave as Capital Letter and that of Agrave as small cap (therefore agrave.sc). I add (with a "paste copy") doubling the glyph of the small cap and I only attribute a glyph name to it, in the case of this example Agrave.a.
Then I find in the 7680 slot the glyph of the Latin Capital Letter A with Ring Below which has Unicode Value ini1E00 and also the small cap version at slot 63626, indicated only as uniF88A. I also copy this glyph "somewhere" in a slot without a Unicode value already assigned: to this point is it correct that I attribute to it as glyph name the Unicode value of the Capital Letter A with Ring Below and that is ini1E00.a? Because in itself this last glyph has no name and no Unicode value (the program gives it to me as NameMe.slot_number ...