Ordinal/superscript feature for French





If the above characters are included in a font, how – and to what extent – should OpenType be involved in rendering them correctly for French? Would it make sense to use the ordn feature? Is this a stylistic choice, on par with raised “st” in English 1st? I assume it is not preferred default behaviour, but I could be wrong.

Comments

  • Chris LozosChris Lozos Posts: 991
    Makes sense to me. Just include a sub for all the included sups.
  • It is the preferred default behaviour in typographic guides.
    If the above characters are included in a font
    You do mean glyphs, right? One shouldn’t use the characters 1ᵉʳ, but should use instead the ordinal feature on 1er to get those glyphs.
  • Thanks, Denis. I think I get the terminology wrong a lot of the time re. glyphs/characters. Would it be right to say character refers to a linguistic unit and glyph refers to a designed instance of that unit?

    It is the preferred default behaviour in typographic guides.

    Meaning it could be implemented as a calt feature that is always on by default in French? I would think it should be an active choice.

  • Yeah, that’s the difference between character and glyph one makes to avoid the confusion.

    I wouldn’t put these in a calt feature always on by default in French. The user, or a smart application, should decide when to activate the ordn feature.
  • edited May 2016
    This led me down a rabbit hole. The ordn feature commonly substitutes a for ª and o for º, but these ordinals have their own unicode value. Isn’t this really a hack that is disrupting the data stream?

    I am inclined to substitute letters (ex. ‘egrave’, U+00E8) for unencoded character variants (ex. ‘egrave.sups’, no unicode) in the sups feature, remove the ordn feature, and let users input ª and º manually.





  • Thank you, Denis.


    As for the rabbit hole, different usage of superior and inferior figures and letters seem to require different implementation. Some require a Unicode value. Examples include:

    Chemical formulas
    Pb⁴⁺
    C₆H₁₂O₆

    Math/measurements
    15 cm³
    60 m²

    Lingustics
    … scholarly outstripped long-vowel system (ē=h₁, ā=eh₂, ō=h₃).
    aidʰ-stu-s
    h₁ h₂ h₃ or hₑ hₐ hₒ



    Others seems more like “stylistic” variants:

    Note markers
    … as stated by Ulysses S. Grant.2

    Ordinals
    1st
    1a
    2éme

    Abbrevations
    Mlle
    Mgr


    If I wanted to cover all cases, would it be best to have both encoded and unencoded superiors/inferiors? In your sources I found some accented raised letters as well: í é ó How would I begin to decide what to include and not?

    Btw, here’s another, related, case from Proto Indo European. I have no idea how this might be encoded, but it probably should be, because if I copy this text, the semantic meaning is lost.









  • John HudsonJohn Hudson Posts: 1,236
    edited May 2016
    The Proto Indo-European transcription example should presumably be encoded as

    ch=gᵘ̯ʰ

    i.e. with superscript characters, not <sups> feature styling, and would rely on the font to position the subscript inverted breve combining mark appropriately (possibly with variant mark form for added refinement).

    [The fallback font used on my system to display those characters handles it well, albeit with too much space between the superscript letters.]
  • edited May 2016
    Ah, the U+032F, and then possibly a small size variant activated contextually + anchors on the superscripted characters. Did I understand that correctly?

    As for the other question –
    If I wanted to cover all cases, would it be best to have both encoded and unencoded superiors/inferiors?
    – do you have any thoughts?




  • John HudsonJohn Hudson Posts: 1,236
    Ah, the U+032F, and then possibly a small size variant activated contextually + anchors on the superscripted characters. Did I understand that correctly? 

    Correct.

    If I wanted to cover all cases, would it be best to have both encoded and unencoded superiors/inferiors?

    Yes, to the all encoded super/subscript characters if you want full coverage. For unencoded superscript variants, the most I've ever been asked for is A–Z, a–z[è], 0–1, Α–Ω, α–ω (for Brill), and only 0–1 for subscripts.

    If you're not fussed about PDF text reconstruction, you can use the same glyphs for both encoded and unencoded.

  • edited May 2016
    Thanks, John. I am still a little confused. You have been vocal in the past about the problems with substituting different semantic units for each other, like for example x by ×. Why is this different?

  • John HudsonJohn Hudson Posts: 1,236
    edited May 2016
    Using the same glyph for an encoded superscript and a <sups> superscript isn't confusing the semantics except in the unique case of Acrobat text reconstruction from glyph names in printstream-distilled PDFs. It's not messing with the semantics at the text creation level, and it isn't misrepresenting the encoded characters (since the visual representation of encoded and <sups> superscripts is the same). It's only an issue if someone is creating a PDF in a way that doesn't preserve the original text encoding. Some customers will care about this, but many others won't. Brill wanted clean text reconstruction, so their fonts contain duplicate superscript glyphs for encoded and unencoded.
  • Wei HuangWei Huang Posts: 72
    Sorry to revive old thread: what's your typical character set for superscript letters?
  • Nick ShinnNick Shinn Posts: 1,191
    I don’t put a Superscript feature in every typeface, but when I do, I include the basic alphabet, plus egrave.

    I also include figures, parentheses, plus, minus, period, comma, dollar, cent and ®.

  • Thomas PhinneyThomas Phinney Posts: 850
    edited December 3
    ® Should almost always be superscript. (And © should always be full-size.)
  • Nick ShinnNick Shinn Posts: 1,191
    ® Should almost always be superscript. 
    If rendered thus, it will be much too small when used in footnotes.
  • Ray LarabieRay Larabie Posts: 676
    ® Should almost always be superscript.
    Superscript ® is so annoying. Isn't it easier for users to superscript an ® than unsuperscript an ®? At least 3 times I've had to change the font for the ® on a web site footer because it was barely visible. And last week I had to swap an ® on a circuit board. It's even worse with older fonts where they didn't bother making different weights for the ®. With those, you can't even grab a lighter one from another weight and scale/shift it.


Sign In or Register to comment.