IPA: Best practice?

2»

Comments

  • Denis Moyogo Jacquerye
    edited May 17
    They don’t require ball terminals. They most often have them because of the typefaces used for IPA. Take a look at Gill in Daniel Jones, The phoneme, its nature and use, 1962.

    There was a whole discussion on the rhotic hook https://typedrawers.com/discussion/3242/modifier-letter-rhotic-hook/p1

    If you include it, you should differentiate ꭤ from ɑ. Giving ɑ a more alpha-like shape may help. Also consider including a slanted-a for italics, as a stylistic alternate if you prefer, to help distinguish it from ɑ.

    ɲ, ɟ, and ʄ are originally inspired by j

    Not originally. Turned f and turned f-with-hook were used because they looked like j, to represent palatal consonants. They’re now most often interpreted as modified js.






  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    edited May 17
    I don't think I'm going to go into the more obscure levels of IPA unless I'm infinitely more bored than I am right now... those tables of rhotacized vowels are giving me a headache (pretty though they are).
  • I should have clarified that I was referring to serif designs of modern or transitional styles when I said that the retroflex hooks traditionally had bulbs or teardrop terminals. There's no need for sans designs to have these of course, and there's a long history of using sans for IPA as shown in that Daniel Jones example.

    But in those cases I'm mostly used to more monolinear designs with little modulation, so the pronounced tapering is a bit jarring even if they result from the stroke logic. Maybe tone down the tapering a bit?
    Not originally. Turned f and turned f-with-hook were used because they looked like j, to represent palatal consonants. They’re now most often interpreted as modified js.

    I think we were thinking of the same thing but expressing it differently. As I explained in that comment, typographically, turned f (and also ƒ though I didn't mention it then) was indeed originally. This was in keeping with the tendency to adopt turned versions of existing glyphs as phonetic symbols. But the reason for that was that they looked like j, which was what I meant by saying it was the original inspiration. In fact, I brought up the fact that ɟ was originally an upside-down f as a reason for the palatal hooks to be stronger than the tail of the j.

    I haven't encountered any of the rhotacized vowel combinations in the wild besides the separately encoded ɚ and ɝ except for maybe ɑ˞. No need to fuss over all the combinations. Focus on the symbols that are actually used. Often you see fonts that seem to include all the base letters for IPA but not the common combining marks, making them unusable in many cases.

    A stylistic variant of italic a as opposed to ɑ is indeed useful. I've used it myself when quoting examples from a romanization of Persian that distinguished a and ɑ.


  • @Jongseong Park Indeed, you made it clear with the low stroke and the hook shape that it wasn’t just a modified j.

    For the rhotacized vowels, keep it simple, design a rhotic hook that works with the common vowels. The custom forms may be nice but they are not required.
  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    Alright, I added a little bit of flare... enough? I added alpha-latin_rhotichookmod to CCMP.
  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    edited May 17
    Denis Moyogo Jacquerye said:
    - For U+031A, most fonts place it on top of the base glyph, but it should sit on the top right. Either use a topright anchor or substitute it for a spacing glyph. John’s Brill has that bug as do too many fonts aiming to support IPA. This is a relatively common diacritic, it should be done right.

    If I use a top right anchor, then it should hover right-aligned above the glyph, right? Would it then have to be lifted sky-high above the ascender of /d/?

    I would much prefer to «substitute it for a spacing glyph» — how would I go about that without causing too much damage? I suppose leaving away the anchors wouldn't be enough?

    (Oh, and is leftanglebelowcomb also supposed to be spacing, or does it just link to the bottom anchor?)


  • John Hudson
    John Hudson Posts: 3,086
    I have just updated handling of U+031A for Brill 5.00, and simply changed it to a spacing glyph, removed the old anchors that incorrectly positioned it over the letters, and added some kerning to pull it in a bit when it follows non-ascending letters (I did this only for the IPA stop consonants; other letters inherit the quoteright class kerning as a fallback).
  • I think the latest revision looks much better.

    From a design standpoint I would prefer the left angle above to be spacing, but I have the feeling that giving positive width to combining characters might break things. The width you assign might or might not be respected according to the rendering environment if the character is expected to have zero width.

    Would it be worth trying to keep it as a combining character but with the anchor actually overshooting the right edge of the the base letter? The problem is that this is often followed immediately by a closing bracket, e.g. [t̚], so you would have to worry about clashing.

    As much as I would like the left angle above to be spacing, I think the safest way would be to align it top right or slightly crossing the right edge if you can afford it. For d̚ for example, it would have to be raised above the ascender.

    The left angle below, if you mean U+0349, is a combining mark. It is semantically very different from the left angle above, and I wouldn't even necessarily expect them to have the same shape (though they can be, of course). It would be in the same family of combining signs as up tack below, down tack below, bridge below, inverted bridge below, plus sign below, minus sign below, etc. Accordingly, it would be centred with the base letter using the same bottom anchor as the others.
  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    John Hudson said:
    I have just updated handling of U+031A for Brill 5.00, and simply changed it to a spacing glyph
    How does one change something into a spacing glyph...?
    I'm thinking of just offering ligatures with all the basic stops. (I guess ingressive stops don't occur without release...?)
  • John Hudson
    John Hudson Posts: 3,086
    How does one change something into a spacing glyph...?
    Give it sidebearings. Don’t give it mark positioning anchors. Avoid classing it as a mark in the GDEF table.

    The character is still classed as a mark, so will tend to behave like one in text editing, e.g. cursor movement might ignore the sidebearings and treat the base+mark sequence as a unit.

    It is possible that your font tool may also have assumptions about the glyph based on its character encoding, but there is probably some way to override that.

  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    edited May 18
    Christian Thalmann said:
    I'm thinking of just offering ligatures with all the basic stops. (I guess ingressive stops don't occur without release...?)
    I stuffed these into CCMP; I hope that covers the realistic use cases...?

  • Igor Freiberger
    Igor Freiberger Posts: 263
    edited May 19
    This is my phonetic block. Posting here with the glyph names as it may be useful to some fellow. Note that these images include the glyphs that are used only in phonetic notation. This is, glyphs like ɑ, ɛ, or ɔ are not here because they are also part of "regular" alphabets.

    The precomposed vowel+rothic and vowel+doublerothic glyphs may include some unnecessary things, as I know there are sounds that are considered only theoric in linguistics, with no real existence. But my knowledge is not enough to identify what is disposable. I tried to sort the glyphs in some kind of a-z collation, but fixes are welcome. Well, actually, any repair is always welcome.

    (Images are shown in small size, but if you save them, the actual size is far more visible.)
    .
  • In the vast majority of cases, the left angle above will only be applied to stops, but nasal consonants can also be marked as having no audible release as in [m̚], [n̚], [ŋ̚], etc. I think it would also be possible for laterals as in [l̚], [ɭ̚], [ʎ̚], etc., but apart from [l̚] I haven't seen these.

    The list of phonetic symbols is endless, but if you're prioritizing based on use cases, ᵻ and ᵿ are non-IPA letters that would be useful to include as several dictionaries use them in otherwise IPA transcriptions of English.
  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    Alright, I added these ligatures as well, although I'm rather confused at their usefulness; I can't say I often hear audible releases from nasals or liquids.

  • Denis Moyogo Jacquerye
    edited May 22
    @Christian Thalmann I don’t understand why you’re building ligatures, but I don’t understand why people build ligatures with apostrophe either.

    You can add a uni031A.spacing and have `sub uni031A by uni031A.spacing;` as long as uni031A.spacing is not in the GDEF mark class that should prevent applications from treating it as a non spacing glyph.


  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    @Denis Moyogo Jacquerye, I'm building ligatures because I don't know enough about what makes a glyph spacing — for instance, I don't know what GDEF is. Wouldn't uni031A.spacing be treated as a form of  uni031A by, say, Glyphs, and thus receive the same flags and treatments as the base glyph?
  • John Hudson
    John Hudson Posts: 3,086
    Glyphs’ automated handling of some aspects of glyphs and layout is determined by its  GlyphsData.xml file, and you can override that behaviour using a custom version of that file. As I recall, you can have multiple version of GlyphsData.xml and place one in the same folder as a .glyphs source if you only want to use it for a specific project.

    GDEF is the Glyph Definition Table in a font, and is one of the three OpenType Layout tables. The GDEF table stores several sets of data about glyphs that are then used by layout engines when applying the features and lookups in the Glyph Substitution (GSUB) and and Glyph Positioning (GPOS) tables. One set of data is the GlyphClassDef table that classifies the glyph type for processing, the most functionally important of which is the definition of some glyphs as marks. The Glyphs automates which glyphs get classified as marks when it writes the GDEF table based on Unicode character properties, but this isn’t always useful because a) some unencoded glyphs that do not map to Unicode characters may need to be treated as marks, and b) not everything that Unicode considers a mark is usefully implemented as a mark at the font level (as in the case of U+031A). So it is important for font makers to have control over which glyphs are classified as marks in GDEF and not to rely on tool automation.

    Glyphs need to be classified as marks in GDEF if a) they are going to be positioned on bases and/or on other marks using anchor attachment positioning, and b) they need to be selectively filtered in GSUB and/or GPOS lookup, e.g. skipped or not skipped in kerning or in contextual substitutions.

    Note that layout engines will tend to collapse the width of any glyph classed as a mark in GDEF to zero, because of the expectation that marks are always zero-width. So it is best to set the width to zero in the design source, and I understand Glyphs does this automatically when it generates a font containing glyphs that it classes as marks in GDEF. In Glyphs, unlike FontLab, it is my colleagues tell me it is difficult to work with zero-width glyphs in the editing UI, hence the reliance on zeroing widths during the font generation process. I prefer having manual control of such things.
  • Florian Pircher
    Florian Pircher Posts: 176
    GlyphsData.xml files are great if you want to reuse your custom glyph definitions. For one-off projects, you can set any glyph to be a non-spacing mark by selecting it in Font View and adjusting its Category and Subcategory properties manually with Edit → Info for Selection…


  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    edited May 22
    Oh, cool! Thanks Florian, that sounds very convenient! I only hope the user-end applications respect those non-standard definitions...?
    In any case, I dumped all those ligatures, gave /leftangleabovecomb a negative left sidebearing and some positive kerning after ell-type letters. Much easier this way.
  • Florian Pircher
    Florian Pircher Posts: 176
    edited May 22
    I only hope the user-end applications respect those non-standard definitions...?
    These are no more or less standard than any other glyph definitions (unless the user-end app gives special treatment to specific glyph names, which is not ideal but happens.) Glyphs ships with its own GlyphData.xml, defining these properties for certain glyph names, but that does not make them any more special or standard in the exported font file than your custom glyph property values, whether set via Info for Selection or custom GlyphData.xml file. [For more app-specific discussions, it’s probably best to continue that on the Glyphs forum, if desired. Back to the great IPA best practices topic here.]
  • John Hudson
    John Hudson Posts: 3,086
    edited May 22
    I only hope the user-end applications respect those non-standard definitions...?
    In terms of layout engines, as Florian notes, this is all standard stuff within the context of OpenType: fonts are the glyph domain, and it is expected that font makers determine glyph categories and behaviours within the OpenType Layout model. This is why the GDEF table exists in individual fonts, rather than layout trying to rely on some kind of external reference of glyph classification.

    Text editing protocols, on the other hand, may rely on Unicode property data when making decisions about caret movement and deletion. The fact that U+031A is considered a mark in Unicode may mean that in some environments it may behave differently during editing than e.g. IPA spacing modifier letters.

  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    Ok, that sounds like an acceptable risk.
  • Christian Thalmann
    Christian Thalmann Posts: 1,966
    edited June 4
    Hm, it wasn't all too hard to draw the binary ligatures for the tone marks. The trinary ones are, what, only five times as many? ;) Easy!

    Although, do you need to support triplets like extralow-mid-extrahigh that end up collinear? Wouldn't a user just use extralow-extrahigh for that?
  • Igor Freiberger
    Igor Freiberger Posts: 263
    Hm, it wasn't all too hard to draw the binary ligatures for the tone marks. The trinary ones are, what, only five times as many? ;) Easy!

    Although, do you need to support triplets like extralow-mid-extrahigh that end up collinear? Wouldn't a user just use extralow-extrahigh for that?
    The existence of some tone variations in actual speak is debatable, but I prefer to include all them and let the linguists discuss if this or that are unnecessary. By the way, the whole set of two and three tone variation is 140.
  • That should be all of them, right...? Now, to code the OT...