Tcomma and Tcedilla

PabloImpallari · April 2013

I'm a little confused about /Tcomma and /Tcedilla

According to Adobe Latin 3:
http://blogs.adobe.com/typblography/latin_charsets/Adobe_Latin_3.html

Unicode	Character	Name	Description
0162	Ţ	Tcommaaccent	LATIN CAPITAL LETTER T WITH CEDILLA
021A	Ț	uni021A	LATIN CAPITAL LETTER T WITH COMMA BELOW

Why are they are using the /Tcommaaccent "name" for the cedilla glyph? Is there any reason for that? or it's just a bug?

When consulting the Unicode chart, I see "commas" on both glyphs...
http://www.fileformat.info/info/unicode/char/162/index.htm
http://www.fileformat.info/info/unicode/char/21a/index.htm
The only difference is that the first shows a Sans font, and the second one shows a Serif font.

So far I have been using:

/uni0162 0162 for Tcedilla
/uni021A 021A for Tcomma

/uni0163 0163 for tcedilla
/uni021B 021B for tcomma

/uni015E 015E for Scedilla
/uni0218 0218 for Scomma

/uni015F 015F for scedilla
/uni0219 0219 for scomma

Is that Ok? Anyone can confirm/clarify?
Thanks in advance

John Hudson · April 2013

What you are using is good, but you will want to add a Romanian {locl} feature GSUB lookup to map from the cedilla form glyphs to the comma form glyphs.

The background here is that Unicode initially unified encoding of the Turkish S with cedilla and the Romanian S with comma accent, but then later decided to disunify them, keeping the existing U+015E and U+015F for the Turkish diacritic and adding the new U+0218 and U+0219 for the Romanian diacritic. [Ignore the T diacritic for a moment.]

However, the old 8-bit codepages covering Romanian, a lot of Romanian fonts, and Romanian keyboard drivers, had all used the unified encodings with S cedilla, so there was a lot of existing Romanian text using those characters instead of the new ones. Hence the need to provide {locl} mapping from the S cedilla characters to the S comma accent preferred glyph form for Romanian.

Now, about the T diacritic...

Unlike the S diacritic, whose unified encoding was used by two different languages with different preferred forms, the T diacritic was only used in Romanian. So although Unicode provided a new, disunified encoding for it as well as for the S diacritic, for a while Adobe, Tiro and others were using the comma accent form for both the old and new T diacritic encodings, since that was the form preferred for the only language using either encoding. A few years ago, though, through Microsoft's regional subsidiaries, we received feedback from Romanian users that in the event of failure of the {locl} feature, i.e. when software didn't do the glyph level substitution, it was preferable for both S and T diacritics to have the cedilla form instead of one having the cedilla and one the comma accent. In other words, the inconsistency is considered more objectionable than the incorrect diacritic form. So since then we've followed the Unicode character name for these diacritic characters (but not for the Baltic 'cedilla' diacritics, which all properly take the comma accent).

Jeff Kellem · April 2013

Here's a blog post from a Romanian point of view on Romanian diacritic marks (with some possible history of the changes):
http://kitblog.com/2008/10/romanian_diacritic_marks.html

John Hudson · April 2013

Thanks for the link, Jeff. I'd not seen that page before; it is very good. Just a pity it isn't bilingual: I'd love to know the Romanian word for clusterfuck.

Jackson Showalter-Cavanaugh · April 2013

Also, Moldavian.

Rainer Erich Scheichelbauer · April 2013

Pablo, this is what you need to do to implement the ROM/MOL locl feature in Glyphs.

PabloImpallari · April 2013

Thanks to all of you!

Nick Shinn · April 2013

The best design solution is to make the commaaccent and cedilla identical.
I keep meaning to do that, but always end up with the usual cedilla already implemented by the time I get around to the commaaccent.

Denis Moyogo Jacquerye · April 2013

This is a mess.

Some Romanian/Moldovan speakers say they do not want the locl feature substituting the cedilla form to the comma form. This kind of substitution only extends the confusion between ţ and ț or ş and ș, it only solves what the characters look like not what they actually are (which is bent on breaking something at some point or another).

In AGLFN 1.7 there is no tcommaaccent nor scommaaccent anymore, only uni-names.
See http://sourceforge.net/projects/aglfn.adobe/files/ with the comment in aglfn.txt:
- removed mappings for commaaccent names. These should now be assigned "uni" names.

For the Baltic cedilla letters with commas, they are also used in other languages, transcription systems or translitteration systems where a proper cedilla is required.

Having the comma below and cedilla identical seems nice on paper, but doesn't really help identify characters which is important on the computer.

Rainer Erich Scheichelbauer · April 2013

At least on a Mac, it is not possible to type the old codes (S/s/T/tcedilla, 015* and 016*) anymore. You can only type the new ones (S/s/T/tcommaaccent, 021*). I assume it’s similar for (newer versions of) Windows. The old codes are still in widespread use on Romanian websites though. But my hope is that their use will fade out over time.

John Hudson · April 2013

Yes, newer versions of the Windows 'Romanian (Standard)' keyboard use the new comma accent characters; however, the older keyboard is still available, identified as the 'Romanian (Legacy)' keyboard.

John Hudson · April 2013

For the Baltic cedilla letters with commas, they are also used in other languages, transcription systems or translitteration systems where a proper cedilla is required.

Do you have specific languages or transcription systems in mind?

In a European context, I've not found any instances in which these characters should be displayed with a cedilla, and for the most part font developers are making Latin fonts for European language support. It's also worth noting that Unicode explicitly annotates these characters as 'Latvian', and the 'WITH CEDILLA' naming is acknowledged as incorrect (but cannot be changed because Unicode character names are normative and covered by stability agreements).

Denis Moyogo Jacquerye · April 2013

Marshallese uses n and l with cedilla along with m and o with cedilla.
In Unicode these decompose to base letters with combining cedilla, not with combining comma below. It is not a naming mistake but a blurry unification of the comma below with the cedilla, like it was for t and s with comma below or cedilla. It’s only in Version 3.0 that some cedillas were changed to look like commas to accomodate Latvian and Livonian.
See http://www.unicode.org/L2/L2013/13037r-cedillas-and-commas-below.pdf it's publicly available now.

Some ISO and DIN transliterations use the cedilla and sometimes the comma with d, n, t, k, etc., in those transliterations diacritics are supposed to look like what they are otherwise you don’t know what you're transliterating anymore.

John Hudson · April 2013

It is not a naming mistake but a blurry unification of the comma below with the cedilla, like it was for t and s with comma below or cedilla.

The situation is a little different in that Unicode both annotates the characters as being for Latvian and, as you note, now presents them in the charts as being with the comma diacritic form. So, yes, it is a unification problem, but one that results in both a naming mistake and a canonical decomposition mistake; this is openly acknowledged by UTC members. Unlike the Turkish/Romanian situation, in this case Unicode has already explicitly identified some 'WITH CEDILLA' characters as correct encoding for Latvian (although annotations are informative, not normative), i.e. they have accepted that the characters are misnamed for their preferred display and major language use. Disunification would be even messier than that for Romanian.

Thanks for reminding me of the Marshallese use, and of Eric Muller's memorandum.

Denis Moyogo Jacquerye · April 2013

Yes, it’s both a naming and unification mistake, I was really disagreeing with which comes first since they decompose to what the names say.
Unicode inherited the problem from previous ISO/IEC 8859-4 or -10, ECMA-94 or-144, or code pages where there characters with cedilla in their names had cedilla sometimes and comma below other times in the reference documents.

Disunification for Romanian was done poorly, doing the right thing the wrong way or at the wrong time doesn’t help.

Denis Moyogo Jacquerye · April 2013

@Pablo: the Unicode charts are at http://www.unicode.org/charts
What you looked at was Fileformat.info which show what it finds in some fonts, sometimes it’s right but sometimes it’s wrong.

Wei Huang · October 2015

Thoughts on this note from Wes Adams about comma accent?

In Romania I have it on authority that they prefer them connected because they are 'part of the letter' and no afterthought. […] I think that's the rub in faces where it's a comma and not more of a cedilla. I could well be wrong about this, they prefer joined.

Bogdan Oancea · October 2015

No, in Romanian the comma accent shouldn't be connected to the S or T for Scommaaccent and Tcommaaccent. Whoever said that to Wes must have confused the comma accent with cedilla.

Dmitry Goloub · November 2015

I've noticed that in many fonts, the comma accent is turned 90 degrees fir all accented letters except S and T. Mainly these are letters from Latvian or Lithuanian. So am I right that it's ok to use rotated comma accent for R, L, N, etc. but not for S and T?

Thomas Phinney · November 2015

Wes was replying to a thread about ogoneks. I'm not sure why he brought up Romanian... I think he just got confused.

Adam Ladd · June 2019

I've been coming across more fonts that use a comma style design (both attached and detached to the letter) for the cedilla accent and do not appear to include a "traditional(?)" cedilla as well. Is this a design choice or acceptable to readers?

Thomas Phinney · June 2019

What I have seen in this vein seems like a hybrid in-between accent of some sort.

(Whether it is acceptable (and acceptable to whom?) is another question....)

Vasil Stanev · June 2019

Bogdan Oancea said:

attached document

Wow, I have completely missed this gem! It's quite comprehensive and it warms my heart to see I always did the design right.

Tcomma and Tcedilla

Comments

Categories