Correct forms of Arabic letters based on Heh

Michael Rafailyk
Michael Rafailyk Posts: 145
edited September 30 in Technique and Theory
There are some Arabic letters based on Heh that have different letter forms in different sources. It would be great if someone could shed some light on which forms are correct and which are not.

Here's how Heh based letters are represented in Wikipedia:
Heh U+0647
Heh Goal U+06C1
Heh Goal with Hamza U+06C2
Teh Marbuta U+0629
Teh Marbuta Goal U+06C3



Questions about final/terminal forms (marked with red color).
1. Should Heh Goal final look like Heh Goal with Hamza final (just without Hamza)?
2. The same question about Teh Marbuta final. Should it look like Teh Marbuta Goal final?

Question about base forms (marked with blue color).
3. Should Teh Marbuta Goal looks the same as Teh Marbuta? I mean, why they look the same? Here the other source that depicts base shape similar to final shape.

Question about medial and initial forms (marked with green color).
4. Heh Goal with Hamza – medial and initial forms. Glyphs app do not have these forms in the Arabic sidebar lists. In addition, Wikipedia depicts them as 2-3 separate symbols. So are they used in practice at all?

Tagged:

Comments

  • John Hudson
    John Hudson Posts: 3,113
    So this is really a question about Unicode text encoding and shaping rather than about Arabic letters per se. The background here is that certain stylistic forms of Arabic letters are used as distinct letters in some alphabets, which means they need to be distinctly encoded in the text (because those letters originate in stylistic forms, there may be typefaces in which the style indicates their use as variant or even default style for some letters, in some languages, for other characters). Some of these characters are visually distinguished only in certain joining situations. So, for example, the teh marbuta goal is only distinguished from teh marbuta in its right-joining (final) form, and the primary purpose of this character is to maintain shaping consistency with the heh goal in Urdu, i.e. it is a regional variant of the teh marbuta with its own codepoint because of its shaping.

    Unicode defines joining groups for Arabic letters, based on shared shaping patterns. These are explained in the Arabic chapter of the core spec, section 9.2.4 Arabic Joining Groups. There is a further subsection on the Letter heh.

    The Arabic Joining Groups are defined in the ArabicShaping.txt file, which is the standard specification for Arabic joining behaviour.

    As I think you have encountered, some sources will illustrate the distinctive shape of a character as its default representation, rather than always illustrating the isolated form. This can be confusing, because can lead people to think that this is the isolated form. So it is a good idea to refer to the Unicode Arabic specification, not to other sources.
  • Hi John. Thank you for the clarification and for the specification sources. It's helpful.
    Some of these characters are visually distinguished only in certain joining situations.
    So here it is. It means that some Arabic letters, although they have their own Unicode value, perform the function of localized forms for certain languages, where the difference can be expressed only in the terminal or initial form. Got it. This is what confuses me. And this is an interesting difference from how a localized forms are used in Latin/Cyrillic scripts.
  • John Hudson
    John Hudson Posts: 3,113
    Yes, localised forms in Arabic script tend to get encoded as distinct characters, most often because their localised form relates to other characters that might be contrastive letters. It is a pragmatic way to maintain graphical consistency within each alphabet.