Interesting piece on an Indonesian language using Hangul to write it

Comments

  • Abdurrahman Hanif
    edited November 2024
    I heard it is koreans who actively research to make hangul is used outside korea and they found Buton Island who historically islamic sultanate and using arabic script then found out hangul is perfect match Cia-Cia language in Buton Island. also i heard there are some modifications hangul letter for Cia-Cia language.
  • John Savard
    John Savard Posts: 1,136
    Although this is a heartwarming story, I do have some questions.
    The article refers to Cia-Cia as a "syllable timed" language; presumably this explains why the Korean writing system is not a bad fit for it.
    But I would have thought that the Latin alphabet could always have been modified to have symbols for all the phonemes of their language, as this has already been done so many times.
    The fact that they had to add symbols to the Korean Hangul alphabet for their language alarms me. Of course that won't affect their ability to write their language by hand.
    But, of course, when the script finally hits Unicode, it isn't going to get a complete set of precomposed syllables like Korean has. So use of that script with computers will be dependent on having a high level of Unicode support in the software and devices used.
    Maybe one day high levels of Unicode support will become so common that it won't be something to worry about...
  • John Hudson
    John Hudson Posts: 3,264
    edited November 2024
    The fact that they had to add symbols to the Korean Hangul alphabet for their language alarms me.
    I didn’t see anything in the article about needing to add symbols to Korean Hangul. Rather, Cia-Cia uses some historical Hangul characters that are not usually encountered in modern Korean.
    But, of course, when the script finally hits Unicode...
    It is already in Unicode. It is Hangul, including the historical characters.
    Maybe one day high levels of Unicode support will become so common that it won't be something to worry about...
    Unicode has been the underpinning of most digital text for more than twenty years. Hangul is massively supported across software.




  • John Savard
    John Savard Posts: 1,136
    edited November 2024
    I didn’t see anything in the article about needing to add symbols to Korean Hangul. Rather, Cia-Cia uses some historical Hangul characters that are not usually encountered in modern Korean.

    Thank you for the clarification. However, even in that case, I would think that it is likely that Cia-Cia will, in written form, include syllables not found in Korean, and thus it will have to go beyond the set of precomposed syllables included in Unicode for Korean. So the issue of being able to construct a character from individual Hangul would still be present, even in this more favorable case. (This assumes by "historical Hangul characters", you mean historical letters of the Hangul alphabet, not historical complete syllables.)
  • John Hudson
    John Hudson Posts: 3,264
    As a result of some historical wrangling with the South Korean national standards body, Unicode ended up including a set of 11,252 precomposed syllable presentation forms—even though these were never technically needed to render Hangul. This is a systematically derived set, based on combinatorial principles not on a subset of syllables that actually occurs in Korean language. It is kind of insane.

    Handling Jamo-to-Syllable combination at the glyph level has always made more sense than relying on precomposed syllable encoding.
  • But I would have thought that the Latin alphabet could always have been modified to have symbols for all the phonemes of their language, as this has already been done so many times.
    I would suggest reflecting on how "why did they have to use a different script when they could have just used Latin?" comes across.
  • Ray Larabie
    Ray Larabie Posts: 1,441
    What a great idea. Hangul, much like Canadian Syllabics is a system where once you figure out the basic rules, learning the rest is self-evident. It's a good way for kids to pick up a language quickly.
  • John Savard
    John Savard Posts: 1,136
    As a result of some historical wrangling with the South Korean national standards body, Unicode ended up including a set of 11,252 precomposed syllable presentation forms—even though these were never technically needed to render Hangul. This is a systematically derived set, based on combinatorial principles not on a subset of syllables that actually occurs in Korean language. It is kind of insane.

    Handling Jamo-to-Syllable combination at the glyph level has always made more sense than relying on precomposed syllable encoding.

    I had thought that there were only a few hundred precomposed Korean syllables in Unicode. So I was going to say that, well, then, there would be no problem, unless Cia-Cia had consonant clusters like those of Czech, Polish, or even English - but then they wouldn't have chosen the Korean script. However, in looking up information about the Korean Unicode range, I found that even the set of 11,252 precomposed syllables does not include those with obsolete historical Jamo. So the problem remains.
    After all, one expects that a printer will only have limited intelligence, sporting at best, say, a 68000 microprocessor. Combining Jamo with the facility of an experienced calligrapher isn't something one expects a mere machine to do.
    Of course, these days, if the host CPU is doing it, and it has a good graphics card, it probably is possible.
    I would suggest reflecting on how "why did they have to use a different script when they could have just used Latin?" comes across.

    As ignorant, bigoted, or at least chauvinistic? But Bahasa Indonesia is written with the Latin alphabet, so that would have indeed been the obvious default for them as well.
  • John Hudson
    John Hudson Posts: 3,264
    I found that even the set of 11,252 precomposed syllables does not include those with obsolete historical Jamo. So the problem remains.
    Again, there is no problem. Handling of Jamo-to-Syllable composition has always made most sense as a glyph processing operation, and precomposed syllable characters have never technically been needed.

    After all, one expects that a printer will only have limited intelligence, sporting at best, say, a 68000 microprocessor. Combining Jamo with the facility of an experienced calligrapher isn't something one expects a mere machine to do.
    No, it is something that type designers do. This is font level glyph processing, often using a combination of precomposed glyphs such as these, none of which are encoded as precomposed syllable characters



    with dynamic composing elements supporting rare or arbitrary combinations, e.g. this completely random sequence of three glyphs:




  • But I would have thought that the Latin alphabet could always have been modified to have symbols for all the phonemes of their language, as this has already been done so many times.
    I would suggest reflecting on how "why did they have to use a different script when they could have just used Latin?" comes across.
    Now that the technology is ubiquitous, we can start writing all languages in IPA! ☺
  • But I would have thought that the Latin alphabet could always have been modified to have symbols for all the phonemes of their language, as this has already been done so many times.
    I would suggest reflecting on how "why did they have to use a different script when they could have just used Latin?" comes across.
    I think latin script is still new for us and Bahasa Indonesia that used Latin is just our unification language. Outside Jakarta, people speak their own dialects, even different languages. Also if Latin is used for local language, it will be confusing of what language i'm reading and how to speak it correctly. for modified the latin, the diacritics is not known for people, we just know, like we read "e" in three different ways:

    - e like in the word "ekor"
    ə like in the word "senang"
    ɛ like in the word "bebek"

    Historically our local scripts is comes from outside like Brahmic or Arabic and heavily influenced and spread by institutions (like local kingdom or sultanate). in case of Hangul for cia-cia, is actively teached by korean who go to school there and people adopt it for their daily life.
  • The Hangul orthography for Cia-Cia uses the obsolete consonant ㅸ for /β/. Wikipedia seems to suggest that another obsolete consonant ᄙ is used for /l/, but from what I've seen the sound is always written as a sequence of two ㄹ, to the point of adding an empty syllable 을 at the beginning of words that start with /l/.

    As John Hudson says, Unicode support isn't the issue. I can write 
    뗄레ᄫᅵ시 for televisi (Cia-Cia for "television") just fine. Unicode can handle obsolete jamo (the consonant and vowel letters that are the building blocks of Hangul) without the need for precomposed code points. Precomposed syllables for Hangul were included in Unicode for compatibility with existing standards, but are not technically needed.

    The real problem is the availability of fonts that support the obsolete jamo. There are lots of Hangul fonts on the market that don't even provide all the 
    11,252 precomposed syllables that are possible with the modern jamo, often settling on as few as only the 2,350 most common ones. For aesthetic reasons, type designers have to deal with each syllable as separate glyphs, and this makes designing Hangul typefaces very labour-intensive. There is little economic incentive to add support for obsolete jamo or the dynamic composition of Hangul jamo in most cases. So only a small number of fonts will be able to display text with ㅸ or other obsolete jamo.