New Arabic font system

Kaled Mana · February 2020

Raqeem font system

I want to make an Arabic font file and the main thing should know is that I need to change Arabic font from low logic to high logic as English style. I want to change from the fourth position of the classic Arabic font which is.

1. Isolated

2. Initial

3. Medial

4. Final

These four positions of letters in Arabic classic fonts in words so, we have 4 designs for each letter for all Arabic letters in the classic Arabic font. In Raqeem new system glyphs created by me, there are only two positions upper and lower cases like English that require to delete two positions from the classic one and use the keyboard Caps lock button to switch from upper to lower or vice versa.

Shortcode combination.

There are six letters that require shortcodes keyboard combinations to get it.

I looking for an expert who can build new keyboard layout and font files with four format

1- .ttf

2- .otf

3- .woff

4- .woff2

You can see my creation glyphs on my twitter post here:

https://twitter.com/Kaledm74/status/1216698924926410752

John Hudson · February 2020

How will the characters be encoded? Standard Arabic script encoding in Unicode does not have uppercase and lowercase characters, of course. This presents a challenge for your scheme, because you want to be able to input upper- and lowercase using Latin-style keyboard shift states, but keyboards input character codes.

Is layout for your scheme still right-to-left?

Kaled Mana · February 2020

Yes, it still right to left.

John Hudson · February 2020

A few technical issues to consider:

1. If you want the text to be automatically laid out right-to-left, then you need to use standard Arabic Unicode character codes for each letter. You can't reliably use Private Use Area codepoints, because those all have default left-to-right directionality.

2. I presume you'd also want to be able to set arbitrary Arabic text strings using the new shapes, so that also implies mapping your shapes to standard Unicode Arabic character codes.

3. Given these encoding implications, the biggest challenge will be your desire to have Arabic behave as a bicameral script with upper- and lowercase. I've thought of a one way to get the casing and keyboarding behaviour you want using standard Arabic codepoints, but it's a really strange hack, and I can imagine that some environments might reject it:

Encode the uppercase shape glyphs as standard Arabic characters.
Add to the font a format 14 cmap table that maps sequences of standard Arabic codepoints + a variation selector character to the lowercase shape glyphs.
[Or vice versa: i.e. you could decide to make the lowercase the default forms and map the uppercase to variation selector sequences.]
Create a custom keyboard driver that maps unshifted (or shifted, if using for uppercase) keys to the sequence of Arabic codepoint + variation selector.
There are a couple of likely dangers in this approach:
—Variation selector sequences are supposed to be defined by Unicode, and some environments could apply a validation check and reject your custom sequences;
—Variation selector characters are default ignorable, so some environments could simply not apply the mappings, in which case your display would fall back to whatever shapes are the default.

A couple of other possible hacks come to mind, but I couldn't claim any of them to be completely reliable, and what works in one environment might not work elsewhere.

4. Some layout environments for Arabic text will check a font to see if it has init, medi and fina OpenType Layout features, and may not use the font if these are absent (working from the assumption that Arabic layout requires these features, and hence the font is broken if they are not present). This means that even though you want to display your text using non-joining, static shapes, you may need to build the font as if it contained joining variants. [This is an issue that Saad Abulhab ran into with his own non-joining Arabic fonts.]

Helmut Wollmersdorfer · February 2020

John Hudson

Agreed, a character in the PUA has only default Unicode properties, which can have more side effects than LTR.

I also had the idea to insert a non spacing character. A feature rule could use the context for an alternate glyph. But I don't know much about mechanics of Arabic forms, if this can work together.

Thomas Phinney · February 2020

Might one also need to map the single-form glyphs to multiple codepoints? So as to get reasonable behaviors when existing text is formatted using the font.

Seems like existing text would be all one case after just changing the font. One could try to work around this with fancy contextual magic, but that isn't likely to be entirely satisfactory.

John Hudson · February 2020

Might one also need to map the single-form glyphs to multiple codepoints? So as to get reasonable behaviors when existing text is formatted using the font.

That shouldn't be necessary. The legacy presentation form codepoints that were rolled into Unicode for backwards compatibility are pretty obsolete now, and one can make fully functional fonts that only encode base characters are use GSUB for shaping. Older text that might include the presentation form codepoints are easily normalised to the base characters.

Seems like existing text would be all one case after just changing the font. One could try to work around this with fancy contextual magic, but that isn't likely to be entirely satisfactory.

Agreed.

Simon Cozens · February 2020

Since you’re also going to have a keyboarding issue for uppercase, my first thought would be just to use one set of code points (standard Arabic ones) and implement the case changes purely with features. I don’t know what apps and shapers would do if you hit the “small caps” button in Arabic text, but I’d like to find out and it would probably be the easiest interface to work with if it does work...

Kaled Mana · February 2020

Hi guys

Actually, this is the first time for months, I chat with someone who understands what this is about, but this is not quite good. I am a graphic designer who has no experience with coding, and I was unable to sleep last night after I read the posts.

Kaled Mana · February 2020

If I understand what you guys mean, what if I deal with this issue by dividing it into two parts then go with making two fonts one with uppercase and another with lowercase, I separate it into two fonts.

The question here is:

Would this action make the solution more possible to achieve and do you think this procedure will be positive?

Kaled Mana · February 2020

Hi Simon

If we face that issue with a small caps button what we can do is there possible solutions?

John Hudson · February 2020

Making two separate fonts will resolve the need to map two different shapes to the same Arabic codepoint, but you will need to manually select bits of text to apply one or other font. You won't be able to type dual-case text directly from the keyboard (unless you had an entire custom text processing environment that switched fonts based on keyboard state).

Question: am I right in thinking that you want these fonts primarily for your personal projects and for text that you would be creating? The more direct control you have over how the fonts are used, the fewer compatibility issues you need to worry about. If you were planning to make the fonts available to other people, things get more complicated because then you have to anticipate all the ways in which those people might want to use the fonts.

Kaled Mana · February 2020

Aha, let's forget two fonts trick.

The answer to the question

Yes, John, I want to release this creation for the public.

Christian Thalmann · February 2020

Why even have an uppercase? If you want the script as clean as possible, just have one form for each letter... maybe also consider using distinct forms for all letters rather than having some be dotted versions of others.

I admit I'm not quite seeing the motivation behind this writing system. If you want the advantages of Latin, just use Latin... it has the additional advantage of being extremely widely supported. I suppose you want the writing system to remain Arabic in nature, but most Arabic-writing people would probably consider it non-Arabic anyway due to its extreme departures from tradition. I expect you'll have a hard time finding users.

(BTW, is the triangle a mīm? It reminds me a lot of medial ʿayn.)

Kaled Mana · February 2020

As an Arab digital artist, I deal with graphics, shapes, and effects, such as typography, motion, and 3D. Traditional Arabic font faces difficulties in harmonizing with that effects and presents it to the users in a smooth, flexible and responsive manner as it should, and this is only part of the issue if we talk about positioning and space management. other issues that need elaboration, and I do not want to take all your time.

Regarding the use of Latin script instead of Arabic, this is impossible because it does not express their identity.

My creation called Raqeem script. it may surprise you if I told you it is derived from Arabic regular script and I believe it would strange for non-Arabs more than Arabs.

Christian Thalmann · February 2020

Kaled Mana said:

As an Arab digital artist, I deal with graphics, shapes, and effects, such as typography, motion, and 3D. Traditional Arabic font faces difficulties in harmonizing with that effects and presents it to the users in a smooth, flexible and responsive manner as it should, and this is only part of the issue if we talk about positioning and space management. other issues that need elaboration, and I do not want to take all your time.

Aha, interesting!

it may surprise you if I told you it is derived from Arabic regular script

No, that was pretty obvious for most of the letters. I didn't recognize the upside-down ones at first sight, but on second thought I guess they're bāʾ and shīn.

Simon Cozens · February 2020

Kaled Mana said:

other issues that need elaboration, and I do not want to take all your time.

Well, I want to keep thinking about these issues because for one thing, you need to get them resolved if you want this to be a font that other people can use, and for another, it's an interesting technical puzzle!

OK, so you want this to be useable by other people too. To summarize the issues we have mentioned in this thread, there are three questions you need to be able to answer:

How will a user get the text into the computer? Will they type "upper-case Arabic" on a keyboard - and if so, how? If not, how will they signal to the computer that certain glyphs are uppercase? For example: my suggested use of smallcaps would be one way to signal to the computer that glyphs are uppercase, and move the text processing to the font (other kinds of alternates/stylistic sets/etc. would do the same job, but be even harder for the user to access from the interface); another way is that, at least for OS X, the Arabic keyboard does have some glyphs mapped to shift-a (alef-madda), shift-b (peh) etc. But other operating systems map use the Arabic keyboard in other ways, so that's not necessarily helpful.
Related, what will the document look like on disk - how will the codepoints for uppercase and lowercase Arabic be encoded? Will you use standard Arabic codepoints for "lowercase" or "uppercase" - or for both? Or neither, and use some kind of hack - either private use area or Latin or something else.
Third, how will you get the shaper to behave? If you use private use or Latin, you won't get the right-to-left shaping behaviour. How will you get the apps to give the right information to the shaper?

You'll need to work out how to fit together these three pieces of the puzzle. For example, I just made a test font with the "small caps" approach. It looked like it was going to work, and testing in hb-view turning on the smcp feature looked good. In TextEdit, it worked wonderfully; but in LibreOffice and Pages, the "small caps" feature was not passed to the shaper for Arabic text. So that's not a viable solution after all.

Simon Cozens · February 2020

John Hudson said:

You can't reliably use Private Use Area codepoints, because those all have default left-to-right directionality.

Frankly that sounds like a bug in Unicode...

Helmut Wollmersdorfer · February 2020

Simon Cozens said:

John Hudson said:

You can't reliably use Private Use Area codepoints, because those all have default left-to-right directionality.

Frankly that sounds like a bug in Unicode...

It's not a bug in Unicode, it's a feature. Unicode promises to never ever define properties for codepoints in the PUA.

From the Unicode FAQ http://www.unicode.org/faq/private_use.html

Q: What about properties for private-use characters?

A: One should not expect the rest of an operating system to override the character properties for private-use characters, since private use characters can have different meanings, depending on how they originated. In terms of line breaking, case conversions, and other textual processes, private-use characters will typically be treated by the operating system as otherwise undistinguished letters (or ideographs) with no uppercase/lowercase distinctions.

From the standard https://www.unicode.org/reports/tr9/#Bidirectional_Character_Types

3.2 Bidirectional Character Types

- Private-use characters can be assigned different values by a conformant implementation.

This means that PUA characters CAN have properties, but this must be implemented individually everywhere, in each software component using properties.

Simon Cozens · February 2020

Look, I understand the idea that any bug you're sufficiently snide about becomes a feature

but I still don't agree. Giving PUA codepoints left-to-right directionality is assigning them a property; it's just suggesting that LTR is so natural and normal that we don't even see it as a choice - that's LTR ethnocentrism right there! Suggesting that you can do something else with PUA codepoints if you manage to hack the shaper to agree with you... doesn't really help.

Put it this way: would it be odd if all the PUA codepoints had RTL directionality? If so, it's also odd to give them all LTR directionality.

Adam Jagosz · February 2020

Wouldn't it be the most neutral if PUA codepoints inherited directionality of preceding character, and failing that, they would be given the directionality from language tags? But failing that too, in language-agnostic environments... you've got to choose something.

notdef · February 2020

Preceeding from which direction?

Helmut Wollmersdorfer · February 2020

Simon Cozens I couldn't find a default for BIDI properties of PUA characters in Unicode. It's undefined.

I know only one other standard, HTML, that defines LTR as default for the dir attribute.

Kaled Mana · February 2020

In the middle between Simon and Adam, without programming experience, you feel like you're in a tennis court ...but it's a good game!

John Hudson · February 2020

@Helmut Wollmersdorfer

Simon Cozens I couldn't find a default for BIDI properties of PUA characters in Unicode. It's undefined.

Yes, but in practice this means that the default directionality of PUA codepoints is LTR, since this is the default state of almost all text layout environments.
_____

@Adam Jagosz

Wouldn't it be the most neutral if PUA codepoints inherited directionality of preceding character, and failing that, they would be given the directionality from language tags? But failing that too, in language-agnostic environments... you've got to choose something.

Neither inheriting directionality nor relying on language tagging is very reliable.

Since PUA codepoints only have knowable attributes in an instantiation of a given font, when you can actually see the glyph and, hopefully, know how it should be used, it makes sense that properties for PUA codepoints should be defined at the font level. Apple's TrueType format spec include the Glyph Properties Table, one function of which is to identified mirrored directional pairs (OpenType uses the 'rtlm' OTL feature for the same purpose), and I've pondered whether this idea of defining directionality at the glyph level could be extended to enable assignment of properties to PUA codepoints.

Adam Jagosz · February 2020

@Frode Helland Character runs only have one direction. Maybe not the smartest idea though.

Helmut Wollmersdorfer · February 2020

John Hudson It seems that applications calculate the properties of PUA characters like unassigned characters. Numerical properties are set to zero. Bidi_Class=0 means Left_To_Right.

Here is what ICU, used by many applications, does: http://demo.icu-project.org/icu-bin/ubrowse?ch=E001#here. It has only graph=yes, print=yes.

Theoretically BIDI control characters could be used, but many applications ignore them.

A practical solution can only be at the font level.

Kaled Mana · February 2020

Hi Simon

I'm sorry to late answering, so I didn't notice the three questions above accept my apologies

Kaled Mana · February 2020

If the links between Arabic letters removed with smart concepts a new reaction will start in many fields in the Arab nation and definitely the world in art, fashion, fonts families, advertising, marketing, translate, scanning in bold, sporty and responsive options"%70 of Arab societies are youth".

The three questions answers

1

Yes, uppercase letters can be used on the keyboard buttons to illustrate.

2

I am a designer and I focus on the outputs that are reviewed by the user but, Are code points affect the development in the future?

3

I want to achieve the goal correctly even on a limited scale, for example, if it will work on a Microsoft office and loss to achieve that on a LibreOffice it is fine for a while. if it will work on Adobe products and will not work on Corel draw it is fine too. I want it to work in all graphical environments but I must focus on taking the new type system to the widest possible number of users.

Kaled Mana · February 2020

Image: https://us.v-cdn.net/5019405/uploads/editor/2i/i2o517hzkv7j.png

One of the Arabic prevailing calligraphy problems is the direction of the text, which may be related to the problem of aligning letters in translation and scanners it could of the difficulty of treating with the letters independently In the current case which is jointed with previous and subsequent letters.

Facing a problem today with converting a PDF file to an Excel file, the program preserved the division of columns, cells, numbers, and e-mail, but the Arabic text was the switch from RTL to LTR in the direction of the text, and this is one of the problems of Arabic script.

Can separating letters and treating each letter as a separate constant value help to solve this type of problem?

André G. Isaak · February 2020

The above looks like one or both of the programs are not handling RTL text correctly. It isn't related to the connected nature of the characters since the text is connected in both cases (despite being backwards in one case). Do you know how the .pdf in question was created?

New Arabic font system

Comments

Categories