Bug 107204 - Writer treats Hungarian Rovas (aka Old Hungarian) text as left-to-right script instead of right-to-left
Summary: Writer treats Hungarian Rovas (aka Old Hungarian) text as left-to-right scrip...
Status: RESOLVED FIXED
Alias: None
Product: LibreOffice
Classification: Unclassified
Component: Writer (show other bugs)
Version:
(earliest affected)
5.3.2.1 rc
Hardware: All All
: medium normal
Assignee: ⁨خالد حسني⁩
URL:
Whiteboard: target:5.4.0
Keywords:
: 40036 (view as bug list)
Depends on:
Blocks: RTL-CTL Font-Rendering 103405
  Show dependency treegraph
 
Reported: 2017-04-16 12:11 UTC by Kovács Viktor
Modified: 2018-09-15 13:24 UTC (History)
9 users (show)

See Also:
Crash report or crash signature:


Attachments
Unicode 8.0 old hungarian conform font and screenshots (860.64 KB, application/zip)
2017-04-18 12:58 UTC, Kovács Viktor
Details
Screenshot, looks OK (166.36 KB, image/png)
2017-04-18 14:30 UTC, ⁨خالد حسني⁩
Details
sample ODT document using Unicode 10c80:10cff (12.57 KB, application/vnd.oasis.opendocument.text)
2017-04-19 18:17 UTC, V Stuart Foote
Details
screen clip of sample ODT (99.10 KB, image/png)
2017-04-19 18:22 UTC, V Stuart Foote
Details
Comparison before/after removing the wrong optimization (151.67 KB, image/png)
2017-04-19 19:48 UTC, ⁨خالد حسني⁩
Details
Old / before rendering of tdf91083 bugdoc. (29.70 KB, application/pdf)
2017-05-02 07:17 UTC, Miklos Vajna
Details
New / after rendering of tdf91083 bugdoc. (32.49 KB, application/pdf)
2017-05-02 07:17 UTC, Miklos Vajna
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kovács Viktor 2017-04-16 12:11:07 UTC
Description:
Old Hungarian script is right to left script (unicode range: u10c80-u10cff), Libreoffice does not accept it.
With GNOME's gedit work perfectly, but copy gedit to Writer, the text again appear left to right

Actual Results:  
Old Hungarian script appears left to right.

Expected Results:
Old Hungarian script must be appear right to left, it works in my browser:
Helyes = 𐲏𐳉𐳗𐳉𐳤 



Reproducible: Always

User Profile Reset: No

Additional Info:
[Information automatically included from LibreOffice]
Locale: hu
Module: StartModule
[Information guessed from browser]
OS: Linux (All)
OS is 64bit: yes
Builds ID: LibreOffice 5.3.2.1


User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0
Comment 1 V Stuart Foote 2017-04-16 17:43:48 UTC
Confrimed on Windows 10 Pro 64-bit en-US (1703) with
Version: 5.3.2.2 (x64)
Build ID: 6cd4f1ef626f15116896b1d8e1398b56da0d0ee1
CPU Threads: 8; OS Version: Windows 6.19; UI Render: GL; Layout Engine: new; 
Locale: en-US (en_US); Calc: group

The Old Hungarian block of text will still need a font with glyphs defined for those Unicode points. What font is providing you support for that range?  I had none, suspect most users would not either.

The GNU Unifont Glyps project provides low quality bitmap font coverage of SMP including the Unicode range [1], when I install the SMP supplement, confirm that character entry for the Range is RTL rather than LTR.

Same system and font, BableStone BablePad handles it correctly as RTL so assume there is hinting in the font.

=-ref-=
http://unifoundry.com/unifont.html
Comment 2 Kovács Viktor 2017-04-17 02:13:38 UTC
I am sorry, linux have required font with that unicode range.
Later I will attach that kind of font, I promiss.
Comment 3 Kovács Viktor 2017-04-18 12:58:21 UTC
Created attachment 132651 [details]
Unicode 8.0 old hungarian conform font and screenshots

I am attached my own font with Old Hungarian implementation and tester layout source codes for Linux and Windows opsistems, screenshot of Libreoffice Writer output (wrong) and GNOME's gedit output (right)
Comment 4 V Stuart Foote 2017-04-18 13:30:40 UTC
Thanks, that is a useful font.

Guess we need to determine if support for other RTL scripts from the Unicode Supplementary Multilingual Plane (SMP) is handled correctly or if it is a specific issue with HarfBuzz.

In hb is LTR/RTL read as a font attribute, or is it defined by Unicode block?

Khaled?
Comment 5 Kovács Viktor 2017-04-18 13:44:21 UTC
Unicode standard defines this block as rtl! I do not understand the question!
Comment 6 Kovács Viktor 2017-04-18 13:56:13 UTC
I tested the Windows layout, too. It works only notepad correctly,the "formatter-editors" fails under Windows, too. I would like, if exists one "formatter-editor", that works correctly!
Comment 7 ⁨خالد حسني⁩ 2017-04-18 14:28:49 UTC
(In reply to V Stuart Foote from comment #4)
> Thanks, that is a useful font.
> 
> Guess we need to determine if support for other RTL scripts from the Unicode
> Supplementary Multilingual Plane (SMP) is handled correctly or if it is a
> specific issue with HarfBuzz.
> 
> In hb is LTR/RTL read as a font attribute, or is it defined by Unicode block?
> 
> Khaled?

Directionality text not font property and is not handled by HarfBuzz, we determine the text direction based on Unicode bidirectional algorithm (we use ICU for that) before calling HarfBuzz.

The attachment does not contain any text documents, but the text in the bug description is shown RTL in Writer already as expected.
Comment 8 ⁨خالد حسني⁩ 2017-04-18 14:30:20 UTC
Created attachment 132654 [details]
Screenshot, looks OK
Comment 9 V Stuart Foote 2017-04-18 16:13:57 UTC
@Khaled, OK thanks! 

And on master I also can force a paragraph to RTL (from my en-US local's default LTR) and then paste special with the sample from comment 0 to match your clip.

Guess that with no Language defined for "Old Hungarian" script, and ICU UAX#31 suggests it would not be likely, our only choice is to toggle to RTL and author text without language tagging.

Should this be closed NOT A BUG?

=-ref-=
http://unicode.org/reports/tr31/#Table_Candidate_Characters_for_Exclusion_from_Identifiers
Comment 10 ⁨خالد حسني⁩ 2017-04-18 16:27:02 UTC
(In reply to V Stuart Foote from comment #9)
> @Khaled, OK thanks! 
> 
> And on master I also can force a paragraph to RTL (from my en-US local's
> default LTR) and then paste special with the sample from comment 0 to match
> your clip.

Paragraph direction and text direction are different (but related) things. The paragraph direction is set manually (Writer will try to be smart and use appropriate default), but the text direction is automatic. OP screenshot shows LTR text direction and I think that is the issue being reported here, but I can’t reproduce it.

Queststion to the OP, are you using TDF or distro builds of LibreOffice, if the later that what is the version of ICU you have and can you try with TDF builds?

> Guess that with no Language defined for "Old Hungarian" script, and ICU
> UAX#31 suggests it would not be likely, our only choice is to toggle to RTL
> and author text without language tagging.

Not sure what you are saying here. Text direction is language independent, it is controlled by fixed character properties provided by Unicode.
Comment 11 V Stuart Foote 2017-04-18 17:43:24 UTC
(In reply to Khaled Hosny from comment #10)
> (In reply to V Stuart Foote from comment #9)
> > @Khaled, OK thanks! 
> > 
> > And on master I also can force a paragraph to RTL (from my en-US local's
> > default LTR) and then paste special with the sample from comment 0 to match
> > your clip.
> 
> Paragraph direction and text direction are different (but related) things.
> The paragraph direction is set manually (Writer will try to be smart and use
> appropriate default), but the text direction is automatic. OP screenshot
> shows LTR text direction and I think that is the issue being reported here,
> but I can’t reproduce it.

If I set the text language to [none] and set the font for the style to OP's sample "Unicode_Maros_ext" font should it be detected as RTL for glyphs from the 10c80-10cff block? It is not.

And I can use the Tools -> Options -> Language Settings and "Ignore system input language", then in Paragraph Style dialog define the CTL Font and set "Hungarian (Szekely-Hungarian Rovas)" as the language.

Entering the sample SMP "Rovas" glyps using the Special Character dialog does not toggle the direction of the text from LTR to RTL. It can be toggled with the Formatting toolbar buttons--but is not automatic for the language. Seems like it should be.

> > Guess that with no Language defined for "Old Hungarian" script, and ICU
> > UAX#31 suggests it would not be likely, our only choice is to toggle to RTL
> > and author text without language tagging.
> 
> Not sure what you are saying here. Text direction is language independent,
> it is controlled by fixed character properties provided by Unicode.

My thought was that if ICU recommends these scripts not be processed for identification, we would not. Guess that is not the case as I then I found bug 97406 and that Eike has provided some support for the script.  But it does not appear in the drop list of Default Languages for Documents for the Complex text layout languages, only in the Character style dialogs.

@Eike, beyond setting it up for i18n/l10n Pootle support, was there more to be done in source for bug 97406 to accommodate LANGUAGE_USER_HUNGARIAN_ROVAS and the "Old Hungarian" Unicode SMP block for honoring its RTL script direction?
Comment 12 ⁨خالد حسني⁩ 2017-04-18 18:03:40 UTC
(In reply to V Stuart Foote from comment #11)
> (In reply to Khaled Hosny from comment #10)
> > (In reply to V Stuart Foote from comment #9)
> > > @Khaled, OK thanks! 
> > > 
> > > And on master I also can force a paragraph to RTL (from my en-US local's
> > > default LTR) and then paste special with the sample from comment 0 to match
> > > your clip.
> > 
> > Paragraph direction and text direction are different (but related) things.
> > The paragraph direction is set manually (Writer will try to be smart and use
> > appropriate default), but the text direction is automatic. OP screenshot
> > shows LTR text direction and I think that is the issue being reported here,
> > but I can’t reproduce it.
> 
> If I set the text language to [none] and set the font for the style to OP's
> sample "Unicode_Maros_ext" font should it be detected as RTL for glyphs from
> the 10c80-10cff block? It is not.

Language setting should not have any effect on text direction, and indeed setting it to [none] makes no difference whatsoever here

> And I can use the Tools -> Options -> Language Settings and "Ignore system
> input language", then in Paragraph Style dialog define the CTL Font and set
> "Hungarian (Szekely-Hungarian Rovas)" as the language.
> 
> Entering the sample SMP "Rovas" glyps using the Special Character dialog
> does not toggle the direction of the text from LTR to RTL. It can be toggled
> with the Formatting toolbar buttons--but is not automatic for the language.
> Seems like it should be.

I think you are still talking about paragraph direction (since that is the one you can change from the toolbar). LibreOffice has no way to manually change text direction, it is always automatic.

> > > Guess that with no Language defined for "Old Hungarian" script, and ICU
> > > UAX#31 suggests it would not be likely, our only choice is to toggle to RTL
> > > and author text without language tagging.
> > 
> > Not sure what you are saying here. Text direction is language independent,
> > it is controlled by fixed character properties provided by Unicode.
> 
> My thought was that if ICU recommends these scripts not be processed for
> identification, we would not.

First ICU is a software library (http://site.icu-project.org), Unicode is standard body, you are confusing the two. Second UAX #31 has nothing to do with text direction and I’m not sure why you are referring to it, it is a specifications for identifiers like programming language variables or hashtags (http://unicode.org/reports/tr31/#Introduction) it has no relevance to the issue being discussed here. 

> Guess that is not the case as I then I found
> bug 97406 and that Eike has provided some support for the script.  But it
> does not appear in the drop list of Default Languages for Documents for the
> Complex text layout languages, only in the Character style dialogs.

This also has no relevance to the issue of text direction.
Comment 13 V Stuart Foote 2017-04-18 19:14:08 UTC
(In reply to Khaled Hosny from comment #12)

Sorry, don't mean to be thick. 

At 5.3 we are claiming support for Hungarian (magyar) using the Rovás script, wherein the Hungarian is encoded RTL departing from "modern" Hungarian which is encoded LTR with Latin derived glyphs.

Seems the challenge is getting the assigned 10c80-10cff Unicode to render RTL when the language is set to Hungarian. Where is that breaking down?

If we are depending on the ICU libraries to handle identification of the script from its Unicode point range, and pass it for rendering as RTL rather than LTR it is not.

The UAX#31 suggested to me that it would not be, as the ICU project recommended against it. And that seemed to be the case.

Hungarian (default with Latin glyphs) are handled as Western language, Hungarian (Szekely-Hungaraian Rovas) as a CTL font, and I'm having trouble toggling between them, I keep getting dumped to Western and its LTR direction.

I don't see how I can force the UI to adjust, and the "automatic" detection seems to not work (if it was actually implemented for this SMP Unicode block).
Comment 14 Eike Rathke 2017-04-19 07:41:30 UTC
(In reply to V Stuart Foote from comment #11)
> But it
> does not appear in the drop list of Default Languages for Documents for the
> Complex text layout languages, only in the Character style dialogs.
Unrelated. Whether it's available as default language depends on whether locale data is available. Language tags for which no locale data is available are listed only for character attribution.

> @Eike, beyond setting it up for i18n/l10n Pootle support, was there more to
> be done in source for bug 97406 to accommodate LANGUAGE_USER_HUNGARIAN_ROVAS
> and the "Old Hungarian" Unicode SMP block for honoring its RTL script
> direction?
It's flagged as CTL (so it shows up in the CTL language list) and RTL (for whatever queries for it). However, for text rendering this is irrelevant. Text rendering uses the code points' Bidi Class property assigned by the Unicode Standard, which for the Unicode block "Old Hungarian" 10C80:10CFF *is* RTL. Given that was introduced with Unicode version 8.0, if it doesn't work it could be that LibreOffice was build against / is used with an ICU library that doesn't support Unicode 8 yet. For Unicode 8 support at least ICU 56 is needed. Any lower version will not do. The LibreOffice 5.3 internal ICU is 58 but that is used only in builds provided by TDF, Linux distributions usually build against the ICU version available in their release.

Given that for Khaled the issue does not occur and assuming he uses the LibreOffice internal ICU, my guess is that all boils down to the ICU version used. Maybe the original poster could answer that question? Setting NEEDINFO.

If it's due to the ICU version there's nothing we can do and we should close this as NOTABUG.
Comment 15 Tamas Rumi 2017-04-19 08:36:24 UTC
Please provide me some background information or link of "LibreOffice internal ICU version" in order to understand the problem arised here. Thanks in advance.
Comment 16 V Stuart Foote 2017-04-19 11:49:36 UTC
(In reply to Eike Rathke from comment #14)
> It's flagged as CTL (so it shows up in the CTL language list) and RTL (for
> whatever queries for it). However, for text rendering this is irrelevant.
> Text rendering uses the code points' Bidi Class property assigned by the
> Unicode Standard, which for the Unicode block "Old Hungarian" 10C80:10CFF
> *is* RTL. 

OK can't argue with that.

> Given that was introduced with Unicode version 8.0, if it doesn't
> work it could be that LibreOffice was build against / is used with an ICU
> library that doesn't support Unicode 8 yet. For Unicode 8 support at least
> ICU 56 is needed. Any lower version will not do. The LibreOffice 5.3
> internal ICU is 58 but that is used only in builds provided by TDF, Linux
> distributions usually build against the ICU version available in their
> release.
> 

Unfortunately, for Windows builds I am using Clop'hs TDF configured TB62 Tinderbox for nightlies and his release builds. So on Windows at least the correct ICU libraries do not help, yet the Unicode block is not rendered to canvas RTL. So something is not right.
Comment 17 ⁨خالد حسني⁩ 2017-04-19 12:15:23 UTC
(In reply to V Stuart Foote from comment #16)

> Unfortunately, for Windows builds I am using Clop'hs TDF configured TB62
> Tinderbox for nightlies and his release builds. So on Windows at least the
> correct ICU libraries do not help, yet the Unicode block is not rendered to
> canvas RTL. So something is not right.

Please attach a screenshot.
Comment 18 V Stuart Foote 2017-04-19 18:17:14 UTC
Created attachment 132692 [details]
sample ODT document using Unicode 10c80:10cff

Attached is an Writer document prepared on Windows 8.1 Ent with
Version: 5.3.2.2 (x64)
Build ID: 6cd4f1ef626f15116896b1d8e1398b56da0d0ee1
CPU Threads: 8; OS Version: Windows 6.29; UI Render: GL; Layout Engine: new; 
Locale: en-US (en_US); Calc: group 

Cleared the language defaults.

The default style Western font is set to the sample "Unicode_Maros_ext" at 12 pt with no language. The default style CTL font is also set to "Unicode_Maros_ext" but at 40 pt, and the language set to Hungarian (Szekely-Hungarian Rovas).

With glyphs for codepoints used in a "Western" paragraph, the layout/shaping is not RTL.  But if forced to be CTL with RTL paragraph, they codepoints are layout/shaped as RTL.

If it were correct, shouldn't strings coded with the glyphs from 10c80:10cff always be RTL direction even when mixed into a Western class text? Presumably for example Hungarian (with Latin derived fonts) mixed with Hungarian strings using Rovás font.  Doesn't seem we can do that.
Comment 19 V Stuart Foote 2017-04-19 18:22:29 UTC
Created attachment 132693 [details]
screen clip of sample ODT

Note that with a paragraph with Western layout, the 10c80:10cff is laid out LTR. And when forced into a CTL (by setting paragraph RTL) the string for the SMP codepoints are laid out RTL.
Comment 20 ⁨خالد حسني⁩ 2017-04-19 19:18:40 UTC
I can reproduce the issue on Windows using 5.3.2.2 release builds as well as self built master. Though now I think the difference is not Windows/Linux or ICU versions, but rather that I’m testing on an RTL locale on Linux and LTR on Windows.

I suspect there is some simplistic optimization somewhere that skips doing bidi under some conditions.
Comment 21 ⁨خالد حسني⁩ 2017-04-19 19:48:00 UTC
Created attachment 132697 [details]
Comparison before/after removing the wrong optimization

There are two places in Writer that check for the default direction (whatever this is) and whether the text has any scripts we classify as complex before applying the bidi algorithm. Removing these checks seem to fix the issue. The left window in the screenshot is the fixed code.
Comment 22 V Stuart Foote 2017-04-19 20:21:24 UTC
(In reply to Khaled Hosny from comment #21)
> Created attachment 132697 [details]
> Comparison before/after removing the wrong optimization
> 

That is more like what I'd expect. Does it also work with a new document without a CTL language default format defined?
Comment 23 ⁨خالد حسني⁩ 2017-04-19 22:17:55 UTC
(In reply to V Stuart Foote from comment #22)
> (In reply to Khaled Hosny from comment #21)
> > Created attachment 132697 [details]
> > Comparison before/after removing the wrong optimization
> > 
> 
> That is more like what I'd expect. Does it also work with a new document
> without a CTL language default format defined?

Yes.
Comment 24 Kovács Viktor 2017-04-20 07:44:26 UTC
I have several stupid question: what can I do, if I would like use new feature? It solved? Which version has or will have this feature?
Comment 25 V Stuart Foote 2017-04-20 08:37:48 UTC
(In reply to Kovács Viktor from comment #24)
> I have several stupid question: what can I do, if I would like use new
> feature? It solved? Which version has or will have this feature?

Khaled has a patch up [1] for code review that should clear this up and allow direct input of the 10c80:10cff glyphs RTL in line with other LTR Hungarian text. Text input would be using the Special Character dialog, or our toggle method for Unicode codepoints, e.g. enter U+10c8f and toggle to 𐲏 glyph using <Alt>+X 

If it all works out it will first be available in nightly builds of master (testing appreciated), and be released at 5.4.0

It may then also be "back ported" to the 5.3 branch

More complete support for Hungarian (Szekely-Hungarian Rovas) as a supported locale in the User Interface awaits translation/transliteration work noted in bug 94706 c#23 [2] and bug 103405

=-refs-=
[1] https://gerrit.libreoffice.org/#/c/36704/

[2] https://bugs.documentfoundation.org/show_bug.cgi?id=97406#c28
Comment 26 Commit Notification 2017-04-27 13:40:52 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=e3b7ef45d4364fda15691b5748a9a88bc908afc6

tdf#107204: Make sure we always do Unicode Bidi

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 27 V Stuart Foote 2017-04-28 02:26:16 UTC
This works nicely now, the Szekely-Hungarian Rovas script (10c80:10cff) enters RTL *inline* with modern LTR Hungarian text. Enter the Unicode one glyph at a time with <Alt>+X toggle, or pick via the Special Character dialog.

But guess we need to be sure the change to always do ICU bidirectional does not gum things up in multi-script texts. For example one place might be with handling punctuation and word breaks, e.g. Hebrew and English mixed text.

@Frank, could you please take this out on a test drive with today's build of master and comment?
Comment 28 Frank 2017-04-28 15:46:41 UTC
Stuart:

I was about to just delete this from my e-mail when I noticed that it appears that you added my e-mail address specifically. I'm assuming therefore that the request for "@Frank" referred to me. If that's a mistake, sorry to bother you. I'm pretty sure I never saw this bug report before, although it relates to the sorts of things that I followed before giving up on LO.

Although I purged LO a while back (got tired of the wide variety of issues and regressions with the 5.3 updates and needed to get some actual work done), I thought I would download the daily build and take a look anyway (much like Lucy, Charlie Brown, and the football if you follow Peanuts); the file I downloaded was:

libreoffice-5-3~2017-04-27_16.42.44_LibreOfficeDev_5.3.4.0.0_Linux_x86-64_deb.tar.gz

but ran into an error when trying to run either soffice or swriter (didn't try the others): I saw the following in the terminal:

[Java framework] Error in function createSettingsDocument (elements.cxx).
javaldx failed!
Warning: failed to read path from javaldx

and also got a GUI error box with the following:
The application cannot be started. 
[context="user"] caught unexpected com.sun.star.ucb.ContentCreationException: Cannot create folder (invalid path):

And, yes, I did modify bootstraprc. I didn't therefore bother unpacking the language pack or help files.

I'm not sure if this failure is due to an incorrect installation (i.e. my fault) or not, although I'd done it many times in the past years. So, much as I'd like to see if some of the issues I cared about have been addressed, I'm afraid I can't help. If I have some time on the weekend, I might fiddle just to see if I can figure out the whole "invalid path" thing, but that's doubtful.

Frank
Comment 29 V Stuart Foote 2017-04-28 16:38:54 UTC
Stuart -> Frank -- Yes sir 'twas intentional to you. 

Unfortunately I've no idea about that Linux install issue, I don't recall seeing anything recent coming across install related for javaldx; but if you are able to get a current nightly master/5.4.0 build installed think you'll be pleased with some of Khaled's rework of font/script handling for the HarfBuzz based Common SAL Layout. 

Issue here was corrected by removing a filter to allow the ICU project libraries for bidirectional Unicode script block(s) to handle direction for glyphs in handling for LTR "Western" scripts in addition to our RTL "CTL" complex scripts--allowing the glyphs to change direction according to their Unicode definition.

Where it can be a problem is at the edges of changes between in line script--for numbers and punctuation that might go either direction, or even have glyphs (punctuation or numbers) pulled from the wrong script.

Think we need nuanced hands with polyglot scripts to suss out the revised behavior--and I thought of you :-)

Khaled already picked up one issue for fielded data https://gerrit.libreoffice.org/#/c/37050/
Comment 30 Frank 2017-04-29 15:16:56 UTC
Stuart:

Fixed the install problem by starting over with the same downloaded files – so no idea what I might have done wrong.

Let me express my sympathy for the tasks confronting Khaled, since his efforts are likely only visible to fanatics like myself – even though they go to the heart of what a word processor’s purpose is. But I would urge him to continue (I assume he’s reading this) even knowing that only a few appreciate his efforts.

Khaled and I had some correspondence a while back concerning what you refer to as “the edges of changes between in line script ...” I recognize that the current (and very common) behavior in these situations is the result of a difference in interpreting the various relevant specs (most specifically Annex #9 of the Unicode Bidirectional Algorithm). Apparently I interpret it differently than virtually everyone else. Having had to instruct a variety of folks in how to enter bidirectional text, I believe my interpretation makes more sense to a typical user, but I fully appreciate the risks in changing such a well-entrenched convention.

Since I got tired of explaining bidirectional text entry (to IT personnel, no less), I did a “self-help” paper on this a while back titled “Exploring Bi-Directional Text Entry,” which is the last item on the web page http://www.antikytherapubs.com/i18n.htm. (These papers all relate to multi-lingual, multi-script considerations when designing relational databases; like many coders, database designers can be completely befuddled when confronted with scripts and languages that are “foreign” to them. Nonetheless, it may help illustrate how I came to my interpretation of Annex #9).

The biggest problem even these IT-savvy users seem to have is that typing their code is much more difficult when one hand is occupied with scratching their heads in confusion at the (seemingly) randomly bouncing cursor.

Seriously, though, if that edge issue is to be momentarily set to the side while letting Khaled’s new changes settle in, might I suggest that the subject of rotating text be given some love and attention in the meantime. By this I mean specifically ripping it out entirely and motivating Khaled to rebuild it from scratch (the odd ways rotated text works suggests to me that its implementation has no connection or similarity to the mainstream methods of text entry).

To see what I mean, simply enter some normal text, and choose “Format/Character/Position/Rotation” and choose one of the options. Works just as you’d expect, right?

Now do the same with some right-to-left text (e.g. Hebrew, Arabic). Now try it with a bidirectional sentence. Now try both of those exercises within a table cell. If an author “needs” (well, “wants”) to do either of those things, you can see how this could get frustrating. Sometimes you can rotate the text and sometimes you can’t; sometimes you can, but the order of any right-to-left character sequences are reversed. Generalizing text entry to include rotated text could conceivably simplify LO’s internals - yet another benefit.

I’ll try to take a closer look at the progress in this area if I get a chance, but their is a voice from afar insisting that the weather is good enough to begin our annual post-winter garage cleaning; if Khaled thinks that sounds like less work than fiddling with decades-old spaghetti code, which it probably is, he’s welcome to join us. But I really hope he continues with what he’s doing. I’d really like to go back to Writer.

Frank
Comment 31 Miklos Vajna 2017-05-02 07:16:27 UTC
The patch has a side-effect, see how the tdf91083 bugdoc was rendered before, and how the rendering happens after (I'll attach before/after pdf export result in a moment). I talked to Khaled, he's OK with reverting the patch till it's clear how to fix this bug without side-effects.
Comment 32 Miklos Vajna 2017-05-02 07:17:14 UTC
Created attachment 133001 [details]
Old / before rendering of tdf91083 bugdoc.
Comment 33 Miklos Vajna 2017-05-02 07:17:42 UTC
Created attachment 133002 [details]
New / after rendering of tdf91083 bugdoc.
Comment 34 Commit Notification 2017-05-02 07:24:56 UTC
Miklos Vajna committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=1b9d9a8bb640b7883c143cb39e3024be86528977

Revert "tdf#107204: Make sure we always do Unicode Bidi"

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 35 Miklos Vajna 2017-05-02 07:41:32 UTC
https://gerrit.libreoffice.org/#/c/37050/ fixes the side-effects, so we're OK.
Comment 36 Commit Notification 2017-05-02 07:42:38 UTC
Khaled Hosny committed a patch related to this issue.
It has been pushed to "master":

http://cgit.freedesktop.org/libreoffice/core/commit/?id=270d6db63d933b7350dc8543b9b9ebc2133a116e

tdf#107204: Make sure we always do Unicode Bidi, take two

It will be available in 5.4.0.

The patch should be included in the daily builds available at
http://dev-builds.libreoffice.org/daily/ in the next 24-48 hours. More
information about daily builds can be found at:
http://wiki.documentfoundation.org/Testing_Daily_Builds

Affected users are encouraged to test the fix and report feedback.
Comment 37 ⁨خالد حسني⁩ 2017-10-20 22:06:46 UTC
*** Bug 40036 has been marked as a duplicate of this bug. ***