Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use existing alternative hyphen glyphs instead of a fallback font #517

Closed
paperboyo opened this issue Jul 30, 2017 · 9 comments
Closed

Use existing alternative hyphen glyphs instead of a fallback font #517

paperboyo opened this issue Jul 30, 2017 · 9 comments

Comments

@paperboyo
Copy link

paperboyo commented Jul 30, 2017

Hello,

I'm terribly sorry if harbuzz already does this or, indeed, if it's not something harfbuzz should be responsible for (I'm just too lazy to open several bugs against different applications that use harfbuzz :-). I've tried my best to understand if it belongs here, and this and that look like it may. Sorry, and feel free to close if I'm wrong!

Recently, I've come across latest Chrome using a (broken) fallback font's glyph for a missing non-breaking hyphen glyph: guardian/frontend#17506.

Wouldn't it be prudent to just use U+2010 (and then U+002D) from the same font if U+2011 (or U+2010) is missing instead of going down the fallback route which is unlikely to ever look correct?

Thank you!

Regards
m.

@behdad
Copy link
Member

behdad commented Aug 2, 2017

I'm terribly sorry if harbuzz already does this or, indeed, if it's not something harfbuzz should be responsible for (I'm just too lazy to open several bugs against different applications that use harfbuzz :-). I've tried my best to understand if it belongs here, and this and that look like it may. Sorry, and feel free to close if I'm wrong!

We used to do that for all compatibility-decompositions, but there was backlash (for good reason), so disabled it, and implemented spaces specially. I agree for noBreak cases we should do. Thankfully there's only five such characters; three of them are already space and handled. Remains two, which I will implement:

grep ';<noBreak' UnicodeData.txt   | grep -v WS | grep -v Zs
0F0C;TIBETAN MARK DELIMITER TSHEG BSTAR;Po;0;L;<noBreak> 0F0B;;;;N;;;;;
2011;NON-BREAKING HYPHEN;Pd;0;ON;<noBreak> 2010;;;;N;;;;;

Thanks!

@paperboyo
Copy link
Author

paperboyo commented Aug 2, 2017

Thanks, @behdad!

So, after your change, U+2010 will be used when U+2011 is missing. Great and shouldn't be controversial.

What about using U+002D when U+2010 itself is missing, too? One could argue that this may be more controversial (a design for a hyphen-minus could vary from a design for a hyphen). But I would say that if that would be the intention - font designer would include U+2010 in the font.

As things stand, many standard fonts - among them e.g. the latest Georgia (v. 5.058) and latest Noto Serif - do not have a glyph at U+2010. So your change would still see the fallback go beyond the initial font.

In my humblest of opinions there shouldn't be any backlash for also using U+002D when U+2010 is missing. WDYT?

@behdad
Copy link
Member

behdad commented Aug 3, 2017

So, after your change, U+2010 will be used when U+2011 is missing. Great and shouldn't be controversial.

Correct.

What about using U+002D when U+2010 itself is missing, too? One could argue that this may be more controversial (a design for a hyphen-minus could vary from a design for a hyphen). But I would say that if that would be the intention - font designer would include U+2010 in the font.

As things stand, many standard fonts - among them e.g. the latest Georgia (v. 5.058) and latest Noto Serif - do not have a glyph at U+2010. So your change would still see the fallback go beyond the initial font.

In my humblest of opinions there shouldn't be any backlash for also using U+002D when U+2010 is missing. WDYT?

I'm not comfortable doing that. U+2010 does look distinctly wider than U+002D in most fonts. I'm sure good typographers wouldn't like me to substitute that... cc @jfkthame

@behdad
Copy link
Member

behdad commented Aug 3, 2017

Actually, looks like we do handle U+2011, or at least try to. See 52e6c4e

Is this definitely not working for you?

@paperboyo
Copy link
Author

U+2010 does look distinctly wider than U+002D in most fonts.

Very true (although the other way around - minus is usually wider :)! But wouldn't substitution happen only if U+2010 is missing - ie. in a situation when a font designer haven't included a distinct U+2010? In my (limited!) experience, when U+2010 is missing, U+002D looks like a hyphen, maybe was even treated as a genuine hyphen by the designer (going by the name HYPHEN-MINUS). Case in point would be Georgia again.

Actually, looks like we do handle U+2011, or at least try to.

🎆 My bad, I haven't actually run the project myself. And the font that brought me here in the first place is missing both U+2011 and U+2010 (and U+002D looks and talks and walks like a hyphen).

I suspect 52e6c4e is working correctly, so I will wait on your decision regarding a fallback from missing U+2010 to U+002D and will close this. Sorry for the confusion and thank you!

@behdad
Copy link
Member

behdad commented Aug 3, 2017

Sure, let's see what Jonathan says.

BTW, most users of HarfBuzz will not use this functionality, as they do font selection independently. The exceptions currently are Chrome and LibreOffice, which use a HarfBuzz-driven font fallback scheme.

@jfkthame
Copy link
Collaborator

jfkthame commented Aug 3, 2017

My preference would be not to do U+2010 / U+002D substitution. IMO, harfbuzz is supposed to shape the text provided, using the font and features specified; deciding on substitutions for characters that are not supported by a given font should be handled at a separate level. (A client could presumably implement such a substitution by providing a custom get_glyph callback, if it doesn't want to do it in a separate layer prior to calling hb_shape.)

U+2011 / U+2010 seems acceptable, given that the only difference between these characters is expected to be breaking behavior. But mixing HYPHEN-MINUS into this would be a clear step onto a slippery slope of "probably look-alike" replacements (why not do it for U+2212 as well, after all?), and I think we should avoid that.

@behdad
Copy link
Member

behdad commented Aug 3, 2017

I agree.

@behdad behdad closed this as completed Aug 3, 2017
@paperboyo
Copy link
Author

Another font where U+2010 is a direct (component) copy of U+002D is an alpha version of Noto Serif:

noto

I would think that if font designers want to include a proper minus, they use U+2212 MINUS SIGN...

BTW, most users of HarfBuzz will not use this functionality, as they do font selection independently. The exceptions currently are Chrome and LibreOffice, which use a HarfBuzz-driven font fallback scheme.

Thanks, that's a very useful info (sorry to have turned this issue into a pre-school for an ignoramus). I care mostly about browsers (will have a look at Edge and Firefox independently, then). I'm pretty sure Adobe apps already cleverly substitute U+2011>U+2010>U+002D (because our print fonts are missing former two), but can't double check now.

@jfkthame

U+2011 / U+2010 seems acceptable, given that the only difference between these characters is expected to be breaking behavior. But mixing HYPHEN-MINUS into this would be a clear step onto a slippery slope of "probably look-alike" replacements (why not do it for U+2212 as well, after all?), and I think we should avoid that.

Yep, agreed. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants