Hebrew need for another mark in unicode - Sheva Na, Dagesh Hazeq

Scott-Martin Kosofsky · March 2023

Thank you, John. I regret to say that my knowledge of shaping engines beyond those I encounter regularly is limited. As someone who works largely in print, my concerns are centered on whether something works in InDesign and Acrobat, and whether the performance is retained while passing through HTML (a workaround I use for extracting text from certain antiquated apps).

If there’s a not-too-onerous way of asking the United Consortium to free up the holam haser for vav, that would solve Ben’s issue.

It’s worth noting that in metal type, the taamim that have the same shape but vary only by position were cast only as a single character. For example, geresh, geresh muqdam, and merkha were the same casting (merkha was simply turned upside down); similarly, tifcha, pashta, qadma, and dehi were all the same. Of the three lines required to set Bible texts, the top and the bottom lines were the same height and all of the diacritics were cast on same-size squares. I’ve often thought about how Hebrew fonts would be configured if OpenType (and the mk and mkmk features) had preceded the Unicode, and if the shaping engines accommodated more then just singularities.

bdenckla · March 2023

@John Hudson which proposal to you think would be more likely to succeed?

extend U-HHFV to mean ḥolam male on alef
use ZWJ and/or ZWNJ to distinguish a ḥolam male on alef from a ḥolam ḥaser on a letter preceding an alef (details TBD)

Or maybe that is a malformed question since the details that I left TBD need to be determined to answer the question?

While we're at it, there are some ambiguities about how to represent what I call "stress helper" accents in Unicode. I cover some of these ambiguities as "asides" in my Taamey D documentation. Is there a forum where no new code point is being proposed, but a formalization of how existing code points should be used is being proposed? Is this what a Unicode Technical Report is for?

My two issues with stress helpers are these:

Should pashta and zarqa/tsinnor (U-ZINOR) be stress helpers for themselves, or should their "central cousins" be used? (Their central cousins are, respectively, qadma and tsinnorit (U-ZARQA)). (Sorry, discussing zarqa becomes unavoidably confusing due to Unicode naming issues.) This is also, theoretically, a question that should be answered about deḥi, whose central cousin is tipeḥa/tarḥa. I'm not aware of any text that provides stress helpers for deḥi, but IMO Unicode should specify how to do it if one were to do it. (This is inherently not an issue for yetiv.)
Can (or should) some special encoding (ZWJ, etc.) be used when an accent is used as a stress helper for itself? (For accents without central cousins, "self-help" is the only option. Those accents are segolta and the two telisha accents.) For lack of an accepted encoding for "self-help," fonts must infer this from context.

For the record, I find the use of "central cousins" to be a bad idea. But what is really a bad idea is the lack of guidance (standardization) on this topic.

John Hudson · March 2023

use ZWJ and/or ZWNJ to distinguish a ḥolam male on alef from a ḥolam ḥaser on a letter preceding an alef (details TBD)

ZWNJ is already the de facto standard for that: it just works, and I think Unicode would be happy to document it as such (if it hasn’t already; it is a while since I read the Hebrew chapter of the standard). Given that this method exists, works, and may already be in use in documents, it is much more likely to remain the preferred option from Unicode’s perspective.

I’ll digest your queries re. stress helpers on the weekend.

bdenckla · March 2023

@Scott-Martin Kosofsky, based on your feedback, I’ve begun to have some doubts about the alef ḥolam concept I proposed.

My new feeling is that the alef ḥolam concept is not “wrong” per se, but it is only one way of interpreting the meaning of a dot on the upper right of an alef. My new feeling is that it is equally valid to interpret this dot as a “backwards-facing ḥolam ḥaser” dot. In this light it seems far more optional for Unicode to have a way to encode this special dot placement. I.e., in this light, our current situation doesn't seem so bad, where it must be implemented it by inference in fonts that want to look this way.

In my small collection of Hebrew Bibles, BHS is alone in having this special dot placement. We may be seeing yet another case of the huge (often undue) influence of BHS. Here I don’t think the influence is harmful, i.e., BHS has not done anything “wrong” per se, but it may have unduly promoted a rather idiosyncratic typographic distinction.

The real question then becomes, is this distinction present in the authoritative medieval manuscripts, and/or is it present in authoritative writings (medieval or modern) on the topic. I do not have answers to those questions. It is hard to study mark placements in the manuscripts. Not only is it a huge amount of work, but it is hard to tell accident from intention, and even sometimes intention is just whim not “real” intention on the part of the scribe. Also, the scribes' letter spacing (could we call it “tracking”?) is usually so tight that a question like this is hard to answer.

John Hudson · March 2023

By ‘this special dot placement’, you mean the placing of the ḥolam dot above the top right side of alef ? — as in the first word of 2 Samuel 1:9, which BHS gives as

A lot of Hebrew typography regularly places the ḥolam as a kind of inter-dot, between a pair of letters, neither over the left side of the first letter nor over the right side of the alef. I believe this is the norm if modern Hebrew is vocalised in pedagogical materials. In scholarly and study Bibles, one sees this convention in Aron Dotan’s edition of BHL:

Also in the JPS Tanakh:

So I think Ben’s question about where the placement in BHS comes from is an interesting and important one. Most directly, and unsurprisingly, it comes from the BHS’s precursor prepared by Rudolf Kittel (BHK), which established most of the typographic conventions used in BHS:

The dot placement in BHK in this example looks a little more equivocal than in BHS, but I think that is due to the font and the way it was composed. Other examples in the same volume have the dot more definitely associated with the alef, e.g.:

So, to the Masoretic manuscripts...

Leningradensis itself, as the source text for BHK, BHS and BHL, seems a good place to start, so one can compare the typographic transcriptions against the manuscript original. There is a digitised version online, but it doesn’t seem to be searchable and isn’t indexed to book, chapter and verse, so someone with better Hebrew knowledge than me needs to go looking.

bdenckla · March 2023

@John Hudson yes that is the dot on the upper right of alef that was my concern. As I mentioned, now I'm less concerned about Unicode's lack of a standard way to "ask" for it. As you mention, it is still an interesting phenomenon to explore, regardless of what level of abstraction is most appropriate for it to "live" at.

Your images illustrate the issue well. I posted some images on an earlier comment on this thread, too.

FYI what I usually do to find the image corresponding to a verse is use Unicode/XML Leningrad Codex (UXLC) (tanach.us) which will tell you what percent into the page you should start looking for a verse.

But really here there's not much need to compare the manuscript with what BHS looks like; I think BHS is consistent enough that just from looking at the manuscript, I pretty much know what BHS will look like. Still, it would be a painstaking process, to read through the manuscript, scrutinizing the placement of ḥolam ḥaser dots and trying to decide what is perhaps undecidable, which is whether, in each instance, the scribe placed the dot in a way that suggests that the dot "belongs" to the letter preceding alef or to alef itself.

In my opinion a good way to think about the ḥolam ḥaser dot is as "belonging" to neither. It "belongs" to a phantom (or implied) vav in between the two letters. In this sense every word with ḥolam ḥaser can be seen as a kind of perpetual qere, i.e., a (pointed) ketiv whose qere is thought to be obvious, and therefore not spelled out. This way of thinking about ḥolam ḥaser makes it very unlike the other vowel marks. Indeed, visually, it is very unlike the other vowel marks! Why is it above, whereas all the other vowel marks are below? Why is it so far to the left (or even on the right of the next letter, if the next letter is a non-consonant alef), instead of centered, as all other vowel marks are?

If we ignore the visual appearance of ḥolam ḥaser, it is similar to other vowel marks, perhaps most similar to qubuts. Logically, qubuts is to shuruq (vav with a shuruq dot) much as ḥolam ḥaser is to vav ḥolam (vav with a ḥolam male dot).

But visually, perhaps ḥolam ḥaser has more in common with the "famous" orphan ḥiriq in words related to Jerusalem! (The parent of that orphan ḥiriq is a phantom (or implied) yod.) (Also, I always feel obligated to mention the little-known cousin of that ḥiriq, the much rarer orphan sheva on (only four!) words related to Jerusalem.) In this light, one begins to wonder whether a lot of pain and suffering could have been avoided by having a phantom vav and phantom yod in Unicode. Of course, that might have caused all sorts of other pain and suffering!

bdenckla · May 2025

Here's a follow-up post on sheva na (one of the original topics of this discussion).

bdenckla · May 2025

Here's a follow-up post on dagesh ḥazaq (one of the original topics of this discussion).

Hebrew need for another mark in unicode - Sheva Na, Dagesh Hazeq

Comments

Categories