Wiktionary talk:About Hebrew

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Archive
Archives:
2007 · 2008 · 2009–10

Shortcut:
WT:T:AHE

"Participle" as a part-of-speech header in entries[edit]

There's been some discussion that has led to the proposal that Hebrew should use "Participle" as part-of-speech header (==Participle==) for the present-tense forms of all binyanim (e.g., זָבָה (zavá), מְבֻשָׁל (m'vushál)) and/or for the (various numbers and genders of) words of the form of אָהוּב (ahúv); also for construct forms of the same (זָבַת (zavát), מְלַמְּדֵי (m'lam'dé)), and forms with suffixes indicating possessors/objects (טוֹעֲמֶיהָ (toaméha)). Please see that discussion for more, and discuss and opine here. (Include in your discussion any other forms you think should be so headed.)​—msh210 (talk) 06:48, 23 March 2011 (UTC)

An initial opinion: I think it's a good idea for the present-tense forms (including construct or suffixed forms). That'd solve the issue of what header to use for them (verb? noun sometimes (מְלַמְּדֵי (m'lam'dé))? adjective sometimes (מְבֻשָׁל (m'vushál))?) I'm unsure as yet about אָהוּב (ahúv) and its forms.​—msh210 (talk) 06:48, 23 March 2011 (UTC)
I don't think it really solves the question of words like m'vushál, since then we have to ask — is it the present participle and present tense of bushál, or just the passive participle of bishél? Note that (1) although בושל \ בֻּשַּׁל (bushál, to be cooked) does exist, it's otherwise rare (at least in Modern Hebrew), and (2) its meaning is eventive, whereas m'vushál is normally stative: ha'óf bushál etmól implies that today it is m'vushál. In this respect m'vushál behaves more like the "past participle" כָּתוּב (katúv, written) than like the "present participle" נִכְתָּב (nikhtáv, written). (Of course, m'vushál can be used eventively, like nikhtáv, but that's rare. Specifically, it's exactly as rare as the other forms of bushál, which are always eventive.) So if we adopt ===Participle=== for forms like m'vashél and ===Adjective=== for forms like katúv, then I think we'll have to have two entries for m'vushál, one with each POS. Which, I hasten to add, I am not opposed to. I actually think that might make for a very clear presentation. But it doesn't resolve any questions; for any given m'fu'ál form, we'd still have to consider it individually to determine whether it's just a ===Participle===, just an ===Adjective===, or both. —RuakhTALK 14:42, 23 March 2011 (UTC)
I think Participle is not the best choice, possibly even a poor one. However, I too need a little more time to study the options. In the mean time I would like to direct your attention to a Papers which brings up this questions and how it was resolved in the context of producing a Hebrew treebank. It authors are linguist as well as lexicographers and we would be standing on the shoulders of giants if we could find how to emulate their solution within the wiki framework. list of http://mila.cs.technion.ac.il/treebank/tal.pdf OrenBochman 21:38, 23 March 2011 (UTC)
Thanks for that link. I don't pretend to understand all of it, but those authors apparently do tag some present-tense verbs as finite verbs, and do acknowledge that ha- is sometimes a relativizer rather than a definiteness marker, but they apparently tag them as nouns or adjectives when they're in the construct state. —RuakhTALK 21:41, 24 March 2011 (UTC)

new template:he-imperative of[edit]

Fyi, I've created this along the lines of the existing "he-[tense] of" templates (and, like them, with the parallel template:he-Imperative of).​—msh210 (talk) 17:58, 25 March 2011 (UTC)

nikúd+romanization as a sense label (context) and as a pronunciation label (accent).[edit]

Currently, if a different form has different nikúd or different romanization, then it has to have its own headword line, which means it has to have its own POS section, which normally means that it ends up with its own etymology section. It's very unwieldy, and I think it makes entries difficult to navigate.

So I was thinking, what if we created a template to be used at the head of a sense line that would give nikúd and transliteration specific to that form? I was thinking it could look something like this:

Verb

נִמְשַׁךְ (nimshákh) (nif'ál construction)

  1. (intransitive) To continue, to be continued.
  2. (נִמְשָׁךְ, nimshákh) Masculine singular present participle and present tense of נמשך (nimshákh).

Code-wise, I was thinking {{he-wv|נִמְשָׁךְ|tr=nimshákh}}, with the transliteration being optional, but I'm very open to other ideas. (I'm also open to ideas on how to handle the case where nikúd requires a different, defective spelling. I was thinking that just the defective spelling needs to go in the template — after all, the non-defective spelling is already in the headword, and it's not needed for disambiguation — but I'm not sure.)

Since this approach means we can't have separate pronunciation sections, I was thinking this template could also be used in the pronunciation section, alongside regular accent templates, when necessary.

What do y'all think? Does that seem like a good idea?

RuakhTALK 22:53, 29 May 2011 (UTC)

IMO for some things, the form entry can simply be omitted. For example, our headword line for יָשַׁב includes infomration on the pausal form יָשָׁב, which then doesn'thave its own section on the page, and IMO that's sufficient. (Do you agree? Disagree?) But where it's needed — and I agree that the present tense masc. sing. of nif'al verbs is such a case — ... well, I'm not sureabout your suggestion. It sounds good at a glance, but is a big change that I, at least, would need to think more about.​—msh210 (talk) 06:52, 30 May 2011 (UTC)
Re: יָשָׁב: I'm O.K. with that; but if we do adopt my suggestion, then I think it might as well be used for this, too. At least, maybe not so much יָשָׁב, but I'd like to have a place to put דְּבַר־'s transliteration, and some way of giving its pronunciation someday.   Re: big change, requiring more thought: Definitely, take your time. I took a long time thinking about it, myself, so I certainly can't deny you the same privilege. —RuakhTALK 12:37, 30 May 2011 (UTC)
Re דְּבַר־: The pronunciation and transliteration of the pausal form יָשָׁב are on the page despite there not being a dedicated sense line (let alone section) for it. The same can be done with construct forms IMO. Moreover, even if we do adopt your suggestion, I'm not sure it should be used for pausal forms which are both rare/archaic words and very much just-another-form-of-the-other-word words.—msh210℠ on a public computer 14:34, 30 May 2011 (UTC)
Huh, interesting. I never noticed pausaltr=. Honestly, I'm starting to think it might be better if pausal forms only had their own sense lines — that is, if they weren't in the inflection line at all. We just have so much in the inflection line of noun entries (lemma + transliteration + defective spelling, gender, plural indefinite + defective spelling, singular construct + defective spelling, plural construct + defective spelling, other-gender counterpart + defective spelling, pausal form + transliteration), and almost as much in the inflection line of adjective entries. I mean, since I'm not the one adding pausal forms, it's no skin off my back if you prefer the current system; but I'm a bit loath to start adding singular-construct-transliteration in there as well. And I don't like not having a way to list their pronunciations short of giving them their own sections; not that I usually add pronunciations anyway, but it would be nice to be able to. —RuakhTALK 00:23, 31 May 2011 (UTC)
There is too much on the headword line. Re "not having a way to list their pronunciations short of giving them their own sections": That'd be an issue whether the forms appear in the headword line or in dedicated sense lines. However, (in either of those cases,) it's a nonissue really: see, e.g., יָשָׁב, whose pronunciation is given in the =Pronunciation= section despite the word's appearing only on the headword line.​—msh210 (talk) 05:20, 31 May 2011 (UTC)
Oh, you've already been using nikúd as an accent template! So I guess that aspect of my proposal is not going to be controversial. :-)   —RuakhTALK 10:24, 31 May 2011 (UTC)
And then context tags would appear after {{he-wv}}, as # {{he-wv|נִמְשָׁךְ|tr=nimshách}} {{rare|_|form|lang=he}} {{he-Present of|נמשך|tr=nimshákh|g=m|n=s|.=.}}?—msh210℠ on a public computer 14:44, 30 May 2011 (UTC)
I think so, and I admit that it looks a bit odd — but then, I'm not sure it ever actually comes up. I mean, when was the last time you put a context template on a form-of? —RuakhTALK 00:23, 31 May 2011 (UTC)
I'm still thinking about the major ideas here, but re this minor point, if you grep the dump for the string {{rare|_|form|lang=he}}, you should find a few entries represented among the results.​—msh210 (talk) 03:45, 31 May 2011 (UTC)
Yes, I think this will work and is a good idea, for differing tenses of verbs and for construct forms. The pausal forms just seem to me to be so similar to the base form (they have the same meaning, for example) that they don't deserve a separate sense line. They should stay in the inflection line. But if you really think otherwise, then okay, I guess.​—msh210 (talk) 18:16, 1 June 2011 (UTC)
O.K., I'll start doing that, then. Re: pausal forms: I do think otherwise, but since I'm not the one adding them, it doesn't really matter what I think. —RuakhTALK 19:09, 1 June 2011 (UTC)
It does, but okay.​—msh210 (talk) 20:11, 1 June 2011 (UTC)
Does this mean that past-tense masculine singular third-person forms of verbs will get their own sense line for such verbs as are defined as "to..."?​—msh210 (talk) 18:21, 1 June 2011 (UTC)
I hadn't planned to do that, no. Adding "form of" definitions for lemmata seems like it should be a Wiktionary-wide change, not Hebrew-specific. (In Hebrew the disparity between lemmata and to-infinitives is maybe more salient, but even in English the plain form of a verb is not equivalent to a to-infinitive: the former has various finite uses, and even among its non-finite uses there are fairly few where "to" can be freely inserted or deleted.) —RuakhTALK 19:09, 1 June 2011 (UTC)

The he-[form]-of templates probably shouldn't require tr now that the template might be used under the same headword as the tr is sought for.​—msh210 (talk) 18:07, 12 September 2011 (UTC)

Yeah. Or, more specifically: they should notice if the lemma is the current page-name, and if so (1) not link to it and (2) not require tr=. —RuakhTALK 22:55, 12 September 2011 (UTC)
No, because the lemma parameter would be the vowelized form (with <nowiki/> next to it to avoid linking), not the unvowelized. (Preferably.) So it wouldn't match the pagename.​—msh210 (talk) 23:14, 12 September 2011 (UTC)
I dunno, I think if we want to display the lemma with nikúd — and I've noticed that you do — then we should have actual parameters wv= and dwv=. —RuakhTALK 01:50, 13 September 2011 (UTC)
Like diff? (Well, that's wv, not dwv yet.)​—msh210 (talk) 15:47, 14 September 2011 (UTC)
Yes, I think so, except that it needs to be {{#ifeq:, not {{#ifeq|. (And maybe other changes, too; I haven't tested.) And probably the whole 1/wv/dwv logic should be in some sort of helper template that other form-of templates can use as well? —RuakhTALK 16:46, 14 September 2011 (UTC)
That's a grand idea. (Thanks for the revert, by the way.)​—msh210 (talk) 16:51, 14 September 2011 (UTC)
See {{he-lemma}} for one approach. It takes four parameters, 1 & wv & dwv & tr. It's mandatory to include either 1 or wv, in that it will display a literal {{{1}}} if both of them are absent/blank. It won't add any cleanup categories or anything, though; it leaves that for the calling template to do. —RuakhTALK 02:27, 16 September 2011 (UTC)
Looks good! Thanks.​—msh210 (talk) 15:34, 18 September 2011 (UTC)
I see you (Ruakh) have added it, and removed the requirement for tr when lemma is the current pagename, to he-form of noun: thanks. I've followed your lead and added it to the other Hebrew form-of templates except for defective spelling of and excessive spelling of.​—msh210 (talk) 22:14, 19 September 2011 (UTC)

Adding pausal=1 and defective=1 options to "form of" templates.[edit]

I was thinking that maybe the Hebrew "form of" templates should all accept Boolean pausal=... and defective=... options that just tack notes onto the end, to look something like this:

  1. Second-person feminine plural past tense (suffix conjugation) of פו (foo), pausal form, defective spelling.

perhaps with links.

Does this make sense?

It could even conceivably be used with pausal forms or defective spellings of what are otherwise lemma forms:

  1. Third-person masculine singular past tense (suffix conjugation) of פו (foo), pausal form, defective spelling.

though I'm not sure whether that's a good idea or not. (If we do want to do that, then we'll have to modify the templates to recognize the possibility. Currently {{he-past of}} adds an attention category if it's used with p=3, g=m, n=s, but it's easy to change it to allow that case as long as pausal= or defective= is set.)

Any thoughts? I don't add defective spellings very often, and I don't think I've ever added even a single pausal form, so it's not worth it unless someone else thinks they might use it. :-P

RuakhTALK 02:42, 30 May 2011 (UTC)

Sounds good. Perhaps excessive, too, for, e.g., אוכל (óchel, food, noun)?​—msh210 (talk) 06:47, 30 May 2011 (UTC)
excessive= makes sense in general to have, but why for אוכל? That's its usual spelling! —RuakhTALK 12:25, 30 May 2011 (UTC)
Heheh, that's its usual spelling when it's spelled without vowels, yes. Back to this discussion again.—msh210℠ on a public computer 14:38, 30 May 2011 (UTC)
Actually, this sounds good for defective (and, as I suggested above, excessive) spellings, but I'm not sure it's necessary for pausal forms, simply because I can't think of any pausal form spelled differently (modulo vowels) from the its non-pausal form. (But perhaps such do exist.) But this depends on the discussion in the preceding section: if we want dedicated sense lines for pausal forms spelled the same (modulo vowels) as their main forms, then such a parameter in the sense-line templates would be desirable also.—msh210℠ on a public computer 14:38, 30 May 2011 (UTC)
Even if we don't want sense lines for pausal forms of lemmata, it seems like we should have sense lines for pausal forms of non-lemmata, unless we want to put those in the inflections lines of non-lemma forms. (I suppose this might also depend, to some extent, on the discussion in the preceding section: if every non-lemma form gets its own section, then it's at least possible to put its pausal variant in the inflection line, whereas if non-lemma forms can share sections with each other and/or with their lemmata, then that seems impossible. Er, though, in that case I guess {{he-wv}} could have some additional parameters for pausal forms . . .) —RuakhTALK 00:10, 31 May 2011 (UTC)
O.K., I'll just implement this with defective=1 and excessive=1, then. We can implement pausal=1 if and when you decide you want it. —RuakhTALK 19:10, 1 June 2011 (UTC)

marking stress in monosyllabic words' transliterations[edit]

(This is an outgrowth of a discussion at Ruakh's talkpage, hereby brought hither so more people who care will, perhaps, see it.) Should we mark stress in monosyllabic words' transliterations, as in בַּד (bád)? We've omitting such marks, as in בַּד (bad) and עַל כָּל פָּנִים (al kol paním). Arguments in favor include consistency (with polysyllabics, especially with polysyllabics in the same transliterated phrase as the monosyllabic) and readability (so no one thinks that's the English word bad or pronounced the same as it). Arguments against include lack of necessity and clutter. Other arguments may also exist, of course, and you can add them in your bullet point below. Please opine. The result of this will be enshrined on the main page as the standard for Hebrew.​—msh210 (talk) 15:54, 6 July 2011 (UTC)

  • Weakly support, for the sake of consistency within a transliterated phrase.​—msh210 (talk) 15:54, 6 July 2011 (UTC)
  • I have no problem with doing this, if it's going to be implemented consistently. Stephen often does this for Russian. --EncycloPetey 16:18, 6 July 2011 (UTC)
  • Note that we're already doing this for polysyllabic words, such as bóker and bokér; the question here is just about monosyllabic words. (Sorry if you already caught that; somehow something in your comment made me think you might not.) —RuakhTALK 16:43, 6 July 2011 (UTC)
  • I also weakly support. Another argument in favor is that sometimes it's a bit debatable whether a given word is "monosyllabic", especially across different forms of Hebrew. (Consider v'gám' and nóakh for two types of example.) —RuakhTALK 16:43, 6 July 2011 (UTC)
    • ...and bét-dín for another, though that issue is not obviated by the adoption of this new transliteration rule.​—msh210 (talk) 18:58, 17 October 2011 (UTC)

Fine: I've edited the main page to indicate that indication of stress should be used even on a monosyllabic word. Than you all^W both for your input.​—msh210 (talk) 21:38, 15 July 2011 (UTC)

Transliteration of ח[edit]

The page says ח can be both kh and ch. Is it always the same sound, /x/? --Anatoli 06:26, 19 September 2011 (UTC)

What's the meaning of ' in the transliteration of שחור (sha'chor). The translation for black just gives shakhor. --Anatoli 06:30, 19 September 2011 (UTC)
Re: first question: Well, different people pronounce the letter differently from others, and there's good reason to think that historically (until at least Late Antiquity) it was pronounced somewhat differently in some words from others; but the <kh> vs. <ch> thing doesn't actually reflect those differences. It's just that we haven't reached an agreement yet on which to use, so for now both are allowed.
Re: second question: Those transliterations were added long before we had Wiktionary:About Hebrew, and have never been updated. It looks like Dubaduba (talkcontribs) was using ' to denote a syllable-break. (Or maybe he was using it both for /ə/ and for /ˈ/? If so — wow, confusing.) I'll go fix that entry now.
RuakhTALK 12:05, 19 September 2011 (UTC)
Just to clarify — not that Ruakh has said anything here with which I disagree — ח is I think most often pronounced /χ/ (uvular), though many pronounce it as /ħ/ (pharyngeal) and I think some pronounce it as /x/ (velar). (Nowadays.)​—msh210 (talk) 15:15, 19 September 2011 (UTC)
Thank you both. I'm a bit confused about the varieties of /χ/, /x/ and /ħ/ represented by one letter. I guess, I'll have to read about the sound mergers and Biblical/Modern Hebrew phonology. --Anatoli 22:20, 19 September 2011 (UTC)
I suppose you already know, but am writing this in case you don't, that there's also כ, pronounced similarly. (That one I don't think is pronounced /ħ/: it's just /χ/, for some people /x/. Come to think of it, I'll change my above statement from "I think some pronounce [ח] as /x/" to "some pronounce [ח] as /x/".)​—msh210 (talk) 22:40, 19 September 2011 (UTC)

support for direct objects in verb-form templates[edit]

I've added (but have not yet documented) support, in template:he-future of and he-Future of, for direct objects attached to the verb form (as in הַשְׁלִיכֵהוּ (hashlichéhu, cast it, imperative)), with an eye to adding it to all finite-verb-form-definition-line templates. Suggestions and edits will be most welcome.​—msh210 (talk) 18:51, 17 October 2011 (UTC)

I guess template:he-infinitive of, too.​—msh210 (talk) 19:08, 17 October 2011 (UTC)

Paleo-Hebrew[edit]

Do Ancient Hebrew words in Phoenician script get entries? If so, how are they formatted? (Has this issue been discussed at all before?) --Yair rand 20:25, 14 November 2011 (UTC)

We don't currently have any, but I think that they would be welcome, where attested, if we have any editors with the knowledge to add them. But I think they should just be pointers to the square-script spelling. (Even the four-letter Name of G-d, which was written in the Paleo-Hebrew alphabet for centuries after the adoption of the square script — I remember seeing it on the Dead Sea Scrolls in the Israel Museum — would be best addressed by a usage note at the square-script spelling, IMHO.) —RuakhTALK 21:23, 14 November 2011 (UTC)

stress in unstressed words[edit]

In the (Masoretic, and hence only current, version of the) Bible, often a word is connected by a maqaf to the following word. (Sometimes multiple words are so joined.) In such a case, no word except the last has primary stress (though sometimes another word (or words) has secondary (w:meteg) stress).

Previously, I marked the stress of such words anyway: וְכָל צְבָאָם (v'chól ts'vaám) (Gen. 2:1).

However, there are a few problems with this: (1) Simply, it's incorrect. For example, וְכָל really doesn't have (at least primary) stress (and we don't mark secondary stress anywhere else). (2) It leads to oddities of transliteration. For example, the general rule is that there is never a stressed syllable with a _kamatz katan_; yet וְכָל is being marked as stressed.

For these reasons, I've recently started not marking stress in such transliterations. In order to clarify where the word's stress is, I have inserted maqafs into the Hebrew and hyphens into the transliteration (וְכָל־צְבָאָם (v'chol-ts'vaám)).

(Let me stress that this is only for biblical passages.)

I am, of course, willing to abide by the community's agreement that using stress marks is better. Thoughts?​—msh210 (talk) 20:47, 19 March 2012 (UTC)

As an example of where I did this, see the quotes from Samuel and Isaiah at נטל.​—msh210 (talk) 07:02, 20 March 2012 (UTC)
I agree with b'char-l'chá and kol-y'mé, but am ambivalent about cases where the first word has multiple syllables. —RuakhTALK 14:35, 20 March 2012 (UTC)
What's the difference?
Possibly relevant: Deut. 22:12 (link to a PDFed print version, since s:he:, here as often, skips the meteg) has תַּֽעֲשֶׂה־לָּ֖ךְ with secondary stress on the tav and primary on the lamed. (If I recall correctly, classical grammarians differ as to whether there's also (non-notated) stress on the sin, but I think standard practice is to omit it.) When I was marking stress on hyphenated words ("v'chól ts'vaám") I never knew what to do with things like this: "táase lách" looked wrong, since the standalone word is "taasé"; but "taasé lách" is wrong in context. (I think I generally went with "táase lách".) In any event, (a) this is another part of what motivated me to switch to not marking stress in such words at all (although another solution would be to add hyphens and to always mark secondary stress) and (b) how would the part of you (Ruakh) that is tending toward marking stress in polysyllabic pre-maqaf words mark that?​—msh210 (talk) 16:44, 20 March 2012 (UTC)
It would transliterate it as tá'ase-lákh, because it looked wrong to you. I don't think it takes much knowledge of Hebrew prosody to know that the primary stress goes on the second word, but it takes much more knowledge to know which syllable of tá'ase is stressed in such a case. (At least — I knew the first, but not the second. I'm not completely shocked, since I believe the same stress shift happens with vav hahipukh — if I'm not mistaken, "and you did" would be v'tá'ase, or maybe v'tá'as — but still, it wouldn't have occurred to me if you hadn't pointed it out.) Also, it's beyond the capacity of transliteration to paint for readers a full picture of a pasuk or sentence's prosodic patterns; when there's a conflict between identifying the prosody of an individual word and identifying that of a phrase, part of me would rather resolve it in favor of identifying the prosody of an individual word. (The other part of me dislikes the inconsistency inherent in making such a distinction, and wonders how it would apply to cases where more than two words are linked by makafs, and wonders if this same sort of inconsistent logic should also be applied when an unstressed single-syllable word is followed by a heavily stressed word but without a makaf.) —RuakhTALK 21:48, 20 March 2012 (UTC)
Makes sense. But the logical part of me (or the pseudo-OCD part, or something) finds marking stress on pre-maqaf polysyllabics and on standalone monosyllabics, but not on pre-maqaf monosyllabics, unpalatable. So if you really meant that you agree to not marking the monosyllabics and are ambivalent about the polysyllabics, then I'd prefer we go with not marking any. If, otoh, you really think we should mark the polysyllabics, then I'd prefer to mark the monosyllabics also. On the third hand, perhaps I should seek therapy and we can mark the polysyllabics and not the monosyllabics.​—msh210 (talk) 01:12, 21 March 2012 (UTC)
Yup, that's exactly how part of me feels. How did you know? :-)   —RuakhTALK 02:00, 21 March 2012 (UTC)
 :-)  Okay. because of your (as you put it) ambivalence and the other considerations above, I've decided to continue the way I noted at the start of this section: not marking stress in any unstressed word, whether poly- or monosyllabic.​—msh210 (talk) 17:48, 23 April 2012 (UTC)

word-final מ[edit]

ם is the word-final form of מ. If I wanted to create an entry which ended in that letter, would I need to explicitly use ם, or would software correctly display/interpret things if I just used the 'regular' מ? I don't have any entry in mind, I'm just curious... - -sche (discuss) 07:48, 31 March 2012 (UTC)

Sorry: just came across this. You'd need to explicitly use the final form. Interestingly, some abbreviations that end in mem use the final form, others the medial, and others are found both ways. (Or maybe all are found both ways, but with different frequencies.) E.g., בע״מ and בע״ם both should be bluelinks. That'd be impossible with autoconversion (unless it looked for the ״ symbol, I suppose).​—msh210 (talk) 16:52, 23 April 2012 (UTC)
I also didn't see this till now. My gut tells me that abbreviations that are pronounced as their own words, with the mém making a terminal /m/ sound (e.g. או״ם (U.M., U.N.), pronounced /um/), are written with ם, whereas written-only abbreviations where the mém is not at the end of the word when you read it out (e.g. ק״מ (K.M., km), pronounced /ki.loˈme.teʁ/) are written with מ, as are abbreviations that are read out as a series of letters (though offhand I can't think of any examples like that). Do you know how בע״מ and בע״ם are pronounced? —RuakhTALK 18:41, 23 April 2012 (UTC)
To your last question: Not I. The rest sounds plausible; I have no idea whether it's true.​—msh210 (talk) 21:14, 23 April 2012 (UTC)
And re "abbreviations that are read out as a series of letters (though offhand I can't think of any examples like that)", examples are numbers — which I've seen written with final and with medial mems.​—msh210 (talk) 21:31, 23 April 2012 (UTC)
The pronunciation of numbers can go either way; consider ט״ו בשבט (T.V. biSh'vát), pronounced /tu/, and מצודת כ״ח (m'tsudát K.Kh.), pronounced /ˈko.aχ/. (Confoundingly, I think the tendency to pronounce numbers as words is a bit old-fashioned, and conversely I think the use of mém at the end of a word is relatively new — what I described above is my impression of current usage, not historical usage.) —RuakhTALK 22:00, 23 April 2012 (UTC)
And sometimes they're pronounced as if spelled out as words (e.g., with numbers less than eleven, ב׳ pronounced as שְׁנֵי). But I was thinking of years, actually, which (in my experience, at least) are usually pronounced as a series of letters.​—msh210 (talk) 00:16, 26 April 2012 (UTC)

Marking stress on romanizations of single-syllable words[edit]

According to the page:

The position of the stress should be indicated using an acute accent on the main vowel of the stressed (or only) syllable (á, é, í, ó, ú).

Is it really necessary to mark the stress on the only syllable of a single-syllable word? Personally, I think this makes short words (like ) harder to read. --WikiTiki89 (talk) 06:42, 25 July 2012 (UTC)

See #marking stress in monosyllabic words' transliterations above. —RuakhTALK 12:24, 25 July 2012 (UTC)
Ok, you've won me over with the argument of of Nóakh and v'gám. --WikiTiki89 (talk) 12:49, 25 July 2012 (UTC)

Nikkud order[edit]

I have noticed that when a letter has both a dagesh/mappik or a shin/sin dot and a vowel, many places on the internet including many of the Wiktionary pages put the vowel first and then dagesh/mappik or shin/sin dot. This does not make logical sense since the effect of the dagesh/mappik or shin/sin dot is pronounced before the vowel and should be tied closer to the consonant. The shin/sin dot may even be considered as part of the consonant.

I think we should establish a rule about the ordering of these marks:

  1. the shin/sin dot should come first
  2. then the dagesh/mappik
  3. and then the vowel

--WikiTiki89 (talk) 10:14, 6 August 2012 (UTC)

Annoyingly, that's more or less impossible: the MediaWiki software normalizes all input using NFC, which puts them in the exact opposite order from what you describe. (Try it!) There are some hacks around this limitation, but before we discuss them, I have to ask — why does it matter what order they're in? You say that you've "noticed that [] many places on the internet" put the nikúd in the backwards order. How have you noticed? What are you seeing? Because the NFC order is stupid, but it shouldn't actually matter unless there's another problem that it's interacting with. —RuakhTALK 12:04, 6 August 2012 (UTC)
I noticed this by pasting text into a box and pressing backspace to see what disappears first. Now that you mention that MediaWiki normalizes it, I realize that everything I have added myself has also been normalized.
Here is why this matters:
  • It makes it much easier to change vowels when editing after copying and pasting.
  • I noticed that iOS displays the dagesh/mappik on the next letter if there is a vowel between it and the previous letter. On websites that use my proposed order, everything is displayed correctly.
Is there any reason MediaWiki normalizes it the wrong way? If not, can this be changed?
--WikiTiki89 (talk) 12:57, 6 August 2012 (UTC)
Re: Editing: I know exactly what you mean. Maybe we should address this using JavaScript at the edit-screen?
Re: iOS: Indeed; see Template talk:he-noun#RTL issue.
Re: Is there any reason: Yes; as I said, the wrong way is specified by NFC. I highly doubt that we can convince MediaWiki to depart from NFC and invent its own normal form (or dispense with normalization entirely).
RuakhTALK 15:06, 6 August 2012 (UTC)
Perhaps we should make your script (at template talk:he-noun) sitewide.​—msh210 (talk) 19:04, 6 August 2012 (UTC)
Does the script slow down page loading? --WikiTiki89 (talk) 08:18, 7 August 2012 (UTC)
I can't imagine it would, but if you give it a try, I'd be interested to hear. :-)   —RuakhTALK 13:39, 7 August 2012 (UTC)
Ok, how do I use it? I've never played around with JavaScript on Wiktionary. --WikiTiki89 (talk) 14:05, 7 August 2012 (UTC)
Just copy the contents of the last JavaScript-box (the one I described as "this updated script") to Special:MyPage/common.js. (Warning: be cautious in copying people's JavaScript to Special:MyPage/common.js. Malicious JavaScript can compromise your account.) —RuakhTALK 22:26, 7 August 2012 (UTC)
It's OK, I speak JavaScript. I would never use any without first reading it.
The script works. But on my iPhone it only works on the regular site (when I'm logged in of course) but not on the mobile site (I would assume this is because you can't be logged in on the mobile site).
I agree with msh210 that the script should be made sitewide, but would it apply to the mobile site also?
--WikiTiki89 (talk) 06:38, 8 August 2012 (UTC)
Re: "would it apply to the mobile site also?": JS added to the regular site does not apply to the mobile site, but we can add JS to the mobile site by using MediaWiki:Mobile.js. However, this exact script won't work on typical mobile devices, because it uses jQuery, which is (from what I understand) not available on most mobile devices. The good news is, the only thing it uses jQuery for is to find all elements belonging to class Hebr, and according to http://caniuse.com/#search=getelementsbyclassname, document.getElementsByClassName seems to be reliably available on mobile browsers. But I don't feel comfortable adding something to MediaWiki:Mobile.js without being able to test it. —RuakhTALK 11:55, 8 August 2012 (UTC)

WT:AYI[edit]

This is just to notify all you active Hebrew editors that there is now an 'About ' page for Yiddish, and we are holding a discussion on Yiddish policy at Wiktionary at Wiktionary talk:About Yiddish. If any of you are interested, we welcome your input.
Thanks, --Μετάknowledgediscuss/deeds 04:58, 19 September 2012 (UTC)

Hanukkah lexicography[edit]

Firstly, I was hoping to get a good FWOTD for the first night of חנוכה in Hebrew or Yiddish, if you guys are up to citing something. (I note we lack khanukiyá, maybe that's our best bet for this.) Any ideas?

Secondly, this is rather immature, but I'm trying to make the most complete list possible of CFI-citeable alt-forms at Hanukkah. Strange romanizations are welcome. !תודה רבה —Μετάknowledgediscuss/deeds 07:46, 5 December 2012 (UTC)

"(Ch|Kh|H)anukk?ah?" (I hope you can read regular expressions) are all pretty common. Some other less common variations may include a double "n" or "c"s in place of "k"s. --WikiTiki89 08:16, 5 December 2012 (UTC)
Yeah, but if you search every term that regex suggests, you'll find some simply can't be cited, and double n is more common than you think. Hanuka is the shortest, Channukkah the longest. I think every possibility I've seen is covered by (Ch|Kh|H|Ḥ|K)ann?(u|oo)kk?ah? but I'm still looking. Hard to say if all the cites at google books:"Hanookah" are independent. —Μετάknowledgediscuss/deeds 08:53, 5 December 2012 (UTC)
Don't forget the "c" variations: google books:"hanucca", google books:"hanuccah", google books:"hanuca" are a few successful ones I tried. --WikiTiki89 09:52, 5 December 2012 (UTC)
I'm started working on adding and improving a bunch of Hanukkah-related terms (User:Wikitiki89/TODO#Hanukkah), if you wanna help. --WikiTiki89 10:55, 5 December 2012 (UTC)

ch vs. kh[edit]

(from User talk:Metaknowledge#ch vs. kh) Hey guys! Wanna come to a consensus about which to use for transliteration of ח and כ? I think standardization is important enough to make a move from this awful indecision necessary. Unless people have spontaneously changed their minds, the votes in so far are:

  • Msh210 - ch
  • Ruakh - kh (well, there's a reason his handle isn't Ruach)
  • Metaknowledge - kh

Hopefully we can get this done with semi-painlessly. Note: Feel free to bring up arguments for your side, but we've probably heard them all before anyway. —Μετάknowledgediscuss/deeds 08:07, 23 December 2012 (UTC)

I prefare ch, because it seems to me more intuitive to German speakers, but this is not backed by any specific knowledge. Not voting. -- Stf (talk) 09:13, 23 December 2012 (UTC)
Well, that's an argument I've never heard before. Unfortunately, most of us are English speakers (in which that suggests /tʃ/) rather than German speakers around here. —Μετάknowledgediscuss/deeds 17:11, 23 December 2012 (UTC)
If one of the two were transliterated "ch" and the other were "kh", that would increase the number of distinct glyphs with correspondingly distinct transliterations. One could say that using two transliterations would (incorrectly?) imply a difference in pronunciation, but so does the use of two different letters in Hebrew. And FWIW, the Academy of the Hebrew language and the ISO use two different transliterations; they both transliterate "ח" using "h" with a diacritic under it, and the Academy transliterates "כ" "kh" while the ISO prescribes "k" with a diacritic under it. So perhaps "ח"="ch", "כ"="kh"? Just something to consider... - -sche (discuss) 17:31, 23 December 2012 (UTC)
I find ch to be more intuitive. My instinct is to read kh as a plosive or affricate, or even as a sequence of two sounds. —CodeCat 17:37, 23 December 2012 (UTC)
@-sche: That's not our goal. We don't care to distinguish ת and ט, because to an Israeli they're both /t/. Similarly, most texts don't distinguish פ and פּ, even though one is /f/ and the other /p/. We use accents for stress marks, even though these are not found even in normal dotted Hebrew. It's not meant to be bijective.
@CodeCat: Hmmm, is that because you've mostly spent your time with Latin script Germanic languages? I'm just hoping to avoid people saying /ˈtʃæn.u.k‌ə/ for /χa.nuˈka/. —Μετάknowledgediscuss/deeds 17:49, 23 December 2012 (UTC)
A few things I would like to mention:
  • I have never seen kh in any transliterations of Hebrew in "real life", only online on websites such as this one. Using ch would be more familiar to most non-linguists who have ever been around Hebrew. (I would have even said the same for Yiddish if it weren't for the YIVO standard using kh.)
  • If you want to be truer to the original Hebrew, then a scholarly transcription (such as bārūḵ for בָּרוּךְ) would be best, which I would probably support since there is no reason to indicate modern pronunciation in the transcription since that is what the pronunciation section is supposedly for.
  • Some Israelis, even today, still do distinguish ח and כ and various other pairs that are normally not distinguished in modern Hebrew and I think it is not fair not to give them any weight. Not to mention that for liturgical purposes such distinctions are actually very common.
--WikiTiki89 18:13, 23 December 2012 (UTC)
Re: "Some Israelis, even today, still do distinguish ח and כ": And I'm one of them (especially in liturgy; in everyday speech I do tend to khafinate my khets). But I assure you, I have never expected any English-speaker to preserve or even notice that distinction, let alone an English-speaker who hasn't yet learned the Hebrew alphabet. —RuakhTALK 18:30, 23 December 2012 (UTC)
I prefer kh, firstly because ch clearly signals /tʃ/ (when my oldest sister lived in the U.S., she used to hate when people called her /tʃɛn/, though the almost-as-incorrect /ʃɛn/ didn't bother her as much) whereas kh has no such problem; secondly, because uninitiated Americans do approximate non-word-initial /χ/ as /k/ (though admittedly, they approximate word-initial ח as /h/); and thirdly, because it sort-of shows a relationship with כּ k. ("Sort-of" in that, firstly, ח doesn't share that relationship, and secondly, in that we don't use h that way elsewhere in our transliteration scheme. Old-timey transliterations sometimes used ph and dh and so on for the forms without dageshes, but we don't do that.) By the way, since my username is ruakh and not rúakh, it should not be considered to reflect my views on any proposed transliteration scheme. :-)   —RuakhTALK 18:30, 23 December 2012 (UTC)
Well, my username isn't Μετάknowledge, just my signature, but that's because I'm not guaranteed access to my Greek keyboard whenever I want to login to Wiktionary. It could be the same for you, I don't know. —Μετάknowledgediscuss/deeds 18:45, 23 December 2012 (UTC)
My reason for preferring ch is that English speakers are used to pronouncing/hearing it /χ/ in German loanwords (especially Bach) whereas kh is completely unfamiliar. (That ch may be misread as /tʃ/ is true, but so may kh be misread as /kʰ/, so it's no better.)​—msh210 (talk) 04:24, 24 December 2012 (UTC)
Oh, and ch is /x/ in Template:gd too.—msh210℠ on a public computer 15:20, 25 December 2012 (UTC)

root categories[edit]

(I recall mentioning this at some point, but no longer recall where, and, anyway, it was IIRC in passing rather than as a standalone proposal, so let me do it justice here.)

I propose hereby that we have categories for Hebrew (generally triliteral) roots. In each such category would be any word built on the relevant root (and any compound or phrase built in part on the root). Assuming (as I do) that we continue to have entries for roots, those would go in the category also (listed first, I suppose). A category could be called, perhaps, category:Hebrew root י־ר־ד for example. And as usual when etymology is uncertain and we categorize in the relevant category anyway, we can do the same here, categorizing in the root category even if inclusion is uncertain.

  • It would enable users of the dictionary to see, at a glance, related words.
  • It would be easy to maintain once implemented, at least for some entries: adding {{he-root}} or {{he-verb}} would categorize the entry.

I'd appreciate others' thoughts on this.​—msh210 (talk) 05:46, 15 February 2013 (UTC)

I'm pretty neutral on this, but if we do it, I have a few thoughts:
  • I wonder if rather than categorizing entries, it would be nicer to create and categorize redirects from the spellings with nikúd. So, for example, we'd create [[הוֹרִיד]] as a redirect to [[הוריד]], and put the former, rather than the latter, in the category for the root. This way the listing in the category is a bit more useful. A complication: I know that you prefer to list words under the spellings they use when they have nikúd, but I don't, and I know I'm not alone. So the categories would end up containing a mixture of redirects to useful entries and redirects to "defective spelling of ____" entries.
  • When I started typing this comment, you seemed to be suggesting that the category would include inflected forms, but now you've edited it and seemingly removed that impression, so maybe this thought is obsolete, but: I don't think we should include inflected forms. For one thing, the presence of inflected forms would mask the existence of certain lemmata that are identical with them (masculine singular present participles that are also nouns and/or adjectives; feminine singular indefinite adjectives that are also adverbs; etc.).
  • I'm not sure we should keep the entries if we create the categories. Granted, the entries have some advantages over categories, and vice versa, so I see how it could make sense to have both and get the best of both worlds; but given that almost all of the information would be duplicated (the word-list, the definition, and the fact that they're Hebrew roots), I think it would end up just being confusing. Even things that seem to belong to an entry, such as an etymology, can IMHO be put in the category description without too much awkwardness.
  • If we do keep both entries and categories, then I think Wiktionary:Votes/2011-04/Representative entries would apply. (Though we might actually want to take a different approach anyway, and use some other kind of sort-key — say, maybe noun and verb and so on — rather than having half the entries start with the root's first letter and the other half start with a very small number of otiót shimúsh.)
  • When multiple roots are spelled the same way, I believe that he.wikt gives each its own category, and I think we should do the same.
RuakhTALK 05:50, 15 February 2013 (UTC)
Thanks for your thoughts. I'll try to reply to them in the order you've presented them:
  • The idea of categorizing vowelized forms is a good one in theory but will be hard to maintain: people will create entries (or add senses) that will require such categorization and either (a) not categorize or (b) categorize the entry itself.
  • Just by the way: I like listing words under the spellings they use with vowels primarily for biblical, and only for old, words. Words of recent origin are rarely spelled without their matres lectionis and I don't advocate listing them that way as anything other than a soft redirect (and that only if so attested, of course). Biblical words, on the other hand, are written in Tanach with their defective spellings (usually), so I advocate listing them that way even if they're still used and now usually with matres.
  • I suppose the redirects could be to the forms with matres when those are more common (or, as a practical substitute, when those are main entries and the defective spellings are soft redirects). That would get rid of the problem you mention about redirecting to soft redirects, while introducing the oddity of redirecting to something not quite equivalent to the redirected-from form.
  • No, I agree: we should not include form-of senses.
  • Like you, I'm not sure we should include the entries if we have the categories, but, then, I was never sure we should have them in the first place (IIRC). And I agree the duplication would be unnecessary in the main and slightly confusing.
  • Yes, the "Representative entries" vote would apply: I'd forgotten about that. Of course, its intent was mainly topical categories, but it doesn't mention them explicitly.
  • Sorting under "noun" and "verb" wouldn't work, as it would AFAICT mean just sorting under "n" and "v", which would be opaque. I don't see a better way than sorting alphabetically at the moment, but perhaps we can find one.
  • When multiple roots are spelled identically, I think it would be sufficient to handle that with text in the category. Moreover, splitting the category would further exacerbate the problem I mention in the first bullet point of this reply: it would make it hard for people to categorize entries.
​—msh210 (talk) 06:53, 15 February 2013 (UTC)
Two more things:
  • Having thought about it some more, I really feel that the objection in the first bullet point of my post just above — that it will be difficult to maintain this categorization if it's applied to the vowelized forms — is a good one, and we shouldn't do it that way unless a solution is found to that objection.
  • When I say "we should not include form-of senses" I mean inflections. Alternative spellings and obsolete forms should be included, much as we do in, for example, category:English verbs. And if we're doing this with unvowelized spellings, then probably we should include both the 'excessive' and the 'defective' spellings.
​—msh210 (talk) 16:49, 17 February 2013 (UTC)
Re: sorting under 'n' and 'v': That was actually exactly what I was picturing. I figured a note in the description along the lines of, "entries are listed here under their part of speech: 'a' for adjectives and adverbs, 'n' for nouns, 'v' for verbs" would be more than sufficient. But this becomes impossible if we categorize entries rather than redirects, so I guess it's academic.
Re: everything else: O.K.
RuakhTALK 06:16, 21 February 2013 (UTC)
Happy Purim. Do you think the benefits of categorizing the vowelized forms beat the costs? (I don't, but am willing to give in if you do and no one else chimes in vociferously agreeing with me.)​—msh210 (talk) 05:29, 24 February 2013 (UTC)
I don't know. Before Scribunto, I always assumed that we would eventually get around to creating redirects from vocalized spellings anyway, and if we start with that assumption, then these root categories seem as good a prompt as any. But Scribunto means that we can augment our did-you-mean logic to include a removal-of-vowels check, so those redirects no longer seem so eventually-necessary. —RuakhTALK 07:03, 24 February 2013 (UTC)

Okay, so next steps, I guess:

  • Since {{he-root}} says "From the root" and gives no option for hiding that text, I think it's safe to assume every entry transcluding it belongs in the category. So add:
{{#if:{{{3|}}}
 |[[category:
     Hebrew root {{he-dotless-shin|{{{1}}}}}־{{he-dotless-shin|{{{2}}}}}־{{he-dotless-shin|{{{3}}}}}{{#if:{{{4|}}}|־{{he-dotless-shin|{{{4}}}}}}}
   ]]
 |{{#if:{{NAMESPACE}}|[[category:he-root missing middle vav or yod]]}}
 }}
  • Revise {{he-verb}} to take the root literals (as well as, and eventually instead of, {{{sort}}}) as parameters. Categorize (in Hebrew verbs, the binyan categories, and the verious defective-root categories) based on these parameters rather than sort, and use the parameters also to categorize in the root category. Parameter names p, a, and l would be nice, but some roots have four (or more) letters, so say r1, r2, and so on.
    • However, one problem with this, but it may be insurmountable: Currently, the weak-root categories are human-populated: someone decides to put an verb in any such category. Under this proposal, they'd be automatically populated. This would require a good deal more logic than it seems at first glance. For example, גָּבָה (gavá) should be in a paal lamed-he category, whereas גָּבַהּ (gaváh) should be in a paal lamed-g'ronit category. According to this scheme they'd be in the same category unless we include information from other parameters in the logic. Also, שכנע should be in a piel lamed-g'ronit category even though its fourth, not third, letter is ayin. (That second example is probably easier to deal with. But there may be other examples that are not.) So we probably have to keep the existing root-literal parameters for purposes of weak-root categorization and use the new root-literal parameters only for sorting in the main/binyan categories and for categorizing under a root. But expecting editors to use two different types of root-literal parameters is I think expecting too much.
      So I'm hoping we can keep the current sort parameter and use Scribunto to extract the root from it and from the pagename, and to categorize with an error if it's not extractable. But I don't know yet enough Scibunto to know whether this is doable, or how.
  • I propose the root categories always have at least three letters: that is, arguably-two-letter ones be treated as triliteral. (This is already alluded to in the first bullet point of this note, above.)
  • Then check for entries in the error categories, move info from the root entries to the categories… I'm sure I'm missing some steps.

​—msh210 (talk) 17:13, 26 February 2013 (UTC)

Module:he-utilities[edit]

I was thinking to move the various Hebrew "utility" templates into a Scribunto module, say, Module:he-utilities. That would include these:

  • {{he-ot-sofit}} (which maps kaf, mem, nun, pei, and tsadik to their sofit forms, while leaving everything else intact; also has an option to add a sh'va to khaf sofit)
  • {{he-ot-lo-sofit}} (which maps sofit letters to their non-sofit forms, while leaving everything else intact)
  • {{he-dagesh-kal}} (which adds a dagesh to begedkefet letters, while leaving everything else intact)
  • {{he-dotless-shin}} (which removes a shin dot from shin or a sin dot from sin, while leaving everything else intact)
  • {{he-x}} and its children (which are unnamed and undocumented, but seem to perform simple transformations similar to the above)

The benefits include:

  • Our inflection-table templates are heavily dependent on these utilities, so if we want to be able to move any inflection-table templates into Scribunto modules, this is almost a prerequisite.
  • These templates form a single logical group; putting them in a single module would be clearer, I think, than having them all in their own templates.
  • These templates have been artificially constrained by the limitations of what wikitext can do. Once they're in Lua, we can probably make some of them smarter, and add more similar utilities along these lines. (For example, imagine the uses of a general utility-function to strip nikud from a string, rather than the very limited {{he-dotless-shin}}.)

Does anyone have any thoughts on the matter?

RuakhTALK 06:12, 21 February 2013 (UTC)

Not I. And I can't intelligently support such a change, not yet familiar with Scribunto. but if you think it's wise then I'll take your word for it and lend my (unintelligent) support.​—msh210 (talk) 16:48, 21 February 2013 (UTC)

template:he-future of for cohortative and jussive?[edit]

(For background, see Gesenius on the cohortative and jussive: 48 especially and 108109 if desired.)

I'm thinking it would be a good idea to edit {{he-future of}} as follows: change future tense to {{{mood|future tense}}}. (And update {{he-Future of}} accordingly, of course.) This would be documented with intended use for, and used for, mood=jussive or mood=cohortative (as appropriate per Gesenius).

I'd simply effect this except that:

  1. I don't know whether Gesenius is the accepted authority on the names of these moods in Hebrew. (Maybe they're usually called something else, or one of them is?)
  2. I'm not sure this is the best way to go about templatifying the definition lines for these words. Maybe adding more complicated code to allow for mood=j makes more sense? Maybe creating {{he-jussive of}} makes more sense? Maybe the latter, but as a UI of {{he-future of}} (much as {{he-Future of}} is)? Something else?

I'd appreciate others' input.​—msh210 (talk) 18:13, 18 April 2013 (UTC)

Hm, actually, for consistency in {{he-Future of}}, we'd want mood=| to display future tense. That means that {{he-future of}} would have to have {{#if:{{{mood|}}}|{{{mood}}}|future tense}} — and once we have that, I'm not sure how much more expensive {{#switch:{{{mood|}}}|=future tense|j=jussive|c=cohortative|future tense}} is.​—msh210 (talk) 18:21, 18 April 2013 (UTC)

Wow: I just realized we already have {{he-jussive of}}. (In fact, I created it — seemingly on the basis of past discussion — and I've used it.) I've now created {{he-cohortative of}} also.​—msh210 (talk) 14:51, 22 April 2013 (UTC)


Oh, drat. Now I've realized that there's not just a cohortative future but also a cohortative imperative (שְׁבָה (sh'vá, sit/stay)). That means {{he-cohortative of}} shouldn't exist and we add {{#switch:{{{mood|}}}|=future tense|j=jussive|c=cohortative|future tense}} or something like it to {{he-future of}} and {{#switch:{{{mood|}}}|=imperative|c=cohortative|imperative}} or something like it to {{he-imperative of}}. I guess I'll effect that, but invite comments first.​—msh210 (talk) 21:59, 23 April 2013 (UTC)

Hm, Gesenius rather specifically does not call this a cohortative imperative, calling it instead only an imperative with a ה.​—msh210 (talk) 16:48, 24 April 2013 (UTC)

Hebrew terms and translations, which need attention[edit]

FYI, Category:Hebrew terms lacking transliteration needs some input and the new one: Category:Hebrew translations lacking transliteration. --Anatoli (обсудить/вклад) 01:17, 21 May 2013 (UTC)

A few things to think about[edit]

There are a few things I've noticed / thought about, just thought I'd bring them up.

  • I see like נכתב is written as a derived form of כתב. Are we gonna have passive verbs be derived forms of active verbs? How about instead, on the root page כ-ת-ב, we have all the forms derived from the root (the verbs כתב, נכתב, הכתיב, הוכתב, התכתב as well as nouns like מכתב and כתיב). Or would you prefer to only have on that page כתב, הכתיב and התכתב, and have נכתב as derived from כתב and הוכתב as derived from הכתיב? Whether נפעל is truly the "passive" of פעל or is a "real" verb in its own right is up for debate. I mean, it DOES have its own infinitive and imperative forms.
  • Speaking of roots, words like בכה and רצה, would you say that the 3rd letter of the root is ה or י? On the one hand, it naively looks like a ה, and many textbooks write it as ה. But on the other hand, there is lots of evidence that it's actually י. If you look at the corresponding roots in Arabic or Proto-Semitic, they show that 3rd letter as a י. Forming the "pe'ila" template, "1e2i3a", from כתב you get "ketiva", from "רצה" you'd get retsihah... that's wrong... but from "רצי" you'd get retsiya, "רצייה", which is correct. Similarly words like בכי show the י there. Think of the minimal pair גבה, לגבות (root גבי) meaning to collect, and גבהּ, לגבוהּ (root גבה with a true final ה) meaning to get taller.
  • Speaking of minimal pairs, take a look at this. The root ע-ו-ר put into pi'el has *two* possibilities: עוֹרר, לעוֹרר (to arouse), and עיוְרר, לעוְרר (to blind). What's the difference? Well the first is a hollow root, the ו acts special as a ו weakening the root. The second though is a strong root, the ו just acts as a normal consonant. I propose that on the root page, you list the root as either strong or weak, and list out its weaknesses (e.g. פ"א, ע"ו, ל"י etc). This way, there are actually *two* עור roots. Both of them on the same page, because they're spelt the same, but they are different roots. One is hollow and the other is strong. Similarly, you've got all sorts of weird cases, like the י turns into a ו in הפעיל, but it might not in התפעל. e.g. ישב, הושיב, התיישב but יסף, הוסיף, התווסף.
  • On the conjugation table, it's missing out the שם פעולה, the "verbal noun" or "gerund" form. Although this is a derived noun from the verb (just like the participle is), it's part of the conjugation, each binyan (except the passive ones) has a fixed standard way of making its שם פעולה:
פעל -> פעילה
נפעל -> היפעלות
פיעל -> פיעול
etc.
And there are some special cases where the שם פעולה is in an unexpected form. For instance, רקד is pa'al, but its שם פעולה is ריקוד which is the usual pi'el form. And one says עברה not עבירה.
  • Should פעול be considered a special "passive participle" form of פעל?

So yeah, just some things that came to mind. AndreRD (talk) 12:42, 28 July 2013 (UTC)

Hi AndreRD,
Thanks for bringing these up. For the most part, what you suggest is actually what we have already decided. :-)
  • Re: the passive as a "derived form" of the active, vs. appearing directly at the root entry: I don't see these as mutually exclusive, and have been doing both. (But to be honest, I dislike your recent changes to [[כ־ת־ב]] — made-up headers, an ad hoc table, unattested and ungrammatical forms . . .)
  • Re: ל״ה roots: I give the root with ה (per tradition), but transliterate it as a gap: e.g. “From the root ר־צ־ה (r-ts-)” (per our general policy of not transliterating ה when it is just an ém k'ri'á). And I agree with tradition on this point: the ה becomes a י in some forms (רצוי) and a ת in some forms (רצתה), but in most forms it either remains ה or disappears completely. Furthermore, the י-insertion has generalized to some forms that don't have it etymologically, such as סנוי (which is ל״א).
  • Re: distinct roots spelled the same way: Yes, in such cases there will be two root entries on the same page. Per general Wiktionary convention, they will have L4 headers (====Root==== intead of ===Root===), because they will be nested under separate, numbered etymology sections (so the overall format is ==Hebrew== ===Etymology 1=== ====Root==== =====Forms===== ===Etymology 2=== ====Root==== =====Forms=====); see Wiktionary:Entry layout explained.
  • Re: verbal noun in conjugation tables: The conjugation template, {{he-verb-conjugation}}, supports it, but you have to specify it, which most places don't. This is a problem that requires fixing.
  • Re: passive participle: Yes, absolutely, but we treat it as a separate lemma: we list it in the conjugation table, but also under ====Derived forms====, and at the entry for the participle, we list it as a regular ===Adjective===, with its passive-participle-ness being mentioned under ===Etymology===.
RuakhTALK 01:45, 29 July 2013 (UTC)
Hi Ruakh, sorry for causing a mess :/. I'd really like to help you guys, but I'm not familiar with your whole system with all your templates and which conventions you've decided (after all, there are several ways to analyze the way things in Hebrew work). Like, should I make an article for kotev as distinct to katav? As a noun? An adjective? A participle? Anyway, how about you tell me what sort of thing you'd like me to do, and I do it, following your instructions. I'm a near fluent speaker of Hebrew, and I have lots of knowledge about linguistics too. How may I help? :)
(P.S. in {{he-verb-conjugation}}, should the label "present" be replaced with "present / participle" to show that it serves as both?)
AndreRD (talk) 04:20, 29 July 2013 (UTC)
Re: causing a mess: You edited one entry, I'd hardly call that a "mess". You certainly don't need to apologize. :-)
Re: templates, conventions: You might want to skim through the archives of this talk-page.
Re: kotév: We never have "participle" entries for Hebrew; we list participles as verb forms, under a ===Verb=== header. It should be listed as a noun and/or adjective only if it has life as a noun or adjective beyond its use as a verb form. (For example, we don't have separate noun and adjective entries for every English -ing form, because that information is implied by the verb-form entries.)
Re: "present" in {{he-verb-conjugation}}: I think "present" is fine. It's just a label, so the structure of the table is readily visible and users can quickly find what they're looking for; it's not intended to be a thorough explanation of the range of uses of the form. Our verb-form definition templates ("form-of templates", in en.wikt parlance), {{he-Present of}} and {{he-Past of}} and {{he-Future of}}, elaborate a bit further: they have "present participle and present tense" and "past tense (suffix conjugation)" and "future tense (prefix conjugation)", respectively. Even these are not maximally informative; for example, they don't mention that the past tense is often described as the "perfect aspect" instead. A full explanation of the forms would go in an appendix, which the form-of templates and conjugation templates would link to.
RuakhTALK 05:44, 29 July 2013 (UTC)
It's just that because participles are noun-like, there are extra noun inflections they can have, like construct state. כותביי (kotvéi) is the construct state of כותבים, which is itself a conjugation of כתב. Should this form be listed under the conjugation of כתב, or what? Also, I see the passive participle is just listed in the template as a single "non-finite form", rather than being conjugated for gender and number there. I guess that's fair enough if you consider it a lemma of its own. But if you don't consider the "participles" as their own distinct nouns/adjectives (since adjectives can act as nouns too, and also have construct states), I think you should put the construct forms *somewhere*.
Also, I'm assuming that the verbal nouns get their own page... כתיבה is a distinct enough word from כתב, they're usually separate in the dictionary. It's just difficult to work out whether a participle deserves its own mention as a noun or an adjective or whether those are just uses of the verb.......
And if my table is bad, I do think there should be a standard template for the derived terms of a root by binyan and corresponding verbal nouns.
And yeah, ok, I'll read through all the past discussions and try to contribute more productively ^_^
(Oh, and as for my "unattested" forms to the כ־ת־ב page, my dictionary has כיתב as to address something to some, or (biblical) to engrave, and has מכותב as an addressee). AndreRD (talk) 07:52, 29 July 2013 (UTC)
Re: construct forms of present participles: Yeah, that's an open question. So far the only entry that I know of is that for זבת. It's not very smooth. I suggest just ignoring it for the moment, and coming back to it once you have a better grounding in Wiktionary basics. :-)
Re: construct forms of adjectives: We have that covered; see {{he-Form of adj}}, which supports all three adjective states.
Re: verbal nouns: Yes, definitely their own ===Noun=== entries.
Re: the table: I think I just disagree in principle. :-/
Re: reading through past discussions: I mean, there are a lot of them. Don't feel like you have to read through every word, and certainly don't feel like you shouldn't contribute until you've read them!
Re: "unattested": O.K., but מְכַּתֵּב (m'katév) (with 'k')? כותב \ כֻּתַּב (kutáv) (in past tense)?
BTW, it's worth noting, as a special case, that Modern pu'ál and huf'ál participles are really always adjectives. In many cases the verb is otherwise unattested; for example, m'tugán "fried" is perhaps better viewed as a passive participle of tigén "to fry" than as a present participle of ?tugán "to be fried".
RuakhTALK 14:57, 29 July 2013 (UTC)

template for derived terms of roots[edit]

How's this for a template for the derived terms from roots?
http://en.wiktionary.org/w/index.php?title=%D7%9B%D6%BE%D7%AA%D6%BE%D7%91&oldid=21472593
I don't know how you guys do your templates and stuff, but here's a nice table I made which shows all the verbs derived from a root by binyan, as well as the corresponding participles and verbal nouns which should have a page to themselves as they can act as nouns (and adjectives in the case of participles) in their own right. AndreRD (talk) 16:56, 28 July 2013 (UTC)

I'm sorry to say that I actually really dislike that. :-/   —RuakhTALK 01:46, 29 July 2013 (UTC)
In any case it should autocollapse. —Μετάknowledgediscuss/deeds 04:09, 29 July 2013 (UTC)

links to specific entries on a page[edit]

Right now, the links are of the form {term|lang=he|כתב|כָּתַב|tr=katáv}. This is a hyperlink that looks like כָּתַב but just takes you to Hebrew section on the page כתב, not to the individual section for that particular word. I think it should link directly to the entry for that word, since there are several words listed, but I don't know how to do it with the template, things like {term|lang=he|כתב#Verb_2|כָּתַב|tr=katáv} don't work.
AndreRD (talk) 12:36, 29 July 2013 (UTC)

Hebrew Terms Database[edit]

I stumbled over the possibility of deep links into the Hebrew Terms Database of The Academy of Hebrew Language, and I was wondering if we should make a template and add links to this database in the reference sections of articles (see למידה#References for an example). Due some technical difficulties (the deep link has to provide url-encoded characters encoded in Windows-1255/ISO-8859-8) I could not write the template myself, and before I ask some experts, I want to hear your opinion: Is it worth to dig deeper? Is the Hebrew Terms Database a valuable resource? -- Stf (talk) 22:15, 27 October 2013 (UTC)

I have no strong opinion on whether we should link to them, but I do feel that we should be able to link to them, so I've created {{R:Hebrew Terms Database}} and edited למידה#References to use it. —RuakhTALK 02:25, 28 October 2013 (UTC)
Thank you very much. --Stf (talk) 05:42, 28 October 2013 (UTC)

Displaying nikud[edit]

How can the diacritics be shown separately with a large font? E.g. ִ doesn't seem to work. --Anatoli (обсудить/вклад) 08:16, 18 March 2014 (UTC)

You have to either use a font that allows it, use an image instead of text, or put them on a letter like this: אִ. Do you need this for MediaWiki:Edittools or for the About Hebrew page? --WikiTiki89 12:43, 18 March 2014 (UTC)
Thanks. I have improved the display a bit at WT:HE TR. Please check if I made any mistakes, as I struggled a bit with the mixed text and I can't read Hebrew yet. --Anatoli (обсудить/вклад) 14:08, 18 March 2014 (UTC)

קדוש[edit]

I am not sure if I got the adjective und noun forms right. The entry he:קדוש is confusing, because it lists construction forms beside the adjective forms (maybe; I'm not sure about this). Besides, on milog.co.il I found the forms קְדוֹשִׁית/קְדוֹשִׁיּוֹת, which maybe are feminine nouns. Can someone please check קדוש? Thank you, -- Stf (talk) 07:20, 28 April 2014 (UTC)

@Stf: They look good to me. Perhaps קְדוֹשִׁית can be listed in a ===See also=== section. --WikiTiki89 07:23, 28 April 2014 (UTC)

feminine-looking masculine (singular) nouns[edit]

I'd like a [[category:Hebrew masculine nouns with feminine endings]] for בית,‎ זית,‎ לילה,‎ מות, and any others that fit in that category, just to serve as a useful reference for language learners. Any ideas on what to call it?​—msh210 (talk) 05:49, 13 May 2014 (UTC)

Perhaps Category:Hebrew masculine nouns ending in ־ה and Category:Hebrew masculine nouns ending in ־ת‏‎? By the way, note that your examples do not all have equally feminine endings, since לילה also takes a feminine-style plural whereas the others do not. —RuakhTALK 04:47, 14 May 2014 (UTC)
Interesting. Should Arabic feminine-looking masculine nouns also have Category:Arabic masculine nouns ending in ة, like خليفة (ḵalīfa, caliph)? --Anatoli (обсудить/вклад) 06:05, 14 May 2014 (UTC)
(1) Note that Category:Hebrew masculine nouns ending in ־ה would also include e.g. קונה ("purchaser masc."). We'd need Category:Hebrew masculine nouns ending in ־ָה. Is that too illegible, though? (2) There are few enough masculine nouns that it seems a shame to split the category into two. (3) Re "equally feminine endings", note that many, many masculine nouns have feminine-looking plurals. When I proposed this category, I intended it for singular nouns only; do you think otherwise?​—msh210 (talk) 05:59, 15 May 2014 (UTC)
I guess I'm not totally clear on your criteria. I mean, what makes záyit seem feminine? Are there any feminine nouns of the form Xáyit?   Re: #3: I agree that merely having a plural in -ót should not count for this category (though we probably should have a category for words like shulkhanót and m'komót); but to me a feminine-seeming singular ending seems "more feminine" if the plural ending matches. Does it not seem that way to you? —RuakhTALK 06:55, 19 May 2014 (UTC)
זית seems feminine to me because it ends in a ת. The rule always taught in my experience (and my criterion for the category) is that singular nouns that end with ת or with kamatz-ה are feminine (with very few exceptions). Re [לילה] "seems 'more feminine'" [than בית] — not to me. That may be just me, though.​—msh210 (talk) 06:30, 20 May 2014 (UTC)
Ah, O.K.; if it's the rule that's commonly taught, then I agree that it makes sense to use it as the baseline for a list of exceptions. How about Category:Hebrew masculine nouns ending in tav and Category:Hebrew masculine nouns ending in kamats-hei? —RuakhTALK 07:16, 20 May 2014 (UTC)
If I may butt in: it would be harder to type, but I think "Category:Hebrew masculine nouns ending in ת" (with or without a hyphen) would fit better with the way other categories are named than "...ending in tav" would. There is e.g. a "Category:Japanese words suffixed with 中" not *"...with chu", and a "Category:English terms spelled with μ" not "...with mu". - -sche (discuss) 17:27, 20 May 2014 (UTC)
Of course you may, and I agree. You'd have to ask Ruakh, but I suspect the reason he proposed it in transliteration is that Category:Hebrew masculine nouns ending in ־ָה is ugly or hard to read.​—msh210 (talk) 21:25, 27 May 2014 (UTC)

Names for Nikud[edit]

For the sake of clarity we should unify the names of nikud, at least for the lemma form and for the usage on project pages. I've collected some occurrences in the table below. I have a slightly preference for names based on the transliteration rules, but maybe there are more sophisticated reasons for naming. How do you think about it?

Hebrew נִקּוּד שְׁוָא חֲטַף סֶגּוֹל חֲטַף פַּתָּח חֲטַף קָמָץ חִירִיק צֵירֶה סֶגוֹל פַּתָּח קָמָץ חוֹלָם קֻבּוּץ דָּגֵשׁ רָפֶה ? ?
ktiv male ניקוד שווא חטף סגול חטף פתח חטף קמץ חיריק צירי סגול (פתח) קמץ חולם קובוץ/שורוק דגש רפה - -
trans­literation nikúd sh'vá khatáf segól khatáf patákh khatáf kamáts khirík tséire segól patákh kamáts kholám kubúts dagésh rafé - -
Unicode - sheva hataf segol hataf patah hataf qamats hiriq tsere segol patah qamats holam qubuts dagesh rafe shin dot sin dot
WT:AHE - shva chataf segol chataf patach chataf kamats chirik (or chiriq) tseiri (or tsere) segol patach kamats (or qamats) cholam kubuts (or qubuts) dagesh - shin dot sin dot
en:wt nikud schwa - - - - - segol - - - kubutz dagesh - - -
en:wp Niqqud Shva Hataf Segol Hataf Patah / Patach Hataf Qamatz / Kamatz Hiriq Tzere Segol Patach Kamatz / Qamatz Holam Kubutz Dagesh Rafe - -
Stf’s vote nikud shva khataf segol khataf patakh khataf kamats khirik - segol patakh kamats kholam kubuts dagesh rafe - -
msh210's the transliteration used throughout enwikt

I don't know the Hebrew words for sin dot and shin dot and the pronunciation of צירי. Feel free to complete the table accordingly. -- Stf (talk) 10:47, 18 May 2014 (UTC)

In my experience (reading older sources), the shin is called something like a "righted ש" and the sin something like a "lefted ש" (though I forget the exact wording), and there's no name for the dot. But my experience obviously is incomplete; in particular, it doesn't include newer grammars.​—msh210 (talk) 06:45, 20 May 2014 (UTC)

Using Lua to strip vowels in Hebrew templates[edit]

Template:l, which constructs a link, now uses Lua to strip vowels: thus, {{l|he|מִן}} yields מִן. Now, Hebrew uses [foo]wv parameters in our templates: e.g., template:he-present of uses 1= for the lemma and wv= for the lemma with vowels; template:he-adj uses mp= for the masculine plural form and mpwv= for the masculine plural form with vowels. It seems likely to me (not that I've looked at the code (or know Lua)) that whatever code strips the vowels for template:l can do the same for the latter templates, so that users need write only the with-vowels forms. Would that be a good idea? Can someone do it?​—msh210 (talk) 18:58, 29 July 2014 (UTC)

I pretty much already did this with Module:he-headword, but never actually rolled it out to be used by the templates. I guess if you double check User:Wikitiki89/בית, User:Wikitiki89/מצוין, and User:Wikitiki89/חם, then we can move User:Wikitiki89/template:he-noun and User:Wikitiki89/template:he-adj to Template:he-noun and Template:he-adj. --WikiTiki89 19:04, 29 July 2014 (UTC)
I looked at those entries and they seem okay. I can't read the Lua, and haven't thought of odd test cases possibly not covered. Pinging Ran, who authored most of the interesting Hebrew-specific template functionality and who (I think) knows Lua.​—msh210 (talk) 04:05, 30 July 2014 (UTC)
I've also just abstracted away some of the functionality into Module:he-common and created {{l/he}} and {{m/he}}, which support all of this (see the test page). --WikiTiki89 21:33, 29 July 2014 (UTC)
The eighth item on that test page displays unexpectedly/wrong, I think. That is, {{l/he||סיפר|dwv=סִפֵּר}}. Likewise the twelfth. In any event, ({{l/he||...}}) looks much like {{he-wv|...}}. I don't know whether there's any benefit to doing so, but it may be worth rewriting the latter to either use the former or be like it.​—msh210 (talk) 04:05, 30 July 2014 (UTC)
Well basically they are supposed to be exact clones of {{l}} and {{m}}, but also support wv= and dwv=. The ones you mention that you find display unexpectedly/wrong, are because I decided that the second parameter should always override the display, but I am willing to be convinced otherwise. Also, {{he-wv}} does not seem to support links (and I hadn't even heard of it before); this was actually supposed to make {{he-onym}} obsolete, but I guess it would make {{he-wv}} obsolete as well. --WikiTiki89 12:01, 30 July 2014 (UTC)
The use of {{he-wv}} is à la {{sense}}. Specifically, it's at the start of definition lines primarily but also pronunciation lines and maybe elsewhere to signify which vowelization of a word is being defined (or pronounced vel sim.). See e.g. the first L3 section of ישב#Hebrew, where the templateis used for pronunciations and definitions. I suppose it can be done by using hand-coded parentheses and {{m/he}}, but, on the other hand, {{sense}} itself can likewise be done with hand-coding. {{he-wv}} is convenient, and I recommend not deleting it.​—msh210 (talk) 17:33, 30 July 2014 (UTC)
Oh, and note that, because it's intended for that use, {{he-wv}} also boldfaces its first parameter. (Well, enlarges, like all Hebrew boldfacing around here.)​—msh210 (talk) 17:36, 30 July 2014 (UTC)
Well in that case, I'm not suggesting deleting it (although perhaps it should be called {{he-sense}}). --WikiTiki89 17:43, 30 July 2014 (UTC)

Transliteration of צ׳ and other[edit]

How should צ׳ be transliterated ("tsh"?) as in צ׳יינהטאון in Chinatown and other letters with ׳? --Anatoli T. (обсудить/вклад) 02:31, 5 November 2014 (UTC)

We've generally been using "ch" (see צ׳יק צ׳ק), but that could be misinterpreted as "kh". The other alternative would be "tch". I don't think "tsh" is a good choice (it only makes sense for Yiddish because it transliterates the separate components of "טש"). The other letters with geresh are more straightforward: ג׳ (j), ז׳ (zh), ת׳ (th), etc. I really wish we used a system more similar to the one we use for Arabic, then these would be easier: č, ž, etc. --WikiTiki89 03:46, 5 November 2014 (UTC)
Thanks. Would you like to add these geresh examples to the page, what you see fit? I personally dislike "ch" for "kh" but whatever is acceptable, if "ch" is used, then there should be an IPA or a comment, as "ch" in "chalk" or something. --Anatoli T. (обсудить/вклад) 04:30, 5 November 2014 (UTC)
I've taken the liberty and added some geresh letters. Feel free to fix/add. I've got a couple questions, why צ׳יינהטאון uses two yods? I've transliterated it as "chaynataun" and changed to the standard geresh symbol. Is "chaynataun" correct? --Anatoli T. (обсудить/вклад) 05:15, 5 November 2014 (UTC)
I moved them into the table and removed ת׳, since it is only used in transliterations and is usually not even pronounced as "th". --WikiTiki89 12:01, 5 November 2014 (UTC)
Thanks. --Anatoli T. (обсудить/вклад) 12:13, 5 November 2014 (UTC)
Also, in User:Conrad.Irwin/editor.js ' (apostrophe) should probably be replaced with ׳ but I don't know the codes for ' and ׳. --Anatoli T. (обсудить/вклад) 05:19, 5 November 2014 (UTC)
I'm not really sure what part of editor.js you're referring to. --WikiTiki89 12:01, 5 November 2014 (UTC)
I meant, e.g. Komi-Permyak koi: {from: "ÖöIi", to: "ӦӧІі"} (it replaces Latin lookalikes with Cyrillic ones) but instead of symbols ' and ׳ there should be codes. I forgot how to look them up. --Anatoli T. (обсудить/вклад) 12:13, 5 November 2014 (UTC)
They're just Unicode codes. But you can put the characters in literally anyway: diff. --WikiTiki89 12:29, 5 November 2014 (UTC)