Module talk:ru-noun

From Wiktionary, the free dictionary
Jump to navigation Jump to search

"Diminutive forms - worth adding?" discussion moved to Module_talk:ru-headword --Anatoli (обсудить/вклад) 08:51, 4 June 2014 (UTC)[reply]

Gender detection for common nouns ending in ж, ч, ш or щ[edit]

@Benwing2 I noticed once, that a declension table asked me for the gender. It may help for your detection rules that common nouns ending in ж, ч, ш or щ + ь as the last letter of the word are all feminines, masculines don't have a final "ь" after ж, ч, ш or щ. There should be no exception to this rule, "ь" after these consonants only serve a grammatical role. It's only for common nouns, doesn't apply to proper nouns, especially surnames. --Anatoli T. (обсудить/вклад) 23:50, 1 November 2015 (UTC)[reply]

@Atitarev OK, I can implement this if you want, although there may not be so many nouns affected. Benwing2 (talk) 04:05, 2 November 2015 (UTC)[reply]
Not few either. Anything that helps simplifying adding new entries should be welcome. This part is, at least predictable. --Anatoli T. (обсудить/вклад) 04:10, 2 November 2015 (UTC)[reply]
OK. Benwing2 (talk) 04:15, 2 November 2015 (UTC)[reply]

Proper nouns uncountable?[edit]

(moved from Talk:Бухара)

Wikitiki wrote:

@Benwing2: The template {{ru-proper noun+}} should not label the word as "uncountable". --WikiTiki89 14:23, 18 April 2016 (UTC)[reply]
This was a decision made awhile ago that Anatoli requested. The default for proper nouns is singular-only, but you can specify |n=b to override this. (This is documented in the {{ru-proper noun+}} documentation.)
@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter Benwing2 (talk) 18:46, 18 April 2016 (UTC)[reply]
@Benwing2: Maybe you misunderstood me. It should be singular-only, but not uncountable. These are not the same thing. All I'm saying is that the "uncountable" label and category should be removed from proper nouns. --WikiTiki89 18:48, 18 April 2016 (UTC)[reply]
@Wikitiki89 OK, the module (it's actually Module:ru-headword) doesn't currently make a distinction between singular-only and uncountable. This could be changed but it would be fairly major; we'd have to go through and decide for each singular-only common noun how to classify it. What should it say in place of uncountable? "Singulare tantum"? Benwing2 (talk) 18:55, 18 April 2016 (UTC)[reply]
It shouldn't say anything at all in its place. The only place this distinction is relevant is in the headword line. It shouldn't be too hard to make it so that if it is singular-only, then only add the uncountable label and category if it is a common noun. --WikiTiki89 19:01, 18 April 2016 (UTC)[reply]
I just did it myself: diff. Feel free to make any changes if I have strayed from your vision of how the module should work. --WikiTiki89 19:32, 18 April 2016 (UTC)[reply]
@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter: Reping because I'm pretty sure Benwing's ping didn't go through. Does anyone object? Singular-only proper nouns are still countable, but lack a plural only because there is only one of the referrent. --WikiTiki89 20:24, 18 April 2016 (UTC)[reply]
I have no objections for proper nouns not to use "uncountable".--Anatoli T. (обсудить/вклад) 00:10, 19 April 2016 (UTC)[reply]

Display genitive plural in the headword inflections?[edit]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter I notice we currently display the nom sg, gen sg, and nom pl in the headword inflections but not the gen pl, except for pluralia tantum. The gen pl is perhaps the most complicated case to form:

  • reducible feminines and neuters will have an extra vowel in the genitive plural, according to complex rules;
  • there are various endings (-ов, -ев, -ёв, -ей, -ь, no ending, etc.), with complicated rules regarding which one is chosen;
  • some nouns (e.g. железа́ (železá)) have a more-or-less unpredictable ё that appears only in the genitive plural;
  • lots of nouns have an irregular genitive plural.

I think it might make sense to include the genitive plural in the headword, either always or in certain situations (irregularities, reducible feminines and neuters, etc.). Now that nouns use {{ru-noun+}} for the headword, which includes full declension info, this should be fairly easy. The main issue here is that in some nouns the info in {{ru-noun+}} isn't quite correct for generating the genitive plural. This can probably be fixed by bot: the arguments to {{ru-noun+}} and {{ru-noun-table}} should always be the same (except for certain extra arguments in {{ru-noun+}} such as f=, m=, g=, g2=, etc.). Benwing2 (talk) 21:55, 7 August 2016 (UTC)[reply]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter Pinging again, in case the multi-paragraph entry confused things. Benwing2 (talk) 21:56, 7 August 2016 (UTC)[reply]
@Benwing2 I agree that the gen pl is perhaps the most complicated case for beginners to understand. Of all the issues about the Russian plurals, I found the gen pl form to be somewhat tricky. It really depends on the gender and such, so yeah. --KoreanQuoter (talk) 02:58, 8 August 2016 (UTC)[reply]
Genitive plural is helpful bu I'm not sure we need to make the headword even more complex (I'm not going to vote one or the other way either). Accusative plural is also helpful to understand the animacy. Unlike some other languages, you can't pick up one or two forms to help determine with 100 accuracy the stress pattern or declension pattern but some examples help to get an idea. --Anatoli T. (обсудить/вклад) 06:57, 8 August 2016 (UTC)[reply]
I 100% support having the gen. pl. It is certainly more important to include than even the gen. sg. Even native speakers often confuse this form (and I'm not only talking about myself). --WikiTiki89 15:19, 8 August 2016 (UTC)[reply]
I was just about to mention the possibility of not displaying the gen sg. IMO the gen sg is only useful (1) to distinguish accent classes a and b in end-stressed masculine nouns (most of which are class a); (2) to indicate that the noun is reducible; (3) to distinguish masculine and feminine nouns in -ь (which is also available from the indicated gender). My default Russian dictionary omits genitive singular in most cases, and is more likely to include genitive pl than genitive sg. Benwing2 (talk) 15:49, 8 August 2016 (UTC)[reply]
Yes, I've suggested that before, but Anatoli was opposed to it. For nouns ending in -ь it serves the double purpose of distinguishing gender and accent class. However, to fully show the accent class, you would also need to give the nominative plural and one other plural cases (other than nominative and genitive) and for feminine nouns ending in -а you would have to give the accusative singular. That would put too much information in the headword line, so we shouldn't really attempt to give the full accent pattern by example (and that's what the declension table is for anyway). --WikiTiki89 15:55, 8 August 2016 (UTC)[reply]
Another advantage of displaying the genitive plural is that in conjunction with the nominative plural it will help to indicate accent classes e and f. Benwing2 (talk) 16:00, 8 August 2016 (UTC)[reply]
But it's not as reliable for that because of all the exceptional cases. --WikiTiki89 17:14, 8 August 2016 (UTC)[reply]
@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter I implemented displaying the genitive plural. I'm still in the process of running a bot to fix up cases of disagreement between headword and decl, so there may be a few cases currently where the headword-displayed gen pl is wrong. These will be fixed shortly. Benwing2 (talk) 08:12, 9 August 2016 (UTC)[reply]
@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter This should be fixed for common nouns. Benwing2 (talk) 18:56, 13 August 2016 (UTC)[reply]

Alternative instrumental case forms of feminine compound nouns[edit]

@Atitarev, Cinemantique, Benwing2, Wanjuscha, KoreanQuoter: The instrumental case of, for example, Российская Федерация (Rossijskaja Federacija) can not only be “Российской Федерацией” or “Российскою Федерациею”, but also “Российской Федерациею” or “Российскою Федерацией”. This is one of the reasons that I think we should have left out declensions of compound nouns, instead pointing readers to the individual components. However, now that we do provide these full declensions, should we include all these forms? Should we indicate the existence of these variants in some more compact way, or should we just ignore them? Note that listing all of them could quickly get messy when there are more than two words in the compound. --WikiTiki89 18:28, 10 August 2016 (UTC)[reply]

I think what we're currently doing is OK, as it suggests that there are alternative forms that can be used. It would stand to reason that mixed forms can exist, and I think the user will infer this. I imagine these mixed endings won't be so common in any case. Benwing2 (talk) 20:11, 10 August 2016 (UTC)[reply]
Well the -ю forms are not so common themselves in the first place. When they are used, it's usually for euphony, and I suspect that needing it for euphony twice in a row is less likely than needing it just once, so I think mixed cases might actually be more common. But I was thinking maybe we should show something like Росси́йской/-ю Федера́цией/-ю, which is even more concise than what we have now, and also doesn't create links to super rare forms like Российскою Федерациею. --WikiTiki89 20:33, 10 August 2016 (UTC)[reply]
OK, I'm not in principle opposed to this but it will add some complexity to the module and may break a number of my bot scripts, so it will take a little while to get to (if we decide to do it). Benwing2 (talk) 22:28, 10 August 2016 (UTC)[reply]
Росси́йской/-ю Федера́цией/-ю is fine by me. Another option might be to add a little footnote about -ой/-ою, -ей/-ею and suggest users to go to individual entries for more. --Anatoli T. (обсудить/вклад) 22:39, 10 August 2016 (UTC)[reply]
Yeah, just giving Российской Федерацией and including a footnote might be even better. --WikiTiki89 15:32, 11 August 2016 (UTC)[reply]

Display of hypothetical/conjectural or rare/awkward forms[edit]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter This is now supported in declension tables (not yet in headwords). You can either use overrides and put a * at the beginning of a word, or use something like plhypall=y. There are actually a number of *hyp* arguments to specify various ways of specifying sets of forms to indicate as hypothetical, which exactly parallel the *tail* arguments, but by far the most common is plhypall=*, which marks all plural forms as hypothetical. Such words are displayed unlinked, and currently the code hardcodes the display as italic with a preceding *. Perhaps it would be better to use a CSS class? AFAIK you can specify the italics as a text style and the * as a character to display at the beginning of a span. Or alternatively, we can hard-code the * but surround it with a second CSS class for customization purposes; this is what {{i}} does. Wikitiki, do you feel like implementing this? I'm not super-familiar with CSS. The line that needs to be modified is 4638 in Module:ru-noun; it might also make sense to convert the hard-coded #888 color in line 4660 to be a CSS class as well. Benwing2 (talk) 05:37, 14 August 2016 (UTC)[reply]

I'll take a look when I'm at a proper computer. But why exactly do we need to italicize the words? I know ru.wikt does it, but I'm still not sure why. Which of the accepted and understood uses of Italics would this fall under. It doesn't fall under any in the list on Wikipedia, but maybe that's not a comprehensive list. --WikiTiki89 21:17, 14 August 2016 (UTC)[reply]
I think italics look good, but there are other possibilities, e.g. grayed-out text was suggested by someone. Mainly I want the text clearly set off from normal inflections. Benwing2 (talk) 23:09, 14 August 2016 (UTC)[reply]
OK, I went ahead and implemented this using a CSS class "hypothetical" (and also a "hypothetical" face, see tag_text in Module:script utilities). You can change the CSS class in MediaWiki:Common.css to change its look. Perhaps adding "color: #888;" would be good. There's also another CSS class "hypothetical-star" surrounding the * that precedes the hypothetical text, which can be used to customize the *. Benwing2 (talk) 05:03, 15 August 2016 (UTC)[reply]
I think graying out would be better than italics. Also, after seeing the results, I think it might be better to omit these forms from the headword line and include them only in the declension table. What do you think about that? --WikiTiki89 17:19, 15 August 2016 (UTC)[reply]
Both of these suggestions are fine with me. What should the headword line display instead? Should it just leave them blank, or display "conjectural/disused plural" or something? Benwing2 (talk) 17:45, 15 August 2016 (UTC)[reply]
I was thinking just omit it, but displaying "conjectural/disused plural" would work as well. I'm not sure. --WikiTiki89 17:46, 15 August 2016 (UTC)[reply]

Neo Vocative forms?[edit]

I don't know why but some page (like ребята, Лена) lack their Neo Vocative forms. They should be added?--Yoshiciv (talk) 16:30, 15 January 2017 (UTC)[reply]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter I feel like we discussed this somewhere before but I'm not sure where. Anatoli? Benwing2 (talk) 17:30, 15 January 2017 (UTC)[reply]
I can't find any discussion either. Yes, they can be added but it should only be optional with a parameter, eg nv=y. Grammars don't define this colloquial feature yet. The Russian Wikipedia on звательный падеж mentions there is no agreement among scholars. When it's used, the final а/я is dropped, я is replaced with ь. @Yoshiciv, could you please show an example where it's used? --Anatoli T. (обсудить/вклад) 20:48, 15 January 2017 (UTC)[reply]
One example: Google-image search with "Ребят". You can see the usage like "guys", in English. But I think I need a hand of the natives--Yoshiciv (talk) 03:51, 16 January 2017 (UTC)[reply]
@Yoshiciv Thanks. We never implemented this feature, which is not considered standard (it doesn't mean we can't or we won't do it). I've added before some links in "related terms" in some entries, e.g. Ка́тя (Kátja) -> Кать (Katʹ). Yes, "ребя́т" would be a colloquial "new vocative" or "neo-vocative" case (новозва́тельный паде́ж) for ребя́та (rebjáta). @Benwing2, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter. --Anatoli T. (обсудить/вклад) 04:00, 16 January 2017 (UTC)[reply]
Thank you!--Yoshiciv (talk) 12:09, 16 January 2017 (UTC)[reply]

-овщина[edit]

Feminines for proper nouns[edit]

(moved from Куйбышев)

@Benwing2: BTW, please add a feminine form for surnames in the header. --Anatoli T. (обсудить/вклад) 23:58, 13 May 2018 (UTC)[reply]
@Atitarev Ok, I'll write a script to do that. I'll make it do the following:
  1. -ев, -ёв, -ов, -о́в -> -ева, -ёва, -ова, -о́ва
  2. -ин, -и́н -> -ина, -и́на
  3. Фоми́н, Ильи́н -> Фомина́, Ильина́ (special exceptions noted in Zaliznyak)
  4. -ский -> -ская
  5. For other endings (e.g. -енко, -вич, -ян), no feminines will be indicated
Sound good? Benwing2 (talk) 00:09, 14 May 2018 (UTC)[reply]
@Benwing2: Yes, sounds great. I actually only asked for the header. There are other surnames with a gender difference, actually:
  1. -ый -> -ая (Безро́дный)
  2. -ой-> -а́я (Бережно́й)
  3. -ий -> -ая (not just -ский) --Anatoli T. (обсудить/вклад) 00:26, 14 May 2018 (UTC)[reply]
@Benwing2: Can't think of a good example of -ий surname, which is not -ский but they are basically formed from some adjectives. Not sure if such a surname exists, but Пе́гий is a possible surname formed from an adjective. --Anatoli T. (обсудить/вклад) 00:34, 14 May 2018 (UTC)[reply]
@Atitarev Done. I also had my bot fix a number of cases where adjectival surnames weren't properly marked as adjectival in the headword. Benwing2 (talk) 02:28, 14 May 2018 (UTC)[reply]
@Benwing2: Thanks, I have corrected two I noticed that were incorrect and should follow the rule you wrote "Фоми́н, Ильи́н -> Фомина́, Ильина́ (special exceptions noted in Zaliznyak)": Бородина́, Щедрина́. --Anatoli T. (обсудить/вклад) 02:36, 14 May 2018 (UTC)[reply]
@Atitarev Thanks. Didn't realize that the rule for Фоми́н and Ильи́н applied to all names in stressed -и́н. Benwing2 (talk) 02:42, 14 May 2018 (UTC)[reply]
@Benwing2: That's OK. I know it can be hard to follow sometimes. Thanks for the work! --Anatoli T. (обсудить/вклад) 03:26, 14 May 2018 (UTC)[reply]

Enclitic accusative[edit]

(moved from Template talk:ru-noun-table)

What is the recommended way of marking declensions that have an unstressed enclitic accusative with certain prepositions? For example, на́ ноги (ná nogi), за́ полночь (zá polnočʹ), etc. Tetromino (talk) 21:18, 1 July 2018 (UTC)[reply]

(Notifying Atitarev, Cinemantique, KoreanQuoter, Useigor, Wanjuscha, Wikitiki89, Stephen G. Brown, Per utramque cavernam, Guldrelokk): I think for this you would just use a usage note in the lemma's page. There's no specific support for indicating such things in the declension table. (If you need to include a multisyllabic unaccented word in {{ru-noun-table}}, prefix it with *, but that's not what you're asking about.) Benwing2 (talk) 21:43, 1 July 2018 (UTC)[reply]

Names of declension templates[edit]

(Notifying Atitarev, Cinemantique, KoreanQuoter, Useigor, Wanjuscha, Wikitiki89, Stephen G. Brown, Per utramque cavernam, Guldrelokk): It has always bothered me that the names of the declension and headword templates are completely illogical:

  • {{ru-noun-table}} is for regular noun declensions, while {{ru-decl-noun}} is for irregular noun declensions. We also have {{ru-decl-noun-unc}} for irregular singular-only noun declensions, and {{ru-decl-noun-pl}} for irregular plural-only noun declensions.
  • For adjectives, the situation is exactly reversed: {{ru-adj-table}} is for irregular adjective declensions, while {{ru-decl-adj}} is for regular adjective declensions.
  • We also have {{ru-noun-old}} and {{ru-adj-old}} for pre-reform noun and adjective declensions, which look like they ought to be headword templates (cf. {{ru-noun}} and {{ru-adj}}). There are no special templates for pre-reform irregular noun and adjective declensions; you use |old=1 with the normal irregular declension templates.

I am thinking of fixing this up by renaming the irregular and pre-reform templates like this:

The number of pages requiring renaming is not that great (maybe a few hundred). This leaves only the following:

Regular Irregular
Noun {{ru-noun-table}} {{ru-decl-noun-irreg}}, {{ru-decl-noun-irreg-unc}}
Adjective {{ru-decl-adj}} {{ru-decl-adj-irreg}}

The reason for having two irregular declension templates is that the irregular noun declension templates take positional parameters to specify the declensions. You could as well use {{ru-decl-noun-irreg}} for uncountable nouns, but you'd have to put "-" in every other parameter (you could also just use {{ru-noun-table}} for all irregular nouns but you'd have to name the parameters: |nom_sg=, |nom_pl=, |gen_sg=, |gen_pl=, etc.).

This still leaves {{ru-noun-table}} vs. {{ru-decl-adj}}, but fixing that would require renaming templates on thousands of pages and so I'm less gung-ho about making the change unless others really think it's a good idea. Benwing2 (talk) 19:38, 8 July 2018 (UTC)[reply]

@Benwing2: No objections.--Anatoli T. (обсудить/вклад) 22:15, 8 July 2018 (UTC)[reply]

Adverbs like два́дцатью[edit]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): ruwikt says there exist multiplicative adverbs like два́дцатью "twenty times as much", де́сятью "ten times as much", пя́тью "five times as much", and even gives an example sentence: пя́тью пять — два́дцать пятьpjátʹju pjatʹ — dvádcatʹ pjatʹfive times five is twenty-five My questions are: (1) Are these real? (Presumably so, although they aren't listed in Zaliznyak.) (2) For which numbers do these multiplicative adverbs exist, and how are they formed? For examples like пя́тью, де́сятью, два́дцатью, they differ from the corresponding instrumental пятью́, десятью́, двадцатью́ by the stress, but what about e.g. пятна́дцать, with instrumental пятна́дцатью? Can you form a multiplicative adverb meaning "fifteen times as much"? What about the multiplicative adverbs of два, три, четы́ре, два́дцать пять, etc.? Benwing2 (talk) 04:28, 12 February 2019 (UTC)[reply]

Yes, I can easily produce an expression of the form "N-ю пять" (like пя́тью пять) for the N = 1-20, 30, 50, 60, 70, 80. (Maybe there are more?) For 1-10, 20, 30, they differ from the instrumental case, sometimes significantly (compare единожды vs. одним, дважды vs. двумя). For 11-19, 50, 60, 70, 80 they coincide with the instrumental, so I suppose we shouldn't have a special page for those. Tetromino (talk) 05:25, 12 February 2019 (UTC)[reply]
List of exceptional ones that I can think of: еди́ножды (jedínoždy), два́жды (dváždy), три́жды (tríždy), четы́режды (četýreždy), пя́тью (pjátʹju), ше́стью (šéstʹju), се́мью (sémʹju), во́семью (vósemʹju), де́вятью (dévjatʹju), де́сятью (désjatʹju), два́дцатью (dvádcatʹju), три́дцатью (trídcatʹju). Tetromino (talk) 05:33, 12 February 2019 (UTC)[reply]
@Tetromino Thanks! What about e.g. "23 times 5" or "40 times 5"? How do you say that in Russian? Benwing2 (talk) 06:07, 12 February 2019 (UTC)[reply]
I would say that in full, двадцать три умножить на пять. Tetromino (talk) 13:24, 12 February 2019 (UTC)[reply]

Categorisation[edit]

@Benwing2 We currently have Category:Russian soft-stem masculine-form accent-e nouns, which I think could be categorised better:

Do you think this is ok? —Rua (mew) 13:42, 25 May 2019 (UTC)[reply]

@Rua Somehow I missed this ping from 6 months ago. I agree with your suggestions. If you want to implement them, feel free, or I will eventually get to them. Benwing2 (talk) 13:59, 14 November 2019 (UTC)[reply]

Gender of nouns in -ость, -тель, -шь, -чь, -жь, -щь[edit]

@Canonicalization (Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Guldrelokk, Fay Freak, Tetromino, Canonicalization): Canonicalization asked if the module should be smarter about nouns in -ость, defaulting to feminine. There's гость (gostʹ) that's masculine, but that's the only one I see. Similarly it could default to masculine for nouns in -тель, and feminine for nouns in -шь, -чь, -жь, -щь. I am not opposed to this; I wonder what others think. Benwing2 (talk) 14:01, 14 November 2019 (UTC)[reply]

BTW I found the following feminine nouns in -тель: артель (artelʹ), гантель (gantelʹ), добродетель (dobrodetelʹ), канитель (kanitelʹ), метель (metelʹ), обитель (obitelʹ), пастель (pastelʹ), постель (postelʹ), ракета-носитель (raketa-nositelʹ), as well as переключатель (pereključatelʹ) that's wrongly classified as feminine, vs. 242 masculine nouns in -тель. ракета-носитель (raketa-nositelʹ) is a weird case because ракета (raketa) is feminine and declined that way, while носитель (nositelʹ) is masculine and declined masculine. Benwing2 (talk) 14:10, 14 November 2019 (UTC)[reply]
BTW of the feminine nouns above in -тель, most are end-stressed, and the module could default to no gender in that case since AFAIK masculine nouns in -тель do not have end stress. Benwing2 (talk) 14:13, 14 November 2019 (UTC)[reply]
гость (gostʹ) looks like an exception, but that's because it's not using the suffix -ость (-ostʹ). The same reasoning applies to the nouns ending in -тель you've just mentioned: they're not using the suffix -тель (-telʹ) (I think обитель (obitelʹ) is wrongly etymologised fixed by Useigor). Maybe there's a way to somehow link gender detection and etymology templates? But that might be too convoluted. Canonicalization (talk) 14:18, 14 November 2019 (UTC)[reply]
@Benwing2, Canonicalization: I have no objection but it seems more predictable with -ость (-ostʹ) with гость (gostʹ) as an exception. --Anatoli T. (обсудить/вклад) 10:14, 17 November 2019 (UTC)[reply]

going to add non-lemma forms[edit]

@Atitarev I'm doing a run to generate non-lemma forms for Russian nouns and verbs, which hasn't been done in awhile. Benwing2 (talk) 02:59, 30 July 2020 (UTC)[reply]

New vocative for plurals[edit]

@Benwing2: Hi. Could you please add the ability to add (new) vocative parameter for plurals? I know only two: девча́та (devčáta) and ребя́та (rebjáta). They can have colloquial new vocatives, especially девча́т (devčát) and ребя́т (rebját). Potentially all -а́т/-я́т plurals - котята, зверята, etc. --Anatoli T. (обсудить/вклад) 01:01, 13 October 2020 (UTC)[reply]

@Atitarev Let me see if I can get to this in the next day or so. Benwing2 (talk) 04:11, 13 October 2020 (UTC)[reply]
@Benwing2: Thanks. I meant vocative, not locative (I think you figured that out) :) --Anatoli T. (обсудить/вклад) 04:20, 13 October 2020 (UTC)[reply]
@Benwing2: Hi. Bumping my request. :) --Anatoli T. (обсудить/вклад) 23:47, 18 January 2021 (UTC)[reply]
@Atitarev Apologies, I'll try to get to this in the next day or so. Benwing2 (talk) 01:31, 21 January 2021 (UTC)[reply]
@Benwing2: Thank you! I can only think of the diminutive words above that may have colloquial vocatives in the plural. @Tetromino: Can you think of any others? --Anatoli T. (обсудить/вклад) 01:40, 21 January 2021 (UTC)[reply]
@Atitarev: I can imagine it being used for a few other pluralia tantum words like близня́та (bliznjáta), but I haven't heard it personally. Tetromino (talk) 16:03, 21 January 2021 (UTC)[reply]
@Benwing2: Another bump, LOL. I know you're busy, sorry for numerous pings today. --Anatoli T. (обсудить/вклад) 05:44, 27 December 2021 (UTC)[reply]
@Atitarev I added it to the todo list; apologies that the list keeps growing. It's at the top since you requested it a year and some ago :( ... I looked into just knocking it out but it will take a bit of time as the module is complex. Benwing2 (talk) 06:53, 27 December 2021 (UTC)[reply]

Soft н in 5*a nouns[edit]

@Benwing2: In some words with a reducible stem (especially in geographic, ethnographic, and linguistic terminology relating to China), the «н» in -нец is soft. For example: nominative ха́нец (xánec) → genitive singular ха́ньца (xánʹca), nominative plural ха́ньцы (xánʹcy), genitive plural ха́ньцев (xánʹcev). As far as I can tell, this is currently completely unsupported by the module; it allows a soft «л» (as in па́лец (pálec)), but not a soft «н».

Since there is no way to auto-detect a soft «н» in this case (and in the vast majority of words with -нец, the «н» is hard), can we add an explicit declension spec for a soft stem — maybe «ь»? Alternatively, add a way to explicitly set the stem such that doesn't get mangled. Tetromino (talk) 15:14, 10 March 2021 (UTC)[reply]

@Tetromino Apologies, this request got lost among the shuffle. I'm thinking we need to add a flag that tells the reducing algorithm (which converts па́лец (pálec) -> па́льц (pálʹc) and коне́ц (konéc) -> конц (konc)) to add a ь after н followed by an elided vowel. This could be e.g. (нь) if this can only occur with н. Can it occur with any other paired consonant, e.g. р, с, т? If so maybe some other code would be better, like (*ь) (which could stand by itself in place of *). Benwing2 (talk) 07:01, 20 May 2021 (UTC)[reply]
@Benwing2, I cannot think of any examples with a soft р, с, or т. Tetromino (talk) 12:49, 20 May 2021 (UTC)[reply]
@Atitarev This was the discussed here. Does it apply only to н? Benwing2 (talk) 02:51, 18 April 2023 (UTC)[reply]
@Benwing2: Yes, thanks, only to н. With тайва́нец (tajvánec) both hard and soft forms seem to be valid. I've made it to have both. ха́нец (xánec) has only forms with "ь". Anatoli T. (обсудить/вклад) 03:31, 18 April 2023 (UTC)[reply]
@Tetromino: Hi. Do you have any more words in mind, apart from ха́нец (xánec) and тайва́нец (tajvánec)? Anatoli T. (обсудить/вклад) 03:35, 18 April 2023 (UTC)[reply]
@Atitarev I bet terms like сычуанец (syčuanec) and тяньцзинец (tjanʹczinec) are attestable, which would decline the same way. Theknightwho (talk) 00:52, 19 April 2023 (UTC)[reply]
@Theknightwho: Good pickup, yes :) Anatoli T. (обсудить/вклад) 00:56, 19 April 2023 (UTC)[reply]
@Atitarev Are there any other suffixes where this could come up? Theknightwho (talk) 00:57, 19 April 2023 (UTC)[reply]
@Theknightwho: I can't think of any with problems. Feminine forms like ха́нька (xánʹka), for example, inflect regularly. No issues with related adjectives either. Anatoli T. (обсудить/вклад) 01:05, 19 April 2023 (UTC)[reply]
@Atitarev @Benwing2 I’ve added a category for these: Category:Russian nouns with soft final н in reduced stem (though there may be a better name). Curiously, I came across кида́нец (kidánec), which doesn’t come from the Palladius system, so that potentially leaves the possibility that this could come up in other ways (with other final consonants, maybe?). Theknightwho (talk) 07:07, 2 January 2024 (UTC)[reply]
@Theknightwho Are you sure киданец reduces in this fashion? I can't find this word in any dictionaries but кубанец from Кубань is given in Zaliznyak with normal reduction to кубанц-. Benwing2 (talk) 07:32, 2 January 2024 (UTC)[reply]
@Benwing2 киданьцы (kidanʹcy) is easily attestable, and is the expected form from the root кидань (kidanʹ). Theknightwho (talk) 07:35, 2 January 2024 (UTC)[reply]
@Theknightwho You removed your comment, but Zaliznyak specifically says кубанец is from Кубань. кубинец is listed as from Куба. Benwing2 (talk) 07:50, 2 January 2024 (UTC)[reply]
@Benwing2 Yeah, I checked after I wrote it. I can only assume in this case it's by analogy with other Chinese(-adjacent) ethnonyms. Theknightwho (talk) 07:52, 2 January 2024 (UTC)[reply]
чжурчжэнец (čžurčžɛnec) with the same reduction also looks to be attestable (with difficulty, but I think it's doable). Theknightwho (talk) 08:18, 2 January 2024 (UTC)[reply]
@Theknightwho, @Benwing2: The other soft ending is "л", which is already covered: скита́лец (skitálec).
The variations with soft and hard "н" only happned because there are special cases unique for demonyms derived from Chinese place name ending in -нь: тайва́нец (tajvánec), уха́нец (uxánec), чанчу́нец (čančúnec), тайюа́нец (tajjuánec), сиа́нец (siánec), даля́нец (daljánec), чжурчжэ́нец (čžurčžɛ́nec), etc. etc.
It is still a relatively small category. Anatoli T. (обсудить/вклад) 09:19, 2 January 2024 (UTC)[reply]
@Atitarev @Benwing2 I would say чжурчжэ́нец (čžurčžɛ́nec) and кида́нец (kidánec) are exceptions to that, but are clearly analogous. It seems to correlate with the adjectival ending: compare чжурчжэ́ньский (čžurčžɛ́nʹskij) and куба́нский (kubánskij) (to use the example above).
With that being said, I can see terms like сясьский (sjasʹskij) exist (e.g. Сясьский канал). I'm genuinely curious about this, so I'll see if I can find something. Theknightwho (talk) 09:25, 2 January 2024 (UTC)[reply]
@Theknightwho: TBH, I am more interested in Thai/Khmer transliteration scrapers, since I am almost sure, Russian grammatical modules are over 99% accurate :)
Well, ся́сьцы (sjásʹcy) seem to exist, even if it sounds awkward. Anatoli T. (обсудить/вклад) 09:32, 2 January 2024 (UTC)[reply]
@Atitarev Thanks for the reminder :) ... I will try to get back to this. Benwing2 (talk) 09:37, 2 January 2024 (UTC)[reply]
@Benwing2: Thanks! And the Persian headword next or earlier, whatever you feel more comfortable with:) Anatoli T. (обсудить/вклад) 09:39, 2 January 2024 (UTC)[reply]
@Atitarev Sorry, what's the issue with Persian headwords again? Benwing2 (talk) 09:40, 2 January 2024 (UTC)[reply]
@Benwing2: At Wiktionary_talk:About_Persian#Template:head_allows_etym_languages and the next topic we agreed on we would like the Persian headword to be - input and output. Remind me when you get to work on it, if you do :) Anatoli T. (обсудить/вклад) 09:44, 2 January 2024 (UTC)[reply]
@Atitarev OK thanks! I will probably try to work on Thai/Khmer scraping first. Benwing2 (talk) 09:45, 2 January 2024 (UTC)[reply]
@Benwing2: Yay! Anatoli T. (обсудить/вклад) 09:47, 2 January 2024 (UTC)[reply]
@Atitarev 1 result on Google haha! But yes, I suspect there will be others. I mostly just find it an interesting challenge. Theknightwho (talk) 09:37, 2 January 2024 (UTC)[reply]
@Theknightwho: If you
I found при́пятьцы - citizens of the abandoned city after Chernobyl. Anatoli T. (обсудить/вклад) 09:42, 2 January 2024 (UTC)[reply]
@Atitarev Yeah, and the singular exists as well (and may even pass CFI with some work - I see one use and one dictionary mention on GBooks). Okay, I think that's evidence enough that this is a real phenomenon in a really rare handful of cases. Theknightwho (talk) 09:44, 2 January 2024 (UTC)[reply]
There are two place names I could find ending -зь - Абезь, Кильмезь. Not sure about the stress here. You can try researching. Anatoli T. (обсудить/вклад) 09:37, 2 January 2024 (UTC)[reply]
@Atitarev Well Абезьцы (Abezʹcy) is used in a handful of news articles, so is definitely real. Nothing for the singular, though. Theknightwho (talk) 09:42, 2 January 2024 (UTC)[reply]
@Theknightwho: Singulars may be too rare or never used, especially for such unknown places, it's predictable, though. Note that in Russian, a form can still be correct, even if a specific is hard to attest. Anatoli T. (обсудить/вклад) 09:46, 2 January 2024 (UTC)[reply]
@Theknightwho All right, I may change the (нь) notation to just (ь) and generalize it to apply to any consonant. Benwing2 (talk) 09:51, 2 January 2024 (UTC)[reply]
@Benwing2 Thanks. Could you please change the category to "final consonant" instead of "final н"? Theknightwho (talk) 09:53, 2 January 2024 (UTC)[reply]
@Benwing2 Actually, one other thing: it's probably worth marking all the forms as irregular, given this is vanishingly rare and not predictable. Theknightwho (talk) 09:58, 2 January 2024 (UTC)[reply]
@Theknightwho, @Benwing2: I agree. Only the soft скита́лец (skitálec) is regular. Anatoli T. (обсудить/вклад) 10:00, 2 January 2024 (UTC)[reply]
@Benwing2, @Theknightwho: Will скита́лец (skitálec) be in that category? These are more common and always soft. Anatoli T. (обсудить/вклад) 09:59, 2 January 2024 (UTC)[reply]
@Atitarev I think no, given it's regular, but we should probably name the the category to account for that, yeah. Maybe "irregular soft final consonant"? Theknightwho (talk) 10:00, 2 January 2024 (UTC)[reply]
@Atitarev @Benwing2 I think we should potentially mark the relational adjectives as being irregular as well, though that might need to be categorised manually. Theknightwho (talk) 10:31, 2 January 2024 (UTC)[reply]

Gender position[edit]

@Benwing2: The whole documentation of Template:ru-noun-table treats gender hints only a single time. I don’t know what to do with “nouns with multiple and/or mixed declensions” which need a gender hint. See пупы́рь (pupýrʹ), which has another declension correctly given by Russian Wiktionary. I tried multiple ways on the basis of лоску́т (loskút), стул (stul), кры́ло (krýlo), but only got errors or wrong display. I don’t understand the whole section Template:ru-noun-table#Declension spec. It does not say anything about sequential order or paramter names. Fay Freak (talk) 15:08, 19 May 2021 (UTC)[reply]

@Fay Freak I think you are referring to what to do when the gender hint and the mixed declension spec (or possibly multiple such specs) both need to be given. In general, you just jam all the specs together. You can also stick a semicolon between the different specs, which is ignored but might help the parsing. I'll update the docs to make this clearer. Benwing2 (talk) 06:26, 20 May 2021 (UTC)[reply]
@Fay Freak I took a look at the docs for Template:ru-noun-table and example 7 under "nouns with multiple and/or mixed declensions" shows this:
{{ru-noun-table|ка́мень|*m|or||*m-ья|каме́н|pltail=*|notes=* ''The plurals marked with an asterisk are antiquated forms.''}}
which is similar to your example of пупы́рь (pupýrʹ). Here, after the word or is a blank param for the lemma (which is copied from the previous lemma ка́мень) followed by three specs jammed together: * (reducible noun), m (gender hint, required for nouns in ) and -ья (irregular plural declension). Let me know if you need me to clarify this example or add another example. Benwing2 (talk) 06:32, 20 May 2021 (UTC)[reply]
@Benwing2: Now with this example I roughly understand how this template works. The first positional parameter is the stress pattern, the second the word itself with stress mark, the third is “declension spec” in some certain format. The obscure thing is that the gender has to be stated in the template, in this declension spec parameter, in the order listed by the section “declension spec”, separated by hyphens; nor do I understand what the special thing about the “special-case markers” is or why they are separated from declension classes and accent patterns; and you have a fourth parameter for “plural stems” but in an example озерцо́ you still manipulate the stem by parameter. The section “declension spec” says things like “gender is required in the following cases …”, “you should normally supply just +” but I cannot derive from it how to supply – I do not get a picture from it what a valid formatting is. Everything is chopped into parts surely for technical reasons but hardly similar to the way a language user understands the patterns, and I can’t understand what the module searches for to do what, without reading the source code (which I can’t). I still depend but on examples or similarities of existing usage. Fay Freak (talk) 13:12, 20 May 2021 (UTC)[reply]
@Fay Freak I'll see about cleaning up the documentation. This was the first declension module I implemented and all the subsequent ones use a somewhat different format that is maybe easier to understand. Yes, you are right about the order being (1) stress pattern (which can be omitted in which case all numbered params move one to the left), (2) lemma with accent mark, (3) declension modifiers, (4) plural stem. The business about manually specified declension classes almost never needs to be given; I'll rearrange things and move that to the very bottom. The "declension modifiers" are a heterogeneous class of extra information that helps the module in cases where the lemma doesn't include all the necessary information. The most common declension modifiers are gender (m, f or n), the reducibility spec *, the е/ё alternation spec , the "anomalous nominative plural" spec (1), and the "anomalous genitive plural" spec (2). These particular codes come fairly directly from Andrei Zaliznyak's Russian grammar. The various declension modifiers can be placed in the declension modifier parameter in any order, and are just jammed together. The documentation is out of date in this respect. The thing about озерцо́ (ozercó) is that a great number of nouns have an alternation between unstressed е in the lemma and stressed ё in some other form (either the entire plural or just the genitive plural, depending on the stress pattern), and so there's a special declension modifier spec for this purpose. The plural stem (4th param) is normally intended for weird cases like не́бо (nébo) plural небеса́ (nebesá), у́хо (úxo) plural у́ши (úši) or хозя́ин (xozjáin) plural хозя́ев (xozjájev) where the plural is simply formed from a different, unpredictable stem. All of these nouns are found in Category:Russian nouns with irregular plural stem. Benwing2 (talk) 04:41, 21 May 2021 (UTC)[reply]
@Fay Freak See the documentation now. There's a section on declension modifiers (the third param) that should hopefully be a bit clearer. Let me know if anything doesn't make sense. Benwing2 (talk) 06:24, 21 May 2021 (UTC)[reply]

paucal[edit]

@Benwing2 Thanks for another enhancement! I can't currently access Zaliznyak's dictionary on my computer. Could you please share the link again or point me to relevant pages? What is Zaliznyak calling the feature? Modern grammars don't describe this to my knowledge but I may be wrong. It seems a little obscure or I forgot. It is dated as you marked. Maybe they should only be on the pre-reform forms? Anyway, please let me read on this first. --Anatoli T. (обсудить/вклад) 09:38, 31 December 2021 (UTC)[reply]

@Benwing2 Hi. Just re-pinging. Not sure if you got my original one. --Anatoli T. (обсудить/вклад) 02:28, 2 January 2022 (UTC)[reply]
@Atitarev Apologies, I have emailed you my copy of Zaliznyak. I think there's a more recent one online but I'm not sure where. The paucal form is mentioned under the Wikipedia entry "Russian declension" under the "Count form" header. It goes by the name of "счётная фо́рма" form in Russian but since that seems to refer to two different things, I've distinguished them as "count form" (во́семь бит, две́сти два́дцать вольт, etc.; replacement for the genitive plural, used with most numbers) and "paucal form" (два часа́, три ряда́, etc.; replacement for the genitive singular, only used with 1.5/2/3/4 and derived numbers). I don't think either of them are archaic. Russian Wikipedia also mentions them here: Счётная форма. The "paucal form" is called "паукальная счётная форма" = "paucal count form". Benwing2 (talk) 03:17, 2 January 2022 (UTC)[reply]
@Benwing2: Thanks. I got temporarily confused by the symbols used in the declension table. This is good and thank you! Definitely not dated. —Anatoli T. (обсудить/вклад) 03:25, 2 January 2022 (UTC)[reply]

defaults to b in -ёнок and -ёночек nouns[edit]

"defaults to b in -ёнок and -ёночек nouns" must be to a. Stress on stem.Longbowman (talk) 04:53, 23 September 2022 (UTC)[reply]

курица + proscribed forms[edit]

Hi @Benwing2.

I have tried to add some proscribed forms in the declension but ended up by describing all plural forms. Is that the right way? Anatoli T. (обсудить/вклад) 00:00, 21 July 2023 (UTC)[reply]

@Atitarev Hi. If you put the proscribed forms last, you can use |pltail=* I think; otherwise you have to do what you did. Having the proscribed forms in the middle can be supported easily with the declension params for Ukrainian, Belarusian, etc.; eventually I will rewrite the Russian declension module to work similarly but it's a big task. Benwing2 (talk) 00:03, 21 July 2023 (UTC)[reply]
@Benwing2: thanks, the issue is that the regular forms didn't show automatically after "or", as in the version before my edits. Anatoli T. (обсудить/вклад) 00:14, 21 July 2023 (UTC)[reply]
@Atitarev Do you mean the version using both or and overrides? The way it's currently implemented, an override will override all forms. Benwing2 (talk) 00:20, 21 July 2023 (UTC)[reply]

Bug relating to multiple ё's[edit]

@Benwing2 @Atitarev I've noticed a bug when the stem has multiple ё's and ends with ё: the second-last ё is wrongly converted to е automatically. e.g. {{ru-noun-table|Гё́дёллё|n=sg}} gives:

Theknightwho (talk) 16:49, 15 January 2024 (UTC)[reply]

@Theknightwho @Atitarev This is presumably happening because of the initial stress; the ё's following it except any that are absolutely word-final get converted because they're treated as unstressed. In general if the stress moves off a ё, it's correct to convert it to е. There are some special hacks for when the ё precedes the stress because of words prefixed with трёх- and четырёх-. The code wasn't written to handle loanwords with multiple stable ё's. Any changes to this code need to be done carefully so that non-loanwords with ё don't get messed up. Benwing2 (talk) 22:23, 15 January 2024 (UTC)[reply]
@Benwing2 From doing a bit of testing, it seems to work backwards from the final ё, and stops when it encounters either another ё or a primary stress mark. Testing ё́ёёёё, ёё́ёёё, ёёё́ёё, ёёёё́ё, ёёёёё́ gives:
This also shows an issue with explicit primary stress being removed. Theknightwho (talk) 23:01, 15 January 2024 (UTC)[reply]
@Theknightwho Yeah, this code is in Module:ru-common and it's quite old and has been hacked upon quite a lot. It was definitely not written with these sorts of terms in mind. Benwing2 (talk) 23:12, 15 January 2024 (UTC)[reply]
@Benwing2 I might have a look over MOD:ru-common at some point, as I’ve noticed a few other odd things that could potentially come up with rare non-native terms (e.g. -ьей and -ьек becoming -ьья and -ьька in the genitive with a reduced stem). Theknightwho (talk) 23:56, 15 January 2024 (UTC)[reply]
@Theknightwho, @Benwing2: It also added a stressed "е́" in the prepositional for some reason: Гё́делле́ (Gjódellé) Anatoli T. (обсудить/вклад) 03:08, 16 January 2024 (UTC)[reply]