Wiktionary talk:About Russian

From Wiktionary, the free dictionary
(Redirected from Wiktionary:T:ARU)
Jump to navigation Jump to search

Romanisation[edit]

Why is ц transliterated as ts? I would use c, I mean, if we already use ž, č and š... H. (talk) 16:40, 1 March 2007 (UTC)[reply]

And how about using the correct ʹ for ь instead of '? We recommend ʺ for ъ already. H. (talk) 16:44, 1 March 2007 (UTC)[reply]

Hear, hear! I also recommend c for ц. I also agree with the use of ʹ (U+02B9, MODIFIER LETTER PRIME) and ʺ (U+02BA, MODIFIER LETTER DOUBLE PRIME) for ь and ъ instead of ' (apostrophe) and " (quotation mark), as the latter might have other uses. – Krun 16:58, 1 March 2007 (UTC)[reply]
It'sa hard to type it. May be it will be added on "Slavic Roman" section in edit menu? --Jaroslavleff 10:13, 19 March 2007 (UTC)[reply]
I agree, ʹ is too hard to type for such a small distinction. Using ' and " does not result in any confusion. As for ц, transliterating it with c will probably be confusing for many people when it is followed by a consonant or the vowels a or o: cap, carstvo, po-nemecki. With ts, they are clear to everyone. Also, I think most people who would understand c for ц probably read Cyrillic anyway and do not need it. The people who actually need the transliteration are less likely to know this use of c. —Stephen 09:47, 20 March 2007 (UTC)[reply]

True, but there are cases when one wants to say something like the following: Gogol's writings, where there should be two ' symbols; the first ' is conceived of as a soft sign and the second as an English apostrophe. It looks weird and is likely to confuse anybody who doesn't immediately get it. (For some reason, the Wikipedia editor won't let me type two consecutive apostrophes, so I can't get this comment to display the way I intend, with two instead of one. Ghfowler (talk) 16:36, 21 October 2012 (UTC)[reply]

But this is a dictionary. We do not host the Russian Classics in transliteration, we just have entries for each separate word or phrase, and entries such as those do not have apostrophes. I think in all of our pages, there might be one or two examples that have the Russian quote marks (which may be either «н» or „н“). —Stephen (Talk) 17:37, 21 October 2012 (UTC)[reply]

Strange romanization rules.[edit]

This looks like a mess of transliteration and transcription. If the goal is transliteration - that's just script substitution, letter by letter. So it's gonna be požálujsta. If we need transcription - we should spell phonemically and that's pažálusta. So where does požálusta fit? As to the current rules:

  • e, ё have two spellings with j and without. Why not ю, я? They are all equal in that sense, i.e. pronounced with or without j. In fact there's rule for я in reflexive verbs, but why only in that place? Пять is certainly not pronounced pjat', but p'at'.
  • e, э use the same letter. But they're pronounced differently, albeit not always.

So, for vowels I think we should use. e = e / je ё = 'o (not o!, but with exceptions) / jo э = e / (maybe æ?) ю = 'u / ju я = 'a / ja Exceptions for ё are all right, though after ч, I would prefer to keep 'o. So (чёрт = č'ort, not čort), unless you wish to sound Cossack-like. G/h substitution - not good example. I wouldn't imagine pronouncing господи like hóspodi. Unless affecting Ukrainian accent that is. Classic example is лёгкий = l'ohkij. Anyway, if we use g/h and g/v substitution, why not others? That would be код = kot (=кот), etc. and that way we'll just have phonemical transcription which defeats the whole idea of transliteraton. (more below). Резюме is not good either. First, it should be res'um... not resjum... (It's not резьюме, right? But according to current rules it is.) Second, it's mostly prounounced res'ume, with soft m. While hard m might be acceptable (which I'm not sure of), кафе or каре would be better example. But if we use ɛ for e sounding like э, we certainly should use it for э itself, shouldn't we? By the way, ɛ is just a square for me 'cause I don't have no Arial Unicode installed. Like many other users. And that makes me think, what is transliteration really for? I guess it's for making some representation of the word to users who do not know the script or simply can't see it. This should be readable (albeit not exactly correct phonetically) by anyone who knows just Latin alphabet. What's the point of writing ɛ instead of э? Is ɛ romanization? It's not Latin alphabet. We might just as well give full IPA and not bother with rules here. By the way, current policy specifically forbids giving pronunciation in translations (and that's where transliteration is mostly used). I think transliteration must enable any user to easily identify and distinguish different printed words in unfamiliar script. Phonemical transcription doesn't allow that. And even if it is used, it must be at least applied consistently. Now, this is very important matter, so I'm not changing anything myself, but I'd like to see some feedback. --Panda34 20:07, 14 January 2008 (UTC)[reply]

It’s transliteration (i.e., po, not pa), but with enough flexibility to note exceptions to pronunciation. Normally, пожалуйста would be pronounced "pʌʒálujstə" and would be transliterated "požálujsta", but this word is a notable exception for its irregular pronunciation. Therefore, the parts that are dropped in pronunciation are dropped from the transliteration: požálsta or požálusta. There are not many cases like this, but each case must be done so that an American who is not a linguist will be able to see best how to say it. This is not for linguists or Russians, which they use the Cyrillic. Another example is Ивановыч, which is pronounced irregularly as "Iványč".
I know what you are talking about with пять = "p'at'", but only linguists and native Russians will comprehend that. The best that 99% of Americans and British can do is "pjat". If you write "p'at'", most people will pronounce it "pat". To learn the really differences between "пя" and "пья" takes most English-speakers years, and even then only a handful ever get it. That's why я, ё and ю and always written ja, jo, ju (except after ch, etc., where spelling it "chjo" would result is a serious mispronunciation). You have to take into consideration the audience, few of whom will ever figure out palatalized consonants, and few of whom will even see the importance. As for the Russian letter е, it is too common, and transliterating it as 'e everytime looks terrible and helps no one. Linguists and native Russians read the Cyrillic. The only people who need the transliteration are those who don’t get palatalization anyway. It is general practice among most who transliterate Russian to English to transliterate Russian е as "e", and serious users know that the English "e" following a consonant is to be pronounced approximately as "je". Only when Russian е comes at the beginning of a word or follows a vowel does it get transliterated as "je". Initial "e" is used for Russian э. Also, in some words where Russian е is pronounced э, then we transliterated it with ɛ. This means that in transliteration, Roman "e" is treated exactly as Cyrillic е, and it is considered that it palatalizes a preceding consonant, without having to put in the mysterious-looking '.
а = a, я = ja
э = e, е = e (sometimes ɛ)
ы = y, и = i
о = o, ё = jo
у = u, ю = ju
except, of course, after ч, ш, etc., since writing čo šo allows an American to come as close as he can to the correct pronunciation, while writing čjo šjo would result is a very bad error. —Stephen 21:28, 14 January 2008 (UTC)[reply]
First, if you read carefully I never proposed e = 'e (and for exactly the same reason you provided). I agree about ja 'a. Anyway, it's not what I propose; I'm totally against any phonemic transcription whatsoever. I just pointed inconsistencies in the current system. And, by the way it's Иванович, not ...ыч. And officially it's pronounced exactly ivanovich. It's perfectly normal to say ivanovich in informal speech. But saying ivanych in officialese while not an error makes you look a little on 'hillbilly' side. And that's my whole point. Transliteration gives you an idea how the word is printed, transcription how it's pronounced. This mix only makes confusion. Those who see cyrillic for the first time in their life won't be much better off in pronouncing 'tovo' instead of 'togo'. Listening to heavy foreign accent, I would prefer 'togo' because it can be reverse-engineered (and it doesn't sound wrong, just Northern dialect). But suppose this person finally learns the letters (and I suppose it's fairly easy). What then he'll write? Тово? And then he'll find того outside of the dictionary, he won't even know what it is! The same is for spoken language. If you read it as 'tovo' you'll expect to hear it as 'tovo' but you'll hear 'tavo'. And when spelling something by syllables to somebody who has a problem hearing or understanding you, words are mostly pronounced orthographically. So, you see, it's printed 'togo' pronounced 'tavo' but transliterated 'tovo'. Russian orthography is muddled enough by itself without us inventing some third form.
Your point is that transliteration is used to give some idea of how the word is pronounced. Why don't we transliterate French then? My point is that if some non-linguist wants to pronounce something he would be much better off to look at IPA, SAMPA, at least he'll have a chance to get it right. Transliteration is not for pronouncing but for representing printed words in another script. Russians have to use translit very often in non-Cyrillic-enabled environment, and don't type 'tovo' unless affecting padonki-style. Why should Americans? Consider google hits for "kogo nibud" and "kovo nibud". Two orders of magnitude. The latter form is either unintelligible or illiterate. By providing half-phonemic transcription we don’t help pronunciation much but certainly mangle orthography. Transliteration should be used only in print. And another concern is a rather arbitrary choice of exceptions. In my native ear лёгкий pronounced orthographically sounds like 'too correct' pronunciation which is perfectly comprehensible, while e.g. vtoroj (if you could actually pronounce it) instead of ftaroj is a heavy foreign accent to the point of unintelligibility.
Also note that transliteration may be official. Geography, personal names, etc. Consider 'park gor'kovo' and 'park gor'kogo'. Check google hits. And those are English sites.
I agree that both transcriptions, "Ivanovich" and "Ivanych", should be provided, but "Ivanych" should not be left out, since it is a systematic alteration that applies to virtually all male and female patronymics, and something that the orthogram spelling does not reveal. There are many features and forms in Russian that strike a native speaker as provincial or hillbilly, but which nevertheless are deepseated and something that every Russian knows, even though most people avoid using them except in certain circumstances. It may be good to include a note that a foreigner should stick with "Ivanovich", but the pronunciation "Ivanych" still needs to be shown.
As for the use of transcriptions, they are only for the casual user who does not speak any Russian, cannot read Cyrillic, and is not interested in learning. Unlike Arabic, Indic and the Oriental scripts, Cyrillic can be learnt in only about 30 minutes, and anyone who wants to learn Russian will learn Cyrillic in the first lesson.
Anybody who has to rely on the transcription will be someone who does not speak Russian and you will not be hearing him say "tovó" to you. You can’t look up individual words in a dictionary, one after the other, and manage to pronounce a coherent question and receive and decipher a useful reply. A transcription is only good for someone who is not a linguist, does not speak the language, does not care to learn the language or the writing, but who only wants to know in an approximate way how a certain word or two is pronounced. Certainly a transcription is NOT intended as an alternative medium for native speakers. Everyone who speaks Russian, including Americans who have studied Russian for only a month in school, will use the Cyrillic which is supplied with every Russian word and phrase.
When someone starts to learn Russian, the first hour he learns the printed alphabet and the script alphabet. During the first week, the genitive singular case in introduced, and it is made instantly clear how -ого is to be pronounced. Nobody who learns "tovó" is ever going to go and say "éto vavnó". No one who cannot read Cyrillic and who does not know the pronunciations of г is going to have a conversation with you in Russian. Transliteration does not lead to confusion, because it is only used by those who know almost nothing about the language who who do not much care. In a very few instances, the transliteration can help a serious student to learn about a quirk in pronunciation such as "Ivanych".
We also use the transliteration to help indicate where the stess is. It would be much better to show the stress by putting the acute accent mark, but that is somewhat problematic. For one, the acute accent is hard to type. It is not on any keyboard that I have access to and I have to copy and paste it. For two, it looks bad in most fonts used by Westerners (гра́мма), unless you apply the template {{Cyrl|}}: гра́мма. And third, it confuses some search engines. If you search for "гра́мма", you will not find "грамма" (and vice versa).
You will not be pronouncing syllables orthographically to someone who does not know Cyrillic. To communicate with an American who does not know Cyrillic, you have to speak English. If he asks you how to say a certain word in Russian, such as "give me a beer", it really doesn’t matter whether you pronounce it naturally or othographically, since he almost certainly will not learn it and will not use it. If he does try to use it once, he will be bet with blank stares or a complex sentence in Russian which he won’t understand, so he won’t use it a second time. You are attaching entirely too much importance to the transliterations. They are only for people who do not know anything of the language and who are not likely to ever learn more (that is, most people). They are for people who have a passing interest in only a word or two, and the transliterations have an additional use, for serious users, as a way to inform them of unexpected quirks in the pronunciation of some words, such as "pozhalsta". (And the varying pronunciations of Cyrillic а and о are learnt along with the alphabet in the very first hour of study, so the important thing is not the o’s and a’s, but the dropped -uj-.)
We transliterate Cyrillic because many Americans believe it is a gigantic barrier. It only takes a half hour to learn it, but many Americans complain bitterly about Russian-English dictionaries because they don’t have transliterations. It’s very silly but it’s a fact of life. In the case of French, German, Spanish and Italian, nobody much cares about how the word is supposed to be pronounced. Since they are written in the Roman alphabet, people pronounce them as they wish and the foreigner is expected to understand. If an American says "джа" to a German (ja) and the German does’t understand, he thinks he is mentally retarded. But Cyrillic is seen as a fantastic and insurmountable barrier.
As for IPA and SAMPA, most people cannot decipher either of those. Most of us learn a way that approximately describes English pronunciation in dictionaries as we grow up, but that is neither IPA nor SAMPA. IPA and SAMPA are tools of linguists.
As for Russians reading and writing Russian in Roman script, obviously they are going to write "togo" and "Ivanovich" (or "Iwanovic"). Our transliterations are not, I repeat not, for native Russians. They are for people who cannot read Cyrillic and who don’t know Russian and who are almost certainly never going to learn (that is, most people).
Very few people who cannot read Cyrillic can pronounce "vtorój" at all, nor "ftorój". They will add a neutral vowel, or worse. I remember a student sweating over the pronunciation of всё, and the best he could do after much practice was "сфё". "Vtorój" is not intended for any serious students of Russian. Another thing learnt in the first day of study is how в before с is devoiced. The student still expects to see it transcribed as "vs", and "fs" is only for an IPA representation, which virtually nobody uses.
Yes, there are several official transliterations. The U.S. military uses one, which is the first one that I learned; our libraries uses another, which is a common one. The U.S. Geological Service uses another one, which is quite common. The transcriptions that we use here are not intended to be any of the official ones, but are very close to them, yet with the flexibility to indicate exceptions in pronunciation. I don’t think there is much need to give ANY of the official transliterations, but for certain words, such as "Gor’kij park", if you wanted to include one or more official transliterations, you would need to indicate which: Scholarly, ISO/R 9:1968, ISO 9:1995, GOST, U.N., ALA-LC, or the BGN/PCGN (United States Board on Geographic Names). For a while, we did try to include some of these, such as GOST and BGN/PCGN, but people objected to every header that we could think of, so we stopped that effort.
The main thing to remember is that the transcriptions are not intended for native Russians or for linguists and for anyone who speaks Russian. They are for the average native English-speaker, who is a person who cannot read Cyrillic, cannot speak Russian, and is never going to learn. In addition, the transcriptions show unexpected exceptions to pronunciation, which is important to serious students of Russian. And lastly, and unfortunately, the transcriptions are a way to show the word stress for the citation form (as I explained above). —Stephen 16:33, 16 January 2008 (UTC)[reply]
Ah, I got you now. What we have here for transliteration is a sort of curiosity for idle readers and occasionally a good help for learners to learn immediately of exceptional pronunciation. Makes sense. What I had in mind was helping people have written conversation in Russian via e-mail or IM. Like, you know, most common request for translation "how do I write 'I love you' in Russian using English letters". On second thought I see that it's just the same idle curiosity, anyone mildly interested in writing Russian could easily learn Cyrillic and then figure out how to transliterate it properly if need be. It's just that it's called transliteration that confused me in the first place, it is not. But it is more useful than transliteration, you convinced me. Panda34 20:19, 16 January 2008 (UTC)[reply]

The romanization section here contradicts Appendix:Russian transliteration. That guideline is based on the widely accepted transliteration system used in linguistics, while this one seems to be based on the whims of a bunch of Wiktionary editors'.

Scientific transliteration seems to be appropriate for a dictionary (although it's a bad choice as the basis for a system meant to convey pronunciation to English-language readers). Other standardized systems have their advantages too, but there's no justification for making stuff up because we like it.

Pronunciation is already conveyed by IPA, as covered by the Wiktionary:Pronunciation guideline. If someone would like to supplement the IPA pronunciation with some other standardized phonemic or phonetic system, as is done for many English words, that would be fine. This belongs in the "pronunciation" section of an entry, not next to the headword and not in a "translations" section. This is where syllabification, stress, and akanye are also represented.

(By the way, spoken stress is an attribute of pronunciation, and is also represented in IPA. In Cyrillic dictionaries it is represented by accents on the headwords because these are more-or-less phonemic writing systems, but English dictionaries have separate pronunciation guides. Acute accents for stress are a foreign convention to English-language readers, and do not belong in English-language Wiktionary. [Hyphenation is an attribute of orthography represented in the headwords of English dictionaries, but syllabification is a separate attribute of pronunciation. see Wiktionary:Pronunciation#Hyphenation])

The purpose of romanization here is to convey the original orthography to English readers. We list headwords in English, not in IPA. The main headword for his is not hɪz or hĭz. For the same reasons, the main Latin-alphabet representation of the written word его in Russian should be jego (depending on the system), and certainly not jevó, yevo, ievo or jɪˈvo.

Wiktionary's transliteration guide does not clearly convey these principals. If we can agree what they are, then it will be clearer what to do here. I've started a discussion at Wiktionary talk:Transliteration#Purpose. —Mzajac 00:05, 7 March 2008 (UTC)[reply]

Acute accents for stress are a foreign convention to English-language readers, and do not belong in English-language Wiktionary. - This kind of argumentation is flawed. Why shouldn't common Slavistics practice for indicating stress/pitch accent be used for foreign language lexemes on English wiktionary? This is English wiktionary, but the words are not English only. Putting it in transliteration for languages written in Cyrillic script is a good way to get rid of combining diacritics. --Ivan Štambuk 00:18, 8 March 2008 (UTC)[reply]
Well, stress is an attribute of pronunciation, not orthography, and it is already indicated in the pronunciation section here (which is normally absent in Cyrillic dictionaries). We have two separate things: transliterations and pronunciation, and it could be confusing to unnecessarily mix there functions. Since this a dictionary for English readers, but of words in all languages, we should stick to the basic English conventions, and not add exceptions for particular languages or writing systems where possible.
The argument of using common slavistics conventions does carry weight, however. But are accents commonly added to transliterations in English-language slavistics publications, or are they only used on the Cyrillic in dictionaries? I don't have a lot of paper references at hand. —Mzajac 01:23, 8 March 2008 (UTC)[reply]
I just had a look through the Ukrainian language and writing articles in my copy of Ukraine: a Concise Encyclopædia (1963), and indeed, I see that accents are used to indicate stress with scientific transliteration. This appears to be used both in transcription of spoken words and transliteration of written words, although the distinction isn't always clear to me. So there is some literature supporting this practice.
Note that this my still not be suitable for use with certain transliteration systems for some languages which use diacritics over vowels. —Mzajac 01:43, 8 March 2008 (UTC)[reply]
Well, the Russian dictionaries that permit some kind of preview on b.g.c. all appear to use diacritics on Cyrillic letters. I never really paid attention to transliterated Russian words whether they use accent marks or not, but I'm pretty sure that when it's used on various vowels, it was used to indicate stress. For what it's worth, Derksen's Slavic inherited lexicon (online version of a printed book) and Kortlandt's book on Slavic accentuation indicate the stress on transliterations via diacritics, although Stang's classical book "Slavonic accentuation" uses them on Russian Cyrillic letters. So stress marks appear to be used when they are important to note, whatever writing scheme (transliteration or original orthography) author choses to use.
Languages with relatively decent phoneme-to-grapheme ratio rarely need tools such as IPA. Indicating stress is sometimes of utmost importance (for example, in some South Slavic languages some of which are written in Cyrillic too, where e.g. 'novina' can mean 4 different things depending on how you pronounce it). The whole idea of =Pronunciation= section here on wiktionary is ill-suited for highly-inflective languages in which stress/pitch accent is not part of standard orthography (unlike e.g. polytonic Greek), but is very important in practice, especially it's mobility in various cases. =Pronunciation= section idea is based on a fact that the knowledge on how to pronounce a lemma form would be enough. There's no way in entries such as он (on) or on but to indicate it using common stress markers inside the declension tables.
Be it inside the headword on Cyrillic letters, or in transliterations, accent marks are pretty important. Bear in mind, however, that thousands of Russian entries here already mark stress in transliterations, and doing that is much easier than figuring out how does one put acute on "ю". --Ivan Štambuk 10:15, 8 March 2008 (UTC)[reply]
Yes, at least in Russian dictionaries used by Americans, acute accents are standard fare. I’ve only seen one dictionary that did not use acute accents, my chemical and polytechnical Russian-English dictionary, which does not mark accents at all, nor even ё. —Stephen 17:18, 8 March 2008 (UTC)[reply]
I'm starting to believe that acute accents may be suitable for stress.
Some details need to be worked out. Do they belong on Cyrillic headwords? In translations, they don't seem to belong on the Cyrillic which must be linked его́ (jego), but on the transliteration его (jegó). —Mzajac 00:33, 9 March 2008 (UTC)[reply]
The only complaints I encounter are due to the lack of the accent in the Cyrillic. People who can read the Cyrillic are bothered by the need to look to the transcription to see the stress. The only reason that I haven’t been putting the acute in all the Cyrillic headwords is the difficulty of inserting it. It is not on any of my keyboards, and the only way I can do it is to copy and paste. A second problem is how the it appears in Cyrillic text. On some systems, it looks fine (depending on the fonts installed and probably also the browser)...but on many systems the acute accent rides very high and off to the right. This is how it looks on my screen (using Firefox and WinXP Pro), and it really looks unprofessional. It can be fixed on some systems (hopefully almost all systems) by using the template {{Cyrl|...}}, but this adds more work to creating the entries. If the acute were easier to type, I think it would be best to use it everywhere except in the PAGENAME.
There might be a way of "subst-ing" an entry to produce the desired results automatically, but so far I have not found it.
When I write lengths of Cyrillic text, as in examples, I like to add the accents, but that entails extra trouble for words that are linked: Не прочита́в моего́ письма́, он рассерди́лся. —Stephen 02:01, 9 March 2008 (UTC)[reply]
Big discussion, I haven't read everything, will reread again but my position is to show the rare exceptions as they are pronounced. It really doesn't make much difference, Russian is phonetic in 99% of cases, but providing the information on exceptions and the word stress is useful. BTW both Иванович and Иваныч exist, the former being more standard and more commonly used in writing, especially documents but Iványč is more colloquial and sometimes has a friendlier sound. There is no confusion here - there are two forms.
I'll try to list some exceptions:
Other exceptions, which are not worth any change in transliteration because of consistency:
  • о (o) is reduced (close to "a") in an unaccented position but is still romanised as "o": молоко (molokó), корова (koróva)
  • Endings -ться/-тся are consistently pronounced as /tsə/ but we still romanise -t'sja/-tsja родиться (rodít'sja).
I am strongly against removing word accent form Russian transliterations. --Anatoli 11:51, 17 April 2011 (UTC)[reply]

How do we name this case?[edit]

три час'а? в два ряд'а? Some call it счётный падеж in Russian, what is appropriate term in English if any? Do we need to bother at all or should be happy with just special case of genitive? And for час it's not even there.

I don’t think it is ever considered as a different case in English. We just look at it as an irregularity in the genitive case. If there were numerous words, a case name would be helpful, but there are only a few that I can think of. There are many more in what may be called the partitive-genitive case, and of course some in the vocative and some in the locative (as opposed to the prepositional). I have fixed час, it was overlooked before. There might be a better way to do it. —Stephen 15:07, 16 January 2008 (UTC)[reply]

Romanization critique[edit]

My observations on the problems and differences between the transliteration chart at Appendix:Russian transliteration and Wiktionary:About Russian#Romanization. I am assuming that both of these are based on the scholarly linguists' system of w:scientific transliteration.

The contradictions and pell-mell changes introduce inconsistencies from any accepted standard, add ambiguities in transcription, and make the system much more complicated to write and to read. They yield a "transliteration" which is neither a rendering of the original orthography nor a phonetic transcription, and the phonetic elements are a mix of central European and English-language equivalents. The other simple chart has some problems, but the method here is a complete mess, mixing elements of scientific transliteration, GOST transliteration, BGN/PCGN transliteration, made-up rules, and even IPA phonetic transcription.

Easily resolvable problems:

  • The simple chart transliterates Cyrillic х as Latin x, while this method offers "x, kh" (both versions are found in different references for scientific transliteration, but there's no reason to recommend two different options). The original system uses only "x".
  • The simple chart offers neutral double quotation marks for the hard sign, while both charts offer typographic (curly) quotation marks for the other soft and hard signs (an obvious mistake, which I'll correct shortly [done —MZ])

Arbitrary additions, exceptions, and borrowings

  • Both charts transliterate Cyrillic ц as "ts", instead of the original "c".
  • Both charts transliterate Cyrillic ё in two ways: "jo, o", instead of the original "ë".
  • Both charts transliterate Cyrillic е in two ways: "je, e", instead of the original "e", with the simple one lacking an explanatory note.
  • Both charts transliterate Cyrillic э as "e", instead of the original "è".
  • This method adds adds a list of mostly arbitrary phonetic rules under the heading "Additional considerations"
  • This method also adds the Cyrillic dictionary convention of using an acute accent for syllabic stress, which is problematic for several reasons. In English dictionaries stress and syllabification (not hyphenation) are normally indicated in the pronunciation, not in the headword. The use of acute accents this way is unfamiliar to English-language readers, and can be mistaken for normal orthography on Cyrillic characters (cf. Ѓ, Ѐ, Й, Ѝ, Ќ, Ў), and for pronunciation (as in café, etc.) or letter differentiation on Latin characters (several Romanization systems use diacritics to represent different Cyrillic letters).

If this is a guide to pronunciation, then it belongs only in the "pronunciation" section of a page, as a supplement to IPA. If it is a true transliteration representing the written word, then it belongs next to the headword or in the "translations" section. If this chart is a work in progress, it belongs on a talk page, and not presented without comment on the main page about Russian in Wiktionary.

My recommendations:

  1. This chart should be immediately removed from the guideline page, because it is contradictory and seriously flawed.
  2. For orthographic transliteration, a standardized method for w:romanization of Russian should be chosen to be used next to Cyrillic in headwords and "transliteration" sections. This should replace Appendix:Russian transliteration.
    1. The linguists' method of scientific transliteration is appropriate for a dictionary, and could gracefully replace the current practice.
    2. A bibliographic system for English, like ALA-LC or BGN/PCGN would be a suitable alternate.
    3. GOST or ISO 9 may also be suitable, but these are rarely used and unfamiliar.
  3. For phonetic transcription, a standardized method could also be chosen for use as a complement to IPA in "pronunciation" sections. The only thing I've seen in literature is a phonemic transcription using the same conventions as scientific transliteration. Perhaps there is also a suitable convention based on older English-language dictionaries. See w:Pronunciation respelling for English for an outline of 16 systems, and w:Pronunciation spelling. Oxford dictionaries used to have a selection of symbols for some foreign sounds, but apparently none for Slavic languages, e.g. indicating palatalization.[*]
  4. Syllabic stress should not be indicated using acute accents. Stress should only be indicated in "pronunciation" sections according to English dictionary conventions. Possible conventions come from IPA (/jɪˈvo/) or from older English-language dictionaries (jĭ·voʹ). [update: as discussed above, there is precedent for using accents in scholarly transliteration or transcription in slavistics —Mzajac 01:57, 8 March 2008 (UTC)][reply]
  5. We should avoid inventing new methods, unless we can prove that we are more qualified than the scores of linguists, librarians, academics, and publishers who use established standards. If standards can't solve every romanization/transcription problem, then let's at least try to follow a published example.
Mzajac 23:35, 7 March 2008 (UTC)[reply]
The difference between "x" and "kh,x" are due to an evolution in application. Originally, it was "kh" only, but then there was a significant trend toward the use of "χ". Eventually, "χ" was replaced with "x", both for ease of typing and because it’s a common convention. After some time, the "kh" fell into disuse.
The reason for the "straight" double quotes is ease of typing. The curly version is too difficult for the tiny benefit it gives.
I much prefer the use of "ts", but there has been a trend toward the use of "c"; in many cases, the "c" works well enough, but in some words, it is confusing to nonlinguists (e.g., "francúz").
Which phonetic rules is it that you dislike?
In my English dictionary (Random House), stress in indicated in the head word.
The use of the acute accent this way is the only way I, as an American, have ever seen it done for foreign languages.
The pronunciation sections are for IPA or SAMPA. I don’t put IPA or SAMPA onto pages, and, like most Americans, I don’t use them when others put them. But they are certainly welcome if anyone wants to add them. The transcriptions are not suited to the pronunciation sections and don’t go there.
Not just this chart, but this entire page is a work in progress.
Before you mess with our years of work, first you have to take into consideration the audience. Native Russians who want to know what a Russian word means use the Russian Wiktionary. Russian entries in the English Wiktionary are intended for native English speakers. Linguists and other English-speakers who have at least a working knowledge of Russian use the Cyrillic, not the transcription. They only resort to the transcription to check to see where the accent falls. The only people who depend heavily on the transcriptions are casual users who know almost nothing about the language and who are not really interested in learning that much. These casual English-speaking nonlinguist users don’t know Cyrillic, usually don’t know IPA, and don’t really care to learn. They just want an approximate spelling in Roman letters.
For this reason, while ц as "c" is okay before "e" or "i", it is very confusing in other cases. For the same reason, "x" and "j" are problematic. The "x" and "j" have been adopted because there has been a slight consensus for it. There is a similar consensus for "c" instead of "ts", except when the "c" is not followed by "e" or "i".
As far as I can tell, the only differences that you see between this chart and the other is the addition of "x" to one and the addition of a short explanatory note in one. All that needs to be done is add them to the other.
As for choosing a romanization, this is where you have to consider the purpose of the romanization and the readership that will use it. The standard romanizations are intended for linguists, librarians and others who have specialized needs (most of these specialized needs have become moot since the advent of Unicode and the Internet...the people who used to need these special standardizations now have the orginal Cyrillic script, which is better than any romanization). On the English Wiktionary, the users of the romanizations are neither linguists nor Russian speakers. They are only interested in one or a few words in a rather general way.
We have given considerable thought to the current system, and the system is gradually changing as we sense changes in readership, education and culture.
Except for linguists and wordsmiths, the so-called syllabic accent is not well understood. And since there are similar symbols being used in different ways in different transcriptions, even I never know what they mean unless the word in the original script is one that I know. The syllabic accents are fine for the IPA/SAMPA in the pronunciation sections, but they are confusing and ambiguous in the transcriptions.
The only changes that I think are acceptable are adding "x" to this list (not really important, because this page is not yet in use), and adding "c" in addition to "ts" on both charts. —Stephen 15:16, 8 March 2008 (UTC)[reply]
You're telling me that the system you've devised is superior to over a century of linguistics practice? Enough to justify being completely inconsistent with linguistics papers which are still being published? Is this based strictly on the "changes in readership, education and culture," which you "sense", or can you show me some documentation or discussion to back this up? —Mzajac 00:29, 9 March 2008 (UTC)[reply]
I’m saying that the linguistic practices were propelled by need, and that need is now suddenly past. Linguists do not have to settle for one or another of numerous transliteration schemes anymore. Now they can have the real McCoy. The people who actually need Roman transcription are nonlinguists who do not read Russian or Cyrillic. I come across lots of people who want to know, in a general way, how to write something in Russian in Roman letters, but I have NEVER received such a request from a native Russian or from anyone who spoke Russian. Everyone who knows Russian and who also knows the Roman alphabet can easily produce Roman transliterations at will, in any system they favor. The people who cannot do it are those who do not know Russian or Cyrillic and who don’t care to learn much of it. The people who would conceivably use one of your transliterations do not need your or my help to do it. —Stephen 01:38, 9 March 2008 (UTC)[reply]
Please spare us the personal anecdotes and speculation. There is no "my transliteration". I am defending a century of linguistics practice against the made-up "transliteration" described on this page.
Linguists use transliteration today, in print and electronic publishing. For example, the Slavic and East European Journal prefers transliteration over Cyrillic.
Do you have any documentation of your "years of work"? Is it demonstrably better than current linguistics practice? Can you cite a publication supporting your opinion that the need for transliteration "is now suddenly past?" —Mzajac 08:21, 9 March 2008 (UTC)[reply]
You know very well that "one of your transliterations" means one of the transliteration schemes that you favor. If you want me to "spare" you (and, according to you, everyone else) my ideas and beliefs (or anecdotes, speculation, and whatever other negative terms you come up with to describe it), and deliberately refuse to understand the meaning of things like "one of your tranliterations," then there is no point in my trying to continue this conversation with you. I don’t know what you are referring to when you quote "years of work", and you have not been able to understand what I’m saying to you. You have become offensive and you can talk to someone else after this. —Stephen 20:04, 10 March 2008 (UTC)[reply]

Starting again[edit]

I realize that I've been getting to wound up about this. I apologize to all for my tone. I'd still like to try to clarify my concerns about transliteration, if anyone will listen.

I think there is still a need for orthographic transliteration, as opposed to phonetic transcription, and it is important to be consistent with academic practice. Part of the difficulty in the discussion here is because we can't agree on basic implementation goals. I'm going to give this a rest for a bit here, and work on clarifying some points in the general transliteration policy at Wiktionary talk:Transliteration. —Mzajac 21:07, 10 March 2008 (UTC)[reply]

Transliteration proposal[edit]

My understanding is that Appendix:Russian transliteration is the current guideline, and the material on this page is a work in progress. To keep this clear for editors, I'm moving the latter to this talk page, and leaving a clear link and explanation on the guideline page. —Michael Z. 20:48, 12 April 2008 (UTC)[reply]

Romanization[edit]

Russian alphabet
А Б В Г Д Е Ё Ж З И Й К Л М Н О П Р С Т У Ф Х Ц Ч Ш Щ Ъ Ы Ь Э Ю Я
а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я
Roman equivalents
а б в г д е ё ж з и й к л м н о п р с т у ф х ц ч ш щ ъ ы ь э ю я
a b v g d e, je* jo, o** ž z i j k l m n o p r s t u f x, kh ts č š šč y e ju ja

*The Russian е is transliterated je word-initially, after a vowel, and after ъ or ь (въехать = v”jéxat’); otherwise, just e (берег = béreg).
**Transliterate ё as jo with these exceptions: after ж, ч, ш, and щ, simply write o (чёрт = čort).

Additional considerations[edit]

  • Indicate the stressed syllable with an acute accent: áéíóúý.
  • In words where г is pronounced x or h, write x (or kh) or h instead of g (бог = box; господи = hóspodi). Where г is pronounced v, write v: того = tovó.
  • When ч is pronounced š, write š (что = što).
  • When е is pronounced hard like э, you can indicate this with ɛ: резюме = rezjumɛ́.
  • When a letter is not pronounced at all, do not write anything: пожалуйста = požálsta or požálusta.
  • In the case of the reflexive verb endings -ться and -тся, use in place of j: одеваться = odevát’s’a; одевается = odevájets’a; одеваются = odevájuts’a.

Doesn't unicode end the need to agonize over transliteration?[edit]

I'll grant the need for transliteration back in the days before unicode, but in this day and age what's the point? For anyone already familiar with the Latin alphabet, learning the cyrillic alphabet is no big deal, and pretty much ends the problem. Learning the genuine alphabet is no harder than nailing down the rules for just one transliteration scheme, let alone juggling multiple schemes, as we realistically must do when we read transliterated material.

If you don't want to russify your computer with a keyboard switcher, you can always go online and bring up a яшерты keyboard. type out a word and paste it into Google; look it up in an online dictionary. Presto! No more checking sometimes tens of variations of transliterated words containing ё я е ы у ю щ э ч й ь ъ ж х or ц! 76.168.50.104 20:32, 25 June 2008 (UTC)[reply]

Yes, it only takes a half hour to learn it, but Americans tend to think it’s next to impossible and don’t want to try. As long as there are Americans, there will be a need for transliteration. (They also don’t usually worry about learning transliteration schemes, so any one will do as long as the letters are Roman). —Stephen 21:13, 25 June 2008 (UTC)[reply]
An open dictionary should shouldn't require its readers to learn all of the hundreds of alphabets which are in use. For what, to save ourselves a bit of typing?
I don't think transliteration has ever been used for the lack of Unicode. It predates computing by many decades (centuries?). It is used in metal type, including in publications which have the respective non-Latin typefaces available, and in handwriting, inscription, etc.
No need to pick on Americans, since I'm sure that not every one of our readers from Britain, South Africa, the Philippines, China, etc. knows the Russian alphabet. Michael Z. 2008-06-26 01:12 z
I have no idea about the British, South Africans or other English-speakers, I only know Americans. For all I know, the British are as linguistically agile as the Dutch. —Stephen 01:23, 26 June 2008 (UTC)[reply]
Following Stephen, I guess if you're an author, unicode means you can give a rough-and-ready transliteration, followed by the word in proper cyrillic in parentheses. Then there's no need for a picky transliteration system that lets an informed reader reconstruct cyrillic no matter what.
But following Michael, I concede that you need to enable Wiktionary users to enter words in all forms they will encounter "in the wild". Then redirected entries are our friend, so we don't have to write and maintain parallel entries in full detail. LADave 20:06, 26 June 2008 (UTC) (O.P. who negelected to log in above)[reply]
Most transliteration systems aren't truly reversible, but transliteration should be rendered consistently, so that a familiar reader will generally recognize the word, and be able to draw parallels between related words in the same and related languages, and to the same transliteration in other publications. (This is also why Wiktionary's exceptions to the linguistic system for Russian make it worse, not better.)
As a rule, we don't create entries for transliterations of words, unless they are attributed English words in their own right (so troika, but not ljubit’), but including them in main entries may also help the reader find them through site search or Google. Michael Z. 2008-06-26 21:10 z
[re: reversibility] I disagree. I think some transliteration schemes do aim for reversibility. IMO that goes past a point where everyone is better off using indigenous systems. Those folks mean well, but they are trying to square the circle, turn lead into gold, and count how many angels can fit on a pin.
Rough-and-ready transliteration is fine as long as people just want to read in translation and get a taste of a foreign tongue without bothering with nitty-gritty details. Consistency is good. Obsessing over systems is bad. LADave 23:44, 26 June 2008 (UTC)[reply]
I'm not sure what you disagree with. Yes, some schemes do aim for reversibility. w:ISO 9:1995 does so, but I have never seen it actually used. The strict version of ALA-LC and BGN/PCGN for Slavic languages do too, but they are rarely used that way in practice. We have chosen to use linguistics systems for many languages in Wiktionary, which is not reversible by machine, but reasonably unambiguous for anyone who's used to it, if one knows which language is being transliterated. I'm not sure what you mean by "rough-and-ready", but Ius͡hc͡henko, Yushchenko, Juščenko, and Juschtchenko just don't mix, so it's best to follow the chosen convention.
I disagreed with "Most transliteration systems aren't truly reversible". IMO, academics tend to favor reversible schemes even if the laity doesn't.
Rough-and-readiness - Well, if старый becomes stary and Невский becomes Nevsky, that qualifies. Unfortunately alternatives are awkward in print. Горвачёв to Gorbachev and Хрущёв to Krushchev lead us seriously astray. If е is sometimes ye, sometimes e, and э also becomes e, that's rough-and-ready too. On the other hand, я to ya seems perfectly fine. Do we need ia or ja? Probably not. How's this for a definition: If a translation system misses distinctions that matter to an adult, native speaker with not more than a high school education, it's rough-and-ready. LADave 07:34, 27 June 2008 (UTC)[reply]
In Wiktionary we would use Nevskij, Gorbačov, Xruščjov. Details at Appendix:Russian transliterationMichael Z. 2008-06-27 14:00 z
As a language learner, I'm far better off concentrating my attention on the writing system native speakers use. If I were a lowest common denominator user, I would not want to bother with special characters. If I were a linguist, I would probably want a universal phonetic platform, not something language-specific. I'm not sure what audience the proposed scheme serves. LADave 19:58, 27 June 2008 (UTC)[reply]
Linguists can read the Cyrillic and don’t need a transliteration. When they publish something on paper, they all have their own favorite transliteration schemes, which they know very well and need no help with. If they have a lot of material that needs transliterations, it is very simple to write a Word macro to make the desired transliterations. The only people who need our transliterations are the casual users who are not linguists and are not students of the language. —Stephen 20:30, 27 June 2008 (UTC)[reply]
Many academics use transliteration, including linguists. When they publish in journals, they are often required to use one or another transliteration scheme. Many journals specify a standardized scheme or a house style, and in either case often provide a reference. Some Word macro won't help a reader digest the scores of languages found in Wiktionary's entries.
  • Folklorica: Journal of the Slavic and East European Folklore Association: “All words and quotations in a language using the Cyrillic alphabet must appear in the Library of Congress transliteration system; Folklorica does not publish Cyrillic text.”[1]
  • Journal of Linguistics: judging from its several examples, it assumes that all foreign-language text is transliterated.[2]
  • Journal of Slavic Linguistics: “Cyrillic examples should be cited in standard "scientific transliteration". Cyrillic may be used if some orthographic point is at issue, or if reference is made to uniquely Cyrillic designations, e.g., dictionary symbols.”[3]
  • Slavic & East European Journal: “ When using transliteration please follow the LC system, except for papers in linguistics and pedagogy, where the international system may be used (see transliteration charts published regularly in the Journal). Whenever possible, please use transliteration instead of Cyrillic, since this broadens the potential readership of the journal and is less expensive to set.”[4]
  • Slavic Review: “Library of Congress transliteration and the Chicago Manual of Style are to be followed”[5]
  • Slavonic and East European Review: “Words in Cyrillic, Greek, Arabic, Hebrew and so on should be underlined and transliterated (unless you are quoting a passage, in which case it is best not to transliterate);” “In linguistics articles, specimen words are underlined and followed by their translations in single quotations marks, for example, izba, ‘hut’.” Some personal and place names must also be transliterated. “All Cyrillic must be transliterated, except in linguistics articles,...” “Names transliterated from Cyrillic must be in the house style transliteration,...” “For linguistics articles, use the system in Table B. [scientific transliteration with х=ch]”[6]
  • Studies in Slavic and General Linguistics: “Bij Russische (Bulgaarse enz.) titels kun je kiezen tussen weergave in het cyrillisch en in translitteratie” (In Russian [Bulgarian etc] titles, you may choose between Cyrillic or transliteration)[7]
I don't see the point of pigeon-holing readers into "casual users" and "students of the language" as if there were a huge unpopulated gap in between, nor do I see any justification for creating an ad hoc transliteration system for Wiktionary. There are hundreds of kinds of readers for this reference and it is useless to pretend we understand all of their needs. The content of this open dictionary should be as standardized and accessible as possible. Michael Z. 2008-06-28 01:46 z
[re: “Many academics...”] Needing transliteration in order to read is one thing. Needing transliteration in order to publish is quite another. Linguists doing a paper on Russian or the Slavic languages do not need to see any transliteration, because they are well versed in Cyrillic. Furthermore, they need no help in converting Cyrillic into their favorite transliteration, and they can do it effortlessly. I repeat, linguists do not need a transliteration of Cyrillic, they can read Cyrillic perfectly well. When they publish something, they have their favorite systems and they need no one’s help to make the transliterations that they prefer. Anyone who is publishing a work on Russian or the Slavic languages, and who needs a lot of material transliterated, can do it very easily and faithfully with a Word macro. —Stephen 02:09, 28 June 2008 (UTC)[reply]
[re: “Folklorica...”] So you are claiming that the writers could not read Cyrillic and had to have everything transliterated for them? Or do you mean that the writers COULD read Cyrillic, but did not know Roman, and so had to have help to make the transliterations? My point is that the writers did not need transliterations because they can read Cyrillic perfectly well, and they did not need to outside help with making their transliterations because anyone who knows Cyrillic and Roman can do it. —Stephen 02:15, 28 June 2008 (UTC)[reply]
[re: entries] I don't think that goes far enough. If you read in translation, you might have a "Vanya" or a "Sashenka" suddenly popping up. Not words that have found their way into English yet, but if reputable translators are prone to assume too much of their readers, Wiktionary probably should ride to the rescue. The point of having a dictionary is to be able to read over your head. LADave 23:44, 26 June 2008 (UTC)[reply]
If proper names meet WT:CFI by showing up in three English sources, then they may be added to Wiktionary. Michael Z. 2008-06-27 01:40 z
I'm fine with that in principle. Three sources seems like a good number. Having to verify might be a PITA, but less so with electronic texts -- assuming you also have software doing the grunt work. LADave 07:34, 27 June 2008 (UTC)[reply]

Shouldn't diminutives of names have separate entries?[edit]

For example under Александр I see the diminutives Саша, Шура. But let's say someone gets called Sashenka in the novel I'm reading and I don't have a clue who they're talking about. It would be good to be able to look it up and be directed to Александр or Aleksandr (which has no entry, by the way).

A great project for a rainy day, I'm sure. Nevertheless do we need a guideline that if you create an entry for an имя, it's good form to create linking entries for its diminutives?

Similarly we have colloquial reductions of patronymics, e.g. Mихайлович -> Михайлич, Иванович -> Иванич and maybe even Ванич. 76.168.50.104 21:07, 25 June 2008 (UTC)[reply]

Yes, they should have individual entries, with links to the base name and related forms. Whenever I make an entry for a name, I include links to diminutives, etc. See for example Иван. —Stephen 21:17, 25 June 2008 (UTC)[reply]
Стюра - you created links but not entries, and the entries actually don't exist yet. So Ваня doesn't go anywhere and Vanya only goes to an unrelated homonymn. Ivan oddly doesn't show the Russian instance, although there are entries for other Slavic languages.
Так - Shouldn't we have guidelines encouraging actual entries for diminutives of names in cyrillic (and in transliteration)? They could just redirect to the main имя. A Ваня entry could simply redirect to Ваня under Иван. To help newbies we can create formal definitions of Russian diminutives, endearments and pejoratives and always link to them.
In the same way, we could created redirection entries for patronymics including variations Иванович -> Иванич, Ванич etc.
LADave 19:42, 26 June 2008 (UTC) (O.P. who neglected to log in the first time)[reply]
Yes, I created the base name with redlinks. Red links are an invitation to write. If I didn’t want the entries, I would not have made the links. Ivan doesn’t show the Russian only because no one has begun a translation section there. You may if you wish. John does have the Russian.
Diminutives should not be simply redirects, they have to be full-fledged entries, including declension. —Stephen 18:13, 27 June 2008 (UTC)[reply]
Why not redirect and add inflections to the main имя entry? Так
Саша
becomes
Саша (Сашу, Саши, Саше, Саше, Сашей)
Assuming omitting plural forms isn't a mortal sin when it comes to proper nouns. Better yet, put it in a table with grammatical cases across by diminutive variations down.
The advantage over full-fledged entries for each diminutive would be calling the reader's attention to the имя context in all its rococo glory, while not neglecting inflections. LADave 20:34, 27 June 2008 (UTC)[reply]
That would mean that Иван would have to have twelve declension tables just for the forms that are there right now (not every possible form has been added yet). That would make the page extremely unwieldy and we just don’t do it like that.
The plurals have to be included unless a word has no plural. There are some cases where I have not bothered to add the plural declensions where no plural exists, yet a plural is theoretically possible (as in "two Californias"). These could, and probably should, have the plurals added in. But when plurals actually exist, then they really need to be included.
Either in the etymology of the diminutive or in its definition line (or both), it must be pointed out that it is a diminutive, pejorative, etc., of the linked base name.
Off the top of my head, the only valid redirects that I can think of for Russian are redirects of е spellings to ё spellings (еще redirects to ещё). But if another language, word or form has that spelling, then it can’t be a redirect (мед). —Stephen 20:55, 27 June 2008 (UTC)[reply]
Plural имени almost seem hypothetical. Apparently it's bad form to give a full имя all by itself, so at least two of someone's names would need to match up before plurals should come into play. But then if it's OK to use diminutives in isolation, does this give rise to populations of plurals like Вани = Ваня-а + Ваня-б +...+ Ваня-н "in the wild"? Methinks the rule of three instances comes into play. If there aren't enough instances, leave it out.
Re tables, I don't think it's actually so unwieldy. One table for neutral dimimutives, one table for endearments and one table for deprecatives. It's simpler without plurals, but you could have two rows of inflections each:
им. вин. род. пред. дат. тв.
ед. Саша Сашу Саши Саше Саше Сашей
мн. Саши Саш Саш Сашах Сашам Сашами
ед. Шура Шуру Шуры Шуре Шуре Шурей
мн. Шуры Шур Шур Шурах Шурам Шурами
(poorly formatted, but hopefully clear enough)
Re homonyms, If they exist in other (cyrillic) languages, isn't it kosher to disambiguate and then redirect?
Re redirection, it seems to me there are fresh opportunities to explore in electronic dictionaries. Redirection is a PITA in paper dictionaries, but not here.
Another redirection possibility -- all verbs within the -ходить/-йти paradigm -- such as находить(ся)/найти(сь) -- might be redirected to a unified treatment. My reasoning is that the sooner a reader sees this particular forest and not just the occasional tree, the better. LADave 09:22, 28 June 2008 (UTC)[reply]
I don’t follow you. Plurals are often not easy to figure out, and they are an integral part of the grammar when the plural exists. If the nominative singular exists, it gets the full range of grammar, including forms which may be difficult to find examples of. If you have a group of friends over, there could be several Vanyas and several Sashas. You might ask the Vanyas to run hide, and then the Sashas to try to find them. You use the plural forms for that.
There cannot be "one table for neutral dimimutives, one table for endearments and one table for deprecatives", because different forms can have different declensions. Each form has to have its separate page with its own complete grammar described.
For the formatting, we use {{ru-noun1}} or {{ru-noun}}, according to the editor’s preference. Sometimes a word requires a special table because it has additional forms, such as the vocative...but that’s another story.
If a word exists as an oblique form, or as a word in a different language, redirect cannot be used. That has to be disambiguated (as I pointed out in the case of мед).
Redirecting is a popular strategy on Wikipedia, but here we don’t like to use it much. One reason is that we want to have every word in every language, and a redirect in one language may well be a lemma in another. The second reason we don’t like redirects is because people often do not notice that they were redirected, and they may get the impression that the article they reached is actually for the word that was typed.
Not only do we not redirect the verbs within the -ходить/-йти paradigm—such as находить(ся)/найти(сь), we do not redirect the individual verb forms thereof, such as нахожусь. Each one gets its own separate entry. —Stephen 10:43, 28 June 2008 (UTC)[reply]
Re имя ("first" name) plurals: I was skeptical of Stephen's hide-and-seek example; decided to put this to a Google test. Ну, I found some! For example "Новорожденных в Москве модно называть Настями, Никитами, Машами, Сашами и Дашами, Максимами, Виками и Иванами" [8] . I also found "Иванами" and constructions like "Львом и Дмитрием Ивановичами". Интересно. OK, so plural имя forms are obligatory.
Re tables, in my example Саша and Шура are each declined in six cases (columns) by two numbers (rows). Adding and declining Саня and Алик would give a table of eight rows by six columns (plus captions) for all four neutral diminutives of Александр. Then the five endearing forms would be a table of ten rows by six, the four deprecatives would be eight rows by six, etc.
We don't do things like that here. Each derived form represents different lemma and must have inflected forms on it's own separate entry. --Ivan Štambuk 22:50, 28 June 2008 (UTC)[reply]

Yandex dictionaries[edit]

I find Yandex English-Russian/Russian-English dictionary pretty good for advanced users - vocabulary and examples. More suitable for Russians than for English speakers, though (interface and explanations are in Russian). --Anatoli 22:55, 15 February 2010 (UTC)[reply]

Those are dictionaries from Lingvo. --Vahagn Petrosyan 00:09, 16 February 2010 (UTC)[reply]
Yes, then Lingvo would be a better link. --Anatoli 00:26, 16 February 2010 (UTC)[reply]

US War Department Russian Dictionary[edit]

I found out that the US War Department created an English-Russian dictionary in 1945. I think it's copyright-free. Can we try a bot-import from it?

http://archive.org/details/dictofspokruss00unitrich

http://books.google.it/books?id=Kxy5AAAAIAAJ&printsec=frontcover#v=onepage&q&f=false

17:16, 13 October 2012 (UTC)

I don’t know anything about the copyright question, but as far as I know, the dictionary is only in print, not a digital file. If you try to OCR it, you’ll get lots of errors. —Stephen (Talk) 21:55, 13 October 2012 (UTC)[reply]

Ubuntu repository has a packet with Russian language[edit]

Ok, the US govt dictionary is not digitalized, but I found the following:

The ubuntu repository has a package containing a Russian-English dictionary.

package details

This version is already digitalized. Try a bot import from this one? Orbayaapjycja (talk) 12:15, 14 October 2012 (UTC)[reply]

Here another package dict-freedict-eng-rus

Function words[edit]

Function words and some notable word parts (e.g. suffices) are not listed. I alway believed that -ка and -то are particles, not suffices. Ignatus (talk) 19:41, 27 December 2012 (UTC)[reply]

At the moment, this word is transliterated "(sólnce) n (read: sónce)". Personally, I think that's good, and clearer than e.g. его, which gives its transliteration as "jevó", even though by transliteration (change of letters from one script to another) "г" → "g" and /jɪˈvo/ is merely the word's pronunciation. But which of the two entries should be updated to be like the other? I think его (et него et al) should be changed to be like солнце: I'm not sure how many people realise that "jevó" is not just a typo for "jegó", and I have a hard time imagining that, if солнце is changed to give just "sónce" as its translit, anyone at all will recognise that the missing 'l' isn't a typo but an intentional effort to provide a second, pseudo pronunciation outside of the pronunciation section. - -sche (discuss) 21:17, 3 April 2013 (UTC)[reply]

Providing both "jegó" and "jevó" is not such a bad idea but providing the actual pronunciation is more important than giving letter-to-letter transliteration. If I had to choose between "sólnce" and "sónce", "čto" and "što" ("что") I would choose "sónce" and "što". "sólnce" can be produced by a tool but manually overwritten with "sónce". WT:RU TR says "silent consonants in consonant clusters are transliterated", so "sólnce" is our standard anyway but "г" in -ово and -ево genitive/accusative ending is "v". It's OK if tools like Lua produces "g", it can then manually be overwritten with "v". Misreading "г" in adjective endings is common with foreigners (including Slavs and rural Russians in South Russia and Ukraine). Similarly, it doesn't make sense to transliterate чёрный and жёлтый as "čjórnyj" and "žjóltyj", the correct value of "ё" is "ó" ("ё" is stressed in 99% of cases) - "čórnyj" and "žóltyj". The 1917 reform planned but didn't do this change, replace "ё" with "о" after "ч", "ш", "щ" and "ж"- "чорный" and "жолтый"' but "чёрт" can also be spelled "чорт"
English words like trader and trailer are transliterated into Russian as "трейдер" and "трейлер" (not "традер" / "траилер" or "трэйда" / "трэйла") a combination of phonetic and spelling. Cf. Japanese particle and  (letters "ha" and "he") are transliterated as "wa" and "e" when they are used as particles. --Anatoli (обсудить/вклад) 23:25, 3 April 2013 (UTC)[reply]
So if providing both the transliteration jegó and the pronunciation jevó is not such a bad idea, why not put the transliteration under “Transliteration” and the pronunciation under “Pronunciation?” Michael Z. 2013-06-19 04:10 z
If you read further, you'll see, even the first sentence. Transliteration is also used in translation, far away from any pronunciation section and for entries which are red-linked, not created yet.
Michael, instead of keeping having a go at our Russian transliteration methods, could you do create Ukrainian adjective declension tables? We have now at least one person willing to work with Ukrainian - User:Vedac13. I'll make it but it's not going to be too soon. (Thank you for fixing copy-edit typo -будет -> буде in Template:uk-conj-table, I don't agree with the rest of your edits but it's up to whoever is going to use it.) --Anatoli (обсудить/вклад) 04:23, 19 June 2013 (UTC)[reply]

Interesting topic about Russian pronunciation[edit]

If you're interested, I've written a bit about Moscow accent at Talk:лошадь#Pronunciation. --Anatoli (обсудить/вклад) 03:34, 19 June 2013 (UTC)[reply]

Russian transcription[edit]

I came up with a transcription-type transliteration for Russian at w:User:Erutuon/Russian transcription. It shows palatalized consonants, vowel reduction, and hard and soft postalveolars, making it easier to figure out how to pronounce the word.

Examples:

The idea is unlikely to be used on Wikipedia, where there is a strict standard against original research, but perhaps people here will find it interesting. I am not sure if it would be useful in entries. — Eru·tuon 21:40, 15 March 2017 (UTC)[reply]

Thank you[edit]

At my new job I have two coworkers from Ukraine, and with their help I've started learning a bit of Russian. I've been using Wiktionary to look up words and inflections, and I've been impressed with how extensive our coverage of Russian seems to be. I'm sure there's still a lot more to do, but this is already a very useful resource for Russian language learners like me. So, спасибо to the editors who work on our Russian entries, and keep up the good work! —Granger (talk · contribs) 14:00, 21 June 2018 (UTC)[reply]

"low colloquial" vs. "nonliterary"?[edit]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): I've generally been using the label "low colloquial" to translate Russian прост. (as found in some dictionaries) and разг. сниж. (as found in other dictionaries). This is somewhat of a literal translation of разг. сниж., and has the advantage that if rendered like this: {{lb|ru|low|_|colloquial}}, it categorizes the term into CAT:Russian colloquialisms. But I'm starting to think that it might be better to use the label "nonliterary". This can be added as a general label, so that it categorizes the term into CAT:Russian nonliterary terms. Questions:

  1. Is "nonliterary" an accurate rendering of this term?
  2. If so, should we use it?
  3. Am I correct in assuming that прост. and разг. сниж. are actually the same thing?

Benwing2 (talk) 04:28, 27 December 2018 (UTC)[reply]

Some examples of "low colloquial" terms: втолка́ть, всова́ть, выла́зить, вломи́ть, долба́ть, долбану́ть, вма́зать "to hit, to slap", сло́пать, схлопота́ть, etc. Benwing2 (talk) 04:56, 27 December 2018 (UTC)[reply]
#3 - no, it depends. I see many examples where разг.-сниж. means more than just nonliterary; it instead refers to a particular rough, low register of colloquial speech. If you use a lot of such words in your speech, it carries the sentiment that you don't much like the thing that you are talking about and might not much like anything else for that matter; that you are a hard person and that you lead a hard life. Tetromino (talk) 05:24, 27 December 2018 (UTC)[reply]
In other words: both личико and харя are colloquial, but the latter is also low. Tetromino (talk) 05:30, 27 December 2018 (UTC)[reply]
Yes, what Tetromino says. Don’t call it low if you don’t know that it is. “просторечное” means just “informal”, or rather “uncouth” if this where a term of lexicography, “inerudite”. “сниж.” supposes that it is a default not to be сни́женный, which is of course not the case. If a word is “just how people talk” it is not “low”. I still find it murky what the distinction between “informal” and “colloquial” is, while we are at it, don’t ask me this question. Fay Freak (talk) 11:32, 27 December 2018 (UTC)[reply]
@Fay Freak I honestly have no idea what your response means, other than that you seem grumpy. BTW просторе́чие (prostoréčije) is defined in this very dictionary as "a variant of colloquial language that is considered to be outside the literary standard (often related to uneducated people)". This is clearly different from informal or colloquial (which are more or less synonyms). Benwing2 (talk) 04:12, 28 December 2018 (UTC)[reply]
@Benwing2 Informal and colloquial language is what is outside the literary standard (barring certain types like dramata). And thus it is everywhere and not generally low. And просторе́чие (prostoréčije) does not mean more than this, in as much as просто́й means ordinary. ”Low colloquial” is an overstatement, that’s what I say, or you have to take care that the meaning of “low” is not hollowed out by constant use in a translation of a Russian term. I have already seen terms being termed “low colloquial” here when they weren’t low but the original labelling was probably right. There isn’t a guarantee that Russian usage notes condensed in single words can be mapped upon certain English expressions (the expectation of the technical terms having secure translations is mistaken), so when making the knowledge transfer to English you will either exaggerate or lose information if you don’t know the usage except from having read it from labels in Russian dictionaries, but better lose (in as much as it is better to have no label than a wrong label). Fay Freak (talk) 06:41, 28 December 2018 (UTC)[reply]

Auto-accenting and auto-adding brackets[edit]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): I wrote a script to auto-bracket terms inside of usexes and quotes, e.g. женщин -> [[женщина|же́нщин]], using boldface instead where appropriate, e.g. '''женщин''', and just adding the accent outside of usexes and quotes. A few years ago, I ran a simpler version of this script. At the time, User:Wikitiki89 objected to auto-accenting quotes hidden by #*. I personally don't see the value of such a restriction, and Wikitiki hasn't been seen in awhile. What does everyone think? Is it OK to add accents and brackets inside of hidden quotes? Benwing2 (talk) 05:37, 1 February 2019 (UTC)[reply]

Seems okay, since you do it anyway manually and it would save work, and one can still do it manually if the automatic version fails because the computer is not smart enough. Any detriments to the Lua memory with it? I could imagine that one would like to let the function not run on some pages with much content. Fay Freak (talk) 13:00, 1 February 2019 (UTC)[reply]
I find the links and accents in usage examples very helpful as someone who doesn't speak Russian, and wouldn't mind if they were also added to quotes. Most quotations don't have links, but Chinese quotations formatted with {{zh-x}} do at least. — Eru·tuon 22:45, 1 February 2019 (UTC)[reply]
(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): OK, I'll go ahead and add links/accents/boldface to hidden quotes. They shouldn't have any effect at all on Lua memory usage because no Lua is called to process them. Benwing2 (talk) 18:08, 2 February 2019 (UTC)[reply]
@Benwing2: As you know, I don't object to this and would even encourage. For foreign learners it's far more important to get the right stress and pronunciation than the graphical similarity with the original. The quotes from literature are systematically used with accents in texts for young readers and foreigners. Pls note that the normalisation will include replacing "е" with "ё" where appropriate, which is not easy. Inserting a stress mark or writing out "ё" (where appropriate) is not considered a "change" to the original text by Russians. I don't object to linking words in usexes to lemmas either. --Anatoli T. (обсудить/вклад) 05:16, 15 February 2019 (UTC)[reply]
@Atitarev: Cool. I already ran the auto-accenter on lemma pages (which accounts for 99.9% of the total set of words missing accents or links). As for changing е to ё, my code currently only does this if there's an explicit entry for the е variant that links to the ё variant. With some work I could probably have it handle most instances even without explicit entries; I do this for adjectives, for example, by recognizing the various adjective endings (e.g. -ыми) and constructing the possible stems from it. Benwing2 (talk) 05:24, 15 February 2019 (UTC)[reply]
He doesn't edit Russian anymore, but I know @Vahagn Petrosyan disagrees with this too. I can't find the discussion, unfortunately. Per utramque cavernam 21:52, 16 February 2019 (UTC) [reply]
I do disagree, but the regular Russian contributors get to decide what is appropriate. I am not a regular Russian contributor anymore. --Vahag (talk) 07:23, 17 February 2019 (UTC)[reply]

Auto-creating pre-1918 forms[edit]

I have noticed that your auto-accenting and auto-bracketing often fails in pre-1918 quotes because the pre-1918 forms are not created. I suggest to create the pre-1918 forms automatically if they are given in “alternative forms” sections with {{ru-PRO}}, since this should be doable and if somebody has added this form he has checked that it exists. If the entries are created by bot this will fruitfully fill the dictionary with the pre-1918 spellings since then we only have to add the ”alternative forms” section as compared to creating new pages, towards which there is no excitement. (Notifying Atitarev, Benwing2, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Tetromino, Per utramque cavernam): . Fay Freak (talk) 12:59, 27 February 2019 (UTC)[reply]

@Fay Freak OK, I can look into this, if others agree. It shouldn't be terribly hard. Note that my auto-accenting bot is smart enough to infer adjective forms even if they don't already exist, but this is harder to do for nouns and verbs because the morphology is more complex and the endings are less obviously identifiable compared with adjectival endings like -ых and -ыми. Benwing2 (talk) 16:44, 27 February 2019 (UTC)[reply]
I have no objections, as long as someone checks the quality. --Anatoli T. (обсудить/вклад) 21:54, 27 February 2019 (UTC)[reply]
@Fay Freak I thought about this some more. My auto-accenting bot can make automatically infer the accent and lemma for many old-style spellings; e.g. if it can't find a given word that ends in -ъ, it can check the corresponding word without the -ъ, and if it's found, it can work out the proper lemma by adding -ъ onto nouns that end in a hard consonant. It can also do the same trick when converting -аго to -ого. It can also convert any ѣ to е and check that word to find the accented syllable and the modern-spelling lemma, and derive the old-spelling lemma by checking for an {{ru-PRO}} line in the modern lemma. Even if {{ru-PRO}} isn't present, it may be able to put back the ѣ in the right place in many cases; I'd have to think about this some more though. For all of these tricks, it wouldn't be necessary to use my form-creating bot to create entries for old-style non-lemma forms. There may be some special cases where having the old-style non-lemma form entries is needed; I'll have to think about this.
One thing to note though is that many {{ru-PRO}} links are missing. For example, until I just added them, neither есть (jestʹ, to eat) nor деть (detʹ) had {{ru-PRO}} links. If you could add them to the most common words, it would be very helpful. Benwing2 (talk) 03:17, 28 February 2019 (UTC)[reply]
@Fay Freak I notice you're manually adding {{ru-PRO}} links. If you want, I can probably do this automatically; I just need a text file containing a mapping from modern spelling to pre-reform spelling. You might be able to get one from slavenica.com (click on the СКАЧАТЬ tab). Benwing2 (talk) 23:09, 28 February 2019 (UTC)[reply]
@Benwing2 The file contains modern terms possibly never written like “стереотелевиденіе” though (since it is for spellchecking for hypothetical writing). Though телеви́дѣніе (televíděnije) can be found in journals of Orthodoxes, emigrants and the like. Maybe somebody else knows more appropriate data sources. Fay Freak (talk) 23:32, 28 February 2019 (UTC)[reply]
(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): Thanks, Fay. Anyone have a good reference for mapping between modern and pre-reform spellings? I've been using slavenica.com but I've found a few weirdnesses that make me wonder how accurate it is. For example, when converting весь (vesʹ) to pre-reform spelling, it says this:
※ {вѣсь} — вѣсь день. 
⫽ {весь} — деревня, село.
But this disagrees with our dictionary, which says the pre-reform spelling of весь (vesʹ, every, all) is still весь (vesʹ).
Contrariwise, when converting век (vek) to pre-reform spelling, Slavenica says this:
※ {веко} — часть тела.
⫽ {вѣкъ} — время.
This also disagrees (in the other direction in terms of е vs. ѣ) with our dictionary, which says that ве́ко (véko, eyelid) is вѣ́ко (vě́ko) in pre-reform spelling.
Are there variations in pre-reform spelling that might account for this? Clearly some work went into the developers of Slavenica to input the specific glosses like "часть тела", so it's hard for me to see it as just sloppiness. Benwing2 (talk) 23:39, 28 February 2019 (UTC)[reply]
@Benwing2: We will have to carefully check the pre-reform literature and resources. I'll join when I have a moment. Please note that this is very error-prone, it's the actual reason for the 1918 orthography reform. Spelling rules for ѣ/е was considered a nightmare in the Russian Empire, even if it's nothing compared with the English orthography problems, LOL. --Anatoli T. (обсудить/вклад) 23:49, 28 February 2019 (UTC)[reply]
@Atitarev Yeah, I've heard about how much people hated the spelling rules for ѣ/е. But I've wondered a bit about this; English has tons of similar cases where two or more ways of spelling are possible (e.g. sight/site/cite, meet/meat/mete, right/rite/write/wright, etc.), and no one seems to think this is a similar nightmare, you just learn to spell each word appropriately. The situation with 'ea' vs. 'ee', for example, is directly parallel to the situation with ѣ and е, but I've rarely if ever heard people make a big deal about this. Spanish has similar issues with b vs. v and ll vs. y, and in Latin America also s vs. c/z, but I've never heard any serious proposals there either for spelling reform. What was it about the Russian situation that made it so uniquely difficult? Benwing2 (talk) 00:00, 1 March 2019 (UTC)[reply]
@Atitarev BTW I think I can write my auto-accenting/auto-bracketing bot so it correctly handles pre-reform spellings without the need for the {{ru-PRO}} annotations in most cases, which means it would be able to work with whatever spelling of ѣ vs. е is used, even if incorrect. Benwing2 (talk) 00:03, 1 March 2019 (UTC)[reply]
@Benwing2: You're right about {весь} and {вѣко} and Slavenica seems wrong. As for ѣ/е + the final "ъ", it's one thing that made little sense in the otherwise quite phonetic script in Russian but obviously not everyone agrees, see the political rant below. --Anatoli T. (обсудить/вклад) 22:35, 1 March 2019 (UTC)[reply]
People didn’t hate the spelling rules, only the lazy and slow kids. You know how commie teenagers are today? That’s how they have been back then in Russia. The only difference is that back then they actually grabbed power so they could repaint history to their favour, obliterating what was favourable to the old and exaggerating hardships, not to speak of killing everyone who would keep up excellent education, as they always do. Russia was a normal country until the Bolsheviks came.
Like reds don’t understand basic economy they don’t understand that it matters little if it is harder to write when it is actually easier to read, since people have to read texts more than texts have to be written. ѣ gives additional distinctions into the text, there are minimal pairs like лѣчу́ (lěčú, I heal), лечу́ (lečú, I fly) but in general the character stands out and helps you, and і (i) flows well: Nowadays the texts with only и (i) and е (je) are a monotonous mass that strains the eye. The same way it is actually easier to read Fraktur than Antiqua (if you get used to it!). The socialist Hitler did not understand it either, unsurprisingly, and abolished all blackletter printing and writing in 1941 (so called Normalschrifterlaß). Again in 1996 after the merger of the German Democratic Republic and the Federal Republic of Germany they made a “reform” to make it easier for children to write. The result is that it isn’t easier to write, the new rules have multiple times as many rules and children are unsure how to write. Might it have been a part of the programme of former GDR cadres who have taken seat in the West to level it down to obtain gains for their God of equality? Always be wary. There is no reason to assume of a “reform” that it is intended to be a improvement for the common weal. History does not always progress.
Apparently it isn’t hard to write the old spellings either. I write them automatically. One only needs to read things in the spelling here and there. And even though today – and in Germany! – traditional spelling texts do not pour in upon me it is like this. So much easier they must have had it when there was the spelling all around.
Tl;dr: Pre-1918 Russian spelling is easy and better to read.
BTW I have no added to about half a percent of all Russian lemmata ru-PRO forms, should be quite a mass for you now, Benwing. It concerns some common stems written with ѣ, namely мѣт, вѣт, рѣш, вѣр, брѣт, ѣхать/ѣздить, цѣп, бѣг and some smaller stuff. Fay Freak (talk) 01:52, 1 March 2019 (UTC)[reply]

Middle Russian?[edit]

(Notifying Atitarev, Benwing2, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino): Hello. At Reconstruction talk:Proto-Slavic/želza, Guldrelokk and I have discussed a bit about the possible need for a new language header/language code, which would cover XVth-XVIIth century Russian, as distinct both from Old Russian/Old East Slavic and Modern Russian.

Indeed, at Talk:-ствовать, speaking about a spelling such as нравьство (nravʹstvo), Guldrelokk says that "they were normal as recently as in the 17th century and cannot be called Old East Slavic either – the period overall is uncovered".

If we were to give this period a language header indeed, I think we could call it "Middle Russian":

  • It would have the advantage of being in alignment with "Middle Armenian", "Middle English", "Middle French" and so on;
  • It has some (admittedly very sparse) currency: [9], [10]
  • Imo, it sounds better than the translation of Zaliznyak's preferred denomination: старовеликорусский, "Old Great(er) Russian", which looks a bit pompous in English.

However, this designation also has the inconvenient of being used to speak about the Central Russian dialects (среднерусские диалекты) as well.

Of course, there's always the option of simply tagging those words/spellings as obsolete.

What do you think?

Also pinging @-sche. Per utramque cavernam 21:08, 16 February 2019 (UTC)[reply]

@Per utramque cavernam I would feel better using an etymology-only language code and making it redirect either to Old East Slavic or modern Russian depending on which one the spellings are more similar to. Benwing2 (talk) 21:21, 16 February 2019 (UTC)[reply]
@Per utramque cavernam In Russia, this language period is often referred to as старорусский, which literally means "Old Russian", but I think a better (and more widely used) English equivalent would be "Early Modern Russian". Tetromino (talk) 21:34, 16 February 2019 (UTC)[reply]

And where are the linguistic arguments? There must be sufficient typological differences like for Middle English, for Middle French, Middle High German and so on. I can think of a lot of features that distinguish Middle High German from Old High German and from New High German and Middle English from Old English and New English, but 17th century Russian is well understandable for someone who knows Modern Russian well, and XVth century Russian is about the limit. To me it seems there are only enough differences to distinguish two stages, not three. This also aligns to what is done with the other Slavic languages: Old Czech, Old Polish, not Middle Czech or Middle Polish. And Old Church Slavonic is a synonym to Old Bulgarian and at the same time on Wiktionary covers all kinds of literary Slavic languages. Middle Bulgarian as on Wikipedia is yet to be shown as worth the distinction. The comparates here are most far-fetched, as @Per utramque cavernam not only skipped the Slavic analoga but also picked the fastest-changing languages in the Romanic and Germanic world, and even in the fast languages there isn’t alignment. there isn’t an alignment since for example Middle French is said to be until 1609 while you know yourself that Modern English was more than a hundred years there at that point, and Spanish and Portuguese do not need a middle, I even doubt if “Old Spanish” and “Old Portuguese” are really necessary, and Middle High German goes until the beginning of the 1400s while Middle Low German until the early 1600s. This is all because of features in the language: If I see a text I should be able to say if it is Middle High German or New High German or Old High German without knowing the date (the manuscript doesn’t state it always, and editors don’t need to care – example under Schnalle, sense “harlot, hooker, strumpet” where I had to date a text only by the language), but there isn’t much to make you confident in distinguishing three stages of Russian (nor of Polish …). Don’t get duped by the distinctions scholars make in languages stages: They also talk of “Middle Arabic” while texts from the period in question are just Arabic, and this concept is also reprehended by other scholars. Scholars construct stages to make certain points, but lexicographically we must ask if the distinction is useful for our purposes: Clearly “Middle Arabic” is to be dismissed for our purposes since sixth-century Arabic to the greatest extent like today’s literary Arabic and it would have unsolvable hardships from the existing material to do any cut. Similarly ask yourself if from the texts you know of you can discern two or three languages. (Probably you don’t know such texts because there isn’t that much literature edited or sold to you, and this is the reason why we have an “uncovered period”). Spellings like нравьство (nravʹstvo) do not arouse in me any desire to have a new language stage, the differences must be greater. So we have Ukrainian and Belarusian because some sound developments and systematic spelling decisions and some ending differences and differences in basic words like conjunctions and pronouns in addition to generally detached development of the lexicon justify treating it under other headers. But what detaches Old Russian from New Russian, and can you find two major lines in it wherein a Middle Russian could find place? Fay Freak (talk) 00:08, 17 February 2019 (UTC)[reply]

Many phonetic and morphological developments took place between the very archaic stage represented by Wiktionary’s normalised ‘Old East Slavic’ and ‘Russian’, which doesn’t go further back than the 18th century in practice. This is well covered in standard reference works.
The question concerns several hundred manuscripts and early printed books. Their material, amounting to hundreds unique words and wordforms, is distributed among general and specialised dictionaries (e. g. Словарь обиходного русского языка Московской Руси XVI—XVII веков) and is currently hard to categorise in Wiktionary. The significance of such works as A Journey Beyond the Three Seas, Domostroy, princes’ correspondency etc. goes well beyond the fields of linguistics and philology. We also have the first genuine examples of Russian speech from that time, thanks particularly to jurisprudence recording common people’s testimonies.
Importantly, for both Middle Russian and Middle Bulgarian accentological data is available, which is a goldmine for comparative Slavic accentology. Guldrelokk (talk) 17:07, 17 February 2019 (UTC)[reply]
“In practice”. In practice usually Russian lexicography does not even go further back than Pushkin and this is where popular literary history starts. What are the features by which one can distinguish three stages? The way you put it there is just a transitional age: “Many phonetic and morphological developments” – maybe only from Old East Slavic to New Russian. Or it seems that from the corpus Old East Slavic and New Russian coexisted, since we have “common people’s testimonies” and elsewhere Old East Slavic was written. The former being the обиходный русский язык while some texts are written in Old East Slavic? Old East Slavic texts quoting New Russian texts? Like Latin texts quoting Old French speech? But it does not sound like there being three layers. You say there is something difficult to categorize but you make things even more difficult to categorize by introducing three stages. Fay Freak (talk) 13:34, 19 February 2019 (UTC)[reply]

Interfixes -о- and -е-[edit]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): For terms like котокафе́ (kotokafɛ́) and глубоково́дный (glubokovódnyj), I've been analyzing them as ко́то- (kóto-) +‎ кафе́ (kafɛ́) and глубоко- (gluboko-) +‎ вода́ (vodá) +‎ -ный (-nyj). This is based on a suggestion from User:Wikitiki89 that forms like кото- and глубоко- should be treated as "combining forms" of the corresponding lemmas кот (kot) and глубо́кий (glubókij). However, most grammar books instead analyze this as кот (kot) +‎ -о- (kafɛ́) +‎ кафе́ (kafé) and глубо́кий (glubókij) +‎ -о- (-o-) +‎ вода́ (vodá) +‎ -ный (-nyj), with a separate interfix morpheme -о- (-o-), found as -е- (-e-) after paired palatalized consonants as well as ш ж ч щ ц (all of which are or were palatalized). I've come to the conclusion that we should adopt the same analysis. I wrote a script to convert existing entries that use the "combining form" analysis to use the "interfix" analysis, and I'm going to run it unless someone has a good reason not to do so. Benwing2 (talk) 00:10, 27 February 2019 (UTC)[reply]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): A related issue is how to handle the reduced deverbal (and sometimes denominal) forms that often appear in the second part of such a compound. There are currently various ways to render this:
What should be the best way of expressing this? There are at three ways encapsulated above: (1) "a deverbal form of X"; (2) "-X, combining form of Y"; (3) just listing the full form, and relying on the usage note under -о- (-o-) that indicates that verbs (and sometimes nouns?) appearing in the second component are typically reduced to the stem. Benwing2 (talk) 00:18, 27 February 2019 (UTC)[reply]
(responding to your first post) Seems ok to me, though I generally consider adding those interfixes to be needless clutter; I'd personally write глубо́кий (glubókij) +‎ вода́ (vodá) +‎ -ный (-nyj).
I think the question boils down to whether we want/need to document regular phenomena like this in every single entry; in a totally unrelated area, I know User:Victar is against mentioning Brugmann's law in individual Proto-Indo-Iranian entries. Per utramque cavernam 00:21, 27 February 2019 (UTC)[reply]
@Benwing2: I don't have a very strong opinion on this and I don't object to analyse them as interfixes. Mentioning the combining forms as an alternative wouldn't hurt. I also think that we might keep suffixes like ру́сско- (rússko-, Russo-), а́нгло- (ánglo-, Anglo-), even if they can be understood as combining forms. I have restored ру́сско- (rússko-, Russo-), which you have deleted. --Anatoli T. (обсудить/вклад) 11:17, 27 February 2019 (UTC)[reply]
сеноко́с (senokós): се́но (séno) +‎ -о- (-o-) +‎ коси́ть (kosítʹ) like it is done everywhere else – since the trailing -о of се́но is not part of the stem this is no problem. Compare also this week’s Wiktionary:Requests for moves, mergers and splits § Category:Deverbals by language, Category:Denominal verbs by language where it has been found that some of the formats must be done away with. Fay Freak (talk) 12:47, 27 February 2019 (UTC)[reply]
@Atitarev I think ру́сско- (rússko-) is unnecessary since it's transparently ру́сский (rússkij) + -о- (-o-), but I won't object to your recreation of the prefix if you think it belongs. Benwing2 (talk) 16:20, 27 February 2019 (UTC)[reply]
@Fay Freak I have taken to an analysis like this:
Benwing2 (talk) 16:22, 27 February 2019 (UTC)[reply]
Agreed: don't detail predictable outcomes in etymologies. --{{victar|talk}} 19:39, 27 February 2019 (UTC)[reply]

@Benwing2 @Anatoli T A question regarding adjectives now. Are short, comparative, and superlative forms exclusive to qualitative adjective types. Also thank you for guiding me to these "module" pages. The information in them I have been looking for for quite some time. Isaac1901 (talk) 22:53, 26 February 2020 (UTC)[reply]

@Benwing2 Thank you, I had not been aware of these but the "module" pages are very informative. I must ask since I have the chance now. What is the source/method used regarding the categotization of Russian declension and parts of speech? I see linguist Andrey Zaliznyak cited often. Is this the primary author of the Russian categorization used by Wiktionary? Is personal research also used for this? I am a studier of Russian and have been using wiktionary as my primary source of study regarding declension patterns. The information found here is more detailed than other sources I have come by. In regards to questions such as regarding declension patterns is this the appropriate page to ask on? This may be a moderator page from the looks of it. Isaac1901 (talk) 23:11, 26 February 2020 (UTC)[reply]

Rhymes[edit]

(Notifying Atitarev, Benwing2, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Per utramque cavernam): @Wittiami: has recently added some rhyme info, and while the effort is noble, the results are in some cases super wrong (see Wiktionary:Requests for deletion/Others#Russian_non-rhymes). So before people spend more of their time on this, let's take a pause and talk about what rhymes work in Russian. Here follow some observations off the top of my head (which perhaps contradict what you learned about rhyme in English language poetry):

  1. Rhymes ignore word boundaries. Short phrases need rhyme entries just like words.
  2. Rhymes match most strongly on the stressed vowel and preceding consonant. Other sounds towards the end of the word may not matter. Consider Rozenbaum: край бура́новый под охра́ною (kraj buránovyj ) - [ˈranəvɨj] rhymes with [ˈxranəjʊ]. This is especially true of word-final й, whose pronunciation is reduced in casual speech; consider Pushkin: вью́тся ту́чи снег лету́чий (vʹjútsja túči ).
  3. There are "rich rhymes" (where the preceding consonants must coincide) and "poor rhymes" (where they don't need to - as long as there is at least one other consonant in the rhyme). So a word like сла́ва (sláva) should list at least two rhymes: -lava (matching e.g. обла́ва (obláva)) and -ava (matching e.g. пора́ вам (porá vam) - whether you choose "rich" or "poor" depends on your preference as a poet.
    • But even in "rich rhymes", the preceding consonants sometimes might not match exactly. Famous example: Lermontov's ловлю́ на Ю (lovljú ) - [lɐˈvlʲu nɐ‿ˈju] - a palatalized approximant matches a glide for rich rhyme purposes.
  4. Sometimes [o] can rhyme with [ɵ]; наро́д поёт (naród ) [nɐˈrot pɐˈjɵt] works as a rhyme. Same can be true for [i] and [ɨ]; think of си́льный ссы́льный (sílʹnyj ). And sometimes [a] and [æ]: вста́ть опя́ть (vstátʹ ).

Before adding more rhyme info, we need to come up with a structure that accounts for these (and other) features of Russian poetry. Tetromino (talk) 05:39, 27 February 2019 (UTC)[reply]

I thought that some of the allophones distinguished in the current rhyme transcription system might actually rhyme.
I wonder if it would be better to render rhymes in a phonemic transcription, in which the allophones [ɵ] and [o] are both transcribed as /o/, [a] and [æ] as /a/, and so on. I guess the transcription would have to show vowel reduction (for instance, rendering unstressed and reduced о as /a/). Using the same symbol for the pairs of allophones would make it clear that they can sometimes rhyme. I suggest this tentatively because the discussions that I was watching on Talk:Russian phonology on Wikipedia gave me the impression that some aspects of Russian phonemic transcription are controversial. — Eru·tuon 06:05, 27 February 2019 (UTC)[reply]
Yes, for the rhyme titles themselves. And in the pronunciation line one should give both as phonetic, since [ɵ] and [æ] are not necessary and do not exist in every idiolect as much as they are put here by the automatic transliteration. Only because linguists have observed these realizations does not mean they are generally there. For me it’s [pɐˈjo̞t] or as is given on жжёт (žžot) (a bit closer than my German [ɔ] as in Boss in any case) and not [pɐˈjɵt], same in бьёт (bʹjot) etc. – though definitely both exist, the latter seems “Muscovite”. Fay Freak (talk) 12:42, 27 February 2019 (UTC)[reply]
Assuming there where you say “preceding” you mean “succeding” (replace the wording and remove this notice, I recommend), this is what coincides with my perceptions about what a good rhyme in Russian rap is. So it could be that the declined locative of ге́тто (gétto) rhymes with ве́тер (véter). Fay Freak (talk) 12:42, 27 February 2019 (UTC)[reply]
Good point that succeeding consonant often matters - but if the stressed syllable is the last one in the word, then succeeding consonant isn't needed when preceding matches: пыхти́т в пути́ (pyxt ). Tetromino (talk) 14:28, 27 February 2019 (UTC)[reply]
From the examples you give it seemed and seems to me that what matters is mainly the stressed vowel, the succeding consonant and the succeding vowel. You write “Rhymes match most strongly on the stressed vowel and preceding consonant” but give examples where only what succedes matches, not what precedes. Fay Freak (talk) 15:00, 27 February 2019 (UTC)[reply]
Including preceding or not is "rich" vs. "poor" rhyme. Both options are often used, it's a choice of style for your poem. Tetromino (talk) 15:06, 27 February 2019 (UTC)[reply]

(Notifying Atitarev, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino, Per utramque cavernam): The search box has been updated, and now if you type in a word with е instead of ё, and the word with е doesn't exist (e.g. четырехгранник (četyrexgrannik) instead of четырёхгранник (četyrjóxgrannik)), you'll automatically be brought to the correct word with ё. Also, the autocompletion mechanism suggests words with ё in place of е if you type in the form of the word with е. This suggests that all these soft redirect pages like трехэтажный (trexɛtažnyj) linking to трёхэтажный (trjóxɛtažnyj) aren't so necessary any more. As it is, we have a pretty random collection of soft е-to-ё redirects, hard е-to-ё redirects, and missing redirects, and I'd like to make this more sane. I suggest the following:

  1. Keep soft е-to-ё redirects whenever the form with е is a word in its own right (in any language; e.g. кре́стный (kréstnyj) is a Russian word in its own right as well as a possible spelling of крёстный (krjóstnyj), so we'd keep the soft redirect; likewise, лед (led) is a word in Bulgarian [and in Macedonian and Serbo-Croatian, for that matter] meaning "ice", as well as a possible spelling of Russian лёд (ljod), so again we'd keep the soft redirect).
  2. Use a bot to create the missing soft е-to-ё redirects of the above type.
  3. Delete any soft е-to-ё redirect pages where the sole purpose is to be a soft redirect; this way the search mechanism will automatically redirect.
  4. Delete any hard е-to-ё redirect pages.

Thoughts? Benwing2 (talk) 03:31, 28 February 2019 (UTC)[reply]

Seems to be the proper solution. Guldrelokk (talk) 03:42, 28 February 2019 (UTC)[reply]
@Benwing2: You have presented good reasoning but I am still hesitating, thinking how much time and effort was spent on this, not just mine but yours and other people's. More importantly, learners of Russian do ask questions about е/ё discrepancy all the time, it confuses them a lot, and I think it's very important to provide them with the this information. If we delete all soft redirects, then we are just reducing our service, even if we also reduce our maintenance efforts. --Anatoli T. (обсудить/вклад) 23:43, 28 February 2019 (UTC)[reply]
@Atitarev I'm not proposing deleting all soft redirects, just ones where the soft redirect is the only entry on the page. With that entry gone, the search page automatically does the redirect, e.g. typing in "еще" would automatically bring up ещё if not for the soft redirect. With the soft redirect present, both pages еще and ещё show up in the autocompletions when typing "еще", which to me suggests that they are two different words, which I find confusing.
What would you suggest as an alternative? Should we use a bot to generate *all* possible soft redirects (either only for lemmas, or also for non-lemmas)? The main thing I'm trying to fix is the current situation where we have a random and incomplete collection of soft and hard redirects, and no clear rules for whether to create a redirect (soft or hard) when creating a new entry with ё in it. Benwing2 (talk) 23:53, 28 February 2019 (UTC)[reply]
@Benwing2: My preference to have soft-redirects for ALL terms with "ё", only lemmas, if it seems too much. If a person knows what they are looking for, they will select the right term in the search window, if they don't, then it's even more important to have them. --Anatoli T. (обсудить/вклад) 23:57, 28 February 2019 (UTC)[reply]
@Atitarev. OK. Let's see what others have to say, I'd like to find a consensus. Benwing2 (talk) 00:04, 1 March 2019 (UTC)[reply]
@Atitarev One other thing about the soft redirects is that they show up in the relevant categories, e.g. both еще and ещё show up in CAT:Russian adverbs and CAT:Russian lemmas, and similarly both лёгкий and легкий show up in CAT:Russian adjectives and CAT:Russian lemmas. I think this is wrong; unless there are objections, I'll fix this so the soft redirects don't show up in such categories (I'll do this using a new param |noposcat= to |head=, which disables putting the lemma in those categories. That way we can still have soft redirects but not have them polluting the lemma and POS lists. Benwing2 (talk) 00:42, 1 March 2019 (UTC)[reply]
@Benwing2: No objections. Good. --Anatoli T. (обсудить/вклад) 22:26, 1 March 2019 (UTC)[reply]
@Benwing2, Atitarev: Has this (provisional) policy been documented somewhere? I created раздается with the comment “Created for typical printed form of раздаётся (one should be able to look up with cut-and-paste – but is this how we usually do it? No guidance on е for ё at Wiktionary:About Russian!)”, but now I see that this was perhaps not desired. But I had typed “https://en.wiktionary.org/wiki/раздается#Russian” in the browser address line rather than using the search box. In my case it was because I was practising typing Russian, but I feel such URLs could be automatically generated and ought to be redirected, and I believe that the current search mechanism offers no help here. For that reason I think that there should be soft redirects for all these forms, which I suppose a bot ought to be able to generate. PJTraill (talk) 12:40, 5 March 2020 (UTC)[reply]
I have since created a new section sketching the current state of the policy, but it could probably use some more work. PJTraill (talk) 13:44, 5 March 2020 (UTC)[reply]
@PJTraill: We do have the current policy, as I said in my edit summary on diff. I will review your addition Wiktionary:About_Russian#Spellings_with_‘е’_instead_of_‘ё’. I think we don't need to mention how the searches are implemented. "е" and "ё" have been mutually searchable for a long time and it has nothing to do with policies and the search results also depend on what entries already exist. Also calling @Benwing2. --Anatoli T. (обсудить/вклад) 21:12, 5 March 2020 (UTC)[reply]
@Atitarev: Thanks; oddly enough I thought I had edited раздается myself, but perhaps I overlooked a concurrent edit message & cancelled instead of saving! PJTraill (talk) 10:47, 6 March 2020 (UTC)[reply]
@Benwing2, Atitarev: Should we suggest in the guideline that the pronunciation be specified on the soft redirect page? It seems to me like courteous help for the user; of course the pronunciation should be given as on the target (with ё), along with any other possible readings. PJTraill (talk) 21:36, 10 March 2020 (UTC)[reply]
@PJTraill: These are soft-redirects. The transliteration follows (is equal to) the main form, since writing "е" instead of "ё" is considered unaccented but it's still (conceptually) "ё" but unaccented. The pronunciation (IPA) is at the main form. --Anatoli T. (обсудить/вклад) 21:42, 10 March 2020 (UTC)[reply]

Verbal complements: which case, which preposition?[edit]

(Notifying Atitarev, Benwing2, Cinemantique, Useigor, Wikitiki89, Stephen G. Brown, Guldrelokk, Fay Freak, Tetromino): Hi. We currently don't have a systematic way of presenting that kind of information, and we do it very sparsely at any rate.

An example: as a native French speaker, I'm tempted to use аплодировать (aplodirovatʹ, to applaud) as its French equivalent, i.e. with a direct object (in the accusative), when that verb actually requires the dative case.

We do have some templates to handle this ({{+obj}}, {{+preo}}, {{+posto}}), but they never been much used, and I find them ugly; the one I use for French and English is {{indtr}}, but it's not perfect either, in that it can't be used for languages that have a case system.

Thoughts? Suggestions?

On a related note, I've learned that some verbs taking a direct object (in the accusative) as their complement, and which could theoretically undergo passivization, don't actually do (an example: сопровождать (soprovoždatʹ, to accompany)). Wouldn't it be a good idea to document that too? Per utramque cavernam 18:21, 12 March 2019 (UTC)[reply]

@Per utramque cavernam: I don't have any suggestions regarding the implementation but I welcome the idea of standardising this. The experimental templates are interesting.
As for сопровожда́ть (soprovoždátʹ, to accompany), what do you mean? Passive forms are quite common and easy to find, both the -ся form сопровожда́ться (soprovoždátʹsja) and the present tense passive participle сопровожда́емый (soprovoždájemyj). --Anatoli T. (обсудить/вклад) 21:31, 12 March 2019 (UTC)[reply]

Help: verb conjugation symbols meaning.[edit]

I have noticed on a few verbs with atypical patterns, certain symbols are used at the top of the conjugation box on the word pages. Example: гибнуть is represented with "class 3°a[⑤]⑥" I understand "class 3" as being the Zaliznyak classification type shown in the Appendix: Russian Verbs' page. While "a" is the stress patern. What do the letters 5 & 6 represent? What patterns are they? And is there a page on them?

I also have a similar question regarding (-o-) as in танцевать. I am assuming it is a stress shift pattern however the details of which I wish I was explained to. Isaac1901 (talk) 03:40, 25 February 2020 (UTC)[reply]

@Isaac1901: The documention for the verb conjugation modules is here: Module:ru-verb/documentation. Each class/type/subtype is described but it's better understood by the actual examples and comparison with others. Also calling @Benwing2. --Anatoli T. (обсудить/вклад) 05:00, 25 February 2020 (UTC)[reply]
@Isaac1901 Anatoli is correct in that everything is documented in Module:ru-verb/documentation, but it could stand to be documented better in Appendix:Russian verbs and a link created from the verb title to that page. Benwing2 (talk) 05:06, 25 February 2020 (UTC)[reply]

stress of class 2a(or b?) Russian verbs[edit]

Appendix:Russian verbs notes:

"Stress patterns Russian verbs have three different stress patterns. These are indicated with Latin letters:

a - the stress is always on the stem. b - the stress is always on the ending, except when there is no vowel in the ending."

If this is the case then wouldn't class 2a forms such as рисова́ть actually hold a variant b accent? Unless ⟨ов⟩/⟨ев⟩ and respective declensions ⟨у⟩/⟨ю⟩ are considered ultima to stem.

But I usually see, at least in wiktionary −⟨овать⟩/−⟨евать⟩ represented alongside endings. Isaac1901 (talk) 22:08, 27 February 2020 (UTC)[reply]

@Isaac1901, Benwing2 Compare the stresses in inflected forms in the subclasses 2a: рисова́ть (risovátʹ) -> рису́ю (risúju) vs 2b: клева́ть (klevátʹ) -> клюю́ (kljujú). --Anatoli T. (обсудить/вклад) 22:46, 27 February 2020 (UTC)[reply]

Reducible nouns[edit]

Is there a page, or section on a page dedicate to the declension patterns of nouns with reducible stems? Isaac1901 (talk) 10:34, 5 April 2020 (UTC)[reply]

@Isaac1901: Yes.
This section: Template:ru-noun-table/documentation#Reducible nouns, plural-only nouns, adjectival nouns.
This category: Category:Russian nouns with reducible stem. --Anatoli T. (обсудить/вклад) 13:38, 5 April 2020 (UTC)[reply]