Bad block[edit]

I think the length was a bit harsh considering the last blocks were 2 to 3 days long respectively. I recommend shortening them to one week. Thank you. Zeggazo (talk) 09:40, 7 October 2014 (UTC)

I don't really think that anything less would be appropriate due to the severity of the offense. --Ivan Štambuk (talk) 10:10, 7 October 2014 (UTC)
You have yourself said that it is unreasonable to chastise two established editors with thousands of entries, but since you doubled down on your decision I trust your judgement. Zeggazo (talk) 10:16, 7 October 2014 (UTC)


Hi Ivan,

Re: your post there. I would commit to provide Russian definitions, if you generate Russian entries. I'd prefer to have them a bit simpler than Ukrainian ones, without the split by etymology. --Anatoli T. (обсудить/вклад) 22:16, 16 October 2014 (UTC)

I'll just group them under the same etymology so you split them if necessary. The biggest difficulty is the issue of complexity - there are too many Russian templates with too many parameters, and I have no desire to waste time pattern matching them, so I would use the plain {{ru-decl-noun}} and {{ru-conj-table}}, so you substitute them with bots or manually. For adjectives situation seems a bit simpler. Do you have some preferred list of missing words (by frequency etc.)? I see that there is a pronunciation-generating module that can be used. Any comprehensive online dictionaries that can be added as a reference? --Ivan Štambuk (talk) 23:02, 16 October 2014 (UTC)
Yes, for pronunciation {{ru-IPA}} can be used - irregular pronunciations can be added when definitions are added. [1] is a pretty good dictionary. Perhaps it's better to skip any words with spaces for now. Using {{ru-decl-noun}} is OK for me, a specific template can be changed later. Appendix:Frequency dictionary of the modern Russian language (the Russian National Corpus) is a frequency list I added. Some words are spelled with "e" for "ё", which is also OK, can be addressed later. Appendix:Russian Frequency lists still have red links too (higher number). You probably need a simpler template for verbs as a temp solution as well. Don't load a very large number of words yet, please. --Anatoli T. (обсудить/вклад) 23:20, 16 October 2014 (UTC)
There are 5289 redlinks on the national corpus appendix page. I can start with nouns (that should account for ~50% of entries) which are the easiest, then adjectives and then verbs, ordered by frequency of occurrence. How many do you want? --Ivan Štambuk (talk) 23:29, 16 October 2014 (UTC)
All nouns first :) but I'd like to see some samples first. BTW, that dictionary doesn't provide stresses but ru:wikt does, it's pretty reliable in this, not just for nouns and it redirects to "ё" spellings. Terms with multiple stress patterns, inflections could have two headers. --Anatoli T. (обсудить/вклад) 23:39, 16 October 2014 (UTC)
Don't worry for accents - I have that covered. Can you give examples for nouns with multiple stress patterns? I can also generate definitions if the terms are linked from Wiktionary translation tables as Russian translations, together with a corresponding gloss. --Ivan Štambuk (talk) 23:48, 16 October 2014 (UTC)
Usually, it's when they are both animate and inanimate, like бычо́к (byčók) (here: same headword but two declension tables), such cases can have two headwords for multiple senses in the interim but I'll join them if appropriate. Animacy must be tricky, so just m, f, n or p will do fort the moment. --Anatoli T. (обсудить/вклад) 23:53, 16 October 2014 (UTC)
Got it. I'll get back to you in a day or two once it's done. --Ivan Štambuk (talk) 00:11, 17 October 2014 (UTC)
  • @Atitarev: The list of missing nouns is here, and stubs (not generated yet) are here. A few hundred stubs are missing and should better be manually generated due to various conflicts. If necessary I can add the genitive and nominative plural forms in the headword as well but they seem unnecessary given the presence of complete declension. --Ivan Štambuk (talk) 23:03, 18 October 2014 (UTC)
The stubs look great! You can use them. Please add the genitive and nominative plural forms in the headword as well, if genitive=nominative, the 3rd parameter could be - and no inflection table is necessary, like "амплуа" (indeclinable), don't worry if it complicates things. --Anatoli T. (обсудить/вклад) 23:39, 18 October 2014 (UTC)
125 nouns are missing, I'll add them a bit later since they need special attention (substanvized meanings of adjectives, proper noun conflicts, not in the database and so on). --Ivan Štambuk (talk) 09:06, 19 October 2014 (UTC)
Note: Words with "ё" were created on spellings with "е" by error, even though the entries themselves are spelled with ё, so they have to be moved or redirected to existing entries (if they exist). This wasn't supposed to happen and is apparently due to some behind-the-scenes normalization when uploading. --Ivan Štambuk (talk) 11:02, 19 October 2014 (UTC)
Thanks a lot! As you can see, I have already fixed a lot - added definitions and made other changes. Words with "ё" as "е" are OK. They can and probably should also have entries but I'm note too eager to work with them since they need manual transliteration. They (words with "е") can't take any stress either since a stress mark and dots over "ё" are considered accents. Note that words with "ё" don't need a stress mark (including the declension table) (that's for future loads). --Anatoli T. (обсудить/вклад) 11:20, 19 October 2014 (UTC)
Dictionaries nevertheless mark accents on "ё" but I'm fine if it's our convention. I'll clean up these entries manually. In the future stubs will have animacy marked in the gender parameter, and use {{ru-decl-noun-unc}} for uncountable nouns. --Ivan Štambuk (talk) 11:25, 19 October 2014 (UTC)
If you can determine animacy, that's great. Then I could edit using a definition wizard in many cases. --Anatoli T. (обсудить/вклад) 11:31, 19 October 2014 (UTC)
  • Extrapolating from the current rate of cleanup, it would take at least 20 days to empty the category. In the meantime I'll focus on other things, and when the category is emptied I'll add the missing nouns, and move to adjectives. --Ivan Štambuk (talk) 16:59, 20 October 2014 (UTC)
Hi Ivan, the nouns are all done (thanks heaps!), except for one word - копа́ (kopá), which we don't know (I might delete it). Not all nouns were loaded from the frequency list but that's OK. Could you generate adjectives and adverbs, please? Please don't add stress marks to "ё". If you find adjective declension tables confusing, then don't create them, if you wish to use them I'll give you a guide. :) CC: @Wanjuscha:. --Anatoli T. (обсудить/вклад) 23:36, 15 January 2015 (UTC)
OK, I'll do the rest of the nouns in a few days, or start adjectives. --Ivan Štambuk (talk) 23:39, 15 January 2015 (UTC)
  • @Atitarev: I've been caught up in the meatspace so I didn't have time for this, and won't have until the next weekend. However, I've been thinking, and bots are not really worth the periodic hassle when everything is considered. So, would you prefer a file that can be loaded into GoldenDict (the workflow would be: Ctrl+C for lookup of the selected word, copy the preformatted stub from the popup and paste into the wiki editor; could even generate inflection stubs..) or a webpage where you'd need to type the word and copy the result back? I'd prefer a more permanent solution, and not wasting time on inspecting and sanitizing frequency lists (which is editor's job). --Ivan Štambuk (talk) 00:37, 25 January 2015 (UTC)
Thank you, Ivan but my preference is the way you did it before with nouns. If it's too time-consuming or hard, don't worry about it. You don't have to do it. Thank you, anyway. :) --Anatoli T. (обсудить/вклад) 11:16, 25 January 2015 (UTC)

Czech definitionless entries[edit]

As for diff, I object to your mass creation of Czech definitionless entries, should you consider doing that. --Dan Polansky (talk) 08:42, 19 October 2014 (UTC)

You are free to ignore them, should I create them. --Ivan Štambuk (talk) 08:46, 19 October 2014 (UTC)

Serbo-Croatian-Russian false friends[edit]


I found it amusing, although there are a lot of wrong spellings in Serbo-Croatian and some inaccuracies. Что сербу бабушка, то русскому карась. --Anatoli T. (обсудить/вклад) 03:58, 10 November 2014 (UTC)

Mildly interesting :) --Ivan Štambuk (talk) 06:07, 10 November 2014 (UTC)

Pronunciation of Proto-Indo-European laryngeals[edit]

I have read enough published articles on proto-indo-european to know that the pronunciation of these consonants is still disputed, but I was just citing one possible pronunciaton that Don Lists in his book. See Don Ringe, From Proto-Indo-European to Proto-Germanic, Oxford University Press, 2006, page 14. Not only does he give a suggested pronunciation of bʰréh₂tēr as /b̤ráx.tɛːr/ (Note the pitch accent), He also gives a pronunciation for h₂éwis (sheep) as [xá.wis] (Again note the pitch accent). I assumed that since there was already a reconstructed pronunciation for Proto-germanic entries, it would be okay to add a reconstructed pronunciation to some of the Proto-Indo-European entries as well even though the pronunciation is not as settled as with Proto-Germanic.

Have you read what I wrote on the talkpage? It's [] not //, it's pitch accent (or tone, Ringe is silent as to what acute accent indicates) not stress, syllabification is not indicated (syllables are supraphonemic so indicating them in phonemic transcription is just wrong). For Proto-Germanic, what is indicated appears to be just phonemic transcription bijectively respelled from the reconstruction which is completely redundant. It's wrong to call such pronunciation because reconstructions are not spellings of words of actually spoken language. Given the great deal of dispute regarding the phonetic values of PIE segments, as well as notational inconsistencies (*/bʰréh₂tēr/ should really be *bʰráh₂tēr on a formal level), I'd rather that we don't add them before settling on a system first. --Ivan Štambuk (talk) 03:34, 29 December 2014 (UTC)