User talk:Atitarev/2020/2015 (2)
Is this also used to mean "electric light"? A sentence I found in a dictionary translates "выключить электричество" into Veps as "to turn off the light". —CodeCat 18:37, 6 April 2015 (UTC)
- Yes, it does, added. --Anatoli T. (обсудить/вклад) 23:29, 6 April 2015 (UTC)
The page says that "Lua error in Module:languages/templates at line 28: The language code 'zh-ma' is not valid". —umbreon126 04:55, 8 April 2015 (UTC)
- Well, there is no such language code. --Anatoli T. (обсудить/вклад) 04:57, 8 April 2015 (UTC)
Yeah
[edit]I'm not comfortable with Russian slang words. It's just that I'm practicing reading something back and forth on the internet as well as reading the news in Russian online. I want to be exposed to a variety of styles of usage. --KoreanQuoter (talk) 12:01, 11 April 2015 (UTC)
- That's OK, Russian slang must not be easy but please don't rush into adding terms you're not 100% sure of. It's OK to try example questions in a class environment, make mistakes and get corrected but editors are expected to add contents mostly error-free. I'm not trying to be hard, it's just the way it is. I encouraged you to start making full-fledged Russian entries, please continue that but you need to check the usage more thoroughly. :) --Anatoli T. (обсудить/вклад) 12:11, 11 April 2015 (UTC)
- Maybe I should make Russian slang entries within my own user page. I'm trying my best not to offend other users, but somehow I'm making a great progress at making awkward situations. :/ Yikes. --KoreanQuoter (talk) 12:34, 11 April 2015 (UTC)
- No, you haven't offended me or anyone else. Maybe I was too harsh. Well, I promised to check your edits and I will but you'll make it easier for me if you use dictionaries, it's harder to do it for slang words, let alone profanities. Unfortunately, things that are obvious to native speakers are not so obvious to learners, e.g. there's a big difference between "нажраться" and "нажрать" and "назариться" (not too common, by the way) was definitely incorrect. Don't get me wrong, I am glad someone still wants to learn Russian when the image of Russia and Russians is dropping sharply around the world (it's unfortunate but the reasons are understandable). --Anatoli T. (обсудить/вклад) 12:49, 11 April 2015 (UTC)
- BTW, it wasn't a slip of a tongue or pen @RFD page and people are frown upon someone else "correcting" their edits in discussions or talk pages. --Anatoli T. (обсудить/вклад) 12:52, 11 April 2015 (UTC)
- Now this is something that I have to apologize for. --KoreanQuoter (talk) 12:54, 11 April 2015 (UTC)
- 정말 천만에요! I was just explaining this to you, since you're rather new here. --Anatoli T. (обсудить/вклад) 12:58, 11 April 2015 (UTC)
- Well, I don't think I'm new anymore. Especially since I'm a long-time contributor in the Korean Wiktionary. --KoreanQuoter (talk) 13:19, 11 April 2015 (UTC)
- 정말 천만에요! I was just explaining this to you, since you're rather new here. --Anatoli T. (обсудить/вклад) 12:58, 11 April 2015 (UTC)
- Now this is something that I have to apologize for. --KoreanQuoter (talk) 12:54, 11 April 2015 (UTC)
- BTW, it wasn't a slip of a tongue or pen @RFD page and people are frown upon someone else "correcting" their edits in discussions or talk pages. --Anatoli T. (обсудить/вклад) 12:52, 11 April 2015 (UTC)
- No, you haven't offended me or anyone else. Maybe I was too harsh. Well, I promised to check your edits and I will but you'll make it easier for me if you use dictionaries, it's harder to do it for slang words, let alone profanities. Unfortunately, things that are obvious to native speakers are not so obvious to learners, e.g. there's a big difference between "нажраться" and "нажрать" and "назариться" (not too common, by the way) was definitely incorrect. Don't get me wrong, I am glad someone still wants to learn Russian when the image of Russia and Russians is dropping sharply around the world (it's unfortunate but the reasons are understandable). --Anatoli T. (обсудить/вклад) 12:49, 11 April 2015 (UTC)
- Maybe I should make Russian slang entries within my own user page. I'm trying my best not to offend other users, but somehow I'm making a great progress at making awkward situations. :/ Yikes. --KoreanQuoter (talk) 12:34, 11 April 2015 (UTC)
I posted more commentaries in Wiktionary:Requested entries (Korean), even though it could be somewhat complicated. This might help you (and also myself) in the long run. --KoreanQuoter (talk) 11:04, 12 April 2015 (UTC)
- Thanks. That's quite a bit of work. I will still add just words that I come across or I feel the need of adding - like basic, important words or something interesting. Sino-Korean and loanwords from European languages or directly from Japanese are often easy to understand as well. I may work with some from that page, though. I added requests there myself as well. E.g. 솔 - I couldn't find, which sense is pronounced long (if it is), not sure about the etymology of the "brush" sense, if it's transparent. --Anatoli T. (обсудить/вклад) 11:27, 12 April 2015 (UTC)
- Keep adding requests if you need to. I will do my best to explain the difficult words, especially the colloquial ones or historical ones. --KoreanQuoter (talk) 11:31, 12 April 2015 (UTC)
- I will and I already have added requests there - 문어, 솔, not sure if they merit entries: 거야/거예요, not sure about 넉 (in what situations it replaces numeral 넷). They all have my signature. --Anatoli T. (обсудить/вклад) 11:58, 12 April 2015 (UTC)
- 넉 is a weird word. You may look at the explanation. --KoreanQuoter (talk) 12:10, 12 April 2015 (UTC)
- I will and I already have added requests there - 문어, 솔, not sure if they merit entries: 거야/거예요, not sure about 넉 (in what situations it replaces numeral 넷). They all have my signature. --Anatoli T. (обсудить/вклад) 11:58, 12 April 2015 (UTC)
- Keep adding requests if you need to. I will do my best to explain the difficult words, especially the colloquial ones or historical ones. --KoreanQuoter (talk) 11:31, 12 April 2015 (UTC)
I think there's a wee bit of error at the ru-IPA in отзывать. It has a [zz] error. --KoreanQuoter (talk) 14:35, 13 April 2015 (UTC)
- No, it's correct, /t/ is partially assimilated with the following /z/. And you were right in adding |gem=y. --Anatoli T. (обсудить/вклад) 20:47, 13 April 2015 (UTC)
- Do we have to put |gem=y in both of the two ru-IPA entries for отзыв? I'm still kinda cautious about it. (If so, I want to fix the two with your permission.) --KoreanQuoter (talk) 02:18, 14 April 2015 (UTC)
- There's some logic in the pronunciation module that can somewhat predict geminations, thanks to User:Wyang. You only need to add or gem=y or gem=n when the result is not what is expected. The IPA in "отзыв" is currently correct. --Anatoli T. (обсудить/вклад) 02:36, 14 April 2015 (UTC)
- I see. Thank you very much, Wyang, and thank you very much, Anatoli. --KoreanQuoter (talk) 02:51, 14 April 2015 (UTC)
- There's some logic in the pronunciation module that can somewhat predict geminations, thanks to User:Wyang. You only need to add or gem=y or gem=n when the result is not what is expected. The IPA in "отзыв" is currently correct. --Anatoli T. (обсудить/вклад) 02:36, 14 April 2015 (UTC)
- Do we have to put |gem=y in both of the two ru-IPA entries for отзыв? I'm still kinda cautious about it. (If so, I want to fix the two with your permission.) --KoreanQuoter (talk) 02:18, 14 April 2015 (UTC)
Actually, Special:Contributions/175.211.35.41 is me. There was an error that somehow made me logged off temporarily due to my browser's error. Is there a way to merge this contribution back into my account? --KoreanQuoter (talk) 05:26, 23 April 2015 (UTC)
- I don't know how. It happened to me many times. If it's a big concern (exposing your IP address), I can try to hide that revision and regenerate with an account. --Anatoli T. (обсудить/вклад) 05:28, 23 April 2015 (UTC)
- Hiding would be much better since there are commercial online bots that track down IPs in South Korea. The problem is that I get a lot of pop up spams on Korean websites if I suddenly show my IP in any websites. I blame my internet provider for this. --KoreanQuoter (talk) 05:32, 23 April 2015 (UTC)
- Done - I've hidden your IP address from the edit history. --Anatoli T. (обсудить/вклад) 05:36, 23 April 2015 (UTC)
- Hiding would be much better since there are commercial online bots that track down IPs in South Korea. The problem is that I get a lot of pop up spams on Korean websites if I suddenly show my IP in any websites. I blame my internet provider for this. --KoreanQuoter (talk) 05:32, 23 April 2015 (UTC)
- I don't know how. It happened to me many times. If it's a big concern (exposing your IP address), I can try to hide that revision and regenerate with an account. --Anatoli T. (обсудить/вклад) 05:28, 23 April 2015 (UTC)
Thank you very much. Fun fact: most of the South Korean private internet forums show IP addresses of posters at the bottom part of the posts. So..... it's sort of dangerous posting something in Korean websites in general. --KoreanQuoter (talk) 05:40, 23 April 2015 (UTC)
- Umm..... Some troll is taking over the Korean Wiktionary. [1]. His username is [2] --KoreanQuoter (talk) 07:34, 23 April 2015 (UTC)
- Case solved. [3] --KoreanQuoter (talk) 07:36, 23 April 2015 (UTC)
Any way to fix the pinyin in the example sentence for 寧可? It's currently the standard Taiwanese pinyin, but the standard Beijing Mandarin form should probably take priority. Typing "{nìngkě}" after the term in the usex template just results in "níngnìngkě". --WikiWinters (talk) 16:45, 12 April 2015 (UTC)
I noticed that you attempted to fix it. However, this unbolds the term. Does it matter? --WikiWinters (talk) 21:39, 12 April 2015 (UTC)
- Easily fixed as well. I don't always bold them, as they stand out as unlinked, anyway. --Anatoli T. (обсудить/вклад) 22:22, 12 April 2015 (UTC)
The template {{R:vep:UVVV}}
is for a Russian-to-Veps dictionary, and it links to the Wiktionary entries for each of the Russian words you give to it. But I found that some of the words in that dictionary are not defined in Wiktionary yet (red links). So I let the template categorise entries where this is the case, and created this category. I thought it might be useful for you? —CodeCat 23:23, 12 April 2015 (UTC)
- Yes, thanks, like any red links, which should be filled eventually. --Anatoli T. (обсудить/вклад) 23:27, 12 April 2015 (UTC)
- @CodeCat I have filled some nouns but verbs are more time-consuming. You can add requests to Wiktionary:Requested_entries_(Russian) or wanted entries page, of course. --Anatoli T. (обсудить/вклад) 03:02, 14 April 2015 (UTC)
Would you mind cleaning up the Vietnamese reading here when you get the time? ---> Tooironic (talk) 16:09, 16 April 2015 (UTC)
Hello Anatoli. Per your request, I have not and shall not add {{rfe|lang=ru}}
to this entry. However, can you tell me whether the Russian name for the River Alazani derives from the Georgian name (ალაზანი), the other way around, or neither? — I.S.M.E.T.A. 09:17, 19 April 2015 (UTC)
- I am not sure. It sounds Russian. There's more Georgian sounding form in Russian: Алаза́ни (Alazáni). It may be of Georgian origin but Russified, which is common with other loanwords. (BTW, you don't have to link it the way you did in the header, just [[Алазань]] is fine, the stress marks are for users to know where the stress is, they are not used in a running Russian text. Just making sure you understand.)--Anatoli T. (обсудить/вклад) 09:34, 19 April 2015 (UTC)
- I see. Thanks for getting back to me. My wondering is prompted by guesses about the etymology of the Latin Alāzōn and Lewis & Short's calling the river Alasan. Unless I'm mistaken, Azerbaijan and Georgia were parts of the Russian Empire back in 1879, so it would be unsurprising for Lewis & Short to have used the river's Russian name; however, that doesn't explain the spelling Alasan vs. Alazan… Anyway, I'll keep digging a bit. (Also, thanks for the explanation; I knew the thing about the Russian stress mark (acute accent?). I include it when mentioning Russian terms just like I include macra when mentioning Latin terms. I hope that's OK.) — I.S.M.E.T.A. 12:26, 19 April 2015 (UTC)
Этимология в статьях
[edit]Я бы мог добавлять этимологию интересным мне статьям, но их не отследить, потому что нет этого:
===Etymology=== {{rfe|lang=ru}}
Сейчас в Category:Russian_entries_needing_etymology всего 58 страниц. —Игорь Тълкачь 23:28, 19 April 2015 (UTC)
- Спасибо, Игорь, я обычно сам добавляю этимологию, если знаю и могу (и хочу это делать). В списке мало статей, потому что никто не добавлял
{{rfe|lang=ru}}
. Ты можешь сам добавлять этимологию к новым и старым статьям без запроса. --Anatoli T. (обсудить/вклад) 00:32, 20 April 2015 (UTC)
- Желательно создавать страницы вместе с этим кодом, потому что в Category:Russian lemmas перечислены слова с и без этимологии, а от запроса на этимологию статья хуже не становится, тем более первой категорией может кто-нибудь ещё воспользоваться. —Игорь Тълкачь 11:13, 20 April 2015 (UTC)
why did you revert my change on impression?
[edit]It should be uncontroversial to add vowels to an Arabic word, and once this is done, the manual translit (which was wrong anyway) isn't needed. Benwing (talk) 09:58, 20 April 2015 (UTC)
- @Benwing I'm sorry, that was an accident - too easy to do on mobile devices where links are too close to each other and keep moving while the page is loading. If you check the history, I have reverted my revert seconds later. Sorry, again :) --Anatoli T. (обсудить/вклад) 11:11, 20 April 2015 (UTC)
- In fact I didn't even see the edit I reverted but I undid it because I accidentally clicked on the wrong link on a different entry.--Anatoli T. (обсудить/вклад) 11:14, 20 April 2015 (UTC)
- No problem!! Benwing (talk) 11:19, 20 April 2015 (UTC)
- In fact I didn't even see the edit I reverted but I undid it because I accidentally clicked on the wrong link on a different entry.--Anatoli T. (обсудить/вклад) 11:14, 20 April 2015 (UTC)
Если бы у этого глагола и могла образовываться форма страдательного причастия прошедшего времени, то ожидалось бы *подсы́ланный. Однако такой формы не существует, ведь она никогда не образуется от глаголов несовершенного вида, имеющих видовую пару (см. «Грамматический словарь русского языка», с. 86).--Cinemantique (talk) 00:15, 25 April 2015 (UTC)
- @Cinemantique В английском викисловаре мы давно придерживаемся практики давать страдательные страдательные причастия прошедшего времени для несовершенных глаголов даже если они образованы от совершенных глаголов и давать варианты от обеих форм - напр. "деланный, сделанный". Если желаешь, можно открыть тему в Wiktionary:Tea room или Wiktionary:Beer parlour. --Anatoli T. (обсудить/вклад) 00:24, 25 April 2015 (UTC)
- Да кстати, насчет "никогда" - "деланный", "мазанный", "резанный", "лепленный", "меченный", и т.д. образованы от глаголов несовершенного вида. --Anatoli T. (обсудить/вклад) 00:26, 25 April 2015 (UTC)
- Разумеется, ведь у них нет строгой видовой пары. Ну, если вам так нравится у одного глагола ставить форму другого глагола, то ради бога; никаких обсуждений я поднимать не намерен.--Cinemantique (talk) 00:31, 25 April 2015 (UTC)
- Да кстати, насчет "никогда" - "деланный", "мазанный", "резанный", "лепленный", "меченный", и т.д. образованы от глаголов несовершенного вида. --Anatoli T. (обсудить/вклад) 00:26, 25 April 2015 (UTC)
三位一體
[edit]The pinyin should be "Sānwèi Yītǐ," but how would you generate this while bypassing the "一" in the source? I don't know how to do this and still have the "Y" be capitalized. --WikiWinters (talk) 21:59, 30 April 2015 (UTC)
- Not sure how, you probably can't do it without module changes. BTW, Pleco dictionary has it as a solid word (no spaces) and in lower case. -Anatoli T. (обсудить/вклад) 22:36, 30 April 2015 (UTC)
- Is Pleco preferable to CC-CEDICT? That's where I saw that it was spaced and lowercased (http://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=1&wdqb=%E4%B8%89%E4%BD%8D%E4%B8%80%E9%AB%94). --WikiWinters (talk) 22:43, 30 April 2015 (UTC)
- Your source is MDBG, not CEDICT. Pleco is an official dictionary. I got the same spelling in the ABC dictionary that is shipped with Wenlin software. --Anatoli T. (обсудить/вклад) 23:19, 30 April 2015 (UTC)
- Oh, I just thought it was CEDICT because MDBG has a link to it. Although, it does say this for MDBG (third down):
- "Resources used by this website:
- Your source is MDBG, not CEDICT. Pleco is an official dictionary. I got the same spelling in the ABC dictionary that is shipped with Wenlin software. --Anatoli T. (обсудить/вклад) 23:19, 30 April 2015 (UTC)
- Is Pleco preferable to CC-CEDICT? That's where I saw that it was spaced and lowercased (http://www.mdbg.net/chindict/chindict.php?page=worddict&wdrst=1&wdqb=%E4%B8%89%E4%BD%8D%E4%B8%80%E9%AB%94). --WikiWinters (talk) 22:43, 30 April 2015 (UTC)
- Mandarin voice soundset: http://www.chinese-lessons.com/
- Unihan Database: http://www.unicode.org/charts/unihan.html
- CC-CEDICT Database: http://www.mdbg.net/chindict/chindict.php?page=cc-cedict
- Animated Chinese character GIFs: http://www.ocrat.com/ (currently offline)
- Animated Chinese character flash animations: http://ehaton.blogspot.com/
- Handwriting input: http://www.kiang.org/jordan/software/hanzilookup/
- Character Decomposition: http://commons.wikimedia.org/wiki/Commons:Chinese_characters_decomposition (used with personal authorization)"
- --WikiWinters (talk) 00:14, 1 May 2015 (UTC)
Missing senses
[edit]I'm just following the status quo of those who used the vi requests page before me. I stopped counting after a dozen. I assumed you moved mine to the Tea Room and didn't only revert and I assume you're going to move all the others and not just mine. — hippietrail (talk) 01:49, 1 May 2015 (UTC)
- Sorry if I upset you. I have just added a TR topic. Please consider those who try to serve those requests. Missing senses is not for new entries page. I have been doing the same (removing new sense requests) in the Chinese, Russian, Japanese, etc. pages. --Anatoli T. (обсудить/вклад) 01:59, 1 May 2015 (UTC)
- @Hippietrail I can confirm the sense (e.g. "bằng ô tô" - "by car") but not sure about which etymology it belongs to. --Anatoli T. (обсудить/вклад) 02:02, 1 May 2015 (UTC)
- Yep no worries. Your revert seemed a bit abrupt but if this is the current policy I'll start moving the ones I see to the Tea Room as well. I'm just learning Vietnamese and probably won't get very far before I move back to Laos and China so there's a lot more gaps I can point out for our experts than I am confident enough to just add myself. Especially considering the number of mistakes and simplifications in dictionaries and phrasebooks. Such as including classifiers and nominalizers in headwords, so there must be a lot more I wouldn't even know how to spot at this point. Anyway back to today's list... — hippietrail (talk) 02:08, 1 May 2015 (UTC)
- @Hippietrail I can confirm the sense (e.g. "bằng ô tô" - "by car") but not sure about which etymology it belongs to. --Anatoli T. (обсудить/вклад) 02:02, 1 May 2015 (UTC)
Hi. I noticed that you added some more testcases to the module. In light of the six failed tests, I feel we should disable it until it is ready again. DerekWinters (talk) 21:57, 1 May 2015 (UTC)
- I have asked Frank for assistance. I don't think we should disable it because the hard coded transliteration will not override the automatic one. If Frank doesn't respond, we can try asking on the Grease Pit. --Anatoli T. (обсудить/вклад) 01:36, 2 May 2015 (UTC)
I apologize for filling up your talk page with questions, but do you think 土包子 should be type=111 (default) or type=12? I'm not sure if 子 is just used a suffix or if 包子 is used in the figurative sense here, if at all. Thank you. --WikiWinters (talk) 14:15, 2 May 2015 (UTC)
- It should probably be type=12 but I'm not 100% sure. --Anatoli T. (обсудить/вклад) 14:26, 2 May 2015 (UTC)
How can we fix the abbreviation in the pinyin line in the example sentence? ---> Tooironic (talk) 08:09, 4 May 2015 (UTC)
- I'm not sure what you mean. Which abbreviation? --Anatoli T. (обсудить/вклад) 09:09, 4 May 2015 (UTC)
- Also, are the commas the correct type? --WikiWinters (talk) 11:40, 4 May 2015 (UTC)
- The original ﹑ was correct but I can't fix it. The best I can do is:
- But as you there are extra spaces and commas are not converted correctly that way. --Anatoli T. (обсудить/вклад) 12:00, 4 May 2015 (UTC)
- I fixed it. You were using the incorrect commas. You should use "、", not "﹑" (difference is barely noticeable, but one of them is slightly lower or higher). --WikiWinters (talk) 12:05, 4 May 2015 (UTC)
- But as you there are extra spaces and commas are not converted correctly that way. --Anatoli T. (обсудить/вклад) 12:00, 4 May 2015 (UTC)
Thanks for introducing me to this handy template. :) ‑‑ Eiríkr Útlendi │ Tala við mig 06:40, 5 May 2015 (UTC)
- @Eirikr You're welcome. It has to be used with care when automatic conversion doesn't work or produces undesired results, e.g. 著急/着急 (zháojí) takes a parameter after "/", otherwise, it displays no simplified form: 著急 (zháojí). The documentation for
{{zh-l}}
shows the usage but not the traps. --Anatoli T. (обсудить/вклад) 06:45, 5 May 2015 (UTC)- Good to know. Thanks again! ‑‑ Eiríkr Útlendi │ Tala við mig 06:47, 5 May 2015 (UTC)
Thank you so much !
[edit]Thank you for explaining :) Adjutor101 (talk) 05:13, 7 May 2015 (UTC)
- Здравствуйте! Соответствует ли правилу Wiktionary:Criteria for inclusion#Fictional universes включение значения "Character in Shakespeare's play"?--Cinemantique (talk) 06:44, 12 May 2015 (UTC)
- Здравствуйте. Вообще, строго говоря - нет, но я лично спокойно отношусь к таким статьям. Ее можно переделать в статью о женском имени. Я думаю, что она прошла бы
{{rfd}}
(запрос на удаление) и ее бы оставили, судя по подобным случаям, типа Talk:Snow Queen. --Anatoli T. (обсудить/вклад) 06:51, 12 May 2015 (UTC)- А чем Дарт Вейдер хуже Белоснежки? Я-то тоже считаю, что хуже, но как бы это сформулировать в правиле?..--Cinemantique (talk) 10:21, 12 May 2015 (UTC)
- Я не знаю, чем хуже, правда. Кроме критериев для включения, есть еще голосования на страницах удалений. Там и решаются подобные случаи. Пока что, голосования имеют бо́льшую силу, чем правила. --Anatoli T. (обсудить/вклад) 21:46, 12 May 2015 (UTC)
- А чем Дарт Вейдер хуже Белоснежки? Я-то тоже считаю, что хуже, но как бы это сформулировать в правиле?..--Cinemantique (talk) 10:21, 12 May 2015 (UTC)
- Здравствуйте. Вообще, строго говоря - нет, но я лично спокойно отношусь к таким статьям. Ее можно переделать в статью о женском имени. Я думаю, что она прошла бы
Help needed
[edit]Dear Atitarev I need help with this category: https://en.wiktionary.org/wiki/Category:ps:Tribes Could you please help thank you so much ! Adjutor101 (talk) 09:42, 15 May 2015 (UTC)
- @Adjutor101 Hi. The generic Category:Tribes doesn't exist. You can use Category:Nationalities or Category:Demonyms. You can create Category:ps:Nationalities or Category:ps:Demonyms just by looking at English the Category:en:Nationalities or Category:en:Demonyms and replace the language code and move your entries. Creating brand new categories is not so straightforward, unfortunately. (BTW, just call me Anatoli). --Anatoli T. (обсудить/вклад) 13:59, 15 May 2015 (UTC)
- Oh okay thank you :) The problem is there are like dozens of tribes and various clans and alot of clan-specific jargon in Pashto, so the word "Demonym" although close to tribes/clans is still not appropriate to it :( Adjutor101 (talk) 14:20, 15 May 2015 (UTC)
- @Adjutor101 Done. I've added a "tribes" section to make Category:Tribes in Module:category tree/topic cat/data/People in diff. I hope the category tree is correct but someone may fix it if it's not. I didn't add any description of the category either. --Anatoli T. (обсудить/вклад) 14:34, 15 May 2015 (UTC)
- OmG thank you so much !!!
- @Adjutor101 Done. I've added a "tribes" section to make Category:Tribes in Module:category tree/topic cat/data/People in diff. I hope the category tree is correct but someone may fix it if it's not. I didn't add any description of the category either. --Anatoli T. (обсудить/вклад) 14:34, 15 May 2015 (UTC)
- Oh okay thank you :) The problem is there are like dozens of tribes and various clans and alot of clan-specific jargon in Pashto, so the word "Demonym" although close to tribes/clans is still not appropriate to it :( Adjutor101 (talk) 14:20, 15 May 2015 (UTC)
The Linguistic Barnstar | ||
Thank you for all your many and gracious efforts to improve Wiktionary.Adjutor101 (talk) 16:19, 15 May 2015 (UTC) |
a or ɑ in pinyin
[edit]Hi Atitarev. I the pinyin we always use a and not ɑ. Most text book I can find use ɑ and also a Chinese teacher I did ask tell me that ɑ is the correct form. Do you know why are we not using ɑ? The same with ā and ɑ̄ and other diacritic forms of a. Kinamand (talk) 18:48, 20 May 2015 (UTC)
- Pinyin uses standard Roman/Latin letters, including "a" with standard macrons ā, ē, ī, ō and ū. Some computers, systems, typewriters don't have standard diacritics and they use something else, also tone numbers instead of tone marks. Using "ɑ" instead of "a" doesn't make any difference in the pronunciation but it should be used in IPA. We only use "a" at Wiktionary, which is more common. Also, to mark the third tone, we use caron or háček (ˇ), e.g. "mǎ", not the rounded breve (˘). In Pinyin#Tones they mention that the use of "ɑ" is not a standard. --Anatoli T. (обсудить/вклад) 00:25, 21 May 2015 (UTC)
No consensus about the 'Altaic/Asiatic' etymology. Would you care to leave a comment to the discussion at WT:ES? Thanks. Hirabutor (talk) 20:56, 21 May 2015 (UTC)
When you get the time, can you help me add the traditional characters for 高嶺 (高岭) in the English and French etymology sections? I was having trouble with it. Thanks. ---> Tooironic (talk) 02:20, 22 May 2015 (UTC)
Доброе утро!
[edit]Здравствуйте Анатолий!
Ты не можешь объяснить то, что являются различиями (отличиями?) между следующими словами: "должен", "надо", "нужно"? Когда именно можно их употреблять? Есть ли вообще различия, что ли? 82.217.116.224 07:01, 1 June 2015 (UTC)
- до́лжен (dólžen) (masculine form) is used to render the English 1) "must", "to have to" or 2) "to owe". Я должен/должна идти - I must go. See the entry.
- на́до (nádo), more formally ну́жно (núžno) or, even more formally, необходи́мо (neobxodímo) can mean "it's necessary (that)", "(one) needs" but can also mean "must", "have to", "должен" has stronger meanings - "мне нужно идти" is weaker than "я должен/должна идти" - I need to go, I must go. ну́жно (núžno) and необходи́мо (neobxodímo) are also neuter short forms of ну́жный (núžnyj) and необходи́мый (neobxodímyj), so they can mean that something of neuter gender is needed/necessary, e.g. "мне нужно масло" - I need oil/butter. --Anatoli T. (обсудить/вклад) 09:40, 1 June 2015 (UTC)
Thanks, so you say that of the three, должен is the strongest and that the others, including необходимо perhaps, are kind of equal in 'strongness'? By the way, was my Russian any good? I try to make sentences, but I'm never sure whether the words I use or the sentence structure are really Russian. 82.217.116.224 15:41, 1 June 2015 (UTC)
- Hi,
- I get back to you later with corrections. The main thing, I understand what you're asking, although there are some issues with your phrases. --Anatoli T. (обсудить/вклад) 05:28, 2 June 2015 (UTC)
- Yes. That's right about the strength.
- Your Russian is OK, comprehensible. You'll only get better by using more of it. :)
- Here's my corrected/suggested version:
- Здравствуй
теАнатолий! (since you use "ты") - Ты не можешь (OR: ты не мог бы объяснить)
то,в чём различие/отличие между следующими словами: "должен", "надо", "нужно"? Когда именно можно их употреблять? Есть ли вообще различия, что ли? --Anatoli T. (обсудить/вклад) 10:58, 2 June 2015 (UTC)
Thank you so much! I didn't know you could say в чём like that and as you can probably tell, I was struggling with it. You know, I was also really doubting about using a construction with бы or not and the same thing with что ли. I concluded it would be okay, but apparently it's not. Anyway, for some (wrong) reason, I always say здравствуйте, even in familiar cases. I know it's wrong, I'll make sure to do it properly next time :) At least I didn't butcher your language in every sentence... 82.217.116.224 07:20, 3 June 2015 (UTC)
- You're welcome. No-no, you don't sound like you're butchering the language. "что является различиeм..." (but singular) is also grammatically correct but sounds too formal. Note that I called my version "corrected/suggested" and "что ли" (pronounced "што ли") expresses surprise or doubt, sometimes is equal to the English "or something" or "as if" - "Ты не знаешь, что ли?" - "Don't you know?"; "Пойти домой, что ли?" (asking oneself in doubts) - "Should I go home?" --Anatoli T. (обсудить/вклад) 07:36, 3 June 2015 (UTC)
What happened here? I can't work it out. ---> Tooironic (talk) 10:09, 3 June 2015 (UTC)
- Done --Anatoli T. (обсудить/вклад) 19:23, 3 June 2015 (UTC)
- Thank you! One more thing - any idea how to emphasise 周朝 in the example sentence at 周朝? If I add {Zhōu} it seems to disable it. ---> Tooironic (talk) 01:42, 4 June 2015 (UTC)
While you're out and about solving the world's problems, do you think you could spend a minute to de-WF this entry? Chuck Entz (talk) 14:05, 3 June 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 19:24, 3 June 2015 (UTC)
- большо́е спаси́бо! Chuck Entz (talk) 00:47, 4 June 2015 (UTC)
numbers in the example sentences
[edit]Could you help me take a look at the example sentence I added at 持有? I'm not sure how to get the pinyin to display correctly for the 24% bit. ---> Tooironic (talk) 05:40, 6 June 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 12:16, 6 June 2015 (UTC)
- It's still not coming up as pinyin to me. ---> Tooironic (talk) 08:14, 19 June 2015 (UTC)
- Encountered the same issue at 以來. ---> Tooironic (talk) 08:15, 19 June 2015 (UTC)
- Why do expect numbers converted to pinyin?! --Anatoli T. (обсудить/вклад) 08:16, 19 June 2015 (UTC)
- That's how it's always been done whenever I've seen pinyin transcriptions. Is this function not built into the script? ---> Tooironic (talk) 10:13, 22 June 2015 (UTC)
- Of course not. Another module would be required to do this but I don't think it's neccessary. Chinese numerals are simple but try doing this for English or some other language. You can spell the numbers out in hanzi (also in brackets), though. I wouldn't call this an "issue" but a feature request. --Anatoli T. (обсудить/вклад) 10:24, 22 June 2015 (UTC)
- That's how it's always been done whenever I've seen pinyin transcriptions. Is this function not built into the script? ---> Tooironic (talk) 10:13, 22 June 2015 (UTC)
- Why do expect numbers converted to pinyin?! --Anatoli T. (обсудить/вклад) 08:16, 19 June 2015 (UTC)
вопросы
[edit]Здравствуйте! Как вы статьи создаёте здесь? Какой механизм удобнее использовать?--Cinemantique (talk) 17:34, 28 June 2015 (UTC)
- Здравствуйте. Это зависит от того, какой язык. Большинство используют существующие статьи, как шаблон. Для китайского, японского, корейского и вьетнамского есть также шаблоны для ускоренного создания. Кроме того, для русского я иногда использую шаблон создания статьи из перевода с английского - но он не поможет со склонением или спряжением. Вас интересуют русские статьи? Я вижу, что вы уже освоили механизм. --Anatoli T. (обсудить/вклад) 20:28, 28 June 2015 (UTC)
- Спасибо, я займусь русским. Да, Entry Creator - интересный скрипт, но мне с ним не очень удобно. Я, наверное, сделаю себе болванку, надо только разобраться, какие здесь шаблоны используются. Существующие шаблоны словоизменения актуальны, или есть какие-нибудь устаревшие?--Cinemantique (talk) 20:38, 28 June 2015 (UTC)
- Устарели
{{ru-verb-1-pf}}
и{{ru-verb-1-impf}}
. Нужно использовать шаблоны модуля Module:ru-verb с конкретным типом и подтипом спряжения (вы уже их используете). К сожалению, шаблоны мало задокументированы, но я думаю у вас не будет проблем с определением типа и подтипа. Другие шаблоны для других частей речи все актуальны. Больше всего требуются статьи глаголов, как видно по красным ссылкам в Appendix:Russian_Frequency_lists/3001-4000 (и выше) и Appendix:Frequency dictionary of the modern Russian language (the Russian National Corpus). --Anatoli T. (обсудить/вклад) 20:49, 28 June 2015 (UTC)
- Устарели
- Спасибо, я займусь русским. Да, Entry Creator - интересный скрипт, но мне с ним не очень удобно. Я, наверное, сделаю себе болванку, надо только разобраться, какие здесь шаблоны используются. Существующие шаблоны словоизменения актуальны, или есть какие-нибудь устаревшие?--Cinemantique (talk) 20:38, 28 June 2015 (UTC)
- Скажите, пожалуйста, статьи о словоформах создаются ботами или их вручную строгают?--Cinemantique (talk) 20:51, 28 June 2015 (UTC)
- Вручную, их сейчас делает KoreanQuoter, я их мало делаю. Единственное ускорение (не бот) есть для сравнительных форм наречий. Мне пора идти, отвечу, когда приду с работы, если будут вопросы. --Anatoli T. (обсудить/вклад) 20:54, 28 June 2015 (UTC)
- Посмотрите, пожалуйста, грант. Верно ли оформлена цитата?--Cinemantique (talk) 13:13, 29 June 2015 (UTC)
- Я сделал вот так, но я не истина в последней инстанции :) --Anatoli T. (обсудить/вклад) 13:29, 29 June 2015 (UTC)
- Анатолий, так не канонічно. Надо использовать
{{Q}}
. --Vahag (talk) 17:25, 29 June 2015 (UTC)- Спасибо, Вааг. Почему ты написал "канонічно"? --Anatoli T. (обсудить/вклад) 20:49, 29 June 2015 (UTC)
- Спасибо. Ещё бы транслитерация автоматически выводилась.--Cinemantique (talk) 20:54, 29 June 2015 (UTC)
- Она и выводится, просто там встречаются исключения - neobxodímovo, odnovó. Я скопировал автоматическую и исправил g на v. --Anatoli T. (обсудить/вклад) 20:58, 29 June 2015 (UTC)
- Анатолий, см. http://lurkmore.to/Канонічно. До Австралии русские мемы доходят медленнее чем до Армении :) --Vahag (talk) 21:22, 29 June 2015 (UTC)
- Она и выводится, просто там встречаются исключения - neobxodímovo, odnovó. Я скопировал автоматическую и исправил g на v. --Anatoli T. (обсудить/вклад) 20:58, 29 June 2015 (UTC)
- Анатолий, так не канонічно. Надо использовать
- Я сделал вот так, но я не истина в последней инстанции :) --Anatoli T. (обсудить/вклад) 13:29, 29 June 2015 (UTC)
- Анатолий, ну почему ССР «звучит» как [ысысэр]. Мне это совсем не нравится.--Cinemantique (talk) 21:12, 9 July 2015 (UTC)
- Наверное по той же причине, какой экран звучит как «ыкра́н». --Anatoli T. (обсудить/вклад) 12:09, 11 July 2015 (UTC)
- Вот ещё одна вырезка из книги Аванесова. Посмотрите «э → В начале слова в предударных слогах»; таким образом, первый звук в слове эква́тор тот же, что и первый звук в слове э́тот, а в слове экраниза́ция и в особенности эмигра́нт этот звук может иметь и-образную концовку. Может, вам всю книгу скинуть?--Cinemantique (talk) 13:54, 11 July 2015 (UTC)
- Верю, хотя странно, что не допускается редукция. В этом случае "эсэсэ́р" все равно не подходит под это правило или есть другие? Я не силен в луа, но если случай непротиворечивый, добавляй в тесты (давай на "ты"). Можешь скинуть полную книгу, если не трудно? Надеюсь, найдется желающий подправить модуль, но нужны четкие правила, как и с конечной "е". --Anatoli T. (обсудить/вклад) 14:05, 11 July 2015 (UTC)
- Ссылка на книгу. Могу привлечь одного редактора, но попозже, сейчас он занят.--Cinemantique (talk) 15:24, 11 July 2015 (UTC)
- Спасибо. За помощью с кодом можно иногда обращаться в Greasepit (в текущий месяц), если не получится иначе. --Anatoli T. (обсудить/вклад) 15:37, 11 July 2015 (UTC)
- Ссылка на книгу. Могу привлечь одного редактора, но попозже, сейчас он занят.--Cinemantique (talk) 15:24, 11 July 2015 (UTC)
- Верю, хотя странно, что не допускается редукция. В этом случае "эсэсэ́р" все равно не подходит под это правило или есть другие? Я не силен в луа, но если случай непротиворечивый, добавляй в тесты (давай на "ты"). Можешь скинуть полную книгу, если не трудно? Надеюсь, найдется желающий подправить модуль, но нужны четкие правила, как и с конечной "е". --Anatoli T. (обсудить/вклад) 14:05, 11 July 2015 (UTC)
- Вот ещё одна вырезка из книги Аванесова. Посмотрите «э → В начале слова в предударных слогах»; таким образом, первый звук в слове эква́тор тот же, что и первый звук в слове э́тот, а в слове экраниза́ция и в особенности эмигра́нт этот звук может иметь и-образную концовку. Может, вам всю книгу скинуть?--Cinemantique (talk) 13:54, 11 July 2015 (UTC)
- Наверное по той же причине, какой экран звучит как «ыкра́н». --Anatoli T. (обсудить/вклад) 12:09, 11 July 2015 (UTC)
Proto-Central Eastern Polynesian
[edit]Hi, I'd like to add Proto-Central Eastern Polynesian (or just Proto-Central Polynesian) as a reconstructed language. Wehewehe and other Polynesian sites and articles do on occasion provide reconstructions for it. Is there any way you could add it? DerekWinters (talk) 03:57, 30 June 2015 (UTC)
- And also Proto-Eastern Polynesian. Perhaps poz-ple-pro for Eastern, and poz-plc-pro for Central? DerekWinters (talk) 04:03, 30 June 2015 (UTC)
- See this discussion, which is advocating going in the opposite direction. This could get out of hand: we may not have anything for central-eastern or central, but we do have poz-pep-pro for Eastern, and we used to also have pqe-pol-pro- we don't need a third! Chuck Entz (talk) 12:12, 30 June 2015 (UTC)
- What Chuck said. --Anatoli T. (обсудить/вклад) 13:28, 5 July 2015 (UTC)
- See this discussion, which is advocating going in the opposite direction. This could get out of hand: we may not have anything for central-eastern or central, but we do have poz-pep-pro for Eastern, and we used to also have pqe-pol-pro- we don't need a third! Chuck Entz (talk) 12:12, 30 June 2015 (UTC)
the 尊稱s
[edit]Do we have a category to put the 尊稱s like 師長 etc.? ---> Tooironic (talk) 08:29, 30 June 2015 (UTC)
- Probably not. --Anatoli T. (обсудить/вклад) 13:28, 5 July 2015 (UTC)
Does Chinese even have a word for this concept? ---> Tooironic (talk) 04:46, 5 July 2015 (UTC)
- No idea. --Anatoli T. (обсудить/вклад) 13:28, 5 July 2015 (UTC)
Any idea why the Mandarin translations I entered here didn't come up with the mini-zh-link? ---> Tooironic (talk) 06:37, 7 July 2015 (UTC)
- I think the way it works is that the mini (zh) links appear only if those pages also exist on the zh Wiktionary. —suzukaze (t・c) 06:41, 7 July 2015 (UTC)
- Yeah, Carl, this was done ages ago. The minilink to other wiktionaries is only added when entries there actually exist. That's why it's ALWAYS better to use the JavaScript translation adder tool to add translations, not manually. Then it will automatically determine whether to use
{{t}}
(without the link) or{{t+}}
(with the link). --Anatoli T. (обсудить/вклад) 07:40, 7 July 2015 (UTC)
- Yeah, Carl, this was done ages ago. The minilink to other wiktionaries is only added when entries there actually exist. That's why it's ALWAYS better to use the JavaScript translation adder tool to add translations, not manually. Then it will automatically determine whether to use
- Gotcha. Thanks! ---> Tooironic (talk) 05:03, 8 July 2015 (UTC)
The pinyin should be "bùyòng" (bu2 instead of bu4), right? —suzukaze (t・c) 20:36, 7 July 2015 (UTC)
- Yes, "bùyòng" (bu4 yong4). We write nominal tones. In case of 不 and 一, the characters themselves should be used in pronunciation sections. (This also applies to characters with variant mainland China/Taiwan readings). --Anatoli T. (обсудить/вклад)
Привет, Анатолий!
Я заметил сегодня, что ты добавил три месяца назад чтение умэдзакэ для 梅酒 , которое однако было удалено (тогда умэ-сакэ) в феврале 2014 г. как измышление. К моему удивлению японская википедия, чьи статьи обычно изобилуют самыми редкими и малораспространенными чтениями, на этот раз не содержит его. Японско-немецкий словарь (和独辞典), которым я пользуюсь, тоже не содержит этого чтения. Ты встретил его в каком-нибудь источнике? Или в тексте с фуриганой? The uſer hight Bogorm converſation 16:43, 8 July 2015 (UTC)
- Привет!
- Я уже не помню, но поиски ""うめざけ"" в Гугле дают результаты. Например, [4] говорит, что можно еще и как "ぱいしゅ" произносить. А [5], что можно произносить "ばいしゆ". Это не самые лучшие источники, но они японские. Мне кажется, возможны четыре варианта, и все их нужно проверить. --Anatoli T. (обсудить/вклад) 20:49, 8 July 2015 (UTC)
Canonicalizing Russian based on the translit and removing redundant translit, and bugs in Module:ru-translit
[edit]I've written all this code to add vowels to Arabic text based on the translit and then remove/canonicalize redundant translit, and it occurred to me something similar could be done for Russian. A simple example of this would convert e.g. зонтик with manual translit of zóntik to зо́нтик and remove the manual translit. A slightly more complex case that I will also handle is a word like нету with manual translit nyétu, where ye is recognized as a variant of e and thus нету converted to не́ту and nyétu removed. An even more complex case I will also handle is where the manual translit can't be removed entirely but can be canonicalized, with ye after a consonant converted to e, accents transferred from Cyrillic to Latin, etc. Russian actually appears much simpler than Arabic in this regard (esp. since there are a whole bunch of differing ways of transcribing Arabic that are used in various parts of Wiktionary), and since I already wrote the Arabic code, it was easy to convert it to work on Russian. I wrote the back end of this code and it works, although I haven't finished the front-end bot or done a test run yet on actual pages (i.e. a run that outputs what would be changed without changing anything). In the process I have a few questions:
- Module:ru-translit pays attention to grave accents as well as acute accents. When do grave accents occur?
- Where is tr_adj() in the module used? If this is used, it will add a bit of complexity to the front end.
- Your Russian translit scheme has a special case to handle ё and ю after certain consonants by rendering them without a j. I currently handle this in my code using a post-processing step on the Latin translit that looks like this:
text = rsub(text, u"([žčšŽČŠ])j([ouóú])", u"\1\2")
This will canonicalize all instances in the manual translit by removing j in these circumstances. Is this safe? I.e. are there ever cases where the manual translit can legitimately have a j in these circumstances?
- Are there cases where ё (similarly for Ё) can legitimately be rendered in manual translit by (j)o without an acute accent? I presume there are, and so I recognize e.g. jo/yo/o as possible manual translits of ё but don't canonicalize them to have an accent.
I also noticed a few apparent bugs in Module:ru-translit:
- There's apparently a stray % sign in the list of vowels in replace_e().
- The code in export.tr() that makes use of replace_e() recognizes Cyrillic е phrase-initially and after vowels, but not after spaces or dashes, nor after vowels followed by an accent.
NOTE: I may be busy and not able to respond immediately to any responses you post.
Benwing (talk) 09:33, 11 July 2015 (UTC)
- @Benwing. Thank you. I'll be happy if you only fix transliterations of English translations into Russian in the main space.
- First of all, like Arabic, Korean, Hindi, etc. not all manual transliterations should be removed. There are cases described in WT:RU TR.
- I'm not sure why we need grave accents, could you give some examples? If you can't find, I think they all should converted to acute accents. Grave accents are used in Bulgarian and by Russian Module:ru-pron to show a secondary accent.
- tr_adj() is used by Russian adjectives, (adjective-like) participles and should be used by some pronouns and numberals where final -ого/-его (with or without accents) are transliterated as -ovo/-(j)evo ("v", not "g") in genitive or accusative (animate) endings, masculine and neuter. We use manual transliterations in headwords and translations when they occur. *Note that there are cases when -ого/-его endings are pronounced regularly in other parts of speech and some cases when these occur in the middle of words in derivations from inflected forms, such as сегодня.
- Not sure about replace_e(), I haven't noticed problems, e.g. на е́ли (na jéli), е́ле-е́ле (jéle-jéle), фра́ер (frájer) work fine.
- Cases when when "ё" should be transliterated as jo/o (never yo) without the accent do happen but they are rare, also described in WT:RU TR at #4.
- The most common case when manual transliteration should be retained, apart from -ого/-его is when "е" doesn't palatalise had consonants. Only at Wiktionary we supply such exceptions, which are completely unpredictable. E.g. текст (tekst) and тест (tɛst) Just watch for "ɛ" or "ɛ́".
- Russian multi-syllabic terms still need stress accents, even if they are manually transliterated, except for words with a stressed "ё", which should never get an accent mark. (In adult native running texts "ё" is replaced with "е" but we avoid them in translations, such entries are allowed, though, cf Arabic use of ا instead of أ or إ).
- It's OK to remove "j" from transliterations of "жю" and "шю".
- Please let me know if you have other questions. Your assumption about nyétu is correct. --Anatoli T. (обсудить/вклад) 13:00, 11 July 2015 (UTC)
- I actually changed the handling of "жю" and "шю" to be smarter so that cases like "жйу" will still get rendered as "žju" (not sure any such cases exist).
- However, another question has to do with Latin h. I canonicalize all occurrences of Latin sh ch zh kh to š č ž x in a pre-processing step. This could cause problems if there happen to be any sequences of e.g. сг that are pronounced /sɣ/ and should genuinely be rendered sh, but I assume they don't exist. Am I right?
- As for what's going to get changed, currently it will be instances involving transliterations of Russian terms plus most occurrences of the templates m, l, etc. To do this properly I need a few changes to Module:links, in export.full_link; in particular, I would need categories generated like Category:Terms with manual transliterations/ru, at least for ru, ar, grc and any other languages I'm going to do similar stuff on. (This is similar to the existing categories like Category:Terms with manual transliterations different from the automated ones/grc.) It's probably fine to create these categories for all languages but we can start with those three. I can give you the code to put in that function. Hopefully CodeCat won't object and unilaterally revert.
- The reason why I mention grave accents is because the code in Module:ru-translit has various places where grave accents are handled, in particular in tr_adj() (look for "\204\128", the UTF-8 for a grave accent). Perhaps grave accents can occur in the Russian text in places to indicate secondary stress?
- As for tr_adj(), can you tell me where the module or template code actually calls it? I can't find any such places. The only mention I find is one place where you ask Wikitiki89 (talk • contribs) to include it in Module:ru-adjective, but it doesn't look like it's there currently.
- As for not removing all manual translits, indeed my code is careful about this, it is just like in Arabic where there are various places where manual translit is required.
- As for replace_e(), this appears to work because you have that stray % sign causing %A to match all non-word characters. It doesn't match Latin A as a result, but this only matters if you have mixed Cyrillic/Latin text e.g. Aер (Ajer), which seems unlikely.
- BTW I looked through most of Category:Russian terms with irregular pronunciations to see what was triggering this, and I notice one case where you apparently have unnecessary manual translit: ванная. Also, брошюра and жюри are in this category only because they have been manually put in the category; is this intentional?
- Also, thanks for your help with this. Benwing (talk) 10:47, 12 July 2015 (UTC)
- "жйу" doesn't occur but the logic is correct.
- I think preprocessing sh ch zh kh to š č ž x should be safe, as a separate "h" in Russian transliterations is extremely rare and happens between vowels.
- Grave accents could potentially be used for secondary accents but it's not the common practice. If it's only one in a word it should be replaced with an acute accent, otherwise I need to see such examples. We do use grave accents in automatic IPA, as I mentioned before.
- tr_adj() are called by templates in Category:Russian adjective inflection-table templates and similar participle templates.
- Yes, please retain the manual translit when it's required. If in doubt, please ask.
- Yes, брошюра, жюри and парашют were added manually, just for clarity to users, even if the module handles this exception.
- Fixed ванная. It doesn't need manual translit.
- Thank you! --Anatoli T. (обсудить/вклад) 11:19, 12 July 2015 (UTC)
- трёхэтажный is a rare case where "ё" is unstressed (as with other words prefixed with "трёх-" and "четырёх-" and shouldn't have "jó", just "jo", the stress is on "а". --Anatoli T. (обсудить/вклад) 11:25, 12 July 2015 (UTC)
OK, I did a partial test run. It got through about 63,000 pages that contain {{t}}
before hitting a programming error (I think there are about 75,000 pages with {{t}}
; and about 56,600 with {{t+}}
, but with a large overlap with the previous pages; and about 2,900 with {{t+check}}
, again with a large overlap; and about 5,600 with {{t-check}}
and < 100 with {{t-}}
, again with overlap). On these 63,000 pages, it found 88,635 templates with Russian text. Whenever there is both Russian and Latin, it attempted to "match-canonicalize" by comparing the two strings character-by-character, copying non-matching accents and certain other things (e.g. certain non-matching punctuation) from one to the other, and applying additional "self-canonicalizations" to both Russian and Latin (e.g. stripping stray whitespace, converting certain look-alike Latin/Cyrillic characters from one to the other when in the "wrong" charset). When the match-canonicalization failed, the self-canonicalization was still applied. If the resulting Latin exactly matches the auto-transliteration of the Russian, it is removed from the template. Some stats from this run:
- 41,593 templates where match-canonicalization succeeded and the Russian was changed (mostly, accents added)
- 46,400 templates where redundant translit was removed
- 79 templates where match-canonicalization succeeded and the Latin was changed but left in place
- 1,995 templates where match-canonicalization failed
- 21 templates where match-canonicalization failed and the Russian was self-canonicalized to something else
- 510 templates where match-canonicalization failed and the Latin was self-canonicalized to something else
The important number here is the 2,000 or so templates where match-canonicalization failed. I did a bunch of work on the code and was able to reduce this to 1,044 cases, but these remaining cases are generally real errors. Here's a sampling of the first few:
Page 63 nonsense: t+.2: NOTE: Unable to match-canon вздор (vdor): Unable to match Russian character з at index 1, Latin character d at index 1: {{t+|ru|вздор|m|tr=vdor}} Page 69 nonsense: t+.2: NOTE: Unable to match-canon вздор (vdor): Unable to match Russian character з at index 1, Latin character d at index 1: {{t+|ru|вздор|m|tr=vdor}} Page 111 august: t+.2: NOTE: Unable to match-canon величавый (v'eličáv'yj): Unable to match Russian character ы at index 7, Latin character ' at index 9: {{t+|ru|величавый|tr=v'eličáv'yj}} Page 344 abandonment: t+.2: NOTE: Unable to match-canon оставле́ние (leaving behind): Unable to match Russian character о at index 0, Latin character l at index 0: {{t+|ru|оставле́ние|n|tr=leaving behind|sc=Cyrl}} Page 345 abandonment: t+.2: NOTE: Unable to match-canon отка́з (refusal): Unable to match Russian character о at index 0, Latin character r at index 0: {{t+|ru|отка́з|m|tr=refusal|sc=Cyrl}} Page 346 abandonment: t+.2: NOTE: Unable to match-canon отка́з (denial): Unable to match Russian character о at index 0, Latin character d at index 0: {{t+|ru|отка́з|m|tr=denial|sc=Cyrl}} Page 355 abash: t+.2: NOTE: Unable to match-canon конфузить (konfúžit’): Unable to match Russian character з at index 5, Latin character ž at index 6: {{t+|ru|конфузить|impf|tr=konfúžit’|sc=Cyrl}} Page 358 abash: t+.2: NOTE: Unable to match-canon сконфузиться (skonfúžit’sja): Unable to match Russian character з at index 6, Latin character ž at index 7: {{t+|ru|сконфузиться|pf|tr=skonfúžit’sja|sc=Cyrl}} Page 574 about: t+.2: NOTE: Unable to match-canon об (o / ob): Unable to match Russian character б at index 1, Latin character at index 1: {{t+|ru|об|tr=o / ob}} Page 577 about: t+.2: NOTE: Unable to match-canon об (o / ob): Unable to match Russian character б at index 1, Latin character at index 1: {{t+|ru|об|tr=o / ob}} Page 762 absolution: t-check.2: NOTE: Unable to match-canon [[отпущение]] ([[грех|грехов]]) (otpuščénije grexóv): Unable to match Russian character ( at index 14, Latin character g at index 13: {{t-check|ru|[[отпущение]] ([[грех|грехов]])|n|tr=otpuščénije grexóv|sc=Cyrl}} Page 890 season: t+.2: NOTE: Unable to match-canon специя (spéčija): Unable to match Russian character ц at index 3, Latin character č at index 4: {{t+|ru|специя|f|tr=spéčija|sc=Cyrl}} Page 1123 parole: t.2: NOTE: Unable to match-canon УДО (u-de-ó): Unable to match Russian character Д at index 1, Latin character - at index 1: {{t|ru|УДО|n|tr=u-de-ó}} Page 1200 CIA: t+.2: NOTE: Unable to match-canon ЦРУ (ce-er-ú): Unable to match Russian character Р at index 1, Latin character e at index 1: {{t+|ru|ЦРУ|n|tr=ce-er-ú}} Page 1311 BCE: t.2: NOTE: Unable to match-canon до н. э. (do nášej éry): Unable to match Russian character at index 5, Latin character a at index 4: {{t|ru|до н. э.|tr=do nášej éry}} Page 1330 point: t+.2: NOTE: Unable to match-canon край (óstrij kraj): Unable to match Russian character к at index 0, Latin character o at index 0: {{t+|ru|край|m|tr=óstrij kraj}} Page 1337 sun: t+.2: NOTE: Unable to match-canon солнце (sónce): Unable to match Russian character л at index 2, Latin character n at index 3: {{t+|ru|солнце|n|tr=sónce}} Page 1505 pissed: t+.2: NOTE: Unable to match-canon пьяный (buxój): Unable to match Russian character п at index 0, Latin character b at index 0: {{t+|ru|пьяный|tr=buxój|sc=Cyrl}}
The errors are quite assorted and will need to be fixed by hand. (Note that the code is smart about many things, e.g. in the entry above under "absolution", it is able to correctly handle embedded links, which aren't the issue; rather, there are unmatched parens on the Russian side (although I could probably fix this by transferring the parens to the Latin side). When I do a full run I'm planning on putting all the errors on a page; you can edit that page (e.g. removing the translit if it's unnecessary, else correcting the Latin and/or Russian) and I will run a bot over the changes on that page to fix up the relevant entries. (Occasionally, more work will be needed on the page, e.g. in the entry above "Page 1330 point", the actual Russian entry has {{t+|ru|острый}} {{t+|ru|край|m|tr=óstrij kraj}}
; this should be converted to {{t+|ru|[[острый]] [[край]]|m}}
but this will take some work to do automatically and might end up requiring manual fixup.)
A few issues:
- Is it safe to match-canonicalize Latin b to v opposite Cyrillic в? I assume so. This occurs many times, e.g. in раздваивать vs. razdvaibat', and I assume it's always an error.
- When the Latin contains a comma and the Russian doesn't, I currently split the template. For example,
{{t+|ru|Катар|m|tr=Kátar, Katár}}
becomes{{t+|ru|Катар|m|tr=Kátar}}, {{t+|ru|Катар|m|tr=Katár}}
and then{{t+|ru|Ка́тар|m}}, {{t+|ru|Ката́р|m}}
after accenting the Russian and removing the now-redundant transliteration. Do you think this is a reasonable thing to do? - Is it OK to self-canonicalize Latin ɛ to e when not after a consonant, e.g paciɛ́nt -> paciént or tɛ-ɛs-žé -> tɛ-es-žé? I assume so, because in these circumstances the ɛ is redundant.
- What about multiple acute accents in a word? Currently I refuse to match-canonicalize if the Russian has multiple accents (Блу́мфонте́йн or ла́биодента́льный); the reason is that it doesn't always work that well with the template splitting I mentioned above, e.g. template
{{t|ru|Блу́мфонте́йн|m|tr=Blúmfontɛjn, Blumfontɛ́jn}}
would be split eventually into{{t|ru|Блу́мфонте́йн|m|tr=Blúmfontɛ́jn}}, {{t|ru|Блу́мфонте́йн|m|tr=Blúmfontɛ́jn}}
with the same template twice. But if the Latin has multiple accents (e.g. rývók or zapóminátʹ) I go ahead and match-canon, meaning the Russian will end up with multiple accents and the Latin will probably be removed as redundant. What do you think about this?
BTW you asked about grave accents; they do occur occasionally, and I convert them to acute if there's no acute accent elsewhere in the word.
Benwing (talk) 07:02, 15 July 2015 (UTC)
- In my opinion (Anatoli may or may not disagree on a few points):
- It is safe to match virtually any mistake in the transliterated consonant: b for в, ž for з, č for ц, etc. This is always a mistake in the transliteration.
- Yes, that's the right thing to do.
- Yes.
- There are three separate things going on here:
- Блу́мфонте́йн: This is meant to indicate two different stress variants. Ideally, it would be nice to split these to end up with
{{t|ru|Блу́мфонтейн|m|tr=Blúmfontɛjn}}, {{t|ru|Блумфонте́йн|m|tr=Blumfontɛ́jn}}
. (Incidentally, in this particular case, I find the variant Блу́мфонтейн very dubious. It appears on the Russian Wikipedia entry w:ru:Блуфонтейн, but I find this pronunciation hard to imagine and the couple online dictionaries I just checked give only Блумфонте́йн.) - ла́биодента́льный: The first stress mark is a secondary stress. I forget if our policy is to indicate secondary stress with a grave accent, with another acute accent, or without any accent, but this can only be fixed manually. So your bot should just canonicalize it as lábiodɛntálʹnyj.
- rývók and zapóminátʹ: Both of these are mistakes in the transliteration and should just be ryvók and zapominátʹ. So in these cases, you should ignore the transliterated stress if the Cyrillic has stress marks, otherwise they need to be fixed manually.
- Блу́мфонте́йн: This is meant to indicate two different stress variants. Ideally, it would be nice to split these to end up with
- --WikiTiki89 14:43, 15 July 2015 (UTC)
- Thanks for your comments. I think you're generally right about mistakes being in the translit rather than the Russian but I've found a few exceptions: абстракный (abstráktnyj), сзбить (sbitʹ), сrерео (stéreo), водвигнуть (vozdvígnut' ). Benwing (talk) 15:34, 15 July 2015 (UTC)
- Well mistakes like that would have to be fixed manually anyway, so your bot shouldn't worry about them. --WikiTiki89 15:52, 15 July 2015 (UTC)
- Thanks for your comments. I think you're generally right about mistakes being in the translit rather than the Russian but I've found a few exceptions: абстракный (abstráktnyj), сзбить (sbitʹ), сrерео (stéreo), водвигнуть (vozdvígnut' ). Benwing (talk) 15:34, 15 July 2015 (UTC)
- The code is pretty much ready now but a couple more issues:
- About stressed and unstressed ё: Since ё can be unstressed, should we add an accent on it when we know it's stressed (i.e. when the corresponding Latin has a stress mark on the vowel)? Currently I don't do that, but we may be losing information this way.
- Normalizing no-break-space (NBSP) to regular space: I do currently do that, both in the Russian and the Latin, but I wonder whether it's right. NBSP randomly occcurs in place of regular space between words in multi-word Russian expressions, and it also occurs fairly often directly before a hyphen in Russian text like "фоо - бар". The alternative is to leave the NBSP, and allow a regular space to match against it (which happens often), and canonicalize the regular space in the Latin to NBSP. What do you think?
Benwing (talk) 08:44, 17 July 2015 (UTC)
- @Atitarev, Wikitiki89 I haven't heard from either of you in a few days. What do you think about these last two issues? Benwing (talk) 10:07, 20 July 2015 (UTC)
- Here is my opinion (and once again, I cannot guarantee that Anatoli will agree with everything):
- The letter ё always has at least secondary stress, otherwise it would have degenerated to е. The primary stress is almost always on the last stressed syllable in a word (I can't think of any exceptions to that). So this should ideally be handled the same way we handle any other cases of secondary stress.
- The NBSP should not be used in entry names; that would be too confusing. However, in usage examples and other instances of running text, it should be left alone and the transliteration should reflect that.
- --WikiTiki89 13:25, 20 July 2015 (UTC)
- Here is my opinion (and once again, I cannot guarantee that Anatoli will agree with everything):
- @Benwing, Wikitiki89 I am terribly sorry. This time, the real life doesn't let me participate. I am between contracts, finishing a big project and in a process of securing a new contract. It may be a while like that. I have no problem with Wikitiki89's assessment and I agree with his answers so far. Thanks a lot, you both. We have a few exceptions not covered by WT:RU TR: e.g. words with irregular pronunciations: пациент (paciɛ́nt) and проект (proɛ́kt), where there's no "j" when there should be. These could be alternatively transliterated as "paciént" and "proékt" but the expected "pacijént" and "projékt" would be misleading. Basically "ɛ" is used when Cyrillic "е" is read as "э" when it shouldn't. Please set aside complicated cases, so that they can be transliterated/fixed manually, especially when |tr= was abused and used for explaining senses, not transliterations. Please go ahead with your bot, if you can. --Anatoli T. (обсудить/вклад) 14:07, 20 July 2015 (UTC)
- @Atitarev Good luck with your work! As for paciɛ́nt and proɛ́kt, my bot will convert them to paciént and proékt, as mentioned above; if you don't think this is a good idea, let me know.
- @Wikitiki89 You didn't really answer my question about what to do with ё, which is, should it be converted to ё́ when it is known from the Latin that it has primary stress? This might be overkill since ё is usually stressed. Also, if you think it's really the case that all ё bears either primary or secondary stress, one thing I could do is convert ё to ё̀ when the Latin indicates the word does not bear primary stress (i.e. it's matched with Latin jo -- rather than jó -- and there's primary stress elsewhere in the same word). But maybe we should just leave ё unmarked on the theory that ё is almost always stressed, and if it's not, this is indicated by a primary stress elsewhere in the word. (BTW are you sure that ё always bears secondary stress when it doesn't have primary stress? Looking through the bot output I encountered a word something like двёргну́ть -- not this actual word, but it ended in stressed -ну́ть and had unstressed ё in the preceding syllable, and it was clear it got that way because it was based on a word something like двёргий [again, not this actual word] that did have stress on the ё. It's not obvious to me that a word like двёргну́ть [or whatever it actually was] would have secondary stress on the ё.) Benwing (talk) 11:33, 21 July 2015 (UTC)
- The answer of what to do was to do the same thing that we do for other secondary stresses. Specifically what your bot should do is make a list of them for us to sort out manually (there can't be very many of them, can there?). As for *двёргну́ть, it seems that the stress on the у́ is an error and it should be stressed like дёрнуть (djórnutʹ), but I would have to see the specific word to make a definitive judgment. --WikiTiki89 13:37, 21 July 2015 (UTC)
- @Benwing, Wikitiki89 I am terribly sorry. This time, the real life doesn't let me participate. I am between contracts, finishing a big project and in a process of securing a new contract. It may be a while like that. I have no problem with Wikitiki89's assessment and I agree with his answers so far. Thanks a lot, you both. We have a few exceptions not covered by WT:RU TR: e.g. words with irregular pronunciations: пациент (paciɛ́nt) and проект (proɛ́kt), where there's no "j" when there should be. These could be alternatively transliterated as "paciént" and "proékt" but the expected "pacijént" and "projékt" would be misleading. Basically "ɛ" is used when Cyrillic "е" is read as "э" when it shouldn't. Please set aside complicated cases, so that they can be transliterated/fixed manually, especially when |tr= was abused and used for explaining senses, not transliterations. Please go ahead with your bot, if you can. --Anatoli T. (обсудить/вклад) 14:07, 20 July 2015 (UTC)
- Putting stress mark over "ё" is overkill and it's not a common practice, either, it will transliterate as "(j)ó". In rare cases when "ё" is not stressed at all or bear the secondary stress, no stress mark is needed but the manual transliteration should have normal "(j)o", for example, трёхэта́жный (trjoxetážnyj), четырёхуго́льник (četyrjoxugólʹnik). These two word use prefixes. Other words with an unstressed "ё" are very-very rare. So it's better to check, if the stress was correctly placed and if there was a mistake in the transliteraion.
- My preference if you use paciɛ́nt and proɛ́kt. --Anatoli T. (обсудить/вклад) 12:07, 21 July 2015 (UTC)
- OK, will do. BTW there are occasional words where the manual translit has (j)o instead of (j)ó, and the ё is in fact stressed, it's just that that translit fails to mark the stress anywhere. In these cases the manual translit will get left in place and someone can later remove it manually. Benwing (talk) 12:17, 21 July 2015 (UTC)
- Yes, please put those words aside for later checking (a list, a tracking category or something else). You could also probably find "(j)ó" with the Cyrillic "е", then the Russian term probably needs a "ё" in the dictionary style (it's like adding a missing hamza over/under alifs when they are missing). I have corrected such cases in the past. --Anatoli T. (обсудить/вклад) 12:29, 21 July 2015 (UTC)
- OK, will do. BTW there are occasional words where the manual translit has (j)o instead of (j)ó, and the ё is in fact stressed, it's just that that translit fails to mark the stress anywhere. In these cases the manual translit will get left in place and someone can later remove it manually. Benwing (talk) 12:17, 21 July 2015 (UTC)
- If it makes it any easier, words with alternative stresses should have two or more stress marks, rather than two transliterations: Ка́та́р (Kátár), апо́стро́ф (apóstróf), ката́ка́на (katákána), ща́ве́ль (ščávélʹ), ро́же́ни́ца (róžéníca). This change in practice is relatively recent. --Anatoli T. (обсудить/вклад) 12:52, 21 July 2015 (UTC)
- Secondary stress is a very under-researched topic in Russian and most people would say a syllable is unstressed even if it is secondarily stressed. Also, shouldn't it be трёхэта́жный (trjoxɛtážnyj) because the х is not palatalized? And I'm going to disagree with Anatoli and say that I strongly prefer paciént. There is no confusion, since клие́нт (klijént) has a j in the transliteration and it is consistent with поэ́т (poét). --WikiTiki89 13:37, 21 July 2015 (UTC)
- If it makes it any easier, words with alternative stresses should have two or more stress marks, rather than two transliterations: Ка́та́р (Kátár), апо́стро́ф (apóstróf), ката́ка́на (katákána), ща́ве́ль (ščávélʹ), ро́же́ни́ца (róžéníca). This change in practice is relatively recent. --Anatoli T. (обсудить/вклад) 12:52, 21 July 2015 (UTC)
- "ɛ" (a non-standard symbol) is only used for irregular pronunciations of "е", not after unpalatalising ж, ш, ш, ц and palatalising ч, щ and й, when it's always "e", regardless of the pronunciation and "э" is also always "e", so "ɛ" is only used when the knowledge of the Russian orthography and phonology doesn't help and the reading is almost unpredictable. Words like тест (tɛst) and текст (tekst) are both loanwords but "е" has a different value in them, the first is phonetically respelled as "тэст". Some dictionaries and textbooks use this method (phonetic respelling) but the information on whether loanwords with "е" should be pronounced as "е" or "э" is not easily available and not consistent at all. At Wiktionary we attempt to fix this. The alternative of using "ɛ" is marking every case of palatalised consonant as "'e" or "je" - "p'er'em'éna" or "pjerjemjéna" for переме́на (pereména), which is awkward and disliked (but still used). It's easier to mark a small percentage of irregular words with "ɛ", less than 5% of words. In пациент and проект the reading is unpredictable as well and "ɛ" makes it obvious, IMO. I won't fight over this. --Anatoli T. (обсудить/вклад) 14:02, 21 July 2015 (UTC)
- My position on Блу́мфонте́йн m (Blúmfontɛ́jn) (two stress marks) can be seen at ро́же́ни́ца (róžéníca) with three possible stress patterns. When the basic forms (lemma and inflected) coincide, it's much simpler to fit them all in one. --Anatoli T. (обсудить/вклад) 14:14, 21 July 2015 (UTC)
- I don't mean to argue, but here is what I thought the system was (and what I still think it should be):
- е/э following a non-palatalizable consonant: always e (же → že, жэ → že)
- е/э following a palatalizable consonant: e if it is palatalized, ɛ if it is not (те → te, тэ → tɛ, термин → tɛrmin)
- е/э following a vowel or beginning of a word: je if it is pronounced /je/, e if it is pronounced /ɛ/ (ест → jest, это → eto, поест → pojest, поэт → poet, проект → proekt)
- Whether or not е or э is written in the Cyrillic is irrelevant in the transliteration. There is not reason to have different rules for them. In my opinion, it is very confusing to transliterate мэр and мера both as me- when the м is pronounced differently, but менеджер as mɛ- even though the м is pronounced the same as in мэр. It would be much less confusing to transliterate мэр also as mɛ-. --WikiTiki89 14:34, 21 July 2015 (UTC)
- I see. It makes sense too. The reason it was not adopted was because of the dislike of "ɛ" by some editors (who only proposed "standard" transliteration but ignored the issue at hand), so the usage was limited to exceptions only. BTW, the common and standard Russian pronunciation of термин is soft: [ˈtʲermʲɪn], not [ˈtɛrmʲɪn] but some people say [ˈtɛrmʲɪn] (тэ́рмин). --Anatoli T. (обсудить/вклад) 14:55, 21 July 2015 (UTC)
- So do you see why I am saying пациент should be pacient? As for термин, I got my point across and that's what matters. I guess I come from a family that says бассэйн, Одэсса, тэрмин, бизнэсмэн (but спортсмен!). --WikiTiki89 15:04, 21 July 2015 (UTC)
- I see. It makes sense too. The reason it was not adopted was because of the dislike of "ɛ" by some editors (who only proposed "standard" transliteration but ignored the issue at hand), so the usage was limited to exceptions only. BTW, the common and standard Russian pronunciation of термин is soft: [ˈtʲermʲɪn], not [ˈtɛrmʲɪn] but some people say [ˈtɛrmʲɪn] (тэ́рмин). --Anatoli T. (обсудить/вклад) 14:55, 21 July 2015 (UTC)
- I don't mean to argue, but here is what I thought the system was (and what I still think it should be):
@Wikitiki89 Since there's no agreement over proekt vs. proɛkt, I'm going to go ahead and leave in the code I currently have that canonicalizes these to proekt. If we decide to do it the other way, I can always undo this at a later time, since no information is being lost. I agree with Wikitiki about transliterating мэр as mɛr; we can deal with this later, though. Benwing (talk) 13:52, 22 July 2015 (UTC)
- Yeah, by all means, do "proékt". This scenario is not described in the transliteration page as an exception, anyway, only when "е" is after hard consonants and is pronounced (recommended or official pronunciation) as "э". Are you also OK with words having two stress patterns, we seemed to have agreed to have two/multiple stress marks on both Cyrillic and transliterations (when manual are required). You don't have this choice with ску́чный (skúšnyj, skúčnyj), of course. @Tiki. Yes, you got your point across but using individual pronunciations as examples keep sounding to me as if you're promoting them to use in entries. :) To me, they are not frequent enough, even though the pronunciation of loanwords can have some variations, some of your examples would be confusing for people trying to get the authentic Russian pronunciation. Одэсса and бизнэсмэн would be acceptable variants, though. Going to bed, it's late here. --Anatoli T. (обсудить/вклад) 14:07, 22 July 2015 (UTC)
- I don't mean to promote anything. That was just the first word I thought of like that that started with a т. --WikiTiki89 14:10, 22 July 2015 (UTC)
- I am fine with having words with multiple acute accents. My code currently will transfer all accents from the Latin to the Russian. If the Latin lists two distinct transliterations, e.g. ску́чный (skúšnyj, skúčnyj), it will be split into two templates. (This will happen even if the two could potentially be combined into one multi-accented word.) I can create a list of words where this template splitting happens, so you can check for cases that could be recombined into a multi-accented word. Benwing (talk) 14:24, 22 July 2015 (UTC)
- What Anatoli meant is that to show two stress variants (not multi-accented words), instead of having "творо́г (tvoróg) or тво́рог (tvórog)", we should shorthand it to "тво́ро́г (tvóróg)" like other dictionaries often do. I personally disagree with this because it can be confusing for a number of reasons, but this has been discussed before. --WikiTiki89 14:40, 22 July 2015 (UTC)
- Right, that's what I meant by "multi-accented words", sorry if my terminology was confusing. Benwing (talk) 14:42, 22 July 2015 (UTC)
- What Anatoli meant is that to show two stress variants (not multi-accented words), instead of having "творо́г (tvoróg) or тво́рог (tvórog)", we should shorthand it to "тво́ро́г (tvóróg)" like other dictionaries often do. I personally disagree with this because it can be confusing for a number of reasons, but this has been discussed before. --WikiTiki89 14:40, 22 July 2015 (UTC)
- I am fine with having words with multiple acute accents. My code currently will transfer all accents from the Latin to the Russian. If the Latin lists two distinct transliterations, e.g. ску́чный (skúšnyj, skúčnyj), it will be split into two templates. (This will happen even if the two could potentially be combined into one multi-accented word.) I can create a list of words where this template splitting happens, so you can check for cases that could be recombined into a multi-accented word. Benwing (talk) 14:24, 22 July 2015 (UTC)
- I don't mean to promote anything. That was just the first word I thought of like that that started with a т. --WikiTiki89 14:10, 22 July 2015 (UTC)
@Wikitiki89 BTW, the word above I was talking about with unstressed ё followed by -ну́ть was actually застёгну́ть "to buckle", which appears to take its ё from застёжка or застёгивать. Benwing (talk) 21:57, 23 July 2015 (UTC)
- @Benwing: Oh. That's a mistake for застегну́ть (zastegnútʹ), although застёжка (zastjóžka), застёгивать (zastjógivatʹ), etc. are correct. --WikiTiki89 22:07, 23 July 2015 (UTC)
It seems that this template doesn't handle well some combinations of kana such as ウェ or ティ. Eg: サイバネティックス, スウェーデン, ノルウェー. ばかFumiko¥talk 03:39, 13 July 2015 (UTC)
- Sorry, Fumiko. I can't fix it. Try Grease pit. --Anatoli T. (обсудить/вклад) 13:40, 13 July 2015 (UTC)
- Who/what is Grease pit? ばかFumiko¥talk 10:05, 14 July 2015 (UTC)
- Fumiko, it's Wiktionary:Grease pit/2015/July. Lua gurus could help if they're interested - Kephir, CodeCat, Benwing, etc. Wyang seems to be busy. Other Japanese editors need to be aware of the issue. This is the place to post about the technical problems, anyway. --Anatoli T. (обсудить/вклад) 10:16, 14 July 2015 (UTC)
- Who/what is Grease pit? ばかFumiko¥talk 10:05, 14 July 2015 (UTC)
- Thanks. ばかFumiko¥talk 10:23, 14 July 2015 (UTC)
- Fixed: [6]. — TAKASUGI Shinji (talk) 11:57, 14 July 2015 (UTC)
- Thanks. ばかFumiko¥talk 10:23, 14 July 2015 (UTC)
example sentence problem
[edit]Hi Anatoli, I had trouble with the 'dress code' example sentence at 著裝 and 要求, do you know how to get the simplified to display correctly (it's 着装, right now it comes up as 著装)? ---> Tooironic (talk) 05:55, 14 July 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 06:03, 14 July 2015 (UTC)
- Many thanks. ---> Tooironic (talk) 11:52, 14 July 2015 (UTC)
У этого глагола помимо стандартных форм есть аномальные формы с основой обязу́-: обязуюсь, обязуешься, обязуется, обязуемся, обязуетесь, обязуются, обязуйся, обязуйтесь, обязующийся. Нужно как-то включить их в таблицу спряжения.--Cinemantique (talk) 13:22, 17 July 2015 (UTC)
- @Cinemantique. Согласен. Постараюсь сделать, когда смогу. Небольшая сложность в том, что неправильное спряжение используется только в возвратной форме, но это решаемо. --Anatoli T. (обсудить/вклад) 14:08, 20 July 2015 (UTC)
- @Cinemantique Готово. --Anatoli T. (обсудить/вклад) 10:55, 22 July 2015 (UTC)
- Спасибо большое, Анатолий. --KoreanQuoter (talk) 11:49, 22 July 2015 (UTC)
- Спасибо! Мы для каждого аномального глагола будем делать отдельный шаблон? Почему вообще существует несколько шаблонов? Нельзя ли использовать один с параметрами?--Cinemantique (talk) 14:16, 22 July 2015 (UTC)
- @Cinemantique Готово. --Anatoli T. (обсудить/вклад) 10:55, 22 July 2015 (UTC)
Finished Russian canonicalization; posted unable-to-match cases, please edit
[edit]@Wikitiki89: Hello to both of you. I finished running the Russian canonicalization script. There were about 37,000 pages affected. There were 780 cases where it couldn't match the Cyrillic and Latin; they are posted here: User:Benwing/ru-unable-to-match On the right are three templates in the form CHANGED <- ORIGINAL (CHANGED) where ORIGINAL is the original template before my program changed it, and CHANGED is the template after it was changed. If you can go through these lines and edit the first template (the first of the two copies of CHANGED) whenever you see a mistake that needs fixing, I will run a script to push all these changes to the correct page. (The reason why there are two copies of CHANGED is so that after you've edited the first copy, the second copy is still there so my script knows what to look for when doing a search-and-replace. The ORIGINAL is there for your reference.) Thanks. Benwing (talk) 06:23, 25 July 2015 (UTC)
- !ممتاز Great job, man. Thank you very much. --Anatoli T. (обсудить/вклад) 12:00, 25 July 2015 (UTC)
- I've started going through them, marking them with
{{done}}
when I've checked and possibly fixed them (and in one case so far with{{attention}}
for extra remarks). --WikiTiki89 13:40, 27 July 2015 (UTC)- Thanks! If it's faster, just mark the last one you did with
{{done}}
. Looks also like Anatoli cleaned up a few of the later entries (e.g. for Alabama). Benwing (talk) 22:02, 27 July 2015 (UTC)- @Wikitiki89 I went through and ran my push-manual-changes script on the changes you made but it looks like Anatoli already got to all of them first. Anatoli, how far did you get? From your changelogs it looks like you got through all the proper nouns and most of the way through the c's of the common nouns. Feel free to edit User:Benwing/ru-unable-to-match instead of doing them manually, if you'd rather. Benwing (talk) 04:47, 31 July 2015 (UTC)
- @Benwing Thanks for this job, sorry for not responding earlier. It's fine, it's probably easier to fix the occurrences in entries. We could use help with multisyllabic terms, which lack any stress marks (except for words with ё). Also, Module:ru-pron needs attention of a Lua developer. There are few silly failed cases in Module:ru-pron/testcases. --Anatoli T. (обсудить/вклад) 11:49, 2 August 2015 (UTC)
- I'll take a look at Module:ru-pron when I have a chance. When you say you need help with multisyllabic terms, what do you mean exactly? Benwing (talk) 04:20, 3 August 2015 (UTC)
- @Benwing Thanks for this job, sorry for not responding earlier. It's fine, it's probably easier to fix the occurrences in entries. We could use help with multisyllabic terms, which lack any stress marks (except for words with ё). Also, Module:ru-pron needs attention of a Lua developer. There are few silly failed cases in Module:ru-pron/testcases. --Anatoli T. (обсудить/вклад) 11:49, 2 August 2015 (UTC)
- @Wikitiki89 I went through and ran my push-manual-changes script on the changes you made but it looks like Anatoli already got to all of them first. Anatoli, how far did you get? From your changelogs it looks like you got through all the proper nouns and most of the way through the c's of the common nouns. Feel free to edit User:Benwing/ru-unable-to-match instead of doing them manually, if you'd rather. Benwing (talk) 04:47, 31 July 2015 (UTC)
- Thanks! If it's faster, just mark the last one you did with
- Thanks! I mean that e.g. демографический (demografičeskij) should be corrected to демографи́ческий (demografíčeskij), especially in translations (there will be too many such cases in the main space). You don't need to know the correct stress but it would be great to have a list of terms needing a word stress. --Anatoli T. (обсудить/вклад) 04:33, 3 August 2015 (UTC)
- Ah, I see. Yes, I can create such a list, I think. Benwing (talk) 04:46, 3 August 2015 (UTC)
- Thanks! I mean that e.g. демографический (demografičeskij) should be corrected to демографи́ческий (demografíčeskij), especially in translations (there will be too many such cases in the main space). You don't need to know the correct stress but it would be great to have a list of terms needing a word stress. --Anatoli T. (обсудить/вклад) 04:33, 3 August 2015 (UTC)
@Benwing I think all mismatches in User:Benwing/ru-unable-to-match are now fixed. Thank you again for that list. :)--Anatoli T. (обсудить/вклад) 14:37, 14 August 2015 (UTC)
- Thanks for fixing them all! I think there may be more, actually; at some point I'll do a run that covers everything remaining that has a manual transliteration, which may flush some more out. Benwing (talk) 14:43, 14 August 2015 (UTC)
- Yes, please, this kind of mismatches all need to be fixed. --Anatoli T. (обсудить/вклад) 14:48, 14 August 2015 (UTC)
Шаблон склонения
[edit]Привет! Я и Виталий работаем над новым шаблоном склонения существительных - Template:ru-decl-noun-z. Сделали пока наполовину; осталась обработка слов с беглой гласной, супплетивизмом (слова на -анин и прочие) и другими нестандартностями. Ещё не реализованные идеи собраны здесь. Возможно, есть какие-то предложения?--Cinemantique (talk) 01:19, 26 July 2015 (UTC)
- Молодцы, только что не так с имеющимися шаблонами - старыми или более новыми? Если чего-то не хватает, можно изменить, добавить. User:Wikitiki89 занимался модулем склонения и склонениями. --Anatoli T. (обсудить/вклад) 01:25, 26 July 2015 (UTC)
- Я пользовался старым семейством шаблоном, и меня не устраивает то, что нужно по нескольку раз указывать основы, причём их порядок различается, и мне приходилось всякий раз заглядывать в документацию. Шаблоном Wikitiki89 я не пользовался: к нему нет документации; мне показалось, что программными средствами можно сделать гораздо больше. Дело в том, что по алгоритму Зализняка можно сгенерировать все формы слова, зная его начальную форму с ударением и индекс склонения. Это и делает новый шаблон. Основа слова (точнее, его начальная форма) указывается ровно один раз в большинстве случаев. Индекс Зализняка разбит на две основные части (и из него исключён тип основы, который определяется программно, то есть все эти vel, sib и прочее), при этом первая часть обычно совпадает с тем, что указывается в шаблоне ru-noun, то есть я просто дублирую её в статейной болванке. Дублируется также слово с ударением, которое используется в ru-noun и ru-IPA. Таким образом, мне требуется указать только схему ударения в большинстве случаев. Чтобы указать, что основа с беглой гласной, я просто поставлю звёздочку отдельным параметром - и никаких дополнительных основ.--Cinemantique (talk) 02:09, 26 July 2015 (UTC)
- Just FYI: Module:User:Vitalik/inflection/data/uz-noun/testcases/documentation has been showing up in Category:Pages with module errors due to running out of script-execution time. That means it takes very close to the entire 10 seconds to process the page under normal circumstances, with any system slowdown pushing it over the edge. I've only seen that once before, and that was from some horribly, bizarrely complex old template that was being executed by the lua backend to
{{documentation}}
. I hope the Russian template works better than that. Chuck Entz (talk) 06:11, 26 July 2015 (UTC)- @Chuck Entz, thanks for detection this bug. It was an error related to the usages in testcases (one critical variable wasn't cleaned up from test to test). I've fixed that and now average time of one module execution is about 0.08 seconds. Vitalik (talk) 17:41, 26 July 2015 (UTC)
- Just FYI: Module:User:Vitalik/inflection/data/uz-noun/testcases/documentation has been showing up in Category:Pages with module errors due to running out of script-execution time. That means it takes very close to the entire 10 seconds to process the page under normal circumstances, with any system slowdown pushing it over the edge. I've only seen that once before, and that was from some horribly, bizarrely complex old template that was being executed by the lua backend to
- Я пользовался старым семейством шаблоном, и меня не устраивает то, что нужно по нескольку раз указывать основы, причём их порядок различается, и мне приходилось всякий раз заглядывать в документацию. Шаблоном Wikitiki89 я не пользовался: к нему нет документации; мне показалось, что программными средствами можно сделать гораздо больше. Дело в том, что по алгоритму Зализняка можно сгенерировать все формы слова, зная его начальную форму с ударением и индекс склонения. Это и делает новый шаблон. Основа слова (точнее, его начальная форма) указывается ровно один раз в большинстве случаев. Индекс Зализняка разбит на две основные части (и из него исключён тип основы, который определяется программно, то есть все эти vel, sib и прочее), при этом первая часть обычно совпадает с тем, что указывается в шаблоне ru-noun, то есть я просто дублирую её в статейной болванке. Дублируется также слово с ударением, которое используется в ru-noun и ru-IPA. Таким образом, мне требуется указать только схему ударения в большинстве случаев. Чтобы указать, что основа с беглой гласной, я просто поставлю звёздочку отдельным параметром - и никаких дополнительных основ.--Cinemantique (talk) 02:09, 26 July 2015 (UTC)
- Шаблон
{{ru-noun-table}}
годится почти в каждом случае. Я документанцию еще не написал, но вы можете посмотреть на эти примеры. Основная идея такая:{{ru-noun-table|1|ба́нк|а|ба́нок}}
,{{ru-noun-table|2|угл||у́гол|loc=углу́}}
. Если есть вопросы, спрашивайте у меня. --WikiTiki89 13:37, 27 July 2015 (UTC)
Отмена
[edit]Здравствуйте, в чём проблема с этой правкой? Ле Лой (talk) 00:10, 27 July 2015 (UTC)
Capitalization in Pinyin
[edit]Why was my edit in 上海話/上海话 (shànghǎihuà) reverted? Isn't it a proper noun, which should be capitalized according to Hanyu Pinyin rules? Justinrleung (talk) 06:36, 30 July 2015 (UTC)
- It's not a proper noun, it's a (common) noun. Names are capitalised but not language names or ethnicities. --Anatoli T. (обсудить/вклад) 06:40, 30 July 2015 (UTC)
- I'm not sure I agree, since 6.3.3 of the Basic rules of the Chinese phonetic alphabet orthography has 汉语, 粤语 and 广东话 capitalized. Justinrleung (talk) 07:08, 30 July 2015 (UTC)
- There's no 100% agreement or consistency on capitalisation among dictionary publishers - paper or electronic and this is just a guide. I have tried to convince Chinese editors to have a vote on this but there was little interest. What counts is consistency and I am trying to be consistent with my edits and other people's edits. We can revisit the rules, of course. --Anatoli T. (обсудить/вклад)
- OK, I see. Thanks for clarifying! Justinrleung (talk) 08:19, 30 July 2015 (UTC)
- There's no 100% agreement or consistency on capitalisation among dictionary publishers - paper or electronic and this is just a guide. I have tried to convince Chinese editors to have a vote on this but there was little interest. What counts is consistency and I am trying to be consistent with my edits and other people's edits. We can revisit the rules, of course. --Anatoli T. (обсудить/вклад)
- I'm not sure I agree, since 6.3.3 of the Basic rules of the Chinese phonetic alphabet orthography has 汉语, 粤语 and 广东话 capitalized. Justinrleung (talk) 07:08, 30 July 2015 (UTC)
Думаю, это тоже надобно удалить.--Cinemantique (talk) 14:30, 30 July 2015 (UTC)
zh/data
[edit]These edits were so that I could create zh-new entries with "御" or "筑" in the title (there is no such thing as "禦好燒"). The Traditional→Simplified pairs (as used by zh-usex) seem to be in a different section of the module. —suzukaze (t・c) 23:24, 31 July 2015 (UTC)
Chinese characters that are only used in compounds
[edit]Any thoughts? —suzukaze (t・c) 08:20, 3 August 2015 (UTC)
Russian words needing accents (e.g. translations)
[edit]I ran a script to find these and it came up with about 24,500 of them. That's a lot. I can upload them if you want, probably in batches of about 2,000, in the same format as the "unable-to-match" cases, where you can edit the template on the page to add the accent and I will push the changes.
I also wrote a script to look up the word and find the accent from the page, e.g. the headword entry {{ru-proper noun|Бре́жнев|m}}
of Брежнев. This should cut down on the number but there will still be a lot. Benwing (talk) 03:11, 5 August 2015 (UTC)
- @Benwing Wow, that's a lot! Are they all in translations? Thank you! That's unmanageable for one person. They should probably be in a tracking hidden category instead. --Anatoli T. (обсудить/вклад) 04:05, 5 August 2015 (UTC)
- Well, it's actually 21,125 after removing duplicates, of which 10,012 are translations. Still a lot. Benwing (talk) 04:33, 5 August 2015 (UTC)
Mandarin pronunciations and non-Mandarin words
[edit]Should a Mandarin pronunciation be included in an entry if the characters can technically be pronounced in Mandarin? (like 今次) I ask because User talk:Wyang#幹活兒 seems to indicate that the reverse (adding Cantonese pronunciation to Mandarin words) is okay. —suzukaze (t・c) 06:09, 5 August 2015 (UTC)
- Only words that are used in that lect should be added, even if they are rare or borrowed from other lects. I know Wyang's opinion on this and from that discussion I can't tell he agreed to simply adding a topolect pronunciation "if the characters can technically be pronounced in Mandarin". "gon3 wut6 ji4" is also used in the formal/literal Cantonese for 幹活兒 and we have a dictionary reference for it, not just because it can be (potentially) converted to Cantonese readings. If 今次 is also used in Mandarin then it's fine, these terms should be marked as "chiefly" Cantonese. To suppress Mandarin in
{{zh-new}}
you can use |m=-. --Anatoli T. (обсудить/вклад) 07:05, 5 August 2015 (UTC)- Thanks for the answer. What about if I'm not sure if a word is used in Mandarin? (for example a Google search produces an overwhelming number of Cantonese results and various dictionaries say "only used in Cantonese" but a minority few Google results appear to be Mandarin (the presence of 你們, 了, etc.)) —suzukaze (t・c) 07:39, 5 August 2015 (UTC)
- The idea is that you don't work with a language you don't know. If you do, you need evidence that a word exists and exists in that topolect. Using
{{zh-new}}
used in most cases to produce standard Chinese (=Mandarin) entries. If there is data for Cantonese, Min Nan and Hakka, the readings will be created automatically. It's pretty safe to rely on the dialectal data. However, if various dictionaries say "only used in Cantonese", then you shouldn't make Mandarin entries for those. Google searches may show mentions, not uses. It all depends on what you know about a word. In [7], if you go to "Words", you can select oral Cantonese only. Those words are usually used only in Cantonese, even if they are made of common Chinese compounds. It's even more the case with Min Nan, which differs more from Mandarin. --Anatoli T. (обсудить/вклад) 11:40, 5 August 2015 (UTC)
- The idea is that you don't work with a language you don't know. If you do, you need evidence that a word exists and exists in that topolect. Using
- Thanks for the answer. What about if I'm not sure if a word is used in Mandarin? (for example a Google search produces an overwhelming number of Cantonese results and various dictionaries say "only used in Cantonese" but a minority few Google results appear to be Mandarin (the presence of 你們, 了, etc.)) —suzukaze (t・c) 07:39, 5 August 2015 (UTC)
Sundanese Language
[edit]Hello, could you please add Module:su-translit to the Sundanese language. DerekWinters (talk) 22:55, 5 August 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 08:26, 6 August 2015 (UTC)
- Thank you :) DerekWinters (talk) 22:37, 6 August 2015 (UTC)
Can you check a few Russian entries I templatized?
[edit]Look in my contribs around Aug 5 02:30, there should be 4, I think. Benwing (talk) 21:10, 7 August 2015 (UTC)
- They all look fine to me, although I made some minor tweaks that were unrelated to your edits. --WikiTiki89 21:17, 7 August 2015 (UTC)
- @Wikitiki89 Thanks!!! Benwing (talk) 21:19, 7 August 2015 (UTC)
Automatically adding accents to Russian words
[edit]I wrote a little script to automatically add accents to Russian words. It works by going through all the translations and such, and if it finds a word without an accent, it looks to see if there's a page for that word, and if so it snarfs the accented form out of the page. My question is, is this safe? How often are there cases where there are multiple possible accentuations for a word? Note that if it finds more than one accented form on a given page, it does nothing, e.g. it won't try to accent воды because there are two headwords on the page with different accents. So the only issue is if there are two accentuations but only one of them is given on a page. Do you think this occurs often enough to worry about? Benwing (talk) 21:18, 7 August 2015 (UTC)
- I think it's about 99.9% safe and the remaining 0.1% is not really the worst thing in the world. You can make it a little bit safer by checking parts of speech (if you haven't done that already). Another thing you could do would be to check the pages that link to the word to make sure they don't have alternative accentuations for it (as in declension tables and such). --WikiTiki89 21:28, 7 August 2015 (UTC)
- Yes, I agree, it's pretty safe. Need to watch for cases like писа́ть (pisátʹ, “to write”) and пи́сать (písatʹ, “to pee”) :) --Anatoli T. (обсудить/вклад) 01:00, 8 August 2015 (UTC)
Translations for northern fur seal
[edit]Several of us have been enlisted to get things ready for k'oon to be Foreign Word of the Day. DC During has added an entry for the species in question, the northern fur seal, and several of us have been adding translations. I noticed, though, that one language that's very important in the history of this species is only represented by "{{t-needed|ru}}
", and no one seems to have asked for help from Russian editors. Would you be so kind as to add the Russian term, and also look at the etymology for seecatch, which claims it's from Russian (I suspect Russian may be just an intermediary)? Thanks! Chuck Entz (talk) 23:53, 8 August 2015 (UTC)
- After a little searching, I notice we have морской котик, which Russian Wikipedia redirects to w:ru:Северный морской котик. That would make sense, since the northern fur seal is the only fur seal in Russian waters, and Wikipedias tend to use only the most precise common name regardless of what's most familiar to the average person. Also, I found this, which I'm guessing is a specialized subsense of a secondary sense. Chuck Entz (talk) 00:35, 9 August 2015 (UTC)
- @Chuck Entz Done. The term морско́й ко́тик (morskój kótik) or simply ко́тик (kótik) means any "seal", сека́ч (sekáč) is an a adult male seal (not so common), not specific to "northern fur seal". --Anatoli T. (обсудить/вклад) 06:34, 9 August 2015 (UTC)
- I forgot to thank you for this. Belatedly: Thanks! As for your corrections: I know better than to add such information myself, since there are too many ways to get things wrong when dealing with a language I don't know well, even with good dictionaries. Chuck Entz (talk) 03:03, 12 August 2015 (UTC)
- @Chuck Entz Done. The term морско́й ко́тик (morskój kótik) or simply ко́тик (kótik) means any "seal", сека́ч (sekáč) is an a adult male seal (not so common), not specific to "northern fur seal". --Anatoli T. (обсудить/вклад) 06:34, 9 August 2015 (UTC)
head differs from page title in more than accents, needs fixing
[edit]@Wikitiki89 Some warnings from my recent script:
Page 1 Wheel of Fortune: WARNING: Accented page Колесо уда́чи! changed from Колесо удачи in more than just accents, not changing Page 1 dupe: WARNING: Accented page же́ртва обма́н changed from жертва обмана in more than just accents, not changing Page 1 obliterate: WARNING: Accented page поко́нчить changed from уничтожить in more than just accents, not changing Page 1 patina: WARNING: Accented page па́тина, пати́на changed from патина in more than just accents, not changing Page 1 switcheroo: WARNING: Accented page подме́н changed from подмена in more than just accents, not changing Page 1 tut tut: WARNING: Accented page ай-ай-а́й! changed from ай-ай-ай in more than just accents, not changing Page 1 о-го-го: WARNING: Accented page ого́! changed from ого in more than just accents, not changing Page 1 ого: WARNING: Accented page о-го-го́! changed from о-го-го in more than just accents, not changing Page 1 удачи: WARNING: Accented page уда́чи! changed from удачи in more than just accents, not changing Page 1 балбес: WARNING: Accented page балбѣ́с changed from балбѣсъ in more than just accents, not changing Page 1 подруга: WARNING: Accented page друг к дру́га changed from друг в друга in more than just accents, not changing Page 1 хеллоуин: WARNING: Accented page Хе́ллоуин, Хеллоуи́н changed from Хеллоуин in more than just accents, not changing
These are cases where the headword is not simply the page title plus accents. Some might not need fixing (e.g. the ones with exclamation points) but the others do. After doing this, could you fix up the mentioned pages (e.g. dupe) to add the accents to the Russian words linked to? Thanks! Benwing (talk) 00:39, 9 August 2015 (UTC)
Changes to Module:languages/data2
[edit]@Wikitiki89 The character ѐ is U+0450 and is a case where there's a precomposed version of a Cyrillic character plus accent. So we need to modify Module:languages/data2 so that the entry_name reads like this:
[see below]
I'd suggest the same changes for Ukrainian and Belarusian. Benwing (talk) 00:39, 9 August 2015 (UTC)
- Don't forget uppercase. —CodeCat 00:42, 9 August 2015 (UTC)
- Thanks. Here we go:
m["ru"] = { ... entry_name = { from = {"ѐ", "Ѐ", GRAVE, ACUTE}, to = {"е", "Е"}} }
Benwing (talk) 01:03, 9 August 2015 (UTC)
- This is one of the reasons I hate Unicode. --WikiTiki89 17:31, 10 August 2015 (UTC)
- This is done. Benwing (talk) 07:56, 12 August 2015 (UTC)
Russian pages missing headword templates
[edit]@Wikitiki89 See User:Benwing/ru-cant-find-heads. Some of these are existing pages without any Russian entry (e.g. there's only a Serbo-Croatian entry); the rest (probably most) are "old-style" pages using manual formatting in the headword line. There are 216 of them listed. Benwing (talk) 01:01, 9 August 2015 (UTC)
- @Benwing Thanks. It's embarrassing how many entries still lack basic headword templates. There are indeed non-Russian terms among them but many just need to be created. --Anatoli T. (обсудить/вклад) 11:42, 9 August 2015 (UTC)
Pages with Latin-script words masquerading as Russian
[edit]@Wikitiki89 Another list of errors. These are pages with links to a word categorized as Russian but containing Latin script in it. I'm not quite sure what to do with entries like WYSIWYG but most need to be translated or Cyrillified, or have the language changed. (NOTE: Some have already been fixed.)
- Page 1 Abkhas: WARNING: Can't find any heads: Abkhas
- Page 1 Абиссиния: WARNING: Can't find any heads: Abyssinia
- Page 1 Cognos: WARNING: Can't find any heads: Cognos
- Page 1 раскрутить: WARNING: Can't find any heads: Google
- Page 1 Intel: WARNING: Can't find any heads: Intel
- Page 1 Microsoft: WARNING: Can't find any heads: Microsoft
- Page 1 TOEFL: WARNING: Can't find any heads: TOEFL
- Page 1 WYSIWYG: WARNING: Can't find any heads: WYSIWYG
- Page 1 мочь: WARNING: Can't find any heads: Zemfira
- Page 1 интерес: WARNING: Can't find any heads: dative
- Page 1 промысел: WARNING: Can't find any heads: divine
- Page 1 следование: WARNING: Can't find any heads: entire
- Page 1 после: WARNING: Can't find any heads: in time
- Page 1 следование: WARNING: Can't find any heads: journey
- Page 1 prostokvasha: WARNING: Can't find any heads: kefir
- Page 1 κουλάκος: WARNING: Can't find any heads: kulak
- Page 1 как по-русски...: WARNING: Can't find any heads: love
- Page 1 чаевые: WARNING: Can't find any heads: money
- Page 1 πογκρόμ: WARNING: Can't find any heads: pogrom
- Page 1 промысел: WARNING: Can't find any heads: providence
- Page 1 раунд: WARNING: Can't find any heads: round
- Page 1 чаевые: WARNING: Can't find any heads: tea
- Page 1 после: WARNING: Can't find any heads: time
- Page 1 туба: WARNING: Can't find any heads: tuba
- Page 1 zeptomole: WARNING: Can't find any heads: zeptomole
Benwing (talk) 01:08, 9 August 2015 (UTC)
- All the ones that needed fixing are fixed. The rest are company names and such and in one case a usage example that quotes English. --WikiTiki89 03:37, 12 August 2015 (UTC)
- Thank you. Benwing (talk) 03:42, 12 August 2015 (UTC)
У этого глагола причастие — су́шащий.--Cinemantique (talk) 11:27, 9 August 2015 (UTC)
- @Cinemantique Спасибо, сделал - diff. --Anatoli T. (обсудить/вклад) 11:52, 9 August 2015 (UTC)
- И тебе спасибо.--Cinemantique (talk) 11:56, 9 August 2015 (UTC)
- Umm.... Excuse me. Did I do something wrong? --KoreanQuoter (talk) 14:10, 10 August 2015 (UTC)
- No, you did not. :)--Cinemantique (talk) 14:15, 10 August 2015 (UTC)
- (E/C) No, you didn't. The present active participle is irregular here but it could only be fixed in the module. :) --Anatoli T. (обсудить/вклад) 14:19, 10 August 2015 (UTC)
- No, you did not. :)--Cinemantique (talk) 14:15, 10 August 2015 (UTC)
- Umm.... Excuse me. Did I do something wrong? --KoreanQuoter (talk) 14:10, 10 August 2015 (UTC)
- И тебе спасибо.--Cinemantique (talk) 11:56, 9 August 2015 (UTC)
Any idea what the Russian etymon of this is? Baillie brushkie was popularized by the same author, from белобрюшка (belobrjuška), if that gives any clues to how he tended to transliterate things (laxly, it seems). - -sche (discuss) 18:04, 9 August 2015 (UTC)
- No, sorry, I can't find anything similar, although there are similarities to several Russian words, I have no strong theory about the connection. --Anatoli T. (обсудить/вклад) 23:17, 9 August 2015 (UTC)
- It looks somewhat like цацка. —Stephen (Talk) 01:21, 12 August 2015 (UTC)
Auto-find Russian accents finished
[edit]@Wikitiki89 Approx. 12,200 changes. Still 10,873 cases needing stress marks. Benwing (talk) 23:54, 9 August 2015 (UTC)
- Great job! Thank you very much! --Anatoli T. (обсудить/вклад) 23:56, 9 August 2015 (UTC)
Two competing Russian declension modules?
[edit]@Atitarev, Wikitiki89, Vitalik, CodeCat I'm concerned about the fact that there are two competing Russian noun declension modules, Module:ru-noun and Module:inflection (with its Russian data in Module:inflection/data/ru-noun) -- on top of the fact that we also have a bunch of old Russian declension templates, which I'm willing to convert using a bot once I know the conversion rules. (BTW I see there's a section above discussing Vitalik's module but unfortunately I can't read Russian.) Ultimately we should have one way of doing things, rather than having to maintain two separate code bases. Are both modules complete currently? Are there situations that one module supports but the other doesn't? I also need to be convinced that the approach of Module:inflection (using a DSL, i.e. domain-specific language, to specify inflection rules) is better than just writing the code directly. I have a feeling a non-programmer is going to be just as daunted by the DSL as by programming directly in Lua, and I wonder about limitations of the DSL -- e.g. I don't see any loops in the DSL, which may work for Russian and Uzbek but definitely won't for e.g. Arabic, where nouns can have lots of plurals. Benwing (talk) 00:23, 12 August 2015 (UTC)
- Yeah, apparently someone created a new noun declension module without finding out if we already have one. I don't know much about it other than from the discussion above, although I do think the stress pattern codes in the newer one are better organized. I have no idea how complete it is. Mine handles most declensions, although it needs occasional overrides in places where the newer one doesn't. Essentially, it may be a better idea to go with the newer one, but I wouldn't be able to help much. Mine also has a parallel module for pre-reform declensions, which I don't think the newer module supports. --WikiTiki89 02:43, 12 August 2015 (UTC)
- (Edit conflict) Module:inflection is new and doesn't seem to have been discussed anywhere but on this page and maybe on another user page or two. If and when it's truly finished and tested, it needs to be discussed somewhere where the community can decide if they want to adopt it- at the bare minimum at the About pages for the languages in which it will be used, and preferably the Grease Pit and the Beer Parlor. After all, this isn't just a new inflection module, but a new way of doing things.
- Hard questions need to be asked about how easily it can be learned by some random IP who has limited computer background and doesn't read English very well, and by all the other rank-and-file contributors out there. Our formatting is already very complex, and if you ever spend much time patrolling Recent Changes, you'll see the messes that often result when people try to work with things they don't understand.
- As for having two competing systems: contributors are confused enough by the challenges of figuring out one system- we don't need to compound it with multiple ones. Chuck Entz (talk) 02:57, 12 August 2015 (UTC)
- (Edit conflict) Yes, we need to concentrate on one module, rather than multiple ones. I wasn't involved much in development of either. Hopefully the guys will cooperate and decide, which module to focus on. --Anatoli T. (обсудить/вклад) 03:01, 12 August 2015 (UTC)
- One thing I notice is that Wikitiki's module has specific ending types, whereas Vitalik's infers it based on the nom. sg. ending and the gender m/f/n. I don't know if there are exceptions that can't be handled Vitalik's way. Vitalik also has a * parameter that automatically handles "reduceables" (where an extra vowel appears in forms with a null ending, either nom. sg., gen. pl. or both), inferring the reduced and non-reduced stem from the nominative sg., whereas Wikitiki's has a "bare stem" parameter; I don't know if there are exceptions that can't be done Vitalik's way. Wikitiki's module supports vocative/partitive/(2nd) locative forms, whereas it looks like Vitalik's does not. Wikitiki's allows for overrides of specific forms, supports nouns that are singular-only or plural-only and allows the addition of a usage note in the declension table; I can't tell if Vitalik's has any of these features. I do see that Vitalik's module supports more stress classes than Wikitiki's, although I wonder how often these weird cases (b', d', f', f'') come up, and I imagine it would be easy to support these in Wikitiki's module if need be. Benwing (talk) 07:55, 12 August 2015 (UTC)
- @Chuck Entz As I said above, I doubt a random user could learn to modify Vitalik's DSL module code much more easily than modifying Wikitiki's module; in fact I'd feel more comfortable editing Wikitiki's code. However, I imagine it's rare that a random user would need to modify either module; more likely they would need to figure out how to use the declension templates. I think Vitalik's system where you specify the nom. sg. and the gender is probably a bit easier to use than Wikitiki's system where you need to learn what the different ending types mean, although I imagine it wouldn't be hard to modify Wikitiki's module to infer the declension type (and allow it to be overridden if need be). Benwing (talk) 07:55, 12 August 2015 (UTC)
- @Benwing: Those weird stress classes usually differ only in one form and can thus be handled with a single override parameter. Also, my module supports showing both animacies at once, rather than needing to have two separate declension tables for a difference in only one case. --WikiTiki89 10:48, 12 August 2015 (UTC)
- "showing both animacies" - yes, it's a good idea and we are going to steal it. :) Actually, our work is not yet finished.--Cinemantique (talk) 12:45, 12 August 2015 (UTC)
- @Benwing: Those weird stress classes usually differ only in one form and can thus be handled with a single override parameter. Also, my module supports showing both animacies at once, rather than needing to have two separate declension tables for a difference in only one case. --WikiTiki89 10:48, 12 August 2015 (UTC)
- (Edit conflict) Yes, we need to concentrate on one module, rather than multiple ones. I wasn't involved much in development of either. Hopefully the guys will cooperate and decide, which module to focus on. --Anatoli T. (обсудить/вклад) 03:01, 12 August 2015 (UTC)
- As it was mentioned above the main advantage of the new module consists in [1]: easier usage (we don't need to learn about ending-types or use special "bare stem" parameters) and [2]: amazing algorithm by Zaliznyak for declension of Russian nouns. We are also planning to add to our module other features from Wikitiki's module (vocative/partitive/locative forms, additional usage note, replacement of specific forms etc.), and also continue implementation of Zaliznyak's algorithm (other special cases). So this module isn't fully furnished yet (but nevertheless it can be used for most of Russian nouns already).
- And one more thing: while development of the data-module becomes harder and harder we've refused DSL-idea and have just rewritten unit in LUA: Module:User:Vitalik/inflection/units/ru-noun (but it's in the "development" status as for now). Vitalik (talk) 10:58, 13 August 2015 (UTC)
- How is it easier to use? The only thing it makes slightly easier is that you don't have to separate the stem from the ending manually. The only real advantage I see is the more thorough stress pattern scheme, but this can easily be added to my module as well. --WikiTiki89 11:32, 13 August 2015 (UTC)
- It could be argued that it's easier for a naive user to insert the nom sg. and the gender than to learn the various declension types (although you could argue against this too). Also arguing against this however is the fact that you can't always infer the declension type from the nom sg. and gender. In practice Vitalik will have to add a declension type argument in any case, although it could be left out in the most common cases. Benwing (talk) 12:00, 13 August 2015 (UTC)
- How is it easier to use? The only thing it makes slightly easier is that you don't have to separate the stem from the ending manually. The only real advantage I see is the more thorough stress pattern scheme, but this can easily be added to my module as well. --WikiTiki89 11:32, 13 August 2015 (UTC)
Прости, что снова отвлекаю. Я частично исправил спряжение, но причастие должно быть оттёкший, а деепричастие — оттёкши. К сожалению, нет инструкции, и я постоянно забываю, что и как там делается. Замечу, что, зная точный индекс Зализняка и используя его алгоритм, можно сгенерировать все формы слов, даже если ты сам не знаешь, какие там должны быть формы.--Cinemantique (talk) 18:33, 13 August 2015 (UTC)
- Ты сделал всё правильно и таблица показывает правильные причастие и деепричастие, так же как я когда-то переделал стечь. К сожалению, применить 100%-ную логику Зализняка не удалось (многие подтипы удалось объединить, учитывая предсказуемую иотизацию), особенно с малочисленными типами глаголов и подтипы с ударением в прошедшем времени типа /b и /c (по Зализняку) часто требуют дополнительных параметров. --Anatoli T. (обсудить/вклад) 00:17, 14 August 2015 (UTC)
воз
[edit]The manually-specified declension of воз had accusative plural возо́в, same as genitive, although the noun is inanimate. I assume this is a mistake? Benwing (talk) 17:46, 14 August 2015 (UTC)
- @Benwing Yes, you're right, it's возы́. --Anatoli T. (обсудить/вклад) 01:34, 15 August 2015 (UTC)
Can you check the decl, esp. the nom/acc/gen pl? Formerly it had the poetic form listed only for the nom pl, but I assume it's also acc pl, and the gen pl has irregular stress given. Benwing (talk) 18:43, 15 August 2015 (UTC)
- Checked. --Anatoli T. (обсудить/вклад) 00:12, 16 August 2015 (UTC)
Template:ru-noun-sib-3 and nouns харч, муж
[edit]@Wikitiki89 This template uses an unexpected genitive plural муже́й etc. Only two nouns харч, муж use this template. Is this genitive plural ending simply irregular or does it occur with a particular class of nouns? Module:ru-noun doesn't support a class with it, should it? Benwing (talk) 18:54, 15 August 2015 (UTC)
- The ending -ей is regular for masculine nouns whose stem ends with sibilant.--Cinemantique (talk) 20:49, 15 August 2015 (UTC)
This noun had acc pl подо́в but it's inanimate. Is that a mistake? Benwing (talk) 18:57, 15 August 2015 (UTC)
- Yes, that was a mistake.--Anatoli T. (обсудить/вклад) 00:13, 16 August 2015 (UTC)
Page needs accents
[edit]@Wikitiki89, Cinemantique Needs accents: абарга. Can you also check the headword and decl (including esp. the genitive/partitive) of холодок? Benwing (talk) 10:30, 17 August 2015 (UTC)
- I fixed холодок (it does not have an -у partitive; all instances I find of "в холодку" are Ukrainian). I have no idea about абарга. --WikiTiki89 10:48, 17 August 2015 (UTC)
- холодок has partitive. The stress in абарга is unknown.--Cinemantique (talk) 10:54, 17 August 2015 (UTC)
- It seems rare enough not to include it (a better search in the corpus you linked to only finds two instances). The normative form based on sheer numbers seems to be "в холодке". --WikiTiki89 11:04, 17 August 2015 (UTC)
- Well, it looks like the locative, and yes, it's rare. Examples of the partitive:
- ― Таня, ставь-ка, ты самовар да сбери чайку: куманек с холодку-то погреется. [П. И. Мельников-Печерский. В лесах. Книга первая (1871-1874)]
- Солдат, набившихся в барак, уже сыто воротило от кипятка ― хотелось постного холодку. [Олег Павлов. Степная книга (1990-1998)]
- --Cinemantique (talk) 11:13, 17 August 2015 (UTC)
- Oh, I was confusing locative with partitive. But still, it seems "холодка" is more common, although "холодку" definitely exists. --WikiTiki89 11:24, 17 August 2015 (UTC)
- Yes, it is known that we lose partitives more and more. As I said, this one is mentioned in Zaliznyak's dictionary. I prefer to use it precisely.--Cinemantique (talk) 11:41, 17 August 2015 (UTC)
- Oh, I was confusing locative with partitive. But still, it seems "холодка" is more common, although "холодку" definitely exists. --WikiTiki89 11:24, 17 August 2015 (UTC)
- Well, it looks like the locative, and yes, it's rare. Examples of the partitive:
- It seems rare enough not to include it (a better search in the corpus you linked to only finds two instances). The normative form based on sheer numbers seems to be "в холодке". --WikiTiki89 11:04, 17 August 2015 (UTC)
- In Buryat, абарга (abarga, “огромный, колоссальный, исполинский, могучий”) is stressed on the penultimate, аба́рга (abárga). I would expect it to be the same in Russian. —Stephen (Talk) 13:16, 17 August 2015 (UTC)
Please check declensions for trial run of converting to Template:ru-noun-table
[edit]@Wikitiki89, Cinemantique I had my bot do one page from each of the existing noun declension templates. Could you check the declensions? Thanks. There are a lot of these templates, so there are about 75 pages to be checked (and lots of possibilities for error ...). Should be the last 75 entries, around 10:30 UTC on Aug 17. Benwing (talk) 10:35, 17 August 2015 (UTC)
- @Benwing Most are good, I am still checking. копьё (kopʹjó) is incorrect, please check against the older version or ru:копьё. I have reverted WingerBot's edit for now. --Anatoli T. (обсудить/вклад) 00:11, 18 August 2015 (UTC)
- вожжа was wrong. I've also made a change to вече (the secondary genitive plural wasn't in the table, so no prob with the bot). --Anatoli T. (обсудить/вклад) 01:12, 18 August 2015 (UTC)
- I fixed those specific cases (diff and diff), but your bot will need to add those templates as exceptions and fix them the way I did. --WikiTiki89 02:06, 18 August 2015 (UTC)
- Thanks to both of you! It looks like запястье and левша were also wrong before. I fixed the module so it knows about these predictable declension variations, which seem to be triggered by the stress pattern and presence of sibilants. Anatoli, can you recheck the following? [moved below]. If all of them are correct then we should be good to go. Benwing (talk) 05:41, 18 August 2015 (UTC)
- I fixed those specific cases (diff and diff), but your bot will need to add those templates as exceptions and fix them the way I did. --WikiTiki89 02:06, 18 August 2015 (UTC)
- левша is still wrong. I have to say that the optional parameter for animacy is a bad idea and a way to make mistakes permanently. I myself did such mistakes.--Cinemantique (talk) 05:48, 18 August 2015 (UTC)
- Thanks. Should be fixed now; the error was old. I suppose we could make the animacy parameter mandatory, although it seems to mostly work OK as is. Benwing (talk) 05:57, 18 August 2015 (UTC)
- I don't think the animacy parameter should be mandatory. Anyway, this is one of the reasons I made the table display animacy boldly in the table's title, so that if were wrong, it would be clearly noticeable. Requiring the parameter will not change anything but give people the habit of putting "inanimate" on everything. --WikiTiki89 10:06, 18 August 2015 (UTC)
- Agreed ... generally when I have been editing "irregular" declensions to use the new
{{ru-noun-table}}
, the big boldface animacy msg is very helpful. Benwing (talk) 10:35, 18 August 2015 (UTC)
- Agreed ... generally when I have been editing "irregular" declensions to use the new
- I don't think the animacy parameter should be mandatory. Anyway, this is one of the reasons I made the table display animacy boldly in the table's title, so that if were wrong, it would be clearly noticeable. Requiring the parameter will not change anything but give people the habit of putting "inanimate" on everything. --WikiTiki89 10:06, 18 August 2015 (UTC)
- Thanks. Should be fixed now; the error was old. I suppose we could make the animacy parameter mandatory, although it seems to mostly work OK as is. Benwing (talk) 05:57, 18 August 2015 (UTC)
moved from above, list of nouns to check, expanded a bit since I did some work on Module:ru-noun: туш, ёж, сторож, овощ, Камбоджа, левша, душа, вожжа, битьё, копьё, запястье, произношение, бытие, калий, Россия, помощь, мышь. Benwing (talk) 10:39, 18 August 2015 (UTC)
- @Benwing I have added a parameter for душа (sg. acc.) (the stress pattern 4 is correct but it's not working) and checked the rest. --Anatoli T. (обсудить/вклад) 11:06, 18 August 2015 (UTC)
- Thank you!!! Benwing (talk) 11:08, 18 August 2015 (UTC)
- Maybe we should have a
4*
and a6*
stress pattern for these cases? --WikiTiki89 11:22, 18 August 2015 (UTC)- @Wikitiki89 We could implement the remainder of Zaliznyak's stress paradigm; see Template:ru-decl-noun-z. This would give us 2', 4', 6', 6'', although the double-prime is hard to type, so maybe 2*, 4*, 6*, 6** or 2a, 4a, 6a, 6b. (While we're at it, I could imagine implementing
loc=+
andpar=+
to give a regular locative/partitive according to rules (whatever they are). Benwing (talk) 11:50, 18 August 2015 (UTC)- Yes, I like asterisk version. I would prefer to use two separate symbols for two separate things. How about
4*
/6*
and2^
/6^
(note that the latter case may actually be fully predictable, seemingly occurring always with feminine -ь words that are stressed on the ending in the singular)? I also like the idea ofloc=+
andpar=+
(I think the rules are that the partitive is the same as the dative and the locative is the same but always with final stress, but maybe there are exceptions). --WikiTiki89 12:23, 18 August 2015 (UTC)- What do you think of labeling them 4a, 6a (a = accusative), 2i, 6i (i = instrumental). That way there's a mnemonic beyond simply the arbitrary symbols of * and ^ or a and b. Benwing (talk) 13:24, 18 August 2015 (UTC)
- Hmmm... those happen to be the same exact letters we use for animacy; that might make it even more confusing. I would just do
4*
and6*
and leave the other thing up to theь-f
declension class (which I vaguely remember already doing before). --WikiTiki89 13:28, 18 August 2015 (UTC)- I guess I didn't do it before, but I considered removing the stress mark on the
"ью́"
and have the module detect that and put the stress on the stem (as well as using the bare stem due to the consonant cluster created by this ending). --WikiTiki89 13:33, 18 August 2015 (UTC)
- I guess I didn't do it before, but I considered removing the stress mark on the
- Hmmm... those happen to be the same exact letters we use for animacy; that might make it even more confusing. I would just do
- What do you think of labeling them 4a, 6a (a = accusative), 2i, 6i (i = instrumental). That way there's a mnemonic beyond simply the arbitrary symbols of * and ^ or a and b. Benwing (talk) 13:24, 18 August 2015 (UTC)
- Yes, I like asterisk version. I would prefer to use two separate symbols for two separate things. How about
- @Wikitiki89 We could implement the remainder of Zaliznyak's stress paradigm; see Template:ru-decl-noun-z. This would give us 2', 4', 6', 6'', although the double-prime is hard to type, so maybe 2*, 4*, 6*, 6** or 2a, 4a, 6a, 6b. (While we're at it, I could imagine implementing
- Maybe we should have a
- Thank you!!! Benwing (talk) 11:08, 18 August 2015 (UTC)
Declensions to check
[edit]@Wikitiki89, Cinemantique ход had nom. pl given as ходы́, хо́ды, хода́ but acc. pl given as only хода́, which I take as a mistake since it's inanimate. Benwing (talk) 13:23, 17 August 2015 (UTC)
- Yes, the accusative should be the same as the nominative. Also @Atitarev, does хо́ды (xódy) really exist? --WikiTiki89 13:29, 17 August 2015 (UTC)
- Yes, it is. Zaliznyak: "...м, 1c//1e, (в игре, в интриге)..."--Cinemantique (talk) 13:43, 17 August 2015 (UTC)
@Wikitiki89, Cinemantique More strangeness ... стан had the locative given as "в ста́не, на стану́". I ignored this and just put in стану́ as the locative. Which leads to
- Is the old entry correct?
- If so, should we include it in the declension table, or perhaps instead in a decl note or usage note? Benwing (talk) 13:40, 17 August 2015 (UTC)
- I think it's something professional today (only for meaning 'camp'). Also, I've found "в стану́" and it looks archaic.--Cinemantique (talk) 14:16, 17 August 2015 (UTC)
X as pinyin
[edit]I noticed that, in the translations section of X-ray, X射線 is listed with X-shèxiàn as its pinyin, as well as X光 as X-guāng. Should the X be changed to àikèsī? If we want to match entries such as OK, which has ōukèi, and, even more relevant, 卡拉OK, kǎlā'ōukèi, then perhaps we should make the change to the pinyin in the entries containing X. WikiWinters ☯ 韦安智 19:47, 17 August 2015 (UTC)
@Wyang Any opinion on the matter? WikiWinters ☯ 韦安智 20:07, 21 August 2015 (UTC)
Specifying whether to use на or в or both in the locative
[edit]@Wikitiki89, Cinemantique It appears that we used to do this in the manual templates. I've been deleting these prepositions but I wonder if this was the right thing, since Vitalik's module includes this info. What do you think? Should we bother? Benwing2 (talk) 15:04, 21 August 2015 (UTC)
- I think the prepositions should be discussed in usage notes and usage examples. Putting it in the table is insufficient and uninformative. --WikiTiki89 17:27, 21 August 2015 (UTC)
- In fact I think that ideally every entry with a partitive, locative, and/or vocative should have a usage note explain the circumstances that each is used in. --WikiTiki89 20:18, 21 August 2015 (UTC)
- I agree ... of course, that takes a good deal of work. Benwing2 (talk) 20:53, 21 August 2015 (UTC)
- But my point is that if you don't do that, then adding the prepositions that it is used with is pretty useless. --WikiTiki89 21:35, 21 August 2015 (UTC)
- I'm not sure it's useless. I don't know Russian grammar that well but I think I've heard that some nouns use в and some use на and you kind of just have to memorize which goes with which, so it can be useful to be told that. Benwing2 (talk) 22:06, 21 August 2015 (UTC)
- That's mostly with place names and really has nothing to do with the locative (it applies to the accusative too). Anatoli has been handling this with usage examples. Also, the other preposition can always also be used to emphasize a more literal meaning. --WikiTiki89 22:21, 21 August 2015 (UTC)
- I'm not sure it's useless. I don't know Russian grammar that well but I think I've heard that some nouns use в and some use на and you kind of just have to memorize which goes with which, so it can be useful to be told that. Benwing2 (talk) 22:06, 21 August 2015 (UTC)
- But my point is that if you don't do that, then adding the prepositions that it is used with is pretty useless. --WikiTiki89 21:35, 21 August 2015 (UTC)
- I agree ... of course, that takes a good deal of work. Benwing2 (talk) 20:53, 21 August 2015 (UTC)
- In fact I think that ideally every entry with a partitive, locative, and/or vocative should have a usage note explain the circumstances that each is used in. --WikiTiki89 20:18, 21 August 2015 (UTC)
Implementing bare param * to auto-construct the unreduced form
[edit]@Wikitiki89, Cinemantique One way to harmonize with Vitalik's module is to implement auto-constructing the bare form, locative and/or partitive when * is given instead of an actual form (or we could use some other char like +, whatever). Only issue: I don't know the rules to do so. Can some of you help? It looks like you insert either an I'm guessing something like this: Insert an е or о between the last two consonants. Specifically, insert an е if the preceding consonant is soft or used to be soft (presumably that includes all sibilants ш щ ч ж, plus ц and й, which disappears before е); otherwise insert an о. But that rule can't be completely right, because of -ец, which always has е. Maybe if either consonant is soft or used to be soft? Now, if the stress would fall on the ending, then presumably it shifts onto the inserted vowel, and е becomes ё. Is this right?
Sorry about the length, please don't TL;DR it ... Benwing2 (talk) 15:30, 21 August 2015 (UTC)
- Seems more complex, e.g. це́рковь < це́рквь- but бре́день < бре́днь-, and козёл < козл-, and totally irregularly, сестёр < сёстр-. Maybe you can only from the full form to the reduced (stem) form, not the other way around? What does User:Vitalik do? Benwing2 (talk) 15:55, 21 August 2015 (UTC)
- With -ец, the the preceding consonant always "used to be soft", because this -е- is derived from a former -ь-. I don't think it is always predictable which vowel it will be, but you can derive some helpful rules:
- After velars (к, г, х): always -о-.
- After sibilants (ч, ж, ш, щ, ц): -о- if stressed and not followed by a soft consonant or ц, -е- otherwise.
- After other consonants, I don't think it is predictable whether it would be -о- or -е/ё- (but the -ё- vs. -е- distinction follows the same rule as above).
- --WikiTiki89 17:39, 21 August 2015 (UTC)
- Thanks. Benwing2 (talk) 17:58, 21 August 2015 (UTC)
- With -ец, the the preceding consonant always "used to be soft", because this -е- is derived from a former -ь-. I don't think it is always predictable which vowel it will be, but you can derive some helpful rules:
Implementing loc=* and par=* to auto-construct the locative/partitive
[edit]@Wikitiki89, Cinemantique As for the partitive, is it always the same as the dative?
And as for the locative, it looks like it's always stressed, and it ends in -у́ after hard or ц (or sibilant consonants?), -ю́ after й (which disappears), -и́ after fem nouns in -ь. The only masculine -ь noun I could find is хмель which has locative во хме́лю (is the stress wrong?).
Benwing2 (talk) 15:30, 21 August 2015 (UTC)
- I think it's always the same as the dative. And I also think the locative is always the same as the dative except for the stress. And yes, it should be хмелю́ (see gramota.ru). Russian Wiktionary lists хме́лю as the partitive, but gramota.ru does not mention this. --WikiTiki89 17:51, 21 August 2015 (UTC)
- I forgot to mention that I would much prefer a plus sign here (
loc=+
). --WikiTiki89 18:03, 21 August 2015 (UTC)- No problem with me. I use
|pl=+
in some of my Arabic declension templates to indicate a sound plural, and it's analogous. 18:06, 21 August 2015 (UTC)- It's also complimentary to (for example)
nom_sg=-
to omit something. --WikiTiki89 18:46, 21 August 2015 (UTC)
- It's also complimentary to (for example)
- No problem with me. I use
- I forgot to mention that I would much prefer a plus sign here (
Could you patrol this page? It looks a complete mess, possible vandalism, but the user who made it has a history of credible edits...? ---> Tooironic (talk) 11:31, 23 August 2015 (UTC)
- @Tooironic I think it was done in good faith but I have RFD'ed it. --Anatoli T. (обсудить/вклад) 23:07, 23 August 2015 (UTC)
- Many thanks. ---> Tooironic (talk) 06:08, 24 August 2015 (UTC)
Nouns in -ня
[edit]@Cinemantique, Wikitiki89 Vitalik's module has a special case so that reducible nouns in -ня have gen pl in -ен instead of -ень. This appears to apply to most (all?) nouns in -льня and also вишня and песня and башня and пекарня but not деревня or кухня (gen pl in -онь), going by ruwiki. Is there a general rule? More generally, in -я nouns, what are the exact rules for the genitive plural? Does -й always appear after a vowel? When does -ь appear and when does it not? There's currently a bug in Module:ru-noun in the gen pl of handling reducible -я nouns. Benwing2 (talk) 07:46, 26 August 2015 (UTC)
- @Benwing2 Russian inflection is way too complicated and irregular. It's unlike most other European (an other) languages, including Slavic languages. There are various examples even for -ня endings but they are just examples, not comprehensive lists of all possible situations. A. Zaliznyak's reference is by far, the most comprehensive reference of the Russian inflections. I have a djvu copy in Russian, not searchable, all in Russian, 15 MB. I can share it with you, so you can have an idea. I can translate bits and pieces of it. --Anatoli T. (обсудить/вклад) 14:06, 26 August 2015 (UTC)
- If you could make that available it would be great; contact me by email. Benwing2 (talk) 08:22, 27 August 2015 (UTC)
- @Benwing, Benwing2 Hi, I've emailed you using your Benwing2 account but got no reply. I need your email address, so that I can send you the file (15 MB) (Wiktionary email functionality doesn't allow attachments). Please email me from Wiktionary and I will reply. Also, you need a DJVU reader installed. --Anatoli T. (обсудить/вклад) 00:40, 28 August 2015 (UTC)
- Done, thanks. Benwing2 (talk) 10:45, 28 August 2015 (UTC)
- @Benwing, Benwing2 Hi, I've emailed you using your Benwing2 account but got no reply. I need your email address, so that I can send you the file (15 MB) (Wiktionary email functionality doesn't allow attachments). Please email me from Wiktionary and I will reply. Also, you need a DJVU reader installed. --Anatoli T. (обсудить/вклад) 00:40, 28 August 2015 (UTC)
- If you could make that available it would be great; contact me by email. Benwing2 (talk) 08:22, 27 August 2015 (UTC)
Hello Atitarev,
As a native French speaker, I don't think avoir une liaison is a good translation for have an affair [8]. We would rather say avoir une liaison extraconjugale. We don't have an expression that is exactly equivalent in French afaik. — Automatik (talk) 16:39, 30 August 2015 (UTC)
- @Automatik Thanks. You can add
{{rft}}
,{{rfd}}
or{{rfv}}
to this. I know it's ambiguous but I found some evidence when making the entry that it's also used in the sense of to have an affair but I may be wrong. --Anatoli T. (обсудить/вклад) 23:25, 9 September 2015 (UTC)- I opened a discussion: Wiktionary:Requests for verification#avoir une liaison. Please add your evidence to it if you still remember. — Automatik (talk) 08:18, 10 September 2015 (UTC)
Traleyka
[edit]I found this in a citation for Denali. Is it a Russian word? DTLHS (talk) 03:35, 31 August 2015 (UTC)
- Wikipedia claims (with citations I’ve not checked) that it comes from the Dena’ina name for the mountain, Dghelay Ka’a. Vorziblix (talk) 08:09, 31 August 2015 (UTC)
- @DTLHS Sorry for the delay. I have no idea what Traleyka is. --Anatoli T. (обсудить/вклад) 23:11, 9 September 2015 (UTC)
Ерунда какая-то, по-моему.--Cinemantique (talk) 16:33, 5 September 2015 (UTC)
Нужно исправить императив. Ума не приложу, как это сделать. У Зализняка индекс прост как три копейки: св 6a
.--Cinemantique (talk) 15:38, 6 September 2015 (UTC)
- @Cinemantique. Сделал. Не все так просто. Многие глаголы подтипа 6a имеют или "и" (напр. вызови, вырви) или "ь" в повелительном наклонение. Функция "conjugations["6a"]" в Module:ru-verb имеет четвертый параметр для случаев, где нужно поменять "и" на "ь". --Anatoli T. (обсудить/вклад) 23:09, 9 September 2015 (UTC)
- Возможно, стоит сделать модуль более умным и Бенвинг нам поможет в этом? В случае 6a правило таково:
- Если глагол оканчивается на -ять, то окончание императива — -й.
- Если глагол оканчивается на -ать, то
- если у него ударная приставка вы-, то окончание императива — -и;
- в противном случае нужно посмотреть на согласные конца основы:
- если на конце основы л, н или р, то окончание императива — -ь;
- если на конце основы б, м или п, то окончание императива — -и, кроме глагола сыпать и производных (у них -ь);
- если на конце основы г, д, з, к, с, т или х (то есть в остальных случаях), то
- если на конце основы одна согласная, то окончание императива — -ь,
- если больше одной, то -и (фактически это три глагола и их производные: кудахтать, брызгать и рыскать).
--Cinemantique (talk) 00:10, 10 September 2015 (UTC)
- Да, конечно, всегда можно улучшить. В модуле еще не хватает логики для схем ударений в прошедшем времени, это делается вручную, если ударение падает на окончание /b и /c. Ну и документацию надо сделать - пока есть только для первых трех типов.--Anatoli T. (обсудить/вклад) 00:22, 10 September 2015 (UTC)
Can you help review new declension tables?
[edit]@Cinemantique, Wikitiki89 I have created a new version of Module:ru-noun that allows you to specify declensions close to how they appear in Zaliznyak's dictionary, while still maintaining compatibility with the current way of doing things. I also created a new Module:ru-adjective that lets you specify short adjectival forms the Zaliznyak way. I put all of Zaliznyak's test cases into the following locations:
- User:Benwing2/test-ru-noun-m
- User:Benwing2/test-ru-noun-m-2
- User:Benwing2/test-ru-noun-f
- User:Benwing2/test-ru-noun-f-2
- User:Benwing2/test-ru-noun-n
- User:Benwing2/test-ru-adjective (only the short adjective forms need reviewing)
- User:Benwing2/test-ru-adjective-2 (only the short adjective forms need reviewing)
I've reviewed the declensions myself but I might have missed something; can you help review them? I marked the ones I think particularly need checking; if you could at least review those and maybe some of the rest, that would be great. Benwing2 (talk) 13:03, 11 September 2015 (UTC)
- test-ru-adjective: wrong stress in позывны́е, чаевы́е.--Cinemantique (talk) 21:38, 11 September 2015 (UTC)
- @Cinemantique Fixed, thanks. Benwing2 (talk) 01:38, 12 September 2015 (UTC)
- @Benwing2 Checked User:Benwing2/test-ru-noun-m - all good. You must have meant лиша́й (lišáj), not луша́й (lušáj), which doesn't exist. --Anatoli T. (обсудить/вклад) 00:30, 12 September 2015 (UTC)
- Thanks! Indeed, луша́й (lušáj) was a typo. Benwing2 (talk) 00:34, 12 September 2015 (UTC)
- In User:Benwing2/test-ru-noun-m-2 - gen. pl of боти́нок, боти́ночек, чуло́к are incorrect. Still checking... --Anatoli T. (обсудить/вклад)
- @Benwing2 Please change общла́г to обшла́г, сапожо́к misses some stress marks. 2nd variant of gen pl for череви́чек is incorrect (should be череви́чек). Haven't finished yet ...--Anatoli T. (обсудить/вклад) 00:43, 12 September 2015 (UTC)
- @Benwing2 still in User:Benwing2/test-ru-noun-m-2. Was it decided to use prepositions for the locative? Just asking. Not sure I like the idea but probably OK if it's optional. --Anatoli T. (обсудить/вклад) 00:47, 12 September 2015 (UTC)
- @Benwing2 Please fix чёрт, it's irregular. See ru:wiki for the full table. --Anatoli T. (обсудить/вклад) 00:49, 12 September 2015 (UTC)
- Yuck, the errors in боти́нок, боти́ночек, чуло́к, череви́чек were due to a bug fix for щено́к that I made ... surprised this didn't mess up more things. Fixed обшла́г, thought I had checked сапожо́к as I specifically have code for this one but looks like not. Will fix чёрт. I put the prepositions in the locative because Zaliznyak indicates them, but it's certainly not necessary; you can just write 'loc=+' and you'll get the plain locative. If you write 'loc=в +' or whatever you get the version with the preposition. Benwing2 (talk) 01:09, 12 September 2015 (UTC)
- @Benwing2 Done User:Benwing2/test-ru-noun-m-2. Typo: плоскозу́вцы -> плоскозу́бцы. The term кочкари́ actually has a singular form - кочка́рь, it's not pluralia tantum. Thanks! --Anatoli T. (обсудить/вклад) 01:13, 12 September 2015 (UTC)
- OK, сапожок should hopefully be fixed, on the assumption that the stress in the plural is сапо́жки (as with other pattern-d nouns) or сапожки́ (as with pattern-b nouns). Benwing2 (talk) 01:29, 12 September 2015 (UTC)
- Fixed чёрт. Does it have singular forms like черта́ or only чёрта? The current en wiktionary has both forms but the Russian wiktionary only has the latter. Benwing2 (talk) 01:36, 12 September 2015 (UTC)
- As for кочкари, it is indicated as plurale tantum in Zaliznyak -- I wonder why? Benwing2 (talk) 01:39, 12 September 2015 (UTC)
- The form черта́ is presented only in the term ни черта. Presenting it as regular is a bad idea. I don't know where the other forms came from (черту́, черто́м, черте́).--Cinemantique (talk) 04:31, 12 September 2015 (UTC)
- @Benwing2, Cinemantique I think we can allow both stress patterns now that we can add notes. Yes, черта́ is only used in ни черта́ (ni čertá) and in identical expressions with vulgar words and their milder equivalents: ни хуя́ (ni xujá), ни хера́ (ni xerá), ни хрена́ (ni xrená), ни фига́ (ni figá), ни шиша́ (ni šišá) from lemmas хуй (xuj), хер (xer), хрен (xren), фиг (fig), шиш (šiš). Please note that the two gen. sg forms of час (čas) - ча́са, часа́ also differ in usage: бо́льше ча́са - more than an hour, два/три/четыре часа́ - two/three/four hours and the derivation полчаса́.
- As for кочкари, I wouldn't worry too much about it. There could be a pluralia tantum sense we don't know or just Zaliznyak's mistake. Some Russian pluralia tantum do have singular forms but they are either rare or obsolete.
- черту́, черто́м, черте́ should be removed. --Anatoli T. (обсудить/вклад) 05:26, 12 September 2015 (UTC)
- The form черта́ is presented only in the term ни черта. Presenting it as regular is a bad idea. I don't know where the other forms came from (черту́, черто́м, черте́).--Cinemantique (talk) 04:31, 12 September 2015 (UTC)
- @Benwing2 Done User:Benwing2/test-ru-noun-m-2. Typo: плоскозу́вцы -> плоскозу́бцы. The term кочкари́ actually has a singular form - кочка́рь, it's not pluralia tantum. Thanks! --Anatoli T. (обсудить/вклад) 01:13, 12 September 2015 (UTC)
- Yuck, the errors in боти́нок, боти́ночек, чуло́к, череви́чек were due to a bug fix for щено́к that I made ... surprised this didn't mess up more things. Fixed обшла́г, thought I had checked сапожо́к as I specifically have code for this one but looks like not. Will fix чёрт. I put the prepositions in the locative because Zaliznyak indicates them, but it's certainly not necessary; you can just write 'loc=+' and you'll get the plain locative. If you write 'loc=в +' or whatever you get the version with the preposition. Benwing2 (talk) 01:09, 12 September 2015 (UTC)
- @Benwing2 Please fix чёрт, it's irregular. See ru:wiki for the full table. --Anatoli T. (обсудить/вклад) 00:49, 12 September 2015 (UTC)
- @Benwing2 still in User:Benwing2/test-ru-noun-m-2. Was it decided to use prepositions for the locative? Just asking. Not sure I like the idea but probably OK if it's optional. --Anatoli T. (обсудить/вклад) 00:47, 12 September 2015 (UTC)
- @Benwing2 Please change общла́г to обшла́г, сапожо́к misses some stress marks. 2nd variant of gen pl for череви́чек is incorrect (should be череви́чек). Haven't finished yet ...--Anatoli T. (обсудить/вклад) 00:43, 12 September 2015 (UTC)
- In User:Benwing2/test-ru-noun-m-2 - gen. pl of боти́нок, боти́ночек, чуло́к are incorrect. Still checking... --Anatoli T. (обсудить/вклад)
- Thanks! Indeed, луша́й (lušáj) was a typo. Benwing2 (talk) 00:34, 12 September 2015 (UTC)
I put черта́ as an alternative genitive singular with a note saying it's only used in ни черта́. Benwing2 (talk) 05:52, 12 September 2015 (UTC)
- @Benwing2 I have checked User:Benwing2/test-ru-noun-f with two typo questions on the talk page (цаве́рна->таве́рна, ни́вхка?). Minor: the order of forms for це́рковь should be "церквя́м, церква́м", etc. as in the current entry. "я" in the plural forms is much more current and common. --Anatoli T. (обсудить/вклад) 21:38, 12 September 2015 (UTC)
- [[User:Benwing2/test-ru-noun-f-2]] - checked as well, great :) --Anatoli T. (обсудить/вклад) 21:50, 12 September 2015 (UTC)
- Thanks very much! цаве́рна was actually царе́вна, and ни́вхка is a word that might mean a Nivkh woman or something. Z has the order "церква́м/церквя́м" for some reason but I fixed it. Benwing2 (talk) 23:41, 12 September 2015 (UTC)
- Thanks, царевна is animate, though. --Anatoli T. (обсудить/вклад) 23:49, 12 September 2015 (UTC)
- Fixed. Benwing2 (talk) 00:21, 13 September 2015 (UTC)
- Thanks, царевна is animate, though. --Anatoli T. (обсудить/вклад) 23:49, 12 September 2015 (UTC)
- Thanks very much! цаве́рна was actually царе́вна, and ни́вхка is a word that might mean a Nivkh woman or something. Z has the order "церква́м/церквя́м" for some reason but I fixed it. Benwing2 (talk) 23:41, 12 September 2015 (UTC)
- [[User:Benwing2/test-ru-noun-f-2]] - checked as well, great :) --Anatoli T. (обсудить/вклад) 21:50, 12 September 2015 (UTC)
@Benwing2 Re: User:Benwing2/test-ru-noun-n
- деревце and деревцо shouldn't have one declension tables, they have different entries. The declension is correct but entry деревцо's current table is wrong.
- плечо should have a stress mark in gen. pl for consistency (even if it's one syllable). If we agree to remove them in monosyllabic forms, they should be removed from other terms as well.
- I don't know the word сверлило. Is it really animate? (I haven't checked)
- The rest looks good! --Anatoli T. (обсудить/вклад)
- Сверлило корабельный.--Cinemantique (talk) 11:34, 14 September 2015 (UTC)
- Thanks. That's a masculine, not a neuter then. --Anatoli T. (обсудить/вклад) 11:38, 14 September 2015 (UTC)
- Yes, but it's declined as neuter.--Cinemantique (talk) 11:54, 14 September 2015 (UTC)
- @Atitarev Thanks very much for reviewing these pages! Benwing2 (talk) 10:31, 15 September 2015 (UTC)
- Yes, but it's declined as neuter.--Cinemantique (talk) 11:54, 14 September 2015 (UTC)
- Thanks. That's a masculine, not a neuter then. --Anatoli T. (обсудить/вклад) 11:38, 14 September 2015 (UTC)
- Сверлило корабельный.--Cinemantique (talk) 11:34, 14 September 2015 (UTC)
What to do about proper names?
[edit]@Cinemantique, Wikitiki89 I added support for proper names in my new ru-adjective. This is the old {{ru-adj11}}
, which hasn't yet been converted. Per Zaliznyak I only made them masc, fem and plural (no neuter), but {{ru-adj11}}
also lists neuter, which it describes as "only for place names" with various caveats about how place names might be declined differently. What do we want to do about these? See User:Benwing2/test-ru-adjective-2 for some examples. Benwing2 (talk) 03:42, 12 September 2015 (UTC)
- BTW in my Arabic declension tables I have a template
{{ar-decl-gendered-noun}}
specifically for supporting nouns that come in masculine and feminine variants (no neuter in Arabic), although in Arabic this is intended for common nouns referring to people, which are obviously nouns not adjectives; I suppose surnames should be treated similarly but it's not totally obvious to me whether they're nouns or adjectives in Russian. Benwing2 (talk) 03:46, 12 September 2015 (UTC)
- Possibly, I'm a heretic in this case. I think we have three different words – Пушкин (m-an), Пушкина (f-an), and Пушкино (n-in). Because Russian nouns never change the gender.--Cinemantique (talk) 04:40, 12 September 2015 (UTC)
- No, you're not a heretic. It's just easier (?) to combine all three genders in one and you don't have to create a feminine form of the surname if you make a masculine. It's not always easy, though. Neuter proper nouns ending in -ево/-ёво/-ово/-ино (if formed from possessive-like surnames) have a tendency to be indeclinable, it depends on the speaker and how adapted those words are. E.g. Сараево is more likely to be indeclinable than Шереметьево. Stress may also differ, e.g. Ива́ново (Ivánovo) is different from Ивано́в (Ivanóv) and Ивано́ва (Ivanóva). (Ива́нов (Ivánov) is the older Russian pronunciation and modern Bulgarian.). @Benwing2 You can see the linked neuter entries for how they are inflected. --Anatoli T. (обсудить/вклад) 05:15, 12 September 2015 (UTC)
- I'm thinking we should leave the neuters as separate words, since they clearly can have a declension that's independent of the proper name that they're derived from, and only list the masculine and feminine under the surname. Benwing2 (talk) 05:23, 12 September 2015 (UTC)
- No, you're not a heretic. It's just easier (?) to combine all three genders in one and you don't have to create a feminine form of the surname if you make a masculine. It's not always easy, though. Neuter proper nouns ending in -ево/-ёво/-ово/-ино (if formed from possessive-like surnames) have a tendency to be indeclinable, it depends on the speaker and how adapted those words are. E.g. Сараево is more likely to be indeclinable than Шереметьево. Stress may also differ, e.g. Ива́ново (Ivánovo) is different from Ивано́в (Ivanóv) and Ивано́ва (Ivanóva). (Ива́нов (Ivánov) is the older Russian pronunciation and modern Bulgarian.). @Benwing2 You can see the linked neuter entries for how they are inflected. --Anatoli T. (обсудить/вклад) 05:15, 12 September 2015 (UTC)
- Нужно добавить примечание о том, что краткие формы с одним Н используются при дополнении в дательном падеже (Она очень предана ему), а с НН — без дополнения (Она добра и преданна). Не поможешь?--Cinemantique (talk) 12:32, 12 September 2015 (UTC)
- @Cinemantique Сделал. --Anatoli T. (обсудить/вклад) 13:06, 12 September 2015 (UTC)
- @Cinemantique Ой, нет, только примечание, не знаю как добавить формы. --Anatoli T. (обсудить/вклад) 13:09, 12 September 2015 (UTC)
- Спасибо. Я перенёс к прилагательному.--Cinemantique (talk) 13:10, 12 September 2015 (UTC)
- @Cinemantique Ой, нет, только примечание, не знаю как добавить формы. --Anatoli T. (обсудить/вклад) 13:09, 12 September 2015 (UTC)
created more new terms, please review
[edit]@Cinemantique, Wikitiki89 Created the following words (sample reducible words in Zaliznyak). Added defns where I could find them. Please review, thanks!
- кочан
- увалень
- черевичек
- плавни
- дровни
- плоскозубцы
- бубна
- нивхка
- крепостца
- вайя
- путля
- козлы
- мохны
- капли
- полдни
- штанишки
- грабельки
- обойки
- портки
- пяльцы
- брашно
- тягло
- сельцо
- долотцо
- болотце
- жнивьё
- волоконце
- верховье
- озерцо
- сиверко
- народишко
- соловейко
- воронко
- кросна
Benwing2 (talk) 12:21, 24 September 2015 (UTC)
- @Wanjuscha, Wikitiki89 Thanks very much for your edits. Benwing2 (talk) 12:52, 25 September 2015 (UTC)
@Cinemantique, Wikitiki89, Wanjuscha More words:
- цыплёночек
- мышоночек
- мадьяр
- чувяк
- кочкари
- трусики
- помои
- прелиминарии
- зеленя
- торока
- рохля
- растеря
- судия
- бразды
- ладоши
- сатурналии
- мазло
- сверлило
- когтище
- домище
- волчище
- войска
- прения
- письмена
- кострец
Thanks!
Benwing2 (talk) 15:25, 26 September 2015 (UTC)
- @Atitarev, Wanjuscha Thank you. Benwing2 (talk) 20:28, 27 September 2015 (UTC)
a few more nouns
[edit]@Wikitiki89, Cinemantique, Wanjuscha
- фурия
- каланча
- маца
- дрога
- ружьецо
- пол-очка
- полкилометра
- кила
- урема
- урёма
- скирда
- мрежа
- кумжа
- кобза NOTE: ruwiki claims accent d/a, Z. claims a/b; which is correct?
- сило
Benwing2 (talk) 06:23, 30 September 2015 (UTC)
@Atitarev, Wikitiki89, Cinemantique, Wanjuscha Thank you! Here's another batch ...
- лития
- корчма
- гумно
- стебло
- гребло
- скребло
- жевело
- питие
- пакибытие
- инобытие
- древние
- домашние
- окружающие
- авторские
- полымя
@Atitarev, Wikitiki89, Cinemantique, Wanjuscha Various words in пол-. Please check the pronunciation carefully; also, some of the words are not indicated in Z. as having oblique cases so I take it they exist only in the nominative and accusative singular. If that isn't right, please let me know and I'll fix them! Benwing2 (talk) 05:35, 4 October 2015 (UTC)
- полмиллиона
- полтысячи
- пол-литра
- полведра
- полдела
- полчасика
- полстолька
- полжизни
- полпорции
- полставки
- полбутылки
- полтаблетки
- полночи
- полдороги
- полпути
- полномера
- Are you sure about the gender of these terms? Z. gives examples like прошло́ полчаса́ and полжи́зни про́жито and полде́ла сде́лано to show that these terms can take neuter singular agreement (also для этого потребуется полгода although this appears not to indicate the gender). They also can take plural agreement, первые полчаса, эти полгода, каждые полкилометра etc. Since plural agreement can still be neuter I'd argue these are neuter rather than whatever the underlying gender is. Benwing2 (talk) 13:59, 4 October 2015 (UTC)
- Yes, you're right, sorry. I used some genders from dictionaries, I had doubts myself but your test is right, "полмиллиона" must be also plural, like "полчаса". For declinable nouns I think we should allow undeclined first part "пол-" as a colloquial form. Sorry, I will be busy today. --Anatoli T. (обсудить/вклад) 20:17, 4 October 2015 (UTC)
more about пол- words
[edit]@Atitarev, Wikitiki89, Cinemantique, Wanjuscha Could someone help translate the following? It's a usage note from ruwiki about the agreement characteristics of пол- words. "в норме разговорной речи первая часть пол не изменяется. Согласованные определения и прилагательные в составном сказуемом согласуются во множественном числе; при наличии согласованного определения глагольные формы согласуются во множественном числе, иначе — в единственном числе и среднем роде"
- Thanks, Benwing2 (talk) 05:35, 5 October 2015 (UTC)
Also, Anatoli, a few more issues:
- Can the words полведра, полвека, полгода, полкилометра, пол-литра, полмесяца, полметра, полмиллиона, пол-очка, полсотни, полтысячи, полчаса actually be declined in the plural? All, or some, or none? Currently they're all listed as having plural forms.
- Can you review the pronunciation of the words in Category:Russian words prefixed with пол-?
- The following words still have
{{rfdef}}
in them: полбутылки, полдела, полжизни, полпорции, полставки, полстолька, полтаблетки, полчасика
Thanks, Benwing2 (talk) 06:14, 5 October 2015 (UTC)
- @Benwing2 Yes, I know there are still words to be checked and translated.
- Translation of the passage: "In the norm of the colloquial speech the first part пол doesn't change. Agreed modifiers (attributes) and adjectives in a compound predicate agree in the plural; if an agreed modifier (attribute) is present verb forms agree in the plural, otherwise - in singular and neuter."
- I can't think how some of the above words can be used in the plural but I'll check them when I have more time. "получа́сы" and "полуго́ды" are rare and sound a bit strange to me but they are attestable. Where did you get some of them? E.g. "полуведра́" as pl nom is incorrect, IMO. It should be "полувёдра", "полувёдер" ... --Anatoli T. (обсудить/вклад) 12:01, 6 October 2015 (UTC)
- On a second thought, "полуведро" is another word, different from "полведра". Still interested where you got it from. --Anatoli T. (обсудить/вклад) 12:03, 6 October 2015 (UTC)
- I am of the opinion that all полу- words should be separate lemmas from пол- words. A templated usage note (perhaps at
{{U:ru:half}}
) can explain that when declension is needed, the полу- lemma is used. --WikiTiki89 14:32, 6 October 2015 (UTC)- Anatoli -- thanks for the translation. At some point soon I'll add it as a usage note to the пол- words. I fixed полведра, oops. полведра is found in Zaliznyak, in the main section, also it's one of the example words given in the discussion of пол- words on pp. 73-74 (1980 edition). BTW I added one more word I found, полномера.
- Wikitiki -- I'm not sure why they need to be separate lemmas; it is a straightforward case of suppletion (and not even especially suppletive, since the forms are so similar). We don't e.g. consider went a separate lemma from go. It would make more sense to list the полу- forms as non-lemma entries ("genitive singular of X", etc.). Benwing2 (talk) 06:49, 7 October 2015 (UTC)
- Just a quick note that полведра and полуведро are different words, just like полночь and полночи. --Anatoli T. (обсудить/вклад) 07:13, 7 October 2015 (UTC)
- @Benwing2: Because the nominatives/accusatives also exist for the полу- forms. (@Atitarev, your comparison is flawed, полведра is like полночи, but полуведро is like полуночь, not like полночь.) --WikiTiki89 15:12, 7 October 2015 (UTC)
- @Wikitiki89 OK. Zaliznyak writes the полу- forms with a * implying the nom sg doesn't exist, but he may be wrong. Benwing2 (talk) 08:46, 8 October 2015 (UTC)
- @Wikitiki89: It depends what you have in mind when comparing. What I meant was, "полно́чи" means "half a night" and "полведра́" means "half a bucket" but "по́лночь" means "midnight" and "полуведро́" is a type of container, THEY ARE NOT SYNONYMS of "полно́чи" and "полведра́". From this perspective, your example "полу́ночь" is a synonym of "по́лночь", it's not what I meant but you contrasting prefix usages, not senses.
- BTW, for "по́лночь" the stress pattern "полуно́чи", etc. is also acceptable. @Benwing Yes and no, you see the nominative forms with "полу-" may mean a different thing, e.g. "полша́ра" (half a sphere/ball), "полуша́рие" (hemisphere) or other examples. --Anatoli T. (обсудить/вклад) 09:28, 8 October 2015 (UTC)
- But полушарие has an extra suffix at the end. полшара would have a similar meaning to полушар. --WikiTiki89 14:20, 8 October 2015 (UTC)
- @Benwing2: Because the nominatives/accusatives also exist for the полу- forms. (@Atitarev, your comparison is flawed, полведра is like полночи, but полуведро is like полуночь, not like полночь.) --WikiTiki89 15:12, 7 October 2015 (UTC)
- Just a quick note that полведра and полуведро are different words, just like полночь and полночи. --Anatoli T. (обсудить/вклад) 07:13, 7 October 2015 (UTC)
- I am of the opinion that all полу- words should be separate lemmas from пол- words. A templated usage note (perhaps at
- On a second thought, "полуведро" is another word, different from "полведра". Still interested where you got it from. --Anatoli T. (обсудить/вклад) 12:03, 6 October 2015 (UTC)
What do we do about entries like this which have different classifiers for different senses? ---> Tooironic (talk) 15:38, 2 October 2015 (UTC)
- Just do "classifier: 粒 (lì)" next to a translation of a sense or similar. --Anatoli T. (обсудить/вклад) 11:51, 3 October 2015 (UTC)
- I ended up going with Usage Notes as I think that looks neater. Thanks. ---> Tooironic (talk) 14:40, 3 October 2015 (UTC)
Would you mind checking the Japanese and Korean? Since the Chinese translation given was dodgy, the other translations may be as well. ---> Tooironic (talk) 15:43, 2 October 2015 (UTC)
- @Tooironic Checked. --Anatoli T. (обсудить/вклад) 11:51, 3 October 2015 (UTC)
- Cheers. ---> Tooironic (talk) 14:48, 3 October 2015 (UTC)
Could you look at this? I'm fairly certain that this is a Cantonese word (the Google search in References supports this) and would remove the Mandarin, but the entry was created as a Mandarin one. —suzukaze (t・c) 02:48, 5 October 2015 (UTC)
- I've converted it to both Mandarin and Cantonese. It makes sense that Q and 7 used synonymously in Mandarin, not in Cantonese. A plain Google search can't be used as a reference. --Anatoli T. (обсудить/вклад) 11:45, 6 October 2015 (UTC)
Best way to translate устар.?
[edit]Is this best translated as "obsolete", "archaic" or "dated"? Or does it depend? Benwing2 (talk) 06:19, 5 October 2015 (UTC)
- The word "устаревший" literally translates roughly to "dated" (i.e. "something that became old"), but in Russian dictionaries it is used for terms that we would classify as any of "dated", "archaic", or "obsolete". --WikiTiki89 15:08, 5 October 2015 (UTC)
- "Dated" is a more common translation and it also seems more commonly used than "obsolete", "archaic" is архаи́чный (arxaíčnyj), abbreviation - "арх.". --Anatoli T. (обсудить/вклад) 11:32, 6 October 2015 (UTC)
new words
[edit]@Atitarev, Wikitiki89, Cinemantique, Wanjuscha
- дереза
- камка
- держидерево
- трепло
- домохозяин
- квартирохозяин
- сохозяин
- получеловек
- обезьяночеловек
- хазарин
- поднаём
- перенаём
- перёд
- христос
- ботиночек
Thanks, Benwing2 (talk) 07:41, 5 October 2015 (UTC)
- @Benwing2: Note that
{{diminutive of}}
is a definition-line template, not an etymology template. --WikiTiki89 15:04, 5 October 2015 (UTC)- OK. So the etymology should spell out the suffix? Benwing2 (talk) 15:28, 5 October 2015 (UTC)
- OK, I see what you mean, as in ботиночек (botinoček). Benwing2 (talk) 15:28, 5 October 2015 (UTC)
{{diminutive of}}
could be used in etymology section, especially if a term acquired a new sense. I don't see it as a big deal either way. Diminutives are formed with various suffixes, plus possible phonetic changes, anyway. --Anatoli T. (обсудить/вклад) 11:36, 6 October 2015 (UTC)- Exactly, therefore it is better to give the specific suffix than to just say "diminutive of". Anyway, the
{{diminutive of}}
template is formatted for definition lines, which is incorrect for etymology sections. If you want to mention that it is a diminutive in the etymology section just write out "diminutive of" without a template. --WikiTiki89 14:36, 6 October 2015 (UTC)- This is what I mean. The term пирожок is diminutive by the etymology, not by the sense. Пирог and пирожок are different dishes. Adding the template in the etymology section will add to the right category. --Anatoli T. (обсудить/вклад) 21:12, 6 October 2015 (UTC)
- Exactly, therefore it is better to give the specific suffix than to just say "diminutive of". Anyway, the
- OK, I see what you mean, as in ботиночек (botinoček). Benwing2 (talk) 15:28, 5 October 2015 (UTC)
- OK. So the etymology should spell out the suffix? Benwing2 (talk) 15:28, 5 October 2015 (UTC)
Another set of words:
@Atitarev, Wikitiki89, Cinemantique, Wanjuscha
- кегля
- ракля
- цикля
- букля
- пукля
- мямля
- семенодоля
- семядоля
- сопля (second entry)
- гуля
- каракуля
- беремя
- льносемя
- полуимя Please verify the secondary stress, I guessed it's on по̀л-
Thanks, Benwing2 (talk) 07:54, 7 October 2015 (UTC)
Модуль транскрипции / Transcription module
[edit]- Привет, Анатолий! Я решил перенести модуль транскрипции в рувики, но хотелось бы, чтобы ошибки были исправлены. Кажется, эта правка (а именно добавка
and not match(syl, 'ʹ' .. non_vowels)
) принесла больше вреда, чем пользы. Слово тьфу действительно было исправлено, однако открой в редакторе предыдущую редакцию модуля и посмотри в предпросмотре страницу тестов: пять новых багов отсутствовали до той правки. И, возможно, ещё какие-то появились в статьях. Не мог бы ты объяснить Бенвингу эту проблему?--Cinemantique (talk) 22:50, 6 October 2015 (UTC) - Я имею в виду слова типа сельдь. Кажется, теперь любой слог вида "Мягкий Согласный (1 или больше) + Гласный + Согласный (1 или больше) + Ь + Согласный" обрабатывается некорректно: гласный изменяется, мягкий депалатализуется, также чудеса в группе согласных перед Ь.--Cinemantique (talk) 23:04, 6 October 2015 (UTC)
- @Cinemantique OK. Are you shy of your English? You shouldn't be. :) Could you link me again to Avanesov's book on the Russian pronunciation, please? I've got the book but we might need it for reference.
- @Benwing2 Hi. Me and Cinemantique would like to ask you to take a look at Module:ru-pron, if you can, please. I am not sure if Wyang will come back to Wiktionary, he seems to be very busy in the real life.
- OK. This diff fixed the pronunciation of тьфу but must have introduced some new errors. A new problem is that every occurrence of a consonant with "ь" followed by another consonant or vowel is not processed correctly, such as missing devoicing, etc.
- Current common or obvious issues.
- Consonant cluster "вств" should no longer be converted to [stv], there are more cases where it's [fstv]. I think I removed that piece of coding but it has come back. Words where it's pronounced [stv] should have |phon= or just skip the first "в" in the parameter.
- Double vowels and cognates shouldn't produce long vowels.
- Secondary stress doesn't work with some vowels, e.g. "и".
- Unstressed "е" and "я" (and "а" after sibiliants) should be treated identically. Please let us know if you have any questions. The failed cases seem straightforward. --Anatoli T. (обсудить/вклад) 11:38, 7 October 2015 (UTC)
- Thanks. Here is Avanesov's book. Note that different sources gives different information. I've found a table in this textbook:
Фонемы | В ударном слоге | В безударных слогах | ||||||
---|---|---|---|---|---|---|---|---|
в неприкрытом | после твёрдого согласного | после мягкого согласного | в неприкрытом | после твёрдого согласного | после мягкого согласного | |||
редуцированные 1-й степени | редуцированные 2-й степени | во всех, кроме конечного открытого | в конечном открытом слоге | |||||
/у/ | у | у | •у | у | у | у | •у | •у |
/и/ | и | ы | и | иэ | ыə | ыə | иэ | иэ |
/э/ | э | э̙ | •э | иэ | ыə | ə | иэ | иэ |
/о/ | о | о | •о | аə | аə | ə | иэ | •ə |
/а/ | а | а | •а | аə | аə | ə | иэ | •ə |
It gives a little different info in comparison with Avanesov's tables.--Cinemantique (talk) 13:30, 7 October 2015 (UTC)
- This code is rather tricky. I fixed some bugs but there are still more. I eliminated reduction of вств but that introduces some additional test failures; I also tried to fix conversion of double vowels to long vowels but the test cases are inconsistent in whether this happens across hyphens. Benwing2 (talk) 08:43, 8 October 2015 (UTC)
- I've removed long vowels from the code in ruwiki. I think Russian doesn't have long vowels. But two vowels [əə] in the middle of a word pronounce like [əɐ] or [ɐɐ] (there are different points of view).--Cinemantique (talk) 09:41, 8 October 2015 (UTC)
@Cinemantique, Wikitiki89 How is a word like аллилуйя (allilujja) pronounced? The manual transcription was /əlʲːɪˈlujːə/, and the module generates /ɐlʲɪˈlujːə/. Is the double лл pronounced double? Also I assume the manual transcription was incorrect about initial /ə/? Benwing2 (talk) 20:36, 10 October 2015 (UTC)
- @Benwing2 Geminations of consonants are less straightforward. They are normally geminated in positions right after the stressed vowels. In other positions they are normally geminated on the prefix + stem borders. "нн" can normally go both ways in post-tonal positions further away from the stress (ка́менный - /ˈkamʲɪn(ː)ɨj/) (for adjectives and adverbs). For other situations we have gem=y (force geminations) and gem=n (remove geminations). In this case, /ɐlʲɪˈlujːə/ is a more common and natural pronunciation, so automatic pronunciation is correct. --Anatoli T. (обсудить/вклад) 22:55, 10 October 2015 (UTC)
- There are many words that are written with double consonants for etymological or orthographic reasons that are not and never were pronounced with a geminate. These include Росси́я (Rossíja), Белору́ссия (Belorússija), паралле́льный (parallélʹnyj), Ма́йя (Májja). If these are ever pronounced with geminates, it is a rare hypercorrection. As for аллилу́йя (allilújja), the pronunciation at Forvo has no gemination, but I just watched a YouTube video of a Russian priest (is that the right term?) and it seems that there is something extra going on around the /j/, although it seems to me more like an extra syllabic и: /ɐ.lʲɪˈlu.ɪ.jə/, but maybe he was being too emphatic. I guess we could say that the gemination is optional. I can't tell you from personal experience since my family would have pronounced it as /ˌhaləˈluˌjɔ/ and /ˌhaləˈliˌju/, if they ever said it at all. --WikiTiki89 15:13, 11 October 2015 (UTC)
- Benwing, please don't follow Wikitiki's family in pronunciations :). Geminations are only optional in some positions. I forgot to mention that word initial gemination is mostly mandatory. --Anatoli T. (обсудить/вклад) 20:37, 11 October 2015 (UTC)
- I was only joking that they would have said it in Yiddish. Word-initial geminations are also often dropped, such as in ссо́риться (ssóritʹsja). --WikiTiki89 19:44, 12 October 2015 (UTC)
- I think it's more common not to drop the gemination here. Mandatory (in most cases) gemination is - prefix + stem (рассказ is an exception), mostly mandatory - stressed vowel + нн. --Anatoli T. (обсудить/вклад) 20:37, 12 October 2015 (UTC)
- I was only joking that they would have said it in Yiddish. Word-initial geminations are also often dropped, such as in ссо́риться (ssóritʹsja). --WikiTiki89 19:44, 12 October 2015 (UTC)
- Benwing, please don't follow Wikitiki's family in pronunciations :). Geminations are only optional in some positions. I forgot to mention that word initial gemination is mostly mandatory. --Anatoli T. (обсудить/вклад) 20:37, 11 October 2015 (UTC)
- There are many words that are written with double consonants for etymological or orthographic reasons that are not and never were pronounced with a geminate. These include Росси́я (Rossíja), Белору́ссия (Belorússija), паралле́льный (parallélʹnyj), Ма́йя (Májja). If these are ever pronounced with geminates, it is a rare hypercorrection. As for аллилу́йя (allilújja), the pronunciation at Forvo has no gemination, but I just watched a YouTube video of a Russian priest (is that the right term?) and it seems that there is something extra going on around the /j/, although it seems to me more like an extra syllabic и: /ɐ.lʲɪˈlu.ɪ.jə/, but maybe he was being too emphatic. I guess we could say that the gemination is optional. I can't tell you from personal experience since my family would have pronounced it as /ˌhaləˈluˌjɔ/ and /ˌhaləˈliˌju/, if they ever said it at all. --WikiTiki89 15:13, 11 October 2015 (UTC)
A few questions:
- Is ьо always pronounced the same as (possibly unstressed) ьё?
- Should стск be treated like сск or ск?
- Should we ever reduce double vowels to long vowels? Cinemantique recently added a test case that doesn't do this even in the middle of a word.
Thanks. Benwing2 (talk) 05:28, 13 October 2015 (UTC)
- Yes.
- I will get back to you on this. I don't entirely with User:Cinemantique on this (сегрегациони́стский). There could be variants but we should come to some agreement. [-sːk-] is not incorrect but [-st͡sk-] is also acceptable pronunciation (formal and thorough).
- Yes, I agree with this change.
- There are some seemingly straightforward cases, which only fail because of the code complexity - фильм, сельдь, Амударья́ and скамья́ (pre-tonal syllables), та́ять (unstressed я = е, e.g. абиети́н, works for за́яц). --Anatoli T. (обсудить/вклад) 05:39, 13 October 2015 (UTC)
- I agree, сегрегациони́стский should definitely be left as [-st͡sk-]. Even in informal speech, [-sːk-] only occurs when speaking quickly. --WikiTiki89 14:16, 13 October 2015 (UTC)
- @Benwing2 I have changed the module to make "-стск-" as [-st͡sk-], so - one less error in the test cases. I've also made an entry for маркси́стский (marksístskij) where this pronunciation is used. Wanjuscha's previous edits on фашистский and расистский were in line with this, even if he presented [t͡s] as [ts]. I've changed them to use the module and show "t͡s".
- @Cinemantique. What's the deal with Асунсьо́н (Asunsʹón)? I can't visually see the difference between [ɐsʊn⁽ʲ⁾ˈsʲjɵn] and [ɐsʊn⁽ʲ⁾ˈsʲɵn]. --Anatoli T. (обсудить/вклад) 22:34, 13 October 2015 (UTC)
- Yes, you can pronounce [mɐrkˈsʲist͡skʲɪj] or [mɐrkˈsʲisːkʲɪj], both ways are standard (see Avanesov, page 190, and orthoepic dictionaries). How can I add the second way?
There is a difference between Асунсьон and *Асунсён.--Cinemantique (talk) 04:01, 14 October 2015 (UTC)- We have to think about it, maybe [-(st͡sk/sːk)-]? IMHO, we could leave it as "st͡sk" as the formal and careful pronunciation for simplicity or use "|phon=...сск..."? on a separate line?
- Re: Асунсьон and *Асунсён. Oops, I didn't notice the big "j". --Anatoli T. (обсудить/вклад) 04:07, 14 October 2015 (UTC)
- I fixed the issue with ьо, but there's still a test failure due to different syllabification. I think the test case make be wrong; at least, it's inconsistent with the syllabification of Вьентья́н a little farther down.
- Also, I commented out the code to convert long vowels -> geminate vowels, but this causes increased test case failures due to cases like ааро́новец and авиаотря́д, and it didn't fix нецелесообра́зный, because the actual pronunciation has /əɐ/ while the expected has /ɐɐ/. Can we decide how we actually want to handle these cases? And should we always convert /əɐ/ to /ɐɐ/? Benwing2 (talk) 06:39, 14 October 2015 (UTC)
- Yes, you can pronounce [mɐrkˈsʲist͡skʲɪj] or [mɐrkˈsʲisːkʲɪj], both ways are standard (see Avanesov, page 190, and orthoepic dictionaries). How can I add the second way?
- @Benwing2 Thanks! The double vowels are now treated correctly! Here's why:
- нецелесообра́зный [nʲɪt͡sɨlʲɪsəɐbˈraznɨj] and "авиаотря́д" [ɐvʲɪəɐtˈrʲæt] - only the second "о" (1st word) and the only "о" (2nd word) is pre-tonal ([ɐ]), the vowel before it, is further away from the word stress ([ə])
- ааро́новец [ɐɐˈronəvʲɪt͡s] and ааро́новщина [ɐɐˈronəfɕːɪnə] - the first "а" is word initial, so the first vowel should be [ɐ]
- Basically, the pronunciation of two non-iotated vowels together should work the same way as if they had a consonant between them. Maybe some sources will claim [ɐɐ] is an alternative pre-tonal pronunciation but [ɐvʲɪəɐtˈrʲæt] is the most natural pronunciation of "авиаотря́д". Cinemantique mentioned once that linguist can't decide if it should be əɐ or ɐɐ. I think it's both depending on the position in relation to the stress and the beginning of the word. --Anatoli T. (обсудить/вклад) 06:55, 14 October 2015 (UTC)
- @Benwing2 Thanks! The double vowels are now treated correctly! Here's why:
- Re: Вьентья́н and Асунсьо́н. I guess it's too hard to set the exact rules for syllabification. Shall we pass Асунсьо́н as [ɐsʊn⁽ʲ⁾sʲˈjɵn] (just like Вьентья́н [vʲjɪn⁽ʲ⁾tʲˈjæn])? (Note: the current Ukrainian syllabification works much worse). --Anatoli T. (обсудить/вклад) 07:02, 14 October 2015 (UTC)
@Wikitiki89, Cinemantique I fixed the issue with фильм and related words; at least I think the fix is correct. I have a fix for скамья́ -- the module inserts too many syllable breaks (line 260) -- but I'm not sure how it interacts with the code I marked farther down as not understanding well (line 345). What is the purpose of pal=y exactly? Also, Wikitiki, do you understand exactly what the clause in lines 346-357 is doing, and what the reason for line 347 syl = rsub(syl, '^([ʺʹ]?)([äëöü])', '%1j%2') being there is, within the if clause, rather than done unilaterally at an earlier point (like line 260)? Thanks! Benwing2 (talk) 08:05, 14 October 2015 (UTC)
- I have no idea. I could probably figure it out with some time, but not much more easily than you. And I'm pretty busy right now. --WikiTiki89 13:56, 14 October 2015 (UTC)
- About 'н[ндт]ск' -- see Avanesov, page 191. 'ндск' usually loses 'д', but 'нтск' loses 'т' rarely.--Cinemantique (talk) 06:58, 15 October 2015 (UTC)
- @Cinemantique Does the нн always get reduced to н? I can make ндск -> нск and leave нтск unchanged, if you think that's right. Benwing2 (talk) 22:18, 15 October 2015 (UTC)
- Also, what about [сз]ск? The module reduces this to ск, is this correct? Benwing2 (talk) 22:20, 15 October 2015 (UTC)
- No, "нн" gets geminations more frequently than other duplications. Although I agree with Avanesov's frequency distribution re 'н[ндт]ск', it would be strange to treat кока́ндский (kokándskij) (of Kokand) and ташке́нтский (taškéntskij) (of Tashkent) differently. I'd prefer to have optional [t] for both, like this: [kɐˈkan(t)skʲɪj] and [tɐʂˈkʲen(t)skʲɪj], which would work for both very formal and casual pronunciations. Would you agree, Cinemantique?
- Yes, that's correct. [сз][дт]ск is different, as discussed earlier - both pronunciations are acceptable, the latter being informal - [-(st͡sk/sːk)-].
- I like how variant pronunciations are handled in Korean, made by Wyang - see 강의 (gang'ui), which can be pronounced as both "gāngui" and "gāngi" - /ˈka̠ːŋɰi/ ~ /ˈka̠ːŋi/, Phonetic hangeul: 강:의/강:이. --Anatoli T. (обсудить/вклад) 22:40, 15 October 2015 (UTC)
- So ннск gets pronounced with gemination of нн? Benwing2 (talk) 23:20, 15 October 2015 (UTC)
- BTW the first (failed) test case shows optional palatalization before /j/. Does this always happen or only word-initially? Benwing2 (talk) 23:25, 15 October 2015 (UTC)
- No, ннск should actually become нск.
- Hmm, the answer is not so straightforward. I'll give you more details later. From memory, I think it was an attempt to cover various cases. --Anatoli T. (обсудить/вклад) 23:46, 15 October 2015 (UTC)
- Anatoli, is this correct transcription? I'm not sure... I still don't like the transcription of -е (in the end of words). Sometimes it sounds like [ə] (солнце, сердце), but more often like [ɪ] (на моро́зе) or [ɨ] (в столи́це). Can you please read Avanesov, page 102, §5.8?--Cinemantique (talk) 03:10, 17 October 2015 (UTC)
- BTW does anyone have a copy of Avanesov? Benwing2 (talk) 05:05, 17 October 2015 (UTC)
- @Benwing2 Repeating the link from above Avanesove.
- @Cinemantique Fixed макголи by adding geminations, does it look OK to you? The module wasn't designed by me. I only helped Wyang where I could. With final "е" you yourself that it's not always consistent like with other cases. We could use "и" or "а" in
{{ru-IPA}}
(e.g. use "гро́мчи" for гро́мче (grómče)) depending on what pronunciation we need to achieve. I suggested to create new rules and new test cases based on Avanesov, do you remember? Hopefully Benwing(2) could implement the rules if they are more or less consistent. If they are not, we'll have to rely on some manual parameters. - What is you suggestion on -стск- -ндск-/-нтск-? Also, could you help establish better rules for palatalisations in words with ь + iotated vowels? E.g. бьёт, пьянка. Is it possible to define clear rules when palatalisation is optional or when it should and shouldn't happen -[(ʲ)j]? Should we address оо/аа/оа/ао in unstressed positions? I started a discussion topic in Talk:Russian_phonology but not so sure any more. --Anatoli T. (обсудить/вклад) 08:18, 17 October 2015 (UTC)
- макголи looks good. -ндск-/-нтск-: yes, maybe Korean variant will be fine. оо/аа/оа/ао: see Avanesov, §6, pages 106-109 (ʌ = ɐ).
- ь + iotated: I think it's after Б, В, М, П (вьюга, бью, скамья, пью).
- final "е" = [ə] only after Ц and Ь (солнце, здоровье).--Cinemantique (talk) 11:38, 17 October 2015 (UTC)
- I've added one more test, for the secondary-stressed Ё.--Cinemantique (talk) 05:09, 19 October 2015 (UTC)
- Thanks. That's interesting, it didn't occur to me you could have secondary-stressed ё but it shouldn't be hard to fix. Benwing2 (talk) 05:24, 19 October 2015 (UTC)
Hey Anatoli, the POJ in the example sentence here is displaying as Hanyu Pinyin, who should we contact to fix this? ---> Tooironic (talk) 04:28, 8 October 2015 (UTC)
- @Tooironic
{{zh-usex}}
does not automatically convert to POJ. The POJ was deleted from the entry at some point and I have restored it. —suzukaze (t・c) 04:32, 8 October 2015 (UTC)- Many thanks! ---> Tooironic (talk) 07:27, 9 October 2015 (UTC)
new words (listed as irregular in Zaliznyak)
[edit]@Atitarev, Wikitiki89, Cinemantique, Wanjuscha Going through the words given as irregular ...
- боярышня
- пядень (need some translation help here)
- пария
- кшатрия
- вайшья
- паремья (definitely need some translation help here)
- паремия (here too)
- епитимья
- аллилуйя
- майя
- райя
Thanks! Benwing2 (talk) 20:47, 10 October 2015 (UTC)
Who speaks Cantonese here? We need to check the Cantonese reading here. ---> Tooironic (talk) 10:39, 11 October 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 20:38, 11 October 2015 (UTC)
- Duoxie. ---> Tooironic (talk) 01:50, 13 October 2015 (UTC)
When you have time could you help me fix the formatting of the example sentences here? Specifically, I mean the de-linking of the Latin word, and the capitalisation of the proper names. Thanks. ---> Tooironic (talk) 01:23, 16 October 2015 (UTC)
- Done. --Anatoli T. (обсудить/вклад) 01:46, 16 October 2015 (UTC)
- Xiexie! ---> Tooironic (talk) 02:26, 16 October 2015 (UTC)
Words like космический корабль should have the decl spelled out
[edit]I really think multiword expressions like космический корабль should have their decl spelled out rather than using {{ru-decl-noun-see}}
. With the work I've done to {{ru-noun-table}}
, it's just about as easy to list the full declension as to use {{ru-decl-noun-see}}
, and the proper links will be there in the headword. If you think it's a good idea, the title bar of the declension table can also easily be made to have the same links that {{ru-decl-noun-see}}
would provide. Benwing2 (talk) 06:15, 16 October 2015 (UTC)
- I agree, now that we have an automatic way of doing this.
{{ru-decl-noun-see}}
was a better solution when such tables would have had to be written out manually. There are still some downsides related to duplicating information, but the easiness of generating the table outweighs them. --WikiTiki89 15:10, 16 October 2015 (UTC)- I personally don't think we should spell out declensions for compound words, even if we can. The number of inflected forms will grow exponentially with Союза Советских Социалистических Республик, жевательную резинку, Арабских Эмиратов, вышла замуж, etc. No published dictionary would do that. I think we are doing a disservice to learners and it's not just for Russian, consider how many languages have inflections and these inflections are also used in idioms, compound words, set expressions. Shouldn't learners learn to look up individual components, learn the basics of grammar. A dictionary can't cover all possible word combinations and it shouldn't. It's a waste of time and efforts, IMO, although I appreciate the efforts. Well, there are people who may like the idea. Maybe it's a topic for Beer parlour? --Anatoli T. (обсудить/вклад) 07:57, 17 October 2015 (UTC)
- We don't have to create articles for all of them. We were only talking about declension tables. Perhaps we could have such tables link to each component form individually. --WikiTiki89 15:20, 19 October 2015 (UTC)
- I personally don't think we should spell out declensions for compound words, even if we can. The number of inflected forms will grow exponentially with Союза Советских Социалистических Республик, жевательную резинку, Арабских Эмиратов, вышла замуж, etc. No published dictionary would do that. I think we are doing a disservice to learners and it's not just for Russian, consider how many languages have inflections and these inflections are also used in idioms, compound words, set expressions. Shouldn't learners learn to look up individual components, learn the basics of grammar. A dictionary can't cover all possible word combinations and it shouldn't. It's a waste of time and efforts, IMO, although I appreciate the efforts. Well, there are people who may like the idea. Maybe it's a topic for Beer parlour? --Anatoli T. (обсудить/вклад) 07:57, 17 October 2015 (UTC)
support for manual translit in decls is here
[edit]Put the manual translit after a //, e.g. {{ru-noun-table|го́мо са́пиенс//gómo sápiens|a=an}}
; same for {{ru-noun+}}
. If you think of a better separator, let me know. Benwing2 (talk) 07:33, 16 October 2015 (UTC)
- Great. Thanks a lot! --Anatoli T. (обсудить/вклад) 07:58, 17 October 2015 (UTC)
FYI, this city's pronounced Jǐníng, as in 濟水. ---> Tooironic (talk) 14:41, 16 October 2015 (UTC)
Question
[edit]Hi Anatoli,
could you help me out? I did a presentation in Russian the other day on Apocalypse Now and I wanted to say that Francis F. Coppola thought that the world was doomed, so I said мир обречённый. The teacher corrected me and said обречён was better. Now I don't get this particular instance, because I was taught that the short form indicates some sort of 'temporality'/that the statement is not true always, that it can change according to circumstances. In the case of the world being doomed, I thought it was appropriate to use the long form, because I wanted to emphasize Coppola's dramatic meaning. Could you say something about why I was wrong?
Also, could you indicate when you should use something like спросить у него and when спросить его? Thank you so much, 31.201.151.216 08:44, 17 October 2015 (UTC)
- Hi.
- Your teacher was right and "мир обречён" sounds better than "мир обречённый".
- Permanent/temporary senses only apply to some adjectives - бо́льной/бо́лен, здоро́вый/здоро́в. There a few more examples with subtle differences (some are described here (in Russian)).
- Short adjectives are usually only qualitative adjectives, only used as predicates "the world is dooomed", not "the doomed world". The attributive usage is dated, still in use in literature, old expressions - "красна девица".
- The majority of short form adjectives are synonymic with full forms, the short forms being more formal and sound more "educated" (but sometimes too formal), я го́лоден = я голо́дный. With the formal vocabulary, like "обречённый", it's better to use the short forms in predicates.
- The phrase "спросить у него" is almost identical with "спросить его". It's up to you. Use what you're more comfortable with. :) --Anatoli T. (обсудить/вклад) 09:00, 17 October 2015 (UTC)
new words (and do any of them need manual translit)
[edit]@Wanjuscha, Wikitiki89, Cinemantique Wanjuscha has already taken care of some of them (thank you!!).
- провод
- доведь
- груздь (Wanjuscha handled)
- перкаль (Wanjuscha handled)
- табель (Wanjuscha handled)
- скобель (Wanjuscha handled)
- дюбель (Wanjuscha handled)
- нагель
- ригель
- мергель (Wanjuscha handled)
- бугель
- педель (Wanjuscha handled)
- вензель (Wanjuscha handled)
- трензель
- шенкель
- стапель
- штемпель
- дупель
- штепсель
- фухтель
- грифель
- шеншель
- кокиль
- дизель
- чизель
- бензель
Benwing2 (talk) 22:47, 17 October 2015 (UTC)
- Words with phonetic respellings, like "ште́мпель" would require manual translit. The rest of words are OK but with some words I'm not 100% sure how to pronounce. I think they can be automatically transliterated.--Anatoli T. (обсудить/вклад) 23:21, 17 October 2015 (UTC)
- Thanks. BTW I added the definition "cupped hands" to пригоршня based on ruwiki. Is this correct? Or is "hollow of the hand" (from горсть) better? Benwing2 (talk) 13:48, 18 October 2015 (UTC)
- Thanks, I've made some minor changes. --Anatoli T. (обсудить/вклад) 21:36, 18 October 2015 (UTC)
- Thanks. BTW I added the definition "cupped hands" to пригоршня based on ruwiki. Is this correct? Or is "hollow of the hand" (from горсть) better? Benwing2 (talk) 13:48, 18 October 2015 (UTC)
Who speaks Hakka and/or Min Nan here? These translations look... interesting. ---> Tooironic (talk) 06:40, 19 October 2015 (UTC)
- "chit" is a Min Nan cognate without a hanzi equivalent, "tām" is the right reading, "hāi" is wrong. Can't verify Hakka. Let me think about the fix. --Anatoli T. (обсудить/вклад) 07:04, 19 October 2015 (UTC)
more new words (and do any of them need manual translit)
[edit]@Cinemantique, Wikitiki89, Wanjuscha More words ending in -ель:
- ракель
- декель
- винкель
- мушкель
- румпель
- ниппель
- лисель
- трисель
- стаксель
- брамсель
- трюмсель
- апсель
- капсель (needs a definition)
- топсель (needs a better definition)
- марсель
- дроссель
- шпатель (partly already defined)
- шпахтель (needs a definition)
- гафель
- муфель
- штихель
Benwing2 (talk) 09:16, 19 October 2015 (UTC)
As you can see the pronunciation of this word is a bit complicated. Is there a way to indicate that the first and second pronunciations are also commonly 輕聲? I ran into trouble before when I tried to add "tl=y". ---> Tooironic (talk) 06:05, 22 October 2015 (UTC)
- @Tooironic done with "tl=y,2tl=y". --Anatoli T. (обсудить/вклад) 06:15, 22 October 2015 (UTC)
- Brilliant! Thanks. ---> Tooironic (talk) 06:21, 22 October 2015 (UTC)
Yes, I do indeed think that your rollback is in error.
[edit]I see that you undid my revision on the page 한글. More specifically, you removed the Hanja form of 한글 (韓㐎) that I had added to this entry. May I ask why? I do have my sources, and by sources I mean Wikipedia. See the little box entitled "Hangul", subtitled "South Korean name", on this page VulpesVulpes42 (talk) 15:41, 22 October 2015 (UTC)
- Wikipedia can't be used as a reliable source! I can go now and remove 韓㐎 from that article. 한 is indeed 韓 but the 글 part is a native Korean word and native words are not written in hanja. You can open a WT:Tea room discussion but don't change the entry. --Anatoli T. (обсудить/вклад) 19:17, 22 October 2015 (UTC)
- Wikipedia ranks slightly higher in reliability than Google Translate as a source of dictionary information, but that's not saying much... Chuck Entz (talk) 23:26, 23 October 2015 (UTC)
- FWIW, it looks like that odd hanja only appeared quite recently, on 2015-03-12, when an anon added it in this edit. That IP only has five edits on EN WP, all in March. ‑‑ Eiríkr Útlendi │Tala við mig 23:37, 23 October 2015 (UTC)
- I've opened a topic here Wiktionary:Etymology_scriptorium/2015/October#.ED.95.9C.EA.B8.80.2C_.EC.A1.B0.EC.84.A0.EA.B8.80_-_hangeul. I care little about the Wikipedia but we shouldn't add unsourced info here, at Wiktionary. --Anatoli T. (обсудить/вклад) 01:08, 24 October 2015 (UTC)
some new words ending in -я
[edit]@Cinemantique, Wikitiki89, Wanjuscha
Thanks! Benwing2 (talk) 23:02, 22 October 2015 (UTC)
Is there are more proper way to mark the erhua for this entry? It's a bit tricky cuz the 'r' appears after the first character, not the second as is usually the case. ---> Tooironic (talk) 09:26, 23 October 2015 (UTC)
- Not really, that would require some work in the modules. Well, you have already added "xiànrbǐng". You can also add 餡兒餅/馅儿饼 (xiànrbǐng) as an alternative form. --Anatoli T. (обсудить/вклад) 09:34, 23 October 2015 (UTC)
Hello again
[edit]Just wanted to show you this:
https://en.wikipedia.org/w/index.php?title=Hangul&action=history
and this:
https://en.wiktionary.org/w/index.php?title=%ED%95%9C%EA%B8%80&action=history
VulpesVulpes42 (talk) 11:46, 23 October 2015 (UTC)
some more words
[edit]@Atitarev, Cinemantique, Wikitiki89
- вентерь
- росстань
- полутень
- грабарь
- рыбарь
- сазандарь (don't know what this means)
- бондарь
- лопарь
- слесарь
- лагерь
- пехтерь
- псалтырь
Thanks! Benwing2 (talk) 20:13, 23 October 2015 (UTC)
Adjectives with strange stress patterns
[edit]@Cinemantique, Wikitiki89 The following is the full list:
- Page 509 искренний: WARNING: Unrecognized stress: m=и́скренен f=и́скренна n=и́скренне,и́скренно p=и́скренни,и́скренны
- Page 1119 стрёмный: WARNING: Unrecognized stress: m=стрёмен f=стрёмна,стремна́ n=стрёмно p=стрёмны,стремны́
- Page 1483 данный: WARNING: Unrecognized stress: m=да́н f=да́на,дана́ n=да́но,дано́ p=да́ны,даны́
- Page 2205 мутный: WARNING: Unrecognized stress: m=му́тен f=му́тна,мутна́ n=му́тно,мутно́ p=му́тны,мутны́
- Page 2446 видный: WARNING: Unrecognized stress: m=ви́ден f=ви́дна,видна́ n=ви́дно p=ви́дны,видны́
- Page 2526 зрелый: WARNING: Unrecognized stress: m=зре́л f=зре́ла, зрела́ n=зре́ло p=зре́лы, зрелы́
The first two have already been commented on. Could one of you comment on the others -- are these forms real? Are they nonstandard? Thanks! Benwing2 (talk) 00:45, 26 October 2015 (UTC)
- искренний - as Cinemantique suggested, remove и́скренно and и́скренны
- стрёмный - OK as is, IMO. Update: This is slang, so, it won't appear in dictionaries but all the forms don't strike as incorrect.
- данный - ending-stressed only.
- мутный - OK as is, IMO. Update: confirmed, мутна́ - standard, му́тна - colloquial, etc.
- видный - ending-stressed only, except for the neuter
- зрелый - stem-stressed only
--Anatoli T. (обсудить/вклад) 01:44, 26 October 2015 (UTC)
- Thanks, will fix. Benwing2 (talk) 08:17, 26 October 2015 (UTC)
a few new words
[edit]@Atitarev, Cinemantique, Wikitiki89, Wanjuscha
Benwing2 (talk) 02:25, 26 October 2015 (UTC)
BTW I created a few new words полвторого and полседьмого and полтретьего. Can you look over them? If they look OK I'll create the remainder. Benwing2 (talk) 10:08, 26 October 2015 (UTC)
- They are good, thnx.--Anatoli T. (обсудить/вклад) 10:14, 26 October 2015 (UTC)
many more new words, many will need pronunciation fixed
[edit]@Cinemantique, Wikitiki89, Wanjuscha
- шабер
- шабёр
- груббер
- скруббер
- шибер
- кивер
- кливер
- спрингер
- стрингер
- флюгер
- гренадер
- грейдер
- леер
- фрезер
- триер
- стакер
- буккер
- анкер
- зенкер
- клинкер
- юнкер
- маркёр
- маркер
- швеллер
- колер
- сейнер
- кельнер
- скрепер
- шкипер
- клипер
- миллиампер
- микроампер
- вольт-ампер
- киловольт-ампер
- каупер
Thanks! Benwing2 (talk) 21:43, 26 October 2015 (UTC)
- Checked pronunciation.--Cinemantique (talk) 08:40, 27 October 2015 (UTC)
- Thank you! Benwing2 (talk) 08:48, 27 October 2015 (UTC)
- Checked pronunciation.--Cinemantique (talk) 08:40, 27 October 2015 (UTC)
Second round, remaining words in -ер with unusual declensions:
- глиссер
- ватер
- бронекатер
- унтер
- сеттер
- вахтер (check this carefully, appears to have different defn from вахтёр but I'm not totally sure)
- лихтер
- буер
- туер
- шафер
- шофер
- лейб-кучер
- ветфельдшер
Thanks! Benwing2 (talk) 02:30, 27 October 2015 (UTC)
When I said I was matching 卡拉OK, I meant I was changing the pinyin to get rid of individual letters and convert them to standard pinyin. "X" in standard pinyin is áikesi (correct me if I'm wrong), just like how "OK" is "ōukèi." WikiWinters ☯ 韦安智 14:10, 27 October 2015 (UTC)
Hindi IPA
[edit]I made a new template (Template:hi-IPA) that automatically generates IPA for Hindi. It's somewhat rudimentary, and has a few shortcomings (automatically adds ə after consonants), but it still is fairly decent. Do you think this a good idea? There's already a transclusion at आर्यमन. Aryamanarora (talk) 01:47, 28 October 2015 (UTC)
- Thanks, it looks interesting but I know little about Hindi phonology, not sure if you get help elsewhere in the community either, judging by how the transliteration module is no longer supported. I'd prioritise fixing the transliteration module first, though. Good luck, in any case. If schwa-dropping is fixed in the translit module, then it could work for the IPA as well, hopefully. --Anatoli T. (обсудить/вклад) 01:53, 28 October 2015 (UTC)
I based the etymology on the one we have at rupee. Could you check it for me? ---> Tooironic (talk) 12:04, 28 October 2015 (UTC)
- Fine by me but I think the term was borrowed from English. --Anatoli T. (обсудить/вклад) 05:13, 29 October 2015 (UTC)
According to the zhuyin, the pinyin for "SIM" is "xin." Is this correct? Is it not pronounced like "SIM" in English? I think we need to clean up the entries in the Chinese mixed script category. WikiWinters ☯ 韦安智 03:12, 29 October 2015 (UTC)
- I am less concerned about this type of entries, they might as well be deleted. It may be "xīn" but I don't know for sure. The pronunciation of foreign abbreviations is unregulated and native speakers might know better. --Anatoli T. (обсудить/вклад) 05:12, 29 October 2015 (UTC)
Hi. This word means "craft", and also "occupation" in the sense of "vocation, work", but not in the sense of "control of another country". This is احتلال. This was probably just an occasional mistake, but try to check twice before you add :-) No offence intended. Kolmiel (talk) 01:11, 31 October 2015 (UTC)
- Thanks. --Anatoli T. (обсудить/вклад) 01:26, 31 October 2015 (UTC)
Russian "brocade"
[edit]I added the term брокат (brokat), so feel free to expand it - even if you feel like you forgot to add it in the first place. --Lo Ximiendo (talk) 09:45, 31 October 2015 (UTC)
- P.S. Don't forget василёк (vasiljók) as well. --Lo Ximiendo (talk) 09:48, 31 October 2015 (UTC)
- Done. Why do you think I forgot them? Please add
{{rfinfl|lang=ru}}
, if you make Russian entries without inflections. --Anatoli T. (обсудить/вклад) 09:58, 31 October 2015 (UTC)
- Done. Why do you think I forgot them? Please add
Do you have a source for the 'dialectal' tag? AFAIK it's standard Chinese. ---> Tooironic (talk) 05:15, 2 November 2015 (UTC)
- @Tooironic Pleco's main dictionary says it's dialectal (it includes 3-rd party dictionaries, like CEDICT as well). It's usually quite reliable and very good. I am OK if you remove it, if the standard Chinese is confirmed. BTW, I recommend Pleco. It also contains many Cantonese readings, if you have any doubts about Cantonese readings. It's available for iPhones and Androids. The paid version has additional features but the free version is great too. --Anatoli T. (обсудить/вклад) 05:26, 2 November 2015 (UTC)
- Yeah, I know Pleco, of course. But usually I consult C-C dictionaries first. I'm not a fan of the "dialectal" label sense since it is quite ambiguous and C-C dictionaries don't use it, only 口語 and 書面語. If you don't mind I'll remove it for now until we have any better evidence. ---> Tooironic (talk) 05:33, 2 November 2015 (UTC)
- No worries. I use Pleco when the Cantonese is not automatically added and I hate to see "Gwoyeu Romatzyh" in the expanded view. It doesn't have yue for each word but for many, including many in C-C dictionaries. --Anatoli T. (обсудить/вклад) 05:38, 2 November 2015 (UTC)
- Yeah, I know Pleco, of course. But usually I consult C-C dictionaries first. I'm not a fan of the "dialectal" label sense since it is quite ambiguous and C-C dictionaries don't use it, only 口語 and 書面語. If you don't mind I'll remove it for now until we have any better evidence. ---> Tooironic (talk) 05:33, 2 November 2015 (UTC)