User talk:Wyang

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Super-busy in June-July 2014.



Archive 1 — 2013/01/18 21:12 (UTC) to 2014/05/24 00:43 (UTC)

User talk[edit]

Pinyin containing comma being treated as two readings[edit]

See 一枝草,一点露, 一枝草,一點露, 既生瑜,何生亮 and 说时迟,那时快. How to deal with the problem that the comma split the pinyin into two readings? --kc_kennylau (talk) 15:03, 25 May 2014 (UTC)

Probably should not have used a comma in the first place, but multiple template parameters. By the way, I would suggest renaming the parameters to use language codes of the topolects (i.e. {{zh-pron|cmn=...|cmn2=...|wuu=...|yue=...|yue2=...|...}}) One-letter parameters may be handy to type, but they present some cognitive burden as yet another thing to look up in the documentation and memorise. Also, changing these two things at once would solve the problem of distinguishing parameters in new format from the old. Keφr 16:32, 25 May 2014 (UTC)
Fixed - use ', ' as dictated by Pinyin orthography. I don't think I agree that using language codes is the better option. A user may have to look up the documentation to know that 'w' is the code for 'Wu' (chances are not, when the person sees 'm' is used for 'Mandarin'), but a user definitely has to look up the documentation if 'wuu' is the code for 'Wu'. Wyang (talk) 00:03, 26 May 2014 (UTC)
There's also the matter of lects that don't have language codes. Categorization would probably need to be handled differently, but I'm sure there are some that would be worth incorporating into this framework. Knowing how dialectology and the ISO work, I would be quite surprised if there were no dialects with lexically-significant differences lacking ISO codes. I think the main problem would be deciding which ones not to cover- though lack of sources might solve that problem for us. Chuck Entz (talk) 00:55, 26 May 2014 (UTC)

links to erhua-ed pinyin[edit]

I am not sure whether this has already been discussed before, but should we link to erhua-ed pinyin in entries? --kc_kennylau (talk) 11:10, 27 May 2014 (UTC)

Don't think it has been discussed before. Feel free to do it if you feel so inclined. Wyang (talk) 11:31, 27 May 2014 (UTC)
Because it's currently linked and I want to disable the link. --kc_kennylau (talk) 12:51, 27 May 2014 (UTC)
My preference is to link it, if generated pinyin is a valid pinyin. So, is "wánr" a valid pinyin for 玩, even if it's not written 玩儿? --Anatoli (обсудить/вклад) 12:59, 27 May 2014 (UTC)
The reason I showed the erhua-ed pinyin on the un-erhua-ed page is just to show that writing "兒/儿" is optional. A Beijinger may write "我去玩了", and pronounce it as "我去玩儿了" instead. Wyang (talk) 00:30, 28 May 2014 (UTC)

Wuu tones 2 and 4[edit]

How did you know that 小三 is the second tone instead of the fourth? --kc_kennylau (talk) 12:51, 27 May 2014 (UTC)

Let me, a student of Wu, try this one. According to wu-minidict, 小 is 上 and starts with a voiceless initial /ɕ/ and according to Wyang's table, it should be 2. --Anatoli (обсудить/вклад) 12:59, 27 May 2014 (UTC)
Thanks! :) --kc_kennylau (talk) 13:05, 27 May 2014 (UTC)
Yes, Anatoli's right. Also, since you speak Cantonese, this is probably easier:
In general
Cantonese tone Shanghainese tone Mandarin tone
1 (not checked - 7) 1 1
2 2 3
3 (not checked - 8) 2 4
4 3 2
5 3 3 or 4
6 (not checked - 9) 3 4
7 4 could be anything
8 4 could be anything
9 5 2 or 4

Wyang (talk) 00:31, 28 May 2014 (UTC)

Thank you. From the above, is Cantonese the driver? Is there a correspondence for Mandarin -> Shanghainese -> Cantonese? Such as Mandarin tone 3 can be 2 or 5 in Cantonese and 2 or 3 in Shanghainese?
How many Cantonese tones are currently supported by Module:yue-pron? I have an impression - only six, as in Hong Kong Cantonese. Number 7 and above don't generate IPA. --Anatoli (обсудить/вклад) 00:45, 28 May 2014 (UTC)
1 and 7, 3 and 8, 6 and 9 occur in complementary distribution. Jyutping merges the two into the former. The former is for non-checked syllables, the latter for checked syllables. Wyang (talk) 00:47, 28 May 2014 (UTC)
I see, I thought so. Is it true that Guangzhou and Hong Kong Cantonese differ in the number of tones - 7 and 6 accordingly. Not sure where I read this now. Strangely, it's hard to find numeric values in Wikipedia for Cantonese 6 tones: 55 35 33 21 13 22, not sure about the checked tones. This "complementary distribution", does it actually mean different, lower tones for 7, 8 and 9? --Anatoli (обсудить/вклад) 01:04, 28 May 2014 (UTC)
The checked tones are: 5, 3, 2, for 7/8/9 (they are just shorter versions of 1/3/6). Tone 1 is typically 55 in Hong Kong, but can be 55 or 53 in Guangzhou. Some view it as two tones instead, citing characters which are basically only pronounced as 55 or 53, and never the other. At present the two values are largely interchangeable, although reading some 55 characters as 53 might sound weird. Compare : Hong Kong (JustinLam) and Guangzhou (greatharry). Wyang (talk) 01:22, 28 May 2014 (UTC)
Middle Chinese is the driver, Cantonese is the vice-driver, Shanghainese and Mandarin sit in the back seats. Wyang (talk) 00:59, 28 May 2014 (UTC)
Thanks. Very educational. Are there numeric values for 7, 8 and 9? Can Shanghainese and Mandarin tones be mapped to each other similarly or they are completely unpredictable? --Anatoli (обсудить/вклад) 01:04, 28 May 2014 (UTC)
The table above should do - ignore the Cantonese column. It's a lot less regular, but there are some correspondences. Wyang (talk) 01:22, 28 May 2014 (UTC)
Thank you very much! --kc_kennylau (talk) 10:53, 28 May 2014 (UTC)

Some of questions about Shanghainese[edit]

Hi Frank,

  1. Is this particle 额 standard in Shanghainese - 侬好流利刚英文𠲎?
  1. What is 梢许 - "a little bit"? 还可以,就是梢许忙着点。
  1. Is 阿 a question word? 侬是陈先生? --Anatoli (обсудить/вклад) 23:49, 28 May 2014 (UTC)
1. There is no standard written form for Shanghainese. 额 is the same as 個, equivalent to Mandarin 地. Also, 刚英文 should be 讲英文.
2. 稍许 - a little bit.
3. Yes, it is used in old-style Shanghainese. Wyang (talk) 23:56, 28 May 2014 (UTC)
Thank you. I've added a Shanghainese usage example in /. I noticed there's no standard form for Shanghainese. I have to make adjustments when reading 上海话方言词典 but their audio is very good. --Anatoli (обсудить/вклад) 00:12, 29 May 2014 (UTC)

A Question for One Word[edit]

I have a question about a certain word. Is this Chinese word an adjective? If so, what does it mean? I could suggest improving your list of yellow linkable Chinese words by adding parts of speech next to them. --Lo Ximiendo (talk) 09:34, 29 May 2014 (UTC)

Done. Please fix the definition due to improper English by me. --kc_kennylau (talk) 10:57, 29 May 2014 (UTC)
Done. Wyang (talk) 12:03, 29 May 2014 (UTC)

I'm also wondering about the Pinyin reading of this word. --Lo Ximiendo (talk) 07:31, 30 May 2014 (UTC)

{{zh-new}} should handle this correctly. Wyang (talk) 07:32, 30 May 2014 (UTC)

Russian words[edit]

Making entries for inflected forms is a bit of waste of time, IMHO. It's for bots, not humans :) I hope someone may create accelerated methods for making them and/or make a bot for a quick creation. You can try something more challenging by making an entry from translations. Adverbs are not inflected, they are easier to do.

Russian translations from English starting with letter "a" are here User:Matthias_Buchmeier/en-ru-a (or any letter - change "-a" to other letters or change ru to cmn to see Chinese translations - User:Matthias_Buchmeier/en-cmn-a (note: some translations are wrong or SoP). --Anatoli (обсудить/вклад) 00:05, 30 May 2014 (UTC)

Aha, thanks. I will read more on Russian phonology first. I have one question: What does сдалан here mean? Wyang (talk) 00:09, 30 May 2014 (UTC)
That was a typo :) -> сде́лан (made - participle).
Thanks, that explains it. :) Wyang (talk) 00:22, 30 May 2014 (UTC)

The format of {{zh-pron}} usage[edit]

@Atitarev, Kephir, Lo Ximiendo: So apparently I and Lo Ximiendo are in a war of whether the double close curly brackets should be put on the same line as the category or in a newline. Please express your view here. --kc_kennylau (talk) 07:23, 30 May 2014 (UTC)

It doesn't seem to be a war at all... yet. I prefer new line, because every other parameter gets a new line granted to it. Wyang (talk) 07:31, 30 May 2014 (UTC)
For me, I could go for lesser lines. --Lo Ximiendo (talk) 07:32, 30 May 2014 (UTC)
This is a silly non-issue. As long as it renders the same, I would leave it alone. Personally, though, I prefer them at the beginning of the line. Makes line-based diffs less messy when changing the last item. Keφr 07:37, 30 May 2014 (UTC)

Result: 3 persons (Kenny, Wyang, Kephir) agree making a newline, 1 person (Lo Ximiendo) agrees appending it onto the last item.

Classical tag[edit]

Please add the "classical" tag into Module:labels/data and link it to w:Classical Chinese. --kc_kennylau (talk) 14:54, 30 May 2014 (UTC)

Can't you have other uses of "Classical"? For example, "Classical Latin"? Wyang (talk) 10:36, 31 May 2014 (UTC)
Then link it to "Classical + lang:getCanonicalName()". --kc_kennylau (talk) 10:48, 31 May 2014 (UTC)
Tag added but unlinked. There is no lang parameter there [1]. Wyang (talk) 11:01, 31 May 2014 (UTC)
Thanks. --kc_kennylau (talk) 11:35, 31 May 2014 (UTC)



Could you make it work, please? It would be easier for me to get expected IPA, using what is currently produced. I've added some descriptions and more test cases. --Anatoli (обсудить/вклад) 12:08, 1 June 2014 (UTC)

It's working now. --kc_kennylau (talk) 15:03, 1 June 2014 (UTC)

学校, 学习, 学堂 in Shanghainese[edit]

Hi Frank,

Do these words have different readings? You have corrected my edit in 学校 where I used "5hhoq" from 学习 (your edit). I want to make Wu reading for 学堂, which seems more common than 学校 - used in one of my textbooks. What is the transliteration? "5hhoq daan" or "5hhiaq daan"--Anatoli (обсудить/вклад) 23:13, 1 June 2014 (UTC)

It's 5hhoq daan. 5hhoq: colloquial; 5hhiaq: literary. Wyang (talk) 23:41, 1 June 2014 (UTC)
Thank you. --Anatoli (обсудить/вклад) 23:44, 1 June 2014 (UTC)

五 in Wu Chinese[edit]

Hi Frank, could you fix (Wu translit.), please? Nothing worked for me. --Anatoli (обсудить/вклад) 00:13, 3 June 2014 (UTC)

Added. Wyang (talk) 00:19, 3 June 2014 (UTC)

Templates in the Belarusian adjectives category[edit]

Hi, if you're thinking about completing a side-quest, could you consider fixing the Belarusian adjective templates so that they don't show up in Category:Belarusian adjectives? Maybe even other categories, too? --Lo Ximiendo (talk) 06:03, 3 June 2014 (UTC)

@Atitarev: How about you, Anatoli? --Lo Ximiendo (talk) 07:19, 3 June 2014 (UTC)
I'll take a look later tonight and reply re: Russian IPA, will add some stuff. Gotta go. --Anatoli (обсудить/вклад) 07:23, 3 June 2014 (UTC)
Fixed (except the talk page, which can't be removed from the category without removing the test transclusion). Putting a category wrapped in noincludes in a template that's transcluded by other templates only stops the transcluded template from going into the category- not the ones transcluding it. Fortunately, all the transcluding templates had the same code, so it could be deleted from the transcluded template without any effect on the entries. Chuck Entz (talk) 08:19, 3 June 2014 (UTC)
So is it okay or not to delete the talk page? Next stop, Category:Belarusian nouns and Category:Belarusian verbs? --Lo Ximiendo (talk) 08:34, 3 June 2014 (UTC) Go for the Latin word in the Belarusian nouns category. --Lo Ximiendo (talk) 08:37, 3 June 2014 (UTC)
I didn't delete it because I don't know if it's still needed. The Latin-script noun has been tagged for attention since January. Someone who knows Belarusian needs to create an entry under the correct Cyrillic-script spelling so the Latin-script one can removed without losing information. Chuck Entz (talk) 08:56, 3 June 2014 (UTC)


Frank, could you check this entry, when you have a moment, please? @Kc kennylau:, Kenny, Cantonese, please as well. I did my best not to make mistakes but it's a big entry. --Anatoli (обсудить/вклад) 05:27, 4 June 2014 (UTC)

It probably needs some formatting checks as well - L3 and L4 for pronunciations and PoS? --Anatoli (обсудить/вклад) 05:28, 4 June 2014 (UTC)

Remaining Mandarin entries[edit]

Hi Frank, there are still heaps (multi-character terms) in Special:WhatLinksHere/Template:cmn-noun&limit=500 (and next) with the old-style templates and headers, some lack pronunciation headers. Can they be done programmatically still? --Anatoli (обсудить/вклад) 04:26, 5 June 2014 (UTC)

There are 332 monosyllabic entries there and 908 multisyllabic ones, both intentionally omitted (the bot omits the entry if the title is monosyllabic or content lacks {{Pinyin-IPA}}). There is a complete list of the multisyllabic entries omitted here. I think the multisyllabic ones are automatable, at least semi-automatable. Not sure about the monosyllabic ones. Wyang (talk) 04:33, 5 June 2014 (UTC)
Wow, there's still a lot. --Anatoli (обсудить/вклад) 04:37, 5 June 2014 (UTC)

A sample entry with unknown PoS, only reading is available - [edit]

In this revision I have added Mandarin and Cantonese readings, removed translingual definition requests and left one for Chinese, the cat= parameter is empty. Can all single character entries be categorised in by default and PINT, Jyutping, etc. if a reading is added? (Definitions and PoS can be added but I'd like to establish a format for defintionless characters, as a sample. What do you think? --Anatoli (обсудить/вклад) 23:38, 5 June 2014 (UTC)

CC @Bumm13:, since you edited that entry as well. --Anatoli (обсудить/вклад) 23:39, 5 June 2014 (UTC)
Oops, wrong tone number for Wu but I see why it's 3. Thanks for fixing.
Why all language categories have gone from ? It is Chinese and if there is a reading, then shouldn't it also be Mandarin, etc.? Or should all single-character entries just sit in Category:Han characters (that's added by translingual)? --Anatoli (обсудить/вклад) 00:01, 6 June 2014 (UTC)
Because I think Category:Mandarin language should be free from mainspace entries, like Category:English language. In the absence of a suitable category, I changed the code to put it in Category:Chinese hanzi. "Category:Mandarin hanzi" sounds strange to me. Wyang (talk) 00:19, 6 June 2014 (UTC)
Thanks. Agreed. Any categorisation is better than none and "Chinese hanzi" is a good name. --Anatoli (обсудить/вклад) 00:23, 6 June 2014 (UTC)
Could you add single-character entries like WITH definitions to Category:Chinese hanzi as well, pls? --Anatoli (обсудить/вклад) 04:38, 6 June 2014 (UTC)
Added. Wyang (talk) 04:41, 6 June 2014 (UTC)

Template:zh-pron and Wade-Giles romanization[edit]

I'm adjusting okay so far to the new unified Chinese formatting for the most part. One thing I'm not seeing in Template:zh-pron is the option for adding the older Wade-Giles romanization for Mandarin. Wade-Giles isn't used as much anymore but is still found frequently in older texts and is still preferred by many Chinese linguistics experts in academia. It'd be really good to have the ability to automatically (or manually) convert from Hanyu pinyin to Wade-Giles. This page is good for showing most of the conversions from Hanyu pinyin to Wade-Giles (doesn't use IPA charts, though). Bumm13 (talk) 03:24, 6 June 2014 (UTC)

It should be fairly easy to do. @kc kennylau: You might be interested in this too. I might get this started if no one takes the lead. Wyang (talk) 03:29, 6 June 2014 (UTC)
Any chance that Wade-Giles will be added to Template:zh-pron anytime soon? It's actually a big reason why I'm not spending more time converting topolect sections to the new "Chinese" formatting. Just curious. Bumm13 (talk) 19:20, 20 June 2014 (UTC)
(E/C)I am neutral on Wade-Giles. Like Gwoyeu Romatzyh, it's just another system to understand and support, making the Chinese pronunciation box larger. Well, we had Wade-Giles in Hanzi headers (for single-character entries) all the time, to avoid being accused of destroying it, we should probably keep it/add it but perhaps for single-character entries only(?), perhaps the same for Gwoyeu Romatzyh(?). @Kc kennylau: it must be an easy task for you? @Wyang, same thinking :)--Anatoli (обсудить/вклад) 03:33, 6 June 2014 (UTC)
I have done a draft for the py_wg function at Module:cmn-pron. All the rudimentary monosyllabic testcases work as expected, which I think is fairly sufficient for showing its robustness if it were to be applied solely to Hanzi entries. Please see if anything needs to be improved and enable it when it is deemed trustworthy. Wyang (talk) 05:47, 6 June 2014 (UTC)

@Bumm13: It's now enabled for Hanzi. Wyang (talk) 01:12, 27 June 2014 (UTC)

Question re: copula in Korean[edit]

I'm curious if the copula 이다 (ida) might have undergone "n" deletion at some point in the distant past. Is there any chance that the negative 아니다 (anida) was originally composed of (a) + 니다 (nida), with negative prefix (an) originating as a contraction of 아니 (ani)?

There are interesting suggestions (such as in this slide deck by Bjarke Frellesvig about Old Japanese and earlier) that classical Japanese perfective auxiliary (nu) might have developed from an older copula, and that this might be the root of even modern particles like (ni). That and related discussions about the 未然形 (mizenkei, irrealis) conjugation got me wondering if there were any analogs in Korean, either in the formation of negatives by using a, or in the copula, and hence my question above.

(Incidentally, etyms 3 and 4 at (an) look completely indistinguishable to me... and on a different note, Chinese entries are looking pretty snazzy. :) )

TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 17:24, 6 June 2014 (UTC)

Interesting! I love Bjarke Frellesvig's book "A History of the Japanese Language" and the presentation you linked to is very interesting as well.
There are four negatives in Korean: an ("not", , 아니, 아니다, 않다), mos ("cannot", , 못하다, 모르다), mal ("don't", 말다), and eps ("not have", 없다). I will post a detailed reply tonight when I have access to the Korean etymology books. Wyang (talk) 03:00, 10 June 2014 (UTC)

Hi Eirikr! Sorry for the delay...

Korean differs from Japanese in that its negative constructions cannot be done purely with endings, and require a combination of verbal endings and negative verb/adjective/adverbs (i.e. 생각지 않다 = 思わず), and is hence less agglutinative in morphology. To me, the first Korean negative series of an has the root form of (a)n, and I always thought this must be cognate with the n or z (< n-su) in the negative forms of Japanese verbs. There doesn't seem to be a negative-forming process by attaching a to the positive copula. The positive copula might eventually be a reduced form of 있다 (itda, “there is”), which in Middle Korean was ista ~ isita ~ sita and this might be related to Japanese aru.

There is a very interesting discussion in Lee Namdeuk's book 한국어 어원 연구 IV, Chapter 基礎 語彙의 語源과 比較 考察. He thinks that an and eps negatives in Korean are ultimately from the same source, and that the -n- negative in Korean is related to the Japanese n negative. Here is the original text:

P.S. The second and third Korean negative mos and mal may be distantly related to Appendix:Proto-Sino-Tibetan/ma.

Glad that the collaborative attempts to make entries snazzier seemed to be working!

Cheers, Wyang (talk) 02:39, 11 June 2014 (UTC)

  • Thank you so much for the research and details!
So to make sure we're on the same page, it sounds like:
  1. KO ida did not undergo "n" deletion.
  2. By extension, KO itda did not undergo "n" deletion.
    I would be very grateful if you could confirm the above two, as that would help categorically rule out any connection between these and hypothetical JA copula nu (with inflected form ni).
    (As two tangential ideas, do you think KO i- in ida has any relation to JA i- in iru, classically rendered as wiru? I'm not aware of any phonological processes that might explain "w" deletion in Korean, but you certainly know more about that than I do. And do you have any thoughts on the apparent overlap between KO iss- in itda and PIE *h₁es- (to be), among other odd coincidental KO-PIE collisions?)
  3. KO negatives are historically indicated primarily by the consonant /n/, with the vowel /a/ being an incidental (or otherwise not important in conveying negativity), and the vowel /a/ in its negative capacity definitely not having anything to do with verb conjugation patterns.
    This is interesting as a possible point of real divergence, in that the Japanese negative nu could be analyzed as identical with perfective nu, provided one accepts the 未然形 (mizenkei, irrealis form) as a real feature of the language and not an artifact of some sort: [verb in irrealis == action that is incomplete or hasn't happened yet] + nu == [verb that has completed without happening] == [verb hasn't happened == negative of verb]. For the zu forms, Frellesvig postulates that this was a fusion between the 連用形 (ren'yōkei, continuative form) ni of root form nu + apparent adverbial complement su: /ni/ + /su/ > /nsu/ > /zu/. Have a look at slide 34 of the linked deck -- Frellesvig diagrammed this as “*ani-su”, and that ani is what got me thinking about possible KO connections. Ultimately though, I think the “a” there is just intended to convey the mizenkei. This /ni/ + /su/ matches Lee's notes for OJ on page 79, as much as I can read of them (thank you very much for that, though I regret that I can currently only make out some of the text -- I've really got to spend more time studying Korean). JA negative nashi mentioned on that same page by Lee could be analyzed as the mizenkei na of root form nu + adjectival suffix shi.
    I see on the bottom of page 79 and the start of page 80 that Lee equates this JA “n” element with the KO an element, as you mentioned. I'm still chewing on the JA; I have real trouble viewing JA “n” purely as a negative given the prevalence of affirmative meanings that can potentially be ascribed to this same root, such as modern naru, verbal auxiliary -nau, non-negative adjectival suffix -nai (as in  (あぶ)ない (abunai),  (すく)ない (sukunai), etc.), perfective nu, possibly even particle ni. I might be open to the possibility of two JA “n” roots that converged or collided somehow, but the semantics for such opposite meanings being expressed in the same sound leave me uncertain as to how that would happen. With the mizenkei verb conjugation stem providing necessary context, the overlap between affirmative and negative “n” meanings in JA can be explained. I know that some authors, Frellesvig apparently among them, have advanced the notion that the mizenkei is purely an historical artifact and not an underlying semantic feature of the language, but without reading their arguments, I can't see where that could be the case -- the mizenkei appears to be an integral feature since the earliest writings, and its semantics can explain a number of otherwise-weird constructions.
Anyway, I realize this is a lot, but if you're willing :), I'd greatly appreciate it if you could 1) confirm that I'm restating the numbered items correctly, and 2) share your thoughts on the rest of the above. I'm an incorrigible language geek, especially when it comes to figuring out how things are put together, so if this has exceeded your interest threshold, just let me know.  :) ‑‑ Eiríkr Útlendi │ Tala við mig 17:37, 11 June 2014 (UTC)
I'm an incorrigible language geek too, fortunately and hopelessly. :) I agree with the first two points, please let me get back to you on the rest later. In the meantime it might be useful to have a look at Lee's discussion on J. iru, wiru: Wyang (talk) 00:11, 12 June 2014 (UTC)

My edit to and possible categories bug[edit]

It looks like after converting this article to "Chinese", there might be a bug with how categories are being displayed (it's trying to add to "Hakka nouns/verbs" (and Wu) when no Hakka or Wu readings are specified (plus the munged formatting of the phantom Wu nouns/verbs category links in general). Bumm13 (talk) 08:07, 7 June 2014 (UTC)

Thanks, think it's fixed. I have also expanded that entry slightly. Wyang (talk) 01:00, 10 June 2014 (UTC)

Unforseen naming issue with pronunciation audio files (Template:zh-pron)[edit]

After converting the article to using "Chinese" and the new templates, I noticed that the current Mandarin pronunciation Ogg file has a name that it isn't expecting (and thus showing a red link). The expected name is zh-ēn.ogg but the actual Wikimedia Commons filename for that sound is at Zh-en.ogg instead. I expect that this issue will continue to show up for many other such sound files. :\ Bumm13 (talk) 08:25, 7 June 2014 (UTC)

@Bumm13: You can set the parameter ma to the name of the ogg file. --kc_kennylau (talk) 08:26, 7 June 2014 (UTC)
Oh, okay, thankfully that option is available. Thanks! Bumm13 (talk) 08:30, 7 June 2014 (UTC)

Russian IPA, remaining Mandarin entries[edit]

Hi Frank,

I will get back on the Russian pronunciation appendix. Sorry for not doing much lately. Is that OK? I hope you won't lose interest :) I may not be able to describe assimilative palatalisation and gemination rules in good details.

I have checked the remaining multi-syllabic Mandarin entries. All of them either miss pronunciation sections altogether or use some old-style non-standard pronunciation method. You could probably fix them with AWB and your bot. Could you do that when you have time, please? They are too many to do manually. --Anatoli (обсудить/вклад) 01:29, 10 June 2014 (UTC)

No worries - Great works are not finished in a day :). I will be on a lookout for the testcase and talk pages, so please add anything that needs to be improved whenever you think of them.
I will modify the code and do the rest, when I have time. Wyang (talk) 03:18, 10 June 2014 (UTC)
I have fixed the last Mandarin noun - , which didn't have Pinyin inside {{cmn-noun}}. Will check other PoS. --Anatoli (обсудить/вклад) 05:34, 10 June 2014 (UTC)

Simplified and traditional scripts[edit]

Hi Wyang, I just had a quick question. Is it true that under the current formatting arrangements, there is no categorisation of simplified and traditional scripts? I just realised this may be the case. ---> Tooironic (talk) 07:59, 10 June 2014 (UTC)

Yes, there is no categorisation by script. Wyang (talk) 00:36, 11 June 2014 (UTC)
I personally don't miss this categorisation but I can imagine it won't be hard to introduce but without PoS separation. As previously agreed, 中國 and 中国 are now sorted the same way, by numbered pinyin. --Anatoli (обсудить/вклад) 00:42, 11 June 2014 (UTC)
Seems to be a bit of a shame to me. This kind of categorisation and its related data could be useful for both the average user and people who wish to make use of the data. ---> Tooironic (talk) 09:11, 11 June 2014 (UTC)
@Atitarev:, @kc kennylau:, @Jamesjiao: What is your opinion on this? Should we resurrect categorisation by script? It is easy to reliably generate Category:Chinese terms in simplified script, and equally easy to generate Category:Chinese nouns in simplified script (though not as reliably). I don't think it is worthwhile to split the topolect-specific categories. Wyang (talk) 00:02, 12 June 2014 (UTC)
I wouldn't resurrect SoP's split by traditional/simplified, just split ALL TERMS by traditional/simplified. No, just "Chinese", IMO. --Anatoli (обсудить/вклад) 00:05, 12 June 2014 (UTC)
I am still against sorting by pinyin. What makes Mandarin the official dialect? --kc_kennylau (talk) 04:23, 13 June 2014 (UTC)
I am against it too. We should use the sortkeys in Chinese categories. Wyang (talk) 04:25, 13 June 2014 (UTC)
What's the alternative, guys? I see a big issue with sorting, for example in topical categories (they quickly get out of hand, if |sort= is not specified). How are you going to sort Chinese entries, e.g. Category:Chinese nouns, if we drop pinyin sorting? Back to radical sort or by characters themselves? I'm not suggesting that Mandarin should overwhelm topolects but it's better to have some sorting key than nothing. I see that topolect categories are sorted by the appropriate romanisation but what if we decide not to split by Mandarin traditional/simplified, Cantonese traditional/simplified, etc.? --Anatoli (обсудить/вклад) 05:08, 13 June 2014 (UTC)
My preference is to sort Category:Chinese nouns by radical, and the topolects by romanisations. Wyang (talk) 05:24, 13 June 2014 (UTC)
My preference is numbered pinyin or have alternative sorting. If radicals are chosen, it would be great then to have a radical index on categories then, as a minimum, otherwise finding a word in a list of thousands won't be possible, one could use Category:Mandarin nouns, of course. There's a table at the top of Category:Mandarin nouns in traditional script but it's no longer usable because rs= value no longer exists. --Anatoli (обсудить/вклад) 05:46, 13 June 2014 (UTC)
The Chinese dictionary I have has an index of radicals at the beginning of the book and under each radical is a list of characters that incorporate the said radical ordered by the number of strokes of the phonetic element. The actual dictionary is ordered by pinyin from A to Z. JamesjiaoTC 22:55, 15 June 2014 (UTC)
It's technically not difficult to sort them automatically by radical - Module:zh/data has a sortkey function specifically for that purpose. Wyang (talk) 06:05, 13 June 2014 (UTC)
  • Query: Is there any way to add back-end routines to {{zh-pron}} that would add categorizations for each reading (i.e. topolect) as it's added? I haven't explored Lua enough to know if it's even possible, but what of code that could parse the page for POSes and pronunciations, and auto-generate the corresponding categories? ‑‑ Eiríkr Útlendi │ Tala við mig 06:34, 13 June 2014 (UTC)
    Sorry I haven't been very responsive lately... I don't quite understand what you meant above. {{zh-pron}} currently operates under that premise (it seems), generating the corresponding categories depending on the readings and PoS parameter value given. Wyang (talk) 04:19, 20 June 2014 (UTC)
Eirikr probably means using zh-pron to make e.g. "Cantonese nouns in traditional script", etc. I think we shouldn't split by topolects and PoS but that's only me. Just "Chinese terms in traditional script" and ...simplified would do but some people will disagree. --Anatoli (обсудить/вклад) 04:26, 20 June 2014 (UTC)
  • Hi, yes, that's more what I was trying to convey. I'm not much for using the Chinese on this site, but I could see some utility in being topolect-specific -- as a user, to find readings for terms in a specific topolect; and as an editor, in order to find those entries that might still need topolect data. ‑‑ Eiríkr Útlendi │ Tala við mig 18:15, 20 June 2014 (UTC)
Category:catboiler with sc might be relevant. —CodeCat 12:10, 21 June 2014 (UTC)


I currently do not see any need for protecting Template:la-conj-3rd-no234, so can you please unprotect the template? Thanks in advance. --kc_kennylau (talk) 05:10, 11 June 2014 (UTC)

No problem, I have unprotected it for a week for you. Please let me know if that is not long enough or if you need to edit other templates. Wyang (talk) 05:14, 11 June 2014 (UTC)
Thanks. --kc_kennylau (talk) 06:19, 11 June 2014 (UTC)

Mandarin homophones[edit]

How can we add to the list of homophones for given pinyin readings? E.g. 財務 is missing 才悟 (才思穎敏,領悟力強). ---> Tooironic (talk) 09:09, 11 June 2014 (UTC)

@Tooironic: Template:cmn-pron/hom/cáiwù or click on the edit button on the top right hand corner of the box on 財務. --kc_kennylau (talk) 10:32, 11 June 2014 (UTC)

Mandarin translation not nested under Chinese[edit]

Hi Wyang, I've seen you working a lot on Chinese. If you have some spare time you could have a look at Wiktionary:Todo/Mandarin translation not nested under Chinese. Matthias Buchmeier (talk) 17:49, 11 June 2014 (UTC)

Thanks, I will do that. Wyang (talk) 23:45, 11 June 2014 (UTC)

糸 and 絲[edit]

Hi Frank,

I got a bit mixed up with the readings there on . The readings were originally using those for , I think. Could you verify, please? --Anatoli (обсудить/вклад) 00:27, 13 June 2014 (UTC)

Please see now. Wyang (talk) 05:53, 13 June 2014 (UTC)
Thank you, looks great. --Anatoli (обсудить/вклад) 06:04, 13 June 2014 (UTC)

Russian pronunciation[edit]

Hi Frank,

I have imported Appendix:Russian pronunciation/imported from the Russian Wiktionary. It's not a gospel and we don't have to follow it 100% as there are choices in IPA. Are there any points you would like me to translate for you? --Anatoli (обсудить/вклад) 13:12, 15 June 2014 (UTC)

Thanks, I will have a look. Wyang (talk) 23:53, 15 June 2014 (UTC)
It doesn't answer the question of the assimilative palatalization and gemination, though. Well, we can keep working on it and deal with problematic cases when they arise.
Could you add handling for some prefixes (they do cause gemination) - I will try to maintain the list. One of the currently failed tests: отдохну́ть (prefix: от-). --Anatoli (обсудить/вклад) 23:59, 15 June 2014 (UTC)

Yellow Link Deal[edit]

I add Pinyin and Jyutping readings to yellow links such as was one random word, and you or someone else fills in and corrects the blanks. Could that be a deal? --Lo Ximiendo (talk) 16:04, 15 June 2014 (UTC)

Added. Wyang (talk) 23:53, 15 June 2014 (UTC)
I came across fǎnmiàn and dǎomín. Maybe you could delete the words from your list of missing Chinese words? --Lo Ximiendo (talk) 14:14, 16 June 2014 (UTC)
Please update that page if you could. I haven't been providing care for that page for months... Wyang (talk) 00:40, 18 June 2014 (UTC)


L333-L347 contains extraneous lines. What are you trying to do there? --kc_kennylau (talk) 08:49, 17 June 2014 (UTC)

Generate different indentation for quotations not following definitions, following definitions but inline, and following definitions and not inline. See the bottom of Wiktionary:Feedback for an example of in_notes. Wyang (talk) 12:44, 17 June 2014 (UTC)
You have an "if in_notes then" in L343 which overrides the "elseif in_notes then" in L336. By the way, how can I not make it a quotation? --kc_kennylau (talk) 14:16, 17 June 2014 (UTC)
L336 defines other_lines_indent and simp_indent, whereas L343 defines first_line_indent. What do you mean by a non-quotation? in_notes and inline? Wyang (talk) 23:51, 17 June 2014 (UTC)

Speedy deletion[edit]

Can you please clear Category:Candidates for speedy deletion thank you --kc_kennylau (talk) 11:51, 17 June 2014 (UTC)

I've deleted all the Category:cmn:-prefixed pages. Wyang (talk) 12:45, 17 June 2014 (UTC)
Please delete all the pages in Category:cmn:List of topics that has no sub-categories or elements also (I'm too lazy to tag them one by one). [Press Ctrl+F and find "0 c, 0 e" to find] --kc_kennylau (talk) 14:13, 17 June 2014 (UTC)
I've deleted all the empty categories and supercategories. Wyang (talk) 00:12, 18 June 2014 (UTC)
I've just restored a large number of deleted cmn categories that had subcategories- in fact, most of the category tree for cmn was obliterated, leaving redlinks all over the place.
There are many topical categories that are populated by {{context}}, so the only way to empty all cmn categories would be to get rid of either the context template or the "lang=cmn" parameter. Unless you do that, there will be non-empty cmn categories, which will in turn be categorized in the parent categories set by {{topic cat}} subtemplates.
Just so we're clear: a category that has subcategories is not empty, and shouldn't be deleted. Not every category is designed to directly contain entries- many are just for navigating between sister categories. If we're going to have cmn topical categories, we should have a category tree to link them together.
If you're going to delete a category, first empty and delete all of its subcategories (and sub-subcategories, etc.). Otherwise, leave it alone. Thanks Chuck Entz (talk) 02:55, 20 June 2014 (UTC)
The whole category tree starting from Category:cmn:List of topics has been deleted. Wyang (talk) 03:42, 20 June 2014 (UTC)
That'll work... As much as I hate to see all my work in creating and then restoring the categories just evaporate like that, my only real problem was with deleting non-empty categories. As long as you make sure the categories are empty before you delete them, I can live with it. Sorry for the extra work! Chuck Entz (talk) 05:22, 20 June 2014 (UTC)


I really struggled translating this concept into English, was wondering if you had any suggestions? ---> Tooironic (talk) 02:16, 26 June 2014 (UTC)

Audio-visual, intuitive, self-evident? I have expanded that entry. Wyang (talk) 23:51, 26 June 2014 (UTC)
Many thanks. ---> Tooironic (talk) 13:12, 4 July 2014 (UTC)


請求刪除粵語同音詞模板中所有頁面,謝謝。 --kc_kennylau (talk) 10:06, 27 June 2014 (UTC)

已悉數刪除。另有關廣州粵語顎化系聲母的討論,如無異議,可否恢復之前的版本?多謝。Wyang (talk) 10:55, 27 June 2014 (UTC)
已按簡報第四十一及第四十二頁修改。 --kc_kennylau (talk) 12:42, 27 June 2014 (UTC)

Edit request: Module:ja-headword[edit]

Please semi-protect for a day so that I can edit. Thank you in advance. --kc_kennylau (talk) 01:53, 28 June 2014 (UTC)

@Kc kennylau: Unprotected, pls let me or Wyang know when done. --Anatoli (обсудить/вклад) 03:41, 28 June 2014 (UTC)

@Atitarev: Done. Thank you. --kc_kennylau (talk) 14:27, 28 June 2014 (UTC)
Protected again. Thank you for the edit. --Anatoli (обсудить/вклад) 14:39, 28 June 2014 (UTC)
Is full protection really necessary? —CodeCat 14:40, 28 June 2014 (UTC)
Not sure, if you ask me. I only restored the previous protection. --Anatoli (обсудить/вклад) 15:18, 28 June 2014 (UTC)
The page history shows that User:Haplology is the user who fully protected it. Pinging him/her to here. --kc_kennylau (talk) 17:15, 28 June 2014 (UTC)
  • Hey folks, I noticed that entries such as 檳榔, that need both hira and kata specified, wind up getting two romaji listings in the headline. That doesn't seem quite right... ‑‑ Eiríkr Útlendi │ Tala við mig 01:13, 29 June 2014 (UTC)

@Kc kennylau: Unprotected again. Please fix. @Eirikr: thanks for letting us know. --Anatoli (обсудить/вклад) 02:52, 29 June 2014 (UTC)

@Atitarev, Eirikr: Done. --kc_kennylau (talk) 08:38, 29 June 2014 (UTC)
  • Hmm, I just found that now 鮎#Japanese isn't showing any romaji at all -- specifically for the first etym noun sense, where the あゆ reading is supplied as an unnamed positional parameter and the アユ reading is supplied as the named kata= parameter. ‑‑ Eiríkr Útlendi │ Tala við mig 22:56, 14 July 2014 (UTC)
  • I have edited it to make romaji display on . Hopefully it didn't cause romajis to blossom elsewhere! Wyang (talk) 00:00, 15 July 2014 (UTC)
Sorry! How about now? Wyang (talk) 12:08, 15 July 2014 (UTC)
  • Herp-a-derp on my part -- it never occurred to me to look at the katakana string itself as provided to the template. Thank you for that!  :) ‑‑ Eiríkr Útlendi │ Tala við mig 16:40, 15 July 2014 (UTC)
No worries :) Wyang (talk) 23:40, 15 July 2014 (UTC)


This entry apparently got accidently scrambled by an AWB edit of yours in the heat of the topolect merger, and it's been reverted to the edit previous to that by a contributor. I thought you might want to take a look at it. Chuck Entz (talk) 04:22, 30 June 2014 (UTC)

Fixed. --Anatoli (обсудить/вклад) 04:43, 30 June 2014 (UTC)

Module error[edit]

Caused by this edit. Please fix it. —Mr. Granger (talkcontribs) 00:34, 4 July 2014 (UTC)

Actually, no. Kephir's last edit to Module:och-pron did that. The edit you're referring to just switched Old Chinese on so the module could choke on it. Chuck Entz (talk) 00:39, 4 July 2014 (UTC)
Fixed. --kc_kennylau (talk) 00:53, 4 July 2014 (UTC)
Almost. There are still 9 entries with a variation on the same error that don't respond to null edits. Thanks for the other 111 entries, though. Chuck Entz (talk) 01:10, 4 July 2014 (UTC)
I created many irrelevant errors when doing a major edit on Module:yue-pron, which, by the way, is fixed now. --kc_kennylau (talk) 01:40, 4 July 2014 (UTC)
I know about those- they went away after null edits- but these are in och-pron, not yue-pron (in several cases, an entry had both, a few lines apart from each other). Chuck Entz (talk) 02:05, 4 July 2014 (UTC)
Not sure if I had accidentally fixed it in my sorting of Module:zh/data/och_pron, but Category:Pages with module errors is empty now. Wyang (talk) 04:30, 4 July 2014 (UTC)
It's hard to be sure of anything, with all the edits cycling through the edit cue. After I read your comment, I checked and saw 55 entries in the category. In the time it took me to do a quick null edit on one entry, it was empty again. Still, I haven't seen anything that displayed a module error on the page or that survived a null edit, so I think we're out of the woods on this one- for now, anyway. Thanks! Chuck Entz (talk) 04:47, 4 July 2014 (UTC)

Taiwan pronunciation for [edit]

If possible could you update the automatic template so that the Taiwan variant pronunciation for 縛 as fú is added? See more at the page for 綁縛 and here too. Thanks. ---> Tooironic (talk) 13:10, 4 July 2014 (UTC)

@Tooironic: Done. See 綁縛绑缚 --Anatoli (обсудить/вклад) 13:47, 4 July 2014 (UTC)
Thanks muchly! ---> Tooironic (talk) 22:02, 6 July 2014 (UTC)


Hi! Can you shed any light on WT:RFV#坉, and on whether or not can mean "water that does not recede and cannot be diverted"? - -sche (discuss) 03:23, 6 July 2014 (UTC)

Kenny has commented there. The sense is easily attested. Wyang (talk) 23:30, 6 July 2014 (UTC)
Thanks, both of you! - -sche (discuss) 03:05, 7 July 2014 (UTC)

Remaining cmn-nouns and other Mandarin PoS[edit]

Hi Frank,

Do you still have any tricks for the remaining [2] (and other PoS), multisyllabic, at least? --Anatoli (обсудить/вклад) 03:27, 10 July 2014 (UTC)

Most of the >1-syllable Hanzi words there are done now... Wyang (talk) 07:08, 10 July 2014 (UTC)
Thanks a lot! I was a bit bored converting them manually :) However, there are still a list of verbs, phrases and quite a lot of proverbs and idioms. --Anatoli (обсудить/вклад) 11:06, 10 July 2014 (UTC)
Verbs are done. --Anatoli (обсудить/вклад) 04:50, 11 July 2014 (UTC)
Thanks! I did a number of phrases and idioms yesterday, and will have a look later. Wyang (talk) 09:07, 11 July 2014 (UTC)
Frank, pls put on your to do-list finishing those pesky cmn templates, e.g. proper nouns [3], idioms, proverbs, etc. :) It's just seems much easier for you with AWB. If any of them are hard to do because of bad formatting, I'll finish manually. I have a question: how would you write IPA for 三Q? Not sure how to convert it to the new format. --Anatoli (обсудить/вклад) 00:43, 15 July 2014 (UTC)
Not sure if you missed my request. Just need to know if I need to continue to do them manually, they are not very interesting but need to be done. :) --Anatoli T. (обсудить/вклад) 07:03, 21 July 2014 (UTC)
Sorry! I missed your message earlier. There is no need to do them manually since time would be better spent on other tasks, although on the other hand I haven't been very free lately... I reckon we should disable Zhuyin, IPA etc. if the entry title contains non-Chinese characters. Wyang (talk) 23:35, 21 July 2014 (UTC)
That's OK, whenever you have time, just wanted to make sure you read my message. Thanks. :) Re: IPA, Zhuyin, 卡拉OK may get hits in Zhuyin, besides, users may want to know how Chinese pronounce those words but it's too hard, well... --Anatoli T. (обсудить/вклад) 23:47, 21 July 2014 (UTC)

Edit request: Module:ja-pron[edit]

Created an entry for 合期. This has a rare alternate reading of gaggo, for which {{ja-pron}} has produced the unlikely IPA of [ga̠k̚g̃o̞]. So far as I know, geminate "g" sounds in Japanese (rare as they are) never manifest this way. Could someone look into this? ‑‑ Eiríkr Útlendi │ Tala við mig 19:14, 10 July 2014 (UTC)

Good point. I thought the Japanese are unable to pronounce voiced geminates and thus bed and bet would end up basically identical (even though written differently), which is why I devoiced the first part of the geminate with no audible release (probably also influenced by the limited distribution of checked tone to voiceless codas in Chinese). I have changed them to truly voiced geminates, and added a voicelessness sign, and removed the optional nasalisation of g in gg. Wyang (talk) 09:13, 11 July 2014 (UTC)


According to the Taiwan dictionary, 吃貨 can also mean 股票術語,指做手於低價時不動聲色的買進股票. Do you know what the English equivalent would be? I'm lost. ---> Tooironic (talk) 23:27, 10 July 2014 (UTC)

The act of quietly accumulating shares of stock by traders when the stock is at a lower price? Would it sound too literal? Wyang (talk) 09:18, 11 July 2014 (UTC)

Middle Chinese[edit]

Now that {{zh-pron}} includes Middle Chinese pronunciation info, what should be done with these 275 Middle Chinese entries that were discussed in the BP in January (list)? Can the bizarrely-annotated, half-hidden pronunciation information be removed from the ==Middle Chinese== sections of those entries now, once the entries are made to use {{zh-pron}}? (If the info isn't removed, I'd like to standardize the wording and make it visible, like this.) - -sche (discuss) 02:05, 11 July 2014 (UTC)

I parsed through your list. 41 articles are gone and here is the updated version:
I would just leave them as they are as the Chinese merger is actively ongoing and they would probably be gone in a year. Unless someone wants to decimate them now... Wyang (talk) 09:46, 11 July 2014 (UTC)
OK; I have no problem leaving them as-is for now, as long as something will be done with them in the long term. Cheers, - -sche (discuss) 18:43, 14 July 2014 (UTC)

Minor issues in Template:zh-pron[edit]

Hi Wyang,

The unified "Chinese" with Template:zh-pron is working quite nicely for the most part. I have found a few small errors that eventually will need to be fixed regarding romanization readings. For Mandarin (Wade-Giles), the pinyin "gui" is showing as "kui" when it should be showing "kuei". Example: .

There's also two relatively minor Min Nan romanization issues. Going from Peh-oe-ji to Tai-Lo (in both cases), the Tai-Lo -eh (as in "ngeh") is supposed to correspond to POJ -oeh (as in "ngoeh"), at least according to the sources I checked against. The template is changing POJ -oeh to -ueh in Tai-Lo. Example: .

Also, the -o͘ suffix in Peh-oe-ji corresponds to -oo in Tai-Lo but is showing up as if they are the same. Example: . Other than that, everything looks great so far. Keep up the good work! :) Bumm13 (talk) 20:54, 11 July 2014 (UTC)

@Bumm13: Thanks for your kind reminder. However, according to Wikipedia, in Tai-lo ngueh is correct, and in Wade-Giles kui is correct. Can you kindly state your source? Thanks once again. --kc_kennylau (talk) 09:54, 12 July 2014 (UTC)
For the pinyin "gui" conversion, here are two good sources: [4] and [5]. These are both university library sources (Hong Kong University of Science and Technology and the University of Chicago, respectively. The former is actually in China, while the latter is basically the equivalent of an Ivy League institution in the United States. I'll have to get back to you on the "ngeh" Min Nan issue. My sources for that one are (admittedly somewhat weak) the Open Dictionary Network - Min Nan Dictionary ( for Peh-oe-ji compared with the Taiwan Min Nan Common Words Dictionary (based in Taiwan at - Taiwan Ministry of Education) for Tai-Lo. Bumm13 (talk) 17:10, 12 July 2014 (UTC)
I've fixed the kuei conversion. --kc_kennylau (talk) 17:18, 12 July 2014 (UTC)

Reference templates[edit]

Can additional characters be "exploded" in Template:R:xcl:AG? Such as “a”, “b”, “f” and the comma. --Vahag (talk) 22:31, 12 July 2014 (UTC)

Like this? Wyang (talk) 22:49, 12 July 2014 (UTC)
Yes, exactly like that, thank you. --Vahag (talk) 22:58, 12 July 2014 (UTC)

Hello again. Is there an expression that can easily convert the parameter {{{vol}}} in Roman numerals (I, II, ..., X) into Arabic numbers (1, 2, ..., 10) in Template:R:xcl:HAB, in the &volume={{{vol}}} part? --Vahag (talk) 08:14, 28 July 2014 (UTC)

Nope. You have to write a module. Keφr 08:17, 28 July 2014 (UTC)
…which I just wrote. "Ungoliant {{#invoke:foreign numerals|from_Roman|MMDCCLXIV}}</nowiki>" gives: "Ungoliant 2764". Keφr 08:36, 28 July 2014 (UTC)
Thanks! Any module to convert User:BD2412 and User:msh210 into human language? --Vahag (talk) 09:24, 28 July 2014 (UTC)


In Appendix:Proto-Sino-Tibetan/p(r)an/t ~ b(r)an/t, what does the "greater than overlapping less than" symbol signify? - -sche (discuss) 18:43, 14 July 2014 (UTC)

Allofamic variants. Also, the slash between 1 and 2 represents two alternative tone categories for the first allofam, not two reconstructions. Wyang (talk) 23:25, 14 July 2014 (UTC)


We are missing about four extra senses here. Don't suppose you'd be interested in taking a stab? I'm busy with something else at the moment. ---> Tooironic (talk) 11:53, 16 July 2014 (UTC)

Anatoli and I have expanded the entry. Wyang (talk) 00:47, 17 July 2014 (UTC)
Looks fantastic, thanks for this! ---> Tooironic (talk) 09:42, 26 July 2014 (UTC)

Issue with multiple audio files ( article)[edit]

I tried adding a second Mandarin pronunciation .ogg file entry to Template:zh-pron in the article and nothing I've tried seems to work in causing the second file's click play button thing to show up in my browser(s). Could you check the article to see if I'm doing something wrong? I have both Mandarin readings in the "m=" parameter, so I would think the audio files would show up without a lot of effort. Bumm13 (talk) 08:41, 21 July 2014 (UTC)

It could be generated by putting ,2a=y in the |m= field, please see what I did. Ideally the pronunciations should be split since they have alternative etymologies, as in or , but for short articles like 教, extra parameters like ,2a=, 3a=, and 4a= are available for use. Wyang (talk) 23:33, 21 July 2014 (UTC)


This looks real, but obviously messy, and I am not sure if this is a brand name, in which case it would need to pass WT:BRAND. (The same IP has also added prigle, for what it may be worth.) Can you take care of this? Keφr 06:38, 26 July 2014 (UTC)

It is the Korean equivalent of uncooked ramen noodles. I've made some changes there. Wyang (talk) 11:58, 27 July 2014 (UTC)


Is there really a Taiwanese variant pronunciation of gémò? This does not seem to be supported by 國語辭典. ---> Tooironic (talk) 09:40, 26 July 2014 (UTC)

It doesn't seem to be consistent - see 橫膈膜. I've changed the tag to "variant in Taiwan". Wyang (talk) 11:43, 27 July 2014 (UTC)

Dzongkha (རྫོང་ཁ) data[edit]

Hi Wyang

Thanks for response in the Beer Parlour. We could like to contribute the data for Dzongkha dictionaries. I think it will need someone familiar with Wikimedia software and something like Python to convert an import this data - unfortunately I don't have those skills. Any help would be appreciated. CFynn (talk) 13:06, 31 July 2014 (UTC)

CFynn (talk) 13:06, 31 July 2014 (UTC)

@CFynn: Hi Chris! It's great to have you here. I wrote a module for transliterating Tibetan/Dzongkha a while ago (Module:bo-translit) and have written a few simple Python scripts for either retrieving or uploading data from/to Wiktionary. I have used the dictionary at a couple of times, and was impressed by how well-organised the website is. For creating an entry of a word in Wiktionary, we need two pieces of information about the word: the definition and part of speech. I am more than glad to help out if you have any questions. Thanks, Wyang (talk) 23:26, 31 July 2014 (UTC)
@Wyang:Hi. We have XDXF (XML) files of the dictionaries which are probably the easiest format to deal with. The Dzongkha-English dictionary has part of speech and English definition and sometimes a Dzongkha synonym. I think this could be used to make the basic entries. There are separate files which list verb forms (past, present future) and honorific forms of words - which might be added on top. The English-Dzongkha dictionary has English word, part of speech, Dzongkha definition(s). The Dzongkha-Dzongkha dictionary is just word+definition with separate field for part of speech - though this information is often embedded within the definition.
The Tibetan-Dzongkha dictionary has Word+Definition with part of speech within square brackets as the first part of the definition which should be easy to extract. Sometimes the square brackets also contain a code indicating the head word is Sanskrit in Tibetan script or an archaic form. The Dzongkha-English and English-Dzongkha dictionaries are clearly going to be the easiest to deal with. The differences in format are due to the fact that these were originally compiled by different people at different times using only a word processor - not even a database. At that time people were only concerned about print publication. 07:17, 5 August 2014 (UTC)

In the XDXF files Dzongkha-English entries look like this:
<ar><k>ཀྲུམ་ཀྲུ</k> <def> noun cartilage (པགས་ཀོ་ཧྲབ་ཧྲོབ།) adj. crisp, crunchy, gristle (ཕྲུམ་ཕྲུམ།)</def> </ar>

<ar><k>ཀྲེག</k> <def> <pos>verb</pos>( fut., prs., pst., imp.) scratch, cross out</def> </ar>

<ar><k>ཀྲེག་ཀྲེགཔ</k> <def> <pos>adj.</pos>shaven</def> </ar>

<ar><k>ཀྲེག་ཆས</k> <def> <pos>noun</pos>abrasive, scraper, shaver</def> </ar>

English-Dzongkha like this:
<ar><k>A</k> <pos>n:</pos> ༡ ཨིང་ལིཤ་གི་ཡི་གུ་དང་པ། ༢ སྡེ་ཚན་ཀ་པ། སྡེ་རིམ་ཀ་པ། ༣ དྲག་ཤོས།</ar>

<ar><k>a</k> <pos>ia:</pos> ཅིག ཞིག ཤིག གང༌།</ar>

<ar><k>aardvark</k> <pos>n:</pos> གྱོག་དོམ། གྱོག་མོ་ཟ་མིའི་དོམ།</ar>

<ar><k>aard-wolf</k> <pos>n:</pos> འཕརཝ། ཨ་ཕི་རི་ཀ་ལུ་ཡོད་པའི་འཕརཝ་གི་རིགས་ཅིག</ar>

<ar><k>aback</k> <pos>adv:</pos> དཔྱད་རིག་རྣམ་རྟོག་མེད་པར། དཔྱད་རིག་མེད་སི་སི་སྦེ།</ar>

- CFynn (talk) 07:35, 5 August 2014 (UTC)

@CFynn: Thanks for the reply. The xml file easiest to use here would be the Dzongkha-English dictionary, as well as the additional files of verb conjugation and honorifics. The English-Dzongkha dictionary can only be used here for adding to translation tables (eg. aardvark), which is less straightforward (multiple translation tables, linking to components in translations). Embedding of part of speech in the definition shouldn't be too much of a problem. Would there be anything to take care of in terms of copyright (referencing) and externally linking to the website? It seems it's not a very difficult task, and we can get started on this soon. Wyang (talk) 23:22, 5 August 2014 (UTC)
OK. I'll get the latest versions of those files together and post a link here to the files. If Wikimedia need an official letter saying the data is released under CC-BY-SA 3.0 + GFDL I can get the Secretary of the DDC to write one and we can fax it or send it by snail mail if you can tell me where and to whom this should be sent. Is there some kind of standard release form? A note or references saying the Dzongkha data is from the DDC and a link to their website would be nice. (BTW PDF versions of all the dictionaries are available on the DDC site.) CFynn (talk) 04:28, 6 August 2014 (UTC)
Thank you. References containing link to the website would be appended to all entries. Here is the Wikipedia policy on donating copyrighted information: w:Wikipedia:Donating copyrighted materials#Granting us permission to copy material already online. Copyright is usually less of a concern at Wiktionary, since the material involved is generally short in length and not of innovative nature. If we want to be safe, we could request that the Secretary send a brief email declaring permission to use DDC data. Wyang (talk) 04:49, 6 August 2014 (UTC)
OK - this may take me a few days as I'm recovering from a minor operation on my foot and it is a little difficult for me to get around. CFynn (talk) 21:00, 6 August 2014 (UTC)
OK, no worries. Wyang (talk) 23:20, 6 August 2014 (UTC)

Dzongkha data[edit]

OK I've posted the XDXF dictionary files here:

I forgot that this data was already available at - under CC-BY-SA 3.0 license.

PDF copies of the printed versions of these dictionaries can be found at

CFynn (talk) 17:11, 12 August 2014 (UTC)

BTW you may need to slightly modify your Tibetan transliterating tool for some Dzongkha entries. Dzongkha syllables sometimes contain a second root which does not occur in Tibetan. This mostly happens when the tseg between syllables is dropped to reflect Dzongkha pronunciation. e.g. Tibetan བླ་མ་ (bla ma / Lama) = Dzongkha བླམ་ (blam / Lam). About 12 years ago when I was working on Tibetan & Dzongkha collation I compiled a spreadsheet which shows all the possibilities for a second root in a Dzongkha syllable which might be useful to you. I'll try and find it and post a link. CFynn (talk) 17:29, 12 August 2014 (UTC)

A minor tweak for zh-pron[edit]

At , I just fixed an edit where spaces before the commas in the cat= part caused the module not to recognize the POS abbreviations. I found out about it from an entry in Special:WantedCategories for Category:Hakka pron (I would highly recommend regulatly checking Special:WantedCategories for non-catastrophic module errors- it updates every 3 days). Is it too much trouble to have the module allow for whitespace in arguments to avoid this in the future? Thanks. Chuck Entz (talk) 18:42, 1 August 2014 (UTC)

Done now (spaces after commas). Wyang (talk) 00:36, 4 August 2014 (UTC)


Thanks for some great edits recently. My understanding of 吐槽 was it was more like "whinge", what do you think? ---> Tooironic (talk) 03:50, 5 August 2014 (UTC)

Thanks, I have added it. Wyang (talk) 04:06, 5 August 2014 (UTC)
Fantastic. Our Chinese coverage has been improving leaps and bounds recently. ---> Tooironic (talk) 00:27, 6 August 2014 (UTC)


According to my C-C Dictionary, this term can mean both social work and 指以安定人民生活、协调人际关系和维持社会秩序为主要目的的各种为人民大众谋福利的工作. Do you think the latter could be defined as "community service"? ---> Tooironic (talk) 01:19, 7 August 2014 (UTC)

Yes, absolutely. Wyang (talk) 01:20, 7 August 2014 (UTC)

Module errors in cmn-pron[edit]

The erhua code you wrote seems to have introduced some module errors in four Chinese entries. Take a look at Category:Pages with module errors.

Benwing (talk) 12:06, 10 August 2014 (UTC)

Fixed - they were using parameters not defined in the original set. Wyang (talk) 23:20, 10 August 2014 (UTC)


Wasn't sure how to word the second meaning here. My C-C- defines it as 因某人在眼前而感到不方便。例如:他俩在说悄悄话,我们呆在这儿很碍眼。Any ideas? ---> Tooironic (talk) 09:29, 16 August 2014 (UTC)

Hmm, (of someone's presence) to make others feel inconvenient or uncomfortable? Wyang (talk) 23:37, 17 August 2014 (UTC)

Module errors in cdo-pron[edit]

fyi, in case you hadn't noticed- 5 entries affected. Chuck Entz (talk)

Thanks, fixed. Wyang (talk) 05:31, 25 August 2014 (UTC)


Do you have a better definition? Thank you in advance. --kc_kennylau (talk) 00:16, 26 August 2014 (UTC)

Not really, the definition summarises the meaning well. Wyang (talk) 00:20, 26 August 2014 (UTC)
I think the definitions in 撞邪, 撞鬼, 見鬼 and 见鬼 are not quite accurate. Do you have a better definition? --kc_kennylau (talk) 11:20, 26 August 2014 (UTC)
I would say "1) to be absurd; preposterous; 2) to be down on one's luck; 3) to go to hell; to hell with ...; 4) damn; damn you; for Christ's sake" for these words. Wyang (talk) 11:31, 26 August 2014 (UTC)

歲數 / 岁数[edit]

Just noticed an error in the Mandarin pinyin here. I've fixed it now. Is the Cantonese correct? ---> Tooironic (talk) 12:50, 26 August 2014 (UTC)

@Tooironic: Yep. --kc_kennylau (talk) 13:14, 26 August 2014 (UTC)


Would it be better to pass no parameter to the module and let the module get all the parameters from the parent? --kc_kennylau (talk) 18:01, 26 August 2014 (UTC)

Yes, I was thinking about the same thing when I was adding extra parameters. Wyang (talk) 22:46, 26 August 2014 (UTC)