Pronunciation file[edit]

@Wyang What's the right way to indicate the pronunciation file? 歷史 just uses |a=y. Can this be documented, please? --Anatoli (обсудить/вклад) 01:44, 7 April 2014 (UTC)

Hi. Done. Wyang (talk) 01:50, 7 April 2014 (UTC)
|a=y is a bit confusing on its own. What if it's only Mandarin file present, no Cantonese, etc.? --Anatoli (обсудить/вклад) 02:35, 7 April 2014 (UTC)
|a=y is the parameter and argument used in {{Pinyin-IPA}}. For this template, the variety code has to be prefixed to 'a'. Wyang (talk) 02:53, 7 April 2014 (UTC)
I tried to that (|ma=y) but the audio link disappears, e.g. {{Pinyin-IPA|lìshǐ|ma=y}} in 歷史. --Anatoli (обсудить/вклад) 03:10, 7 April 2014 (UTC)

{{Pinyin-IPA}} is Mandarin-only, hence |a=y or |a=zh-lìshǐ.ogg. {{zh-pron}} is across-topolectal, hence |ma=y or |ma=zh-lìshǐ.ogg. Wyang (talk) 03:23, 7 April 2014 (UTC)

{[ping|Wyang}} Thank you but I'm still confused. See 日本, I had to use {{Pinyin-IPA|Rìběn|a=Zh-ri4ben3.ogg}}. It's !=a, not |ma=. "ma" doesn't work. --Anatoli (обсудить/вклад) 23:37, 7 April 2014 (UTC)
It's {{Pinyin-IPA}} (Mandarin-only), not {{zh-pron}} (across-dialectal), which is why there is no |ma= parameter. Wyang (talk) 23:42, 7 April 2014 (UTC)
I see, thanks. Perhaps I need to see it used more often. :) BTW, I haven't listened to the audio on 日本‎. Was it really bad? --Anatoli (обсудить/вклад) 23:54, 7 April 2014 (UTC)
Here it is if you haven't heard it:
. It's another non-native pronunciation by Peter Isotalo - inaccurate consonants, exaggerated tonal contours. Wyang (talk) 00:00, 8 April 2014 (UTC)
Thanks. I will listen later but I trust your judgement. BTW, many of your templates use {{Hani}} and other script templates but without a language code, they get into Category:Language code missing/scripts/Hani, etc. Could you add language codes, please? "cmn" for now but then it can be replaced with "zh" in some cases. --Anatoli (обсудить/вклад) 00:51, 8 April 2014 (UTC)
When you add "=y" (e.g. |ma=y|ca=y|ga=y|ha=y|ja=y|mna=y|wa=y|xa=y) it adds to " terms with audio links" categories but there is no link to audio. --Anatoli (обсудить/вклад) 01:01, 8 April 2014 (UTC)
They are collapsed. Wyang (talk) 01:43, 8 April 2014 (UTC)

Pinyin-IPA to zh-pron[edit]

These two templates are out of sync. How do you do erhua, alternative pronunciations? E.g {{Pinyin-IPA|ēipiān|er=y|py=A-piān}} on A片? --Anatoli (обсудить/вклад) 02:32, 1 May 2014 (UTC)

Replace all '|' with ','


Wyang (talk) 04:02, 1 May 2014 (UTC)

A片 is not a noun any more, in any language :( --Anatoli (обсудить/вклад)
It is now. :) Wyang (talk) 05:00, 1 May 2014 (UTC)
Thanks. Why my addition
<includeonly>[[Category:Chinese nouns]]</includeonly>
didn't work? --Anatoli (обсудить/вклад) 05:04, 1 May 2014 (UTC)
It should work, I think. I'm not sure why it is not working. Wyang (talk) 05:58, 1 May 2014 (UTC)


On Hakka pronunciation is not shown in collapsed mode and looks broken in the expanded mode. --Anatoli (обсудить/вклад) 02:30, 2 May 2014 (UTC)

What about IPA for Hakka? --Lo Ximiendo (talk) 22:32, 28 May 2014 (UTC)

Category names[edit]

In categories that include the language name, that name and the canonical name for the language code in the language data modules have to match- otherwise, the catboiler templates won't work. Of all the names in Module:zh-pron, Jin seems to be the only one that doesn't match: WT's canonical name is Jinyu, not Jin. That means we have to either change zh-pron to use Jinyu, or go to RFM to get the canonical name changed to Jin. Chuck Entz (talk) 05:26, 2 May 2014 (UTC)

What is RFM? I have already changed to "Jin" in Module:languages/data3/c and moved categories. Wiktionary:Grease_pit/2014/May#cjy_-_Jin_or_Jinyu.3F. --Anatoli (обсудить/вклад) 05:40, 2 May 2014 (UTC)
WT:RFM: Requests for moves, mergers and splits. Even though language codes are no longer templates, that's where we still discuss such things. You really need to get out of the habit of acting first and then thinking about the consequences. Chuck Entz (talk) 05:59, 2 May 2014 (UTC)
@Chuck Entz I posted Wiktionary:Grease_pit/2014/May#cjy_-_Jin_or_Jinyu.3F before acting. I saw that Jinyu categories were empty. What are the possible consequences apart from being told off by you? Are you aware of any active Jin/Jinyu editors? --Anatoli (обсудить/вклад) 06:09, 2 May 2014 (UTC)
As I said, this time it's not a big deal, but it's not a good practice, in general. As for "posting", you first broached the subject at 5:11, got one response at 5:19, said at 5:22 you were going to make the change, then made the change at 5:24- 13 minutes.
We're not all in the same room- it usually takes hours or even days to get people's attention. It just happens that most of the editors active in Chinese happen to be in either Australia or New Zealand, but most of the people who deal with language codes, templates and modules are in North America or Europe.
I'm not accusing you of trying to slip something by anyone- that would be completely out of character. I've never had any reason to question your intentions- just your lack of patience. Chuck Entz (talk) 07:02, 2 May 2014 (UTC)

Another variant pronunciation question[edit]

@Wyang How do I add a variant pronunciation at 芥兰 - "jièlán" and "gàilán"? See also Talk:假期 for 期 and Taiwanese variants. --Anatoli (обсудить/вклад) 11:01, 3 May 2014 (UTC)

@Atitarev You can separate the readings by comma. Please see my edit there. Wyang (talk) 11:23, 3 May 2014 (UTC)
You must have changed something because I tried a comma before. Thank you for the fixes. --Anatoli (обсудить/вклад) 11:27, 3 May 2014 (UTC)

Middle Chinese and Old Chinese[edit]

@Wyang Apparently there are Category:Middle Chinese language (ltc) and Category:Old Chinese language (och). I think they should get PoS categories as well after they are merged and on any new entry. --Anatoli (обсудить/вклад) 01:36, 21 May 2014 (UTC)

I don't think that's a good idea. These are phonological concepts being applied in an incorrect context. Wyang (talk) 02:07, 21 May 2014 (UTC)
I'm not sure myself. That means we are deleting the two above when the merger is complete. Or they should be moved to Appendices as reconstructed languages are done, e.g. Appendix:Proto-Slavic/voda. What do you suggest - just keeping the pronunciations, without PoS categories? I have created Wiktionary:Requests_for_moves,_mergers_and_splits#Category:Middle_Chinese_language_.28ltc.29_and_Category:Old_Chinese_language_.28och.29.
BTW, please run your AWB, when you can, there are still unconverted multisyllabic Min Nan verbs, etc. --Anatoli (обсудить/вклад) 02:21, 21 May 2014 (UTC)
@Wyang I have another idea. We can categorise terms with transliteration in Category:Middle Chinese and Category:Old Chinese - new categories without PoS info. Just to have a list of term for which there are Old and Middle Chinese pronunciations.--Anatoli (обсудить/вклад) 00:02, 22 May 2014 (UTC)
Yes, I've made {{zh-pron}} do so. Wyang (talk) 00:24, 22 May 2014 (UTC)

Gwoyeu Romatzyh[edit]

I think this addition was an unnecessary burden. --Anatoli (обсудить/вклад) 23:04, 27 May 2014 (UTC)

@Atitarev It may be unnecessary, but how is it a burden? --kc_kennylau (talk) 10:37, 28 May 2014 (UTC)
Because we have to understand and maintain it. It's just my opinion but there are too many transliterations. Why this one, out of all? Even Wade-Giles is better known. (BTW, sorry for accidental reversals today)--Anatoli (обсудить/вклад) 14:42, 28 May 2014 (UTC)
@Atitarev No problem. I wouldn't include Wade-Giles because it is too similar to Pinyin. (Does this argument stand?) --kc_kennylau (talk) 14:56, 28 May 2014 (UTC)
Not really:) --Anatoli (обсудить/вклад) 22:20, 28 May 2014 (UTC)
What about Yale for Mandarin? --Lo Ximiendo (talk) 22:32, 28 May 2014 (UTC)
Apparently the word that is ideal in that system would be 一点儿. :) Wyang (talk) 02:32, 30 May 2014 (UTC)
  • I would also like to see Wade–Giles included. It's not that similar to Pinyin, and older English speakers and people interested in Taiwan may be more familiar with it than with Pinyin. —Aɴɢʀ (talk) 14:10, 21 August 2015 (UTC)
    • I second that. It's simply everywhere in older English-language reference works, often without the Chinese characters. I doubt there are many Chinese speakers that need this, but it would really come in handy for English-speaking casual users who are trying to find out more about words mentioned in those reference works. Chuck Entz (talk) 17:19, 21 August 2015 (UTC)
I'm also perplexed and would like to add my voice of complaint that the Wade-Giles information is being systematically removed. As a Wikipedia editor I routinely resort to somewhat older or public domain references, and they all use Wade-Giles. I dont think it was very nice to systematically overhaul Chinese character pages, which used to all give Wade-Giles transliteration pretty much, and delete the information. The older template {{cmn-hanzi}} accomodated the wg= parameter, so this one should have as well. --Kiyoweap (talk) 04:55, 26 July 2016 (UTC)
It's in the collapsed view. Wyang (talk) 04:59, 26 July 2016 (UTC)
However, it's only in single character entries. I think we need to display Wade-Giles in all Mandarin entries. — justin(r)leung (t...) | c=› } 16:43, 27 July 2016 (UTC)

Hanzi templates and headers[edit]

I have already removed a lot of ===Hanzi=== when merging but I'm having second thoughts. They may contain alternative readings, which are not present in {{zh-pron}} for specific PoS, e.g. a pronunciation only used in a component, a rare reading. Should we keep ===Hanzi=== and {{cmn-hanzi}} (move to {{zh-hanzi}})? --Anatoli (обсудить/вклад) 23:07, 28 May 2014 (UTC)

I think we should merge the definitions into one header named "Definitions", and divide it by MC readings, not by PoS, with the help of additional templates. In that way {{zh-pron}} accounts for all readings and is used only once, whereas the L4 reading templates in Definitions account for multiple readings. Wyang (talk) 02:32, 30 May 2014 (UTC)
I haven't fully accepted your idea about "Definitions" header yet, even if I understand your point, sorry. This approach has pluses and minuses and both approaches are challenging. However, using PoS headers is more common and most people are used to it, you don't have to change anything radically. Besides, this may not be accepted by the community, including Chinese, Vietnamese, etc. editors. It may require another vote. Sorry for not fully supporting you on this one! --Anatoli (обсудить/вклад) 02:41, 30 May 2014 (UTC)

Wu Entry Transliteration Ideas[edit]

Could we be able to sort Wu entries by consonants and vowels instead of numerals? --Lo Ximiendo (talk) 00:54, 10 June 2014 (UTC)

Yes, numbers stripped. Wyang (talk) 01:03, 10 June 2014 (UTC)
Maybe we could place the numbers behind the readings instead of before them? --Lo Ximiendo (talk) 01:10, 10 June 2014 (UTC)
I think stripping all numbers would probably be better. There are words following phrasal tone sandhi rules as well, which are currently written with numbers after letters. 儂好 Wyang (talk) 03:04, 10 June 2014 (UTC)
Perhaps the transliteration without any numbers could be adopted in translations, see also's, synonyms, etc., e.g. "non hau", otherwise, complete numbers (for each syllable), e.g. "non33 hau34" should be used, which is error-prone (the only person who could do it error-free would be Wyang :)). I've got a textbook, which ignores tones. It's not perfect but accurate tone numbers could be reserved for Chinese entries. --Anatoli (обсудить/вклад) 03:44, 10 June 2014 (UTC)
How about something like "|w=zoe xiau3"? --Lo Ximiendo (talk) 16:09, 10 June 2014 (UTC)

Numbered pinyin, Jyutping, Wade-Giles with superscript?[edit]

@Kc kennylau, @Wyang Can numbered pinyin, Jyutping and Wade-Giles (if introduced) use superscript numbers? E.g. gwok3 in ? I don't why we need linked numbered pinyin hyperlinked, just displaying guo2 in monosyllabic entries is sufficient, IMHO. (There's some problem with the expand button in ). --Anatoli (обсудить/вклад) 00:09, 18 June 2014 (UTC)

All seem to be superscripted now. I don't seem to have trouble expanding zh-pron at 國. Wyang (talk) 00:29, 18 June 2014 (UTC)
Thank you. The button seems at a lower than usual position, not at the top but almost the middle of the box. It's not a big deal, though. --Anatoli (обсудить/вклад) 00:33, 18 June 2014 (UTC)
While you're at it, could you remove the hyperlink to the numbered pinyin? They are not maintained and getting of sync with toned pinyin. --Anatoli (обсудить/вклад) 00:35, 18 June 2014 (UTC)
Umlaut is turned into ��, as on , , , , etc. However, in the link it's fine. Nibiko (talk) 05:16, 21 February 2015 (UTC)
@Kc kennylau Would you know of a way to fix this without removing the tt syntax? Thanks. Wyang (talk) 11:09, 22 February 2015 (UTC)
@Wyang Fixed. --kc_kennylau (talk) 14:29, 22 February 2015 (UTC)

β粒子 and other terms written in multiple scripts[edit]

What should be the format (pinyin, jyutping) for terms written in multiple scripts, such as β粒子? The module will obviously crash if Latin, Greek, etc. letters are not replaced with standard transliteration. --Anatoli (обсудить/вклад) 03:10, 23 June 2014 (UTC)

For Mandarin, it could be |m=bèitǎ lìzi (贝塔粒子) but there are other words for which pinyin and jyutping may be unknown. --Anatoli (обсудить/вклад) 03:15, 23 June 2014 (UTC)

Template currently broken[edit]

The current template requires the following to be displayed at Shanghai:

It should read Lua error in Module:yue-pron at line 101: Please do not capitalize the Jyutping.

or possibly Lua error in Module:yue-pron at line 101: Please do not capitalize the Jyutping.

but both of those currently give "module errors". I'm not sure what in the script could cause it to get so buggy when properly capitalized and hyphenated Cantonese and Shanghainese are included, but whatever it is needs fixing. — LlywelynII 13:06, 7 July 2014 (UTC)

Jyutping does not capitalise proper nouns (see how the article Jyutping treats "jyut6 ping3"). The Wiktionary romanisation of Wu does not capitalise proper nouns either and does not make use of hyphens. For Jyutping, normal numbers are used for tone numbers, since the original Jyutping scheme does not actually make tone numbers superscripts (see the link above). Making them superscripts is a modification of the original scheme adopted by Wiktionary and some other sites. Normal numbers are also easier to type. Wyang (talk) 00:05, 8 July 2014 (UTC)
Capitalisation of Jyutping should also be disabled in zh-usex. --Anatoli (обсудить/вклад) 00:09, 8 July 2014 (UTC)

Why does this categorise in part-of-speech categories?[edit]

It shouldn't be doing this. The part of speech should be handled by the headword template. —CodeCat 13:12, 7 July 2014 (UTC)

I wonder, why are you asking now, when it's been used like that for a long time by a very large number of entries, which have converted to use {{zh-pron}}? I have asked a while ago on GP about sorting in {{zh-noun}} and I thought you knew it all along. All categorisations and sorting is done by this template and modules. User:Wyang could explain this better - it was his idea and design but this template contains pronunciations for various Chinese topolects and as soon a pronunciation is given (transliteration or audio file), it adds to PoS categories for that topolect and they are sorted by the transliteration, e.g. 醫院医院 (yīyuàn) has 5 topolects and one PoS category. A template like {{zh-noun}} would require some complex logic to do that. Also pinging @Kc kennylau who has been taking an active part in the development and the use. --Anatoli (обсудить/вклад) 23:43, 7 July 2014 (UTC)
I'm asking now because I am adding the lemma categories to {{head}}, but I'm finding that a number of Chinese entries has no part of speech specified at all, which prevents categorisation. I still don't understand why part of speech categories are added in the pronunciation section; what does the PoS have to do with pronunciation at all? Why not use normal headword templates like any other language? —CodeCat 00:10, 8 July 2014 (UTC)
I'll try to explain again. The overwhelming majority of Chinese words use the same characters but have different pronunciations in topolects and dialects, so 醫院 is just a Chinese word for "hospital". "yīyuàn" is Mandarin transliteration, "ji1 jyun6-2" is Cantonese, etc., without pronunciation "ji1 jyun6-2", there is no point in adding 醫院 to Category:Cantonese nouns because it wouldn't contain anything Cantonese. 噉样 is a Cantonese specific term, it's not used in Mandarin, there is no pronunciation for Mandarin, so it's not added to any Mandarin PoS categories. Potentially, "zh" headword templates could be used for Chinese PoS categorisations, which is also handled nicely by this and other PoS templates. --Anatoli (обсудить/вклад) 00:23, 8 July 2014 (UTC)
Ok, but why do we even have Category:Cantonese nouns? Wasn't the whole point of the merger to get rid of the more specific categories and have only Category:Chinese nouns? —CodeCat 00:40, 8 July 2014 (UTC)
No, you misunderstood the purpose. How can users find Cantonese pronunciations, usage examples? They can't assume that every Chinese entry will have Cantonese Jyutping, it's not automatic but it's now made easy to add contents in at least 5 topolects + Old and Middle Chinese. Chinese topolects are now thriving with the merger. Cantonese nouns have grown tenfold, with IPA, usage examples and proper transliterations. Wu has grown from nearly nothing to a few hundred. There is work going for Old Chinese and Middle Chinese. Hakka and Min Nan entries are improved and increased. --Anatoli (обсудить/вклад) 00:52, 8 July 2014 (UTC)
But they're really just Chinese entries with a Cantonese transliteration in the pronunciation section. Does that really merit a separate Category:Cantonese nouns? Why not Category:Chinese entries with Cantonese pronunciation? —CodeCat 01:09, 8 July 2014 (UTC)
The exact categorisation and formatting may not have been thoroughly thought through and discussed but it's now accepted by Chinese editors (natives and learners). I personally see no problem with the usual Category:Cantonese nouns, which may contain other topolects as well. Well, only those who supported and understood the merger discussed and took part in it. The opponents didn't suggest anything constructive. --Anatoli (обсудить/вклад) 01:15, 8 July 2014 (UTC)
Pronunciation is not the only Cantonese content on those pages. Wyang (talk) 01:19, 8 July 2014 (UTC)
I don't object to Chinese editors working with it and understanding how it works. But it's a problem when it comes to editors like me who are not familiar with the Chinese practices. It's a real headache. Furthermore, there are a lot of technical difficulties because the way templates and modules are being used deviates so strongly from how the equivalents in other languages work. That's not a problem if the languages' stuff is maintained by its own set of editors, but it's confusing when it comes to points where the language-specific stuff meshes with general templates, like {{head}}, which I am currently working on to allow proper categorisation of all lemmas and non-lemma forms. If Chinese handles part-of-speech categories in a totally different way, then all of that breaks down, and it's a real mess for me to make it work for Chinese. —CodeCat 01:24, 8 July 2014 (UTC)
Could you describe the challenges and Wyang or Kenny, who are technically better than me, can try to help? --Anatoli (обсудить/вклад) 01:29, 8 July 2014 (UTC)
The primary problem is that Module:headword/templates contains a list of recognised parts of speech that I am working on. As part of this, I'm trying to ensure that {{head}} always has a second parameter, so that the template is able to categorise it properly. However, there is currently the template {{zh-pos}} which does not give a POS, and it's used in quite a few entries. Furthermore, because the {{zh-pron}} template is not a headword line template that can use {{head}}, it entirely bypasses this, so Category:Cantonese lemmas will not be populated by it. —CodeCat 01:36, 8 July 2014 (UTC)
This could be easily done by modifying the make_cat function in Module:zh-pron. Done now. Wyang (talk) 01:39, 8 July 2014 (UTC)
Well in that case, there would need to be a separate function in Module:headword that is exported for Module:zh-pron to use, just for categorising into lemma/POS categories. —CodeCat 01:42, 8 July 2014 (UTC)
E/C: What you did now doesn't actually work the way it should. Now, not just lemmas will be categorised, but also non-lemma forms. That is why I am creating the list of POSs in the first place, so that the template knows what parts of speech are lemmas and which aren't. It also seems that it's categorising this talk page, so something is clearly wrong. —CodeCat 01:44, 8 July 2014 (UTC)
Not sure if it matters but Chinese is not an inflected language and every Chinese (also Vietnamese, Thai, Lao, etc.) entry is a lemma. Should phrases, idioms, etc. be broken apart? --Anatoli (обсудить/вклад) 01:54, 8 July 2014 (UTC)
Idiom is not a part of speech in any case. Rather, other parts of speech can be optionally considered idioms. —CodeCat 01:57, 8 July 2014 (UTC)
(E/C) I asked because idioms get entries and have headers, like {{zh-idiom}}. Does my comment answer your question? Every Chinese term that merited an entry is a lemma. --Anatoli (обсудить/вклад) 02:05, 8 July 2014 (UTC)
Why are idioms not lemmas? Chinese idioms are as lemma-like as nouns, verbs, adjectives, ... Wyang (talk) 02:01, 8 July 2014 (UTC)
I was just checking, if idioms, proverbs, phrases in general (not just Chinese) are considered lemmata, sorry if it was a silly question. "Lemma - the canonical form of an inflected word" and phrases (and many idioms) are not words. --Anatoli (обсудить/вклад) 02:05, 8 July 2014 (UTC)
I didn't say they weren't lemmas. I said that idiom is not a part of speech. "Phrase" is, but "idiom" isn't, nor is "proverb". Part of speech relates only to the use of the word in a sentence, to syntax. And idiomatic phrases act like any other phrase, and are therefore not parts of speech in themselves. They are just phrases that happen to be idioms. —CodeCat 02:08, 8 July 2014 (UTC)
Phrase is a part of speech, and is also a lemma because it's not an inflected form of a lemma. But idiom is not a lemma because it's not even a part of speech. —CodeCat 02:11, 8 July 2014 (UTC)
To me, idioms in Chinese (e.g. 大驚小怪) do not behave any differently from nouns, verbs and adjectives. Lemmas are clearly a concept stemming from inflecting languages, as are the headword templates themselves, and the idea that word senses should always be split by part of speech. I'm not sure whether such a distinction of lemmas and non-lemmas is traditionally made for inflecting languages, but personally I think carrying this distinction over to non-inflecting languages would be an unnecessary complication. Wyang (talk) 03:46, 8 July 2014 (UTC)
Based on the definition in the entry you gave, that should be labelled "verb", not "idiom". —CodeCat 11:44, 8 July 2014 (UTC)
It's also a noun, adjective, adverb. Wenlin dictionary (software, based on ABC dictionary) just gives it as f.e - "fixed expression". --Anatoli (обсудить/вклад) 12:20, 8 July 2014 (UTC)
Then why doesn't the entry say that? —CodeCat 12:23, 8 July 2014 (UTC)
I have just added examples of noun, adjective and adverb usages. Why? It's actually endless. Many Chinese words behave that way - they are used in various functions. Dictionaries just make arbitrary choices about parts of speech to make it a bit easier for foreign learners. It's even more complicated with single-character words. That's why our current translingual sections have vague definitions without the part of speech info. --Anatoli (обсудить/вклад) 12:37, 8 July 2014 (UTC)
This kind of Chinese exceptionalism is aggravating me to be honest. Chinese has parts of speech just like other languages, as those concepts are common to all human languages and even wired into our brains. I don't see why Chinese should be treated differently from other languages. In other languages, if words have more than one part of speech, we list them all. The same can easily be done for Chinese as well. —CodeCat 12:42, 8 July 2014 (UTC)
There is no Chinese exceptionalism. Chinese also has parts of speech but they are often ignored or shown only partially in dictionaries. Editors, dictionary publishers make choices but other editors do it differently. I don't know, e.g. why 那邊 is shown as adverb and noun. It's also used as a postposition, Wenlin has it as "place word" and pronoun! Languages, which were originally monosyllabic and completely lack inflections have this in common. If you dig deeper into Vietnamese, Burmese, Thai, Lao, etc. they are very similar in this respect. It's possible to classify them comprehensively but too damn hard. --Anatoli (обсудить/вклад) 13:01, 8 July 2014 (UTC)
Another example - 以后. Two reputable dictionaries list them with different PoS - Oxford Chinese dictionary - as a noun (名), ABC dictionary lists it as an adverb (adv.). And Pleco dictionary simply omits PoS info altogether but gives extensive examples. The choice is arbitrary, whatever suits better in a current situation. Sorry if it's aggravating you. --Anatoli (обсудить/вклад) 13:23, 8 July 2014 (UTC)

nǐhǎo or níhǎo[edit]

Correct pronunciation of 你好 is níhǎo but the other form (root tones) is used here on wiktionary in zh-pron. On the page Wiktionary:About_Chinese#Tone_sandhi there are a description of it. The text is written before the zh-pron template was introduced and is about the inflection template. I think was has happened is that the infomation from the inflection template has been copied to zh-pron. I think we need to update the info in zh-pron. There are very clear rules about pronunciation so I think a bot can make the update. What do you think? Kinamand (talk) 09:09, 9 September 2014 (UTC)

Sorry but I don't understand what you mean. Wyang (talk) 11:18, 9 September 2014 (UTC)
Have you read the section Tone sandhi on About Chinese which I link to in my question? There are to ways to convert 你好 into pinyin: converted tones (níhǎo) or root tones (nǐhǎo). Notice the different tone on the first syllable. Both ways are used in dictionaries. The first way follow the correct pronunciation. Currently we use the other conversion in zh-pron and I think that is wrong. You can also read about it on wikipedia:[[1]]. Kinamand (talk) 12:30, 9 September 2014 (UTC)
Only original tones are standard in Pinyin orthography. Wyang (talk) 23:51, 9 September 2014 (UTC)
Have you read the section Tone sandhi on About Chinese which I link to in my question? Your link is a personal site make by a guy named Mark Swofford which states some rules without giving any source or reason. On the page I link to they link to two dictionaries. The one which use the standard I think is most logical is HSK and HSK is supported by Ministry of Education of the People's Republic of China so it must have much bigger weight than the personal site you link to. Kinamand (talk) 06:34, 10 September 2014 (UTC)
Mandarin tone sandhi is a common knowledge. One needs to know the expected tone changes but nǐhǎo is the standard pinyin, not níhǎo, which is reflected in most standard dictionaries, including HSK. It is possible to include additionally the phonetic pinyin but that's another story. --Anatoli T. (обсудить/вклад) 06:46, 10 September 2014 (UTC)
Here is the link by the Ministry of Education of the People's Republic of China: Basic rules of the Chinese phonetic alphabet orthography (pg. 14 in the pdf). Wyang (talk) 20:50, 10 September 2014 (UTC)
@Kinamand note that IPA reflects the tone sandhi: /ni²¹⁴⁻³⁵ xɑʊ̯²¹⁴⁻²¹⁽⁴⁾/. --Anatoli T. (обсудить/вклад) 23:55, 9 September 2014 (UTC)
I know that IPA reflects tone sandhi but I have never heard of people learning chinese pronounciation from IPA. Every textbox about chinese I have seen use pinyin. Kinamand (talk) 06:34, 10 September 2014 (UTC)
I have now tried to google: nǐhǎo og níhǎo. Nǐhǎo seems to be used far more often than níhǎo. So maybe we should just keep it and write that in the documentation. Do you know if there exists an official standard for pinyin maintained by Ministry of Education of the People's Republic of China or other big authority? Kinamand (talk) 06:39, 10 September 2014 (UTC)
(edit conflict)Yes, learning pinyin includes learning tone sandhi. If a learner doesn't know how to read pinyin correctly, taking into account tone sandhi, it's a flaw in learning, not in pinyin. --Anatoli T. (обсудить/вклад) 06:46, 10 September 2014 (UTC)
Standard pinyin uses nominal pinyin, not the actual pronunciation. It's basics, taught at HSK Basic level. You can check any HSK references, textbooks or various dictionaries - ABC (Wenlisn software), Pleco, CEDIC, Nciku, MDBG, etc. Also, mainland China's and Taiwan's systems coincide on this. --Anatoli T. (обсудить/вклад) 06:50, 10 September 2014 (UTC)
Our "About Chinese" page says: "Some Mandarin dictionaries are inconsistent when it comes to depicting tone sandhi in Pinyin.". Can you correct the text on our "About Chinese" page with your info so that it is clear how we do it here in wiktionary? And many thanks for your answer :-) Kinamand (talk) 07:58, 10 September 2014 (UTC)

Separate languages[edit]

Cantonese, Hakka, Mandarin, etc., are separate and different languages. It's meaningless to merge the sections. Please undo the merge. Thanks. —This unsigned comment was added by (talk). 19:18, 5 October 2014 (UTC)

You'll need to make a stronger case if you want to convince all the editors here to undo hundreds of hours of their work. —CodeCat 19:45, 5 October 2014 (UTC)
Spoken Cantonese, Hakka, Mandarin, etc., are indeed separate and different languages, but as written with Han characters, they're dialects of written Chinese. This split in nature between the spoken and written languages means that neither merged nor separate approaches will be without problems, but the current approach is what we arrived at after extensive discussion, and I don't think anyone would want to change it again without really compelling reasons.
Before this, we tried having everything with separate language sections, but most of the non-Mandarin sections were either empty or had exactly the same definitions as the Mandarin sections. This way, we have the writing merged, but can provide information about the differences in pronunciation and grammar, among other things, that make the spoken languages distinct. It's not perfect, but it's much better than it was before. Chuck Entz (talk) 22:28, 5 October 2014 (UTC)
Yes, nobody's going to undo changes, especially after anonymous comments. There's no information loss and Chinese topolects can now be added, including terms specific to topolects. They are treated equally. Languages or dialects is a political topic, we deal with information here. 歷史 is a Chinese word. Mandarin, Cantonese, Hakka, Min Nan, Wu are different ways to pronounce it. --Anatoli T. (обсудить/вклад) 00:13, 6 October 2014 (UTC)

Parameter for Taishanese needed[edit]

There seems to be call for it. — I.S.M.E.T.A. 16:47, 2 April 2015 (UTC)

@Wyang, Justinrleung: I tried making a module for Taishanese: link. —suzukaze (tc) 07:32, 29 June 2016 (UTC)
I've noticed, and it's exciting that you've put effort into it. (My paternal grandparents are Taishanese, but my dad doesn't speak it.) There aren't many resources for Taishanese out there, so it might be hard to have much coverage. However, I'd like to see it added to Wiktionary. — justin(r)leung (t...) | c=› } 08:56, 29 June 2016 (UTC)
I second what Justin said. Well done on the module and I look forward to it being incorporated in the template. Wyang (talk) 12:17, 29 June 2016 (UTC)
It's not very scholarly so I thought it might need a bit of review before it goes live, especially regarding romanization. Feedback is welcome of course. —suzukaze (tc) 00:06, 30 June 2016 (UTC)
Xiaoxuetang clearly marks pronunciations as "Taicheng, Taishan, Siyi", unlike other sources (as far as I can tell); should it become the basis for the romanization? —suzukaze (tc) 07:10, 30 June 2016 (UTC)
Yes, it probably should. Taicheng seems to be the standard for Taishanese. — justin(r)leung (t...) | c=› } 07:34, 30 June 2016 (UTC)
Done, and Wiktionary:About Chinese/Cantonese/Taishanese has been provisionally set up (the rows are not in an ideal order but that's not of major concern at the moment...). I don't know what else needs tweaking right now. —suzukaze (tc) 09:21, 2 July 2016 (UTC)
Don't forget about this... Feel free to complete and integrate the module into zh-pron in my absence. @Wyang, Justinrleung. —suzukaze (tc) 01:57, 11 July 2016 (UTC)
No problem - I will try to garner some Taishanese references and replace those bigrams first when free... Wyang (talk) 02:02, 11 July 2016 (UTC)

I added Taishanese-to-IPA in Module:yue-pron. A list of Stephen Li's words is at Module talk:User:Suzukaze-c/04 (I have tidied his original data up quite extensively to produce this page, still there are inconsistencies in the notation). Three things need to be discussed: (1) pronunciation of prenasalised consonants, (2) pronunciation of 'y' (/ʒ/ or /j/), and (3) pronunciation of 'ia/ie' and 'au'. This is a very useful overview on the various Siyi dialects: [2], again written by the legendary Wang Li. Wyang (talk) 09:56, 12 July 2016 (UTC)

Cross referencing the Stephen Li data with http://xiaoxue.iis.sinica.edu.tw/yueyu may also be necessary as it is unclear what dialect Li speaks while Xiaoxuetang has pronunciations marked as Taicheng. —suzukaze (tc) 19:49, 12 July 2016 (UTC)

Hakka Pha̍k-fa-sṳ and Pe̍h-ōe-jī[edit]

What's the difference between Pha̍k-fa-sṳ and Pe̍h-ōe-jī in terms of Hakka? Isn't Pe̍h-ōe-jī for Min Nan? If so, why is Pe̍h-ōe-jī listed as one of the romanizations for Hakka? Justinrleung (talk) 05:14, 13 June 2015 (UTC)

w:POJ#Adaptations for other languages or dialectssuzukaze (tc) 06:50, 15 June 2015 (UTC)
I understand, but aren't Pha̍k-fa-sṳ and Pe̍h-ōe-jī the same in the context of Hakka? Why are there two different parameters (pfs and poj)? Justinrleung (talk) 06:58, 15 June 2015 (UTC)
@Wyang: ? —suzukaze (tc) 07:07, 15 June 2015 (UTC)
We have to use pfs for Hakka to make templates work well. [3] has common characters but sometimes you have to search elsewhere. --Anatoli T. (обсудить/вклад) 11:39, 15 June 2015 (UTC)
Maybe I wasn't clear enough before. I know pfs is used for Hakka, but why is poj also given as a valid option for Hakka romanization? Justinrleung (talk) 21:54, 15 June 2015 (UTC)
@Wyang, Atitarev He is stating that POJ and PFS are equivalent (different name for the same system) and should not be separated as two parameters. --kc_kennylau (talk) 06:31, 16 June 2015 (UTC)
PFS and POJ are different systems. POJ should be removed, so should the code "pfs=". Wyang (talk) 01:18, 20 June 2015 (UTC)


There seems to be two problems with the Hakka part of the template:

  1. IPA is not displayed for long words.
  2. There can't be more than one pronunciation displayed.

e.g. 馬來西亞

Justinrleung (talk) 05:26, 13 June 2015 (UTC)

@Justinrleung Thank you. I have modified it to allow IPA to be displayed for long words. --kc_kennylau (talk) 06:39, 16 June 2015 (UTC)
I don't want to be a spoilsport but shouldn't it be
  • IPA(key): /ma²⁴ lo̯i¹¹ ɕi²⁴⁻¹¹ a/, /ma²⁴ lo̯i¹¹ ɕi²⁴ a³¹/ invalid IPA characters (//)
and not
  • IPA(key): /ma²⁴ lo̯i¹¹ ɕi²⁴⁻¹¹ a,ma²⁴ lo̯i¹¹ ɕi²⁴ a³¹/ invalid IPA characters (,)
 ? —suzukaze (tc) 06:52, 16 June 2015 (UTC)
In fact, it should be IPA(key): /ma²⁴ lo̯i¹¹ ɕi²⁴⁻¹¹ a²⁴/, /ma²⁴ lo̯i¹¹ ɕi²⁴ a³¹/ invalid IPA characters (//). Justinrleung (talk) 07:00, 16 June 2015 (UTC)

(resolved) —suzukaze (tc) 05:42, 20 March 2016 (UTC)

Min Nan[edit]

A few problems in Min Nan:

  1. o͘ in POJ does not convert to oo in Tâi-lô
  2. ch in POJ does not convert to ts in Tâi-lô
  3. no IPA when there is more than one word

e.g. 內蒙古自治區内蒙古自治区 (Nèiměnggǔ Zìzhìqū)

  • Min Nan
    • (Hokkien)
      • Pe̍h-ōe-jī: Lāi-bông-kó͘ Chū-tī-khu
      • Tâi-lô: Lāi-bông-kóo Tsū-tī-khu
      • Phofsit Daibuun: laixbongkor zuxdixqw
      • IPA (Xiamen): /laɪ²²⁻²¹ bɔŋ²⁴⁻²² kɔ⁵³ t͡su²²⁻²¹ ti²²⁻²¹ kʰu⁴⁴/
      • IPA (Quanzhou): /laɪ⁴¹⁻²² bɔŋ²⁴⁻²² kɔ⁵⁵⁴ t͡su⁴¹⁻²² ti⁴¹⁻²² kʰu³³/
      • IPA (Zhangzhou): /laɪ²²⁻²¹ bɔŋ¹³⁻²² kɔ⁵³ t͡su²²⁻²¹ ti²²⁻²¹ kʰu⁴⁴/
      • IPA (Taipei): /laɪ³³⁻¹¹ bɔŋ²⁴⁻¹¹ kɔ⁵³ t͡su³³⁻¹¹ ti³³⁻¹¹ kʰu⁴⁴/
      • IPA (Kaohsiung): /laɪ³³⁻²¹ bɔŋ²³⁻³³ kɔ⁴¹ t͡su³³⁻²¹ ti³³⁻²¹ kʰu⁴⁴/

~ Justinrleung (talk) 21:51, 20 June 2015 (UTC)

All fixed. Wyang (talk) 05:34, 21 June 2015 (UTC)

Polysyllabic characters[edit]

The Pinyin with numbers seems to come out strange for and (shuangxi1 and tushuguan1). —suzukaze (tc) 17:21, 25 June 2015 (UTC)

Hakka tones for ng in IPA[edit]

There seems to be something wrong with the tones in IPA for ng in Hakka.

(should be /ŋ̍¹¹/)

(should be /ŋ̍³¹/)

Justinrleung (talk) 04:55, 27 July 2015 (UTC)

Fixed. (ǹg and ńg used to produce /ŋ̍⁵⁵/) Justinrleung (talk) 04:57, 7 October 2015 (UTC)

Template does not function properly in conjunction with template:wikipedia[edit]

So you know how this template his a little button in the top right corner that says "Expand"? Well, if Template:zh-pron is used in conjunction multiple Temlate:wikipedia, then that button gets misplaced, like on this page. VulpesVulpes42 (talk) 17:34, 5 March 2016 (UTC)

Is that what's causing it? D: —suzukaze (tc) 09:49, 6 March 2016 (UTC)
@suzukaze-c Seems like it. Remove the Wikipedia templates, and the problem is gone. Put them back, and the "Expand" button becomes displaced once again. - VulpesVulpes42 (talk) 14:41, 6 March 2016 (UTC)
@suzukaze-c Oh, and also; the number of Wikipedia templates seem to affect how displaced the "Expand" button gets. On this page, there is only one Wikipedia template, but on this page, there are as many as seven templates. Observe how the "Expand" button is much closer to its intended position on the page that only had one Wikipedia template, compared to the other page. - VulpesVulpes42 (talk) 14:50, 6 March 2016 (UTC)
@suzukaze-c, VulpesVulpes42 I've observed this too, but just didn't bring up the issue. — justin(r)leung (t...) | c=› } 01:56, 7 March 2016 (UTC)
@suzukaze-c, VulpesVulpes42 Actually, I think it occurs with anything on the right. For example, in 中華民國, the image is also causing the Expand button to shift down. — justin(r)leung (t...) | c=› } 19:51, 11 March 2016 (UTC)
@Justinrleung You seem to be right about that! Now, with these observations in mind, is there a possibility for the bug to be fixed? I personally do not have the programming knowledge necessary to do that myself. - VulpesVulpes42 (talk) 08:05, 12 March 2016 (UTC)
@Wyang, Kc kennylau Is there any solution to this? — justin(r)leung (t...) | c=› } 19:56, 12 March 2016 (UTC)
I'm quite bad with formatting :( Wyang (talk) 21:34, 12 March 2016 (UTC)
Dixtosa's application of {{floatright-top}} and {{floatright-top}} to seems to have had an effect, but there must be a better way to avoid this than adding the two templates to every entry. —suzukaze (tc) 22:39, 12 March 2016 (UTC)

Dialectal data[edit]

@Wyang Is it possible to do something like MC/OC, where we can choose the pronunciation if there are more than one pronunciations? For example, in , there are three pronunciation sections, but the dialectal data is showing the two sets of pronunciations under each pronunciation section. — justin(r)leung (t...) | c=› } 06:37, 7 May 2016 (UTC)

Yep certainly, should be implemented now. Wyang (talk) 06:57, 7 May 2016 (UTC)
Thanks! — justin(r)leung (t...) | c=› } 06:59, 7 May 2016 (UTC)

RFDO discussion: June 2016[edit]

TK archive icon.svg

The following discussion has been moved from Wiktionary:Requests for deletion/Others (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

The template can become hard to read when there are too many pronunciations listed especially on mobile. Is there any sort of reason that we can't just have each pronunciation listed as a separate subsection on each page?--Prisencolin (talk) 00:59, 9 June 2016 (UTC)

Struck as an invalid reason for deletion. However, I agree with your opinion. —suzukaze (tc) 05:46, 9 June 2016 (UTC)


Thinking about rewriting this atm, to make it more "holistic". Perhaps a single collapsed table for all pron, dial, mc and oc, similar to {{th-pron}}. Also to add: expected Mandarin reading from MC. Wyang (talk) 21:39, 13 June 2016 (UTC)

Take One[edit]


Mandarin (Beijing)+
Pinyin guójiā
Zhuyin ㄍㄨㄛˊ ㄐㄧㄚ
Gwoyeu Romatzyh gwojia
IPA (key) /ku̯ɔ³⁵ t͡ɕi̯a̠⁵⁵/
Cantonese (Guangzhou)+
Jyutping gwok3 gaa1
Yale gwok gā
Cantonese Pinyin gwok8 gaa1
IPA (key) /kʷɔːk̚³ kɑː⁵⁵/
Hakka (Sixian)
Pha̍k-fa-sṳ koet-kâ
Hakka RS gued` ga´
IPA /ku̯et̚² ka²⁴/
Min Dong (Fuzhou)
Bàng-uâ-cê guók-gă
IPA (key) /kuoʔ²⁴⁻²¹ ka⁵⁵/
Min Nan (Hokkien)
Pe̍h-ōe-jī kok-ka
Tâi-lô kok-ka
Phofsit Daibuun kokkaf
IPA (Taipei) /kɔk̚³²⁻⁴ ka⁴⁴/
IPA (Zhangzhou) /kɔk̚³²⁻¹²¹ ka³⁴/
Wu (Shanghai)
Wiktionary koq jia (T4)
IPA (key) /kʊʔ³³ t͡ɕiᴀ⁴⁴/

Wyang (talk) 05:51, 14 June 2016 (UTC)

@Wyang Looks great! Could we perhaps collapse by lect (similar to what Suzukaze-c did)? Also, how are we going to deal with multiple readings using this new layout? — justin(r)leung (t...) | c=› } 08:14, 14 June 2016 (UTC)
Similar to my concerns for th-pron, I think that too much whitespace goes unused. —suzukaze (tc) 08:16, 14 June 2016 (UTC)
@suzukaze-c Yeah, I agree that there's too much padding, too. — justin(r)leung (t...) | c=› } 08:18, 14 June 2016 (UTC)

Take Two[edit]

Mandarin (Beijing)+
Pinyin guójiā
Zhuyin ㄍㄨㄛˊ ㄐㄧㄚ
Gwoyeu Romatzyh gwojia
IPA (key) /ku̯ɔ³⁵ t͡ɕi̯a̠⁵⁵/
Cantonese (Guangzhou)+
Jyutping gwok3 gaa1
Yale gwok gā
Cantonese Pinyin gwok8 gaa1
IPA (key) /kʷɔːk̚³ kɑː⁵⁵/
Hakka (Sixian)
Pha̍k-fa-sṳ koet-kâ
Hakka RS gued` ga´
IPA /ku̯et̚² ka²⁴/
Min Dong (Fuzhou)
Bàng-uâ-cê guók-gă
IPA (key) /kuoʔ²⁴⁻²¹ ka⁵⁵/
Min Nan (Hokkien)
Pe̍h-ōe-jī kok-ka
Tâi-lô kok-ka
Phofsit Daibuun kokkaf
IPA (Taipei) /kɔk̚³²⁻⁴ ka⁴⁴/
IPA (Zhangzhou) /kɔk̚³²⁻¹²¹ ka³⁴/
Wu (Shanghai)
Wiktionary koq jia (T4)
IPA (key) /kʊʔ³³ t͡ɕiᴀ⁴⁴/

Wyang (talk) 10:12, 14 June 2016 (UTC)

This one looks better, but where does the IPA go? — justin(r)leung (t...) | c=› } 21:35, 14 June 2016 (UTC)
There is a full table if you click on the 'More' button on the top right. The alternative is to use a single table and hide certain lines in the table by default. Wyang (talk) 21:48, 14 June 2016 (UTC)
I think hiding lines is a better option. Switching between the two tables makes it a bit annoying. — justin(r)leung (t...) | c=› } 23:01, 14 June 2016 (UTC)

Take Three[edit]


Pronunciations of 國家
Mandarin (Beijing)+ Pinyin guójiā
Zhuyin ㄍㄨㄛˊ ㄐㄧㄚ
Gwoyeu Romatzyh gwojia
IPA (key) /ku̯ɔ³⁵ t͡ɕi̯a̠⁵⁵/
Cantonese (Guangzhou)+ Jyutping gwok3 gaa1
Yale gwok gā
Cantonese Pinyin gwok8 gaa1
IPA (key) /kʷɔːk̚³ kɑː⁵⁵/
Hakka (Sixian) Pha̍k-fa-sṳ koet-kâ
Hakka RS gued` ga´
IPA /ku̯et̚² ka²⁴/
Min Dong (Fuzhou) Bàng-uâ-cê guók-gă
IPA (key) /kuoʔ²⁴⁻²¹ ka⁵⁵/
Min Nan (Hokkien) Pe̍h-ōe-jī kok-ka
Tâi-lô kok-ka
Phofsit Daibuun kokkaf
IPA (Taipei) /kɔk̚³²⁻⁴ ka⁴⁴/
IPA (Zhangzhou) /kɔk̚³²⁻¹²¹ ka³⁴/
Wu (Shanghai) Wiktionary koq jia (T4)
IPA (key) /kʊʔ³³ t͡ɕiᴀ⁴⁴/

This is perhaps the ideal layout, although I can't seem to selectively use rowspan (enable when expanded and disable when collapsed) or something equivalent...


Mandarin (Beijing)+ Pinyin guójiā
Zhuyin ㄍㄨㄛˊ ㄐㄧㄚ
Gwoyeu Romatzyh gwojia
IPA (key) /ku̯ɔ³⁵ t͡ɕi̯a̠⁵⁵/
Cantonese (Guangzhou)+ Jyutping gwok3 gaa1
Yale gwok gā
Cantonese Pinyin gwok8 gaa1
IPA (key) /kʷɔːk̚³ kɑː⁵⁵/
Hakka (Sixian) Pha̍k-fa-sṳ koet-kâ
Hakka RS gued` ga´
IPA /ku̯et̚² ka²⁴/
Min Dong (Fuzhou) Bàng-uâ-cê guók-gă
IPA (key) /kuoʔ²⁴⁻²¹ ka⁵⁵/
Min Nan (Hokkien) Pe̍h-ōe-jī kok-ka
Tâi-lô kok-ka
Phofsit Daibuun kokkaf
IPA (Taipei) /kɔk̚³²⁻⁴ ka⁴⁴/
IPA (Zhangzhou) /kɔk̚³²⁻¹²¹ ka³⁴/
Wu (Shanghai) Wiktionary koq jia (T4)
IPA (key) /kʊʔ³³ t͡ɕiᴀ⁴⁴/


Mandarin (Beijing)+
Pinyin guójiā
Zhuyin ㄍㄨㄛˊ ㄐㄧㄚ
Gwoyeu Romatzyh gwojia
IPA (key) /ku̯ɔ³⁵ t͡ɕi̯a̠⁵⁵/
Cantonese (Guangzhou)+
Jyutping gwok3 gaa1
Yale gwok gā
Cantonese Pinyin gwok8 gaa1
IPA (key) /kʷɔːk̚³ kɑː⁵⁵/
Hakka (Sixian)
Pha̍k-fa-sṳ koet-kâ
Hakka RS gued` ga´
IPA /ku̯et̚² ka²⁴/
Min Dong (Fuzhou)
Bàng-uâ-cê guók-gă
IPA (key) /kuoʔ²⁴⁻²¹ ka⁵⁵/
Min Nan (Hokkien)
Pe̍h-ōe-jī kok-ka
Tâi-lô kok-ka
Phofsit Daibuun kokkaf
IPA (Taipei) /kɔk̚³²⁻⁴ ka⁴⁴/
IPA (Zhangzhou) /kɔk̚³²⁻¹²¹ ka³⁴/
Wu (Shanghai)
Wiktionary koq jia (T4)
IPA (key) /kʊʔ³³ t͡ɕiᴀ⁴⁴/

Wyang (talk) 01:24, 15 June 2016 (UTC)

Out of A, B, and C, A is the one I like the most, but I also think the current design has its own merits. —suzukaze (tc) 07:37, 29 June 2016 (UTC)

Pinyin display with cap or py[edit]

@Wyang, Kc kennylau: For words like 亞洲 and A型肝炎, could we show the same pinyin in the collapsed and expanded displays? 亞洲 should be capitalized in both, and A型肝炎 should have A instead of ēi in both. — justin(r)leung (t...) | c=› } 10:02, 16 June 2016 (UTC)

@Justinrleung: The code reads |m=ēixíng gānyán,py=A-xíng gānyán, meaning that the behaviour is intended. --kc_kennylau (talk) 10:05, 16 June 2016 (UTC)
@Kc kennylau Really? I thought it was for the conversions into the other systems (zhuyin, etc.). — justin(r)leung (t...) | c=› } 10:08, 16 June 2016 (UTC)
@Kc kennylau, Wyang I don't know about things like A型肝炎, but 亞洲 still needs to be fixed. The capitalization is wonky. I tried to fix it, but I don't understand what this in MOD:cmn-pron (export.str_analysis), which might be the source of the problem, does:
if conv_type == 'head' or conv_type == 'link' then
	if match(text, ', cap—') then
		text = gsub(text, '[一不]', {['一'] = 'Yī', ['不'] = 'Bù'})
	text = gsub(text, '[一不]', {['一'] = 'yī', ['不'] = 'bù'})
— justin(r)leung (t...) | c=› } 22:46, 12 September 2016 (UTC)
I have fixed it - there was no capitalisation for strait diff aside from this. I hope I have not broken anything... Module:cmn-pron probably needs a rewrite. Wyang (talk) 01:29, 13 September 2016 (UTC)

Wenzhou dialect[edit]

@Wyang, Justinrleung Would it be possible to add Wenzhounese? It seems like User:Mteechan may be able to add pronunciations (diff, diff, diff, diff). —suzukaze (tc) 07:00, 2 October 2016 (UTC)

@Suzukaze-c I think that's a great idea, since Shanghainese and Wenzhounese are quite different. That being said, we would need to have a romanization scheme for Wenzhounese. @Mteechan, do you have any ideas if there are any common romanization schemes out there, or do we need to make our own? (Wikipedia only has 溫州話羅馬字. Is this a good romanization scheme?) Also, I notice that you've been adding some Rui'an pronunciations. Would there be some dialectal variations within Wenzhounese to consider? — justin(r)leung (t...) | c=› } 07:21, 2 October 2016 (UTC)
Minidict has data not only for Shanghainese Wu, but for Wenzhou and other dialects. By default it's 上海 but you can select 温州, 苏州, etc. in the drop-down box. Wyang has already defined Wu transliteration. Perhaps it needs some tweaking for Wenzhou. --Anatoli T. (обсудить/вклад) 07:30, 2 October 2016 (UTC)
Are we ready to tackle the hardest Chinese dialect on Earth? lol. Anyway, I'm all for adding in additional Wu, either Suzhou or Wenzhou, or both. I added some stuff to zh:溫州話 before. It should be possible, and it would have to be a new parameter in zh-pron since Wenzhounese is not inferrable from Shanghainese. We need to decide on what the best way to handle sandhi is, and this will depend on how irregular the tone changes are. Wyang (talk) 07:37, 2 October 2016 (UTC)
"Are we ready to tackle the hardest Chinese dialect on Earth?" Me, certainly not, ha-ha. I'm glad if the method is added, even if it's incomplete (work-in-progress) or only for single syllables. It makes little sense, though when there is no data or very little predictability. --Anatoli T. (обсудить/вклад) 08:00, 2 October 2016 (UTC)
"The hardest Chinese dialect on Earth"? Well I'd suppose Min dialects to be much much harder. About the romanization scheme, I'm for the one that Minidict currently uses. But the problem is Minidict mentions that "禁止以任何形式盗用本站任何内容". So we may not be able to grab the data right from the site. And about the tone sandhi, I've made a sheet of 2-character tone sandhi. However, it's too complicated and not exhaustive for all the irregular ones. Not to mention my dialect is different from the "standard" one. Mteechan (talk) 09:21, 2 October 2016 (UTC)

"Category:Chinese lemmas"[edit]

Currently {{zh-pron}} outputs (for example) [[Category:Chinese lemmas|kai1]] on . However, this sortkey is overridden by {{head}} ({{zh-noun}}, etc.)'s plain [[Category:Chinese lemmas]]. —suzukaze (tc) 09:44, 22 November 2016 (UTC)

This is bad. IMO we should replace the {{head}} part of the headword-line templates with {{lang|zh|{{{head|{{PAGENAME}}}}}}}. Wyang (talk) 09:57, 22 November 2016 (UTC)

Sichuanese pronunciation[edit]

Can an entry for Sichuanese be added to this?--Prisencolin (talk) 09:47, 9 December 2016 (UTC)

Sichuanese to be nested[edit]

Can and should Sichuanese be nested under Mandarin? E.g.

--Anatoli T. (обсудить/вклад) 23:25, 14 December 2016 (UTC)

@Atitarev: Wiktionary_talk:About_Chinese#Sichuanese.—suzukaze (tc) 23:30, 14 December 2016 (UTC)

"Phonetic" pinyin[edit]

In entries that contain more than one third-tone Chinese character, I found this template generates a claim that there is a "phonetic" pinyin.

Entry Pinyin "Phonetic" pinyin claimed by this template
螞蟻 mǎyǐ "máyǐ"
鼓舞 gǔwǔ "gúwǔ"
展覽館 zhǎnlǎnguǎn "zhánlánguǎn"
紙老虎 zhǐlǎohǔ "zhíláohǔ"

I don't see reasons why these claims are written in this template. Dokurrat (talk) 14:46, 20 February 2017 (UTC)

@Dokurrat: I'm not understanding what you're disputing. The phonetic pinyin is basically how it would be pronounced with all phonological rules applied. — justin(r)leung (t...) | c=› } 15:26, 20 February 2017 (UTC)
BTW, @Wyang, Tooironic, I feel like for 紙老虎, I would pronounce it as zhǐláohǔ. Is that wrong? — justin(r)leung (t...) | c=› } 15:43, 20 February 2017 (UTC)
  • Please see Standard_Chinese_phonology#Tone_sandhi. ---> Tooironic (talk) 15:48, 20 February 2017 (UTC)
    @Wyang, Tooironic, so it would be pronounced as zhǐláohǔ instead of zhíláohǔ, right? — justin(r)leung (t...) | c=› } 22:25, 20 February 2017 (UTC)
    My theory is it depends on how the word is made up. For example, 纸老虎 is pronounced, after tone sandhi, as zhi3lao2hu3, while 展览品 is pronounced as zhan2lan2pin3. The difference between the two is the former is made up for a one-character word followed by a two-character word, while it is the other way around for the latter term. If my theory is correct, we will need to change the way tone sandhi is annotated on Wiktionary, as 纸老虎 should not be pronounced as zhi2lao2hu3. ---> Tooironic (talk) 04:45, 21 February 2017 (UTC)
    Yes, zhǐláohǔ is the sandhi pronunciation. Perhaps we could add in a feature to allow 3rd-3rd tone sandhis to be blocked, such as using 'zhǐ/lǎohǔ'. Wyang (talk) 08:33, 21 February 2017 (UTC)
    I don't like the idea of using / since it's already used in other sections to separate pronunciations. (Off the top of my head, what about _?) —suzukaze (tc) 08:55, 21 February 2017 (UTC)
    Perhaps # could be used (it's used in phonology to indicate a word boundary, if that makes any sense in this context). If not, we could stick with Suzukaze-c's idea of using an underscore. I think this is needed in Min Nan as well. The tone sandhi in Hokkien is kind of messed up because it relies on the hyphens and spaces. — justin(r)leung (t...) | c=› } 09:03, 21 February 2017 (UTC)

"Mainland vs. Taiwanese Mandarin" note[edit]

Can this be made more obvious? —suzukaze (tc) 04:02, 10 March 2017 (UTC)


Can capitalization be allowed for Jyutping? On the word list compiled by LSHK, they also uses capital letters and spacings are not required. And I believe we should adhere to the official Jyutping system where tones does not have superscript as it was designed this way for easy input and I've not seen any textbooks that uses superscript. Jyutping also does not indicate tone change, this is something created by an unaffiliated website (see bottom). Littlepenny413 (talk) 12:25, 2 April 2017 (UTC)

@Littlepenny413: You may want to take a look at Module talk:yue-pron#I'm sure. As far as superscripts go, I think it's for aesthetics. As for indicating tone changes, I think we included them for the interest of learners. BTW, we are not following Cantodict conventions, i.e. we use a hyphen rather than an asterisk. — justin(r)leung (t...) | c=› } 20:09, 2 April 2017 (UTC)

RFC discussion: March–April 2016[edit]

TK archive icon.svg

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

So far, this template only has coverage on the Mandarin, Cantonese, Wu, Hakka, Min Nan, and Min Dong dialects of Chinese. Is there any way you can include Xiang, Shandong, and other lesser-known dialects? Also, make sure that most (if not all) pages contain these and existing dialect pronounciations. Thanks in advance. —This unsigned comment was added by Johnny Shiz (talkcontribs). 15:48, 24 March 2016 (UTC)

Xiang (x) and other topolects like Gan (g) and Jin (j) are included. Since they do not have well-known romanizations, they are in IPA. See (shuǐ) for an example. We currently do not support dialects of Mandarin, like Shandong or Sichuanese Mandarin. — justin(r)leung (t...) | c=› } 19:17, 24 March 2016 (UTC)
Please improve the coverage of these dialects and try to make sure most common Han Characters have these pronounciations.
We don't have speakers of these varieties, so it may be difficult to have good coverage at the moment. — justin(r)leung (t...) | c=› } 02:07, 27 March 2016 (UTC)
We don't have a proper coverage for Gan, Jin, Xiang and won't have in the near future. Not just because of the shortage of native speakers but because of the lack of other resources. (shuǐ) is probably an exception, which covers 9 Chinese topolects + Middle Chinese and Old Chinese. The infrastructure is there, though. See Category:Gan_lemmas, Category:Jin_lemmas, Category:Xiang_lemmas.--Anatoli T. (обсудить/вклад) 00:43, 1 April 2016 (UTC)
The only online resource for Gan, Jin and Xiang readings that I'm aware of is 小學堂, which has coverage of many characters in many Chinese varieties. I think the readings for 水 come from this website. — justin(r)leung (t...) | c=› } 07:06, 1 April 2016 (UTC)

Module error[edit]

W has an error now, perhaps related to this recent edit by @Suzukaze-c? — Eru·tuon 03:09, 20 May 2017 (UTC)

In no way is W valid pinyin. The anonymous editor doesn't seem to care too much about module errors, and has been producing an enormous amount of them since they don't touch-up {{zh-pron}} input appropriately when using {{zh-new}}. —suzukaze (tc) 03:13, 20 May 2017 (UTC)
Hmm, yes, I recall seeing other similar module errors in. Sorry about thinking it was your fault. — Eru·tuon 03:38, 20 May 2017 (UTC)
It's alright. —suzukaze (tc) 03:52, 20 May 2017 (UTC)

Min Bei Pronunciation[edit]

Module:mnp-pron ought to be created, just so I could add the following transliteration and others: Dô̤ng-gŏ (for China in Kienning Colloquial Romanized; I encountered it in this external link). --Lo Ximiendo (talk) 08:43, 3 September 2017 (UTC)

(More text: s:mul:Se̿ng-géngsuzukaze (tc) 08:47, 3 September 2017 (UTC))
@Wyang, I wonder what the tone sandhi (or any other sandhi) would be like. --Lo Ximiendo (talk) 10:11, 3 September 2017 (UTC)
It's too exotic. There is too little stuff on this, plus no one speaks this s*** here... so a lot of it will end up being guesswork. Wyang (talk) 10:17, 3 September 2017 (UTC)
I'm taking, that the situation calls for adding transliterations for only single characters? (Such as 國, transliterated as )
Also, I meant a request for a parameter for Min Bei like a parameter for Taishanese was requested. --Lo Ximiendo (talk) 11:35, 3 September 2017 (UTC)
All Chinese varieties don't only have transliterations but also pronunciations to match. Some sources had different transliterations but have been normalised and standardised here to produce consistent results. While Min Bei may have a few texts transliterated, no-one knows how to pronounce them with certainty and what tone sandhi are used. It's not worth adding a couple of hundred Min Bei transliterations when there is no good resource for this lect. --Anatoli T. (обсудить/вклад) 11:45, 3 September 2017 (UTC)
I do have 建甌方言詞典, but I'll have to look into how the pronunciation actually matches with the romanization. Using Kienning Colloquial Romanized could be problematic, since there have been changes in the phonology of the Jian'ou dialect since the creation of that romanization, including a merger of the 陽平 tone into 陰去. On a good note, I understand that tone sandhi is pretty much nonexistent in the Jian'ou dialect. — justin(r)leung (t...) | c=› } 12:22, 3 September 2017 (UTC)

IPA module[edit]

@Wyang, as with Mod:ja-pron, is it possible to use the IPA module? Thanks! —JohnC5 06:13, 10 October 2017 (UTC)

@JohnC5 I'm too lazy to change it... since there are deeply embedded within the zh-pron structure, are behaving well atm, and the IPA module may throw up errors for Chinese IPA. (btw, ping didn't work) Wyang (talk) 07:11, 10 October 2017 (UTC)

Default label for Mandarin pronunciation[edit]

@Wyang, Tooironic, Atitarev, Suzukaze-c, do you think we should remove "Beijing" from the default label for Mandarin pronunciations? It's similar to the issue brought up here with {{th-pron}}, and it also makes it seem like it's excluding the Taiwanese standard. — justin(r)leung (t...) | c=› } 02:43, 15 October 2017 (UTC)

Support. Wyang (talk) 02:44, 15 October 2017 (UTC)
Support but a label is probably needed. --Anatoli T. (обсудить/вклад) 12:08, 17 October 2017 (UTC)
@Atitarev: It already has a "Standard Chinese" label in front of Beijing, so that should be fine. — justin(r)leung (t...) | c=› } 12:30, 17 October 2017 (UTC)

Bug report[edit]

In entry 三九四零五二, the audio interface sheltered the IPA. Is it just me or a bug? @Wyang. Dokurrat (talk) 20:56, 13 November 2017 (UTC)

@Dokurrat: It’s been a bug for ages. — justin(r)leung (t...) | c=› } 21:04, 13 November 2017 (UTC)
@Dokurrat Yeah. Perhaps 重金懸賞 is warranted for this and the floating problem of the button in {{zh-pron}}. :) Wyang (talk) 07:35, 14 November 2017 (UTC)
phab:T130982suzukaze (tc) 07:48, 14 November 2017 (UTC)
Thanks. I didn't realise you filed this bug before. Pity it's still unresolved. (Probably would have resorted to a bit of $$$ in real life, but too bad Phabricator doesn't allow this) Wyang (talk) 07:57, 14 November 2017 (UTC)