Ho Chi Minh City Cantonese[edit]

Just wondering, did you get data for Ho Chih Minh City Cantonese from 越南芒街市粤方言词汇研究? RcAlex36 (talk) 07:40, 13 August 2020 (UTC)

@RcAlex36: Yup. — justin(r)leung (t...) | c=› } 15:30, 13 August 2020 (UTC)
@Justinrleung: Great. I can add the remaining three 方言點 recorded in the paper (lol). RcAlex36 (talk) 15:37, 13 August 2020 (UTC)
@RcAlex36: Yup! — justin(r)leung (t...) | c=› } 16:05, 13 August 2020 (UTC)
@Justinrleung: Yes check.svg Done, plus a couple 方言點 in 廣西北海市粵方言研究. RcAlex36 (talk) 15:52, 15 August 2020 (UTC)


Why did you revert? There were concerns that some of my technical fixes might not be stable longer term? ShakespeareFan00 (talk) 18:45, 13 August 2020 (UTC)

@ShakespeareFan00: The standard formatting for {{CJKV}} is to not put it in a list. Putting it in a list breaks the formatting. — justin(r)leung (t...) | c=› } 18:46, 13 August 2020 (UTC)

Tibetan dictionary[edit]

Is there a good Tibetan dictionary online? Asking this because The Tibetan & Himalayan Library is down, and I need a Tibetan dictionary when editing Sinitic etymology. RcAlex36 (talk) 03:29, 14 August 2020 (UTC)

You won't get that kind of quality unless you resort to a print dictionary (of which there are many excellent ones in English and Chinese), but [1] is pretty good. —Μετάknowledgediscuss/deeds 03:52, 14 August 2020 (UTC)
@RcAlex36, Metaknowledge: I believe they've moved the THL dictionaries to here. — justin(r)leung (t...) | c=› } 03:55, 14 August 2020 (UTC)
@Justinrleung: Thanks a lot! RcAlex36 (talk) 05:43, 14 August 2020 (UTC)

Congee in Singaporean Hainanese[edit]

To answer your question, if the rice grains are still whole like what you typically see in Hokkien and Teochew porridge, then it's 糜. If the rice grains are mashed up and no longer visible like what you typically see in Cantonese porridge, then it's 粥. That's the distinction. For this term, it's different from the Hainanese in China because in Wenchang itself, 糜 refers to cooked rice, so it did confuse me somewhat. The dog2 (talk) 07:52, 15 August 2020 (UTC)

@The dog2: That doesn't really answer my question. So would Hainanese congee be like Cantonese style or Hokkien/Teochew style? Or do they have both? I don't know if it's a good idea to classify it based on other groups. — justin(r)leung (t...) | c=› } 07:55, 15 August 2020 (UTC)
In don't know about in China because I never ate congee during my trip to Hainan but in Singapore, the style of congee that you find at Hainanese hawker stalls is at some sort of a halfway point between Cantonese and Teochew porridge. See [2] for an example. This style would be called 粥. The dog2 (talk) 08:07, 15 August 2020 (UTC)
@The dog2: I see, thanks! I'm not sure how we should do it best, but I don't think the notes should be describing other groups' styles without describing their own kind of congee. It may be better to have it be based on the actual characteristics of the congee as you had it before. — justin(r)leung (t...) | c=› } 08:13, 15 August 2020 (UTC)
I've adjusted it. Please feel free to improve. The dog2 (talk) 08:37, 15 August 2020 (UTC)


I've created Module:zh/data/dial-syn/瓦匠. Does Thesaurus:瓦工 still need to stay? RcAlex36 (talk) 10:47, 17 August 2020 (UTC)

@RcAlex36: We usually keep thesaurus entries. I'd remove all the dialectal terms because they're covered better with {{zh-dial}}. — justin(r)leung (t...) | c=› } 16:36, 17 August 2020 (UTC)

海豐方言 and 海豐方言辭典[edit]

I was wondering if you have heard of these two books by 羅志海 and have e-copies of them. RcAlex36 (talk) 08:53, 18 August 2020 (UTC)

@RcAlex36: I have all of 海豐方言 and the first 100 pages of 海豐方言詞典. I could email them to you if you'd like. — justin(r)leung (t...) | c=› } 08:57, 18 August 2020 (UTC)
@Justinrleung: May I have just 海豐方言? I have the first 100 pages of 海豐方言詞典, but I have no idea why there's only the first 100 pages of it. RcAlex36 (talk) 08:59, 18 August 2020 (UTC)

後字變調 in Teochew[edit]

Months ago you talked about 後字變調 in Teochew. There does not seem to be 後字變調 in the Shantou dialect, but 後字變調 is indeed present in Chaozhou, Chenghai and Jieyang (see 廣東閩方言語音研究 p.95-96). The IPA generation system we currently have is based on the Shantou dialect, yet it is also programmed to display 後字變調. RcAlex36 (talk) 17:02, 18 August 2020 (UTC)

@Justinrleung: If we are to do sub-dialects of Teochew one day (I hope that happens eventually), we should perhaps start with Chaozhou, Shantou, Chenghai and Jieyang. RcAlex36 (talk) 17:09, 18 August 2020 (UTC)
@RcAlex36: Yup, it's kind of a mess. It's kind of a mixed version that doesn't distinguish dialects. It'll need to be fixed eventually but it's not that easy to do. — justin(r)leung (t...) | c=› } 17:11, 18 August 2020 (UTC)
@Justinrleung: As a side note, -m and -p are said to be disappearing in Chaozhou, but I guess we will stick to 老派讀音 when sub-dialects get implemented. RcAlex36 (talk) 17:23, 18 August 2020 (UTC)
@RcAlex36: Yup, let's stick to 老派. "Disappearing" means it's not completely gone, so we should record the more "conservative" pronunciation. — justin(r)leung (t...) | c=› } 17:28, 18 August 2020 (UTC)
@RcAlex36, Justinrleung: If I'm not wrong, 後字變調 also exists in Hokkien, not just Teochew, at least as spoken in Singapore. However, the module doesn't seem to incorporate it. The dog2 (talk) 19:07, 19 August 2020 (UTC)

Hainanese pronunciation[edit]

Just wondering, when will the module be implemented? I think our number of Hainanese entries has grown, so it will certainly be useful to have pronunciations. The dog2 (talk) 19:05, 19 August 2020 (UTC)

@The dog2: I'm not sure. There are still some uncertainties with tone sandhi. — justin(r)leung (t...) | c=› } 19:09, 19 August 2020 (UTC)


I can't seem to find it online. May I have a copy of it? Thanks a lot! RcAlex36 (talk) 04:28, 21 August 2020 (UTC)

@RcAlex36: It's just the vocabulary list in 海南省志 人口志 方言志 宗教志. — justin(r)leung (t...) | c=› } 04:41, 21 August 2020 (UTC)
@Justinrleung: Oh... Never mind then. RcAlex36 (talk) 04:51, 21 August 2020 (UTC)

Philippine Hokkien[edit]

@Justinrleung, Mar vin kaiser This user, Kamkamkamuti, changed two Philippine Hokkien entries. Perhaps you can take a look at them. RcAlex36 (talk) 09:31, 23 August 2020 (UTC)

@RcAlex36: Thanks for the the heads-up. I've already looked at it. --Mar vin kaiser (talk) 10:00, 23 August 2020 (UTC)
@Mar vin kaiser: Thanks! They made a few more edits, so make sure you take a look at those as well. — justin(r)leung (t...) | c=› } 18:21, 23 August 2020 (UTC)

Aunt In Hainanese[edit]

OK, I guess this isn't proof, but if you look at the video from The Federation of Hainan Association Malaysia, they do have a segment where they talk about how to address your paternal aunts and uncles in Hainanese. The dog2 (talk) 21:37, 23 August 2020 (UTC)

@The dog2: That'd be proof for Malaysian Hainanese, but not Wenchang. — justin(r)leung (t...) | c=› } 00:17, 24 August 2020 (UTC)
That said, most Hainanese in Malaysia trace their ancestry to Wenchang or Qionghai. And the Wenchang and Qionghai dialects are very similar; the Hainanese diverges more when you get to Haikou, and even more when you get to Sanya. So what you will notice is that the Hainanese spoken in Malaysia is quite heavily based on the Wenchang and Qionghai dialects. One interesting observation I have made though is that my Hainanese relatives from Singapore have difficulty understanding the Hainanese in Haikou and Sanya, while those who were actually born and raised in Wenchang can understand the Haikou and Sanya dialects with no problem. The dog2 (talk) 00:30, 24 August 2020 (UTC)
@The dog2: Of course, Hainanese in Malaysia/Singapore probably has diverged less from Wenchang/Qionghai Hainanese than say Hokkien has, but there is bound to be differences because the Hainanese in Malaysia/Singapore are surrounded by many other languages, while in Hainan, there's likely influence from Mandarin (and to a lesser extent, Cantonese). There's always a change of language change along the way. — justin(r)leung (t...) | c=› } 01:50, 24 August 2020 (UTC)


Just to clarify, I thought 伊斯蘭教 is the term used in China, while 回教 is used in Malaysia and Singapore. The dog2 (talk) 18:35, 25 August 2020 (UTC)

@The dog2: Plenty of people still use 回教 in China, though the more "official" term is 伊斯蘭教. 回教 is definitely used in Taiwan and Hong Kong as well. — justin(r)leung (t...) | c=› } 18:37, 25 August 2020 (UTC)

普通话基础方言基本词汇集 词汇卷[edit]

Just so you know, there's a book called 普通话基础方言基本词汇集 词汇卷, divided into three volumes. It contains quite a lot of Mandarin 方言點. RcAlex36 (talk) 06:31, 26 August 2020 (UTC)

@RcAlex36: Yup, I'm not sure if you noticed, but I'm already using parts of it. (It's actually 5 volumes if you include the two volumes with the phonologies.) — justin(r)leung (t...) | c=› } 06:33, 26 August 2020 (UTC)
@Justinrleung: Don't know if I'm asking for too much, but may I have the three volumes of 詞彙卷? Thanks a lot! RcAlex36 (talk) 06:35, 26 August 2020 (UTC)
@RcAlex36: I don't have all the pages yet (cuz the system wouldn't let me get all of it yet). Maybe you could get the pages I don't have? — justin(r)leung (t...) | c=› } 06:37, 26 August 2020 (UTC)
@Justinrleung: I have 上: 2001-2400, 中: 2909-3029, 下: 3871-4270. Am I just slower than you in requesting? RcAlex36 (talk) 06:41, 26 August 2020 (UTC)
@RcAlex36: Haha, I think I've just been at it longer. I have 上: 2001-2700, 中: 2909-3658, 下: 3871-4570. — justin(r)leung (t...) | c=› } 06:46, 26 August 2020 (UTC)
@Justinrleung: So you're ahead of me by six days. I don't need it for now then. By the way, I think you can't request too much of a particular book within a month. RcAlex36 (talk) 06:49, 26 August 2020 (UTC)
@RcAlex36: Yup, there's a limit. I think I saw on Zhihu that the limit is something like 80% of the book in 20 days or something. — justin(r)leung (t...) | c=› } 06:56, 26 August 2020 (UTC)
@Justinrleung: I think it's lower than 80% in reality. RcAlex36 (talk) 07:01, 26 August 2020 (UTC)
@RcAlex36: Yeah, it probably depends on the book as well. — justin(r)leung (t...) | c=› } 07:02, 26 August 2020 (UTC)
@Justinrleung: Let's just add the remaining 方言點 some day. RcAlex36 (talk) 07:25, 26 August 2020 (UTC)
@RcAlex36: Yup. We might need to shuffle the order around. I'm not sure if we should go geographically and/or by major classification. Our current order is kind of a mix of both. — justin(r)leung (t...) | c=› } 07:33, 26 August 2020 (UTC)
@Justinrleung: Shuffling the order or inserting new 方言點 between old ones is a pain in the a**, to be honest. Can a bot be written to reorder the 方言點 and add newly added ones in existing dialectal synonyms modules? RcAlex36 (talk) 07:37, 26 August 2020 (UTC)
@Suzukaze-c, I wonder if you could help with this. — justin(r)leung (t...) | c=› } 07:42, 26 August 2020 (UTC)
I've totally thought about it. I've also put zero effort so far into making my thoughts a reality, but I'll start. —Suzukaze-c (talk) 07:43, 26 August 2020 (UTC)
In progress, but what's this about Sabah?Suzukaze-c (talk) 10:21, 26 August 2020 (UTC)
Also please mass revert User:350bot's edits because my script is trash and I failed to consider edge cases. I have to monitor it. —Suzukaze-c (talk) 10:23, 26 August 2020 (UTC)
@Suzukaze-c: I don't know Lua, but why is stuff from other entries showing up? RcAlex36 (talk) 10:26, 26 August 2020 (UTC)
@RcAlex36: Because my code sucks and retained words from previous pages. —Suzukaze-c (talk) 10:39, 26 August 2020 (UTC)
@Suzukaze-c: There are several locations that should be deprecated and replaced with its successors: Sabah → Sabah-B, Sabah-L; Luchuan → Luchuan-LC, Luchuan-DQ; Doumen → Doumen-T, Doumen-S; Huidong → Huidong-PS, Huidong-DL. If there are synonyms for the deprecated locations, you could just leave the synonyms as comments (and maybe ping me in the comments to deal with them?). — justin(r)leung (t...) | c=› } 16:01, 26 August 2020 (UTC)
I've checked 40 of the bot's revisions after improving the code and I haven't seen any problems, so I'll let the bot run unsupervised now since there are >1,000 of these. If anything still happens please shout at me violently.
Deprecated locations: [3]Suzukaze-c (talk) 05:17, 27 August 2020 (UTC)
Yes check.svg DoneSuzukaze-c (talk) 06:32, 27 August 2020 (UTC)
@Suzukaze-c: Great, thanks! — justin(r)leung (t...) | c=› } 06:37, 27 August 2020 (UTC)

左轉 and 右轉[edit]

I was just wondering, do we have enough data to create dialect tables for these? @RcAlex36 Please weigh in too. The dog2 (talk) 04:01, 27 August 2020 (UTC)

@The dog2: Probably not, except for our native lects. — justin(r)leung (t...) | c=› } 05:25, 27 August 2020 (UTC)


Hi Justin, may I have a copy of 閩北區三縣市方言研究? Thanks a lot! RcAlex36 (talk) 10:50, 27 August 2020 (UTC)

@RcAlex36: Sure. — justin(r)leung (t...) | c=› } 18:07, 27 August 2020 (UTC)

Wu 方言點[edit]

@Justinrleung, Thedarkknightli I've added quite a number of Wu 方言點 from 當代吳語研究. I've also added 舟山 and 義烏 from 舟山方言研究 and 義烏方言研究 respectively. I did not add 金壇西崗鎮 because Wu is no longer spoken there as noted in the book. RcAlex36 (talk) 08:33, 28 August 2020 (UTC)

Sorry for reordering the 方言點 again, but we can always fix that with 350bot some day, perhaps after those Mandarin 方言點 are also added. RcAlex36 (talk) 08:41, 28 August 2020 (UTC)
@RcAlex36: Thanks! I always find it very time consuming to add 方言點, so I really appreciate that you're doing it! — justin(r)leung (t...) | c=› } 17:19, 28 August 2020 (UTC)
@RcAlex36, Thedarkknightli: BTW, it looks like all the Shanghai dialects are in a weird order. I think they should really be put together. — justin(r)leung (t...) | c=› } 17:21, 28 August 2020 (UTC)
@Justinrleung: You can reorder them if you want. I was just trying to follow the order given in 當代吳語研究 but I didn't move Shanghai itself. RcAlex36 (talk) 02:07, 29 August 2020 (UTC)
@RcAlex36, Justinrleung: Guys, I've reordered them per Wu subgroups. --TheDarkKnightLi(STAY HAPPY) 21:37, 31 August 2020 (UTC)

Origin of 仔 suffix in Taiwanese Hakka[edit]

The paper 客家方言名词后缀 “子”“崽” 的类型及其演变 suggests diminutive suffixes in mainland Hakka lects ultimately derive from either 子 or 崽. Do you have any source for the etymological origin of 仔 suffix in the various Taiwanese Hakka lects? I suspect it is ultimately from 子, but a source is needed. Thanks! RcAlex36 (talk) 14:39, 28 August 2020 (UTC)

@RcAlex36: The same article mentions Sixian (represented by Miaoli) and Hailu (represented by Hsinchu), the major dialects in Taiwan, which 庄初升 derives from 子. I think Raoping dialect in Taiwan probably got its suffix from contact with these two dialects (both or either). Dabu dialect doesn't have a diminutive suffix, but a diminutive tone (kind of like Cantonese). Zhao'an dialect has /a/, which seems to be from Hokkien, and /tsu/, which is clearly 子. — justin(r)leung (t...) | c=› } 17:09, 28 August 2020 (UTC)


I've created Module:zh/data/dial-syn/點頭. We might as well just delete Thesaurus:頭向下微動. RcAlex36 (talk) 05:09, 29 August 2020 (UTC)

@RcAlex36: I think we could still keep it, but it should be moved to Thesaurus:點頭 or something. It's kind of weird to have 頭向下微動 as the title since it's really a definition more than anything. What do you think, @Tooironic? — justin(r)leung (t...) | c=› } 05:38, 29 August 2020 (UTC)
I gave it that name so it would be clear that we are referring to a literal motion of the head, and not its extended meaning of giving consent. However, if you guys think moving it to Thesaurus:點頭 is a good idea, please go ahead. ---> Tooironic (talk) 06:39, 29 August 2020 (UTC)
@Tooironic: I see what you mean, but it's kind of a clunky title. We've already defined it in English in the thesaurus entry, so it should be fine for distinguishing which sense we're referring to. I'll move it then. — justin(r)leung (t...) | c=› } 06:42, 29 August 2020 (UTC)
No worries. Thanks mate. ---> Tooironic (talk) 11:16, 29 August 2020 (UTC)

陽入 in Taishanese[edit]

I bet you have noticed this problem long ago, but different sources give different tone numbers for 陽入 in Taishanese. Stephen Li and 台山方音字典 gives 32, 广东四邑方言语音特点 and 台山县志 give 21, and 广东四邑方言语法研究 gives 2 but notes there is another 陽入 tone with tone number 21 that is commonly used with nouns and likely produced by morphological tone change. RcAlex36 (talk) 04:27, 31 August 2020 (UTC)

@RcAlex36: Let's stick with Stephen Li and Deng Jun (台山方音字典) since they're native speakers of Taishanese (or a similar dialect). — justin(r)leung (t...) | c=› } 19:16, 31 August 2020 (UTC)


Hiya. Would you mind taking a look at this entry when you get a chance? I can't reconcile the lua error. Thank you. ---> Tooironic (talk) 06:13, 1 September 2020 (UTC)

@Tooironic: It seems like you've inputted the wrong character (U+0261 instead of the normal U+0067). — justin(r)leung (t...) | c=› } 06:57, 1 September 2020 (UTC)
Weird. Thank you. ---> Tooironic (talk) 07:59, 1 September 2020 (UTC)
@Tooironic: No problem :D — justin(r)leung (t...) | c=› } 08:06, 1 September 2020 (UTC)


I was editing the etymology section of (luàn) and couldn't find the Burmese cognate on SEAlang. I'm not sure if Burmese ပြွမ်း (prwam:) is one of the words listed there. Could you please help? It's on p. 390 of Schuessler. RcAlex36 (talk) 16:45, 1 September 2020 (UTC)

@RcAlex36: (You mean p. 369, right? It's p. 390 in the PDF haha) Sorry, I'm not sure about Burmese myself. @Atitarev, would you be able to help? The cognates are given as broŋᴮ ~ byoŋᴮ ~ prunᴮ ~ runᴮ "tumultuous" (in Paul Benedict's Rhyming Dictionary, brôŋ ~ byôŋ- "tumultuously"; prûn ~ pa-rûn- "tumultuous; disorderly"; rûn "tumultuous"). — justin(r)leung (t...) | c=› } 06:05, 2 September 2020 (UTC)
@Justinrleung, RcAlex36: Sorry for the late reply. I am not sure about the Burmese descendant. It's not easy to find. I have created all Khmer terms mentioned the entry, plus the Burmese ပြွမ်း (prwam:). @RichardW57, Mahagaja: Can you help there?
Also, @Mahagaja - there seems to be a new case of a missing Burmese initial /pjw-/. ပြွမ်း (prwam:) is pronounced as both /pjʊ́ɴ/ and /pjwáɴ/. I am not able to add the respelling for the alternative reading. --Anatoli T. (обсудить/вклад) 02:12, 3 September 2020 (UTC)
I'm afraid I can't be of any help here. I don't know anything about Sino-Tibetan historical linguistics and don't have any sources on it. As for the initial cluster, I don't know how to edit Module:my-pron to fix it. Unfortunately Wyang seems to have left the project. —Mahāgaja · talk 06:37, 3 September 2020 (UTC)

漢語方言大詞典 (by 許寶華 & 宮田一郎)[edit]

Hi. Just wondering whether this is a reliable source. --TheDarkKnightLi(STAY HAPPY) 22:52, 1 September 2020 (UTC)

@Thedarkknightli: I sometimes use it when I'm desperate, but it occasionally quotes some century-old sources so use with caution. RcAlex36 (talk) 03:12, 2 September 2020 (UTC)
@Thedarkknightli: I agree with @RcAlex36. It's an okay source, but it's riddled with errors and includes material from the Qing dynasty (or even before). — justin(r)leung (t...) | c=› } 05:39, 2 September 2020 (UTC)
@RcAlex36, Justinrleung: I see. Thanks for your replies, guys. --TheDarkKnightLi(STAY HAPPY) 05:48, 2 September 2020 (UTC)

Guangzhou Cantonese High Falling Tone[edit]

Hey @Justinrleung, I wanna revive this topic. I remember we discussed it before, but I recently watch some videos of Guangzhou people talking, and they clearly still use the High Falling Tone which seems to be lost in Hong Kong, and I checked my 漢英小字典, and the words they say with a high falling tone does correspond to what's written in the dictionary, so I was thinking, the fact that we list all variant pronunciations of words in Standard Mandarin, shouldn't we also include these pronunciations in Cantonese? Especially since they're still used. --Mar vin kaiser (talk) 04:46, 2 September 2020 (UTC)

@Mar vin kaiser: There are several things that would need to be solved. First, Jyutping doesn't support this distinction, so we'd need additional notation. Second, we might need to split Guangzhou and Hong Kong if we do choose to make this distinction. Third, while this distinction may be clear for single character entries if we have the right sources, it's not as clear with multi-character entries because there is also tone sandhi involved (falling to level). The only 詞典 that distinguishes falling from level that I'm aware of is 廣州方言詞典, which lacks many common words because it's focused on dialectal terms. We'd need to be quite careful about this. @RcAlex36, any thoughts? — justin(r)leung (t...) | c=› } 05:47, 2 September 2020 (UTC)
@Justinrleung, Mar vin kaiser: Can we do something like 1* for the High Falling Tone? I think the major issue is for many words we don't know whether the High Falling Tone should be used. RcAlex36 (talk) 05:55, 2 September 2020 (UTC)
Personally I think that flat and falling should both be marked, with 1 being ambiguous. 1- and 1`? —Suzukaze-c (talk) 07:29, 2 September 2020 (UTC)
@Suzukaze-c: That's a good idea! (also @RcAlex36, Mar vin kaiser:) Do you think we should show it in Jyutping in display or should we just show it in Yale and IPA? — justin(r)leung (t...) | c=› } 07:34, 2 September 2020 (UTC)
@Justinrleung: Jyutping and IPA, and perhaps also Yale (saying this because I seldom pay attention to Yale, lol). RcAlex36 (talk) 08:28, 2 September 2020 (UTC)
@RcAlex36: I asked because Yale can actually distinguish the two tones, but Jyutping can't unless we modify it. — justin(r)leung (t...) | c=› } 08:32, 2 September 2020 (UTC)
@Justinrleung: Wow I didn't know that! If we show it in Jyutping, we may have to call our romanization system modified Jyutping. I'm fine with not showing it in Jyutping. RcAlex36 (talk) 08:36, 2 September 2020 (UTC)
@RcAlex36: Yes, the dictionary I have, 漢英小字典, uses Yale pronunciation so it distinguishes high falling (bìu 標, dìn 顛) and high flat (bīu 錶, dīn 癲), as written in the dictionary. Though it seems to say that most words are high falling (that are usually high flat in the usual dictionaries). --Mar vin kaiser (talk) 10:08, 2 September 2020 (UTC)
@Mar vin kaiser: I'm curious, which usual dictionaries are you talking about? Other dictionaries using Yale? — justin(r)leung (t...) | c=› } 22:57, 2 September 2020 (UTC)
@Justinrleung: Nah, I mean other dictionaries that don't distinguish the two tones, so presumably, they're only talking about high flat. --Mar vin kaiser (talk) 23:27, 2 September 2020 (UTC)
@Mar vin kaiser: Well, they're kind of in free variation, so generally I'd say they're referring to both without distinguishing the two tones. Most younger people in HK would use the flat tone, but there are plenty of people who use both or the falling tone without distinguishing them. — justin(r)leung (t...) | c=› } 23:53, 2 September 2020 (UTC)
@Justinrleung: Do you think it's possible that it's not as of a free variation in Guangzhou? For example, the character 區, sometimes I hear them say it with a high falling, sometimes with a high rising. Maybe there are characters that they never say with a high falling. --Mar vin kaiser (talk) 01:21, 3 September 2020 (UTC)
@Mar vin kaiser: The distinction seems to persist in Guangzhou, but it's probably a change in progress. For 區, as an example, if you hear both, you still need to consider all the context: Is it in a non-final position that would cause it to be tone sandhi-ed? Is it just read on its own? It's also interesting that even Guangzhou sources only mark it as tone 1 without distinguishing the two forms. — justin(r)leung (t...) | c=› } 03:56, 3 September 2020 (UTC)

Dialectal synonyms table for 出生[edit]

I'm thinking of doing this, but we don't seem to have much data, especially for 出生. RcAlex36 (talk) 08:38, 3 September 2020 (UTC)

@RcAlex36: It's probably more data than we have for 出生證 though, right? — justin(r)leung (t...) | c=› } 08:42, 3 September 2020 (UTC)
@Justinrleung: Did a quick search on 現代漢語方言大詞典 and guess what I've got? Nothing. RcAlex36 (talk) 08:44, 3 September 2020 (UTC)
@RcAlex36: Oh well. It could just be skewed towards the dialectal terms. — justin(r)leung (t...) | c=› } 08:47, 3 September 2020 (UTC)
@Justinrleung: Basically the same problem I encountered when doing Module:zh/data/dial-syn/一輩子. RcAlex36 (talk) 08:48, 3 September 2020 (UTC)
@RcAlex36: We've gotta work with what we have :D It's really annoying that these sources often leave out 共同語. — justin(r)leung (t...) | c=› } 08:49, 3 September 2020 (UTC)

-uoi final in the Fuzhou dielect[edit]

Is there supposed to be no 鬆緊韻 distinction for the -uoi final in the Fuzhou dielct? Compare (bēi) vs (fēi) (Etymology 4). RcAlex36 (talk) 16:45, 4 September 2020 (UTC)

@Justinrleung: Alright this appears to be the case, as described in 福州方言松紧韵母实验研究 and on the English wiki page on the Fuzhou dialect. RcAlex36 (talk) 16:51, 4 September 2020 (UTC)
@RcAlex36: Sorry, I'm not sure about the details of Fuzhou dialect's phonology/phonetics. We can probably go with 福州方言松紧韵母实验研究, which is probably more reliable than previous sources that have just recorded the dialect based on hearing. — justin(r)leung (t...) | c=› } 22:05, 4 September 2020 (UTC)


I want to create a Module:zh/data/dial-syn/買米, but 買米 is SoP. What should I do? RcAlex36 (talk) 07:10, 6 September 2020 (UTC)

@RcAlex36: I think we could have different modules of 買 depending on the object? I'm not quite sure though. — justin(r)leung (t...) | c=› } 16:27, 6 September 2020 (UTC)
@Justinrleung: 漢語方言詞彙 has 買米 as an entry. 糴米 isn't considered SoP and is already an entry here on Wiktionary. What should we do? RcAlex36 (talk) 16:36, 6 September 2020 (UTC)
@RcAlex36: We could maybe have it at Module:zh/data/dial-syn/買米 but not create 買米 as an entry. I think we can maybe modify the code so that we could link 買 and 米 separately? We can maybe write it as "買//米" in the synonyms module? If so, we'll have to change the main module to allow this kind of syntax. — justin(r)leung (t...) | c=› } 17:43, 6 September 2020 (UTC)
Or maybe create an entry anyway as a "synonym hub", like "translation hubs". —Suzukaze-c (talk) 21:42, 6 September 2020 (UTC)
@Suzukaze-c, RcAlex36: That's an interesting idea, but I'm not sure what other editors would think of this. English can have "translation hubs" because English is the language of this dictionary, but is Chinese justified in having these synonym hubs? Whatever we decide for this would be helpful in the long run because I can think of other entries that could make use of this (like 洗衣服 and 生孩子). — justin(r)leung (t...) | c=› } 22:49, 6 September 2020 (UTC)
It's definitely more odd, but I think it would be superior for discoverability. —Suzukaze-c (talk) 22:51, 6 September 2020 (UTC)
@Suzukaze-c: I'm just wondering if there'd be a need for a vote for such entries to exist. Pinging @Metaknowledge, Atitarev to see what their thoughts are. — justin(r)leung (t...) | c=› } 23:04, 6 September 2020 (UTC)
I am not sure, sorry but I think it's not a bad idea to have module for synonyms with linking to components, if they are SoP in Mandarin. --Anatoli T. (обсудить/вклад) 23:25, 6 September 2020 (UTC)
Not sure why I was pinged, but I don't see any good justification for creating SOP Mandarin entries, so I concur with Anatoli. —Μετάknowledgediscuss/deeds 02:02, 7 September 2020 (UTC)

Cantonese pronunciation of 復活复活[edit]

You seem to have added fau6 wut6 back in 2016. Do you people who actually say that? RcAlex36 (talk) 18:08, 7 September 2020 (UTC)

@RcAlex36: Yup. I know some people in 禮賢會 who say that. I'm not sure if it's common outside of Rhenish/Lutheran circles. — justin(r)leung (t...) | c=› } 18:34, 7 September 2020 (UTC)
@Justinrleung: You mean in Canada? I've never heard fau6 wut6 in Hong Kong myself. Well, back then when I attended a Catholic school, it was always fuk6 wut6. RcAlex36 (talk) 18:36, 7 September 2020 (UTC)
@RcAlex36: No, 禮賢會九龍堂 in 又一村 (close to Festival Walk). I've never heard it outside of that church, though. — justin(r)leung (t...) | c=› } 18:38, 7 September 2020 (UTC)
@Justinrleung: Wow that's specific. Anyway thanks for telling me. RcAlex36 (talk) 18:40, 7 September 2020 (UTC)
@RcAlex36: Haha :D I think it's probably used in other 禮賢會 churches. The fau6 reading is also used in 復生 in the 文言 Apostle's Creed (like in this recording of a service at the Wan Chai church, at around 22:41). — justin(r)leung (t...) | c=› } 18:54, 7 September 2020 (UTC)

Sabah (Bao'an)[edit]

Hi, where is data on Sabah (Bao'an) Hakka from? Thanks. RcAlex36 (talk) 11:04, 8 September 2020 (UTC)

@RcAlex36: The only source is our fellow contributor @Qhwans. — justin(r)leung (t...) | c=› } 03:53, 9 September 2020 (UTC)
@Justinrleung: Oh I thought there was a paper or something. Thanks for telling me. RcAlex36 (talk) 03:59, 9 September 2020 (UTC)
@Justinrleung: By the way, I'm not trying to overwhelm the dialectal synonyms table, but we can still add some Hakka 方言點. RcAlex36 (talk) 04:03, 9 September 2020 (UTC)
@RcAlex36: For that matter, we should add more 方言點 for Malaysian Hokkien and Cantonese if we have the resources. Besides Penang, Klang, Kuching are predominantly Hokkien speaking, while Ipoh and Sandakan are predominantly Cantonese speaking. We can probably add some for Min Dong (there are none for overseas varieties as of now), and for that there is Sitiawan, Yong Peng, Sibu and Miri in Malaysia. And we don't have one for Malaysian Hainanese, for which we could probably add Kuala Terengganu. The dog2 (talk) 04:20, 9 September 2020 (UTC)
@The dog2: But we don't have the resources (lol). Normally, people aren't really interested in dialects spoken outside Greater China. RcAlex36 (talk) 04:22, 9 September 2020 (UTC)
@RcAlex36: There's this and this for Malaysian Hainanese, but I don't know which city it is based on. The dog2 (talk) 04:29, 9 September 2020 (UTC)
@RcAlex36, The dog2: There is interest in overseas varieties, but it's hard to investigate given other factors. 陈晓锦 has worked on other varieties and there's a 3-volume survey called 东南亚华人社区汉语方言概要 - I'm not sure how much vocab there is though because I don't have access to a copy of it. — justin(r)leung (t...) | c=› } 04:50, 9 September 2020 (UTC)
@RcAlex36: Also, yes, of course, there are lots of things we can still add, but I think there's an obvious imbalance in our tables. There are way too many southern dialects (which is a given because that's the reality in the literature as well). We're hardly covering Mandarin properly, and we're missing important groups like Tuhua and Shaojiang Min. — justin(r)leung (t...) | c=› } 04:54, 9 September 2020 (UTC)
@RcAlex36, Justinrleung: Unfortunately, I don't have resources for this, but perhaps we could post a request on Wikipedia to ask for help from Malaysian Wikipedians. I'm pretty sure there will be some slight differences between the Cantonese in Kuala Lumpur, Ipoh and Sandakan. And I know for sure there are some Seiyap speakers from Malaysia too (my aunt is one of them), though I'm not sure if there are any Malaysian cities where the Chinese community is majority Seiyap speaking. And in Singapore, we have much smaller communities of other dialects too. The Hakka community is actually larger than the Hainanese community, but I just don't have access to a native speaker. The dog2 (talk) 05:00, 9 September 2020 (UTC)
@The dog2: I actually do have some resources on two dialects of Singapore Hakka (Meixian and Dabu). We could probably add those. — justin(r)leung (t...) | c=› } 05:02, 9 September 2020 (UTC)
@Justinrleung, The dog2: I have Kulai Hakka. There's a thesis on it. RcAlex36 (talk) 05:07, 9 September 2020 (UTC)
Justin, what's your source for Singapore Hakka? RcAlex36 (talk) 05:32, 9 September 2020 (UTC)
@RcAlex36: It's 新加坡客家, which has a section by 嚴修鴻. It's not available on that website. I borrowed it a while back from my uni library. — justin(r)leung (t...) | c=› } 06:28, 9 September 2020 (UTC)
@Justinrleung, The dog2: That's why we have to add all those Mandarin 方言點 in 普通话基础方言基本词汇集. By the way, I have 古丈瓦鄉話, a variety of Chinese that branched off earlier than Min but no one talks about. Also, we should probably add all those Gan 方言點 in the two Hakka-Gan studies. Shaojiang Min (many linguists still don't think it's Min) is definitely doable. The Pinghua and Tuhua dialects can be added, perhaps after we have a good way of classifying them. RcAlex36 (talk) 05:09, 9 September 2020 (UTC)
@RcAlex36: I've got 古丈瓦鄉話 as well. Shaojiang Min is treated as part of Min Bei in Ethnologue; I think we can maybe follow them on that like we do by including Leizhou Min and Hainanese in Min Nan? For Pinghua and Tuhua, I think we could either group them together as one or just separate them as Pinghua and Tuhua? I'll have to look into it a little more. — justin(r)leung (t...) | c=› } 06:34, 9 September 2020 (UTC)
@Justinrleung: Shaojiang Min should probably be its own category, since its conspicuously different fom Min Bei and its classification is disputed. RcAlex36 (talk) 06:41, 9 September 2020 (UTC)
@Justinrleung, RcAlex36: For that matter, whether or not the Danzhou dialect should be classified as a dialect of Cantonese is controversial. As for Singapore Hakka, interesting that they would document two separate varieties of Hakka, but that said, as far as Mindong goes, the Fuzhou and Fuqing communities are considered distinct in Singapore, so I guess it's not that surprising then. With regards to China, is there any 方言點 for the Hakka in Hainan? Apparent, there is a Hakka community there too. The dog2 (talk) 15:33, 9 September 2020 (UTC)
@The dog2: I think there is basically zero documentation, except on 語保工程採錄展示平台 that ordinary folks like you and I don't have access to. I actually find dialects in Hainan under-researched. Hakka elsewhere is relatively well-documented though. RcAlex36 (talk) 15:35, 9 September 2020 (UTC)
@RcAlex36: 語保工程採錄展示平台 seems to be available to Chinese citizens upon request? Not sure if people from HK like you could try submitting a form to them to request access. (It seems like the website can't be accessed here in Canada now?) — justin(r)leung (t...) | c=› } 15:47, 9 September 2020 (UTC)
Someone on Zhihu commented earlier that the request function does't actually work. RcAlex36 (talk) 15:50, 9 September 2020 (UTC)
@RcAlex36: Really, eh? All this deception... I wonder how that guy on YouTube who posts stuff from the site has access to it then... — justin(r)leung (t...) | c=› } 15:51, 9 September 2020 (UTC)
Perhaps he has connections or something, but there are many interesting 方言點 in his videos. RcAlex36 (talk) 15:55, 9 September 2020 (UTC)
@The dog2: Don't know if you have noticed, we have 方言點 for almost every city and county in Fujian. RcAlex36 (talk) 15:42, 9 September 2020 (UTC)
@RcAlex36: For Hokkien, I've noticed that. And even many locations in Taiwan too. I guess Hokkien is one of the better researched variants. The number of Teochew 方言點 also seems to have expanded, but one city we still don't have is Puning. For Hainanese, it'd be nice if we could have Sanya, since the Hainanese there is noticeably different from the Hainanese in northern Hainan. The dog2 (talk) 15:45, 9 September 2020 (UTC)
@The dog2: Puning and Huilai, to be exact. And to be honest, the vocabulary used in Chaozhou, Chenghai, Shantou and Jieyang are just the same most of time. RcAlex36 (talk) 16:03, 9 September 2020 (UTC)
@RcAlex36: There are some subtle differences though. 暝昏 means "evening" in Chaozhou and Shantou (and Singaporean Teochew), but means "night" in Jieyang and Shanwei. Speaking of which, we don't have a 方言點 for downtown Shanwei either. The dog2 (talk) 16:15, 9 September 2020 (UTC)
@The dog2: I don't have Shanwei, but I have data on the dialect spoken by fishermen in Shanwei fishing villages. I also have Macau Tanka (水上話). RcAlex36 (talk) 16:21, 9 September 2020 (UTC)
@The dog2, RcAlex36: By the way, should we keep Guangzhou as one entry as it is now, or should we split Guangzhou into 東山口音 and 西關口音? I know there is a distinction between the two (eg. 呢 vs 依). The dog2 (talk) 17:47, 9 September 2020 (UTC)
@The dog2: We don't have resources that distinguish between the two accents. Yes, they seem to be just accents. RcAlex36 (talk) 17:50, 9 September 2020 (UTC)

Character for Teochew gaoh4[edit]

Hey there, what character should we use for this? I've heard it in a number of Teochew videos as an equivalent of 卷. The dog2 (talk) 03:33, 9 September 2020 (UTC)

@The dog2: 臺灣閩南語常用詞辭典 writes it as 𩛩. I'm not sure if there's a better character. — justin(r)leung (t...) | c=› } 03:54, 9 September 2020 (UTC)
OK, we can just use that then. Mogher didn't provide a character, so I wasn't sure which one to use. The dog2 (talk) 04:02, 9 September 2020 (UTC)
@The dog2: If you read the Mogher entry carefully, it does suggest 毂, but I don't know how common that is. — justin(r)leung (t...) | c=› } 04:55, 9 September 2020 (UTC)
No idea. Usually when it comes up, people just write it as 卷. The dog2 (talk) 05:02, 9 September 2020 (UTC)

IP changing stuff[edit]

@The dog2 We have a user here changing Hainanese entries. RcAlex36 (talk) 15:48, 9 September 2020 (UTC)

Reverted. — justin(r)leung (t...) | c=› } 15:52, 9 September 2020 (UTC)
Please do keep an eye, because the IP seems argumentative on their talk page. RcAlex36 (talk) 16:23, 9 September 2020 (UTC)
Yup, definitely. (They're not an IP, though, haha). — justin(r)leung (t...) | c=› } 16:31, 9 September 2020 (UTC)
Make sure you send them only that page though, not the entire document. RcAlex36 (talk) 16:42, 9 September 2020 (UTC)
Yeah sure. — justin(r)leung (t...) | c=› } 16:43, 9 September 2020 (UTC)

Hakka 方言點[edit]

@The dog2 I've added 大埔(西河), 吉隆坡 (大埔) and 博白 (沙河). I've also moved the Guangxi Hakka 方言點. RcAlex36 (talk) 02:41, 11 September 2020 (UTC)

@RcAlex36 Great, thanks! Just wondering what sources you’re using for these. I think I know for KL, but what are the other ones? — justin(r)leung (t...) | c=› } 04:11, 11 September 2020 (UTC)
Mainland Dabu is from the KL one. Bobai (Shahe) is from 博白县沙河镇客家话研究. RcAlex36 (talk) 04:13, 11 September 2020 (UTC)
@RcAlex36: Nice! — justin(r)leung (t...) | c=› } 04:38, 11 September 2020 (UTC)

@The dog2 There's also material on two Hakka dialects spoken in Bangkok, Meixian and Fengshun, which I will add some day. I guess we don't need to add Bangkok Cantonese because it's probably extinct by now. RcAlex36 (talk) 12:33, 11 September 2020 (UTC)

@RcAlex36: It might still be good to include Bangkok Cantonese, and the same may go for Jintan Wu. We are not just documenting the languages that are spoken now. They're extinct or moribund recently, so all the more reason to document these varieties. — justin(r)leung (t...) | c=› } 13:29, 11 September 2020 (UTC)
I think the Ting Kok and Tung Ping Chau dialects are extinct. You can count the inhabitants of Tung Ping Chau with just one hand now. Is the data for Jintan in the book Wu though? RcAlex36 (talk) 13:35, 11 September 2020 (UTC)
@RcAlex36: I'll have to look into it, but it should be. — justin(r)leung (t...) | c=› } 13:45, 11 September 2020 (UTC)
@Justinrleung, RcAlex36: How about Phnom Penh Teochew and Ho Chi Minh City Teochew? I'm not sure if there's any speakers left in Cambodia and Vietnam, but some older people in Australia and the U.S. do speak those. And I wonder if we should document Yangon Hokkien and Yangon Taishanese, as well as Mandalay Mandarin (based on Yunnan Mandarin). The dog2 (talk) 13:51, 11 September 2020 (UTC)
@The dog2: Do we even have data for those? (lol) RcAlex36 (talk) 13:57, 11 September 2020 (UTC)
@The dog2: Wow there's actually 缅甸曼德勒台山话词汇研究. RcAlex36 (talk) 13:59, 11 September 2020 (UTC)
@Justinrleung, RcAlex36: There's material on YouTube teaching Vietnamese Teochew (like this), but I'm not sure if it's from Ho Chi Minh City. The dog2 (talk) 14:06, 11 September 2020 (UTC)
@The dog2, RcAlex36: I don't know about Mandalay Mandarin, but I've found a book on 3 varieties of Yunnan Mandarin spoken in Thailand (泰国的西南官话). — justin(r)leung (t...) | c=› } 17:03, 11 September 2020 (UTC)
@Justinrleung: Well, I'm guessing they are also moribund, given how quickly Thai Chinese are assimilating. RcAlex36 (talk) 17:06, 11 September 2020 (UTC)
@Justinrleung, RcAlex36: 缅甸曼德勒台山话词汇研究 seems to be based on Mandalay though. From what I understand, in modern Myanmar, the Chinese in Yangon are mostly Hokkien and Taishanese, while the Chinese in Mandalay are mostly of Yunnan descent. Speaking of which, expanding beyond Southeast Asia, does anyone know of the dialects of Chinese in India and South Africa? I think the South African Chinese in Johannesburg mainly speak Cantonese, while the Indian-Chinese in Kolkata are predominantly Hakka. Also, I know there is a Chinese community in Cuba, and there's a Havana Chinatown, but I'm not sure if any Chinese dialects are still spoken, or whether the Cuban-Chinese are now all monolingual Spanish speakers. The dog2 (talk) 17:14, 11 September 2020 (UTC)
@The dog2: I doubt there's any published studies on those to date. Given the circumstances in Cuba, I'm quite certain that most Chinese Cubans have either moved to the US or assimilated into mainstream Cuban society. RcAlex36 (talk) 17:22, 11 September 2020 (UTC)
@RcAlex36: For fun: Mini documentary on a Cuban-Chinese(-ish) woman that touches upon it (and a fuller documentary seems to exist as well, called 《古巴花旦》). —Suzukaze-c (talk) 08:48, 12 September 2020 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────@RcAlex36, Justinrleung: Based on my understanding, there's some seventh generation Chinese-Indians from Kolkata who speak Hakka, and some fourth generation Chinese-South Africans from Johannesburg who speak Cantonese, so if any of you decide to do a PhD in linguistics, perhaps you could study Indian Hakka or South African Cantonese, or both. It would certainly help our dialect tables. The dog2 (talk) 17:41, 11 September 2020 (UTC)

@The dog2: Well, I'm not even a linguistics major student, so I can't help. RcAlex36 (talk) 17:46, 11 September 2020 (UTC)
@The dog2: I'm gonna be doing stuff on Cantonese in Toronto this year for my MA, but I don't think it'll be relevant to this. I don't plan on going too far to do fieldwork atm. Maybe I'll change my mind down the road. — justin(r)leung (t...) | c=› } 18:12, 11 September 2020 (UTC)

鬆緊韻 distinction for a̤[edit]

In Module:cdo-pron there is supposed to be a 鬆緊韻 distinction for a̤. However, it isn't showing ɛ at () for some reason. Is the code broken? RcAlex36 (talk) 12:24, 12 September 2020 (UTC)

It appears that the IP's edits caused the problem. I have left a message on their talk page. Let's see if they respond. RcAlex36 (talk) 14:52, 12 September 2020 (UTC)
@Justinrleung: It has been 24 hours since I posted the message, to which there has been no response. We might have to assume that the IP is not intent on fixing the error in the code and so we have to fix it ourselves. RcAlex36 (talk) 13:23, 13 September 2020 (UTC)
It appears that the module is failing to give 鬆緊韻 distinction for every rime. @Suzukaze-c, are you familiar with the code? Could you please check what's wrong with the code? RcAlex36 (talk) 18:20, 13 September 2020 (UTC)
It's their code from beginning to end. —Suzukaze-c (talk) 04:39, 14 September 2020 (UTC)

Use of chinese characters in Bai[edit]

So I saw a your verification request on the article 水. And I did some digging and found something called 僰文 (Bowen) which uses chinese character in a Bai syntax. On one of the inscriptions in Bowen, 山花碑, has the character 水 on it, but Bowen is not used now Henry Wonh (talk) 12:58, 13 September 2020 (UTC)

@Henry Wonh: Thanks! You should put your findings in WT:RFVN. My questions are whether 山花碑 is written in 僰文 and whether 水 in 僰文 is actually pronounced as /ɕy³³/. — justin(r)leung (t...) | c=› } 14:37, 13 September 2020 (UTC)

Medan Hokkien[edit]

What sources do we have for Medan Hokkien? RcAlex36 (talk) 15:17, 14 September 2020 (UTC)

@RcAlex36: Not much. I think @Mar vin kaiser has found some videos of it before? — justin(r)leung (t...) | c=› } 16:12, 14 September 2020 (UTC)
@RcAlex36: There are some videos in Youtube that are in Medan Hokkien. There is also a website that tries to list vocabulary in Medan Hokkien using its own crude romanization, but you can make out how they pronounce things, also based on the fact I know that it's mostly based on Zhangzhou Hokkien. --Mar vin kaiser (talk) 00:18, 15 September 2020 (UTC)

Dialectal synonyms table[edit]

@The dog2 I've always wondered if we should add Standard Mandarin to the table. Officially it isn't a dialect, but its a variety of Mandarin worth documenting, and often times it has many colloquialisms that cannot be considered as formal Chinese. The source will be 現代漢語詞典. RcAlex36 (talk) 16:51, 14 September 2020 (UTC)

Plus more and more people seem to be speaking Standard Mandarin as a first language in mainland China, so perhaps we may include it? RcAlex36 (talk) 16:53, 14 September 2020 (UTC)
@RcAlex36: Doesn't "formal" refer to standard Mandarin in mainland China? And I think the "Taiwan" entry covers standard Mandarin in Taiwan. And we have an entry for Singapore, but it might be worth considering separating Singapore into standard and colloquial Mandarin. That said, colloquial Mandarin in Singapore can be a bit tough to define, since it's quite common to codeswitch to Hokkien, Teochew or Cantonese, or in some cases even Malay or English. The dog2 (talk) 16:56, 14 September 2020 (UTC)
@The dog2: Formal is written standard Chinese, according to the module. What about informal terms used in Standard Mandarin? Also, the Taiwan entry covers not only standard Mandarin in Taiwan, but also terms used in Taiwan Mandarin that are considered non-standard Mandarin by official dictionaries, for example 臺 as the classifier for vehicles. RcAlex36 (talk) 16:59, 14 September 2020 (UTC)
@Justinrleung, Mar vin kaiser I would like to hear your opinion on this. RcAlex36 (talk) 02:29, 15 September 2020 (UTC)
@RcAlex36: Do you mean a kind of "general" Standard Mandarin in Mainland China? I think we could consider adding it, but we'd need to have good ways of distinguishing it from "Formal". Also, there are regional varieties of Standard Mandarin in Mainland China, so what would we base this "Standard Mandarin" on? Is it like an idealized form? — justin(r)leung (t...) | c=› } 04:45, 15 September 2020 (UTC)
@Justinrleung: Standard Mandarin in mainland China has a prescriptive standard and is (theoretically) the same in all regions of China. We should probably base this Standard Mandarin on the idealized form. RcAlex36 (talk) 04:51, 15 September 2020 (UTC)
@RcAlex36: Then wouldn't many informal forms still be left out because they may not be deemed standard? — justin(r)leung (t...) | c=› } 05:24, 15 September 2020 (UTC)
Well, 現代漢語詞典 has plenty of erhua words that are used in colloquial speech. Does 現代漢語詞典 not contain many informal words? It has many entries that it labels as 口語, or are they not colloquial enough? RcAlex36 (talk) 05:31, 15 September 2020 (UTC)
Maybe a good example here is the term for maternal grandmother. Years ago, there was an issue of a textbook using "姥姥" instead of "外婆". Some said the former is 普通话 while the latter is not. While some said both are 普通话. In the end, well both are in 普通话 dictionaries. So they are part of "Standard Mandarin", but in our dialectal synonyms chart, they're only shown in the regional dialect portion. For "Formal Written Standard Chinese", we put 外祖母. So it seems that we should somehow indicate 姥姥 and 外婆 as part of "informal" Standard Mandarin. --Mar vin kaiser (talk) 10:57, 15 September 2020 (UTC)
@Mar vin kaiser: My idea was to include words used in spoken Standard Mandarin that aren't too literary (those will belong to Written Standard Chinese and are labelled <书> in 現代漢語詞典). So yeah, I would include 姥姥, 外婆 and 外祖母. RcAlex36 (talk) 11:39, 15 September 2020 (UTC)
@Mar vin kaiser: It may sometimes get tricky though. 今日 isn't labelled 書面語 in 現代漢語詞典, but that's still too formal to be used in colloquial speech. RcAlex36 (talk) 11:49, 15 September 2020 (UTC)
@RcAlex36: 外婆 is marked as <方> in 現代漢語詞典, which is another issue. It's clearly used in Putonghua, but it may be restricted to the south. I'm not sure how closely we should follow 現代漢語詞典 for this "Standard Mandarin". Another issue is probably neologisms or slang terms that aren't in 現代漢語詞典 (or 現代漢語規範詞典). — justin(r)leung (t...) | c=› } 15:37, 15 September 2020 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────@Mar vin kaiser, RcAlex36: Another potential issue here. Some expressions aren't really classical Chinese, but also sound way too stilted even for formal speech, and are only used in formal writing. An example is when telling time. To indicate the hour, 時 is used instead of 點 when you are writing a formal letter or making a formal written announcement. For instance, to indicate 1:30 PM, you'll write it as 下午1時30分. However, this is so stilted that even in news broadcasts, the anchors will revert to using 點. The dog2 (talk) 15:48, 16 September 2020 (UTC)

@The dog2: Yeah, there are definitely different degrees of formality. — justin(r)leung (t...) | c=› } 15:53, 16 September 2020 (UTC)
In my opinion though, at the very least, there should be a way to show that 外婆 and 姥姥 are part of Standard Mandarin (普通话) and can be used in everyday speech, while the other Mandarin dialectal terms in the chart, like 外奶奶, 姥兒, 舅家奶, 家婆, are almost never used in Standard Mandarin to mean "maternal grandmother". --Mar vin kaiser (talk) 03:01, 17 September 2020 (UTC)

Lists by frequencies[edit]

Hello Justin, I just discovered Module:User:Justinrleung/char-summary which is quite impressive. Over the years I collected frequency lists there. I though it could be handy to you as well. Cai & Brysbaert 2010 made the strongest argument for subtitles-based frequency, confirmed by correlation with well known reaction-times to words/characters displayed. Yug (talk) 18:27, 15 September 2020 (UTC)

@Yug: Thanks, I'll be sure to look into it! — justin(r)leung (t...) | c=› } 01:32, 16 September 2020 (UTC)
I'm also interested to know what is your frequency list's sources ? I don't have it yet. (but I do exclude non-academic backed frequency lists). Yug (talk) 06:40, 16 September 2020 (UTC)
@Yug: It's actually a fork of Wyang's module. I'm not sure where he got it from. — justin(r)leung (t...) | c=› } 06:44, 16 September 2020 (UTC)
A quick google search doesn't return academic source. So I may need more research on this. In any way, other sources can be used. Cai & Brysbaert 2010 is for spoken frequency, therefor for today language teaching materials. Da Jun is for written frequency. If I remember well there are also some frequency lists by written text type : news papers, novels, historical books. But it was not my focus at the time so I didn't kept them in mind. I store the core of this knowledge on en:w:Word lists by frequency. Yug (talk) 08:05, 16 September 2020 (UTC)
@Yug: It may be this one, but I'm not sure. — justin(r)leung (t...) | c=› } 08:18, 16 September 2020 (UTC)
Thank you, I will inspect that. Sorry to also ask this, but while saved I Module:User:Justinrleung/char-summary I lost track of were its output is displayed. Can you help me ? I'am creating a fork for modern characters, so I may track my mass uploads of Kaishu, Clerical, Songti I wish to do in 2021. Yug (talk) 08:30, 16 September 2020 (UTC)
@Yug: You can probably have a page like User:Yug/char-summary and put in {{#invoke:User:Yug/char-summary|main}}. — justin(r)leung (t...) | c=› } 08:36, 16 September 2020 (UTC)
Yes ! This. invoke. Thank you. Yug (talk) 08:51, 16 September 2020 (UTC)
@Yug: Not a problem! — justin(r)leung (t...) | c=› } 14:56, 16 September 2020 (UTC)

Please delete doenghaemh[edit]

Please delete the page. I misspelled the page title.

By the way, 壮语方言研究 p. 609 has doengzhaemh instead of doengxhaemh. I wonder why. RcAlex36 (talk) 16:23, 18 September 2020 (UTC)

@RcAlex36: Deleted. I'm not sure about the discrepancy. Could be a typo or just a dialectal thing. Wuming doesn't necessary mean Standard Zhuang, as you probably know. — justin(r)leung (t...) | c=› } 17:36, 18 September 2020 (UTC)