User talk:Wyang

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Contents

Archives[edit]

Archive 1 — 2013/01/18 21:12 (UTC) to 2014/05/24 00:43 (UTC)

User talk[edit]

Pinyin containing comma being treated as two readings[edit]

See 一枝草,一点露, 一枝草,一點露, 既生瑜,何生亮 and 说时迟,那时快. How to deal with the problem that the comma split the pinyin into two readings? --kc_kennylau (talk) 15:03, 25 May 2014 (UTC)

Probably should not have used a comma in the first place, but multiple template parameters. By the way, I would suggest renaming the parameters to use language codes of the topolects (i.e. {{zh-pron|cmn=...|cmn2=...|wuu=...|yue=...|yue2=...|...}}) One-letter parameters may be handy to type, but they present some cognitive burden as yet another thing to look up in the documentation and memorise. Also, changing these two things at once would solve the problem of distinguishing parameters in new format from the old. Keφr 16:32, 25 May 2014 (UTC)
Fixed - use ', ' as dictated by Pinyin orthography. I don't think I agree that using language codes is the better option. A user may have to look up the documentation to know that 'w' is the code for 'Wu' (chances are not, when the person sees 'm' is used for 'Mandarin'), but a user definitely has to look up the documentation if 'wuu' is the code for 'Wu'. Wyang (talk) 00:03, 26 May 2014 (UTC)
There's also the matter of lects that don't have language codes. Categorization would probably need to be handled differently, but I'm sure there are some that would be worth incorporating into this framework. Knowing how dialectology and the ISO work, I would be quite surprised if there were no dialects with lexically-significant differences lacking ISO codes. I think the main problem would be deciding which ones not to cover- though lack of sources might solve that problem for us. Chuck Entz (talk) 00:55, 26 May 2014 (UTC)

links to erhua-ed pinyin[edit]

I am not sure whether this has already been discussed before, but should we link to erhua-ed pinyin in entries? --kc_kennylau (talk) 11:10, 27 May 2014 (UTC)

Don't think it has been discussed before. Feel free to do it if you feel so inclined. Wyang (talk) 11:31, 27 May 2014 (UTC)
Because it's currently linked and I want to disable the link. --kc_kennylau (talk) 12:51, 27 May 2014 (UTC)
My preference is to link it, if generated pinyin is a valid pinyin. So, is "wánr" a valid pinyin for 玩, even if it's not written 玩儿? --Anatoli (обсудить/вклад) 12:59, 27 May 2014 (UTC)
The reason I showed the erhua-ed pinyin on the un-erhua-ed page is just to show that writing "兒/儿" is optional. A Beijinger may write "我去玩了", and pronounce it as "我去玩儿了" instead. Wyang (talk) 00:30, 28 May 2014 (UTC)

Wuu tones 2 and 4[edit]

How did you know that 小三 is the second tone instead of the fourth? --kc_kennylau (talk) 12:51, 27 May 2014 (UTC)

Let me, a student of Wu, try this one. According to wu-minidict, 小 is 上 and starts with a voiceless initial /ɕ/ and according to Wyang's table, it should be 2. --Anatoli (обсудить/вклад) 12:59, 27 May 2014 (UTC)
Thanks! :) --kc_kennylau (talk) 13:05, 27 May 2014 (UTC)
Yes, Anatoli's right. Also, since you speak Cantonese, this is probably easier:
In general
Cantonese tone Shanghainese tone Mandarin tone
1 (not checked - 7) 1 1
2 2 3
3 (not checked - 8) 2 4
4 3 2
5 3 3 or 4
6 (not checked - 9) 3 4
7 4 could be anything
8 4 could be anything
9 5 2 or 4

Wyang (talk) 00:31, 28 May 2014 (UTC)

Thank you. From the above, is Cantonese the driver? Is there a correspondence for Mandarin -> Shanghainese -> Cantonese? Such as Mandarin tone 3 can be 2 or 5 in Cantonese and 2 or 3 in Shanghainese?
How many Cantonese tones are currently supported by Module:yue-pron? I have an impression - only six, as in Hong Kong Cantonese. Number 7 and above don't generate IPA. --Anatoli (обсудить/вклад) 00:45, 28 May 2014 (UTC)
1 and 7, 3 and 8, 6 and 9 occur in complementary distribution. Jyutping merges the two into the former. The former is for non-checked syllables, the latter for checked syllables. Wyang (talk) 00:47, 28 May 2014 (UTC)
I see, I thought so. Is it true that Guangzhou and Hong Kong Cantonese differ in the number of tones - 7 and 6 accordingly. Not sure where I read this now. Strangely, it's hard to find numeric values in Wikipedia for Cantonese 6 tones: 55 35 33 21 13 22, not sure about the checked tones. This "complementary distribution", does it actually mean different, lower tones for 7, 8 and 9? --Anatoli (обсудить/вклад) 01:04, 28 May 2014 (UTC)
The checked tones are: 5, 3, 2, for 7/8/9 (they are just shorter versions of 1/3/6). Tone 1 is typically 55 in Hong Kong, but can be 55 or 53 in Guangzhou. Some view it as two tones instead, citing characters which are basically only pronounced as 55 or 53, and never the other. At present the two values are largely interchangeable, although reading some 55 characters as 53 might sound weird. Compare : Hong Kong (JustinLam) and Guangzhou (greatharry). Wyang (talk) 01:22, 28 May 2014 (UTC)
Middle Chinese is the driver, Cantonese is the vice-driver, Shanghainese and Mandarin sit in the back seats. Wyang (talk) 00:59, 28 May 2014 (UTC)
Thanks. Very educational. Are there numeric values for 7, 8 and 9? Can Shanghainese and Mandarin tones be mapped to each other similarly or they are completely unpredictable? --Anatoli (обсудить/вклад) 01:04, 28 May 2014 (UTC)
The table above should do - ignore the Cantonese column. It's a lot less regular, but there are some correspondences. Wyang (talk) 01:22, 28 May 2014 (UTC)
Thank you very much! --kc_kennylau (talk) 10:53, 28 May 2014 (UTC)

Some of questions about Shanghainese[edit]

Hi Frank,

  1. Is this particle 额 standard in Shanghainese - 侬好流利刚英文𠲎?
  1. What is 梢许 - "a little bit"? 还可以,就是梢许忙着点。
  1. Is 阿 a question word? 侬是陈先生? --Anatoli (обсудить/вклад) 23:49, 28 May 2014 (UTC)
1. There is no standard written form for Shanghainese. 额 is the same as 個, equivalent to Mandarin 地. Also, 刚英文 should be 讲英文.
2. 稍许 - a little bit.
3. Yes, it is used in old-style Shanghainese. Wyang (talk) 23:56, 28 May 2014 (UTC)
Thank you. I've added a Shanghainese usage example in /. I noticed there's no standard form for Shanghainese. I have to make adjustments when reading 上海话方言词典 but their audio is very good. --Anatoli (обсудить/вклад) 00:12, 29 May 2014 (UTC)

A Question for One Word[edit]

I have a question about a certain word. Is this Chinese word an adjective? If so, what does it mean? I could suggest improving your list of yellow linkable Chinese words by adding parts of speech next to them. --Lo Ximiendo (talk) 09:34, 29 May 2014 (UTC)

Done. Please fix the definition due to improper English by me. --kc_kennylau (talk) 10:57, 29 May 2014 (UTC)
Done. Wyang (talk) 12:03, 29 May 2014 (UTC)

I'm also wondering about the Pinyin reading of this word. --Lo Ximiendo (talk) 07:31, 30 May 2014 (UTC)

{{zh-new}} should handle this correctly. Wyang (talk) 07:32, 30 May 2014 (UTC)

Russian words[edit]

Making entries for inflected forms is a bit of waste of time, IMHO. It's for bots, not humans :) I hope someone may create accelerated methods for making them and/or make a bot for a quick creation. You can try something more challenging by making an entry from translations. Adverbs are not inflected, they are easier to do.

Russian translations from English starting with letter "a" are here User:Matthias_Buchmeier/en-ru-a (or any letter - change "-a" to other letters or change ru to cmn to see Chinese translations - User:Matthias_Buchmeier/en-cmn-a (note: some translations are wrong or SoP). --Anatoli (обсудить/вклад) 00:05, 30 May 2014 (UTC)

Aha, thanks. I will read more on Russian phonology first. I have one question: What does сдалан here mean? Wyang (talk) 00:09, 30 May 2014 (UTC)
That was a typo :) -> сде́лан (made - participle).
Thanks, that explains it. :) Wyang (talk) 00:22, 30 May 2014 (UTC)

The format of {{zh-pron}} usage[edit]

@Atitarev, Kephir, Lo Ximiendo: So apparently I and Lo Ximiendo are in a war of whether the double close curly brackets should be put on the same line as the category or in a newline. Please express your view here. --kc_kennylau (talk) 07:23, 30 May 2014 (UTC)

It doesn't seem to be a war at all... yet. I prefer new line, because every other parameter gets a new line granted to it. Wyang (talk) 07:31, 30 May 2014 (UTC)
For me, I could go for lesser lines. --Lo Ximiendo (talk) 07:32, 30 May 2014 (UTC)
This is a silly non-issue. As long as it renders the same, I would leave it alone. Personally, though, I prefer them at the beginning of the line. Makes line-based diffs less messy when changing the last item. Keφr 07:37, 30 May 2014 (UTC)

Result: 3 persons (Kenny, Wyang, Kephir) agree making a newline, 1 person (Lo Ximiendo) agrees appending it onto the last item.

Classical tag[edit]

Please add the "classical" tag into Module:labels/data and link it to w:Classical Chinese. --kc_kennylau (talk) 14:54, 30 May 2014 (UTC)

Can't you have other uses of "Classical"? For example, "Classical Latin"? Wyang (talk) 10:36, 31 May 2014 (UTC)
Then link it to "Classical + lang:getCanonicalName()". --kc_kennylau (talk) 10:48, 31 May 2014 (UTC)
Tag added but unlinked. There is no lang parameter there [1]. Wyang (talk) 11:01, 31 May 2014 (UTC)
Thanks. --kc_kennylau (talk) 11:35, 31 May 2014 (UTC)

Template:ru-pron-auto[edit]

Hi,

Could you make it work, please? It would be easier for me to get expected IPA, using what is currently produced. I've added some descriptions and more test cases. --Anatoli (обсудить/вклад) 12:08, 1 June 2014 (UTC)

It's working now. --kc_kennylau (talk) 15:03, 1 June 2014 (UTC)

学校, 学习, 学堂 in Shanghainese[edit]

Hi Frank,

Do these words have different readings? You have corrected my edit in 学校 where I used "5hhoq" from 学习 (your edit). I want to make Wu reading for 学堂, which seems more common than 学校 - used in one of my textbooks. What is the transliteration? "5hhoq daan" or "5hhiaq daan"--Anatoli (обсудить/вклад) 23:13, 1 June 2014 (UTC)

It's 5hhoq daan. 5hhoq: colloquial; 5hhiaq: literary. Wyang (talk) 23:41, 1 June 2014 (UTC)
Thank you. --Anatoli (обсудить/вклад) 23:44, 1 June 2014 (UTC)

五 in Wu Chinese[edit]

Hi Frank, could you fix (Wu translit.), please? Nothing worked for me. --Anatoli (обсудить/вклад) 00:13, 3 June 2014 (UTC)

Added. Wyang (talk) 00:19, 3 June 2014 (UTC)

Templates in the Belarusian adjectives category[edit]

Hi, if you're thinking about completing a side-quest, could you consider fixing the Belarusian adjective templates so that they don't show up in Category:Belarusian adjectives? Maybe even other categories, too? --Lo Ximiendo (talk) 06:03, 3 June 2014 (UTC)

@Atitarev: How about you, Anatoli? --Lo Ximiendo (talk) 07:19, 3 June 2014 (UTC)
I'll take a look later tonight and reply re: Russian IPA, will add some stuff. Gotta go. --Anatoli (обсудить/вклад) 07:23, 3 June 2014 (UTC)
Fixed (except the talk page, which can't be removed from the category without removing the test transclusion). Putting a category wrapped in noincludes in a template that's transcluded by other templates only stops the transcluded template from going into the category- not the ones transcluding it. Fortunately, all the transcluding templates had the same code, so it could be deleted from the transcluded template without any effect on the entries. Chuck Entz (talk) 08:19, 3 June 2014 (UTC)
So is it okay or not to delete the talk page? Next stop, Category:Belarusian nouns and Category:Belarusian verbs? --Lo Ximiendo (talk) 08:34, 3 June 2014 (UTC) Go for the Latin word in the Belarusian nouns category. --Lo Ximiendo (talk) 08:37, 3 June 2014 (UTC)
I didn't delete it because I don't know if it's still needed. The Latin-script noun has been tagged for attention since January. Someone who knows Belarusian needs to create an entry under the correct Cyrillic-script spelling so the Latin-script one can removed without losing information. Chuck Entz (talk) 08:56, 3 June 2014 (UTC)

/[edit]

Frank, could you check this entry, when you have a moment, please? @Kc kennylau:, Kenny, Cantonese, please as well. I did my best not to make mistakes but it's a big entry. --Anatoli (обсудить/вклад) 05:27, 4 June 2014 (UTC)

It probably needs some formatting checks as well - L3 and L4 for pronunciations and PoS? --Anatoli (обсудить/вклад) 05:28, 4 June 2014 (UTC)

Remaining Mandarin entries[edit]

Hi Frank, there are still heaps (multi-character terms) in Special:WhatLinksHere/Template:cmn-noun&limit=500 (and next) with the old-style templates and headers, some lack pronunciation headers. Can they be done programmatically still? --Anatoli (обсудить/вклад) 04:26, 5 June 2014 (UTC)

There are 332 monosyllabic entries there and 908 multisyllabic ones, both intentionally omitted (the bot omits the entry if the title is monosyllabic or content lacks {{Pinyin-IPA}}). There is a complete list of the multisyllabic entries omitted here. I think the multisyllabic ones are automatable, at least semi-automatable. Not sure about the monosyllabic ones. Wyang (talk) 04:33, 5 June 2014 (UTC)
Wow, there's still a lot. --Anatoli (обсудить/вклад) 04:37, 5 June 2014 (UTC)

A sample entry with unknown PoS, only reading is available - [edit]

In this revision I have added Mandarin and Cantonese readings, removed translingual definition requests and left one for Chinese, the cat= parameter is empty. Can all single character entries be categorised in by default and PINT, Jyutping, etc. if a reading is added? (Definitions and PoS can be added but I'd like to establish a format for defintionless characters, as a sample. What do you think? --Anatoli (обсудить/вклад) 23:38, 5 June 2014 (UTC)

CC @Bumm13:, since you edited that entry as well. --Anatoli (обсудить/вклад) 23:39, 5 June 2014 (UTC)
Oops, wrong tone number for Wu but I see why it's 3. Thanks for fixing.
Why all language categories have gone from ? It is Chinese and if there is a reading, then shouldn't it also be Mandarin, etc.? Or should all single-character entries just sit in Category:Han characters (that's added by translingual)? --Anatoli (обсудить/вклад) 00:01, 6 June 2014 (UTC)
Because I think Category:Mandarin language should be free from mainspace entries, like Category:English language. In the absence of a suitable category, I changed the code to put it in Category:Chinese hanzi. "Category:Mandarin hanzi" sounds strange to me. Wyang (talk) 00:19, 6 June 2014 (UTC)
Thanks. Agreed. Any categorisation is better than none and "Chinese hanzi" is a good name. --Anatoli (обсудить/вклад) 00:23, 6 June 2014 (UTC)
Could you add single-character entries like WITH definitions to Category:Chinese hanzi as well, pls? --Anatoli (обсудить/вклад) 04:38, 6 June 2014 (UTC)
Added. Wyang (talk) 04:41, 6 June 2014 (UTC)

Template:zh-pron and Wade-Giles romanization[edit]

I'm adjusting okay so far to the new unified Chinese formatting for the most part. One thing I'm not seeing in Template:zh-pron is the option for adding the older Wade-Giles romanization for Mandarin. Wade-Giles isn't used as much anymore but is still found frequently in older texts and is still preferred by many Chinese linguistics experts in academia. It'd be really good to have the ability to automatically (or manually) convert from Hanyu pinyin to Wade-Giles. This page is good for showing most of the conversions from Hanyu pinyin to Wade-Giles (doesn't use IPA charts, though). Bumm13 (talk) 03:24, 6 June 2014 (UTC)

It should be fairly easy to do. @kc kennylau: You might be interested in this too. I might get this started if no one takes the lead. Wyang (talk) 03:29, 6 June 2014 (UTC)
Any chance that Wade-Giles will be added to Template:zh-pron anytime soon? It's actually a big reason why I'm not spending more time converting topolect sections to the new "Chinese" formatting. Just curious. Bumm13 (talk) 19:20, 20 June 2014 (UTC)
(E/C)I am neutral on Wade-Giles. Like Gwoyeu Romatzyh, it's just another system to understand and support, making the Chinese pronunciation box larger. Well, we had Wade-Giles in Hanzi headers (for single-character entries) all the time, to avoid being accused of destroying it, we should probably keep it/add it but perhaps for single-character entries only(?), perhaps the same for Gwoyeu Romatzyh(?). @Kc kennylau: it must be an easy task for you? @Wyang, same thinking :)--Anatoli (обсудить/вклад) 03:33, 6 June 2014 (UTC)
I have done a draft for the py_wg function at Module:cmn-pron. All the rudimentary monosyllabic testcases work as expected, which I think is fairly sufficient for showing its robustness if it were to be applied solely to Hanzi entries. Please see if anything needs to be improved and enable it when it is deemed trustworthy. Wyang (talk) 05:47, 6 June 2014 (UTC)

@Bumm13: It's now enabled for Hanzi. Wyang (talk) 01:12, 27 June 2014 (UTC)

Question re: copula in Korean[edit]

I'm curious if the copula 이다 (ida) might have undergone "n" deletion at some point in the distant past. Is there any chance that the negative 아니다 (anida) was originally composed of (a) + 니다 (nida), with negative prefix (an) originating as a contraction of 아니 (ani)?

There are interesting suggestions (such as in this slide deck by Bjarke Frellesvig about Old Japanese and earlier) that classical Japanese perfective auxiliary (nu) might have developed from an older copula, and that this might be the root of even modern particles like (ni). That and related discussions about the 未然形 (mizenkei, irrealis) conjugation got me wondering if there were any analogs in Korean, either in the formation of negatives by using a, or in the copula, and hence my question above.

(Incidentally, etyms 3 and 4 at (an) look completely indistinguishable to me... and on a different note, Chinese entries are looking pretty snazzy. :) )

TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 17:24, 6 June 2014 (UTC)

Interesting! I love Bjarke Frellesvig's book "A History of the Japanese Language" and the presentation you linked to is very interesting as well.
There are four negatives in Korean: an ("not", , 아니, 아니다, 않다), mos ("cannot", , 못하다, 모르다), mal ("don't", 말다), and eps ("not have", 없다). I will post a detailed reply tonight when I have access to the Korean etymology books. Wyang (talk) 03:00, 10 June 2014 (UTC)

Hi Eirikr! Sorry for the delay...

Korean differs from Japanese in that its negative constructions cannot be done purely with endings, and require a combination of verbal endings and negative verb/adjective/adverbs (i.e. 생각지 않다 = 思わず), and is hence less agglutinative in morphology. To me, the first Korean negative series of an has the root form of (a)n, and I always thought this must be cognate with the n or z (< n-su) in the negative forms of Japanese verbs. There doesn't seem to be a negative-forming process by attaching a to the positive copula. The positive copula might eventually be a reduced form of 있다 (itda, “there is”), which in Middle Korean was ista ~ isita ~ sita and this might be related to Japanese aru.

There is a very interesting discussion in Lee Namdeuk's book 한국어 어원 연구 IV, Chapter 基礎 語彙의 語源과 比較 考察. He thinks that an and eps negatives in Korean are ultimately from the same source, and that the -n- negative in Korean is related to the Japanese n negative. Here is the original text: https://www.dropbox.com/s/ok868r50asv2yox/Lee.tar.gz.

P.S. The second and third Korean negative mos and mal may be distantly related to Appendix:Proto-Sino-Tibetan/ma.

Glad that the collaborative attempts to make entries snazzier seemed to be working!

Cheers, Wyang (talk) 02:39, 11 June 2014 (UTC)

  • Thank you so much for the research and details!
So to make sure we're on the same page, it sounds like:
  1. KO ida did not undergo "n" deletion.
  2. By extension, KO itda did not undergo "n" deletion.
    I would be very grateful if you could confirm the above two, as that would help categorically rule out any connection between these and hypothetical JA copula nu (with inflected form ni).
    (As two tangential ideas, do you think KO i- in ida has any relation to JA i- in iru, classically rendered as wiru? I'm not aware of any phonological processes that might explain "w" deletion in Korean, but you certainly know more about that than I do. And do you have any thoughts on the apparent overlap between KO iss- in itda and PIE *h₁es- (to be), among other odd coincidental KO-PIE collisions?)
  3. KO negatives are historically indicated primarily by the consonant /n/, with the vowel /a/ being an incidental (or otherwise not important in conveying negativity), and the vowel /a/ in its negative capacity definitely not having anything to do with verb conjugation patterns.
    This is interesting as a possible point of real divergence, in that the Japanese negative nu could be analyzed as identical with perfective nu, provided one accepts the 未然形 (mizenkei, irrealis form) as a real feature of the language and not an artifact of some sort: [verb in irrealis == action that is incomplete or hasn't happened yet] + nu == [verb that has completed without happening] == [verb hasn't happened == negative of verb]. For the zu forms, Frellesvig postulates that this was a fusion between the 連用形 (ren'yōkei, continuative form) ni of root form nu + apparent adverbial complement su: /ni/ + /su/ > /nsu/ > /zu/. Have a look at slide 34 of the linked deck -- Frellesvig diagrammed this as “*ani-su”, and that ani is what got me thinking about possible KO connections. Ultimately though, I think the “a” there is just intended to convey the mizenkei. This /ni/ + /su/ matches Lee's notes for OJ on page 79, as much as I can read of them (thank you very much for that, though I regret that I can currently only make out some of the text -- I've really got to spend more time studying Korean). JA negative nashi mentioned on that same page by Lee could be analyzed as the mizenkei na of root form nu + adjectival suffix shi.
    I see on the bottom of page 79 and the start of page 80 that Lee equates this JA “n” element with the KO an element, as you mentioned. I'm still chewing on the JA; I have real trouble viewing JA “n” purely as a negative given the prevalence of affirmative meanings that can potentially be ascribed to this same root, such as modern naru, verbal auxiliary -nau, non-negative adjectival suffix -nai (as in  (あぶ)ない (abunai),  (すく)ない (sukunai), etc.), perfective nu, possibly even particle ni. I might be open to the possibility of two JA “n” roots that converged or collided somehow, but the semantics for such opposite meanings being expressed in the same sound leave me uncertain as to how that would happen. With the mizenkei verb conjugation stem providing necessary context, the overlap between affirmative and negative “n” meanings in JA can be explained. I know that some authors, Frellesvig apparently among them, have advanced the notion that the mizenkei is purely an historical artifact and not an underlying semantic feature of the language, but without reading their arguments, I can't see where that could be the case -- the mizenkei appears to be an integral feature since the earliest writings, and its semantics can explain a number of otherwise-weird constructions.
Anyway, I realize this is a lot, but if you're willing :), I'd greatly appreciate it if you could 1) confirm that I'm restating the numbered items correctly, and 2) share your thoughts on the rest of the above. I'm an incorrigible language geek, especially when it comes to figuring out how things are put together, so if this has exceeded your interest threshold, just let me know.  :) ‑‑ Eiríkr Útlendi │ Tala við mig 17:37, 11 June 2014 (UTC)
I'm an incorrigible language geek too, fortunately and hopelessly. :) I agree with the first two points, please let me get back to you on the rest later. In the meantime it might be useful to have a look at Lee's discussion on J. iru, wiru: https://www.dropbox.com/s/ak4xm4i0l42qozn/2014-06-14%2006.57.10.jpg. Wyang (talk) 00:11, 12 June 2014 (UTC)

My edit to and possible categories bug[edit]

It looks like after converting this article to "Chinese", there might be a bug with how categories are being displayed (it's trying to add to "Hakka nouns/verbs" (and Wu) when no Hakka or Wu readings are specified (plus the munged formatting of the phantom Wu nouns/verbs category links in general). Bumm13 (talk) 08:07, 7 June 2014 (UTC)

Thanks, think it's fixed. I have also expanded that entry slightly. Wyang (talk) 01:00, 10 June 2014 (UTC)

Unforseen naming issue with pronunciation audio files (Template:zh-pron)[edit]

After converting the article to using "Chinese" and the new templates, I noticed that the current Mandarin pronunciation Ogg file has a name that it isn't expecting (and thus showing a red link). The expected name is zh-ēn.ogg but the actual Wikimedia Commons filename for that sound is at Zh-en.ogg instead. I expect that this issue will continue to show up for many other such sound files. :\ Bumm13 (talk) 08:25, 7 June 2014 (UTC)

@Bumm13: You can set the parameter ma to the name of the ogg file. --kc_kennylau (talk) 08:26, 7 June 2014 (UTC)
Oh, okay, thankfully that option is available. Thanks! Bumm13 (talk) 08:30, 7 June 2014 (UTC)

Russian IPA, remaining Mandarin entries[edit]

Hi Frank,

I will get back on the Russian pronunciation appendix. Sorry for not doing much lately. Is that OK? I hope you won't lose interest :) I may not be able to describe assimilative palatalisation and gemination rules in good details.

I have checked the remaining multi-syllabic Mandarin entries. All of them either miss pronunciation sections altogether or use some old-style non-standard pronunciation method. You could probably fix them with AWB and your bot. Could you do that when you have time, please? They are too many to do manually. --Anatoli (обсудить/вклад) 01:29, 10 June 2014 (UTC)

No worries - Great works are not finished in a day :). I will be on a lookout for the testcase and talk pages, so please add anything that needs to be improved whenever you think of them.
I will modify the code and do the rest, when I have time. Wyang (talk) 03:18, 10 June 2014 (UTC)
I have fixed the last Mandarin noun - , which didn't have Pinyin inside {{cmn-noun}}. Will check other PoS. --Anatoli (обсудить/вклад) 05:34, 10 June 2014 (UTC)

Simplified and traditional scripts[edit]

Hi Wyang, I just had a quick question. Is it true that under the current formatting arrangements, there is no categorisation of simplified and traditional scripts? I just realised this may be the case. ---> Tooironic (talk) 07:59, 10 June 2014 (UTC)

Yes, there is no categorisation by script. Wyang (talk) 00:36, 11 June 2014 (UTC)
I personally don't miss this categorisation but I can imagine it won't be hard to introduce but without PoS separation. As previously agreed, 中國 and 中国 are now sorted the same way, by numbered pinyin. --Anatoli (обсудить/вклад) 00:42, 11 June 2014 (UTC)
Seems to be a bit of a shame to me. This kind of categorisation and its related data could be useful for both the average user and people who wish to make use of the data. ---> Tooironic (talk) 09:11, 11 June 2014 (UTC)
@Atitarev:, @kc kennylau:, @Jamesjiao: What is your opinion on this? Should we resurrect categorisation by script? It is easy to reliably generate Category:Chinese terms in simplified script, and equally easy to generate Category:Chinese nouns in simplified script (though not as reliably). I don't think it is worthwhile to split the topolect-specific categories. Wyang (talk) 00:02, 12 June 2014 (UTC)
I wouldn't resurrect SoP's split by traditional/simplified, just split ALL TERMS by traditional/simplified. No, just "Chinese", IMO. --Anatoli (обсудить/вклад) 00:05, 12 June 2014 (UTC)
I am still against sorting by pinyin. What makes Mandarin the official dialect? --kc_kennylau (talk) 04:23, 13 June 2014 (UTC)
I am against it too. We should use the sortkeys in Chinese categories. Wyang (talk) 04:25, 13 June 2014 (UTC)
What's the alternative, guys? I see a big issue with sorting, for example in topical categories (they quickly get out of hand, if |sort= is not specified). How are you going to sort Chinese entries, e.g. Category:Chinese nouns, if we drop pinyin sorting? Back to radical sort or by characters themselves? I'm not suggesting that Mandarin should overwhelm topolects but it's better to have some sorting key than nothing. I see that topolect categories are sorted by the appropriate romanisation but what if we decide not to split by Mandarin traditional/simplified, Cantonese traditional/simplified, etc.? --Anatoli (обсудить/вклад) 05:08, 13 June 2014 (UTC)
My preference is to sort Category:Chinese nouns by radical, and the topolects by romanisations. Wyang (talk) 05:24, 13 June 2014 (UTC)
My preference is numbered pinyin or have alternative sorting. If radicals are chosen, it would be great then to have a radical index on categories then, as a minimum, otherwise finding a word in a list of thousands won't be possible, one could use Category:Mandarin nouns, of course. There's a table at the top of Category:Mandarin nouns in traditional script but it's no longer usable because rs= value no longer exists. --Anatoli (обсудить/вклад) 05:46, 13 June 2014 (UTC)
The Chinese dictionary I have has an index of radicals at the beginning of the book and under each radical is a list of characters that incorporate the said radical ordered by the number of strokes of the phonetic element. The actual dictionary is ordered by pinyin from A to Z. JamesjiaoTC 22:55, 15 June 2014 (UTC)
It's technically not difficult to sort them automatically by radical - Module:zh/data has a sortkey function specifically for that purpose. Wyang (talk) 06:05, 13 June 2014 (UTC)
  • Query: Is there any way to add back-end routines to {{zh-pron}} that would add categorizations for each reading (i.e. topolect) as it's added? I haven't explored Lua enough to know if it's even possible, but what of code that could parse the page for POSes and pronunciations, and auto-generate the corresponding categories? ‑‑ Eiríkr Útlendi │ Tala við mig 06:34, 13 June 2014 (UTC)
    Sorry I haven't been very responsive lately... I don't quite understand what you meant above. {{zh-pron}} currently operates under that premise (it seems), generating the corresponding categories depending on the readings and PoS parameter value given. Wyang (talk) 04:19, 20 June 2014 (UTC)
Eirikr probably means using zh-pron to make e.g. "Cantonese nouns in traditional script", etc. I think we shouldn't split by topolects and PoS but that's only me. Just "Chinese terms in traditional script" and ...simplified would do but some people will disagree. --Anatoli (обсудить/вклад) 04:26, 20 June 2014 (UTC)
  • Hi, yes, that's more what I was trying to convey. I'm not much for using the Chinese on this site, but I could see some utility in being topolect-specific -- as a user, to find readings for terms in a specific topolect; and as an editor, in order to find those entries that might still need topolect data. ‑‑ Eiríkr Útlendi │ Tala við mig 18:15, 20 June 2014 (UTC)
Category:catboiler with sc might be relevant. —CodeCat 12:10, 21 June 2014 (UTC)

Request[edit]

I currently do not see any need for protecting Template:la-conj-3rd-no234, so can you please unprotect the template? Thanks in advance. --kc_kennylau (talk) 05:10, 11 June 2014 (UTC)

No problem, I have unprotected it for a week for you. Please let me know if that is not long enough or if you need to edit other templates. Wyang (talk) 05:14, 11 June 2014 (UTC)
Thanks. --kc_kennylau (talk) 06:19, 11 June 2014 (UTC)

Mandarin homophones[edit]

How can we add to the list of homophones for given pinyin readings? E.g. 財務 is missing 才悟 (才思穎敏,領悟力強). ---> Tooironic (talk) 09:09, 11 June 2014 (UTC)

@Tooironic: Template:cmn-pron/hom/cáiwù or click on the edit button on the top right hand corner of the box on 財務. --kc_kennylau (talk) 10:32, 11 June 2014 (UTC)

Mandarin translation not nested under Chinese[edit]

Hi Wyang, I've seen you working a lot on Chinese. If you have some spare time you could have a look at Wiktionary:Todo/Mandarin translation not nested under Chinese. Matthias Buchmeier (talk) 17:49, 11 June 2014 (UTC)

Thanks, I will do that. Wyang (talk) 23:45, 11 June 2014 (UTC)

糸 and 絲[edit]

Hi Frank,

I got a bit mixed up with the readings there on . The readings were originally using those for , I think. Could you verify, please? --Anatoli (обсудить/вклад) 00:27, 13 June 2014 (UTC)

Please see now. Wyang (talk) 05:53, 13 June 2014 (UTC)
Thank you, looks great. --Anatoli (обсудить/вклад) 06:04, 13 June 2014 (UTC)

Russian pronunciation[edit]

Hi Frank,

I have imported Appendix:Russian pronunciation/imported from the Russian Wiktionary. It's not a gospel and we don't have to follow it 100% as there are choices in IPA. Are there any points you would like me to translate for you? --Anatoli (обсудить/вклад) 13:12, 15 June 2014 (UTC)

Thanks, I will have a look. Wyang (talk) 23:53, 15 June 2014 (UTC)
It doesn't answer the question of the assimilative palatalization and gemination, though. Well, we can keep working on it and deal with problematic cases when they arise.
Could you add handling for some prefixes (they do cause gemination) - I will try to maintain the list. One of the currently failed tests: отдохну́ть (prefix: от-). --Anatoli (обсудить/вклад) 23:59, 15 June 2014 (UTC)

Yellow Link Deal[edit]

I add Pinyin and Jyutping readings to yellow links such as was one random word, and you or someone else fills in and corrects the blanks. Could that be a deal? --Lo Ximiendo (talk) 16:04, 15 June 2014 (UTC)

Added. Wyang (talk) 23:53, 15 June 2014 (UTC)
I came across fǎnmiàn and dǎomín. Maybe you could delete the words from your list of missing Chinese words? --Lo Ximiendo (talk) 14:14, 16 June 2014 (UTC)
Please update that page if you could. I haven't been providing care for that page for months... Wyang (talk) 00:40, 18 June 2014 (UTC)

Module:zh-usex[edit]

L333-L347 contains extraneous lines. What are you trying to do there? --kc_kennylau (talk) 08:49, 17 June 2014 (UTC)

Generate different indentation for quotations not following definitions, following definitions but inline, and following definitions and not inline. See the bottom of Wiktionary:Feedback for an example of in_notes. Wyang (talk) 12:44, 17 June 2014 (UTC)
You have an "if in_notes then" in L343 which overrides the "elseif in_notes then" in L336. By the way, how can I not make it a quotation? --kc_kennylau (talk) 14:16, 17 June 2014 (UTC)
L336 defines other_lines_indent and simp_indent, whereas L343 defines first_line_indent. What do you mean by a non-quotation? in_notes and inline? Wyang (talk) 23:51, 17 June 2014 (UTC)

Speedy deletion[edit]

Can you please clear Category:Candidates for speedy deletion thank you --kc_kennylau (talk) 11:51, 17 June 2014 (UTC)

I've deleted all the Category:cmn:-prefixed pages. Wyang (talk) 12:45, 17 June 2014 (UTC)
Please delete all the pages in Category:cmn:List of topics that has no sub-categories or elements also (I'm too lazy to tag them one by one). [Press Ctrl+F and find "0 c, 0 e" to find] --kc_kennylau (talk) 14:13, 17 June 2014 (UTC)
I've deleted all the empty categories and supercategories. Wyang (talk) 00:12, 18 June 2014 (UTC)
I've just restored a large number of deleted cmn categories that had subcategories- in fact, most of the category tree for cmn was obliterated, leaving redlinks all over the place.
There are many topical categories that are populated by {{context}}, so the only way to empty all cmn categories would be to get rid of either the context template or the "lang=cmn" parameter. Unless you do that, there will be non-empty cmn categories, which will in turn be categorized in the parent categories set by {{topic cat}} subtemplates.
Just so we're clear: a category that has subcategories is not empty, and shouldn't be deleted. Not every category is designed to directly contain entries- many are just for navigating between sister categories. If we're going to have cmn topical categories, we should have a category tree to link them together.
If you're going to delete a category, first empty and delete all of its subcategories (and sub-subcategories, etc.). Otherwise, leave it alone. Thanks Chuck Entz (talk) 02:55, 20 June 2014 (UTC)
The whole category tree starting from Category:cmn:List of topics has been deleted. Wyang (talk) 03:42, 20 June 2014 (UTC)
That'll work... As much as I hate to see all my work in creating and then restoring the categories just evaporate like that, my only real problem was with deleting non-empty categories. As long as you make sure the categories are empty before you delete them, I can live with it. Sorry for the extra work! Chuck Entz (talk) 05:22, 20 June 2014 (UTC)

直观[edit]

I really struggled translating this concept into English, was wondering if you had any suggestions? ---> Tooironic (talk) 02:16, 26 June 2014 (UTC)

Audio-visual, intuitive, self-evident? I have expanded that entry. Wyang (talk) 23:51, 26 June 2014 (UTC)
Many thanks. ---> Tooironic (talk) 13:12, 4 July 2014 (UTC)

刪除請求[edit]

請求刪除粵語同音詞模板中所有頁面,謝謝。 --kc_kennylau (talk) 10:06, 27 June 2014 (UTC)

已悉數刪除。另有關廣州粵語顎化系聲母的討論,如無異議,可否恢復之前的版本?多謝。Wyang (talk) 10:55, 27 June 2014 (UTC)
已按簡報第四十一及第四十二頁修改。 --kc_kennylau (talk) 12:42, 27 June 2014 (UTC)

Edit request: Module:ja-headword[edit]

Please semi-protect for a day so that I can edit. Thank you in advance. --kc_kennylau (talk) 01:53, 28 June 2014 (UTC)

@Kc kennylau: Unprotected, pls let me or Wyang know when done. --Anatoli (обсудить/вклад) 03:41, 28 June 2014 (UTC)

@Atitarev: Done. Thank you. --kc_kennylau (talk) 14:27, 28 June 2014 (UTC)
Protected again. Thank you for the edit. --Anatoli (обсудить/вклад) 14:39, 28 June 2014 (UTC)
Is full protection really necessary? —CodeCat 14:40, 28 June 2014 (UTC)
Not sure, if you ask me. I only restored the previous protection. --Anatoli (обсудить/вклад) 15:18, 28 June 2014 (UTC)
The page history shows that User:Haplology is the user who fully protected it. Pinging him/her to here. --kc_kennylau (talk) 17:15, 28 June 2014 (UTC)
  • Hey folks, I noticed that entries such as 檳榔, that need both hira and kata specified, wind up getting two romaji listings in the headline. That doesn't seem quite right... ‑‑ Eiríkr Útlendi │ Tala við mig 01:13, 29 June 2014 (UTC)

@Kc kennylau: Unprotected again. Please fix. @Eirikr: thanks for letting us know. --Anatoli (обсудить/вклад) 02:52, 29 June 2014 (UTC)

@Atitarev, Eirikr: Done. --kc_kennylau (talk) 08:38, 29 June 2014 (UTC)
  • Hmm, I just found that now 鮎#Japanese isn't showing any romaji at all -- specifically for the first etym noun sense, where the あゆ reading is supplied as an unnamed positional parameter and the アユ reading is supplied as the named kata= parameter. ‑‑ Eiríkr Útlendi │ Tala við mig 22:56, 14 July 2014 (UTC)
  • I have edited it to make romaji display on . Hopefully it didn't cause romajis to blossom elsewhere! Wyang (talk) 00:00, 15 July 2014 (UTC)
Sorry! How about now? Wyang (talk) 12:08, 15 July 2014 (UTC)
  • Herp-a-derp on my part -- it never occurred to me to look at the katakana string itself as provided to the template. Thank you for that!  :) ‑‑ Eiríkr Útlendi │ Tala við mig 16:40, 15 July 2014 (UTC)
No worries :) Wyang (talk) 23:40, 15 July 2014 (UTC)

需要[edit]

This entry apparently got accidently scrambled by an AWB edit of yours in the heat of the topolect merger, and it's been reverted to the edit previous to that by a contributor. I thought you might want to take a look at it. Chuck Entz (talk) 04:22, 30 June 2014 (UTC)

Fixed. --Anatoli (обсудить/вклад) 04:43, 30 June 2014 (UTC)

Module error[edit]

Caused by this edit. Please fix it. —Mr. Granger (talkcontribs) 00:34, 4 July 2014 (UTC)

Actually, no. Kephir's last edit to Module:och-pron did that. The edit you're referring to just switched Old Chinese on so the module could choke on it. Chuck Entz (talk) 00:39, 4 July 2014 (UTC)
Fixed. --kc_kennylau (talk) 00:53, 4 July 2014 (UTC)
Almost. There are still 9 entries with a variation on the same error that don't respond to null edits. Thanks for the other 111 entries, though. Chuck Entz (talk) 01:10, 4 July 2014 (UTC)
I created many irrelevant errors when doing a major edit on Module:yue-pron, which, by the way, is fixed now. --kc_kennylau (talk) 01:40, 4 July 2014 (UTC)
I know about those- they went away after null edits- but these are in och-pron, not yue-pron (in several cases, an entry had both, a few lines apart from each other). Chuck Entz (talk) 02:05, 4 July 2014 (UTC)
Not sure if I had accidentally fixed it in my sorting of Module:zh/data/och_pron, but Category:Pages with module errors is empty now. Wyang (talk) 04:30, 4 July 2014 (UTC)
It's hard to be sure of anything, with all the edits cycling through the edit cue. After I read your comment, I checked and saw 55 entries in the category. In the time it took me to do a quick null edit on one entry, it was empty again. Still, I haven't seen anything that displayed a module error on the page or that survived a null edit, so I think we're out of the woods on this one- for now, anyway. Thanks! Chuck Entz (talk) 04:47, 4 July 2014 (UTC)

Taiwan pronunciation for [edit]

If possible could you update the automatic template so that the Taiwan variant pronunciation for 縛 as fú is added? See more at the page for 綁縛 and here too. Thanks. ---> Tooironic (talk) 13:10, 4 July 2014 (UTC)

@Tooironic: Done. See 綁縛绑缚 --Anatoli (обсудить/вклад) 13:47, 4 July 2014 (UTC)
Thanks muchly! ---> Tooironic (talk) 22:02, 6 July 2014 (UTC)

[edit]

Hi! Can you shed any light on WT:RFV#坉, and on whether or not can mean "water that does not recede and cannot be diverted"? - -sche (discuss) 03:23, 6 July 2014 (UTC)

Kenny has commented there. The sense is easily attested. Wyang (talk) 23:30, 6 July 2014 (UTC)
Thanks, both of you! - -sche (discuss) 03:05, 7 July 2014 (UTC)

Remaining cmn-nouns and other Mandarin PoS[edit]

Hi Frank,

Do you still have any tricks for the remaining [2] (and other PoS), multisyllabic, at least? --Anatoli (обсудить/вклад) 03:27, 10 July 2014 (UTC)

Most of the >1-syllable Hanzi words there are done now... Wyang (talk) 07:08, 10 July 2014 (UTC)
Thanks a lot! I was a bit bored converting them manually :) However, there are still a list of verbs, phrases and quite a lot of proverbs and idioms. --Anatoli (обсудить/вклад) 11:06, 10 July 2014 (UTC)
Verbs are done. --Anatoli (обсудить/вклад) 04:50, 11 July 2014 (UTC)
Thanks! I did a number of phrases and idioms yesterday, and will have a look later. Wyang (talk) 09:07, 11 July 2014 (UTC)
Frank, pls put on your to do-list finishing those pesky cmn templates, e.g. proper nouns [3], idioms, proverbs, etc. :) It's just seems much easier for you with AWB. If any of them are hard to do because of bad formatting, I'll finish manually. I have a question: how would you write IPA for 三Q? Not sure how to convert it to the new format. --Anatoli (обсудить/вклад) 00:43, 15 July 2014 (UTC)
Not sure if you missed my request. Just need to know if I need to continue to do them manually, they are not very interesting but need to be done. :) --Anatoli T. (обсудить/вклад) 07:03, 21 July 2014 (UTC)
Sorry! I missed your message earlier. There is no need to do them manually since time would be better spent on other tasks, although on the other hand I haven't been very free lately... I reckon we should disable Zhuyin, IPA etc. if the entry title contains non-Chinese characters. Wyang (talk) 23:35, 21 July 2014 (UTC)
That's OK, whenever you have time, just wanted to make sure you read my message. Thanks. :) Re: IPA, Zhuyin, 卡拉OK may get hits in Zhuyin, besides, users may want to know how Chinese pronounce those words but it's too hard, well... --Anatoli T. (обсудить/вклад) 23:47, 21 July 2014 (UTC)

Edit request: Module:ja-pron[edit]

Created an entry for 合期. This has a rare alternate reading of gaggo, for which {{ja-pron}} has produced the unlikely IPA of [ga̠k̚g̃o̞]. So far as I know, geminate "g" sounds in Japanese (rare as they are) never manifest this way. Could someone look into this? ‑‑ Eiríkr Útlendi │ Tala við mig 19:14, 10 July 2014 (UTC)

Good point. I thought the Japanese are unable to pronounce voiced geminates and thus bed and bet would end up basically identical (even though written differently), which is why I devoiced the first part of the geminate with no audible release (probably also influenced by the limited distribution of checked tone to voiceless codas in Chinese). I have changed them to truly voiced geminates, and added a voicelessness sign, and removed the optional nasalisation of g in gg. Wyang (talk) 09:13, 11 July 2014 (UTC)

吃貨[edit]

According to the Taiwan dictionary, 吃貨 can also mean 股票術語,指做手於低價時不動聲色的買進股票. Do you know what the English equivalent would be? I'm lost. ---> Tooironic (talk) 23:27, 10 July 2014 (UTC)

The act of quietly accumulating shares of stock by traders when the stock is at a lower price? Would it sound too literal? Wyang (talk) 09:18, 11 July 2014 (UTC)

Middle Chinese[edit]

Now that {{zh-pron}} includes Middle Chinese pronunciation info, what should be done with these 275 Middle Chinese entries that were discussed in the BP in January (list)? Can the bizarrely-annotated, half-hidden pronunciation information be removed from the ==Middle Chinese== sections of those entries now, once the entries are made to use {{zh-pron}}? (If the info isn't removed, I'd like to standardize the wording and make it visible, like this.) - -sche (discuss) 02:05, 11 July 2014 (UTC)

I parsed through your list. 41 articles are gone and here is the updated version:
I would just leave them as they are as the Chinese merger is actively ongoing and they would probably be gone in a year. Unless someone wants to decimate them now... Wyang (talk) 09:46, 11 July 2014 (UTC)
OK; I have no problem leaving them as-is for now, as long as something will be done with them in the long term. Cheers, - -sche (discuss) 18:43, 14 July 2014 (UTC)

Minor issues in Template:zh-pron[edit]

Hi Wyang,

The unified "Chinese" with Template:zh-pron is working quite nicely for the most part. I have found a few small errors that eventually will need to be fixed regarding romanization readings. For Mandarin (Wade-Giles), the pinyin "gui" is showing as "kui" when it should be showing "kuei". Example: .

There's also two relatively minor Min Nan romanization issues. Going from Peh-oe-ji to Tai-Lo (in both cases), the Tai-Lo -eh (as in "ngeh") is supposed to correspond to POJ -oeh (as in "ngoeh"), at least according to the sources I checked against. The template is changing POJ -oeh to -ueh in Tai-Lo. Example: .

Also, the -o͘ suffix in Peh-oe-ji corresponds to -oo in Tai-Lo but is showing up as if they are the same. Example: . Other than that, everything looks great so far. Keep up the good work! :) Bumm13 (talk) 20:54, 11 July 2014 (UTC)

@Bumm13: Thanks for your kind reminder. However, according to Wikipedia, in Tai-lo ngueh is correct, and in Wade-Giles kui is correct. Can you kindly state your source? Thanks once again. --kc_kennylau (talk) 09:54, 12 July 2014 (UTC)
For the pinyin "gui" conversion, here are two good sources: [4] and [5]. These are both university library sources (Hong Kong University of Science and Technology and the University of Chicago, respectively. The former is actually in China, while the latter is basically the equivalent of an Ivy League institution in the United States. I'll have to get back to you on the "ngeh" Min Nan issue. My sources for that one are (admittedly somewhat weak) the Open Dictionary Network - Min Nan Dictionary (kaifangcidian.com) for Peh-oe-ji compared with the Taiwan Min Nan Common Words Dictionary (based in Taiwan at twblg.dict.edu.tw - Taiwan Ministry of Education) for Tai-Lo. Bumm13 (talk) 17:10, 12 July 2014 (UTC)
I've fixed the kuei conversion. --kc_kennylau (talk) 17:18, 12 July 2014 (UTC)

Reference templates[edit]

Can additional characters be "exploded" in Template:R:xcl:AG? Such as “a”, “b”, “f” and the comma. --Vahag (talk) 22:31, 12 July 2014 (UTC)

Like this? Wyang (talk) 22:49, 12 July 2014 (UTC)
Yes, exactly like that, thank you. --Vahag (talk) 22:58, 12 July 2014 (UTC)

Hello again. Is there an expression that can easily convert the parameter {{{vol}}} in Roman numerals (I, II, ..., X) into Arabic numbers (1, 2, ..., 10) in Template:R:xcl:HAB, in the &volume={{{vol}}} part? --Vahag (talk) 08:14, 28 July 2014 (UTC)

Nope. You have to write a module. Keφr 08:17, 28 July 2014 (UTC)
…which I just wrote. "Ungoliant {{#invoke:foreign numerals|from_Roman|MMDCCLXIV}}</nowiki>" gives: "Ungoliant 2764". Keφr 08:36, 28 July 2014 (UTC)
Thanks! Any module to convert User:BD2412 and User:msh210 into human language? --Vahag (talk) 09:24, 28 July 2014 (UTC)

[edit]

In Appendix:Proto-Sino-Tibetan/p(r)an/t ~ b(r)an/t, what does the "greater than overlapping less than" symbol signify? - -sche (discuss) 18:43, 14 July 2014 (UTC)

Allofamic variants. Also, the slash between 1 and 2 represents two alternative tone categories for the first allofam, not two reconstructions. Wyang (talk) 23:25, 14 July 2014 (UTC)

人事[edit]

We are missing about four extra senses here. Don't suppose you'd be interested in taking a stab? I'm busy with something else at the moment. ---> Tooironic (talk) 11:53, 16 July 2014 (UTC)

Anatoli and I have expanded the entry. Wyang (talk) 00:47, 17 July 2014 (UTC)
Looks fantastic, thanks for this! ---> Tooironic (talk) 09:42, 26 July 2014 (UTC)

Issue with multiple audio files ( article)[edit]

I tried adding a second Mandarin pronunciation .ogg file entry to Template:zh-pron in the article and nothing I've tried seems to work in causing the second file's click play button thing to show up in my browser(s). Could you check the article to see if I'm doing something wrong? I have both Mandarin readings in the "m=" parameter, so I would think the audio files would show up without a lot of effort. Bumm13 (talk) 08:41, 21 July 2014 (UTC)

It could be generated by putting ,2a=y in the |m= field, please see what I did. Ideally the pronunciations should be split since they have alternative etymologies, as in or , but for short articles like 教, extra parameters like ,2a=, 3a=, and 4a= are available for use. Wyang (talk) 23:33, 21 July 2014 (UTC)

뿌셔뿌셔[edit]

This looks real, but obviously messy, and I am not sure if this is a brand name, in which case it would need to pass WT:BRAND. (The same IP has also added prigle, for what it may be worth.) Can you take care of this? Keφr 06:38, 26 July 2014 (UTC)

It is the Korean equivalent of uncooked ramen noodles. I've made some changes there. Wyang (talk) 11:58, 27 July 2014 (UTC)

隔膜[edit]

Is there really a Taiwanese variant pronunciation of gémò? This does not seem to be supported by 國語辭典. ---> Tooironic (talk) 09:40, 26 July 2014 (UTC)

It doesn't seem to be consistent - see 橫膈膜. I've changed the tag to "variant in Taiwan". Wyang (talk) 11:43, 27 July 2014 (UTC)


Dzongkha (རྫོང་ཁ) data[edit]

Hi Wyang

Thanks for response in the Beer Parlour. We could like to contribute the data for Dzongkha dictionaries. I think it will need someone familiar with Wikimedia software and something like Python to convert an import this data - unfortunately I don't have those skills. Any help would be appreciated. CFynn (talk) 13:06, 31 July 2014 (UTC)

CFynn (talk) 13:06, 31 July 2014 (UTC)

@CFynn: Hi Chris! It's great to have you here. I wrote a module for transliterating Tibetan/Dzongkha a while ago (Module:bo-translit) and have written a few simple Python scripts for either retrieving or uploading data from/to Wiktionary. I have used the dictionary at dzongkha.gov.bt a couple of times, and was impressed by how well-organised the website is. For creating an entry of a word in Wiktionary, we need two pieces of information about the word: the definition and part of speech. I am more than glad to help out if you have any questions. Thanks, Wyang (talk) 23:26, 31 July 2014 (UTC)
@Wyang:Hi. We have XDXF (XML) files of the dictionaries which are probably the easiest format to deal with. The Dzongkha-English dictionary has part of speech and English definition and sometimes a Dzongkha synonym. I think this could be used to make the basic entries. There are separate files which list verb forms (past, present future) and honorific forms of words - which might be added on top. The English-Dzongkha dictionary has English word, part of speech, Dzongkha definition(s). The Dzongkha-Dzongkha dictionary is just word+definition with separate field for part of speech - though this information is often embedded within the definition.
The Tibetan-Dzongkha dictionary has Word+Definition with part of speech within square brackets as the first part of the definition which should be easy to extract. Sometimes the square brackets also contain a code indicating the head word is Sanskrit in Tibetan script or an archaic form. The Dzongkha-English and English-Dzongkha dictionaries are clearly going to be the easiest to deal with. The differences in format are due to the fact that these were originally compiled by different people at different times using only a word processor - not even a database. At that time people were only concerned about print publication. 07:17, 5 August 2014 (UTC)

In the XDXF files Dzongkha-English entries look like this:
<ar><k>ཀྲུམ་ཀྲུ</k> <def> noun cartilage (པགས་ཀོ་ཧྲབ་ཧྲོབ།) adj. crisp, crunchy, gristle (ཕྲུམ་ཕྲུམ།)</def> </ar>

<ar><k>ཀྲེག</k> <def> <pos>verb</pos>( fut., prs., pst., imp.) scratch, cross out</def> </ar>

<ar><k>ཀྲེག་ཀྲེགཔ</k> <def> <pos>adj.</pos>shaven</def> </ar>

<ar><k>ཀྲེག་ཆས</k> <def> <pos>noun</pos>abrasive, scraper, shaver</def> </ar>

English-Dzongkha like this:
<ar><k>A</k> <pos>n:</pos> ༡ ཨིང་ལིཤ་གི་ཡི་གུ་དང་པ། ༢ སྡེ་ཚན་ཀ་པ། སྡེ་རིམ་ཀ་པ། ༣ དྲག་ཤོས།</ar>

<ar><k>a</k> <pos>ia:</pos> ཅིག ཞིག ཤིག གང༌།</ar>

<ar><k>aardvark</k> <pos>n:</pos> གྱོག་དོམ། གྱོག་མོ་ཟ་མིའི་དོམ།</ar>

<ar><k>aard-wolf</k> <pos>n:</pos> འཕརཝ། ཨ་ཕི་རི་ཀ་ལུ་ཡོད་པའི་འཕརཝ་གི་རིགས་ཅིག</ar>

<ar><k>aback</k> <pos>adv:</pos> དཔྱད་རིག་རྣམ་རྟོག་མེད་པར། དཔྱད་རིག་མེད་སི་སི་སྦེ།</ar>

- CFynn (talk) 07:35, 5 August 2014 (UTC)

@CFynn: Thanks for the reply. The xml file easiest to use here would be the Dzongkha-English dictionary, as well as the additional files of verb conjugation and honorifics. The English-Dzongkha dictionary can only be used here for adding to translation tables (eg. aardvark), which is less straightforward (multiple translation tables, linking to components in translations). Embedding of part of speech in the definition shouldn't be too much of a problem. Would there be anything to take care of in terms of copyright (referencing) and externally linking to the dzongkha.gov.bt website? It seems it's not a very difficult task, and we can get started on this soon. Wyang (talk) 23:22, 5 August 2014 (UTC)
OK. I'll get the latest versions of those files together and post a link here to the files. If Wikimedia need an official letter saying the data is released under CC-BY-SA 3.0 + GFDL I can get the Secretary of the DDC to write one and we can fax it or send it by snail mail if you can tell me where and to whom this should be sent. Is there some kind of standard release form? A note or references saying the Dzongkha data is from the DDC and a link to their website http://dzongkha.gov.bt/ would be nice. (BTW PDF versions of all the dictionaries are available on the DDC site.) CFynn (talk) 04:28, 6 August 2014 (UTC)
Thank you. References containing link to the website would be appended to all entries. Here is the Wikipedia policy on donating copyrighted information: w:Wikipedia:Donating copyrighted materials#Granting us permission to copy material already online. Copyright is usually less of a concern at Wiktionary, since the material involved is generally short in length and not of innovative nature. If we want to be safe, we could request that the Secretary send a brief email declaring permission to use DDC data. Wyang (talk) 04:49, 6 August 2014 (UTC)
OK - this may take me a few days as I'm recovering from a minor operation on my foot and it is a little difficult for me to get around. CFynn (talk) 21:00, 6 August 2014 (UTC)
OK, no worries. Wyang (talk) 23:20, 6 August 2014 (UTC)

Dzongkha data[edit]

OK I've posted the XDXF dictionary files here: https://drive.google.com/file/d/0B18TCYaFI8CNMVpEaTl3akplMVE/edit?usp=sharing.

I forgot that this data was already available at https://code.google.com/p/dzongkha-dictionaries/source/browse/ - under CC-BY-SA 3.0 license.

PDF copies of the printed versions of these dictionaries can be found at

CFynn (talk) 17:11, 12 August 2014 (UTC)

BTW you may need to slightly modify your Tibetan transliterating tool for some Dzongkha entries. Dzongkha syllables sometimes contain a second root which does not occur in Tibetan. This mostly happens when the tseg between syllables is dropped to reflect Dzongkha pronunciation. e.g. Tibetan བླ་མ་ (bla ma / Lama) = Dzongkha བླམ་ (blam / Lam). About 12 years ago when I was working on Tibetan & Dzongkha collation I compiled a spreadsheet which shows all the possibilities for a second root in a Dzongkha syllable which might be useful to you. I'll try and find it and post a link. CFynn (talk) 17:29, 12 August 2014 (UTC)

A minor tweak for zh-pron[edit]

At , I just fixed an edit where spaces before the commas in the cat= part caused the module not to recognize the POS abbreviations. I found out about it from an entry in Special:WantedCategories for Category:Hakka pron (I would highly recommend regulatly checking Special:WantedCategories for non-catastrophic module errors- it updates every 3 days). Is it too much trouble to have the module allow for whitespace in arguments to avoid this in the future? Thanks. Chuck Entz (talk) 18:42, 1 August 2014 (UTC)

Done now (spaces after commas). Wyang (talk) 00:36, 4 August 2014 (UTC)

吐槽[edit]

Thanks for some great edits recently. My understanding of 吐槽 was it was more like "whinge", what do you think? ---> Tooironic (talk) 03:50, 5 August 2014 (UTC)

Thanks, I have added it. Wyang (talk) 04:06, 5 August 2014 (UTC)
Fantastic. Our Chinese coverage has been improving leaps and bounds recently. ---> Tooironic (talk) 00:27, 6 August 2014 (UTC)

社会工作[edit]

According to my C-C Dictionary, this term can mean both social work and 指以安定人民生活、协调人际关系和维持社会秩序为主要目的的各种为人民大众谋福利的工作. Do you think the latter could be defined as "community service"? ---> Tooironic (talk) 01:19, 7 August 2014 (UTC)

Yes, absolutely. Wyang (talk) 01:20, 7 August 2014 (UTC)

Module errors in cmn-pron[edit]

The erhua code you wrote seems to have introduced some module errors in four Chinese entries. Take a look at Category:Pages with module errors.

Benwing (talk) 12:06, 10 August 2014 (UTC)

Fixed - they were using parameters not defined in the original set. Wyang (talk) 23:20, 10 August 2014 (UTC)

碍眼[edit]

Wasn't sure how to word the second meaning here. My C-C- defines it as 因某人在眼前而感到不方便。例如:他俩在说悄悄话,我们呆在这儿很碍眼。Any ideas? ---> Tooironic (talk) 09:29, 16 August 2014 (UTC)

Hmm, (of someone's presence) to make others feel inconvenient or uncomfortable? Wyang (talk) 23:37, 17 August 2014 (UTC)

Module errors in cdo-pron[edit]

fyi, in case you hadn't noticed- 5 entries affected. Chuck Entz (talk)

Thanks, fixed. Wyang (talk) 05:31, 25 August 2014 (UTC)

條毛/条毛[edit]

Do you have a better definition? Thank you in advance. --kc_kennylau (talk) 00:16, 26 August 2014 (UTC)

Not really, the definition summarises the meaning well. Wyang (talk) 00:20, 26 August 2014 (UTC)
I think the definitions in 撞邪, 撞鬼, 見鬼 and 见鬼 are not quite accurate. Do you have a better definition? --kc_kennylau (talk) 11:20, 26 August 2014 (UTC)
I would say "1) to be absurd; preposterous; 2) to be down on one's luck; 3) to go to hell; to hell with ...; 4) damn; damn you; for Christ's sake" for these words. Wyang (talk) 11:31, 26 August 2014 (UTC)

歲數 / 岁数[edit]

Just noticed an error in the Mandarin pinyin here. I've fixed it now. Is the Cantonese correct? ---> Tooironic (talk) 12:50, 26 August 2014 (UTC)

@Tooironic: Yep. --kc_kennylau (talk) 13:14, 26 August 2014 (UTC)

Template:zh-new[edit]

Would it be better to pass no parameter to the module and let the module get all the parameters from the parent? --kc_kennylau (talk) 18:01, 26 August 2014 (UTC)

Yes, I was thinking about the same thing when I was adding extra parameters. Wyang (talk) 22:46, 26 August 2014 (UTC)

metaevolution revert[edit]

Is there a reason you restored content that was placed in the wrong section, and has been removed in the past via rfc and rfd? Chuck Entz (talk) 21:49, 30 August 2014 (UTC)

Ah, I see you reverted yourself... Never mind. Chuck Entz (talk) 21:52, 30 August 2014 (UTC)
Sorry, please excuse my, err, fatty fingers... Wyang (talk) 10:44, 31 August 2014 (UTC)

Empty categories[edit]

Hi Frank,

Could add one or two verbs belonging to Category:Korean h-irregular verbs and Category:Korean si-irregular verbs, which are now empty as well, please? :) --Anatoli T. (обсудить/вклад) 03:38, 3 September 2014 (UTC)

On this site, it seems all ㅎ-irregular verbs are all ... adjectives. Is that right? --Anatoli T. (обсудить/вклад) 04:12, 3 September 2014 (UTC)
I have expanded the part on si-irregular in the code, and it is non-empty now. Yes, all h-irregular verbs are adjectives. Wyang (talk) 04:27, 3 September 2014 (UTC)
Thanks. I see you have deleted the cat. --Anatoli T. (обсудить/вклад) 04:31, 3 September 2014 (UTC)

About 枯萎[edit]

I saw that you edited the article 枯萎 two weeks ago. How did you do that? I tried to show the Taiwan pronunciation by typing the character on the pronunciation section, but it resulted in a module error. Please explain so that I can do the same thing for other articles, thank you. --Mar vin kaiser (talk) 06:35, 3 September 2014 (UTC)

Hi, Mar vin kaiser. I'll try to answer, as I am also using this method now. It's not so straightforward. The module Module:zh/data contains the following lines:
['萎']={'wěi','wēi'}
['萎']={'Mainland','Taiwan'}
So, any term containing this character instead of pinyin should contain just character, e.g. |m=kū萎 in this case. If a character with variant pronunciations (Mainland, Taiwan) is missing, it needs to be added. BTW, please add a babel to your user page. --Anatoli T. (обсудить/вклад) 06:47, 3 September 2014 (UTC)

기아 and 기근[edit]

Hi Frank,

What would be the correct format for Korean terms with variant hanja? Also, could you consider adding synonyms, etc. to Korean entry creation templates to match Chinese? --Anatoli T. (обсудить/вклад) 04:59, 9 September 2014 (UTC)

Hi Anatoli. Please see what I did in those entries. I have added syn= and ant= parameters. Wyang (talk) 11:28, 9 September 2014 (UTC)
Thank you! I may need to bug you more about ko/ja/vi templates :) --Anatoli T. (обсудить/вклад) 22:48, 9 September 2014 (UTC)

Module:vi-pron[edit]

That new version looks very nice. There are a few problems with it, though. Can you take care of them? Keφr 14:58, 27 September 2014 (UTC)

Thanks, they are all gone now. Wyang (talk) 11:17, 28 September 2014 (UTC)

發音[edit]

你好,請問發音是[pəˈtæt]還是[pəˈtaɪt]?138.229.16.219 17:31, 28 September 2014 (UTC)

後者。Wyang (talk) 23:17, 28 September 2014 (UTC)

請問發音是[tɐɪt]還是[teɪt]?64.18.87.72 18:18, 30 September 2014 (UTC)

發音像粵語的「低」嗎?64.18.87.72 19:12, 1 October 2014 (UTC)

前者。有點。Wyang (talk) 02:20, 8 October 2014 (UTC)

解调[edit]

Hi Frank,

Is it jiětiáo, jiědiào or both? Various dictionaries give different pronunciations. --Anatoli T. (обсудить/вклад) 22:29, 5 October 2014 (UTC)

Hi Anatoli, it's jiětiáo. Wyang (talk) 02:14, 8 October 2014 (UTC)

Oddities in Template:ja-kanjitab[edit]

I dunno what's changed since I've been away, but {{ja-kanjitab}} doesn't seem to be correctly categorizing readings for the second kanji. Have a look at 海狸#Japanese for one such example -- the cats only show Category:Japanese_terms_spelled_with_狸, but not the expected Category:Japanese_terms_spelled_with_狸_read_as_り or Category:Japanese_terms_spelled_with_狸_read_as_たぬき.

Any ideas? And are you the right person to bring this to?  :) TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 21:10, 7 October 2014 (UTC)

-- For that matter, 狸#Japanese doesn't show the expected cats either. Maybe it's just this particular character being parsed funny?

Hi. ja-kanjitab uses the "read as" for all grade 1-6, jouyou and jinmeiyou kanjis, as well as a proportion of hyougai kanjis (厭昌之芽昌浩智晃淳敦聡晃旭亮糊桂隘阿唖撫鼠阿耘迂寅已伊餡姦闊..., see Module:ja-kanjitab). I have added 狸 there. Not sure what the source of the exempted hyougai kanjis is... Wyang (talk) 02:34, 8 October 2014 (UTC)

輕ㄑ者[edit]

Hello Wyang. Could you verify that I have the right hanzi at poppyzon#References please? I tried to copy the characters from here. — I.S.M.E.T.A. 18:41, 9 October 2014 (UTC)

  • Correct me if I'm wrong, but I believe the second character here is the ditto mark, indicating that the character from the line above should be copied. As such, reading your link source, the term would presumably be fully spelled out as 輕哨者.
If that's correct, a lookup on MDBG suggests that this might mean something more like someone who whistles frivolously, rather than someone who smacks their lips. ‑‑ Eiríkr Útlendi │ Tala við mig 20:48, 9 October 2014 (UTC)
  • Eirikr is right, it is the iteration mark . Thus the translation of poppyzon given in the book is 輕哨者, meaning "one who whistles lightly". Wyang (talk) 22:06, 9 October 2014 (UTC)
  • @Eirikr, Wyang: Thank you both. I'll go with "輕〱者 [sc. 輕哨者]". — I.S.M.E.T.A. 23:18, 9 October 2014 (UTC)

挫男[edit]

I had difficulty translating this term. Let me know if you can think of any equivalent expressions in English. ---> Tooironic (talk) 06:28, 10 October 2014 (UTC)

"Awkward" might be better here. Wyang (talk) 22:07, 12 October 2014 (UTC)
Hmm, I'm not familiar with that meaning. But of course this is a dialectal term - prevalent in the south I guess. Which Chinese term are you referring to? ---> Tooironic (talk) 06:43, 13 October 2014 (UTC)
It's a northern Chinese colloquialism. 挫 literally means "short in stature". From what I see in google:挫男, most of the results in page 1 refer to men who are ugly/behave awkwardly and are therefore unable to attract girls. Wyang (talk) 06:49, 13 October 2014 (UTC)

tempête[edit]

這個發音是[tampeɪ̯t]還是[tampɐɪ̯t]?199.59.78.223 23:20, 11 October 2014 (UTC)

類似的問題請在Tea room提問,謝謝。Wyang (talk) 22:08, 12 October 2014 (UTC)

主義, 主义[edit]

I know the old Derived terms formatting was outdated, but at least you could view the list in alphabetical order, now it seems to be all randomised, what's up with that? ---> Tooironic (talk) 01:17, 24 October 2014 (UTC)

@Tooironic: They are positioned in the same order but apparently the template {{der3}} changes the order automatically. An alternative is to use {{der-top}} and {{der-bottom}} around the list.
Would you prefer this format? --Anatoli T. (обсудить/вклад) 01:45, 24 October 2014 (UTC)
I would prefer alphabetical with columns, I think that would be the most user-friendly. ---> Tooironic (talk) 03:41, 24 October 2014 (UTC)
@Tooironic: What about now? The position of {{der-mid3}} may need to change to fix the column height.
--Anatoli T. (обсудить/вклад) 03:50, 24 October 2014 (UTC)
Seems better now. ---> Tooironic (talk) 04:09, 24 October 2014 (UTC)

研究[edit]

I don't understand why you made changes to this entry. The current standard at Wiktionary is to indicate part of speech for all entries. I don't think you can just make up your own headings like "Definitions", etc. Please explain. ---> Tooironic (talk) 12:27, 29 October 2014 (UTC)

Wiktionary:Entry layout explained/POS headers#Other headers in use. I think all the PoS headers for Chinese should go. Wyang (talk) 23:59, 29 October 2014 (UTC)
@Tooironic: More discussions on the topic: Wiktionary:Beer_parlour/2014/May#New_L3_for_Chinese, Template_talk:zh-pron#Why_does_this_categorise_in_part-of-speech_categories.3F and Wiktionary:Beer_parlour/2014/July#.22Definitions.22_header_in_Chinese_entries. I would support "Definitions" header for single-character entries and happy to continue discussions for multi-character entries and unifying Chinese entries under traditional with simpler simplified entries with soft-redirects but there is no consensus on this. --Anatoli T. (обсудить/вклад) 02:37, 30 October 2014 (UTC)
I don't understand why PoS headers for Chinese should go. I think they are very useful, especially considering many dictionaries - both online and paper-based - do not include them. Regardless, I think 研究 should be restored to match how all 词 entries are currently formatted until a consensus is reached. What do you think? ---> Tooironic (talk) 06:33, 30 October 2014 (UTC)
@Tooironic: Please you read those links. Wyang's argument is that Chinese has no inflection, so PoS headers have little value. Also,
  1. PoS is not inherent to Chinese words. You can't tell by their form, if they are nouns, verbs, etc. PoS can be determined in the complete phrases, not as stand-alone words.
  2. PoS can change, depending on the usage. Most adjectives can be used as verbs or nouns, verbs can be used as nouns, etc.
  3. Various dictionaries treat various PoS differently. I mentioned these discrepancies. We have to either list all PoS possible or limit to one.
  4. It's too complicated for idioms and 字 words to determine and list all PoS and adding PoS headers doesn't add any value. Please check . As you mentioned yourself, dictionaries not always use PoS, only sometimes to for calrifications, like "protest" (n.)
Change the format of 研究 back if you want, until consensus is reached. Other languages without inflections could be reviewed as well - Vietnamese, Thai, Lao, Khmer, Burmese, etc. Chinese doesn't have to be "exceptional" in this regard. --Anatoli T. (обсудить/вклад) 21:22, 30 October 2014 (UTC)
I don't actually have much of an opinion at this point. Having read through the arguments, I can see why some editors here would like to list the translations as "Definitions", with some part of speech information included there within. Looking at an entry like 保险 I can see how that would work - currently there is a of white space, so it's not as user-friendly IMO. But I'm concerned that such a radical change (for 词 I mean, not 字 entries) would be a logistical nightmare, and hard for non-programmer editors like myself to deal with. ---> Tooironic (talk) 06:11, 31 October 2014 (UTC)

thúi/thúy[edit]

您好, 您寫錯, thúi讀[tʰuj˧˥], thúy讀[tʰwi˧˥]. 162.247.124.135 23:57, 8 November 2014 (UTC)

Just so you know, this is Fête (talkcontribsglobal account infodeleted contribsnukeedit filter logpage movesblockblock logactive blocks)/Phung Wilson (talkcontribsglobal account infodeleted contribsnukeedit filter logpage movesblockblock logactive blocks) who lives in Quebec. He was globally blocked for a long history of bad edits and for endlessly pestering other users with incessant questions. He's especially obsessed with pronunciation in languages other than his native Cantonese, such as Quebec French and Vietnamese. I just blocked him for changing Module:vi-pron‎, and I also gave it protection so only autoconfirmed users can edit it. He's also the one who's been asking you pronunciation questions previously, but I left him alone because he wasn't doing anything, just asking questions. Chuck Entz (talk) 03:02, 9 November 2014 (UTC)
I don't often agree with Fête's phonetic observations, but in this case they're absolutely right: Module:vi-pron had Thúy (female name) homophonous with thúi (dialectal word for "stinky"). "-uy" should be /wi/ and "-ui" should be /-uj/ or even /-ui/. (In other words, Thúy should be /tʰwɪ/, while thúi should be /tʰuj/.) I've restored their change. – Minh Nguyễn 💬 08:37, 13 November 2014 (UTC)
Good. My actions were based on their making far-reaching unilateral changes without giving enough time for a response to their comment, rather than the substance of those changes (about which I know nothing). Chuck Entz (talk) 14:13, 13 November 2014 (UTC)

vi-new[edit]

Hi Frank,

Template:vi-new is no longer working well. Could you please check? Also, I tried to make etymologies other than Sino-Vietnamese. It failed on this too. Any enhancements - synonyms, alt forms are welcome :) --Anatoli T. (обсудить/вклад) 01:44, 12 November 2014 (UTC)

Hi Anatoli, could you please elaborate? The three examples in the documentation page are working. Wyang (talk) 03:16, 17 November 2014 (UTC)
These revisions didn't work as expected: săn bắn and thuốc lào. --Anatoli T. (обсудить/вклад) 03:24, 17 November 2014 (UTC)
They seem to be working now. I guess one of the templates or modules involved might have been broken at the time. Wyang (talk) 03:31, 17 November 2014 (UTC)
Thanks. Yeah, I saw your edits. It must have been a temporary glitch. --Anatoli T. (обсудить/вклад) 03:37, 17 November 2014 (UTC)

Possible problem with my-translit[edit]

There seems to be something missing from my-translit: ဆယ့် (hcai.) doesn't get automatically transliterated, although most words do. —Aɴɢʀ (talk) 20:40, 15 November 2014 (UTC)

@Angr: Yes, there are a few new cases at Module:my-translit/testcases. It seems "-teen" numerals are all in that boat, pls ass them there, so that we have them in one place. Frank, sorry for giving you more work, no-one seems to be able to work with these. :) Lao module also needs attention. For Russian, I have some requests for fixing secondary stress but this can wait as you seem to be busy. --Anatoli T. (обсудить/вклад) 03:33, 17 November 2014 (UTC)
Burmese and Lao ones fixed... Wyang (talk) 11:24, 17 November 2014 (UTC)
Thank you very much :) --Anatoli T. (обсудить/вклад) 11:39, 17 November 2014 (UTC)
Thanks! —Aɴɢʀ (talk) 14:44, 17 November 2014 (UTC)

Chinese frequency lists[edit]

Hi Frank,

I have checked/added all multi-character words in Appendix:Mandarin Frequency lists/1-1000, Appendix:Mandarin Frequency lists/1001-2000 and Appendix:Mandarin Frequency lists/2001-3000, onto the next list. (I know pinyin and translations are off and there are duplications). @Tooironic: - it's a good source for missing words. I have orange-links enabled in preferences, so I can see if there is a term in Japanese but not Chinese. --Anatoli T. (обсудить/вклад) 03:35, 24 November 2014 (UTC)

Thanks for this Anatoli. We are close to getting these all finished! ---> Tooironic (talk) 03:42, 24 November 2014 (UTC)
Yes, we should pat ourselves on the shoulder. I reckon it's better to focus on common words, which are not likely to be rfd'ed. Appendix:HSK list of Mandarin words is also pretty much done up to Intermediate level. --Anatoli T. (обсудить/вклад) 03:54, 24 November 2014 (UTC)
Good work guys! Wyang (talk) 12:34, 24 November 2014 (UTC)

Lua error in Module:och-pron[edit]

When I converted the article to use the unified "Chinese" header and Template:zh-pron, attempting to use the "oc=y" parameter for Old Chinese gave me a red-text "Lua error in Module:och-pron at line 23: attempt to perform arithmetic on global 'codepoint' (a nil value)" error. Could you check the article to make sure my syntax is correct and to spot any possible module issues? It's not urgent but I want to make sure I'm not doing anything wrong. Cheers! Bumm13 (talk) 05:40, 27 November 2014 (UTC)

It's not your edit, it's the module. There are 317 entries in Category:Pages with module errors, and all of the ones I've checked have the same error.Chuck Entz (talk) 05:51, 27 November 2014 (UTC)
I take that back- there's at least one that's caused by bad input to {{ko-pos}}
Thanks, fixed now. Wyang (talk) 10:37, 27 November 2014 (UTC)

Looks like we have a similar issue when trying to use "mc=y" at the article. The error in red text reads: "Lua error in Module:ltc-pron at line 514: attempt to concatenate global 'fanqieB' (a nil value)". Bumm13 (talk) 17:23, 6 December 2014 (UTC)

After further editing, I haven't seen this error show up on any other article lacking a Middle Chinese infobox, just that one. Bumm13 (talk) 19:27, 6 December 2014 (UTC)
Thanks, fixed. Wyang (talk) 23:09, 6 December 2014 (UTC)

Burmese egg[edit]

Appendix:Proto-Sino-Tibetan/wa ~ wu makes no mention of Burmese (u.). Is it not from that root? Is the similarity just coincidental? Does it come from the *d(w)əj mentioned in the See also section? —Aɴɢʀ (talk) 21:19, 8 December 2014 (UTC)

STEDT sets up *ʔu (egg, hatch) for Burmese "egg", and considers it to be closely related to *wa ~ wu. Wyang (talk) 21:34, 8 December 2014 (UTC)

Question about recent anon edits[edit]

I was curious what you think of Special:Contributions/118.6.149.25. I'm currently seeing just edits to JA entry (i), and KO entries (i) and (ga). I'm quite interested in the OKO connection suggested, and whether you know anything more about that? Also, I remember reading from a couple different linguists that (ga) was relatively recent, and was probably derived from JA, which this anon seems to be discounting. Their description of the JA particle in the KO entry is both misplaced and misleading, which raises doubts about their trustworthiness. TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 00:58, 9 December 2014 (UTC)

I find the four edits to (i) very puzzling - I'm not even sure the current etymology is what the IP intended to write. The current etymology there is incorrect, as the archaic Korean form of "ni" was non-existent. I would be curious to know where he/she got the "ni" etymology from. Korean "ga" is quite recent and it was used initially as an alternative emphatic particle to i, only after i/y-ending nouns. More discussion can be found here. Wyang (talk) 06:54, 9 December 2014 (UTC)
  • Thank you for the link. Unfortunately:

You have either reached a page that is unavailable for viewing or reached your viewing limit for this book.

... but I think I might have an earlier edition of this same book at home (with the almost-kelly-green cover, also Cambridge University Press, same cover design as Shibatani's The Languages of Japan from the Cambridge Language Surveys series).
In light of your comments here, I'm reverting the anon's edits to (i). (The edits to (i) were on the mark.) I'll see if I have the book at home, and if I do, read up on (ga) and make a judgment then. Or, feel free to beat me to it and revert/alter the anon's edits to (ga) as you see fit.  :)
Thank you! ‑‑ Eiríkr Útlendi │ Tala við mig 18:27, 9 December 2014 (UTC)
I've reverted the edit to (ga). I didn't realise you couldn't view it - you can use this link if you don't have a copy of that at home. Thanks! Wyang (talk) 20:55, 9 December 2014 (UTC)
Thank you! I'll have a look later.  :) ‑‑ Eiríkr Útlendi │ Tala við mig 23:02, 9 December 2014 (UTC)

Min Nan POJ data[edit]

Frank, are you able to add data for Min Nan from Min Nan Wiktionary? It would be great if readings could be automatically loaded like Cantonese and Hakka, if it's possible. Doesn't have to be now, I know you're busy. BTW, mn_note (and other notes) parameter seems to have stopped working. --Anatoli T. (обсудить/вклад) 04:57, 12 December 2014 (UTC)

No problem. I am getting the pronunciations elsewhere now and will format and upload them when it finishes. Notes at seem to be working well. Wyang (talk) 06:10, 12 December 2014 (UTC)
Thanks. Carl has some problems with the new format, pls see my talk page (財政家 et al). --Anatoli T. (обсудить/вклад) 06:38, 12 December 2014 (UTC)
Done now: Special:Contributions/Wyangbot. Wyang (talk) 23:59, 16 December 2014 (UTC)
Thanks a bunch! Will the bot load all words from the data module? --Anatoli T. (обсудить/вклад) 00:04, 17 December 2014 (UTC)
It should do so. Wyang (talk) 00:06, 17 December 2014 (UTC)

玉兰[edit]

Trying to create a new entry here using "lua", but it's not working. Any ideas? ---> Tooironic (talk) 05:56, 13 December 2014 (UTC)

Same problem with 政体. ---> Tooironic (talk) 06:11, 13 December 2014 (UTC)
@Tooironic: Carl, it seems you need to create fantizi before making jiantizi. Both entries are OK now, I've made fantizi entries, pls. check if they are what you wanted them to be. The problem can be replicated by making jiantizi without corresponding fantizi entries. Ideally even if fantizi doesn't exist, it won't give module errors. I'm sure Frank can fix it. --Anatoli T. (обсудить/вклад) 11:15, 13 December 2014 (UTC)
Oh I see. Is there a way to display the contents at both the simplified and traditional entries? At the moment the user has to click on a redirect, it's not very user-friendly. ---> Tooironic (talk) 13:58, 13 December 2014 (UTC)
Wiktionary:Beer_parlour/2014/December#New_changes_to_Chinese_entries was about centralising the contents in fantizi. I don't think it's possible to show the same content on both entries. --Anatoli T. (обсудить/вклад) 14:10, 13 December 2014 (UTC)
Oh. That's a shame. I was under the impression that we could. Now I feel that users of simplified Chinese are at a disadvantage having to click through to see the contents of most the entries they look up. ---> Tooironic (talk) 16:03, 13 December 2014 (UTC)
@Tooironic: It's strange that you voted in support, although the topic had clear example entries. Note that no published dictionaries use both scripts equally, on or the other script is always the primary script and the other is provided once. Do you oppose the centralisation of entries under fantizi? --Anatoli T. (обсудить/вклад) 21:35, 14 December 2014 (UTC)
Like I said, I think I misunderstood the nature of the centralisation. I thought that entry content would be viewable on either script. I don't think it's very user-friendly to require any user - whether it be simp or trad form user - to have to click-through to see the content of an entry. ---> Tooironic (talk) 05:29, 15 December 2014 (UTC)

Minor module error at Module:zh/documentation[edit]

I'm not sure why, but there's been a module error at the documentation page for module:zh since your edits to it the other day, which is showing up at the module page, too, via transclusion (it's the first time I've ever seen a module with a module error). The invocations with the error:

  1. {{#invoke:zh|hzbox|光合作用}}
  2. {{#invoke:zh|hzbox|光合作用|22}}
  3. {{#invoke:zh|hzbox|葉綠體}}
  4. {{#invoke:zh|hzbox|葉綠體|21}}

Since it's restricted to just one location, which is outside of mainspace, it's not exactly an emergency, but I thought I'd bring it to your attention, anyway. Chuck Entz (talk) 04:12, 15 December 2014 (UTC)

Fixed. Wyang (talk) 04:14, 15 December 2014 (UTC)
That was certainly very quick- though it took a null edit to the documentation page to clear it completely, Thank you! Chuck Entz (talk) 04:22, 15 December 2014 (UTC)
No worries! Wyang (talk) 04:23, 15 December 2014 (UTC)

[edit]

I've lost all the "derived terms on but I don't find it easy to convert to traditional or, even better, reformat and use {{zh-l}} with both forms. There's no easy way to do that, is there? That's one concern (entries are out of sync) and a reason for the centralisation (to avoid this). Editors edit one version but not the other. --Anatoli T. (обсудить/вклад) 06:19, 17 December 2014 (UTC)

The best I can do is automatic simp->trad conversion of these Simplified lists of compounds. An example is {{zh-der}}, where surrounding the derived terms list with {{zh-der|...}} syntax and previewing it can give the formatted list. The results must be doublechecked, though. Wyang (talk) 06:45, 17 December 2014 (UTC)
It looks great. I understand that the result of the conversion must be doublechecked. --Anatoli T. (обсудить/вклад) 21:12, 17 December 2014 (UTC)

[edit]

Could you add the Taiwan variant pronunciation of zhú to the entry creator? Thanks. ---> Tooironic (talk) 05:57, 19 December 2014 (UTC)

It's already added and should work. Which entry is it? Wyang (talk) 06:00, 19 December 2014 (UTC)
Seems to be OK now. Many thanks. ---> Tooironic (talk) 02:12, 20 December 2014 (UTC)

宏碁[edit]

Hi Frank,

If you think the entry is worth saving, could you add citations, please? --Anatoli T. (обсудить/вклад) 22:52, 22 December 2014 (UTC)

Hi Anatoli, I'm inclined to not include the Brand name if there are no other meanings. By the way, hong2 qi2 and wang4 kei4 are the correct pronunciations, but they are hardly used in real life. Most people pronounce it as hong2 ji1 and wang4 gei1/hung4 gei1/hung4 kei4. Wyang (talk) 02:30, 23 December 2014 (UTC)

zh-forms documentation[edit]

Hi Wyang. I can see that you have made the template zh-forms and it looks really cool. The documentation [6] is just a list of examples with no explanation. I can figure out that the parameters means but where does it get the translations in the table from and how are people supposed to edit them? I have looked at 人民 and it says that means "the people; nationality; citizen" but the entry don't have a definition in the Chinese section only in the character section which is "people, subjects, citizens". Could you write a few words in the documentation about how it works and how people should use it? Kinamand (talk) 18:47, 30 December 2014 (UTC)

@Kinamand: Wyang must be on leave. The glosses are contained in Module:zh/data/glosses (certainly editable) or, with multi-character words, in those entries. {{zh-forms}} can take override numbered parameters, if you need to give specific senses or shorten the descriptions. --Anatoli T. (обсудить/вклад) 08:22, 31 December 2014 (UTC)
I can see it now. Just not used to look in the modules for things like that. Thanks for the answer. Kinamand (talk) 11:53, 31 December 2014 (UTC)
Explanations added now. Wyang (talk) 22:00, 1 January 2015 (UTC)
Hi Frank, how would you use {{zh-forms}} in entries with commas or punctuation marks, e.g. 和尚打傘,無法無天? The template should ignore, including in the |type= parameter. There are still quite a few entries with "Mandarin" header and {{cmn-idiom}}, etc.
BTW, Carl had troubles with 阿Q精神 and 阿Q and left them in a bad state. How should "Q" be transliterated, "kyū" or "kǖ" or something else? Pls see User_talk:Atitarev#.E9.98.BFQ_and_.E9.98.BFQ.E7.B2.BE.E7.A5.9E. --Anatoli T. (обсудить/вклад) 23:19, 1 January 2015 (UTC)
Hi Anatoli, please see my changes at those places. Wyang (talk) 09:21, 2 January 2015 (UTC)
Looks great, thank you very much! --Anatoli T. (обсудить/вклад) 10:41, 2 January 2015 (UTC)
I used the same method on 車同軌,書同文, although {{zh-new}} doesn't work well with commas (it needs to convert full-width "," to "," in pinyin). Further trouble is with {{zh-usex}} when there are English words inside. I couldn't force spaces. I think you fixed it a while ago on a Min Nan mixed script usage example. --Anatoli T. (обсудить/вклад) 11:02, 2 January 2015 (UTC)
Yes, the modules Module:zh-usex and Module:zh need to be rewritten under the new framework - too messy! {{zh-new}} one fixed. Wyang (talk) 22:38, 2 January 2015 (UTC)

Module:zh-see and multiple possible traditional forms[edit]

Hi Wyang, is there any way to add multiple traditional character links to Template:zh-see? Occasionally, a simplified character will have more than one possible traditional equivalent. Thanks Bumm13 (talk) 05:06, 3 January 2015 (UTC)

You can probably do:
{{zh-see|traditional character 1}}
{{zh-see|traditional character 2}}
As in . Frank, maybe the template should take more parameters? --Anatoli T. (обсудить/вклад) 05:18, 3 January 2015 (UTC)
I feel that the template should only take one parameter. 偽 and 僞 are variants of each other, and one of them should be redirected too. In cases where multiple non-variant characters simplify to the same character, it should be separated by Etymology 1/2... like . Wyang (talk) 07:39, 3 January 2015 (UTC)

報販子, 毒販子, 馬販子, 書販子, 魚販子, 戰爭販子[edit]

In these entries, is there a way to get lua to show both readings of 販子 automatically instead of just the "child trafficking" one? ---> Tooironic (talk) 15:41, 5 January 2015 (UTC)

I've reformatted the entry as the "dealer" sense is more common. I can also make the code try to extract another sense from entries, but I'm not sure that will be useful and not too confusing. Wyang (talk) 23:16, 5 January 2015 (UTC)
Thank you muchly. ---> Tooironic (talk) 01:49, 6 January 2015 (UTC)

[edit]

I noticed your bot replaced the page with an error message here. ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 16:50, 5 January 2015 (UTC)

I have tried to correct it but did notice something odd. In the simplified entry 爱 I did write
{{zh-forms|s=爱|t=愛}}
but in the traditional 愛 I only had to write
{{zh-forms|s=爱}}
. If I did not add the s parameter in the simplified entry it did not show both versions.

Another thing: Are we also supposed to use zh-forms in the translingual section? Kinamand (talk) 20:21, 5 January 2015 (UTC)

No Chinese hanzi box is supposed to be used in the translingual section. The error message is due to multiple hanzi boxes being used on page, including one in the translingual section. In the simplified entry it should be
{{zh-see|愛}}
. Wyang (talk) 23:14, 5 January 2015 (UTC)
Why should simplified entries only use the zh-see template and not have the whole definition [7]. Since simplified characters are used far more often than traditional it seem odd to me. Have there been a discussion about this. I would like to know the arguments behind that decision. Kinamand (talk) 07:19, 6 January 2015 (UTC)
There was extensive discussion. As I understand it, it's easier technically to convert from traditional to simplified than in the other direction, so the full information is the traditional entry. There's nothing political about it: Wyang is from the Mainland and grew up with the simplified script, so he would have done it otherwise if he could have. Chuck Entz (talk) 07:55, 6 January 2015 (UTC)
What Chuck said. Here is the link to the proposal and discussion: Wiktionary:Beer parlour/2014/December#New changes to Chinese entries. Wyang (talk) 09:31, 6 January 2015 (UTC)
The discussion is not closed and there is a vote running [8] so why have you already started to make the changes? Kinamand (talk) 11:34, 6 January 2015 (UTC)
Not just me, other editors have started to change entries to the new format a while ago. There is no point for a vote when no Chinese-language editor opposes the proposal. The vote is a means for a bunch of utter standers-by to dictate what chores others should do. Wyang (talk) 12:22, 6 January 2015 (UTC)
The problem is that you forget to make documentation of new templates for example of zh-see [9] and that makes it very difficult for others to make contributions. Kinamand (talk) 13:28, 6 January 2015 (UTC)
Done. Wyang (talk) 20:52, 6 January 2015 (UTC)

bot: Hanzi box format change to use zh-forms[edit]

Can your bot transfer or otherwise not delete {{also material at the top of the entries it's converting? Hongthay (talk) 00:14, 7 January 2015 (UTC)

I think you are confused by how the bot works (diff and diff) - the previous bot edits on those pages did not remove {{also templates. Anyway there needs to be better automatic handling of such correspondence sets, since we are possibly looking at thousands of affected entries. Comprehensive variant lists such as this may be used to compile lists of variant forms, which are then maintained automatically by form-templates. Wyang (talk) 06:17, 7 January 2015 (UTC)
I agree about the auto handling but to illustrate what I'm referring to: diff and diff. It processed the (simplified) Chinese entry, then wiped the {{also matter above. Even when there is a Japanese entry on the page: diff. Hongthay (talk) 11:13, 7 January 2015 (UTC)
I've made a start on the automatic handling, by listing all the existing multisyllabic Chinese entries and finding other entries with titles which differ only as variants. The code is Module:User:Wyang/var and the results are at User:Wyang/test. Any thoughts? Wyang (talk) 13:37, 7 January 2015 (UTC)

Module:hi-translit/testcases[edit]

Hi Frank,

Are you taking this on? Me and User:DerekWinters can help with some tests, he probably knows more than me. Hopefully, User:Kephir can also join but he seems currently busy. --Anatoli T. (обсудить/вклад) 00:47, 9 January 2015 (UTC)

I was trying to see if I could make it work. :) There is still one unfixed testcase, which either results from error in the en.wp article or an exception to DerekWinters' rules. More testcases are needed. Wyang (talk) 02:09, 9 January 2015 (UTC)
Thanks, "iṁgliś" is correct for इंगलिश (iṅgliś) (it is a case of "Hinglish" - Hindi English, though) (if we use "ṁ" for anusvāra). --Anatoli T. (обсудить/вклад) 03:26, 9 January 2015 (UTC)
No worries. aṁgrez seems to have the same issue, which might be another rule or a loanword exception. I've replied at Module talk:hi-translit/testcases too. Wyang (talk) 10:56, 9 January 2015 (UTC)
I changed the list of special consonants to remove 'ṇ'. When I first made the list, I was just going off of my own intuition. Now it seems actually that I had made a mistake with 'ṇ'. DerekWinters (talk) 11:20, 9 January 2015 (UTC)
Also, loanwords always are modified to fit the orthography, so don't worry about exceptions. DerekWinters (talk) 11:23, 9 January 2015 (UTC)
Great. What is the rule concerning anusvara as in aṁgrez? Should preceding anusvara be treated as vowel-like in vowel dropping? Wyang (talk) 11:26, 9 January 2015 (UTC)
ṁ in front of a velar (k, kh, g, gh) is ṅ. In front of a palatal letter (c, ch, j, jh) it is ñ. In front of a retroflex (ṭ, ṭh, ḍ, ḍh) it is ṇ. In front of a dental (t, th, d, dh) and all remaining consonants (y, r, l, v, ḷ, ś, ṣ, s, h) it is simply n.
The case of the anusvara is rather strange. Originally, it only indicated a word-final 'm' or a nasalization. Later, there was an orthography reform in which the anusvara took the place of all nasals in cluster-initial position. Thus, 'k' + 'i' + anusvara + 'g' = 'king'. So even though it is a diacritic, it took the place of consonants and thus should be treated as such. DerekWinters (talk) 11:39, 9 January 2015 (UTC)
I understand it now. Should there be another rule in addition to those at User talk:DerekWinters#Bengali transliteration module, which says "XTaCV = XTCV"? What about the other failed testcases? Wyang (talk) 11:49, 9 January 2015 (UTC)
Great. The rule should actually be XTaCV = XTaCV. Also, I must tell you that the rules I gave were only a few. There are sure to be many more.
About the other testcases. व्यवच्छेद (vyavaccheda without any dropped vowels) would get split into vyav|cched. स्वत्वहरण (svatvaharaṇa without any dropped vowels) would get split into svat|va|ha|raṇ. संगमरमर is actually an ambiguous case, so I'll take care of that one. DerekWinters (talk) 12:05, 9 January 2015 (UTC)
What would be the best way to automate these? The code currently has two vowel dropping rules: 1) word-final -CSa is reduced to -CS. 2) the sequence 'VCaCV' is reduced to 'VCCV', applying from right to left. How should the code be modified to account for those reductions or non-reductions? Wyang (talk) 12:19, 9 January 2015 (UTC)
Sorry, but I'm going to have to be offline for the next couple of hours. DerekWinters (talk) 12:15, 9 January 2015 (UTC)
No worries. Wyang (talk) 12:19, 9 January 2015 (UTC)
Alright, I'm back. What can I help you with? DerekWinters (talk) 06:06, 10 January 2015 (UTC)
Hi, please see my reply on 12:19, 9 January 2015 (UTC), thanks. Wyang (talk) 08:39, 10 January 2015 (UTC)
Forgive me, I completely missed it. Ok. Actually, a word-final -CSa should not lose its 'a', however a word final -CRa should becone a -CR.
So, ignore vyavacched. Svatvaharaṇ has an underlying CSaCSaCaCaCa structure, it being a compound of 2 other structures: CSaCSa (which would lose no vowels) and CaCaCa (which would lose its final 'a').
With sangamarmar, and all the other aṁgQQQ forms, I have no clue what to do. I'm getting conflicting pronunciations. I'm consulting online dictionaries, myself, and my cousin, and sometimes we all agree, and sometimes not at all. Sometimes I feel as if I could pronounce it both ways, with and without it. See, disregarding those, everything else it working, so suffice it to say we can easily hardcode them.
With antarrāṣṭrīya and bhārtīya I realized that we need to have a few special rules regarding 'y'. A word final īya, eya, and aiya (the single vowel 'ai' (ै)) maintain their final 'a's. In word-medial position at the end of a syllable, normally the 'a' is dropped (latakaro -> latkaro). However, layakaro would not become laykaro, it would instead maintain its 'a'. DerekWinters (talk) 12:46, 10 January 2015 (UTC)
Sorry, I was supposed to say the first rule was "word-final -a is dropped unless in -CSa". I guess much of the variation comes from compounding. For the moment I only added another non-dropping rule for '-ya', and I'm not sure how to automate the rest. Please tell me if there are other things that you would like me to add or modify. Wyang (talk) 13:58, 10 January 2015 (UTC)
No problem, I know it's a difficult task. You've done ridiculously well so far. Lets see, could you perhaps modify the syncope thing to work the medial y non-dropping rule? Also, could you add the bit where 'ṁ' (anusvara) becomes the correct nasal preceding another consonant as I mentioned above? I think everything else is good. We'll just have to hardcode the aṁg words, perhaps with the two alternate pronunciations. DerekWinters (talk) 14:17, 10 January 2015 (UTC)
Also, for the anusvara. If there exists a word-final -aṁ, it becomes -am. If it is word final on any other vowel it nasalizes it to ā̃, ẽ, ĩ, ī̃, ũ, ū̃, etc. DerekWinters (talk) 14:20, 10 January 2015 (UTC)
I've done the following changes: 1) made 'VyaCV' sequence not become 'VyCV'; 2) added anusvara assimilations (consonant and word-final vowels); 3) added a functionality "+", marking compounding boundary such that medial vowel dropping cannot apply across "+", e.g. रंगपटल (raṅgapṭal) would be input as "रंग+पटल" "रंग+पटल (raṅgapaṭal)". Please check the testcases, thanks! Wyang (talk) 01:00, 11 January 2015 (UTC)
You are an absolute genius. Thank you. I think this is ready for application. This is ready for Hindi. I'm hesitant to make it usable for any other language at the moment. I'll make some more testcases for Marathi and then give it a go. DerekWinters (talk) 11:59, 11 January 2015 (UTC)
I'm joining a praise. Excellent job again, Frank! It's not the hardest module he has done, though. Korean, Burmese, etc. are much more complicated. Sorry for not helping much in the last few days. @DerekWinters: re: Bengali, Oriya, Gujarati (also Amharic/Tigrinya) modules: the logic for schwa-dropping is the same or almost the same as Hindi but the code is too complicated for me to simply transfer it to other modules. Bengali would be a higher priority after Hindi (it's an official language of a very populous country and we have lots of entries) but we need to make the basic module first. --Anatoli T. (обсудить/вклад) 13:04, 11 January 2015 (UTC)
Thanks guys. Please let me know if I could be of any help. Wyang (talk) 22:19, 11 January 2015 (UTC)
Super sorry, but I forgot that for the labials (p, ph, b, bh, m) the anusvara becomes a 'm'. I already took care of it though, just put it here for future notice. Also, let's make this module official for Hindi and Marathi as of now @Atitarev:. I'll see if any other languages can come under it. Nepali, Newari, Sanskrit, and the Prakrits cannot because they follow different rules under devanagari. DerekWinters (talk) 14:10, 12 January 2015 (UTC)
Oh and Wyang, the chandrabindu strictly does nazalisation (ā̃, ẽ, ĩ, ī̃, ũ, ū̃, etc.) If you could add that in please? DerekWinters (talk) 14:18, 12 January 2015 (UTC)
Chandrabindu now gets converted to a tilde. @DerekWinters: Are you sure that anusvara assimilates to 'm' before 'm' but not to 'ṅ' before 'ṅ', etc.? Wyang (talk) 01:32, 13 January 2015 (UTC)
I'm certain that it becomes an 'm' in front of an 'm'. Truthfully I've never seen it before an 'ṅ'. I tried to see what it would be like and I wasn't able to make any sense out of anusvara + 'ṅ'. I think it's safe to assumer we'll never see that combination. DerekWinters (talk) 08:08, 13 January 2015 (UTC)
I have already added the module to Module:languages/data2 for Hindi, Marathi and Nepali (non-mandatory, i.e. overrideable with manual translit). Nepali could be taken out, if it doesn't work (I judged by examples in my Nepali phrasebook and assumed it's the same as Hindi, Marathi contributors, on the other hand, provide all vowels, ignoring schwa dropping, for some reason). For languages, which use multiple scripts, a script detention should be used (like i.e. Mongolian). @DerekWinters: How do you think nasal diphthongs should be transliterated, e.g. हैं (haĩ) ("h͠ai' is used on the entry)? हैं (haĩ). Also, I think there should be an apostrophe between vowels, e.g. डाउनलोड (ḍāunloḍ) should be "ḍā'unloḍ". What do you think? --Anatoli T. (обсудить/вклад) 21:40, 12 January 2015 (UTC)
Please do not make it for Nepali. Nepali is supposed to work strictly like Sanskrit, with each letter having the 'a' unless a virama is used. Although I guess I'm not 100% sure. Recent examples, including online dictionaries, seem to use devanagari the same as Marathi and Hindi, but I can't tell if it's just laziness, a new trend, or something else altogether. Truly, I'm uncertain and I'll try and figure it out. For nasalizations of diphthongs, both vowels should show the tilde, because in Hindi/Marathi it's treated as one vowel (h͠ai). The apostrophe makes it seem like there is a break between the vowels, almost like a glottal stop. I know google translate employs the apostrophe, but I personally just don't like the apostrophe. If you feel it's necessary then that's fine. DerekWinters (talk) 08:08, 13 January 2015 (UTC)
I have changed Nepali to use Sanskrit module for now (it's definitely better than nothing). Added a new test case for हैं (haĩ) to use "h͠ai", if it's OK with you guys. Apostrophe would be essential to separate diphthong "ai" () and "au" () (or diacritics) from consecutive "a + i" and "a + u", if they occur. User:Dijan had some question on my talk page: User_talk:Atitarev#hi-translit.
I've started Module:bn-translit and the test module, which is in a poor state but hopefully will save time for you, guys. --Anatoli T. (обсудить/вклад) 08:26, 13 January 2015 (UTC)
I may have gone a little overboard with the extra letters not used in Hindi, but at the time I was envisioning the Pahari languages employing the module as well. It turns out that those languages use some other rules, but there isn't enough literature on them. However, most of the extra letters get used at least once or twice in Hindi.
You are right, I guess the apostrophe is needed.
The new test case is great.
Wyang, can you look at this https://en.wikipedia.org/wiki/Gujarati_phonology#.C9.99-deletion to help with Module:gu-translit? Also, I'm not sure if you've seen this, but I think it might help you. http://delivery.acm.org/10.1145/1630000/1622156/p20-choudhury.pdf?ip=120.62.164.52&id=1622156&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&CFID=617888890&CFTOKEN=80959661&__acm__=1421140180_3bccacc1dc24dbb76930524353169b8b DerekWinters (talk) 09:21, 13 January 2015 (UTC)

as in 稍微, 略微, etc.[edit]

When you get time could you add as in 稍微, 略微, etc. to the variant pronunciations? In mainland China it is pronounced wēi; in Taiwan it's wéi. Thank you. ---> Tooironic (talk) 02:19, 10 January 2015 (UTC)

It's already added. Please see my changes to those entries. Wyang (talk) 08:39, 10 January 2015 (UTC)
Many thanks! ---> Tooironic (talk) 00:41, 11 January 2015 (UTC)

[edit]

I very rarely edit 字, but I thought I'd have a try with this one. Please add the Cantonese and/or Min Nan reading if you can. Thanks. ---> Tooironic (talk) 15:28, 11 January 2015 (UTC)

No problem; couldn't find the Min Nan reading (presumably the same as ) but the rest is all done. Wyang (talk) 23:46, 11 January 2015 (UTC)
Lovely. 多谢. ---> Tooironic (talk) 02:23, 13 January 2015 (UTC)

ELP![edit]

YIDUINIAOSHI[1HEAPbirdstuf-yisi?[tw-yongfa praps

That means "a lump of bird poo". Wyang (talk) 11:43, 12 January 2015 (UTC)

uniydhanzi-probs[edit]

1.ex.juzi i/yut/minetc-ow2ad?2.IPA'dREALYNOThidn:([as intl'standed![st.abc-notatns=altimewastin[FE.C'j'+itsactualsound i/yut/min[resp.POJ/YUTPIN>drivesunutswen kwiklywantocompar:( 3.2left:NO'show[complet]pron.info>acesibility-isue[fe.ivRSI,2MANYBUTN.PUSHES:(

There are better layouts for the information, but they need to be designed and proposed first. :) Wyang (talk) 11:48, 12 January 2015 (UTC)

cmn templates[edit]

Hi Frank,

I have converted the remaining idioms using {{cmn-idiom}} to use {{zh-pron}}, "Chinese" header and {{zh-idiom}} but there are still a few hundreds in Category:Mandarin proverbs in traditional script. --Anatoli T. (обсудить/вклад) 06:00, 14 January 2015 (UTC)

Thanks! I will take care of those later. Wyang (talk) 22:39, 14 January 2015 (UTC)
Yes, please, when you have time. There are also multi-character proper nouns in need of {{zh-pron}} and correct templates. --Anatoli T. (обсудить/вклад) 23:11, 15 January 2015 (UTC)

Contribs by Ieay4a (talkcontribs)[edit]

Curious about the recent prolific Korean contributions by this user. The few I've looked at seem okay, but they also look like the edits could use some copy-editing and/or clarification. C.f. this change to the 이 entry as an example. ‑‑ Eiríkr Útlendi │ Tala við mig 22:20, 14 January 2015 (UTC)

Thanks, I think he has done a very good job at the i entry and there is nothing I'd like to change there. My comment is that the content is presented in a very confusing way at i due to Wiktionary's entry formatting. I also had a look at some of the other changes of his; they seem okay to me, although some templates for Korean syllables should have been simplified with Lua. Wyang (talk) 22:36, 14 January 2015 (UTC)

悠閑[edit]

Thanks for your edits here. I will use this variant form template from now on. ---> Tooironic (talk) 11:58, 17 January 2015 (UTC)

No worries! Wyang (talk) 07:34, 18 January 2015 (UTC)

榜樣[edit]

Hi Frank, is there a way to get the pinyin in the example sentence to display as Léi Fēng rather than LéiFēng? I asked Anatoli but he wasn't sure. Thanks. ---> Tooironic (talk) 02:49, 18 January 2015 (UTC)

You can use "." as documented in Template:zh-usex/documentation. Please see my edit there. Wyang (talk) 07:25, 18 January 2015 (UTC)
Thanks a lot. I will do it that way in the future. ---> Tooironic (talk) 11:19, 18 January 2015 (UTC)

i/caseu=POLYMATH:)[edit]

ifu'dstopdaBULYS'roundherefr.pesterin,ofendin+xcludinaDISABLDPERSON[me,t'db.greatlyapreciated..

Thanks, is there a particular example I can help you with? Wyang (talk) 03:55, 20 January 2015 (UTC)

Lapsang souchong[edit]

I hope you can find time to have a look at 正山小種, 立山小種, and 拉普山小種. The spelling alternation between and 拉普 kinda makes sense phonetically, but I'm baffled how, why, and when came into the picture, and how it is that the pronunciation is still "lap". I added a note to that effect in the 正山小種#Japanese entry's Etymology section, but that note might need changing. TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 18:51, 19 January 2015 (UTC)

Japanese seems to have borrowed the Cantonese reading of 立山小種立山小种 to 正山小種. I'm also interested what 立山, 拉普山 and 正山, some Mount Li/Lapu, which is also called "zeng3 saan1" (正山) in Cantonese? --Anatoli T. (обсудить/вклад) 21:44, 19 January 2015 (UTC)

This really is an area of mystery. From what I found on Google Books and Chinese sources: The name "Souchong" appeared first in late 18th century, and it's quite certain that it came from Chinese 小種小种 ("a small (tea) variety; small sort"). The name "Lapsang" is modifier of "souchong" which started to appear around the 1830s-1840s. I haven't found a definitive etymology of "lapsang" (to me it sounds like 臘腸, lol), and here are the hypotheses I found:

  • From Cantonese "立山" (lap saan). 立山 seems to be a mountain in the Wuyi Mountains, where lapsang souchong is found.
  • From Min Nan "" ("inner mountain"), to distinguish it from the "outer mountain small tea varieties" [10][11].
  • From Cantonese "" ("the smoked, fresh and fragile variety") [12]. Unlikely.
  • Simply an invented commercial name [13].
  • Uncertain [14]. :)

拉普山 is a later Mandarin translation rendering of English "lapsang", via phonological transcription due to its etymological obscurity. 正山 (the "lineal mountain") or "內山" (the "inner mountain") refers to the tea variety produced in Tongmuguan, Xingcun Village, Chong'an County, Fujian (福建崇安县星村乡桐木关) and is differentiated from "外山" (the "outer mountains"), which are "non-lineal varieties" ([15], "The Chinese Tea Bible" (中国茶经)). Wyang (talk) 05:33, 20 January 2015 (UTC)

zh-pron[edit]

Hi Wyang. The template zh-pron sometimes gives a play button which makes it possible to hear the word. The documentation of the template don't say anything about how it works but it works on but not on . The sound file for 好 is here [16]. Why does the template not find it? How can I fix it? Can you update the documentation with info about it? Kinamand (talk) 15:09, 21 January 2015 (UTC)

The documentation already deals with that - Template:zh-pron#Parameters. Wyang (talk) 00:45, 22 January 2015 (UTC)
I have now found and read the documentation and made this edit [17] but I don't understand why it did not work in the first place with ma=y. The filename is zh-PINYIN.ogg so it should work according to the documentation. Kinamand (talk) 11:04, 25 January 2015 (UTC)
I now see that you have edited since I did ask the first time and you fixed an error in the use of zh-pron. I have therefore undone my last edit of . Kinamand (talk) 11:17, 25 January 2015 (UTC)

{{ja-new}}[edit]

Hi Frank,

Are you interested in making enhancements to the accelerated {{ja-new}}? It would be great if you could add alt forms, synonyms, antonyms, derivations, more PoS and categorisations (by kana?) similar to {{zh-new}}. suru-verbs and na-adjectives would be nice to have -moved from {{ja new}}. It must be too much work but you are the only one who could be interested and could do it. Please consider, maybe long-term.

There are outstanding things in the Russian pron. and Hindi, Lao translit. modules. I'm just giving you my wish list, which you can ignore, of course. :) --Anatoli T. (обсудить/вклад) 00:31, 23 January 2015 (UTC)

Question about Template:zh-forms[edit]

I've noticed that, in the past few days, the definition section (beige/light tan box) that appears when invoking Template:zh-forms isn't showing any definition information but just a simple dash. I see that you edited Module:zh-forms on Jan. 20 and was just curious if this change was intentional. Cheers! Bumm13 (talk) 01:23, 24 January 2015 (UTC)