Wiktionary:About Chinese
Definition from Wiktionary, a free dictionary
| This is a Wiktionary policy, guideline or common practices page. This is a draft; the format of entries for Chinese words was modified in the second half of 2006. | |
| Policies: CFI - ELE - BLOCK - REDIR - BOTS - QUOTE - DELETE - NPOV - AXX |
Contents |
The Chinese language group
In the Chinese language group, ISO 639 identifies a number of languages. Entries in the Wiktionary use these language names. Within a language, dialects or variations are identified by tagging the pronunciations and the senses (definition lines) with the dialect or region in which they are used. This is the same as with other language groups, the Wiktionary uses the level of distinction defined by ISO 636-3.
The languages, with the standard names used in the Wiktionary are:
Cantonese Gan Hakka Huizhou Jinyu Mandarin Min Bei Min Dong Min Nan Min Zhong Pu-Xian Wu Xiang
There are also many more languages that are not dialects of these 13 that are spoken in China, and have ISO 639 codes, from Achang to Hmong to Manchu to Uyghur to Yi to Zhuang, and many in between. These are not addressed here.
Character forms and romanization
Chinese words have at least three common forms:
- Traditional characters, as used in Taiwan,
- Simplified characters, as used in PRC (this may agree exactly or approximately with the traditional character), and
- Romanized forms, of which there may be more than one.
Every entry should list and link to all alternative forms (including all romanizations); there are a number of templates listed in #Entry format that can assist with this, of which {{cmn-noun}} is representative.
It appears that entries are duplicated between traditional and simplified forms, so as not to prioritize one form over the other, and these entries link to each other. See #Hanzi form templates below for templates to assist with this.
Headwords that are romanizations point to both the traditional and simplified forms, but do not duplicate all entries with that pronunciation, instead having a “Pinyin” L3 heading (likewise for other romanizations), linking to characters with that reading; see ài.
Collation
There does not appear to be a consensus on how to collate (order) entries such as “Derived terms” or “Compounds” in Chinese entries. For romanized entries, usual alphabetical order is correct, but for Chinese characters, one might use either a phonetic ordering or radical-and-stroke sorting, as both are used in Chinese dictionaries.
Entry format
This has been the subject of much discussion in Beer Parlor and on the user pages of individual contributors. For a basic outline on out how to create a Chinese entry, see How to Create a Basic Chinese Entry. A preliminary consensus has been reached whereby Chinese entries will have a level two header that matches a specific language in the Chinese family of languages. Chinese entries should follow format guidelines in Entry layout explained. However, there are a few issues unique to Chinese. Templates have been created to address many of these issues (see: Category:Chinese templates). For example:
- ==Mandarin==
- {{zh-forms|[[曾经沧海难为水]]|[[曾]][[經]][[滄海]][[難]][[為]][[水]]}}
- ===Idiom===
- {{cmn-idiom|t|pin=céng jīng cānghǎi nán wéi shuǐ|pint=ceng2jing1cang1hai3nan2wei2shui3|tra=曾經滄海難為水|sim=曾经沧海难为水|rs=曰08}}
曾經滄海難為水 and 井底之蛙 are both good examples of how Chinese entries should ideally be formatted. Two of the Chinese languages have a full set of templates at this time.
The script template for Chinese characters is {{Hani}}.
Cantonese
Templates:
- {{yue-noun}}
- {{yue-pronoun}}
For other parts of speech, use Template:infl:
{{infl|yue|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Romanizations
Recall that the standard romanizations of Cantonese are:
Gan
Use Template:infl:
{{infl|gan|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Hakka
Use Template:infl:
{{infl|hak|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Huizhou
Use Template:infl:
{{infl|czh|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Jinyu
Use Template:infl:
{{infl|cjy|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Mandarin
Templates:
- {{cmn-noun}}
- {{cmn-proper}}
- {{cmn-verb}}
- {{cmn-adj}}
- {{cmn-adv}}
- {{cmn-idiom}}
- {{cmn-proverb}}
- {{cmn-conj}}
- {{cmn-inter}}
- {{cmn-particle}}
- {{cmn-post}}
- {{cmn-prep}}
Romanizations
Recall that the standard romanizations of Mandarin are:
- pinyin, with tones marked either as diacritics or as numerals
- Wade-Giles
- Yale
Tone sandhi
Some Mandarin dictionaries are inconsistent when it comes to depicting tone sandhi in Pinyin. For example, the character 不 (normally bù - fourth tone) is changed to second tone (bú) when followed by another fourth tone syllable. Some dictionaries spell it with the converted tones (búshì in this case, ex. HSK汉语水平考试词典, ISBN 7561720785), while others use the root tones (bùshì, ex. 现代汉语词典, ISBN 9620701348). For Wiktionary entries, it is advisable to always use the root tone for syllables when spelling words in Pinyin (bùshì). This is also more consistent with how tone sandhi is handled in other situations in Mandarin. For example, 可以 is kěyǐ, even though the first syllable changes to second tone (pronounced: kéyǐ).
Min Bei
Use Template:infl:
{{infl|mnp|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Min Dong
Use Template:infl:
{{infl|cdo|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Min Nan
The Min Nan language family has four main branches. Two of the four may be encountered outside of China. This poses a problem for Wiktionary, since these dialects are not mutually intelligible,[1] and only one L2 header may be used per ISO 639 code. Since Min Nan is the official name for the ISO 639 code of nan, it must be used as the L2 header. To date, virtually all entries for Min Nan have been based on the Amoy dialect, which is widely considered to be a de facto standard. The disposition of other dialects such as Teochew and Qiongwen Hainanese remains undecided at this time. If you are fluent in one of those languages, and have an interest in adding words to Wiktionary, please post a message on Wiktionary:Beer parlour. If there is interest, we will try to work something out.
Templates:
- {{nan-noun}}
- {{nan-proper}}
- {{nan-verb}}
- {{nan-adj}}
- {{nan-adv}}
- {{nan-idiom}}
- {{nan-proverb}}
- {{nan-post}}
- {{nan-prep}}
- {{nan-particle}}
- {{nan-conj}}
- {{nan-inter}}
- {{nan-pronoun}}
Romanization
Recall that the standard romanization of Min Nan is POJ.
Min Zhong
Use Template:infl:
{{infl|czo|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Pu-Xian
Use Template:infl:
{{infl|cpx|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Wu
Use Template:infl:
{{infl|wuu|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Xiang
Use Template:infl:
{{infl|hsn|(pos)|script=Hani|...}}
where (pos) is the part of speech and the remainder is the other forms. See the documentation.
Other entry sections
Etymology
If possible, it is desirable for entries to have etymologies, showing earlier pronunciations, spellings (if hanzi usage has changed), and semantic change (change in meaning).
For terms or phrases that can be traced back to Literary Chinese, you may wish to use the etymology template in the form {{etyl|lzh|cmn}} (where cmn is the Modern Chinese language in which the term is used).
References
Like other Wikimedia projects, Wiktionary is largely the work of anonymous volunteers. Therefore it is important to cite authoritative reference works such as dictionaries and encyclopedias. The {{pedialite}} template is a good choice if you want to cite a Wikipedia article. If matching articles in Chinese and English can be found on Wikipedia (particularly true for nouns), you can use the {{pedialite}} template in the following manner:
*{{pedialite|剪刀|lang=zh}}
*{{pedialite|scissors}}
In the references section, it will look like:
References
For external websites, you can use the {{cite web}} template. Here is an example:
===References===
- "剪刀 (in Mandarin)." Guoyu Cidian On-line Mandarin Dictionary (國語辭典). URL accessed on 2008-04-09.
Reference books
For books, you can use the {{cite book}} template. For your convenience, the filled out templates for some authoritative reference works are provided (click on the blue edit button to copy):
===References===
- Wu, Jingrong (ed.) (1985). The Pinyin CHINESE-ENGLISH DICTIONARY (in Mandarin/English). Beijing, Hong Kong: The Commercial Press. ISBN 0471867969.
- (1994) Dictionary of Modern Chinese (現代漢語詞典) (in Mandarin). Hong Kong: The Commercial Press. ISBN 9620701348.
- (2007) Hanyu Da Cidian 3.0 (in Mandarin). Hong Kong: Commercial Press. ISBN 9789620702778.
- (1998) Hanyu Da Zidian (in Mandarin). Taiwan: Jianhong Publishers. ISBN 9578134789.
- Shao, Jingmin (ed.) (2000). HSK Dictionary (汉语水平考试词典) (in Mandarin/English). Shanghai: Huadong Teachers College Publishers. ISBN 7561720785.
Chinese Categories
There are two sets of categories for the chinese languages:
- The zh-cn/zh-tw/yue-cn... categories which have names similar to names normally only used for topics.
- The "Mandarin Languages"/"Mandarin nouns"/"Cantonese language" categories which have names similar to names for other language categories
The zh-cn/zh-tw/yue-cn... categories
There are separate categories for various forms of Chinese languages, corresponding to the language and some written form, which are distinguished by language codes.
The base of a language code is a ISO 639-3 code such as nan or yue (for legacy reasons and familiarity, ISO 639 zh is used for Mandarin, instead of the ISO 639-3 cmn), and it is followed by an optional suffix:
- -cn in simplified script, per PRC usage
- -tw in traditional script, per Taiwan usage
- no suffix for the standard romanization (Pinyin, POJ, Jyutping)
In detail, this yields:
- zh-cn = Chinese (Standard Mandarin) in simplified script, per PRC usage
- zh-tw = Chinese (Standard Mandarin) in traditional script, per Taiwan usage
- zh = Chinese (Standard Mandarin) in romanized Pinyin
- nan-cn = Min Nan (Amoy) in simplified script, per PRC usage
- nan-tw = Min Nan (Amoy) in traditional script, per Taiwan usage
- nan = Min Nan (Amoy) in romanized POJ
- yue-cn = Cantonese in simplified script, per PRC usage
- yue-hk = Cantonese in traditional script, per Hong Kong usage
- yue = Cantonese in romanized Jyutping (I don't speak Cantonese, so if this is wrong, please correct)
The templates {{zh-simplified}} and {{zh-traditional}} can be used at the top of a category to indicate what form of characters are used.
The "Mandarin Languages"/"Cantonese language" categories
The zh-cn/zh-tw categories are used extensively in this wiktionary and supports sorting by radical and stroke. In the "Mandarin Language" categories does the categories always contains both the pinyin, simplified and traditionel forms of the words and sorted by pinyin with tone number. It is missing categories like Mandarin nouns by stroke, Mandarin nouns by pinyin, Cantonese nouns by stroke, etc. It also dont have a category similar to the many small zh-cn-categories as for example Category:zh-cn:Job titles in Romance of the Three Kingdoms. But since the category name "Mandarin nouns" is much easier for everyone to understand than the equal "Category:zh-cn:Nouns" we will keep both categories here in wiktionary.
Hanzi form templates
To display various forms of Hanzi, it is recommended that Chinese entries make use of the following templates (in addition to the above):
{{zh-forms}}
| 汉字 | |
| 漢字 |
This template, {{zh-forms}}, should be placed below the language header.
Ideally, the characters should be hyperlinked as follows: link to the entire phrase for different forms, and component words for the given form. In more detail:
Simplified entries
| 太极拳 | |
| 太極拳 |
The simplified characters should be linked according to longest component words. If no compound words exist within the entry, then the individual characters should be hyperlinked.
The entire traditional phrase should be linked so that the user may conveniently navigate to the companion traditional entry (there are cases where simplified and traditional entries contain slightly different information).
Traditional entries
| 太极拳 | |
| 太極拳 |
The traditional characters should be linked according to longest component words. If no compound words exist within the entry, then the individual characters should be hyperlinked.
The entire simplified phrase should be linked so that the user may conveniently navigate to the companion simplified entry (there are cases where simplified and traditional entries contain slightly different information).
Pinyin entries
| 太极拳 | |
| 太極拳 |
The entire simplified phrase and the entire traditional phrase should be hyperlinked to allow for easy navigation to the simplified and traditional entries (which often contain additional information that is lacking in the Pinyin entry).
{{zh-hanzi}}
| simpl. and trad. | |||
| 功夫 | |||
This template may be used in cases where the word is the same in both simplified and traditional. The template should be placed in the same location as where {{zh-forms}} would have been.
{{ja-forms}}
| 翻訳 | |
| 翻译 | |
| 翻譯 |
This template can be used in cases where you would like to illustrate the difference between the Japanese simplified kanji form, the PRC simplified form, and the traditional form. The template would be used in place of {{zh-forms}}.
{{zh-ts}}
This template may be used in places where both the simplifed and traditional versions should be placed side by side (ex. translations section for English entries).
- For a complete list of Chinese templates: Category:Chinese templates
Additional help
Help from the community
Sometimes, we know there is a problem, but don't know what to do to correct the problem. If you should find a Chinese entry with a problem that you do not know how to correct, there are several ways to approach the situation.
- Mark the page with {{zh-attention}}. This template adds the entry to Category:Chinese words needing attention, where another user can then find and correct the problem. It helps if you include comments on the entry's talk page explaining what the problem is or why you think the page needs attention.
- Raise the issue on Wiktionary talk:About Chinese. Note that this approach is primarily for issues of style, formatting, categorization, and not for specifics of content.
- Mark the page with {{rfc}}. this is a more general cleanup tag, and it allows the user to include reasons or concerns as an argument in the template. Be sure to also add an entry to WT:RFC concerning the word so that other editors will be made aware of the problem.
Other Chinese aids
- Category:Chinese language
- Category:Chinese templates
- Wiktionary:Requested entries:Chinese
- Category:Translations to be checked (Chinese)
Chinese language on Wikipedia.Wikipedia:Chinese language
Chinese grammar on Wikipedia.Wikipedia:Chinese grammar
Written Chinese on Wikipedia.Wikipedia:Written Chinese