Wiktionary:About Chinese

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:ACMN)
Jump to: navigation, search
The policies for the Chinese language are changing.
Following Wiktionary:Votes/pl-2014-04/Unified Chinese and the preceding discussions and agreements, the structure of Sinitic or Chinese (Mandarin, Cantonese, Min Nan, Wu, Hakka, etc.) entries is changing. The entries are being merged and new methods to display topolect information are being used.


Accessories-text-editor.svg This is a Wiktionary policy, guideline or common practices page. This is a draft; the format of entries for Chinese words was modified in the second half of 2006.
Policies: CFI - ELE - BLOCK - REDIR - BOTS - QUOTE - DELETE - NPOV - AXX

The Chinese or Sinitic languages form a family of related languages that have very similar written forms, but different grammar, vocabulary and especially pronunciation. On Wiktionary, each language within the family is treated as a distinct language. Thus, there is no "Chinese language" on Wiktionary, and its corresponding language code "zh" is not used.

This page details the common aspects for the various Chinese languages on Wiktionary.

The Chinese language group

In the Chinese language group, a level two language header needs to have a corresponding ISO 639-3 language code for an individual (as opposed to macro) language. The level two language header shall use whatever the ISO 639-3 describes for that language. For the sake of simplicity, the word “Chinese” should be omitted (for example, “Mandarin” rather than “Mandarin Chinese”). The only exception to this rule is for Yue Chinese (Cantonese), since the word Yue is virtually unknown to the average English speaker and Cantonese is widely recognized in English. Within a language, dialects or variations are identified by tagging the pronunciations and the senses (definition lines) with the dialect or region in which they are used. This is the same as with other language groups. Archived discussions and decisions on this topic can be found at:

The languages, with the standard names used in the Wiktionary are:

Standard name Language code Category Information page
Cantonese yue Category:Cantonese language Wiktionary:About Cantonese
Gan gan Category:Gan language Wiktionary:About Gan
Hakka hak Category:Hakka language Wiktionary:About Hakka
Huizhou czh Category:Huizhou language Wiktionary:About Huizhou
Jin cjy Category:Jin language Wiktionary:About Jin
Mandarin cmn Category:Mandarin language Wiktionary:About Mandarin
Min Bei mnp Category:Min Bei language Wiktionary:About Min Bei
Min Dong cdo Category:Min Dong language Wiktionary:About Min Dong
Min Nan nan Category:Min Nan language Wiktionary:About Min Nan
Min Zhong czo Category:Min Zhong language Wiktionary:About Min Zhong
Pu-Xian cpx Category:Pu-Xian language Wiktionary:About Pu-Xian
Wu wuu Category:Wu language Wiktionary:About Wu
Xiang hsn Category:Xiang language Wiktionary:About Xiang

There are also many more languages that are not dialects of these 13 that are spoken in China, and have ISO 639 codes, from Achang to Hmong to Manchu to Uyghur to Yi to Zhuang, and many in between. These are not addressed here.

Character forms and romanization

Chinese words have at least three common forms:

  • Traditional characters, as used in Taiwan, Hong Kong and Malaysia,
  • Simplified characters, as used in PRC (this may agree exactly or approximately with the traditional character), Singapore and Malaysia, and
  • Romanized forms, of which there may be more than one.

Every entry should list and link to all alternative forms (including all romanizations); there are a number of templates listed in #Entry format that can assist with this, of which {{cmn-noun}} is representative.

It appears that entries are duplicated between traditional and simplified forms, so as not to prioritize one form over the other, and these entries link to each other. See #Hanzi form templates below for templates to assist with this.

Headwords that are romanizations point to both the traditional and simplified forms, but do not duplicate all entries with that pronunciation. Instead, they have a “Romanization” L3 heading, and their definition lines simply link to characters with that reading.[1] See yánlì for an example.

Collation

There does not appear to be a consensus on how to collate (order) entries such as “Derived terms” or “Compounds” in Chinese entries. For romanized entries, usual alphabetical order is correct, but for Chinese characters, one might use either a phonetic ordering or radical-and-stroke sorting, as both are used in Chinese dictionaries.

Entry format

This has been the subject of much discussion in Beer Parlor and on the user pages of individual contributors. For a basic outline on out how to create a Chinese entry, see How to Create a Basic Chinese Entry. A preliminary consensus has been reached whereby Chinese entries will have a level two header that matches a specific language in the Chinese family of languages. Chinese entries should follow format guidelines in Entry layout explained. However, there are a few issues unique to Chinese. Templates have been created to address many of these issues (see: Category:Chinese templates). For example:

==Mandarin==
{{Hani-forms|[[曾经沧海难为水]]|[[曾]][[經]][[滄海]][[難]][[為]][[水]]}}
===Idiom===
{{cmn-idiom|t|pin=céng jīng cānghǎi nán wéi shuǐ|pint=ceng2jing1cang1hai3nan2wei2shui3|tra=曾經滄海難為水|sim=曾经沧海难为水|rs=曰08}}

曾經滄海難為水 and 井底之蛙 are both good examples of how Chinese entries should ideally be formatted. Two of the Chinese languages have a full set of templates at this time.

The script code for Chinese characters is Hani.

Cantonese

Templates:

For other parts of speech, use Template:head:

{{head|yue|(pos)|...}} (ex. 自行車 - > {{head|yue|noun|traditional||Jyutping|zi6 hang4 ce1|simplified|自行车}})

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Romanizations

Recall that the standard romanizations of Cantonese are:

Gan

Use Template:head:

{{head|gan|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Hakka

Use Template:head:

{{head|hak|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Huizhou

Use Template:head:

{{head|czh|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.


Jinyu

Use Template:head:

{{head|cjy|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Mandarin

Templates:

Romanizations

Recall that the standard romanizations of Mandarin are:

For individual syllables, we have entries in each of these systems, as well as in pinyin with no tones marked at all.[2] For words with multiple syllables, we only have entries for the pinyin romanizations, with tones marked using diacritics.[1]

Mandarin romanization entries only provide a link to Chinese character entries and should look like this (if the traditional and simplified characters are different):

==Mandarin==

===Romanization===
{{cmn-pinyin}}

# {{pinyin reading of|嚴厲|严厉}}
# {{pinyin reading of|妍麗|妍丽}}
# {{pinyin reading of|沿例}}
# {{pinyin reading of|岩櫟|岩栎}}
# {{pinyin reading of|沿歷|沿历}}
...

The entries make the search for Chinese characters easier. All important information will be contained in the Chinese character entries. See the vote Wiktionary:Votes/2011-07/Pinyin entries.

Tone sandhi

Some Mandarin dictionaries are inconsistent when it comes to depicting tone sandhi in Pinyin. For example, the character (normally bù - fourth tone) is changed to second tone (bú) when followed by another fourth tone syllable. Some dictionaries spell it with the converted tones (búshì in this case, ex. HSK汉语水平考试词典, ISBN 7561720785), while others use the root tones (bùshì, ex. 现代汉语词典, ISBN 9620701348). For Wiktionary entries, it is advisable to always use the root tone for syllables when spelling words in Pinyin (bùshì). This is also more consistent with how tone sandhi is handled in other situations in Mandarin. For example, 可以 is kěyǐ, even though the first syllable changes to second tone (pronounced: kéyǐ).

Min Bei

Use Template:head:

{{head|mnp|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Min Dong

Use Template:head:

{{head|cdo|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Min Nan

The Min Nan language family has four main branches. Two of the four may be encountered outside of China. This poses a problem for Wiktionary, since these dialects are not mutually intelligible,[1] and only one L2 header may be used per ISO 639 code. Since Min Nan is the official name for the ISO 639 code of nan, it must be used as the L2 header. To date, virtually all entries for Min Nan have been based on the Amoy dialect, which is widely considered to be a de facto standard. The disposition of other dialects such as Teochew and Qiongwen Hainanese remains undecided at this time. If you are fluent in one of those languages, and have an interest in adding words to Wiktionary, please post a message on Wiktionary:Beer parlour. If there is interest, we will try to work something out.

Templates:

Romanization

The standard romanization of Min Nan is POJ.

Min Zhong

Use Template:head:

{{head|czo|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Pu-Xian

Use Template:head:

{{head|cpx|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Wu

Use Template:head:

{{head|wuu|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Xiang

Use Template:head:

{{head|hsn|(pos)|...}}

where (pos) is the part of speech and the remainder is the other forms. See the documentation.

Historical languages

Wikipedia

Wikipedia

Historical Sinitic languages include the spoken languages Middle Chinese (ltc) and Old Chinese (och), the written language Literary Chinese (lzh), and the protolanguage Proto-Sino-Tibetan. Entries for words in these languages are used, except for Proto-Sino-Tibetan, which is a protolanguage and thus in the appendix namespace. These terms can also appear in etymologies for entries in modern Sinitic languages, and in entries for languages that have borrowed from Chinese, notably Japanese, Korean, and Vietnamese.

Finer distinctions are possible, such as Late Middle Chinese and Early Middle Chinese for the spoken language, and Literary Chinese versus earlier Classical Chinese for the written language. These distinctions can be made in the text of etymologies, but these do not have ISO 639 codes, and thus are not used for level 2 headings.

The precise meaning and status of these “languages” is complicated: narrowly speaking “Middle Chinese” and “Old Chinese” refer to various phonological reconstructions, notably based on rime dictionaries, and do not necessarily refer to a specific historical dialect or common language. Nevertheless, they are useful designations for historical periods.

Most modern Sinitic languages descend from Middle Chinese, with the notable exception of Min, which diverged earlier, with Proto-Min also descending from Old Chinese; see branching of modern varieties of Chinese. A notable example of this difference is , from which English tea is from Min and chai is from other Chinese.

Literary Chinese is significantly different from the spoken languages; this may be compared with Medieval Latin versus Romance languages. Literary Chinese (lzh) is the correct source language for literary terms in modern Sinitic languages, notably chengyu (four-character idioms), and in borrowings such as the corresponding Japanese yojijukugo.

Middle Chinese

Wikipedia has an article on:

Wikipedia

As Middle Chinese phonology is not attested (it is only reconstructed), please be sure to mark pronunciations with *.

Old Chinese

Wikipedia has an article on:

Wikipedia

Wikipedia has an article on:

Wikipedia

As Old Chinese phonology is not attested (it is only reconstructed), please be sure to mark pronunciations with *. As sources differ, please carefully cite specific references (author and year) for any reconstructions.

References for Old Chinese phonology include:

Cognates and stubs

Across Sinitic languages, a single written form is very frequently shared across a long historical period and wide geographical area. Thus cognate entries in different languages appear on the same page; this occurs quite frequently for cognates in closely related languages in other scripts, but to nowhere near the same degree as in Sinitic languages. Due to this, it is generally unhelpful, and possibly incorrect, to create an entry for one Sinitic simply by copying the heading and definitions for Mandarin. It is unhelpful because this adds no information beyond which a reader could themselves guess (cognate so probably the same meaning), and possibly incorrect because words do differ between these language; blindly copying without a reference is not reliable.

Thus, when creating a new Sinitic entry, please try to add some information distinctive to the particular language, particularly pronunciation, references, or citations.

For etymologies, each entry should include an Etymology section indicating its immediate ancestor term. For native words in modern Sinitic languages this is either Middle Chinese (most) or Proto-Min (thence Old Chinese) for Min languages. Per usual practice (see Wiktionary:Etymology), it is acceptable to include full etymologies back to Proto-Sino-Tibetan in modern entries. However, unless there is something specific to the etymology of a term in a given language, this is tedious to repeat for all modern languages. It is thus preferred (and sufficient) to only include the full history at representative languages, namely Mandarin and Min Nan (most used in each branch), with other languages just indicating the immediate predecessor and having a link reading “more at Mandarin/Min Nan”.

Similarly, it is tedious and not helpful to list contemporary cognate terms unless some particular relationship or contrast is being given. Instead, ancestral relationships can be given both backwards (in the Etymology section), to Middle Chinese, Old Chinese, and Proto-Sino-Tibetan, and forwards (in the Descendents section), from Middle Chinese, Old Chinese, and Proto-Sino-Tibetan to later forms. In these Descendents sections, listing pronunciations of descendent terms along with the spelling allows easy comparison, and avoids the duplication of the same listing in all modern forms. These are more useful than sibling relationships between cognates.

Chinese characters

Chinese characters should not be conflated with Chinese words or morphemes. Information about the characters themselves appear in the Translingual section, which appears before all other sections. See About Chinese characters for discussion of its format.

In general the Translingual section only includes information on the character form (in Etymology and script variations) and the meanings, which are widely shared. It does not include pronunciation information, except when necessary to understand the form. This occurs for example in phono-semantic compounds, where reconstructions of the pronunciations of the compound character and its phonetic are relevant to the form.

Specifically, discussion of the phonetic change of a character over time in Old Chinese, Middle Chinese, and various modern Sinitic languages belongs in the language-specific sections. However, information on when a meaning of a character developed (whether in some Sinitic language or a separate one, such as Japanese) is acceptable in the Translingual section.

Other entry sections

Etymology

If possible, it is desirable for entries to have etymologies, showing earlier pronunciations, spellings (if hanzi usage has changed), and semantic change (change in meaning).

For terms or phrases that can be traced back to Literary Chinese, you may wish to use the etymology template in the form {{etyl|lzh|cmn}} (where cmn is the Modern Chinese language in which the term is used).

References

Like other Wikimedia projects, Wiktionary is largely the work of anonymous volunteers. Therefore it is important to cite authoritative reference works such as dictionaries and encyclopedias. The {{pedialite}} template is a good choice if you want to cite a Wikipedia article. If matching articles in a Chinese language and English can be found on Wikipedia (particularly true for nouns), you can use the {{pedialite}} template in the following manner (example given for Mandarin):

*{{pedialite|剪刀|lang=cmn}}
*{{pedialite|scissors}}

In the references section, it will look like:

References

For external websites, you can use the {{cite web}} template. Here is an example:

===References===

  • "剪刀" (in Mandarin), Guoyu Cidian On-line Mandarin Dictionary (國語辭典). URL accessed on 2008-04-09.


Reference books

For books, you can use the {{reference-book}} template. For your convenience, the filled out templates for some authoritative reference works are provided (click on the blue edit button to copy):

===References===

Chinese Categories

Mandarin, Cantonese, and Min Nan subdivide their existing topical and part-of-speech-level categories by script. This means that while a given category, such as Category:Min Nan nouns, will contains terms in all scripts there will be several subcategories that contain only terms in specific scripts, such as Category:Min Nan nouns in simplified script, Category:Min Nan nouns in traditional script, and Category:Min Nan nouns in POJ script.

Hanzi form templates

To display various forms of Hanzi, it is recommended that Chinese entries make use of the following templates (in addition to the above):

{{Hani-forms}}

simpl.
trad.

This template, {{Hani-forms}}, should be placed below the language header.

Ideally, the characters should be hyperlinked as follows: link to the entire phrase for different forms, and component words for the given form. In more detail:

Simplified entries

simpl.
trad.

The simplified characters should be linked according to longest component words. If no compound words exist within the entry, then the individual characters should be hyperlinked.

The entire traditional phrase should be linked so that the user may conveniently navigate to the companion traditional entry (there are cases where simplified and traditional entries contain slightly different information).

Traditional entries

simpl.
trad.

The traditional characters should be linked according to longest component words. If no compound words exist within the entry, then the individual characters should be hyperlinked.

The entire simplified phrase should be linked so that the user may conveniently navigate to the companion simplified entry (there are cases where simplified and traditional entries contain slightly different information).

{{zh-hanzi-box}}

simpl. and trad.

This template may be used in cases where the word is the same in both simplified and traditional. The template should be placed in the same location as where {{Hani-forms}} would have been.

{{ja-forms}}

shinjitai

simplified

翻译

traditional

翻譯

This template can be used in cases where you would like to illustrate the difference between the Japanese simplified kanji form, the PRC simplified form, and the traditional form. The template would be used in place of {{Hani-forms}}.

Additional help

Help from the community

Sometimes, we know there is a problem, but don’t know what to do to correct the problem. If you should find a Chinese entry with a problem that you do not know how to correct, there are several ways to approach the situation.

  1. Mark the page with {{attention}} with a language code. This template adds the entry to the cleanup category for that language (such as Category:Mandarin terms needing attention), where another user can then find and correct the problem. It helps if you include comments on the entry’s talk page explaining what the problem is or why you think the page needs attention.
  2. Raise the issue on Wiktionary talk:About Sinitic languages. Note that this approach is primarily for issues of style, formatting, categorization, and not for specifics of content.
  3. Mark the page with {{rfc}}. this is a more general cleanup tag, and it allows the user to include reasons or concerns as an argument in the template. Be sure to also add an entry to WT:RFC concerning the word so that other editors will be made aware of the problem.

Translations into Chinese languages/dialects/topolects

  1. All translations into Chinese languages must be grouped under * Chinese. Subdialects can be sub-nested. Regional variations can be flagged with {{qualifier}}
* Chinese:
*: Mandarin: {{t|cmn|肥皂|tr=féizào|sc=Hani}}
*: Min Nan: {{t|nan|雪文|tr=sat-bûn}} {{qualifier|Zhangzhou}}, {{t|nan|茶塊|tr=tê-kóe}} {{qualifier|Quanzhou}} ...
  1. The traditional precedes the simplified version if they are different and the transliteration is provided with the simplified version.
* Chinese:
*: Mandarin: {{t|cmn|心理學|sc=Hani}}, {{t|cmn|心理学|tr=xīnlǐxué|sc=Hani}}
  1. If translation is both simplified and traditional, only one translation is given.
* Chinese:
*: Mandarin: {{t|cmn|三明治|tr=sānmíngzhì|sc=Hani}}

Other Chinese aids

References

  1. 1.0 1.1 Wiktionary:Votes/2011-07/Pinyin entries
  2. ^ Wiktionary:Votes/pl-2009-12/Treatment of toneless pinyin syllables