The policies for the Chinese language are changing.
Following Wiktionary:Votes/pl-2014-04/Unified Chinese and the preceding discussions and agreements, the structure of Sinitic or Chinese (Mandarin, Cantonese, Min Nan, Wu, Hakka, etc.) entries is changing. The entries are being merged and new methods to display topolect information are being used. The body of this page needs to be updated to explain the new policy.

Accessories-text-editor.svg This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors.

The Chinese or Sinitic language family includes a number of related lects which have very similar written forms, but different grammar, vocabulary and especially pronunciation. On Wiktionary, these lects are treated under the header ==Chinese== and the language code zh unless they natively use a non-Chinese script.

Key points

  • The various varieties of Chinese are subsumed under the header ==Chinese== and the language code zh (vote).
    The exception is when the variety of Chinese natively uses a non-Chinese script, e.g. Dungan.
  • A Traditional Chinese form of a Chinese word, usually the most commonly used Traditional Chinese form, is chosen as the lemma (vote).
    All other forms of the exact same word should soft-redirect to the lemma form using {{zh-see}}.
  • Terms are defined in relation to Modern Standard Written Chinese.
    Senses limited to the literary language, certain dialects or regions should be marked accordingly using label ({{lb}}, in definitions) and qualifier ({{q}}, elsewhere) tags. For example, 簡訊 uses {{lb|zh|chiefly|Taiwan}} and 煠熟狗頭 uses {{lb|zh|Cantonese}} to show that these terms are mainly used in Taiwan and exclusively Cantonese, respectively.

Entry format

Chinese entries should follow format guidelines in Entry layout explained. 朋友 is a good example of how Chinese entries should ideally be formatted.

{{zh-new}} can be used to accelerate creation of entries.

Unless the entry is of a variant or simplified spelling, templates that are almost always obligatory are {{zh-forms}} and {{zh-pron}}.

Some useful Chinese-specific templates include {{zh-l}}, {{zh-x}}, {{zh-cat}}, and {{zh-compound}}.

Entries for single characters

Variants and simplified forms use {{zh-see}} to redirect to the standard traditional form.

is a good example of how Chinese character entries should ideally be formatted.

Basic headers

Glyph origin
Describes how the character obtained its current shape. Templates used include {{Han etym}} and {{Han compound}}.
A character is not always of Chinese origin; see for an example of what to do in this case.
Describes the origin of the character's pronunciation(s). (Or, used to host a {{zh-see}} box; see documentation for details and for an example.)
Note that Old Chinese and Middle Chinese have been subsumed under "Chinese", so indicating that Mandarin (hēi) is inherited from Old Chinese (/*hmlɯːɡ/) or Middle Chinese (/hək̚/) is redundant. However, indicating that a character is derived from a different character is fine, such as with (from Old Chinese ).
It is preferred that an entry is split by etymology per Old Chinese and Middle Chinese ancestor.
Hosts {{zh-pron}}; see documentation for details.
The |cat= parameter does the work sorting entries into categories such as Category:Chinese verbs, Category:Mandarin nouns, and Category:Cantonese chengyu and should be filled out when reasonable.
Hosts definitions. Rationale for using a "Definitions" header instead of "Noun", "Verb", or other more specific part of speech headers can be found here.
The {{zh-hanzi}} template is found directly under this header as the headword template. It has little practical value but is currently part of the standard Wiktionary entry format.

Other templates

  • {{zh-forms}}: found either at the top of an entry or under an Etymology header. See documentation for details.
  • {{zh-obsolete}} can be used to mark definitions as being obsolete in Modern Standard Chinese (but not necessarily other modern Chinese lects; see for an example).
Headword-line templates

About specific lects

Romanisation Romanisation
allowed as
Mandarin Standard Chinese
(Beijing/Taiwan dialect)
Hanyu Pinyin Allowed for (votes: 1, 2):
  • monosyllables with diacritics (zhāng)
  • monosyllables with tone numbers (zhang1)
  • monosyllables with no tone mark (zhang)
  • polysyllables with diacritics (yánlì)
Only original tones are indicated in Pinyin, e.g. consecutive third tones are shown as third tones. Phonetic pinyin is shown in the expanded mode on tone sandhi with characters and when the actual tone differs from the nominal, e.g. 一定 (yīdìng) [Phonetic: yídìng], 不過不过 (bùguò) [Phonetic: búguò]
Sichuanese (Chengdu dialect) Sichuanese Pinyin (help) Indicated by hyphen -.
Cantonese Standard Cantonese
(Guangzhou dialect)
Jyutping Allowed for monosyllables. Indicated by hyphen -.
(Taicheng dialect)
Wiktionary No See the romanisation page.
Gan Nanchang dialect Wiktionary No
Hakka Sixian dialect
(north and south)
Pha̍k-fa-sṳ (help) No
Meixian dialect Guangdong Romanization (help) No
Jin Taiyuan dialect Wiktionary No
Min Dong Fuzhou dialect Bàng-uâ-cê (help) No
Min Nan Hokkien
(multiple dialects)
Pe̍h-ōe-jī (help) Yes (e.g. put-khó-su-gī)
(multiple dialects)
Guangdong Ministry of Education's
Teochew Romanization Scheme
(not Gaginang's Peng'im)
No Only original tones are used.
Wu Shanghai dialect Wiktionary No See WT:WUU#Tones
Xiang Changsha dialect Wiktionary No

Chinese characters

Chinese characters should not be conflated with Chinese words or morphemes. Information about the characters themselves appear in the Translingual section, which appears before all other sections. See Wiktionary:About Chinese characters for discussion of its format.

In general the Translingual section only includes information on the character form (in Etymology and script variations) and the meanings, which are widely shared. It does not include pronunciation information, except when necessary to understand the form. This occurs for example in phono-semantic compounds, where reconstructions of the pronunciations of the compound character and its phonetic are relevant to the form.

Specifically, discussion of the phonetic change of a character over time in Old Chinese, Middle Chinese, and various modern Sinitic languages belongs in the language-specific sections. However, information on when a meaning of a character developed (whether in some Sinitic language or a separate one, such as Japanese) is acceptable in the Translingual section.

Other entry sections


If possible, it is desirable for entries to have etymologies, showing earlier pronunciations, spellings (if hanzi usage has changed), and semantic change (change in meaning).

For terms or phrases that can be traced back to Literary Chinese, you may wish to use the etymology template in the form {{etyl|lzh|cmn}} (where cmn is the Modern Chinese language in which the term is used).

Additional help

Help from the community

Sometimes, we know there is a problem, but don’t know what to do to correct the problem. If you should find a Chinese entry with a problem that you do not know how to correct, there are several ways to approach the situation.

  1. Mark the page with {{attention}} with a language code. This template adds the entry to the cleanup category for that language (such as Category:Mandarin terms needing attention), where another user can then find and correct the problem. It helps if you include comments on the entry’s talk page explaining what the problem is or why you think the page needs attention.
  2. Raise the issue on Wiktionary talk:About Chinese. Note that this approach is primarily for issues of style, formatting, categorization, and not for specifics of content.
  3. Mark the page with {{rfc}}. This is a more general cleanup tag, and it allows the user to include reasons or concerns as an argument in the template. Be sure to also add an entry to WT:RFC concerning the word so that other editors will be made aware of the problem.

Translations into Chinese languages/dialects/topolects

  • All translations into Chinese languages must be grouped under * Chinese. Subdialects can be sub-nested. Regional variations can be flagged with {{qualifier}}
* Chinese:
*: Mandarin: {{t|cmn|肥皂|tr=féizào}}
*: Min Nan: {{t|nan|雪文|tr=sat-bûn}} {{qualifier|Zhangzhou}}, {{t|nan|茶塊|tr=tê-kóe}} {{qualifier|Quanzhou}} ...
  • The traditional precedes the simplified version if they are different and the transliteration is provided with the simplified version. All Chinese varieties need both traditional and simplified forms.
* Chinese:
*: Mandarin: {{t|cmn|心理學}}, {{t|cmn|心理学|tr=xīnlǐxué}}
  • If translation is both simplified and traditional, only one translation is given.
* Chinese:
*: Mandarin: {{t|cmn|三明治|tr=sānmíngzhì}}

Users are encouraged to use MediaWiki talk:Gadget-TranslationAdder.js for adding translations, which automatically determines if template {{t}} or {{t+}} should be used. The latter is used when an interwiki entry exists.

Other Chinese aids

