Wiktionary:Languages

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Accessories-text-editor.svg This is a Wiktionary policy, guideline or common practices page. Specifically it is a policy think tank, working to develop a formal policy.
Policies: CFI - ELE - BLOCK - REDIR - BOTS - QUOTE - DELETE - NPOV - AXX

Wiktionary includes many words in many languages.

To distinguish languages, Wiktionary gives each a unique name and a unique code, which identify it.

See Wiktionary:Dialects and Wiktionary:Families for discussions of dialects and of language families, respectively.

Contents

Language names [edit]

Wiktionary calls each language by a different name; these language names are used in headers, translations tables, lexical categories, appendices, and some other places. Language names are chosen by consensus. Whenever possible, common English names of languages are used, and diacritics are avoided. Attested names (names which meet CFI) are strongly preferred.

When a single language is known by multiple names, only one is used. For a list of languages which are known by multiple names, see the Section "List of languages with multiple names".

When two languages are commonly known by the same name, Wiktionary distinguishes them by using synonyms for one or both, or (rarely) by using appended identifiers. For example, the Ghanan language commonly called "Buli" is referred to as "Buli (Ghana)" on Wiktionary and represented by the code "bwu"; the Indonesian language commonly called "Buli" is referred to as "Buli (Indonesia)" on Wiktionary and represented by the code "bzq". The Indonesian language commonly called "Maba" is referred to as "Maba" on Wiktionary and represented by the code "mqa"; the Chadian language commonly called "Maba" is referred to on Wiktionary as "Bura Mabang" and represented by the code "mde".

Language codes [edit]

Wiktionary's bots maintain Wiktionary:Index to templates/languages, a list of all used language codes; all can also be found in Category:Language code templates.

Wiktionary has an intricate system for determining which string of letters (code) represents each language and language family, and for determining where (at which URL) the information that a particular string of letters represents a particular language or family will be stored and be called from (that is, where templates will look for information when they are given a string of letters and must display or otherwise use the name signified by the string). Language codes are used in naming some categories, and are called by many templates. When a template is called directly, the result is the language name: calling {{vot}}, for example, displays Votic; in this way, you can determine a language's name if you know its code. If you know its name, you can determine its code by using {{langrev}} with the language's name as a parameter: the template will return the language's code if it can find it. (Type {{langrev|English}}, for example, in the Sandbox or Special:ExpandTemplates, and it will return "en".)

Wiktionary also has a simple system for recording which family individual languages belong to, and which scripts they are written in.

Wiktionary represents individual languages as follows:

  1. Languages which were assigned two-letter codes in the international standard ISO 639-1 are generally represented on Wiktionary by those codes. The individual codes are stored in the Template: namespace without any prefix. English, for example, is represented by en, as recorded in the Template:en. German is represented by de (Template:de). Esperanto is represented by eo (Template:eo). Wiktionary has a list of ISO 639-1 codes here.
    1. A few languages are represented on Wiktionary by 639-1 codes the ISO has deprecated. (This is generally the case when the ISO has come to consider a lect a group of languages, but Wiktionary still considers it a single language.) Serbo-Croatian, for example, is represented by sh (Template:sh).
  2. Languages which were not assigned codes by ISO 639-1, but which were assigned three-letter codes (based on Ethnologue codes) in the international standard ISO 639-3 are generally represented on Wiktionary by those codes. Abenaki, for example, is represented by abe (Template:abe). Wiktionary has a list of ISO 639-3 codes here.
  3. A few languages are represented by other, "exceptional" codes. (A complete list of these is in the section "List of languages with exceptional codes".) Exceptional codes are chosen as follows:
    1. A few are ISO 639-2 codes. (This is the case, for example, for languages which were not assigned specific, single codes by either ISO 639-1 or ISO 639-3.) Nahuatl, for example, is represented by nah (Template:nah).
    2. A few are codes devised by the Wikimedia Foundation Language Committee. (This is the case when a Wikimedia project is begun in a language which was not assigned a code by any ISO standard.) Zamboanga Chavacano, for example, is represented by cbk-zam (Template:cbk-zam). Wiktionary has a list of such codes in its Appendix:Wikimedia language codes.
    3. Any language which does not have an ISO or specially-devised Wikimedia code, but which is to be included in Wiktionary, is given a two-part exceptional code. The first part of this code is a relevant ISO 639-5 family code (see Wiktionary's appendix); after a hyphen, the second part of the code is a series of three lowercase letters which generally approximate the language name. (No digits, upper case letters, etc are used: IANA tags allow these, case independent, but Mediawiki software is more restrictive.) For example, Samoan Plantation Pidgin is cpe-spp (Template:cpe-spp): "cpe" is the ISO 639-5 code for English-based creoles and pidgins, "spp" abbreviates "Samoan Plantation Pidgin". Gallo is roa-gal (Template:roa-gal): "roa" is the ISO 639-5 code for Romance languages, "gal" abbreviates "Gallo".

Constructed languages which are not widely used but which have been assigned ISO 639-3 codes are sometimes accepted by Wiktionary for inclusion in dedicated Appendices. These languages are represented by their ISO 639-3 codes. Láadan, for example, is represented by the ISO 639-3 code ldn. This information is stored in the Template: namespace after a conl: prefix (constructed language): Template:conl:ldn. Some other constructed languages are also included in dedicated Appendices though they do not have ISO 639-3 codes: these languages are given codes which consist of "art-" followed by three letters, and which are stored in the Template: namespace after a conl: prefix.

Reconstructed languages are assigned special codes. Proto-Germanic, for example, is represented by the code gem-pro. This information is stored in the Template: namespace after a proto: prefix: Template:proto:gem-pro.

Not all lects which have been assigned codes by the ISO are assigned codes or included by Wiktionary.

  1. The ISO has assigned codes to some constructed languages which Wiktionary excludes.
  2. The ISO has assigned codes to some lects which Wiktionary treats as dialects of other languages and thus of other codes. (This is the case, for example, with Moldovan/Moldavian: the ISO assigned the lect the 639-1 code mo, but Wiktionary regards it as a form of Romanian and represents it and Romanian by the same code ro.)

List of languages with exceptional codes [edit]

Name Wikipedia article Wiktionary code Comments
ǃKung w:!Kung language khi-kun {{khi-kun}}
  • ǃKung may be considered a group of dialects or related languages.
Ammonite w:Ammonite language sem-amm {{sem-amm}}
Banyumasan w:Banyumasan language map-bms {{map-bms}}
Bunurong w:Bunurong language aus-bun {{aus-bun}}
Crimean Gothic w:Crimean Gothic language gme-cgo {{gme-cgo}}
Dutch Low Saxon w:Dutch Low Saxon nds-nl {{nds-nl}}
  • Wikimedia uses the code nds-nl for Dutch Low Saxon language projects. Contrast German Low German.
Gabi w:Pama-Nyungan languages#Classification and Languages aus-gab {{aus-gab}}
Gallo w:Gallo language roa-gal {{roa-gal}}
  • Gallo may be considered a dialect of French or a separate language.
Gaulish w:Gaulish language cel-gau {{cel-gau}}
German Low German w:Low German nds-de {{nds-de}} Wiktionary uses the exceptional code nds-de because nds is ambiguous and could include Dutch Low Saxon.
Greenlandic Eskimo Pidgin w:Indigenous languages of the Americas#Pidgins, mixed languages and trade languages crp-gep {{crp-gep}}
Guernésiais w:Guernésiais roa-grn {{roa-grn}}
  • Guernésiais may be considered a dialect of Norman, a dialect of French or a separate language.
Gunai w:Gunai language aus-gun {{aus-gun}}
  • Gunai may be considered a group of dialects or related languages.
Gutnish w:Modern Gutnish gmq-gut {{gmq-gut}}
  • Gutnish is the modern version of Old Gutnish.
Jèrriais w:Jèrriais roa-jer {{roa-jer}}
  • Jèrriais may be considered a dialect of Norman, a dialect of French or a separate language.
Leonese w:Leonese language roa-leo {{roa-leo}}
Maroon Spirit Language w:Jamaican Maroon Spirit Possession Language cpe-mar {{cpe-mar}}
Middle Chinese w:Middle Chinese zhx-mid {{zhx-mid}}
Middle Norwegian w:Norwegian language#From Old Norse to distinct Scandinavian languages gmq-mno {{gmq-mno}}
Mingo w:Mingo iro-min {{iro-min}}
Nahuatl w:Nahuatl nah {{nah}}
  • Nahuatl may be considered a group of dialects or related languages.
  • There is the ISO 639-2 or ISO 639-5 code nah for Nahuatl.
  • Wikimedia uses the code nah for Nahuatl language projects.
  • The result of a RFDO discussion was to keep the Category:Nahuatl language category.
Norman w:Norman language roa-nor {{roa-nor}}
  • Norman may be considered a dialect of French or a separate language.
  • Wikimedia uses the code nrm for Norman language projects. This is confusing, because nrm is an ISO 639-3 code for the Narom language.
Old Danish w:Old Danish gmq-oda {{gmq-oda}}
Old Polish w:Old Polish language zlw-opl {{zlw-opl}}
Old Portuguese w:Galician Portuguese roa-ptg {{roa-ptg}}
Old Swedish w:Swedish language#Old Swedish gmq-osw {{gmq-osw}}
Phuthi w:Phuthi language bnt-phu {{bnt-phu}}
Picuris w:Picuris language nai-pic {{nai-pic}}
Pomeranian w:Pomeranian language zlw-pom {{zlw-pom}}
Russenorsk w:Russenorsk crp-rsn {{crp-rsn}}
Samoan Plantation Pidgin w:Samoan Plantation Pidgin cpe-spp {{cpe-spp}}
Serbo-Croatian w:Serbo-Croatian language sh {{sh}}
  • Serbo-Croatian may be considered a group of languages (Bosnian, Croatian, Serbian, and Montenegrin) or an individual language.
  • The ISO 639-1 code sh for Serbo-Croatian is no longer active.
  • There is the ISO 639-3 code hbs for Serbo-Croatian.
  • Wikimedia uses the code sh for Serbo-Croatian language projects.
Slovincian w:Slovincian zlw-slv {{zlw-slv}}
  • Slovincian may be considered a dialect of Kashubian, a dialect of Pomeranian, a dialect of Polish or a separate language.
Syrian Arabic w:Syrian Arabic sem-syr {{sem-syr}}
Taimyr Pidgin Russian crp-tpr {{crp-tpr}}
Tarantino w:Tarantino language roa-tar {{roa-tar}}
Zamboanga Chavacano w:Chavacano language#Zamboangueño cbk-zam {{cbk-zam}}
  • Wikimedia uses the code cbk-zam for Zamboanga Chavacano language projects.

List of appendix-only constructed languages [edit]

Name Wikipedia article Wiktionary code Comments
Bolak w:Bolak language art-blk {{conl:art-blk}}
Communicationssprache w:Communicationssprache art-com {{conl:art-com}}
Eloi w:Eloi language art-elo {{conl:art-elo}}
Go'uld w:Go'uld art-gld {{conl:art-gld}}
Klingon w:Klingon language tlh {{conl:tlh}}
Láadan w:Láadan ldn {{conl:ldn}}
Lapine w:Lapine language art-lap {{conl:art-lap}}
Mandalorian w:Mandalorian#Language art-man {{conl:art-man}}
Mundolinco w:Mundolinco art-mun {{conl:art-mun}}
Na'vi w:Na'vi language art-nav {{conl:art-nav}}
Neo w:Neo (constructed language) neu {{conl:neu}}
Noxilo w:Noxilo art-nox {{conl:art-nox}}
Quenya w:Quenya qya {{conl:qya}}
Sindarin w:Sindarin sjn {{conl:sjn}}
Toki Pona w:Toki Pona art-top {{conl:art-top}}

Languages' family and script information [edit]

Wiktionary sorts languages into families. Most families are related through descent from a common ancestor, but a few are merely categories, such as "creoles and pidgins". Wiktionary records which family a language belongs to on a subpage of the language's template, /family. Each family is represented by a code; the family codes are explained in Wiktionary:Families.

  1. English belongs to the family of West Germanic languages; this information is recorded in Template:en/family. German is also a West Germanic language, as recorded in Template:de/family. Serbo-Croatian is a South Slavic language, as recorded in Template:sh/family. Abenaki is an Algonquian language, as recorded in Template:abe/family. Nahuatl is a Nahuan language, as recorded in Template:nah/family.
  2. The widely-used constructed language Esperanto has its membership in the category "Artificial languages" recorded in Template:eo/family.
  3. Zamboanga Chavacano has its membership in the category "Creole or pidgin languages" recorded in Template:cbk-zam/family.
  4. Wiktionary even records information about appendix-only constructed languages in this way: Láadan has its membership in the category "Artificial languages" recorded in Template:conl:ldn/family.

Wiktionary records which script(s) a language uses on another subpage of the language's template, /script. Each script is represented by a code, which is stored in the Template: namespace without any prefix. The script codes are explained in Wiktionary:Scripts.

  1. English is written in the Latin script; this is recorded in Template:en/script. Esperanto is written in the Latin script; this is recorded in Template:eo/script.
  2. Serbo-Croatian is written in both the Latin and the Cyrillic scripts; this is recorded in Template:sh/script.
  3. Wiktionary even records information about appendix-only constructed languages in this way: the information that Láadan is written in the Latin script is recorded in Template:conl:ldn/script.

List of languages with multiple names [edit]

Code Language Names Common links
ab Abkhaz Template:ab/names
adj Adioukrou Template:adj/names
ak Akan Template:ak/names
ang Old English Template:ang/names
arr Arara-Karo Template:arr/names
ase American Sign Language Template:ase/names
asf Auslan Template:asf/names
aue ǂKxʼauǁʼein Template:aue/names
aum Abewa Template:aum/names
aus-gun Gunai Template:aus-gun/names
av Avar Template:av/names
axb Abipon Template:axb/names
az Azeri Template:az/names
bal Baluchi Template:bal/names
bau Badanchi Template:bau/names
bcn Bibaali Template:bcn/names
bdh Tara Baka Template:bdh/names
be Belarusian Template:be/names
bey Akuwagel Template:bey/names
bfi British Sign Language Template:bfi/names
bkc Baka Template:bkc/names
bkp Iboko Template:bkp/names
bm Bambara Template:bm/names
bmy Kinyabemba Template:bmy/names
bn Bengali Template:bn/names
bua Buryat Template:bua/names
bzs Brazilian Sign Language Template:bzs/names
bzj Belize Kriol English Template:bzj/names
ca Catalan Template:ca/names
car Galibi Carib Template:car/names
cmn Mandarin Template:cmn/names
cpe-mar Maroon Spirit Language Jamaican Maroon Spirit Possession Language, Maroon Spirit Language
den Slavey Template:den/names
dlm Dalmatian Template:dlm/names
dsb Lower Sorbian Template:dsb/names
dv Dhivehi Template:dv/names
ekl Kolhe Template:ekl/names
el Greek Template:el/names
en English Template:en/names
enm Middle English Template:enm/names
eu Basque Template:eu/names
fa Persian Template:fa/names
fan Pahouin Template:fan/names
fi Finnish Template:fi/names
fr French Template:fr/names
frp Franco-Provençal Template:frp/names
fy West Frisian Template:fy/names
ga Irish Template:ga/names
gd Scottish Gaelic Template:gd/names
gez Ge'ez Template:gez/names
gsw Alemannic German Template:gsw/names
gul Gullah Template:gul/names
gv Manx Template:gv/names
he Hebrew Template:he/names
ho Hiri Motu Template:ho/names
hu Hungarian Template:hu/names
hsb Upper Sorbian Template:hsb/names
ht Haitian Creole Template:ht/names
hub Huambisa Template:hub/names
hvc Haitian Vodoun Culture Language Template:hvc/names
hwc Hawaiian Pidgin Template:hwc/names
hy Armenian Template:hy/names
ik Inupiak Template:ik/names
ja Japanese Template:ja/names
jam Jamaican Creole Template:jam/names
ka Georgian Template:ka/names
kbc Kadiwéu Template:kbc/names
kcn Nubi Template:kcn/names
kcu Kikami Template:kcu/names
kda Worimi Template:kda/names
kea Kabuverdianu Template:kea/names
kg Kongo Template:kg/names
khi-kun ǃKung Template:khi-kun/names
kj Kwanyama Template:kj/names
kl Greenlandic Template:kl/names
km Khmer Template:km/names
kmb Kimbundu Template:kmb/names
ko Korean Template:ko/names
ky Kyrgyz Template:ky/names
lad Ladino Template:lad/names
lg Luganda Template:lg/names
li Limburgish Template:li/names
lkt Lakota Template:lkt/names
lo Lao Template:lo/names
lv Latvian Template:lv/names
mec Mara Template:mec/names
meu Motu Template:meu/names
mnk Mandinka Template:mnk/names
moc Mocoví Template:moc/names
mgs Nyasa Template:mgs/names
mjh Nyanza Template:mjh/names
mrh Mara Chin Template:mrh/names
mwe Mwera Template:mwe/names
my Burmese Template:my/names
mzn Mazanderani Template:mzn/names
na Nauruan Nauru, Nauruan
nb Norwegian Bokmål Bokmål, Norwegian Bokmål
nds Low German Low German, Low Saxon, Modern Low German
nn Norwegian Nynorsk New Norwegian, Norwegian Nynorsk, Nynorsk
ny Chichewa Chichewa, Chicheŵa, Chinyanja, Nyanja
nyr Shinyiha Shinyiha, Nyiha
nys Nyunga Noongar, Nyunga, Nyuunga
ood O'odham O'odham, Papago
os Ossetian Ossete, Ossetian, Ossetic
ota Ottoman Turkish Ottoman, Ottoman Turkish
pa Punjabi Panjabi, Punjabi
pap Papiamentu Papiamento, Papiamentu
pis Pijin Kanaka, Neo-Solomonic, Pijin, Solomons Pidgin
pit Pitta-Pitta Pitta Pitta, Pitta-Pitta
plg Pilagá Pilacá, Pilagá
pot Potawatomi Potawatomi, Pottawatomie
pro Old Provençal Old Occitan, Old Provençal
pt Portuguese Modern Portuguese, Portuguese
pua Purepecha Purepecha, Phorhépecha, Porhé, P'urhépecha, Tarascan, Tarasco
rap Rapa Nui Rapa Nui, Rapanui, Pascuense
rm Romansch Romansch, Romansh, Rumantsch, Romanche
ro Romanian Daco-Romanian, Romanian, Roumanian, Rumanian
roa-grn Guernésiais Dgèrnésiais, Guernésiais, Guernsey French, Guernsey Norman French
roa-jer Jèrriais Jèrriais, Jersey French, Jersey Norman, Jersey Norman French
roa-ptg Old Portuguese Template:roa-ptg/names
rop Kriol Australian Kriol, Kriol
sco Scots Lowland Scots, Scots
sh Serbo-Croatian BCS, Croato-Serbian, Serbo-Croatian, Serbocroatian; Bosnian, Croatian, Montenegrin, Serbian
si Sinhalese Singhalese, Sinhala, Sinhalese
sk Slovak Slovak, Slovakian
sl Slovene Slovene, Slovenian
snq Chango Chango, Sangu
spp Supyire Suppire, Supyire, Supyire Senoufo
sqt Soqotri Socotri, Soqotri
srs Sarcee Sarcee, Sarsi, Tsuu T'ina, Tsuut'ina, Tsu T'ina
ss Swati Swati, Swazi
st Sotho Sesotho, Sotho, Southern Sesotho, Southern Sotho
sth Shelta Cant, Shelta
tcs Torres Strait Creole Big Thap, Blaikman, Brokan, Broken, Broken English, Cape York Creole, Lockhart Creole, Papuan Pidgin English, Torres Strait Brokan, Torres Strait Broken, Torres Strait Creole, Torres Strait Pidgin, Yumplatok
tg Tajik Tadjik, Tadzhik, Tajik, Tajiki, Tajik Persian
tmh Tamashek Tamahaq, Tamajaq, Tamasheq, Tamashek, Tuareg
tn Tswana Setswana, Tswana
tnq Taino Taino, Taíno
tob Toba Chaco Sur, Namqom, Qom, Toba, Toba Qom
tog Chitonga Kitonga, Siska, Sisya, Tonga, Western Nyasa
toi Tonga Chitonga, Plateau Tonga, Tonga, Zambezi
tpi Tok Pisin Melanesian Pidgin English, Neo-Melanesian, New Guinea Pidgin, Tok Pisin
ug Uyghur Uigur, Uighur, Uygur, Uyghur
uln Unserdeutsch Rabaul Creole German, Unserdeutsch
umb Umbundu South Mbundu, Umbundu
vai Vai Template:vai/names
waq Wageman Wageman, Wagiman, Wakiman, Wogeman
xaa Andalusian Arabic Andalusian Arabic, Andalusi Arabic, Moorish Arabic, Spanish Arabic
xcl Old Armenian Classical Armenian, Liturgical Armenian, Old Armenian, Grabar
yue Cantonese Template:yue/names
yun Bena Bena, Binna, Buna, Ebina, Ebuna, Gbinna, Lala (offensive) , Purra, Yangeru, Yongor, Yungur

See also [edit]