Wiktionary:Languages

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Language codes)
Jump to: navigation, search
Accessories-text-editor.svg This is a Wiktionary policy, guideline or common practices page. Specifically it is a policy think tank, working to develop a formal policy.
Policies: CFI - ELE - BLOCK - REDIR - BOTS - QUOTE - DELETE - NPOV - AXX

Wiktionary includes many words in many languages.

To distinguish languages, Wiktionary gives each a unique name and a unique code, which identify it.

See Wiktionary:Dialects and Wiktionary:Families for discussions of dialects and of language families, respectively.

Contents

Language names [edit]

Wiktionary calls each language by a different name; these language names are used in headers, translations tables, lexical categories, appendices, and some other places. Language names are chosen by consensus. Whenever possible, common English names of languages are used, and diacritics are avoided. Attested names (names which meet CFI) are strongly preferred.

When a single language is known by multiple names, only one is used. For a list of languages which are known by multiple names, see the Section "List of languages with multiple names".

When two languages are commonly known by the same name, Wiktionary distinguishes them by using synonyms for one or both, or (rarely) by using appended identifiers. For example, the Ghanan language commonly called "Buli" is referred to as "Buli (Ghana)" on Wiktionary and represented by the code "bwu"; the Indonesian language commonly called "Buli" is referred to as "Buli (Indonesia)" on Wiktionary and represented by the code "bzq". The Indonesian language commonly called "Maba" is referred to as "Maba" on Wiktionary and represented by the code "mqa"; the Chadian language commonly called "Maba" is referred to on Wiktionary as "Bura Mabang" and represented by the code "mde".

Language codes [edit]

Wiktionary's bots maintain Wiktionary:Index to templates/languages, a list of all used language codes; all can also be found in Category:Language code templates.

Wiktionary has an intricate system for determining which string of letters (code) represents each language and language family, and for determining where (at which URL) the information that a particular string of letters represents a particular language or family will be stored and be called from (that is, where templates will look for information when they are given a string of letters and must display or otherwise use the name signified by the string). Language codes are used in naming some categories, and are called by many templates. When a template is called directly, the result is the language name: calling {{vot}}, for example, displays Votic; in this way, you can determine a language's name if you know its code. If you know its name, you can determine its code by using {{langrev}} with the language's name as a parameter: the template will return the language's code if it can find it. (Type {{langrev|English}}, for example, in the Sandbox or Special:ExpandTemplates, and it will return "en".)

Wiktionary also has a simple system for recording which family individual languages belong to, and which scripts they are written in.

Wiktionary represents individual languages as follows:

  1. Languages which were assigned two-letter codes in the international standard ISO 639-1 are generally represented on Wiktionary by those codes. The individual codes are stored in the Template: namespace without any prefix. English, for example, is represented by en, as recorded in the Template:en. German is represented by de (Template:de). Esperanto is represented by eo (Template:eo). Wiktionary has a list of ISO 639-1 codes here.
    1. A few languages are represented on Wiktionary by 639-1 codes the ISO has deprecated. (This is generally the case when the ISO has come to consider a lect a group of languages, but Wiktionary still considers it a single language.) Serbo-Croatian, for example, is represented by sh (Template:sh).
  2. Languages which were not assigned codes by ISO 639-1, but which were assigned three-letter codes (based on Ethnologue codes) in the international standard ISO 639-3 are generally represented on Wiktionary by those codes. Abenaki, for example, is represented by abe (Template:abe). Wiktionary has a list of ISO 639-3 codes here.
  3. A few languages are represented by other, "exceptional" codes. (A complete list of these is in the section "List of languages with exceptional codes".) Exceptional codes are chosen as follows:
    1. A few are ISO 639-2 codes. (This is the case, for example, for languages which were not assigned specific, single codes by either ISO 639-1 or ISO 639-3.) Nahuatl, for example, is represented by nah (Template:nah).
    2. A few are codes devised by the Wikimedia Foundation Language Committee. (This is the case when a Wikimedia project is begun in a language which was not assigned a code by any ISO standard.) Zamboanga Chavacano, for example, is represented by cbk-zam (Template:cbk-zam). Wiktionary has a list of such codes in its Appendix:Wikimedia language codes.
    3. Any language which does not have an ISO or specially-devised Wikimedia code, but which is to be included in Wiktionary, is given a two-part exceptional code. The first part of this code is a relevant ISO 639-5 family code (see Wiktionary's appendix); after a hyphen, the second part of the code is a series of three lowercase letters which generally approximate the language name. (No digits, upper case letters, etc are used: IANA tags allow these, case independent, but Mediawiki software is more restrictive.) For example, Samoan Plantation Pidgin is cpe-spp (Template:cpe-spp): "cpe" is the ISO 639-5 code for English-based creoles and pidgins, "spp" abbreviates "Samoan Plantation Pidgin". Gallo is roa-gal (Template:roa-gal): "roa" is the ISO 639-5 code for Romance languages, "gal" abbreviates "Gallo".

Constructed languages which are not widely used but which have been assigned ISO 639-3 codes are sometimes accepted by Wiktionary for inclusion in dedicated Appendices. These languages are represented by their ISO 639-3 codes. Láadan, for example, is represented by the ISO 639-3 code ldn. This information is stored in the Template: namespace after a conl: prefix (constructed language): Template:conl:ldn. Some other constructed languages are also included in dedicated Appendices though they do not have ISO 639-3 codes: these languages are given codes which consist of "art-" followed by three letters, and which are stored in the Template: namespace after a conl: prefix.

Reconstructed languages are assigned special codes. Proto-Germanic, for example, is represented by the code gem-pro. This information is stored in the Template: namespace after a proto: prefix: Template:proto:gem-pro.

Not all lects which have been assigned codes by the ISO are assigned codes or included by Wiktionary.

  1. The ISO has assigned codes to some constructed languages which Wiktionary excludes.
  2. The ISO has assigned codes to some lects which Wiktionary treats as dialects of other languages and thus of other codes. (This is the case, for example, with Moldovan/Moldavian: the ISO assigned the lect the 639-1 code mo, but Wiktionary regards it as a form of Romanian and represents it and Romanian by the same code ro.)

List of languages with exceptional codes [edit]

Name Wikipedia article Wiktionary code Comments
ǃKung w:!Kung language khi-kun {{khi-kun}}
  • ǃKung may be considered a group of dialects or related languages.
Ammonite w:Ammonite language sem-amm {{sem-amm}}
Banyumasan w:Banyumasan language map-bms {{map-bms}}
Bunurong w:Bunurong language aus-bun {{aus-bun}}
Crimean Gothic w:Crimean Gothic language gme-cgo {{gme-cgo}}
Dutch Low Saxon w:Dutch Low Saxon nds-nl {{nds-nl}}
  • Wikimedia uses the code nds-nl for Dutch Low Saxon language projects. Contrast German Low German.
Gabi w:Pama-Nyungan languages#Classification and Languages aus-gab {{aus-gab}}
Gallo w:Gallo language roa-gal {{roa-gal}}
  • Gallo may be considered a dialect of French or a separate language.
Gaulish w:Gaulish language cel-gau {{cel-gau}}
German Low German w:Low German nds-de {{nds-de}} Wiktionary uses the exceptional code nds-de because nds is ambiguous and could include Dutch Low Saxon.
Greenlandic Eskimo Pidgin w:Indigenous languages of the Americas#Pidgins, mixed languages and trade languages crp-gep {{crp-gep}}
Guernésiais w:Guernésiais roa-grn {{roa-grn}}
  • Guernésiais may be considered a dialect of Norman, a dialect of French or a separate language.
Gunai w:Gunai language aus-gun {{aus-gun}}
  • Gunai may be considered a group of dialects or related languages.
Gutnish w:Modern Gutnish gmq-gut {{gmq-gut}}
  • Gutnish is the modern version of Old Gutnish.
Jèrriais w:Jèrriais roa-jer {{roa-jer}}
  • Jèrriais may be considered a dialect of Norman, a dialect of French or a separate language.
Leonese w:Leonese language roa-leo {{roa-leo}}
Maroon Spirit Language w:Jamaican Maroon Spirit Possession Language cpe-mar {{cpe-mar}}
Middle Chinese w:Middle Chinese zhx-mid {{zhx-mid}}
Middle Norwegian w:Norwegian language#From Old Norse to distinct Scandinavian languages gmq-mno {{gmq-mno}}
Mingo w:Mingo iro-min {{iro-min}}
Nahuatl w:Nahuatl nah {{nah}}
  • Nahuatl may be considered a group of dialects or related languages.
  • There is the ISO 639-2 or ISO 639-5 code nah for Nahuatl.
  • Wikimedia uses the code nah for Nahuatl language projects.
  • The result of a RFDO discussion was to keep the Category:Nahuatl language category.
Norman w:Norman language roa-nor {{roa-nor}}
  • Norman may be considered a dialect of French or a separate language.
  • Wikimedia uses the code nrm for Norman language projects. This is confusing, because nrm is an ISO 639-3 code for the Narom language.
Old Danish w:Old Danish gmq-oda {{gmq-oda}}
Old Polish w:Old Polish language zlw-opl {{zlw-opl}}
Old Portuguese w:Galician Portuguese roa-ptg {{roa-ptg}}
Old Swedish w:Swedish language#Old Swedish gmq-osw {{gmq-osw}}
Phuthi w:Phuthi language bnt-phu {{bnt-phu}}
Picuris w:Picuris language nai-pic {{nai-pic}}
Pomeranian w:Pomeranian language zlw-pom {{zlw-pom}}
Russenorsk w:Russenorsk crp-rsn {{crp-rsn}}
Samoan Plantation Pidgin w:Samoan Plantation Pidgin cpe-spp {{cpe-spp}}
Serbo-Croatian w:Serbo-Croatian language sh {{sh}}
  • Serbo-Croatian may be considered a group of languages (Bosnian, Croatian, Serbian, and Montenegrin) or an individual language.
  • The ISO 639-1 code sh for Serbo-Croatian is no longer active.
  • There is the ISO 639-3 code hbs for Serbo-Croatian.
  • Wikimedia uses the code sh for Serbo-Croatian language projects.
Slovincian w:Slovincian zlw-slv {{zlw-slv}}
  • Slovincian may be considered a dialect of Kashubian, a dialect of Pomeranian, a dialect of Polish or a separate language.
Syrian Arabic w:Syrian Arabic sem-syr {{sem-syr}}
Taimyr Pidgin Russian crp-tpr {{crp-tpr}}
Tarantino w:Tarantino language roa-tar {{roa-tar}}
Zamboanga Chavacano w:Chavacano language#Zamboangueño cbk-zam {{cbk-zam}}
  • Wikimedia uses the code cbk-zam for Zamboanga Chavacano language projects.

List of appendix-only constructed languages [edit]

Name Wikipedia article Wiktionary code Comments
Bolak w:Bolak language art-blk {{conl:art-blk}}
Communicationssprache w:Communicationssprache art-com {{conl:art-com}}
Eloi w:Eloi language art-elo {{conl:art-elo}}
Go'uld w:Go'uld art-gld {{conl:art-gld}}
Klingon w:Klingon language tlh {{conl:tlh}}
Láadan w:Láadan ldn {{conl:ldn}}
Lapine w:Lapine language art-lap {{conl:art-lap}}
Mandalorian w:Mandalorian#Language art-man {{conl:art-man}}
Mundolinco w:Mundolinco art-mun {{conl:art-mun}}
Na'vi w:Na'vi language art-nav {{conl:art-nav}}
Neo w:Neo (constructed language) neu {{conl:neu}}
Noxilo w:Noxilo art-nox {{conl:art-nox}}
Quenya w:Quenya qya {{conl:qya}}
Sindarin w:Sindarin sjn {{conl:sjn}}
Toki Pona w:Toki Pona art-top {{conl:art-top}}

Languages' family and script information [edit]

Wiktionary sorts languages into families. Most families are related through descent from a common ancestor, but a few are merely categories, such as "creoles and pidgins". Wiktionary records which family a language belongs to on a subpage of the language's template, /family. Each family is represented by a code; the family codes are explained in Wiktionary:Families.

  1. English belongs to the family of West Germanic languages; this information is recorded in Template:en/family. German is also a West Germanic language, as recorded in Template:de/family. Serbo-Croatian is a South Slavic language, as recorded in Template:sh/family. Abenaki is an Algonquian language, as recorded in Template:abe/family. Nahuatl is a Nahuan language, as recorded in Template:nah/family.
  2. The widely-used constructed language Esperanto has its membership in the category "Artificial languages" recorded in Template:eo/family.
  3. Zamboanga Chavacano has its membership in the category "Creole or pidgin languages" recorded in Template:cbk-zam/family.
  4. Wiktionary even records information about appendix-only constructed languages in this way: Láadan has its membership in the category "Artificial languages" recorded in Template:conl:ldn/family.

Wiktionary records which script(s) a language uses on another subpage of the language's template, /script. Each script is represented by a code, which is stored in the Template: namespace without any prefix. The script codes are explained in Wiktionary:Scripts.

  1. English is written in the Latin script; this is recorded in Template:en/script. Esperanto is written in the Latin script; this is recorded in Template:eo/script.
  2. Serbo-Croatian is written in both the Latin and the Cyrillic scripts; this is recorded in Template:sh/script.
  3. Wiktionary even records information about appendix-only constructed languages in this way: the information that Láadan is written in the Latin script is recorded in Template:conl:ldn/script.

List of languages with multiple names [edit]

Code Language Names Common links
ab Abkhaz Abkhaz, Abkhazian, Abxazo
adj Adioukrou Adioukrou, Adjukru, Adyoukrou, Adyukru, Ajukru
ak Akan Akan, Twi-Fante
ang Old English Old English, Anglo-Saxon
arr Arara-Karo Arara-Karo, Karo
ase American Sign Language American Sign Language, Ameslan, ASL
asf Auslan Auslan, Australian Sign Language
aue ǂKxʼauǁʼein ǂKxʼauǁʼein, ǁAuǁei, Auen, Kaukau, Koko, Kung-Gobabis, ‡Kx'auǁ'ei, ǂKx'auǁ'ein, ǁX'auǁ'e
aum Abewa Abewa, Asu
aus-gun Gunai Gunai, Gaanay, Ganai, Gunnai', Kurnai, Kurnay
av Avar Avar, Avaric
axb Abipon Abipon, Abipón, Callaga, Kalyaga
az Azeri Azeri, Azerbaijani, Azari, Azeri Turkic, Azerbaijani Turkic
bal Baluchi Baluchi, Balochi
bau Badanchi Badanchi, Bada
bcn Bibaali Bibaali, Bali
bdh Tara Baka Tara Baka, Baka
be Belarusian Belarusian, Belorussian, Belarusan, Bielorussian, Byelorussian, Belarussian, White Russian
bey Akuwagel Akuwagel, Beli
bfi British Sign Language British Sign Language, BSL
bkc Baka Baka
bkp Iboko Iboko, Boko
bm Bambara Bambara, Bamanankan
bmy Kinyabemba Kinyabemba, Bemba
bn Bengali Bengali, Bangla
bua Buryat Buryat, Buriat
bzs Brazilian Sign Language Brazilian Sign Language, LGB, LSB, LSCB, Libras
bzj Belize Kriol English Belize Kriol English, Belizean Creole, Belizean Creole English, Belizean Kriol, Kriol
ca Catalan Catalan, Valencian
car Galibi Carib Galibi Carib, Carib, Caribe, Cariña, Galibi, Galibí, Kalihna, Kali'na, Kalinya
cmn Mandarin Mandarin, Mandarin Chinese, Putonghua, Guoyu, Huayu, Guanhua, Beifanghua, Standard Chinese
cpe-mar Maroon Spirit Language Maroon Spirit Language, Jamaican Maroon Spirit Possession Language
den Slavey Slavey, Slave, Slavé
dlm Dalmatian Dalmatian, Dalmatic
dsb Lower Sorbian Lower Sorbian, Lower Lusatian, Lower Wendish
dv Dhivehi Dhivehi, Divehi, Mahal, Mahl, Maldivian
ekl Kolhe Kolhe, Kol
el Greek Greek, Modern Greek, Neo-Hellenic
en English English, Modern English
enm Middle English Middle English, Medieval English
eu Basque Basque, Euskara
fa Persian Persian, Farsi, New Persian, Modern Persian
fan Pahouin Pahouin, Fang
fi Finnish Finnish, Suomi
fr French French, Modern French
frp Franco-Provençal Franco-Provençal, Arpetan, Arpitan
fy West Frisian West Frisian, Western Frisian
ga Irish Irish, Irish Gaelic
gd Scottish Gaelic Scottish Gaelic, Gàidhlig, Highland Gaelic, Scots Gaelic, Scottish
gez Ge'ez Ge'ez, Ethiopic, Gi'iz
gsw Alemannic German Alemannic German, Swiss German
gul Gullah Gullah, Geechee, Sea Island Creole English
gv Manx Manx, Manx Gaelic
he Hebrew Hebrew, Ivrit
ho Hiri Motu Hiri Motu, Pidgin Motu, Police Motu
hu Hungarian Hungarian, Magyar
hsb Upper Sorbian Upper Sorbian, Upper Lusatian, Upper Wendish
ht Haitian Creole Haitian Creole, Creole, Haitian, Kreyòl
hub Huambisa Huambisa, Huambiza, Wambisa
hvc Haitian Vodoun Culture Language Haitian Vodoun Culture Language, Langaj, Langay
hwc Hawaiian Pidgin Hawaiian Pidgin, Hawaii Creole English, Hawaii Pidgin English, HCE, Pidgin
hy Armenian Armenian, Modern Armenian
ik Inupiak Inupiak, Inupiaq, Iñupiaq, Inupiatun
ja Japanese Japanese, Modern Japanese, Nipponese, Nihongo
jam Jamaican Creole Jamaican Creole, Jamaican, Jamaican Patois, Patois, Patwa
ka Georgian Georgian, Kartvelian
kbc Kadiwéu Kadiwéu, Caduveo, Ediu-Adig, Guaicurú, Kadiweu, Mbayá, Mbayá-Guaycuru, Waikurú
kcn Nubi Nubi, Ki-Nubi
kcu Kikami Kikami, Kami
kda Worimi Worimi, Gadang, Gadhang, Gadjang, Kattang, Kutthung
kea Kabuverdianu Kabuverdianu, Cape Verdean Creole
kg Kongo Kongo, Kikongo
khi-kun ǃKung ǃKung, ǃ'OǃKung
kj Kwanyama Kwanyama, Kuanyama, Oshikwanyama
kl Greenlandic Greenlandic, Kalaallisut
km Khmer Khmer, Cambodian
kmb Kimbundu Kimbundu, North Mbundu
ko Korean Korean, Modern Korean
ky Kyrgyz Kyrgyz, Kirghiz, Kirgiz
lad Ladino Ladino, Judaeo-Spanish, Judæo-Spanish, Judeo-Spanish
lg Luganda Luganda, Ganda
li Limburgish Limburgish, Limburgan, Limburgian, Limburgic
lkt Lakota Lakota, Lakhota
lo Lao Lao, Laotian
lv Latvian Latvian, Lett
mec Mara Mara, Leelawarra, Leelalwarra, Mala, Marra
meu Motu Motu, Pure Motu, True Motu
mnk Mandinka Mandinka, Mandingo
moc Mocoví Mocoví, Mbocobí, Mokoví, Moqoyt
mgs Nyasa Nyasa, Kimanda, Kinyasa, Manda
mjh Nyanza Nyanza, Kinyasa, Mwera, Nyasa
mrh Mara Chin Mara Chin, Chin Mara, Lakher, Mara, Maram, Mira, Zao
mwe Mwera Mwera, Chimwera, Cimwera, Mwela
my Burmese Burmese, Myanmar
mzn Mazanderani Mazanderani, Mazandarani
na Nauruan Nauruan, Nauru
nb Norwegian Bokmål Norwegian Bokmål, Bokmål
nds Low German Low German, Low Saxon, Modern Low German
nn Norwegian Nynorsk Norwegian Nynorsk, New Norwegian, Nynorsk
ny Chichewa Chichewa, Chicheŵa, Chinyanja, Nyanja
nyr Shinyiha Shinyiha, Nyiha
nys Nyunga Nyunga, Noongar, Nyuunga
ood O'odham O'odham, Papago
os Ossetian Ossetian, Ossete, Ossetic
ota Ottoman Turkish Ottoman Turkish, Ottoman
pa Punjabi Punjabi, Panjabi
pap Papiamentu Papiamentu, Papiamento
pis Pijin Pijin, Kanaka, Neo-Solomonic, Solomons Pidgin
pit Pitta-Pitta Pitta-Pitta, Pitta Pitta
plg Pilagá Pilagá, Pilacá
pot Potawatomi Potawatomi, Pottawatomie
pro Old Provençal Old Provençal, Old Occitan
pt Portuguese Portuguese, Modern Portuguese
pua Purepecha Purepecha, Phorhépecha, Porhé, P'urhépecha, Tarascan, Tarasco
rap Rapa Nui Rapa Nui, Rapanui, Pascuense
rm Romansch Romansch, Romansh, Rumantsch, Romanche
ro Romanian Romanian, Daco-Romanian, Roumanian, Rumanian
roa-grn Guernésiais Guernésiais, Dgèrnésiais, Guernsey French, Guernsey Norman French
roa-jer Jèrriais Jèrriais, Jersey French, Jersey Norman, Jersey Norman French
roa-ptg Old Portuguese Old Portuguese, Galician-Portuguese, Galician Portuguese
rop Kriol Kriol, Australian Kriol
sco Scots Scots, Lowland Scots
sh Serbo-Croatian Serbo-Croatian, BCS, Croato-Serbian, Serbocroatian, Bosnian, Croatian, Montenegrin, Serbian
si Sinhalese Sinhalese, Singhalese, Sinhala
sk Slovak Slovak
sl Slovene Slovene, Slovenian
snq Chango Chango, Sangu
spp Supyire Supyire, Suppire, Supyire Senoufo
sqt Soqotri Soqotri, Socotri
srs Sarcee Sarcee, Sarsi, Tsuu T'ina, Tsuut'ina, Tsu T'ina
ss Swati Swati, Swazi
st Sotho Sotho, Sesotho, Southern Sesotho, Southern Sotho
sth Shelta Shelta, Cant
tcs Torres Strait Creole Torres Strait Creole, Big Thap, Blaikman, Brokan, Broken, Broken English, Cape York Creole, Lockhart Creole, Papuan Pidgin English
tg Tajik Tajik, Tadjik, Tadzhik, Tajiki, Tajik Persian
tmh Tamashek Tamashek, Tamahaq, Tamajaq, Tamasheq, Tuareg
tn Tswana Tswana, Setswana
tnq Taino Taino, Taíno
tob Toba Toba, Chaco Sur, Namqom, Qom, Toba Qom
tog Chitonga Chitonga, Kitonga, Siska, Sisya, Tonga, Western Nyasa
toi Tonga Tonga, Chitonga, Plateau Tonga, Zambezi
tpi Tok Pisin Tok Pisin, Melanesian Pidgin English, Neo-Melanesian, New Guinea Pidgin
ug Uyghur Uyghur, Uigur, Uighur, Uygur
uln Unserdeutsch Unserdeutsch, Rabaul Creole German
umb Umbundu Umbundu, South Mbundu
vai Vai Vai, Gallinas, Vy
waq Wageman Wageman, Wagiman, Wakiman, Wogeman
xaa Andalusian Arabic Andalusian Arabic, Andalusi Arabic, Moorish Arabic, Spanish Arabic
xcl Old Armenian Old Armenian, Classical Armenian, Liturgical Armenian, Grabar
yue Cantonese Cantonese, Yue, Yüeh
yun Bena Bena, Binna, Buna, Ebina, Ebuna, Gbinna, Lala, Purra, Yangeru

See also [edit]