Appendix:Vocabulary lists

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Vocabulary list appendices:

Introduction[edit]

Welcome to Wiktionary's vocabulary lists series. This series aims to have representive word lists for all language families of the world.

  • Purpose: As linguistic lexicographical works, the vocabulary lists are designed with historical-comparative linguistics research goals in mind, such as classifying languages, reconstructing proto-languages, and identifying loanwords. Frequency lists and pedagogical resources are not included.
  • Glosses: Each list maintains original glosses (definitions, meanings) as found in the original sources. Translated glosses are sometimes added as additional columns if the original glosses are not in English. Translations that are not in the original source are noted in the lists, and do not replace the original glosses. Unlike Swadesh lists and other standardized lexicostatistical word lists, the vocabulary lists here do not consist of lists with predetermined glosses. Instead, the vocabulary lists here can serve as "raw building blocks" for compiling Swadesh lists.
  • Content: The lists are typically in the 50-1,000 item range for lexical entries. Definitions are typically concise and focus on basic vocabulary concepts such as numerals, body parts, and natural phenomena.
  • Scope: Emphasis is placed on divergent language isolates, families, and branches that would likely be crucial for etymological reconstruction and classification. Proto-languages are included whenever possible. Many of these language groups are sparsely documented and/or extinct. As a result, some of these lists may actually be the only extant documentation of a language or even language group.
  • Sources: The word lists are adapted from academic sources published by linguists. Thus, all lists must be properly referenced with adequate notes and metadata. Many of these sources are out of print, with highly limited distribution and accessibility.
  • Digitization: As with Wikisource texts, the lists are individually and painstakingly digitized using a variety of methods, such as optical character recognition (OCR), manual typing, and document conversion.
  • Encoding: Unicode.

Open-access online lexical databases that are similar in design, content, and research goals include STEDT, MKED, RefLex, Chirila, and Starling.

Navigational templates[edit]

Vocabulary lists of North Eurasian languages

European • Balkan • Hurro-Urartian • Hattic • Sumerian (Swadesh) • Elamite • Etruscan • Burushaski • Uralo-Altaic • Paleosiberian • p-Japanese • p-Ainu • p-Nivkh • p-Chukotko-Kamchatkan • p-Yukaghir • p-Yeniseian

Caucasian

Caucasian • p-Northwest Caucasian • p-Nakh-Daghestanian • p-Kartvelian

Uralic

p-Uralic (stable roots) • Finnic • Saami • Mordvinic • Mari • Permic • Ugric • Samoyedic

"Altaic" linguistic area

Altaic • Turkic • Mongolic • Common Mongolic • Khitan

Indo-European

Germanic • Celtic • Romance • Baltic • East Slavic • West Slavic • South Slavic


Vocabulary lists of African languages
Nilo-Saharan

Nilo-Saharan • p-Nilo-Saharan • p-Nilotic (p-E. Nilotic • p-S. Nilotic) • p-Surmic • NE Sudanic • p-Nubian • Nara • p-Daju • p-Jebel • Temein • Central Sudanic (p-Central Sudanic • p-Sara-Bongo-Bagirmi • Sinyar • Birri • p-Mangbetu) • p-Kuliak (Ik) • Kadu • Berta • Kunama • Gumuz • p-Koman • Gule • Amdang • Mimi-D • Maban • Kanuri • p-Songhay • Tadaksahak

Niger-Congo

p-Niger-Congo • p-Benue-Congo • p-Grassfields • p-Ring • Momo • Tivoid • Ekoid • Beboid • Bendi • p-Bantu (Swadesh list) • p-Jukunoid • p-Plateau (p-Tarokoid) • p-N. Jos • p-Fali • p-Yoruboid • Olukumi • p-Edoid • p-Akokoid • p-Igboid • Akpes • Ayere-Ahan • p-Upper Cross River • p-Lower Cross River • Anaang • p-Ogoni • p-Ukaan • p-Nupoid • Oko • p-Idomoid • p-Ijaw • Defaka • p-Gbe (Fongbe) • p-Potou-Akanic • p-Mumuye • p-Jen • Yendang • Tula-Waja • p-Lakka • p-Bua • Kim • p-Central Togo • p-Guang • p-Gurunsi • p-Oti-Volta (p-E. Oti-Volta • p-C. Oti-Volta) •  Tiefo • Natioro • Bariba • p-Gbaya • p-Mande (p-W. Mande • p-Mandekan • p-Niger-Volta • p-S. Mande) • Atlantic (Guinea) • p-Cangin • Bijogo • p-Talodi • p-Heiban • p-Katloid • Rashad • Lafofa

Afroasiatic

p-Afroasiatic • p-Chadic • p-Ron • p-North Bauchi • South Bauchi • p-Central Chadic • p-Masa • Kujarge • p-Cushitic • p-Agaw • p-Omotic • p-Aroid • p-Maji • Mao • p-Semitic

Khoisan

Khoisan • p-Khoe • p-Central Khoisan • p-Tuu • !Kung

Language isolates

Bangime • Jalaa • Laal • Ongota • Shabo • Sandawe • Hadza

Others

p-Niger-Saharan


Vocabulary lists of Southeast Asian languages
Sino-Tibetan

p-Tibeto-Burman • Old Chinese (basic) • p-Southern Min • Greater Bai • p-Tujia • p-Naish • p-Ersuic • Guiqiong • p-Lalo • Akha • Kathu • Gong • p-Karenic • p-Luish • p-Bodo-Garo • Kuki-Chin • Mru • p-W. Tibetan • Zakhring • Tshangla • Kho-Bwa • Mey • p-Puroik • p-Hrusish • Koro • Greater Siangic • Raji-Raute • Dhimalish • Baram-Thangmi • Bhujel • p-Kham • Dura • Bunan • (Nepal)

Austroasiatic

p-Austroasiatic • p-Munda • p-Khasian • p-Palaungic • Quang Lam • p-Khmuic • p-Pakanic • p-Vietic • p-Katuic • p-Bahnaric • p-Pearic • p-Khmeric • p-Monic • p-Aslian • p-Nicobarese

Hmong-Mien

p-Hmong-Mien • Hmong-Mien • p-Hmongic • Pa-Hng • Pana • p-Mienic • Mienic

Kra-Dai

p-Kra • Laha • Qabiao • Gelao • p-Kam-Sui • Kam-Sui (Hunan) • p-Lakkia • p-Tai • p-Be • Jizhao • p-Hlai • Jiamao


Vocabulary lists of Indo-Pacific languages
Papuan

p-Trans-New Guinea • Bayono-Awbono • Paniai Lakes • Kolopom • Bulaka River • Pauwasi • p-South Bougainville • p-Lower Sepik • p-Watam-Awar-Gamay • p-Lakes Plain • p-North Halmahera • p-Timor-Alor-Pantar • p-Alor-Pantar • Tayap • Massep

Australian

North Australian (basic) • p-Nyulnyulan • p-Mirndi • p-Gunwinyguan • Limilngan • Umbugarla • Minkin • p-Pama-Nyungan • p-Arandic • p-Thura-Yura • p-Ngayarda

Others

p-Dravidian • Dravidian • Kusunda • Nihali • Kenaboi • p-Ongan


Vocabulary lists of Amerindian languages
North America

Amerindian • p-Amerind • p-Eskimo • p-Na-Dene • p-Athabaskan • p-Algonquian • Beothuk • p-Iroquoian • p-Siouan • Caddoan • Yuchi • Kutenai • Chinook • Sahaptian • p-Takelman • p-Kalapuyan • Alsea • p-Wintun • Klamath • Molala • Cayuse • Coos • Lower Umpqua • p-Utian • p-Yokuts • p-Maidun • p-Salishan • p-Wakashan • p-Chimakuan • p-Palaihnihan • Chimariko • Shasta • Yana • p-Pomo • Esselen • Salinan • p-Chumash • Waikuri • p-Yuman • p-Yukian • Washo • p-Kiowa-Tanoan • p-Keresan • Coahuilteco • Comecrudo • Cotoname • Karankawa • Tonkawa • Maratino • Quinigua • Naolan • p-Muskogean • Natchez (Swadesh) • Atakapa • Chitimacha • Adai • Timucua

Central America

p-Oto-Manguean • p-Oto-Pamean • p-Central Otomian • p-Otomi • p-Mazatec • p-Chinantec • p-Mixtec • p-Zapotec • p-Uto-Aztecan • p-Aztecan • Purépecha (Swadesh) • Cuitlatec • p-Totozoquean • p-Totonacan • p-Mixe-Zoquean • Highland Chontal • Huamelultec • Tequistlatec • p-Huave • p-Mayan (Swadesh) • Xinca • p-Jicaque • p-Lencan • Lenca • p-Misumalpan

South America

p-Cariban • p-Taranoan • p-Chibchan • p-Barbacoan • Páez • p-Pano-Takanan • p-Panoan • p-Makú • Hupda • p-Tukanoan • p-Arawan • Harákmbut-Katukina • p-Cahuapanan • p-Choco • p-Guahiban • p-Shuar • Candoshi • p-Shuar-Candoshi • Achuar • p-Nambikwaran • Tinigua • Timote • p-Lule-Vilela • Vilela • Chamacoco • Allentiac • Chaná • Arutani–Sape • p-Bora-Muinane • Bora • p-Witotoan • Witoto • p-Macro-Daha • Sáliba • Piaroa • Ticuna • Yuri • Caraballo • Andoque • p-Mataguayo • p-Guaicurú • Guachi • Payagua • Mura • Pirahã • Matanawi • Quechumaran • Quechuan • p-Zaparoan • p-Peba-Yagua • Iquito • p-Chapacuran • Andaqui • Guamo • Betoi • Kamsá • Otomacoan • Jirajaran • Hibito-Cholon • Cholón • Sechura-Catacao • Sechura • Culli • Mochica • Esmeralda • Taushiro • Urarina • Aiwa • Canichana • Guató • Irantxe • Aikanã • Kanoé (Swadesh) • Kwaza • Mato Grosso Arára • Munichi • Omurano • Puinave • Leco • Puquina • Ramanos • Warao • Yaruro • Yuracaré • Yurumangui

South America (NE Brazil)

Katembri • Taruma • Yatê • Xukurú • Natú • Pankararú • Tuxá • Atikum • Kambiwá • Xokó • Baenan • Kaimbé • Tarairiú • Gamela

South America (Arawakan)

p-Arawakan • p-Japurá-Colombia • p-Lokono-Guajiro • Wayuu • p-Mamoré-Guaporé • p-Bolivia • p-Mojeño • p-Purus

South America (Macro-Jê)

p-Macro-Jê • Rikbaktsa • p-Jê • Jeikó • p-Jabuti • p-Kamakã • Kamakã • Maxakali • Chiquitano • Dzubukuá • Oti • p-Puri • p-Bororo

South America (Tupian)

p-Tupian • Puruborá • Karo • p-Tupari • p-Maweti-Guarani • p-Tupi-Guarani • Guaraní

External links[edit]

A selection of various comparative lexical databases currently available online:

General
Regional
Southeast Asia
  • STEDT (Sino-Tibetan)
  • MKED (Austroasiatic)
  • ABVD (Austronesian)
  • ACD (Austronesian)