Wiktionary talk:Language treatment

moving dialect codes to the etyl namespace

Latest comment: 14 years ago1 comment1 person in discussion

For cases where we consider the macrolanguage to be the individual language and the subdivisions to be dialects, I think we should move the subdivision language code templates to the etyl: namespace. Similar to language families & other dialects, this is where we house codes that should only be used in Etymologies and not as valid L2 languages. Sound fine? --Bequw → ¢ • τ 21:59, 20 January 2010 (UTC)Reply

Treatment by SIL

Latest comment: 14 years ago1 comment1 person in discussion

I thought it may be interesting to post what SIL's (the Registration Authority for ISO 639-3) criteria are for determining if language varieties are dialects or distinct languages. It can be found on their Change Request Form (page 3).

For this part of ISO 639, judgments regarding when two varieties are considered to be the same or different languages are based on a number of factors, including linguistic similarity, intelligibility, a common literature (traditional or written), a common writing system, the views of users concerning the relationship between language and identity, and other factors. The following basic criteria are followed:
Two related varieties are normally considered varieties of the same language if users of each variety have inherent understanding of the other variety (that is, can understand based on knowledge of their own variety without needing to learn the other variety) at a functional level.

Where intelligibility between varieties is marginal, the existence of a common literature or of a common ethnolinguistic identity with a central variety that both understand can be strong indicators that they should nevertheless be considered varieties of the same language.

Where there is enough intelligibility between varieties to enable communication, the existence of well-established distinct ethnolinguistic identities can be a strong indicator that they should nevertheless be considered to be different languages

We are of course independent of these, but they may be useful nonetheless. --Bequw → ¢ • τ 21:58, 21 January 2010 (UTC)Reply

Allowing macro and non-standard dialects

Latest comment: 14 years ago1 comment1 person in discussion

Sometimes (as with Latvian and Estonian) we treat the subdivisions of a macrolanguage as individual languages, but we use the macrolanguage name/code in place of the "standard" dialect name/code. I just added this option to the table. Are there other macrolanguages where this is the case (possibly Arabic and Malay)? --Bequw → ¢ • τ 17:09, 23 January 2010 (UTC)Reply

Aramaic

Latest comment: 14 years ago2 comments2 people in discussion

Apparently some have been treating "Jewish Babylonian Aramaic" (aka "Talmudic Aramaic", code=tmr) as a variety of Aramaic. Does anyone know if this is standard, or if this is true of other ISO 639-3 coded Aramaic varieties? --Bequw → τ 18:34, 8 February 2010 (UTC)Reply

"Apparently" should now link to an archive.—msh210℠ 18:52, 15 February 2010 (UTC)Reply

Chinese

Latest comment: 11 years ago3 comments3 people in discussion

I'd like to update the Chinese entry. Is there any way to just write in plain English, without passing through a template? Mglovesfun (talk) 16:41, 25 March 2010 (UTC)Reply

Use the templates, please; because they standardize the possible texts, and standardization is good. Another way to contribute to the page is typing here what you need, so I may update the table. --Daniel. 14:15, 22 April 2010 (UTC)Reply

I've changed the table to a regular wikitable so that anyone can edit it and so that it can handle more complex situations and the presence of deleted codes. Cheers, - -sche (discuss) 21:05, 23 May 2013 (UTC)Reply

Aramaic redux

Latest comment: 11 years ago1 comment1 person in discussion

Because at least one RFM is ongoing(?), I'll list this here rather than on the main page: oar (Old Aramaic, up to 700 BCE) is not used, as it has been superseded by arc and syc. tmr (Jewish Babylonian Aramaic, circa 200-1200 CE) is not used, as it has been superseded by arc and etyl:tmr. - -sche (discuss) 00:11, 16 July 2013 (UTC)Reply

Montagnais/Innu

Latest comment: 10 years ago1 comment1 person in discussion

Currently, some main-namespace pages use Montagnais/Innu's language code (probably mostly in translations tables) while a few use other Cree dialects' language codes. Innu is different enough from Cree that Innu is regularly considered side-by-side with (rather than subordinated under) Cree; e.g. the Linguistic Atlas of Canada speaks of "different Cree and Innu dialects". OTOH, they're not that different, and splitting them at the L2 level would raise questions of what to do with e.g. Naskapi. I'm curious whether we should (a) allow Innu its own L2, (b) merge it completely into Cree, or (c) leave it subordinated under / merge into cr at the L2 level, but let it keep its code (it currently still has one, as no-one ever deleted it) so that it can be used in translations tables (like the Romani lects' codes). The translations could be nested under Cree/cr, or could be separate, sorted under M or I depending on which name we end up using for the lect. - -sche (discuss) 22:26, 20 July 2013 (UTC)Reply

East Frisian: frs, stq

Latest comment: 12 years ago10 comments3 people in discussion

RFM 1

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Template:frs - Template:stq

This is an old, old mistake in ISO. Both codes refer to the very same language, namely the Frisian dialect spoken in Saterland, which is an Eastern Frisian dialect. I have no idea how that was overlooked, but it means the two codes should be merged somehow. I'd prefer {{frs}}, since that one is in 639-2. -- Liliana • 14:24, 17 October 2011 (UTC)Reply

Should the language name be "East Frisian" or "Saterland Frisian"? I'd prefer to use the code "frs", but the name "Saterland Frisian". - -sche (discuss) 19:08, 17 October 2011 (UTC)Reply

To me it seems like Saterland Frisian is the most common name, so we should probably use that. -- Liliana • 19:14, 17 October 2011 (UTC)Reply

Alright, frs = Saterland Frisian it is, but {{stq}} is in fact widely used — someone will need to replace it by bot. - -sche (discuss) 23:50, 23 October 2011 (UTC)Reply

Or a really bored person like me needs to spend an hour or two. -- Liliana • 00:12, 24 October 2011 (UTC)Reply

But what about etymologies involving Eastern Frisian, at the time it still existed? With no code, how should they be entered? Or even, how should the etymologies that already exist be fixed? —CodeCa t 11:23, 24 October 2011 (UTC)Reply

If it warrants a distinction, it should get one of these constructed codes. It isn't covered by the code frs anyway, which ISO classifies as a "living" language, not an extinct one. -- Liliana • 12:13, 24 October 2011 (UTC)Reply

East Frisian isn't really extinct strictly, but the only surviving instance of it is now called Saterland Frisian. —CodeCa t 16:11, 24 October 2011 (UTC)Reply

This doesn't explain why ISO assigned two codes to one language. We do not have that for any other language of the world. Using frs for a different language than what ISO intended would make a precedent case, and almost certainly require a vote.
Another problem is that the current name "East Frisian" is really confusing, since there's an (unrelated) Low German dialect which is also called East Frisian. So in any case, you would have to sort out the erroneous uses. -- Liliana • 16:15, 24 October 2011 (UTC)Reply

I agree with Liliana, we need a separate code of our own for non-Saterland varieties of East Frisian (or we need to clearly indicate that we are using "frs" to refer to a language other than the one the ISO refers to as "frs"). If a word is derived from a variety of East Frisian other than the one the ISO calls "stq", it cannot be derived from what the ISO calls "frs", because "frs" is living, and the only living East Frisian lect is "stq". - -sche (discuss) 00:34, 26 October 2011 (UTC)Reply

Proposed additions / clarifications

Latest comment: 10 years ago4 comments2 people in discussion

These are all from translation tables, which I will edit to reflect consensus for any of these cases:

Macro languages:

Chinese: dng, ltc, och
Sorbian: dsb, hsb
Apache: apw, apm, apj, apl, apk
Sami: smn, smj, sms, sma, se
Frisian: fy, ofs, frr
Berber: shi
Marquesan: mrq, mrm

Dialects / script group:

sq: als does not exist any more, change to just Tosk
cop: Bohairic, Sahidic, Fayyumic
lt: Aukštaitian
ms: Rumi, Jawa
sc: Nugorese
tly: Asalemi, Anbarani, Masali
sh: Cyrillic, Roman, Arebica, Latin
arc: Hebrew, Syriac
ks: Arabic, Devanagari
cu: Cyrillic, Glagolitic
ro: mo no longer exists; Latin, Cyrillic
os: Digor, Iron
kea: Badiu, São Vicente, ALUPEC, Sotavento, Barlavento, Santo Antão
az: Cyrillic, Roman, Perso-Arabic, Arabic, Persic
avd: Vidari
egy: Archaic Egyptian, Old Egyptian, Middle Egyptian, Late Egyptian
tt: Cyrillic, Roman
lad: Roman, Hebrew, Latin
pa: Gurmukhi, Shahmukhi (has its own code?)
nso: Sepedi
vot: Roman, Cyrillic
rom: table says that rmc, rmf, rml, rmn, rmo, rmw, rmy are deprecated but they still exist in the languages module
kw: Kernewek Kemmyn
be: Cyrillic, Roman, Narkamaŭka, Taraškievica, Tarashkevitsa
tg: Cyrillic, Persic, Roman
ug: Persic, Roman, Cyrillic, Perso-Arabic
uz: Cyrillic, Roman, table says that uzn and uzs are deprecated but they still exist in the languages module
zza: Persic, Roman
ko: South, North
fia: Fadicca, Kenzi
cr: some codes are deprecated but still in languages module
lmo: Eastern, Western, Milanese
ms: Rumi, Jawi, Latin, Arabic
la: New Latin
pi: Burmese, Devanagari, Latin

Other:

ar: xaa, mey
fr: frm, fro
de: ksh, gsw
nds: deprecated but still in languages module, add nds-de, nds-nl
mn: cmg
es: osp
hy: xcl
pnb: pa
id: ace, ban, bjn, bug, jv, mad, mak, min, nia, sas, su
ga: sga, pgl
fy: stq
arc: syc
tt: crh
ko: oko, okm
rom: rmq
pl: zlw-opl

I apologize if this is in an inconvenient format- rearrange it as you like. DTLHS (talk) 00:44, 20 August 2013 (UTC)Reply

Nice. Some additional things that I noticed after a quick read: okm should be under ko, pgl should be under ga, zlw-opl should be under pl, ~~there are tons of missing Arabic sublects that should be under ar, and grc (and possibly some other lects) should be under el~~. —Μετάknowledge^{discuss/deeds} 02:40, 20 August 2013 (UTC)Reply

grc is already under el on the page. What Arabic sublects aren't in my list or the existing table? DTLHS (talk) 03:08, 20 August 2013 (UTC)Reply

Never mind. Only mt, which shouldn't be under ar anyway (well, linguistically it should, but not sociopolitically). —Μετάknowledge^{discuss/deeds} 04:04, 20 August 2013 (UTC)Reply

Use title text for the language names?

Latest comment: 10 years ago2 comments2 people in discussion

A lot of the language codes in the table don't have a name next to them, but if we added the name it would become very hard to see. Would it be useful to turn it into title text, so that the name is shown when you over the mouse over the code? —CodeCa t 19:36, 25 August 2013 (UTC)Reply

Hmm. One downside to that is that it would no longer be possible (would it?) to hit Ctrl+F and search the page for a particular dialect's name. Given that one of the reasons this page exists is so that people can see if the reason we don't have a code is because we've merged it into something else (vs we just haven't added it yet), that's a significant downside. - -sche (discuss) 05:15, 26 February 2014 (UTC)Reply

RFC discussion: May 2013

Latest comment: 11 years ago3 comments2 people in discussion

The following discussion has been moved from Wiktionary:Requests for cleanup.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Wiktionary:Language treatment

This is causing some script errors because some of the codes have since been deleted. I'm not sure what to do about that. —CodeCa t 13:34, 23 May 2013 (UTC)Reply

It needs to be redesigned so that the table can contain/mention codes that have been deleted, for the reason you mention and several other reasons. - -sche (discuss) 19:03, 23 May 2013 (UTC)Reply

I've started redoing the table. - -sche (discuss) 19:30, 23 May 2013 (UTC)Reply

List of codes the ISO has retired

Latest comment: 9 years ago1 comment1 person in discussion

This was previously at User:-sche/retired codes, but I think it is useful to have it in the Wiktionary: namespace. - -sche (discuss) 23:53, 21 December 2014 (UTC)Reply

Retired codes which were not used on Wiktionary in February 2014

Codes which were retired from the ISO and which were not used on Wiktionary as of February of 2014. (Since then, several other codes which were retired from the ISO by that date have also been retired on Wiktionary; see the following sections.)

List

nln — Durango Nahuatl - split into [azd] Eastern Durango Nahuatl and [azn] Western Durango Nahuatl
noo — Nootka - split into [dtd] Ditidaht and [nuk] Nootka / Nuu-chah-nulth
unp|getCanonicalName}} (unp) Worora - split into Worrorra [wro] and Unggumi [xgu]
wiw|getCanonicalName}} Wirangu - split into Wirangu [wgu] and Nauo [nwo]
aay — Aariya - retired as nonexistent
acc — Cubulco Achí - merged with [acr]
aex — Amerax - retired as nonexistent
agp — Paranan - split into Pahanan Agta [apf] and Paranan [prf] (the lects are extremely similar, but only due to convergence; their different grammar reveals their different origins)
ahe — Ahe - merge into Kendayan / Salako [knx]
aiz — Aari - split into Aari [aiw] (new identifier) and Gayil [gyl]
akn — Amikoana - retired as nonexistent
amd — Amapá Creole - retired as nonexistent
arf — Arafundi - split into three languages: Andai [afd]; Nanubae [afk]; Tapei [afp]
atf — Atuence - retired as nonexistent
auv — Auvergnat - merge into Occitan (post 1500) [oci]
ayx — Ayi (China) - merge into Anong [nun] as duplicate / nonexistent as separate lect
azr — Adzera - split into three languages: Adzera [adz] (new identifier), Sukurum [zsu] and Sarasira [zsa]
baz — Tunen - split into Tunen [tvu] and Nyokon [nvo]
bcx — Pamona - split into Pamona [pmf] (new identifier) and Batui [zbt]
bgh — Bogan - merge into Bugan [bbh] as duplicate
bhk — Albay Bicolano - split into Buhi'non Bikol [ubl]; Libon Bikol [lbl]; Miraya Bikol [rbl]; West Albay Bikol [fbl]
bii — Bisu - split into Bisu [bzi] (new identifier) and Laomian [lwm]
bjq — Southern Betsimisaraka Malagasy (later granted another code which we ourselves retired)
bkb — Finallig
bke — Bengkulu
blu — Hmong Njua
bnh — Banawá
boc — Bakung Kenyah
bqe — Navarro-Labourdin Basque
bsd — Sarawak Bisaya
bsz — Souletin Basque
btb — Beti (Cameroon)
bvs — Belgian Sign Language
bwv — Bahau River Kenyah
bxt — Buxinhua
byu — Buyang
cbm — Yepocapa Southwestern Cakchiquel
ccx — Northern Zhuang
ccy — Southern Zhuang
chs — Chumash - Extinct
cit — Chittagonian
cjr — Chorotega - Extinct
cka — Khumi Awa Chin
ckc — Northern Cakchiquel
ckd — South Central Cakchiquel
cke — Eastern Cakchiquel
ckf — Southern Cakchiquel
cki — Santa María De Jesús Cakchiquel
ckj — Santo Domingo Xenacoj Cakchiquel
ckk — Acatenango Southwestern Cakchiquel
ckw — Western Cakchiquel
cmk — Chimakum - Extinct
cnm — Ixtatán Chuj
cru — Carútana - Extinct
cti — Tila Chol
cun — Cunén Quiché
daf — Dan
dap — Nisi (India)
dat — Darang Deng
dha — Dhanwar (India)
dkl — Kolum So Dogon
drh — Darkhat
drw — Darwazi
dyk — Land Dayak - actually a family
elp — Elpaputih - "Lack of information may have cause Elpaputih to be considered different from [amq] (Amahai) and [plh] (Paulohi) "
eml — Emiliano-Romagnolo - split into Emilian [egl] and Romagnol [rgn]
eni — Enim - merge into [pse] Central Malay
eur — Europanto - constructed - hoax / joke
fiz — Izere
flm — Falam Chin
fri — Western Frisian
gav — Gabutamon - merge into Domung[dev]
gen — Geman Deng - merge into Miju-Mishmi [mxj] as duplicate
ggh — Garreh-Ajuran - split between Borana [gax] and Somali [som] (!)
ggm — Gugu Mini - extinct - nonexistent
gmo — Gamo-Gofa-Dawro - split into three languages: Gamo [gmv], Gofa [gof], and Dawro [dwr]
gsc — Gascon - merge into Occitan (post 1500) [oci]
hsf — Southeastern Huastec
hva — San Luís Potosí Huastec
itu — Itutang
ixi — Nebaj Ixil
ixj — Chajul Ixil
jai — Western Jacalteco - merge with Popti' [jac]
jap — Jaruára - merge into [jaa] Jamamadí
jar — Jarawa (Nigeria) - split into Gwak [jgk] and Bankal [jjr]
kds — Lahu Shi
knh — Kayan River Kenyah
kob — Kohoroxitari
krg — North Korowai
krq — Krui
kxg — Katingan
leg — Lengua
lmm — Lamam
lms — Limousin
lmt — Lematang
lnc — Languedocien
lnt — Lintang
mbg — Northern Nambikuára
mdo — Southwest Gbaya
mhh — Maskoy Pidgin
mhv — Arakanese
miv — Mimi
mja — Mahei
mld — Malakhel
mly — Malay (- language) - actually a family
mms — Southern Mam
mob — Moinba
mof — Mohegan-Montauk-Narragansett - extinct
mol — Moldavian
mpf — Tajumulco Mam
mqd — Madang
mst — Cataelano Mandaya
mtz — Tacanec
muw — Mundari
mvc — Central Mam
mvj — Todos Santos Cuchumatán Mam
myq — Forest Maninka
myt — Sangab Mandaya
mzf — Aiku
nbf — Naxi
nfg — Nyeng
nfk — Shakara
nhj — Tlalitzlipa Nahuatl
nhs — Southeastern Puebla Nahuatl
nky — Khiamniungan Naga
nlr — Ngarla
nxj — Nyadu
occ — Occidental - constructed - merge into Interlingue [ile] as Duplicate
ogn — Ogan - merge into [pse] Central Malay
ope — Old Persian - ancient - merge into Old Persian (ca. 600-400 B.C.) [peo] as duplicate
ork — Orokaiva - split into Orokaiva [okv] (new identifier), Aeka [aez] and Hunjara-Kaina Ke [hkk]
paj — Ipeka-Tapuia
pbz — Palu
pec — Southern Pesisir
pen — Penesak
pgy — Pongyong
plm — Palembang
poa — Eastern Pokomam
pob — Western Pokomchí
poj — Lower Pokomo
pou — Southern Pokomam
ppv — Papavô
prv — Provençal
pun — Pubian
puz — Purum Naga
quj — Joyabaj Quiché
qut — West Central Quiché
quu — Eastern Quiché
qxi — San Andrés Quiché
rae — Ranau
rjb — Rajbanshi
rmr — Caló
rws — Rawas
sap — Sanapaná
sdd — Semendo
sdi — Sindang Kelingi
sgl — Sanglechi-Ishkashimi
sic — Malinguat
skl — Selako
slb — Kahumamahon Saluan
srj — Serawai
stc — Santa Cruz
suf — Tarpia
suh — Suba
sul — Surigaonon
sum — Sumo-Mayangna
suu — Sungkai
szk — Sizaki
tkk — Takpa
tle — Southern Marakwet
tlz — Toala'
tmx — Tomyang
tnf — Tangshewi
tnj — Tanjong
tot — Patla-Chicontla Totonac
ttx — Tutong 1
tzb — Bachajón Tzeltal
tzc — Chamula Tzotzil
tze — Chenalhó Tzotzil
tzs — San Andrés Larrainzar Tzotzil
tzt — Western Tzutujil
tzu — Huixtán Tzotzil
tzz — Zinacantán Tzotzil
ubm — Upper Baram Kenyah
vky — Kayu Agung
vlr — Vatrata
vmo — Muko-Muko
wgw — Wagawaga
wit — Wintu
wre — Ware
xah — Kahayan
xkm — Mahakam Kenyah
xmi — Miarrã
xsk — Sakan - ancient
xst — Silt'e
xuf — Kunfal
yib — Yinglish - merged into English
yio — Dayao Yi
ymj — Muji Yi
ypl — Pula Yi
ypw — Puwa Yi
yus — Chan Santa Cruz Maya - merge with Yucateco [yua]
yuu — Yugh - Yugh [yuu] is a duplicate of Yug [yug]
ywm — Wumeng Yi - merge into Wusa Yi [ywu], renamed Wumeng Nasu (cf. 2007-038)
yym — Yuanjiang-Mojiang Yi - split into Southern Nisu [nsd] and Southwestern Nisu [nsv]
ztc — Lachirioag Zapotec - merge into Yatee Zapotec [zty]

Retired codes which have been discussed since February 2014

Please see Wiktionary:Language treatment/Discussions#Various_code_retirements.2C_part_one (Wiktionary:Beer parlour/2014/February#Codes_the_ISO_has_split_or_merged_.28first_batch.29) and Wiktionary:Language treatment/Discussions#Various_code_retirements.2C_part_two (Wiktionary:Beer parlour/2014/March#Codes_the_ISO_has_split_or_merged_.28second_batch.29).

Retired codes which are still used on Wiktionary

Some codes which were retired from the ISO but which are still used on Wiktionary. (This list is not necessarily comprehensive.) Some codes in the list have been discussed, and these have been intentionally retained: sh "Serbo-Croatian", gio "Gelao", kzh "Dongolawi" / "Kenuzi-Dongola", mnt "Maykulan". Meanwhile, these have not yet been discussed.

List of ISO 639 codes absent from Wiktionary

Latest comment: 7 years ago3 comments1 person in discussion

Most of the 7865 codes present in ISO 639 are present on Wiktionary; most of those which are not are recorded on WT:LT. The only ones which have slipped between those two cracks are these, which should be investigated and discussed in the coming weeks. In many cases, the exclusion is likely nothing more than an oversight; in some cases, it's clearly because a naming conflict prevented importation of the codes back when Wiktionary bot-imported ISO 639 en masse (something we can now solve with disambiguators):

cek — Eastern Khumi Chin - a dialect of cnk (Khumi Chin)
dda — Dadi Dadi
dgw — Daungwurrung
dja — Djadjawurrung
deq — Dendi (Central African Republic) - presumably failed to be included because of the naming conflict with ddn — Dendi (Benin)
dmd — Madhi Madhi (Muthimuthi)
dth — Adithinngithigh - compare rrt, which is said to be a different language
dty — Dotyali
gku — ǂUngkue
gll — Garlali
gpe — Ghanaian Pidgin English - probably to be combined with other African Pidgin English (see RFM)
gwm — Awngthim
gmz - Mgbo
hna — Mina (Cameroon) - presumably failed to be included because of the naming conflict with myi Mina (India), which, however, is spurious
~~ihw — Bidhawal - a dialect of/with unn~~
jan — Jandai
jbi — Badjiri - possibly not even Karnic; cf my notes about ekc above and on User:-sche/retired codes
jbk (Barikewa) and jmw (Mouwase) — varieties of {{mgx}} Omati/Mini, said to be quite divergent from each other: but we should either have mgx or have jbk+jmw, not all three
jbw — Yawijibaya
jgk — Gwak
jjr — Bankal
jms — Mashi (Nigeria)
jog — Jogi
jui — Ngadjuri
kbn — Kare (Central African Republic)
kmf — Kare (Papua New Guinea)
kol — Kol (Papua New Guinea)
myi — Mina (India) (see hna)
nmx — Nama (Papua New Guinea)
npg — Ponyo-Gongwang Naga
nqy — Akyaung Ari Naga
nsf — Northwestern Nisu
ntx — Tangkhul Naga (Myanmar)
nwg — Ngayawung
nxk — Koki Naga
oke — Okpe (Southwestern Edo)
okx — Okpe (Northwestern Edo)
olk — Olkol
orc — Orma
pnl — Paleni
ptq — Pattapu
sfe — Eastern Subanen
sgj — Surgujia - Suraji, Surguja, Surgujia-Chhattisgarhi, Surjugia
sim — Mende (Papua New Guinea)
sng — Sanga (Democratic Republic of Congo)
sox — Swo
spb — Sepa (Indonesia)
tcl — Taman (Myanmar) - (extinct)
tgj — Tagin
tgz — Tagalaka - (extinct)
tjl — Tai Laing
tmn — Taman (Indonesia)
tnz — Tonga (Thailand)
tst — Tondi Songway Kiini
xsn — Sanga (Nigeria)
xud — Umiida - (extinct)
xun — Unggaranggu - (extinct)
xyy — Yorta Yorta
yhs — Yan-nhaŋu Sign Language - signed by 10 people, not that distinct from ygs (exclude?)
ykn — Kua-nsi
yku — Kuamasi
ysg — Sonaga
yxy — Yabula Yabula - (extinct)

(This list is complete as of August 2015, before the 2015 change requests were finalized. Notes and misc.) - -sche (discuss) 15:47, 11 August 2015 (UTC)Reply

Codes in the above list which have been added to Module:languages or WT:LT or otherwise dealt with have been stuck. - -sche (discuss) 03:11, 21 August 2016 (UTC)Reply

Bidhawal

The ISO added a code for Bidhawal, which we never got around to adding. That seems to be OK; Robert M. W. Dixon says in Australian Languages: Their Nature and Development (2002, →ISBN that "Bidhawal appears not to constitute a separate language, but rather to be the most eastern dialect of Q, Muk-thang (or Kurnai). The grammatical forms given by Mathews for Bidhawal are almost identical to those for Muk-thang, as are most of the verbs and a good proportion of nouns." - -sche (discuss) 03:02, 21 August 2016 (UTC)Reply

Treatment of reconstructed languages?

Latest comment: 8 years ago2 comments2 people in discussion

We merged Proto-Finno-Ugric and Proto-Finno-Permic into Proto-Uralic, and Proto-Baltic into Proto-Balto-Slavic. The original languages remain as etymology codes. Should this be mentioned here? —CodeCa t 18:48, 21 August 2015 (UTC)Reply

Sure. Maybe in a separate table, though? Since those aren't cases where we deprecated, split, or broadened an ISO code, but rather cases where we assigned a code of our own devising and then went "wait, on second thought, nah". - -sche (discuss) 19:10, 21 August 2015 (UTC)Reply

Akan and its subdivisions

Latest comment: 8 years ago2 comments2 people in discussion

As for Akan we can currently find that both the macrolanguage and its subdivisons are treated as languages though Category:Fanti language and Category:Twi language were merged previously. It seems that we have to modify the description. How's that? --Eryk Kij (talk) 22:53, 26 May 2016 (UTC)Reply

Like so; thanks for pointing out that this page still needed to be updated. - -sche (discuss) 23:21, 26 May 2016 (UTC)Reply

ISO code changes 2018

Latest comment: 5 years ago1 comment1 person in discussion

Some codes have been merged or retired following the ISO's 2018 code changes; these changes are not necessarily recorded on WT:LT because the codes in question were not just merged or retired by Wiktionary but by the ISO. See Wiktionary:Beer parlour/2019/February#2018_ISO_code_changes for a list. - -sche (discuss) 00:07, 24 February 2019 (UTC)Reply

Scope of the page

Latest comment: 4 years ago11 comments4 people in discussion

I think we need to be more specific in terms of what we mean by language treatment. It should only apply to how languages are treated for entry making, but not anywhere else. For example, we allow many etymology-only languages in etymologies. See Wiktionary talk:About Chinese#Language treatment: Only the macrolanguage is treated as a language? — justin(r)leung _{{ (t...) | c=› }} 04:52, 9 June 2020 (UTC)Reply

Oh my~ got to extirpate the remnants of truth from Wiktionary! This is just about pretending Chinese is a language when we know it's a macrolanguage. Just treat Chinese like any other macrolanguage group. So sad. --Geographyinitiative (talk) 05:05, 9 June 2020 (UTC)Reply

@Geographyinitiative: Many macrolanguages are treated the same way Chinese is. Have you even read the page? — justin(r)leung _{{ (t...) | c=› }} 05:21, 9 June 2020 (UTC)Reply

If I may inquire, what other macrolanguage group is treated like Chinese is treated on Wiktionary? If you can give me a good answer on this, I could be much more convinced that the current system for covering Chinese languages on Wiktionary is not a disaster. --Geographyinitiative (talk) 05:24, 9 June 2020 (UTC)Reply

@Geographyinitiative: Zhuang is one of among many. Please see the main page - any macrolanguage in the table that is marked with "Only the macrolanguage is treated as a language" would be the same situation (more or less). — justin(r)leung _{{ (t...) | c=› }} 05:35, 9 June 2020 (UTC)Reply

@Justinrleung: A rhetorical question: How many times the sad geography troll should be spoiled with responses so he stops treating Wiktionary and its editors as a complete disaster? --Anatoli T. ^{(обсудить}/^вклад) 07:20, 9 June 2020 (UTC)Reply

If I may ask, what are the different Zhuang languages? Are there any other macrolanguage groups not associated with Chinese characters or influenced by Chinese politics that are not split up by language? I think every language should have its own header on Wiktionary, don't you? Atitarev, please don't hate me man! I am bringing a perspective that represents the opinions of many others and I am trying to make honest inquiries about really important things. There were no Wade Giles or Tongyong Pinyin derived geo terms before I came here, and I helped add an important perspective which was being neglected. I am a 'troll' because I bring an outsider perspective, but I am not a troll because I am actively working and negotiating to make the dictionary better with tangible results. Geographyinitiative (talk) 22:43, 20 June 2020 (UTC)Reply

You don't bring anything new. All valid forms are welcome and nobody blocked any language or any script or dialect or transliteration scheme. Yes, that includes Wade-Giles, Tonyong Pinyin, Min Nan in Chinese characters and Min Nan in POJ. Your conspiracy theories have no grounds at all. Bring away your perspective but don't poison people's minds about the achievements of this site. You don't raise any awareness, everyone is aware of what's out there. You just don't want to see it. You're slinging dirt around, then apologise or start praising people, which I find hypocritical. You talk a lot about your own achievements but nobody does it here, this is called narcissism. If there is not enough coverage for anything, then there was not enough contributors. Languages are somewhat like currencies. If a value of currency of small third-world country is low, nobody is interested in it but people of that country have to use it. Even if you do add Wade-Giles, Tonyong Pinyin, you pose it as an opposition of Mandarin and Hanyu Pinyin domination, which you blame this site for, not accepting the reality but it's still someone's fault, isn't it? And you keep blaming someone and no-one in particular for that. Everything is doable and achievable. You want to make the distinction between Min Nan and Mandarin, just do it within the existing infrastructure. Nobody stops you from defining specific senses, usage examples, etc. You want to add alternative English spellings, varieties of Chinese, go ahead, just do it in a positive way. Stop blaming everyone or the site. You just turn away people from your cause. All the work is welcome, if it's not breaking agreed conventions or rules. In short, add you Wade-Giles your forms, POJ, Min Nan, whatever but start making sense, stop attacking pinyin, Mandarin, this site, etc.

The Zhuang situation is a good example of a macrolanguage but it's harder to demonstrate at Wiktionary as the Zhuang coverage is very low at Wiktionary. The unified approach for other language, other than Chinese is better demonstrated by Serbo-Croatian, which combines two to four different standards, depending how to count - Croatian, Serbian, Bosnian and Montenegrin, two scripts - Cyrillic and Roman (Latin), two major dialects - Ekavian and Ijekavian (+Kajkavian). I don't want to cause more trolling from you but Serbo-Croatian "unification" had much stronger opposition and hate. You can imagine the passions after the Yugoslav war where language identity was a reason to be shot at or imprisoned. Nevertheless, at Wiktionary, the scientific and technical reasoning prevailed over hate. Don't imagine for a second that Chinese varieties and Serbo-Croatian standards and dialects are comparable. No way. They are not. Chinese varieties are mostly not mutually comprehensible. However, the rationale for the unified approach was presented and it won. You won't achieve anything by winging and trolling negative messages. Yes, I consider your mentioning that this site may be a complete disaster or similar at every opportunity is trolling. --Anatoli T. ^{(обсудить}/^вклад) 23:37, 20 June 2020 (UTC)Reply

Zhuang lects. (Whether these pronunciations actually belong on the raemx entry is a different matter) —Suzukaze-c (talk) 23:39, 20 June 2020 (UTC)Reply

Let me mull it over a little more. However, I think that it would be wildly difficult to reach the conclusion "the Chinese macrolanguage header, including all language in modern China's borders from Cangjie to to-day, should be portrayed as equivalent to the Danish, Norwegian, Sweedish, English, German, French etc headers, implying they are all equally "languages"." I would that in my worst case scenario some kind of further disclaimer should be added automatically to every page that has the "Chinese" header so we know it means "any Chinese characters used in China since the Shang dynasty til today, including numerous unintelligible dialects with independent Wikipedia versions". What an expansive header it is! --Geographyinitiative (talk) 00:20, 22 June 2020 (UTC)Reply

Wiktionary doesn't have to apologise on every page on how it works. The votes on Chinese and Serbo-Croatian unifications defined the dictionary policies, which is or should be mentioned on appropriate About pages. If it's too hard to accept, which is understandable, you have two options: 1. get a new vote and win it or 2. leave, which was the case with some unhappy Croatian and Serbian editors. We didn't have a precedent in my memory with unhappy Chinese editors wanting to reverse the change and leaving, you may be the first. If you decide to stay, my personal advice is, you have to stop complaining at every opportunity in talk pages or edit summaries, bite the bullet and contribute in your favourite area, including enhancing dialectal coverage, strive to make it work, so that all promises in the vote to adequately cover all Chinese varieties are kept. --Anatoli T. ^{(обсудить}/^вклад) 00:53, 22 June 2020 (UTC)Reply

Stale but unresolved discussions of languages to add or remove

Latest comment: 6 months ago26 comments3 people in discussion

Because they are so stale, I am unilaterally moving these off the RFM page because that page has grown too massive (800,000 bytes) to be usable; however, because they are unresolved, I don't want to hide them away on WT:LTD... so here they go... - -sche (discuss) 06:52, 28 December 2023 (UTC)Reply

RFM discussion: July 2016–October 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Even more languages without ISO codes, part 6

This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledge^{discuss/deeds} 04:54, 6 July 2016 (UTC)Reply

Alingpo language (tbq-alp) — perhaps should be named Yiqing
Alo Teqel language (map-alt)
Antequera Zapotec (omq-anz) — hard to say how different it is, but it's extinct, so a finite lexicon
Auteco language (azc-aut)
I have yet to find content in this lect to judge how different it is from our other Nahuatls. - -sche (discuss) 04:32, 24 May 2017 (UTC)Reply
~~Aveteian language (map-ave)~~
Wikipedia (and Lyle Campbell, Anna Belew, Cataloguing the World's Endangered Languages, 2018) says this is dix "Dixon Reef". Is it not? (Or if it is, should the name associated with that code be changed?) - -sche (discuss) 20:10, 1 August 2020 (UTC)Reply
Bantang language (tbq-ban)
Chashan language (tbq-cha)
Damu language (sit-dam)
Daylami language (ira-day)
I've asked some of our editors of Iranian languages for input. - -sche (discuss) 00:17, 2 August 2020 (UTC)Reply
Based on feedback there, not added at this time, although I note that content in the language seems to exist, which suggests we would eventually need to figure out a header to include it under. - -sche (discuss) 20:44, 2 August 2020 (UTC)Reply
Jo language (crp-joo)
Kasabe language (alv-kas)
Kasong language (aav-kas) — questionable whether this is a separate language
Komi-Yazva language (urj-kya)
Perhaps better prm-kya? Also while I am not convinced treating the Komi varieties as separate languages altogether is the best solution, as long as we do so, we might moreover need Old Komi. --Tropylium (talk) 18:44, 11 July 2016 (UTC)Reply
Kurbet language (crp-kur)

Australian languages

Bugurnidja language (aus-bug)
Dyirringany language (aus-dyi)
Gulidjan language (aus-gul)
Gunindiri language (aus-gun)
Kok Thawa language (aus-kth)
Kureinji language (aus-kur)
~~Mirning language (aus-mir)~~
has an ISO code (to be added), see BP - -sche (discuss) 04:17, 14 October 2020 (UTC)Reply
Ngaro language (aus-ngr)
Ngaygungu language (aus-ngg)
~~Ngumbarl language (aus-ngu)~~
has an ISO code (to be added), see BP - -sche (discuss) 04:17, 14 October 2020 (UTC)Reply
Wik Ompom language (aus-wom)
Wik Paach language (aus-wpa)
Yiman language (aus-yim)

Tasmanian and other

Northeastern Tasmanian:

~~Northeastern, Pyemmairre language (aus-pye)~~ Done
alt names/varieties: Plangermaireener, Plangamerina, Cape Portland, Ben Lomond, Pipers River
~~North Midlands, Tyerrernotepanner language (aus-tye)~~ — Bowern considers this a dialect; perhaps we should just trust her
now has an ISO code which should be added instead, see BP shortly - -sche (discuss) 04:27, 14 October 2020 (UTC)Reply
Lhotsky/Blackhouse Tasmanian language (aus-lbt) — the worst name in Bowern's set!
I'm not sure... the very language is "reconstructed" by Bowern on the assumption that three wordlists (of which only two make it into the name) attest the same language, although apparently none of the three bothered to name the language. The chance of someone "would run across [a word in] it and want to know what it means" seems nonexistent. If we wanted to host the wordlists, we could do that in an appendix or on Wikisource. - -sche (discuss) 16:09, 9 August 2016 (UTC)Reply

Bowern's methods are scientific; but I would feel better if more than one scholar was saying there was one language in this set of wordlists, the way that for e.g. Port Sorrell, Dixon & Crowley and Glottolog agree that there is a unit/lect there. - -sche (discuss) 16:55, 4 June 2017 (UTC)Reply

and what of "Norman Tasmanian"? - -sche (discuss)

Here is another language we might need a code for: Ma(') Pnaan (poz-map?), also known by the exonyms Punan Malinau and Punan Segah, a language of Borneo / East Kalimantan, summarized by Antonia Soriente here and elsewhere. Compare the other things listed at Punan language. - -sche (discuss) 05:21, 29 August 2016 (UTC)Reply

RFM discussion: August 2016

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Some more missing American languages

Here are a few more North American languages for which we could add codes:

~~Akokisa (nai-ako).~~ WP says it is attested certainly in two words in Spanish records (Yegsa "Spaniard[s]", which Swanton suggests is similar to Atakapa yik "trade" + ica[k] "people"; and the female name Quiselpoo), and possibly in more words in a wordlist by Jean Béranger in 1721 (if the wordlist is not some other language).

Labrador Inuit Pidgin French (crp-lab)

Labrador Inuit Pidgin French, less often called Belle-Isle Pidgin, was spoken in Labrador from the late 1600s (probably since before the 1660s, but first written down in 1694) until at least the mid 1760s, based on Inuktitut, French, Basque, Montagnais, and possibly Spanish and Breton. Louis-Jacques Dorais, An Inuit Pidgin around Belle-isle Strait (1996; with reference to "Clermont - Martijn 1980; Dorais 1980; Bakker 1988"), covers the records:

- Louis Jolliet recorded words at Baie Saint-Louis in 1694, including the 'greeting' thou tcharacou, saying the latter word is "peace", which Dorais says is "corroborated by two other sources, from 1717 (characoua [...]) and 1720 (characo [...]). But a text from 1743 (Privy Council 1927: 3284), written by the French merchant Louis Fornel, gives to characo the meaning 'war'." Thou is probably from tu. The other would could be Basque txarrakoa "bad", thus "are you bad?".
- Le Cour in 1742 records some more words: bons camaras "good comrades", tous camaras "all comrades", capitaine "captain", kellanoré (which Dorais says "seems to be Le Cour's [or the pidgin's?] rendering of Inuktitut kinaunali 'but who is he?'?), the personal name Amargo (a rendering of Amaqqut "Wolves"), rénombek "bead" (probably a loanword), maumek "file" (probably a loanword), monkoumek "knife" (probably a loanword from Montagnais mukuma:n, as spelled in Marguerite Ellen MacKenzie Towards a Dialectology of Cree-Montagnais-Naskapi).
- Louis Fornel in 1743 recorded more: tout camara "all comrades", troquo balena "let us trade whale" (from French troquons!), non characo "no war" (sic, per Fornel).
- Jens Haven wrote other words in 1764-5: makagua "peace" (perhaps from Basque bake[a] "peace" plus a suffix -koa), kutta (French couteau "knife"), memek "to drink" (from Inuktitut imiq "drinking water").
- Few references discuss the lect and it is difficult to judge whether it is really a language or just something like broken French or like Spanglish (which I think we exclude), but the fact that the Inuit apparently changed the meaning and even part of speech of words in their own language when speaking pidgin suggests it is more on the pidgin-language side of that continuum than the code-switching side.

Algonquian–Basque pidgin (crp-abp). Wikipedia has a sample. The Atlas of Languages of Intercultural Communication, citing Bakker, says it was spoken from at least 1580 (and perhaps as early as 1530s) through 1635, and "only a few phrases and less than 30 words attributable to Basque were written down" (though apparently more words, attributable to other sources, were also recorded).
Guachichil (Cuauchichil, Quauhchichitl, Chichimeca) (nai-gch or, if Guachí is added as sai-gch, perhaps nai-gcl to prevent the two similarly-named lects from being mixed up by only typoing the initial n vs s), apparently sparsely attested.
Concho (nai-cnc). The Handbook of North American Indians, volume 10, says "three words of Concho [...] were recorded in 1581 [and] look like they may be [...] Uto-Aztecan".
Jumano (Humano, Jumana, Xumana, Chouman, Zumana, Zuma, Suma, and Yuma) (nai-jmn). The Handbook says "It has been established that the Jumano and Suma spoke the same language. Three words have been recorded" of it.

and from South America:

Peba / Peva (sai-peb), said by Erben to more properly by called Nijamvo, Nixamvo. Spoken in "the department of Loreto" in Peru. Attested in wordlists by Erben and Castelnau, which Loukotka provides, and which disagree with each other substantially: munyo (Erben) / money (Castelnau) "canoe, small boat"; nero (E) / yuna (C) "demon"; nebi (E) / nemey (C) "jaguar"; teki (E) / tomen-lay (C) "one", manaxo (E) / nomoira (C) "two"; etc. I would even consider that one might not be the same language as the other... what's with these languages that survive in disparate wordlists? lol.
possibly Saynáwa: fr.Wikt grants a code to this variety of Yaminawá language, described here (see also [1]).

- -sche (discuss) 04:04, 16 August 2016 (UTC)Reply

Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledge^{discuss/deeds} 04:08, 16 August 2016 (UTC)Reply

Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, [...] known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts [...] Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as {{lb|aqp|Akokisa}} and then Béranger's forms as {{lb|aqp|possibly|Akokisa}} without needing to agonize over which header to put them under. - -sche (discuss) 15:31, 16 August 2016 (UTC)Reply

RFM discussion: April 2017–October 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

More unattested languages

The following languages have ISO codes, but those codes should be removed, as there is no linguistic material that can be added to Wiktionary. This list is taken from Wikipedia's list of unattested languages, but I have excluded languages which are not definitively extinct (and thus which may have material become available). If there was any reliable source I could find corroborating the WP article's claim of lack of attestation, it is given after the language. —Μετάknowledge^{discuss/deeds} 04:15, 4 April 2017 (UTC)Reply

Aguano language [aga]
Unclear if it even existed per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).

~~Barbacoas language [bpb]~~ (the Wikipedia article has a discussion of the conflation of this unattested language with Pasto, which needs a code; for clarity, I think this [bpb] should be retired and an exceptional code made explicitly for Pasto)
Retired, following the ISO, see Wiktionary:Beer parlour/2020/October#2019-2020_ISO_code_changes. Content, if needed for migration to a Pasto code, was m["bpb"] = { "Barbacoas", "Q2669202", "sai-bar", otherNames = {"Pasto"}, scripts = Latn, } - -sche (discuss) 06:23, 14 October 2020 (UTC)Reply

Dek language [dek]

Giyug language [giy]
AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
The 1992 International Encyclopedia of Linguistics, v. 1, p. 337, says "Giyug: 2 speakers reported in 1981, in the Peron Islands in Anson Bay, southwest of Darwin." The 2003 edition repeats the claim that "2 speakers remain". Wikipedia says it's extinct and unattested, but Glottolog, although having no resources on it, suggests it's not extinct. Might be best to leave it alone for now. - -sche (discuss) 01:13, 6 August 2020 (UTC)Reply

~~Mawa language (Nigeria) [wma]~~ (We call this "Mawa", if removed, [mcw] Mahwa (Mawa language (Chad) can be renamed to the evidently more common spelling "Mawa".)
Removed, and mcw renamed. Glottolog had only one reference to support the existence of Mawa, Temple (1922), which does not even include a section under that header. There may be confusion with the section on the "Marawa", but that does not even mention what language those people speak. (Temple also knows very little about linguistics; while skimming through, I found that Margi (a Chadic language) was said to be similar to the languages of South Africa. —Μετάknowledge^{discuss/deeds} 01:39, 6 August 2020 (UTC)Reply

Nagarchal language [nbg]
Appendix I in The Indo-Aryan Languages records this language as being a subdialect of Dhundari [dhd] and the 1901 Indian Census concurs; this is at odds with its description as an unattested Dravidian language, but the geographical specifications seem to match up.

Ngurmbur language [nrx]
AIATSIS says: "Harvey (PMS 5822) treats Ngomburr as a dialect of Umbukarla N43, but in Harvey (ASEDA 802), it is listed as a separate language." Nicholas Evans confirms in The Non-Pama-Nyungan Languages of Northern Australia that it is unattested.

Tremembé language [tme]

Truká language [tka]

Wakoná language [waf]

Wasu language [was]
Unclassified due to its absence of data per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).

RFM discussion: May 2017–October 2020

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

Some spurious languages to merge or remove, 2

remove Adabe [adb]

Geoffrey Hull, director of research for the Instituto Nacional de Linguística in East Timor, notes (in a 2004 Tetum Reference Grammar, page 228) that "the alleged Atauran Papuan language called 'Adabe' is a case of the mistaken identity of Raklungu," a dialect (along with Rahesuk and Resuk) of Wetarese. He notes (in The Languages of East Timor, Some Basic Facts) that only Wetarese is spoken on the island, and Studies in Languages and Cultures of East Timor likewise says "The three Atauran dialects—with the northernmost of which the dialect of nearby Lirar is mutually intelligible—are unquestionably Wetarese, and not dialects of Galoli, as Fox and Wurm suggest for two of them (n. 32). The same authors refer (ibidem) to a supposedly Papuan language of Atauro, the existence of which appears to be entirely illusory." (The error appears to have originated not with Fox and Wurm but with Antonio de Almeida in 1966.) - -sche (discuss) 01:45, 31 May 2017 (UTC)Reply

We could repurpose the code into one for those three Atauran varieties of Malayo-Polynesian Wetarese, Rahesuk, Resuk, and Raklu Un / Raklungu (the last of which Ethnologue does list as an alt name of adb, despite their erroneous family assignment of it), perhaps under the name "Atauran Wetarese" for clarity. - -sche (discuss) 01:52, 31 May 2017 (UTC)Reply

remove Agaria [agi]

Glottolog makes the case that this is spurious. - -sche (discuss) 07:57, 31 May 2017 (UTC)Reply

~~Arma~~

Arma (aoh) is also said to be "a possible but unattested extinct language"; I am trying to see if that means it is entirely unattested, or if there are personal/ethnic/place names, etc. - -sche (discuss) 09:45, 3 June 2017 (UTC)Reply

Removed, see Wiktionary:Beer_parlour/2020/October#2019-2020_ISO_code_changes. - -sche (discuss) 06:18, 14 October 2020 (UTC)Reply

Aghu language

The VU Amsterdam report linked to here seems to indicate that one lect has been given multiple codes, and that "Jair" at least is spurious. Further research wouldn't hurt. —Μετάknowledge^{discuss/deeds} 00:24, 3 October 2019 (UTC)Reply

RFM

Latest comment: 1 year ago6 comments4 people in discussion

Splitting Selkup

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

{{ping|Surjection|Tropylium|Kaarkemhveel}} After a while of deliberation with Kaarkemhveel and two other future Selkup editors, we have come to the conclusion that it's best to split Selkup into two codes: Northern Selkup (sel-nor) and Southern Selkup (sel-sou) [the exact form of the codes is up for debate], which will both be part of the Selkup family (sel).

These two dialect areas are so different that treating them as a single language would be too bothersome. All subdialects are going to be marked with labels, and provided as languages in descendants sections (much like the two Karelian proper varieties are, or the Zyrian dialects).

The two branches are often named as different: Glottolog splits Selkup into "Kety-Central-Southern Selkup" (Southern) and "Taz-Turukhan" (Northern); The Oxford Guide to the Uralic Languages also shows a split between "Northern Selkup" and "Tomsk region Selkup" (p.778). A few more examples of papers that do this include Wurm (1997), Budzisch (2015), Vorobeva et al. (2017)...

There is precedent for treating these as different languages: ELP splits the family into three full-fledged languages ([2] [3] [4]). On the pages there is the following reasoning for this split: "The three main varieties of Selkup have traditionally been counted as dialects of a single language; their differences are, however, comparable to those between, for instance, Ket, Yug, and Pumpokol".

The Russian institute RAN also splits Selkup into Northern and Southern, as two full-fledged languages.

So, does anyone have an issue with this split? Thadh (talk) 11:04, 19 June 2023 (UTC)Reply

Not oppose as there are clear differences both lexical and cultural. Tollef Salemann (talk) 11:14, 19 June 2023 (UTC)Reply

The Wikipedia article also mentions a Central Selkup. What are you doing with that one? Does it belong to Southern Selkup? —Mahāgaja · talk 14:03, 19 June 2023 (UTC)Reply

Yes, that one will then be handled as Southern Selkup, just like it is by the above sources. Thadh (talk) 14:12, 19 June 2023 (UTC)Reply

No opposition on this much, Northern Selkup is by now clearly distinct from non-Northern and has its own literary standard. Bridging historical data exists but would be probably better handled in Proto-Selkup entries anyway, about all of it is field records and not direct literary use by the speaker community.

Depending on how work on non-Northern Selkup develops, further division could be eventually meaningful too. The other recent handbook, Routledge's The Uralic Languages, Second Edition discusses things from a primarily tripartite Southern / Central / Northern perspective and notes that, though the sharpest modern boundary is Central vs. Northern, the most taxonomically significant difference is Southern vs. {Central, Northern}. I believe currently Southern is better-documented than Central, but the latter is what still has some attempts at literary usage and revival. --Tropylium (talk) 14:48, 19 June 2023 (UTC)Reply

Done. Cleanup is ongoing. Thadh (talk) 20:01, 28 June 2023 (UTC)Reply

RFM discussion: December 2023–January 2024

Latest comment: 6 months ago13 comments6 people in discussion

The following discussion has been moved from Wiktionary:Requests for moves, mergers and splits (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

~~Rusyn~~

Pinging (possibly) interested users, as always, feel free to ping more: @Vininn126, Sławobóg, Chernorizets, Mahagaja, -sche, Benwing2, Atitarev.

I propose we split Carpathian and Pannonian Rusyn into two codes (rue and rsk respectively, in line with their ISO 639-3 codes), and then set Old Slovak the ancestor of Pannonian Rusyn. I have made a list of typical Slavic developments on User:Thadh/Rusyn and given both a Pannonian Rusyn form (from Ramač 1995, Српско-русински речник) and a Carpathian Rusyn form (from Kercha 2012, Словник русько-русинськый). I think this proves beyond much of a doubt that Pannonian Rusyn belongs to the West Slavic group, and specifically to the Slovak dialects, while Carpathian Rusyn is part of the East Slavic group. This is also a view that is supported by many scholars. Thadh (talk) 13:28, 14 December 2023 (UTC)Reply

Support Sławobóg (talk) 13:36, 14 December 2023 (UTC)Reply

@Thadh would it be possible to add an Eastern Slovak column to your tables (presumably the variety of Slovak that Pannonian Rusyn would be closest to) for comparison? I'm not sure how much extra work that would be, but if it's not a huge amount, it would be helpful. Chernorizets (talk) 13:44, 14 December 2023 (UTC)Reply

Unfortunately, I don't have an Eastern Slovak dictionary at hand, but if anyone does, they're encouraged to add the forms! Thadh (talk) 13:59, 14 December 2023 (UTC)Reply

Strong support. The reflexes are clear, there are language codes, and it's the right moment to do this as Rusyn isn't highly developed yet, so splitting will be easier. Vininn126 (talk) 13:49, 14 December 2023 (UTC)Reply

Thanks for pinging me. I don't have enough background in Rusyn to wager a strong opinion here. Benwing2 (talk) 21:39, 14 December 2023 (UTC)Reply

Support @Thadh: Does Pannonian Rusyn completely lack native pleophony (polnoglasie) or they are all late borrowings? E.g. Pannonian/Carpathian брег (breh) vs бе́рег (béreh) and злато (zlato) vs зо́лото (zóloto). If yes, then it looks like it can't belong to East Slavic languages. I support tentatively but I don't have much knowledge on Pannonian. Anatoli T. ^{(обсудить}/^вклад) 22:31, 14 December 2023 (UTC)Reply

@Atitarev: Yes (which is kind of the point). Similarly reflexes of PS palatals, strong yers, and other things. Everything points to Pannonian being West Slavic and Carpathian being East Slavic. Thadh (talk) 22:34, 14 December 2023 (UTC)Reply

@Thadh: I see, thanks. I have yet to digest other differences.

Pannonian examples гарло (harlo) and дороги (dorohy) kind of contradict the overall differences, no? Anatoli T. ^{(обсудить}/^вклад) 23:26, 14 December 2023 (UTC)Reply

@Atitarev: The language has been influenced by Czech, Ruthenian (> Ukrainian/Rusyn), Hungarian and Serbo-Croatian for the last two-hundred years quite intensively, so some inconsistencies due to borrowings are expected. For гарло, this might be a language-specific innovation (I can imagine grdl- and -rdl- overall not being a very easy cluster, and for this specific example Slovincian also does some simplification). дороги is undoubtedly a borrowing though. Thadh (talk) 23:36, 14 December 2023 (UTC)Reply

@Thadh: I think it's worth addressing possible loanwords for your case (e.g. дороги, etc.). Compare with the English, which has more Romance words than native words and the Korean, which has more Sinitic words than native but it doesn't change their language family belonging. These languages are described well, though, but for Pannonian Rusyn, need to make it explicit, IMO, in case someone questions. Anatoli T. ^{(обсудить}/^вклад) 23:49, 14 December 2023 (UTC)Reply

@Atitarev I think the words chosen are unlikely to have been borrowed. Or at least there are enough that are unlikely to have been borrowed that it's even more unlikely that we chose only borrowed words. Vininn126 (talk) 09:49, 15 December 2023 (UTC)Reply

It's been a month and there's been overall support for this. I'm going to mark this thread as closed and lang codes for Carpathian and Pannonian should be assigned. Vininn126 (talk) 12:49, 14 January 2024 (UTC)Reply

Wiktionary talk:Language treatment

Contents

moving dialect codes to the etyl namespace

Treatment by SIL

Allowing macro and non-standard dialects

Aramaic

Chinese

Aramaic redux

Montagnais/Innu