Wiktionary:Requests for moves, mergers and splits
Wiktionary Request pages (edit) see also: discussions | |||||
---|---|---|---|---|---|
Requests for cleanup add new | history | archives Cleanup requests, questions and discussions. |
Requests for verification/English add new English request | history | archives Requests for verification in the form of durably-archived attestations conveying the meaning of the term in question. |
Requests for verification/CJK add new CJK request | history Requests for verification of entries in Chinese, Japanese, Korean or any other language using an East Asian script. |
Requests for verification/Non-English add new non-English request | history | archives Requests for verification of foreign entries. |
Requests for verification/Italic add new Italic request | history Requests for verification of Italic-language entries. |
Requests for deletion/Others add new | history Requests for deletion and undeletion of pages in other (not the main) namespaces, such as categories, appendices and templates. |
Requests for moves, mergers and splits add new | history | archives Moves, mergers and splits; requests listings, questions and discussions. |
Requests for deletion/English add new English request | history | archives Requests for deletion of pages in the main namespace due to policy violations; also for undeletion requests. |
Requests for deletion/CJK add new CJK request | history Requests for deletion and undeletion of entries in Chinese, Japanese, Korean or any other language using an East Asian script. |
Requests for deletion/Non-English add new non-English request | history | archives Requests for deletion and undeletion of foreign entries. |
Requests for deletion/Italic add new Italic request | history Requests for deletion and undeletion of Italic-language entries. |
Requests for deletion/Reconstruction add new reconstruction request | history Requests for deletion and undeletion of reconstructed entries. |
{{rfap}} • {{rfdate}} • {{rfquote}} • {{rfdef}} • {{rfd-redundant}} • {{rfeq}} • {{rfe}} • {{rfex}} • {{rfi}} • {{rfp}} |
All Wiktionary: namespace discussions 1 2 3 4 5 - All discussion pages 1 2 3 4 5 |
This page is designed to discuss moves (renaming pages), mergers and splits. Its aim is to take the burden away from the Beer Parlour and Requests for Deletion where these issues were previously listed. Please note that uncontroversial page moves to correct typos, missing characters etc. should not be listed here, but moved directly using the move function.
- Appropriate: Renaming categories, templates, Wiktionary pages, appendices, rhymes and occasionally entries. Merging or splitting temp categories, templates, Wiktionary pages, appendices, rhymes.
- Out of scope: Merging entries which are alternative forms or spellings or synonyms such as color/colour or traveled/travelled. Unlike Wikipedia, we don’t redirect in these sort of situations. Each spelling gets its own page, often employing the templates
{{alternative spelling of}}
or{{alternative form of}}
. - Tagging pages: To tag a page, you can use the general template
{{rfm}}
, as well as one of the more specific templates{{move}}
,{{merge}}
and{{split}}
.
Note that discussions for splitting, merging, and renaming languages are often also held here, and should be archived to WT:LTD when closed.
2014[edit]
Khanty words with /ɬ/[edit]
Requesting a move of a dozen Khanty words:
- вәԓ (wəḷ) → вәӆ (wəł)
- вәԓты (wəḷty) → вәӆты (wəłty)
- нивӑԓмит (niwăḷmit) → нивӑӆмит (niwăłmit)
- нивԓ (niwḷ) → нивӆ (niwł)
- няԓ (nâḷ) → няӆ (nâł)
- няԓмит (nâḷmit) → няӆмит (nâłmit)
- оԓӑӈмит (oḷăňmit) → оӆӑӈмит (ołăňmit)
- пӑԓ (păḷ) → пӑӆ (păł)
- хәԓмит (xəḷmit) → хәӆмит (xəłmit)
- хәԓум (xəḷum) → хәӆум (xəłum)
- ԓапӑт (ḷapăt) → ӆапӑт (łapăt)
- ԓапӑтмит (ḷapătmit) → ӆапӑтмит (łapătmit)
These have /ɬ/, which is however written ӆ and not ԓ (this is instead, I believe, /ɭ/). Quite a few current entries are sourced from a dictionary (Kononova 2002) which uses a rather ԓ-like but regardless clearly el-with-tail glyph. --Tropylium (talk) 13:24, 19 November 2014 (UTC)
- (Listed here in case anyone wants to argue that ԓ for /ɬ/ is actually a competing dialectal standard that should have precedence. --Tropylium (talk))
- I think you are mostly going to talk to yourself in this section. Move, if Tropylium says so. --Vahag (talk) 14:23, 19 November 2014 (UTC)
- I would say just go ahead and move them yourself. Unless there's a chance that other languages will have terms using the original spellings, the redirects that you leave will actually be useful for those who make the same mistake when searching. Given the similarity of the characters, I have a hunch scannos from online books might be a major source of these. Chuck Entz (talk) 14:38, 19 November 2014 (UTC)
- Update: apparently the normative glyph is in fact ԯ (el with descender). However, this has not been widely available in fonts, so ӆ or ԓ have been used as workaround solutions in some materials. (Can anyone reading this actually see the first glyph?) --Tropylium (talk) 09:42, 12 March 2015 (UTC)
- @Tropylium, do these still need to be moved? - -sche (discuss) 22:55, 29 February 2016 (UTC)
- They do, though we never did settle here if we should move them to use ԯ or ӆ. Since the latter is attestable as well, and seems to render better, I would be okay with it (even if we might be setting ourselves up for replacing these again with alternate-spelling soft-redirects some years down the line). --Tropylium (talk) 01:57, 1 March 2016 (UTC)
- I actually think I prefer ԓ; it's used in this dictionary for instance, and like Tropylium said, it renders better. Thadh (talk) 14:35, 10 March 2022 (UTC)
- So, what needs to be moved is:
- вәԓ (wəḷ) → вәԯ (with redirect from вәӆ (wəł))
- вәԓты (wəḷty) → вәԯты (redirect from вәӆты (wəłty))
- нивӑԓмит (niwăḷmit) → нивӑԯмит (нивӑӆмит (niwăłmit))
- нивԓ (niwḷ) → нивԯ (нивӆ (niwł))
- няԓ (nâḷ) → няԯ (няӆ (nâł))
- няԓмит (nâḷmit) → няԯмит (няӆмит (nâłmit))
- оԓӑӈмит (oḷăňmit) → оԯӑӈмит (оӆӑӈмит (ołăňmit))
- пӑԓ (păḷ) → пӑԯ (пӑӆ (păł))
- хәԓмит (xəḷmit) → хәԯмит (хәӆмит (xəłmit))
- хәԓум (xəḷum) → хәԯум (хәӆум (xəłum))
- ԓапӑт (ḷapăt) → ԯапӑт (ӆапӑт (łapăt))
- ԓапӑтмит (ḷapătmit) → ԯапӑтмит (ӆапӑтмит (łapătmit))
2015[edit]
West African Pidgin English varieties[edit]
Ethnologue has assigned codes to some but not all of the varieties of West African Pidgin English, and we in turn have incorporated some (e.g. pcm) but not all (e.g. not gpe) of those codes. As WP notes, the "contemporary English-based pidgin and creole languages are so similar that they are sometimes grouped together under the name 'West African Pidgin English'" (a name which also denotes their predecessor which developed in the 1700s). WP's examples are illustrative, particularly in that its Ghanaian and Nigerian Pidgin English examples are identical. I propose to merge at least the following three varieties into wes, renaming it "West African Pidgin English":
- Ghanaian Pidgin English (gpe)
- Nigerian Pidgin English (pcm)
- Cameroonian Pidgin English (wes)
We could also discuss whether or not to merge Sierra Leone Krio (kri, which WP notes its often mistaken for English slang due to its similarity to English, but which has a somewhat distinct alphabet), Pichinglis / Fernando Po Creole (fpe), and Liberian Kreyol / Liberian Pidgin English (lir). - -sche (discuss) 21:11, 11 August 2015 (UTC)
- The question is a very complex one. Firstly (but of least importance), scholars are divided on which lects have creolised and which have not, but it is generally agreed upon that at least some of the language you mentioned are not pidgins, which would make the name "West African Pidgin English" somewhat of a misnomer (the more neutral name "Wes-Kos" have been suggested as an alternative, but even linguists haven't fully adopted it). Secondly, all these lects are remarkably similar on a lexical level, but that's unsurprising; after all, they resulted from separate but very similar language contact events, and then probably modified each other (one scholar posits that Krio and Cameroonian Pidgin English relexified each other to some degree after pidginisation). The similarities are also obscured by the fact that there is nothing close to an agreed orthography for most of these, and pronunciation does differ a bit across West Africa. Linguistically, I'd probably merge them all, but practically that may not be the best decision. I know we have entries in pcm, but probably next to nothing for the rest, and if somebody wants to add them, given how each lect is very neatly assigned to a certain West African country, at least it won't be confusing for them to do so. Conclusion: the literature is schizophrenic, the lects mutually intelligible, and the existing situation remarkably unproblematic. Therefore I abstain. —Μετάknowledgediscuss/deeds 21:19, 16 August 2015 (UTC)
Per Wiktionary:Votes/2011-04/Lexical categories, move:
- Category:en:Exonyms -> Category:English exonyms
- and also all the language variations (Category:es:Exonyms -> Category:Spanish exonyms, etc.)
Rationale: This makes these categories nominally consistent with all other categories that describe the words ("Category:English blablabla") rather than their meanings ("Category:en:blablabla"), such as all categories listed in Category:English terms by etymology.
In fact, I believe Category:English exonyms should be a subcategory of Category:English terms by etymology.
It's interesting to note that Category:English terms by etymology was once called Category:en:Etymology before it was moved multiple times. --Daniel Carrero (talk) 23:22, 11 October 2015 (UTC)
- Being an exonym is not a matter of how a word was created. In fact, terms often don't start off as exonyms, but become exonyms as the languages diverge and evolve. So it's not appropriate to put it under etymology. —CodeCat 00:11, 12 October 2015 (UTC)
Oppose: Exonyms should remain as a category and English exonyms should be a subcategory of it.Purplebackpack89 20:15, 12 October 2015 (UTC)
- I nominated specifically "Category:en:Exonyms -> Category:English exonyms", you mentioned "English exonyms should be […] ", so I don't see how this would work as an oppose vote to my nomination. I don't suppose you wanted the category to remain named "Category:en:Exonyms", right?
- In any event, the format that other umbrella categories use according to Wiktionary:Votes/2011-04/Lexical categories is "Category:Exonyms by language" -> "Category:English exonyms". Like "Category:Nouns by language" -> "Category:English nouns". --Daniel Carrero (talk) 00:16, 13 October 2015 (UTC)
- Oh, sorry, I missed the "en" in there. Retracting my vote. Purplebackpack89 00:22, 13 October 2015 (UTC)
- No problem, thank you. --Daniel Carrero (talk) 00:26, 13 October 2015 (UTC)
- Oh, sorry, I missed the "en" in there. Retracting my vote. Purplebackpack89 00:22, 13 October 2015 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 03:11, 29 October 2021 (UTC)
- This would be a good bot job. - excarnateSojourner (talk | contrib) 00:39, 12 April 2022 (UTC)
- I was going to move these categories as proposed using excarnateSojournerBot, but I discovered that Category:Exonyms's data in Module:category tree/topic cat/data/Places lists
"places"
as one of its parents, which (because it is a topic cat) makes e.g. Category:en:Places a parent of Category:en:Exonyms. I expect if I move the exonyms code from Module:category tree/topic cat/data/Places to Module:category tree/poscatboiler/data/names it will then try to make e.g. Category:English exonyms a child of Category:English places when we instead want Category:Places to remain a topic cat. So I think changing Category:Exonym's parent to something like Category:Terms by etymology subcategories by language will be a necessary part of this operation (but see CodeCat's objection above). Category:Places also currently contains improper nouns such as track and fringe, which should not be descendants of Category:Names. — excarnateSojourner (talk · contrib) 06:32, 27 December 2022 (UTC)
- The situation of all of our names categories is complicated, compounded by the unclear scope of some, e.g. the exonyms category seems to only contain place exonyms, not other exonyms like German or Xerxes. And suppose someone attests a foreign exonym of an English-speaking place [e.g. Japanese-derived "Rondon" for "London"] in English, the way e.g. Deutsch#English or google books:"speak Eigo" are attested in English: would that go in the "English exonyms"/"en:Exonyms" category?
It's been suggested that we need to revamp the system more widely, also doing something about e.g. transliterations of foreign names (Pyotr, Putin, Kaifeng, etc); even the question of whether and how Placenames should be a subset of Names has come up before, though I'm having trouble finding the discussion (I think there's more than just the discussion in the section immediately below this one, and Category talk:en:Place names and Category talk:en:Names and WT:Info desk/2013/July, but I can't find it offhand). On a balance, names are a lot more like a "POS" category than a "topic" category. I agree they aren't per se terms by etymology, since as noted above, they only sometimes originate as exonyms, sometimes they originate as endonyms and then the speakers of the language get forcibly relocated, or the language evolves into two. (Is Icelandic Rín an exonym for the Rhine? Icelanders do not live near the Rhine, but the name goes back to when their ancestors did...) - -sche (discuss) 16:36, 27 December 2022 (UTC)
Recategorize into Category:Names by language[edit]
Pinging some editors from the discussion above: @User:Daniel Carrero, @User:Rua, @User:Purplebackpack89, @User:-sche
As I explained above, it seems infeasible to rename cat:Exonyms (and its subcategories) without also changing what its parent category is. So I propose we remove cat:Exonyms from cat:Places, add it to cat:Names by language, rename it to cat:Exonyms by language, and rename its subcategories to e.g. cat:English exonyms. Exonyms are not places; they are names. I realize this would extend the breadth of cat:English names and its siblings. I think this makes sense, but I would also accept cat:Exonyms by language being under cat:Terms by semantic function by language. — excarnateSojourner (talk · contrib) 03:48, 25 February 2023 (UTC)
Continuation of #Category:en:Names into Category:English names[edit]
Reviving the earlier discussion, I'm still bothered by the fact that we have two different categories for names. But the previous discussion also made it clear that it's not as easy as just merging them.
- I think
Category:en:Place namesshould probably be renamed to Category:en:Places, since it's really meant to contain terms for places. That is, since it's a topical/set-type category, the focus should be on the referent of the word, whereas part-of-speech categories like Category:English names focus on the word itself. A word is a name, and it refers to something bearing that name. - Category:en:Named roads should probably be given some other parent than Category:en:Names; roads are not a subset of names, after all. We already have Category:en:Roads, so removing the names category would be enough.
Category:en:Transliteration of personal namesshould probably be renamed and made to fit into Category:English terms transliterated from other languages somehow. Transliteration of a name doesn't seem particularly different from transliteration of any other word, so we might also just decide to get rid of the distinction and merge them entirely.- Category:en:Demonyms is a bit more problematic and I brought it up before, though I don't remember where. "Demonym", again, is a term focused on the word, not the referent. A word is a demonym. Perhaps this could be renamed to something else? Category:en:Peoples maybe?
- Category:en:Languages could probably just be removed from the category.
- Category:en:Letter names seems like a good candidate to be renamed to Category:English letter names, to fit alongside Category:English letters.
- Category:en:Couple nicknames I don't really know about. I suppose it's thematically quite similar to Category:English female given names? So Category:English couple nicknames?
Category:English surnames from Japanese, finally, should just be removed from the category, as it has more suitable parents already.
—CodeCat 00:45, 10 November 2015 (UTC)
- FWIW, what I am going to say is somewhat off-topic and maybe I'm minority on that, but I would not mind using the naming system "Category:English xxxx" for all topical categories: Category:en:Chess -> English terms related to chess. (or any better name along those lines) --Daniel Carrero (talk) 00:59, 10 November 2015 (UTC)
- "Category:en:Transliteration of personal names" could be renamed to "Category:English names transliterated from other languages", I suppose. What's the matter with the demonyms category? It contains demonyms, as expected. Would it be better titled "English demonyms", on the model of "English phrases"? - -sche (discuss) 06:02, 10 November 2015 (UTC)
- "Category:en:Transliteration of personal names" would be better named "English transliterations of (foreigners') personal names". Notice the existence of e.g.Category:Latvian transliterations of English names. Names of non-English speakers are not English names. I agree with CodeCat that place names belong to topic categories.--Makaokalani (talk) 14:32, 10 November 2015 (UTC)
- Here's the old discussion if anyone wants to read it. - excarnateSojourner (talk | contrib) 15:58, 12 April 2022 (UTC)
- Category:en:Place names was deleted by Equinox in 2017-05 because it was empty. Category:Transliteration of personal names (and its language-specific subcategories) were moved to Category:Foreign personal names in 2021-09 with the help of WingerBot. - excarnateSojourner (talk | contrib) 16:14, 12 April 2022 (UTC)
- Move Category:en:Demonyms to Category:English demonyms. This would be another job for a bot. - excarnateSojourner (talk | contrib) 04:57, 5 October 2022 (UTC)
- cat:en:Demonyms has the same problem as cat:en:Exonyms (as explained in the discussion above): it is a child of cat:en:Places, so moving it will not be straightforward. — excarnateSojourner (talk · contrib) 03:51, 6 February 2023 (UTC)
- @ExcarnateSojourner There being no opposition here, only support (albeit mostly old support), and no opposition or interest when I brought this up in the BP, let's revise whatever needs to be revised to put (at a minimum) all given names and surnames into subcategories of Category:Names by language, instead of some of them being in subcategories of Category:Names. The split is haphazard and arbitrary; I see the intention — put a name that was given within English in one top-level category and a name transliterating a foreign name in a different top-level category — but in practice that's not maintained, since e.g. Alexandra in the context of discussing ancient Greek is transliterating the Ancient Greek name, Sergei has been given to babies born in the Anglosphere (and to characters in English fiction), and we don't maintain such a split with place names. - -sche (discuss) 16:01, 24 April 2023 (UTC)
- It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche (discuss) 14:37, 5 May 2023 (UTC)
- (Assistance solicited at Module talk:names#en:Russian_male_given_names,_etc.) - -sche (discuss) 14:48, 5 May 2023 (UTC)
- It making no sense to have Alexandra (in works about ancient Greece where it's romanizing a Greek name), Alexandra (in fiction about ancient Greece where it's a given name), Alexandra (as borne by British or American people today), Sonya, Vadim and Vladimir divided haphazardly into two different top-level categories, "Names" vs "Names by language", I'm now (attempting) editing the modules to consolidate them into "Names by language" subcategories. - -sche (discuss) 14:37, 5 May 2023 (UTC)
Recategorize Category:Demonyms and Category:Ethnonyms[edit]
Pinging some editors from the discussion above: @User:Rua, @User:Daniel Carrero
As I explained in the discussion about exonyms above, renaming the language-specific subcategories of cat:Demonyms properly will require removing it from the topic category tree and adding it to the set category tree. We should similarly recategorize cat:Ethnonyms, another child of cat:Names that did not yet exist when this discussion started. I propose recategorizing them into Category:Terms by semantic function subcategories by language, unless someone can find a better place, and renaming them cat:Demonyms by language and cat:Ethnonyms by language. — excarnateSojourner (talk · contrib) 06:55, 25 February 2023 (UTC)
2016[edit]
Linear A[edit]
Strangely enough we have a language code for Linear A [lab], even though Linear A is a writing system and not a language. I have no idea why it was encoded or why we have it. -- Liliana • 15:01, 5 March 2016 (UTC)
- It's very odd. The script code for Linear A is "Lina"; the language code for Minoan is "omn"; but there's also a language code "lab" for a language called "Linear A". I have no idea what ISO and SIL were thinking, but I'm in favor of deleting "lab" from our modules. —Aɴɢʀ (talk) 17:43, 5 March 2016 (UTC)
- I'll bet their thinking is that the language written in the script may be an unknown language, which would be consistent with w:Linear A. There do seem to be a large number of hypotheses about Linear A, nearly on the same order as the total number of recorded instances of the script. DCDuring TALK 18:33, 5 March 2016 (UTC)
- I see. Reading Minoan language more carefully, I see that it's written in both Cretan hieroglyphs and Linear A, but since neither writing system has been deciphered, it isn't known whether it's the same language in two writing systems or two different languages. So maybe "omn" means Minoan in Cretan hieroglyphs and "lab" means Minoan in Linear A, and they may or may not refer to the same language. Given that the language is unknown and undeciphered, I wonder why we have one Minoan lemma: kuro. How do we know this word was pronounced "kuro" and that it means "total"? —Aɴɢʀ (talk) 07:25, 6 March 2016 (UTC)
- It's in the wrong script anyway (it was added before Unicode covered Linear A), but afaik Linear A can be read simply by using the known values for Linear B syllables, which are visually similar. This word is always found at the end of lists, followed by a number, so the meaning was easy to figure out. -- Liliana • 10:39, 6 March 2016 (UTC)
- I see. Reading Minoan language more carefully, I see that it's written in both Cretan hieroglyphs and Linear A, but since neither writing system has been deciphered, it isn't known whether it's the same language in two writing systems or two different languages. So maybe "omn" means Minoan in Cretan hieroglyphs and "lab" means Minoan in Linear A, and they may or may not refer to the same language. Given that the language is unknown and undeciphered, I wonder why we have one Minoan lemma: kuro. How do we know this word was pronounced "kuro" and that it means "total"? —Aɴɢʀ (talk) 07:25, 6 March 2016 (UTC)
- I'll bet their thinking is that the language written in the script may be an unknown language, which would be consistent with w:Linear A. There do seem to be a large number of hypotheses about Linear A, nearly on the same order as the total number of recorded instances of the script. DCDuring TALK 18:33, 5 March 2016 (UTC)
- I reckon we should indeed delete this language code; we can always change our minds once some decoding happens. @Liliana-60 (or anyone else), can we move kuro#Minoan to a Linear A entry? —Μετάknowledgediscuss/deeds 00:25, 2 April 2016 (UTC)
I see no evidence that this exists as a separate language, and move that it be merged with tr. The literature which references it seems to describe the dialect of Turkish which may be spoken by Gagauz people in the Balkan Peninsula. —Μετάknowledgediscuss/deeds 20:17, 3 July 2016 (UTC)
- Wikipedia, citing Ethnologue, insists that Balkan Gagauz Turkish, Gagauz, and Turkish are all separate, and a few sources do seem to take that view, e.g. Cem Keskin, Subject agreement-dependency of accusative case in Turkish, or, Jump-starting grammatical machinery (2009) speaks of "Balkan Gagauz Turkish, Gagauz, Turkish, Iraqi Turkmen, North and South Azerbaijani, Salchuq, Aynallu, Qashqay, Khorasan Turkic, Turkmen, Oghuz Uzbek, Afshar, and possibly Crimean Tatar". Other references speak of Balkan Gagauz Turkish as a variety of Gagauz, e.g. James Minahan's Encyclopedia of the Stateless Nations says "The Gagauz speak a Turkic language [...] also called Balkan Gagauz or Balkan Turkic, [which] is spoken in two major dialects, Central and Southern, with the former the basis of the literary language. Other dialects [include] Maritime Gagauz" (which comports with w:Gagauz's list of its dialects). Matthias Brenzinger's Language Diversity Endangered also treats Balkan Gagauz "or slightly misleading, Balkan Turkic" in his entry on Gagauz, but says it that the Balkan "varieties might deserve the status of outlying languages but very little information is available about them." (A few generalist references seem to subsume all
gag
intotr
.) I would leave them all separate, pending more conclusive evidence that they should be merged. - -sche (discuss) 23:58, 3 July 2016 (UTC)- I think there's some confusion about what exactly we're talking about, and whether it's Gagauz or Turkish. Just because they use the term "Balkan Gagauz Turkish" doesn't mean that they're referring to the language with ISO 639-3 code bgx. When I look at who's citing the references listed for bgx at Glottolog, Manević (the reference for its classification) is cited in papers clearly talking about the dialects of tr. These are the only actual words attributed to this lect that I can find. —Μετάknowledgediscuss/deeds 00:33, 4 July 2016 (UTC)
- @Tropylium, on the subject of Turkic languages spoken in Europe, do you know anything about this one, and about its differences or similarity to Gagauz and standard Turkish? - -sche (discuss) 01:08, 11 May 2017 (UTC)
- I'm not previously familiar with this dispute, but here are a few handbooks on the topic:
- Menges in The Turkic Languages and Peoples has the following slightly complicated quote (p. 11): "The Turkic languages spoken farthest west are the Balkanic dialects of Osman and Gagauz in Bosnia, Bulgaria and Macedonia. These seem to form two groups, one of possibly pre-Osman origin, and a later Osman one. To the former belong the Gaǯaly in Deli-Orman (Eastern Bulgaria), who, according to V. A. Moškov, are descended from the Päčänäg, Uz, and Torci (?), the Surguč, numbering about 7000 people in the district (vilājät) of Edirnä, who call themselves Gagauz. In Moškov's opinion, they, too, go back to the Päčänägs (?) and the Macedonian Gagauz; they number ca. 4000 people in southeastern Macedonia." — It seems clear that some group(s) corresponding to "Balkan Gagauz" is being identified here, but I am not even sure how to parse the sentence structure; e.g. are "Uz" and "Torci" some of the pre-Osman Turkic groups, or some of the alleged ancestors of the Gaǯaly? ("Osman" is, of course, Turkish.)
- Hendrik Boeschoten in a classificatory chapter in Routledge's The Turkic Languages mentions that "a few speakers [of Gagauz] in northern Bulgaria, Romania and Greece, adhere to the Orthodox faith, and have their own history." This again seems to refer to "Balkan Gagauz", but with no indication of being its own language.
- So far I would gather from this that "Balkan Gagauz" is at most a sister language of "non-Balkan Gagauz", and perhaps indeed just a different dialect group (perhaps one whose features are not reflected in written standard Gagauz). But the Manević 1954 paper would be more informative on this topic, if anyone wants to hunt it down. --Tropylium (talk) 11:55, 11 May 2017 (UTC)
- I'm not previously familiar with this dispute, but here are a few handbooks on the topic:
- @Allahverdi Verdizade, Crom daba: Here's an old, unresolved issue that could benefit from Turkicist eyes. —Μετάknowledgediscuss/deeds 23:59, 8 September 2018 (UTC)
- I think Balkan Gagauz should be merged with gag, especially since it contains no entries. The few terms that would be specific for Gagauz spoken outside of the traditional Gagauz area in Moldova/Romania/Bulgaria can be dealt with within gag entries. The only thing is that some etymologies of other Turkic languages sometimes refer to Balkan Gagauz instead of Gagauz, because editors didn't know the difference between two. Otherwise I don't see any problems with merging them two.
- On the other hand, Gagauz should definitely NOT be merged with Turkish, that is pretty obvious to me.Allahverdi Verdizade (talk) 05:09, 9 September 2018 (UTC)
- @Metaknowledge This is a hard question, I can offer only guesswork.
- I can't find any good maps for the distribution of Gagauz and (Muslim) Turks proper in the Balkans, most don't show Balkan Gagauz at all although we know they exist at least in Bulgaria and Macedonia.
- It seems that they are not easily separated geographically from Muslim Turks although they presumably live in different localities. I'm guessing this means that their languages ("Balkan Gagauz Turkish" and "Rumelian Turkish") could be the same, although maybe only the latter call their language "Turkish", so I guess that they (would?) use Standard Turkish in education and administration.
- This would be a good argument to merge Balkan Gagauz into Turkish, except that this paper shows that Balkan Turkic (if this really is a single language) is quite distinct from Anatolian Turkish and perhaps worth considering a different language. Baskakov also considers Balkan Turkish and (Moldovan) Gagauz to form a clade within Oghuz and Anatolian Turkish and Azerbaijani to form another. Crom daba (talk) 21:35, 30 September 2018 (UTC)
- Merge / delete it. The distribution of the name, the way it is “mentioned”, points towards it being a ghost language. The name is not attestable as used by anyone having particular information about it; nobody can add anything under it either in such a situation where it is a content-filled concept for nobody. Its alleged synonyms “Balkan Turkish” and “Rumelian Turkish” show it is just an SOP term for Turkish as spoken on the Balkans respectively Rumelia, i.e. remnant speakers of the Ottoman rule. German Balkantürkisch, distinguished from Türkeitürkisch as a regiolect. Fay Freak (talk) 13:38, 2 December 2020 (UTC)
Even more languages without ISO codes, part 6[edit]
This next batch is of languages from lists other than Ethnologue and LinguistList. As before, I've tried to vet them all beforehand, but I will have doubtlessly made some mistakes. NB if you want to find more: I've avoided dealing with most of the Loloish languages, because all the literature seems to be in Chinese. —Μετάknowledgediscuss/deeds 04:54, 6 July 2016 (UTC)
- Alingpo language (tbq-alp) — perhaps should be named Yiqing
- Alo Teqel language (map-alt)
- Antequera Zapotec (omq-anz) — hard to say how different it is, but it's extinct, so a finite lexicon
- Auteco language (azc-aut)
Aveteian language (map-ave)- Bantang language (tbq-ban)
- Chashan language (tbq-cha)
- Damu language (sit-dam)
- Daylami language (ira-day)
- Jo language (crp-joo)
- Kasabe language (alv-kas)
- Kasong language (aav-kas) — questionable whether this is a separate language
- Komi-Yazva language (urj-kya)
- Kurbet language (crp-kur)
Australian languages[edit]
- Bugurnidja language (aus-bug)
- Dyirringany language (aus-dyi)
- Gulidjan language (aus-gul)
- Gunindiri language (aus-gun)
- Kok Thawa language (aus-kth)
- Kureinji language (aus-kur)
Mirning language (aus-mir)- Ngaro language (aus-ngr)
- Ngaygungu language (aus-ngg)
Ngumbarl language (aus-ngu)- Wik Ompom language (aus-wom)
- Wik Paach language (aus-wpa)
- Yiman language (aus-yim)
Tasmanian and other[edit]
- Northeastern Tasmanian:
Northeastern, Pyemmairre language (aus-pye)Done
- alt names/varieties: Plangermaireener, Plangamerina, Cape Portland, Ben Lomond, Pipers River
North Midlands, Tyerrernotepanner language (aus-tye)— Bowern considers this a dialect; perhaps we should just trust her- Lhotsky/Blackhouse Tasmanian language (aus-lbt) — the worst name in Bowern's set!
- I'm not sure... the very language is "reconstructed" by Bowern on the assumption that three wordlists (of which only two make it into the name) attest the same language, although apparently none of the three bothered to name the language. The chance of someone "would run across [a word in] it and want to know what it means" seems nonexistent. If we wanted to host the wordlists, we could do that in an appendix or on Wikisource. - -sche (discuss) 16:09, 9 August 2016 (UTC)
- Here is another language we might need a code for: Ma(') Pnaan (poz-map?), also known by the exonyms Punan Malinau and Punan Segah, a language of Borneo / East Kalimantan, summarized by Antonia Soriente here and elsewhere. Compare the other things listed at Punan language. - -sche (discuss) 05:21, 29 August 2016 (UTC)
Marrithiyel[edit]
Maridan [zmd], Maridjabin [zmj], Marimanindji [zmm], Maringarr [zmt], Marithiel [mfr], Mariyedi [zmy], Marti Ke [zmg]: should these be merged? References speak of a singular Marrithiyel language. - -sche (discuss) 21:30, 20 July 2016 (UTC)
Some more missing American languages[edit]
Here are a few more North American languages for which we could add codes:
Akokisa (nai-ako).WP says it is attested certainly in two words in Spanish records (Yegsa "Spaniard[s]", which Swanton suggests is similar to Atakapa yik "trade" + ica[k] "people"; and the female name Quiselpoo), and possibly in more words in a wordlist by Jean Béranger in 1721 (if the wordlist is not some other language).
Labrador Inuit Pidgin French, less often called Belle-Isle Pidgin, was spoken in Labrador from the late 1600s (probably since before the 1660s, but first written down in 1694) until at least the mid 1760s, based on Inuktitut, French, Basque, Montagnais, and possibly Spanish and Breton. Louis-Jacques Dorais, An Inuit Pidgin around Belle-isle Strait (1996; with reference to "Clermont - Martijn 1980; Dorais 1980; Bakker 1988"), covers the records:
- Louis Jolliet recorded words at Baie Saint-Louis in 1694, including the 'greeting' thou tcharacou, saying the latter word is "peace", which Dorais says is "corroborated by two other sources, from 1717 (characoua [...]) and 1720 (characo [...]). But a text from 1743 (Privy Council 1927: 3284), written by the French merchant Louis Fornel, gives to characo the meaning 'war'." Thou is probably from tu. The other would could be Basque txarrakoa "bad", thus "are you bad?".
- Le Cour in 1742 records some more words: bons camaras "good comrades", tous camaras "all comrades", capitaine "captain", kellanoré (which Dorais says "seems to be Le Cour's [or the pidgin's?] rendering of Inuktitut kinaunali 'but who is he?'?), the personal name Amargo (a rendering of Amaqqut "Wolves"), rénombek "bead" (probably a loanword), maumek "file" (probably a loanword), monkoumek "knife" (probably a loanword from Montagnais mukuma:n, as spelled in Marguerite Ellen MacKenzie Towards a Dialectology of Cree-Montagnais-Naskapi).
- Louis Fornel in 1743 recorded more: tout camara "all comrades", troquo balena "let us trade whale" (from French troquons!), non characo "no war" (sic, per Fornel).
- Jens Haven wrote other words in 1764-5: makagua "peace" (perhaps from Basque bake[a] "peace" plus a suffix -koa), kutta (French couteau "knife"), memek "to drink" (from Inuktitut imiq "drinking water").
- Few references discuss the lect and it is difficult to judge whether it is really a language or just something like broken French or like Spanglish (which I think we exclude), but the fact that the Inuit apparently changed the meaning and even part of speech of words in their own language when speaking pidgin suggests it is more on the pidgin-language side of that continuum than the code-switching side.
- Algonquian–Basque pidgin (crp-abp). Wikipedia has a sample. The Atlas of Languages of Intercultural Communication, citing Bakker, says it was spoken from at least 1580 (and perhaps as early as 1530s) through 1635, and "only a few phrases and less than 30 words attributable to Basque were written down" (though apparently more words, attributable to other sources, were also recorded).
- Guachichil (Cuauchichil, Quauhchichitl, Chichimeca) (
nai-gch or,if Guachí is added as sai-gch, perhaps nai-gcl to prevent the two similarly-named lects from being mixed up by only typoing the initial n vs s), apparently sparsely attested. - Concho (nai-cnc). The Handbook of North American Indians, volume 10, says "three words of Concho [...] were recorded in 1581 [and] look like they may be [...] Uto-Aztecan".
- Jumano (Humano, Jumana, Xumana, Chouman, Zumana, Zuma, Suma, and Yuma) (nai-jmn). The Handbook says "It has been established that the Jumano and Suma spoke the same language. Three words have been recorded" of it.
and from South America:
- Peba / Peva (sai-peb), said by Erben to more properly by called Nijamvo, Nixamvo. Spoken in "the department of Loreto" in Peru. Attested in wordlists by Erben and Castelnau, which Loukotka provides, and which disagree with each other substantially: munyo (Erben) / money (Castelnau) "canoe, small boat"; nero (E) / yuna (C) "demon"; nebi (E) / nemey (C) "jaguar"; teki (E) / tomen-lay (C) "one", manaxo (E) / nomoira (C) "two"; etc. I would even consider that one might not be the same language as the other... what's with these languages that survive in disparate wordlists? lol.
- possibly Saynáwa: fr.Wikt grants a code to this variety of Yaminawá language, described here (see also [1]).
- -sche (discuss) 04:04, 16 August 2016 (UTC)
- Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledgediscuss/deeds 04:08, 16 August 2016 (UTC)
- Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, [...] known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts [...] Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as
{{lb|aqp|Akokisa}}
and then Béranger's forms as{{lb|aqp|possibly|Akokisa}}
without needing to agonize over which header to put them under. - -sche (discuss) 15:31, 16 August 2016 (UTC)
- Good point about Akokisa. (I am reminded that you had mentioned its dialectness earlier; sorry I forgot!) The wordlist, labelled only with a tribal name per WP, is possibly plain Atakapa, but Yegsa is supposedly recorded as specifically Akokisa; OTOH that doesn't rule out that Akokisa is a dialect. Indeed, M. Mithun's Languages of Native North America treats as dialects Akokisa, Eastern ("the most divergent, [...] known from a list of 287 entries") and Western ("the best documented. Gatschet recorded around 2000 words and sentences, as well as texts [...] Swanton recorded a few Western forms", all published in 1932 in a dictionary). I suppose the benefit to treating it as a dialect would be that we could context-label Yegsa and Quiselpoo as
- Support all except possibly Akokisa. I think it's a dialect of Atakapa, and that the wordlist is very likely not being linked correctly. That said, it's so few words, that there's no real reason not to accept it as a separate language, just to be conservative about it. —Μετάknowledgediscuss/deeds 04:08, 16 August 2016 (UTC)
Nkore-Kiga[edit]
As can be seen at w:Nkore-Kiga language, Kiga [cgg] should definitely be merged into Nyankore [nyn]. Unfortunately, this might require a rename to something that is both hyphenated and considerably less common that just plain "Nyankore" (though that is, strictly speaking, merely the name of the main dialect). —Μετάknowledgediscuss/deeds 05:21, 18 September 2016 (UTC)
- I'm not sure. WP suggests the merger was politically motivated, but many reference works do follow it. Ethnologue says there as "Lexical similarity [of] 78%–96% between Nyankore, Nyoro [nyo], and their dialects; 84%–94% with Chiga [cgg], [...and] 81% with Zinza [zin]" (Kiga, meanwhile, is said to be "77% [similar] with Nyoro [nyo]"), as if to suggest nyn is about as similar to cgg as to nyo, and indeed many early references treat Nkore-Nyoro like one language, where later references instead prefer to group Nkore with Kiga. Ethnologue mentions that some authorities merge all three into a "Standardized form of the western varieties (Nyankore-Chiga and Nyoro-Tooro) [...] called Runyakitara [...] taught at the University and used in internet browsing, but [it] is a hybrid language." (For comparison, Ethnologue says English has 60% lexical similarity to German.) - -sche (discuss) 00:16, 2 June 2017 (UTC)
Itneg lects[edit]
See w:Itneg language. All the dialects have different codes, but we really should give them a single code and unify them. I came across this problem with the entry balaua, which means "spirit house" (but I can't tell in which specific dialect). It's also known as Tinggian (with various different spellings), and this may be a better name for it than Itneg. —Μετάknowledgediscuss/deeds 02:09, 23 September 2016 (UTC)
What distinguishes these two? —suzukaze (t・c) 03:31, 9 October 2016 (UTC)
- We have Category:Latin script characters and other subcategories of Category:Characters by script.
- To follow suit, maybe we should merge Category:Chinese hanzi and Category:Chinese Han characters into
Category:Chinese script charactersCategory:Han script characters. --Daniel Carrero (talk) 03:35, 9 October 2016 (UTC)- Oppose merging both into the already existing Category:Han script characters as it is for all hanzi, while the two being discussed here are for Chinese hanzi only. —suzukaze (t・c) 04:00, 9 October 2016 (UTC)
- Allright. Apparently, these two categories are for single-character entries. We have Category:Japanese terms written with one Han script character, so these could be merged into Category:Chinese terms written with one Han script character, and we could populate Category:Chinese character counts like Category:Japanese character counts. --Daniel Carrero (talk) 04:04, 9 October 2016 (UTC)
- Oppose this one too since Japanese entries have Category:Japanese Han characters, for any kanji used in the Japanese language (regardless of whether it can be used independently as a word or not), while Category:Japanese terms written with one Han script character is for words that feature only one kanji, such as 好き. —suzukaze (t・c) 04:17, 9 October 2016 (UTC)
- Allright. Apparently, these two categories are for single-character entries. We have Category:Japanese terms written with one Han script character, so these could be merged into Category:Chinese terms written with one Han script character, and we could populate Category:Chinese character counts like Category:Japanese character counts. --Daniel Carrero (talk) 04:04, 9 October 2016 (UTC)
- Oppose merging both into the already existing Category:Han script characters as it is for all hanzi, while the two being discussed here are for Chinese hanzi only. —suzukaze (t・c) 04:00, 9 October 2016 (UTC)
If there is no meaningful difference between these, I propose keeping Category:Chinese Han characters as it is managed by {{poscatboiler}}
and merging Category:Chinese hanzi into it. —suzukaze (t・c) 04:17, 9 October 2016 (UTC)
- @Wyang, Atitarev, is there a difference between Category:Chinese hanzi and Category:Chinese Han characters, or can Category:Chinese hanzi be merged into Category:Chinese Han characters as suzukaze proposes? - -sche (discuss) 00:27, 28 March 2017 (UTC)
- They can be merged, IMO. --Anatoli T. (обсудить/вклад) 00:52, 28 March 2017 (UTC)
- (reviving this discussion after almost three years) Merge per Suzukaze-c's proposal above. — justin(r)leung { (t...) | c=› } 03:30, 19 January 2020 (UTC)
There seems to be no notable difference between the two categories so they should be merged I guess. Ffffrr (talk) 21:40, 10 December 2021 (UTC)
Update? For reference, it looks like the "Chinese hanzi" category is populated by this code in Module:zh-pron. 70.172.194.25 00:21, 27 May 2022 (UTC)
Paraguayan Guaraní [gug][edit]
I just noticed that we have this for some reason. Guaraní is a dialect continuum that is quite extensive, both in inter-dialect differences and in geography, and certain varieties have been heavily influenced by Spanish or Portuguese. That said, our Guaraní [gn] content is, as far as I can tell, pretty much entirely on Paraguayan Guaraní, which for some reason has a different code, [gug]. My attention was brought to this by User:Guillermo2149 changing L2 headers (I have not reverted his edits, but they do cause header-code mismatch). We could try splitting up the Guaraní dialects, but it would hard to choose cutoffs and would definitely confuse potential editors, of which we have had more since Duolingo released a Guaraní course. I think the best choice is to merge [gug] into [gn] and mark words extensively for which dialects or countries they are used in. @-sche —Μετάknowledgediscuss/deeds 01:29, 1 November 2016 (UTC)
Support [gn] and [grn] are the codes of the macrolanguage, [gug] is the code for the specific dialect spoken in Paraguay, also, until now, I haven't found any [gn] lemma to be out of [gug]. --Guillermo2149 (talk) 01:52, 1 November 2016 (UTC)
Support. — Ungoliant (falai) 11:00, 1 November 2016 (UTC)
- @Guillermo2149, Ungoliant MMDCCLXIV, -sche, Angr: I see now that there are three more Guaraní dialect codes that we have: Mbyá Guaraní [gun], Chiripá [nhd], and Western Bolivian Guaraní [gnw]. I presume that we should merge these into [gn] as well, but the case is arguably less clear given that in our current state, all our [gn] lemmas are really [gug]. What do you all think? —Μετάknowledgediscuss/deeds 22:51, 14 November 2016 (UTC)
- I stick by my motto, "When in doubt, merge". —Aɴɢʀ (talk) 09:53, 15 November 2016 (UTC)
- I think we should actually merge [gn] into [gug] and not viceversa. By the way, [gn] is the only one that should be merged, [gun] has similar and some equal words but the language is very different, and [nhd] is similar and very close to [gug] but it's slightly different and always confused with [gug] --Guillermo2149 (talk) 00:37, 7 December 2016 (UTC)
- Don't forget there's also [gui] and apparently also [tpj]. - -sche (discuss) 04:28, 16 May 2017 (UTC)
2017[edit]
to rubber chicken. Other dictionaries in OneLook have rubber chicken, not rubber-chicken dinner. There are abundant other collocations of rubber chicken both as a substantive and in attributive use. One common one is "rubber-chicken circuit". Examples of other nouns following rubber-chicken are lunch, banquet, affair, meal, fundraiser. Substantive use can be found in usages such as: Fortunately we'll spare everyone the rubber chicken and the speeches and simply acknowledge the guidance and vision of the world's best agent/coach/editor.
Rubber chicken is not identical to rubber (“rubbery”) + chicken either, though that is its origin. It specifically refers to the kind of organizational meals-with-speeches that crowd a politician's schedule, but also characterize conventions, off-site meetings, etc. DCDuring TALK 13:39, 13 January 2017 (UTC)
Merger into Scandoromani[edit]
I propose that the Para-Romani lects Traveller Norwegian, Traveller Danish and Tavringer Swedish (rmg
, rmd
and rmu
) be merged into Scandoromani. TN, TD and TS are almost identical, mostly differing in spelling (e.g. tjuro (Sweden) vs. kjuro (Norway) meaning 'knife', gräj vs. grei 'horse' etc.). WP treats them as variants of Scandoromani. My langcode proposal could be rom-sca
, or maybe we could just use rmg
, which already has a category. --176.23.1.95 20:19, 25 January 2017 (UTC)
- Im supporting it. Traveller Norwegian is sometimes referred to as Tavring, and, to be honest, Ive never herd nobody use the term Traveller Norwegian as a language. People are calling it rather Taterspråk or Fantemål, even when books states it as a derigatory therm. The other problem is that we've got in fact 2 differnet Norwegian Traveller languages (the Romani-based and the Månsing-based). So it look like a total mess rite now Tollef Salemann (talk) 07:55, 2 April 2023 (UTC)
- I don't think this makes sense if the orthographies are consistently different, which seems to be the case. Otherwise, we could use the same logic to merge quite a few of the Slavic languages, which obviously doesn't make sense. Theknightwho (talk) 13:43, 2 April 2023 (UTC)
- Ok, but Traveller Norwegian is not quite right term, cuz the Romani-based TN has two or more branches, which are quite different from eachother, while the main one is allmost the same as the Swedish and had often the same name(s). Meenwhile, there is also a Germanic TN version, unrelated to the Romani-ish TN variations. I mean, we need at least two more L2 in this case, even if we gonna merge TN and Swedish Tavring.
- PS there are also Swedish stuff like Knoparmoj and Loffarspråk and more, and they still have remnants in some rare Swedish/Norwegian sociolects. Maybe they also need their L2? Or can we treat them as sociolects? Tollef Salemann (talk) 13:59, 2 April 2023 (UTC)
Chinese Pidgin English (cpi)[edit]
This is not a separate language at all, it's just English with different grammar and some loanwords, but other than that it's completely intelligible with standard English. As such, it should be moved to Category:Chinese English. -- Pedrianaplant (talk) 15:19, 8 February 2017 (UTC)
- That's not at all the impression I get from Chinese Pidgin English. It seems to be a distinct language to me, as much as any other English-based pidgin. —Aɴɢʀ (talk) 16:45, 8 February 2017 (UTC)
- We did delete Hawaiian Pidgin English in the past though (see Template talk:hwc). I don't see how this case is any different. -- Pedrianaplant (talk)
- Basically, this is a terminological problem. There may have been a true pidgin in each of these cases, but it has not been recorded. What is called a pidgin in many descriptive works is instead a dialect of English that is very easy to understand, nothing like the real English-based pidgins and creoles that I have studied. If you look at the actual quotations used to support lemmas in Chinese Pidgin English, you find that it is Chinese English. Support merge, but leave [cpi] as an etymology-only code. —Μετάknowledgediscuss/deeds 23:16, 8 February 2017 (UTC)
- At least some texts seem very distinct, to the point of unintelligibility; consider "Joss pidgin man chop chop begin" (Whedon's translator begins chopping things? or "god's businessman begins right away"?). On the other hand, other sentences given by Wikipedia are quite intelligible...and possibly not attestable under the stricter CFI to which English is subject. I'm not sure what to do. (Our short previous discussion also didn't reach a firm resolution.) - -sche (discuss) 17:46, 8 March 2017 (UTC)
- I mean, I use joss and chop chop in English normally (having grown up in a fairly Chinese environment likely has something to do with that)... and I think that was chosen as an especially extreme example. —Μετάknowledgediscuss/deeds 03:32, 25 March 2017 (UTC)
More unattested languages[edit]
The following languages have ISO codes, but those codes should be removed, as there is no linguistic material that can be added to Wiktionary. This list is taken from Wikipedia's list of unattested languages, but I have excluded languages which are not definitively extinct (and thus which may have material become available). If there was any reliable source I could find corroborating the WP article's claim of lack of attestation, it is given after the language. —Μετάknowledgediscuss/deeds 04:15, 4 April 2017 (UTC)
- Aguano language [aga]
- Unclear if it even existed per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).
Barbacoas language [bpb](the Wikipedia article has a discussion of the conflation of this unattested language with Pasto, which needs a code; for clarity, I think this [bpb] should be retired and an exceptional code made explicitly for Pasto)- Retired, following the ISO, see Wiktionary:Beer parlour/2020/October#2019-2020_ISO_code_changes. Content, if needed for migration to a Pasto code, was m["bpb"] = { "Barbacoas", "Q2669202", "sai-bar", otherNames = {"Pasto"}, scripts = Latn, } - -sche (discuss) 06:23, 14 October 2020 (UTC)
- Dek language [dek]
- Giyug language [giy]
- AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
- The 1992 International Encyclopedia of Linguistics, v. 1, p. 337, says "Giyug: 2 speakers reported in 1981, in the Peron Islands in Anson Bay, southwest of Darwin." The 2003 edition repeats the claim that "2 speakers remain". Wikipedia says it's extinct and unattested, but Glottolog, although having no resources on it, suggests it's not extinct. Might be best to leave it alone for now. - -sche (discuss) 01:13, 6 August 2020 (UTC)
- AIATSIS has the following to say: "According to Ian Green (2007 p.c.), this language probably died before the 1920's and neighbouring groups in the Daly claim it was the language of Peron Island which was linguistically and perhaps culturally distinctive from the nearby mainland societies. Black & Walsh (1989) say that this may or may not have been a dialect of Wadiginy N31." —Μετάknowledge
Mawa language (Nigeria) [wma](We call this "Mawa", if removed, [mcw] Mahwa (Mawa language (Chad) can be renamed to the evidently more common spelling "Mawa".)- Removed, and mcw renamed. Glottolog had only one reference to support the existence of Mawa, Temple (1922), which does not even include a section under that header. There may be confusion with the section on the "Marawa", but that does not even mention what language those people speak. (Temple also knows very little about linguistics; while skimming through, I found that Margi (a Chadic language) was said to be similar to the languages of South Africa. —Μετάknowledgediscuss/deeds 01:39, 6 August 2020 (UTC)
- Nagarchal language [nbg]
- Appendix I in The Indo-Aryan Languages records this language as being a subdialect of Dhundari [dhd] and the 1901 Indian Census concurs; this is at odds with its description as an unattested Dravidian language, but the geographical specifications seem to match up.
- Ngurmbur language [nrx]
- AIATSIS says: "Harvey (PMS 5822) treats Ngomburr as a dialect of Umbukarla N43, but in Harvey (ASEDA 802), it is listed as a separate language." Nicholas Evans confirms in The Non-Pama-Nyungan Languages of Northern Australia that it is unattested.
- Tremembé language [tme]
- Truká language [tka]
- Wakoná language [waf]
- Wasu language [was]
- Unclassified due to its absence of data per The Indigenous Languages of South America: A Comprehensive Guide (Campbell and Grondona).
Yenish[edit]
The Yenish "language" (which we call Yeniche) was given the ISO code yec, despite being clearly not a separate language from German. Instead, it is a jargon which Wikipedia compares to Cockney (which has never had a code) and Polari (which had a code that we deleted in a mostly off-topic discussion). The case of Gayle, which is similar, is still under deliberation at RFM as of now. Most tellingly, German Wiktionary considers this to be German, and once we delete the code, we should make a dialect label for it and add the contents of de:Kategorie:Jenisch to English Wiktionary. @-sche —Μετάknowledgediscuss/deeds 00:49, 7 April 2017 (UTC)
- I don't see how that's most tellingly; I don't know about the German Wiktionary, but major language works frequently treat things as dialects of their language that outsiders consider separate languages.--Prosfilaes (talk) 03:01, 10 April 2017 (UTC)
- The (linked) English Wikipedia article even says "It is a jargon rather than an actual language; meaning, it consists of a significant number of unique specialized words, but does not have its own grammar or its own basic vocabulary." Despite the citation needed that follows, that sentence is about accurate, as such this should be deleted. -- Pedrianaplant (talk) 10:53, 30 April 2017 (UTC)
- (If kept, it should be renamed.)
There are those who argue that Yenish should have recognition (which it indeed gets, in Switzerland) as a separate language. And it can be quite divergent from Standard German, with forms that are as different as those of some of the regiolects we consider distinct. Many examples from Alemannic or Bavarian-speaking areas are better considered Alemannic or Bavarian than Standard German. But then, that's a sign that it is, as some put it, a cant overlaid onto the local grammar, rather than a language per se. Ehh... - -sche (discuss) 03:22, 9 July 2017 (UTC)
What's the difference? --Barytonesis (talk) 20:19, 17 April 2017 (UTC)
- Apparently (Google n-grams) the term could be used with or without an object. The definition should be somewhat different. An example of use without a direct object is "to rake over the coals of failure". I don't know how to word this in a substitutable way. It seems to mean something like "to belabor (something negative (result, process), obvious from context) as if in reprimand". DCDuring (talk) 15:14, 3 January 2018 (UTC)
Move entries in CAT:Khitan lemmas to a Khitan script[edit]
The Khitan wrote using a Siniform script. Are these Chinese transcriptions of Khitan? —suzukaze (t・c) 02:22, 13 August 2016 (UTC)
- I'm a little confused about what's going on here. Are you RFV-ing every entry in this category? Or are you just looking for evidence that Khitan was written using this script? —Mr. Granger (talk • contribs) 12:45, 13 August 2016 (UTC)
- I understand that, but I don't understand what your goal is with this discussion. If you want to RFV every entry in the category, then I'd like to add
{{rfv}}
tags to alert anyone watching the entries. If you want to discuss what writing systems Khitan used, maybe with the goal of moving all of these entries to different titles, then I'm not sure RFV is the right place for the discussion. (Likewise with the Buyeo section below.) —Mr. Granger (talk • contribs) 17:55, 13 September 2016 (UTC)
- I understand that, but I don't understand what your goal is with this discussion. If you want to RFV every entry in the category, then I'd like to add
Some spurious languages to merge or remove, 2[edit]
- remove Adabe [adb]
Geoffrey Hull, director of research for the Instituto Nacional de Linguística in East Timor, notes (in a 2004 Tetum Reference Grammar, page 228) that "the alleged Atauran Papuan language called 'Adabe' is a case of the mistaken identity of Raklungu," a dialect (along with Rahesuk and Resuk) of Wetarese. He notes (in The Languages of East Timor, Some Basic Facts) that only Wetarese is spoken on the island, and Studies in Languages and Cultures of East Timor likewise says "The three Atauran dialects—with the northernmost of which the dialect of nearby Lirar is mutually intelligible—are unquestionably Wetarese, and not dialects of Galoli, as Fox and Wurm suggest for two of them (n. 32). The same authors refer (ibidem) to a supposedly Papuan language of Atauro, the existence of which appears to be entirely illusory." (The error appears to have originated not with Fox and Wurm but with Antonio de Almeida in 1966.) - -sche (discuss) 01:45, 31 May 2017 (UTC)
- We could repurpose the code into one for those three Atauran varieties of Malayo-Polynesian Wetarese, Rahesuk, Resuk, and Raklu Un / Raklungu (the last of which Ethnologue does list as an alt name of adb, despite their erroneous family assignment of it), perhaps under the name "Atauran Wetarese" for clarity. - -sche (discuss) 01:52, 31 May 2017 (UTC)
- remove Agaria [agi]
Glottolog makes the case that this is spurious. - -sche (discuss) 07:57, 31 May 2017 (UTC)
Arma
Arma (aoh) is also said to be "a possible but unattested extinct language"; I am trying to see if that means it is entirely unattested, or if there are personal/ethnic/place names, etc. - -sche (discuss) 09:45, 3 June 2017 (UTC)
- Removed, see Wiktionary:Beer_parlour/2020/October#2019-2020_ISO_code_changes. - -sche (discuss) 06:18, 14 October 2020 (UTC)
The VU Amsterdam report linked to here seems to indicate that one lect has been given multiple codes, and that "Jair" at least is spurious. Further research wouldn't hurt. —Μετάknowledgediscuss/deeds 00:24, 3 October 2019 (UTC)
Categories in Category:Letters[edit]
Can we come up with more descriptive names than Category:Aa please? —CodeCat 22:37, 14 May 2017 (UTC)
- IMO they are fine as they are. We could use "Letter Aa", etc, I guess. - excarnateSojourner (talk | contrib) 04:51, 29 April 2022 (UTC)
Apparently this is not a set category, despite its name seeming like one. User:Smuconlaw apparently intended it to be about things related to limbs. I think it should be renamed to more clearly reflect that. —CodeCat 17:35, 17 May 2017 (UTC)
- What is a "set category"? — SMUconlaw (talk) 17:36, 17 May 2017 (UTC)
- A category that contains items belonging to a particular set. See Category:List of sets. A characteristic of set categories is that they have plural names. —CodeCat 17:37, 17 May 2017 (UTC)
- Hmmm, I'm not sure what it's supposed to be. I was just following the example of other categories under "Category:Body" such as "Category:Buttocks", "Category:Face", "Category:Muscles", "Category:Organ systems", "Category:Skeleton", "Category:Skin", and "Category:Teeth". — SMUconlaw (talk) 17:44, 17 May 2017 (UTC)
- I'm currently working with User:-sche on a more permanent solution to issues like this. —CodeCat 19:00, 17 May 2017 (UTC)
- OK, thanks. — SMUconlaw (talk) 22:10, 17 May 2017 (UTC)
- Has this been resolved? - excarnateSojourner (talk | contrib) 23:17, 29 March 2022 (UTC)
- @Rua (CodeCat), @-sche Do you have any idea if this was ever resolved? — excarnateSojourner (talk · contrib) 04:03, 6 February 2023 (UTC)
- Unfortunately, I don't think the issue of how to distinguish set and topic categories has been resolved yet (the basic idea was that they need some more distinct naming convention than just "set categories are plural": something like "Category:en:Set:Foobar" vs "Category:en:Topic:Foobar" or something). However, looking at the contents of this category and the "Muscles" category, it looks like the issue of what should be in a category named "Limbs" (or "Muscles") was resolved by changing its contents to being a "set" category...? - -sche (discuss) 18:04, 6 February 2023 (UTC)
- I'm currently working with User:-sche on a more permanent solution to issues like this. —CodeCat 19:00, 17 May 2017 (UTC)
- Hmmm, I'm not sure what it's supposed to be. I was just following the example of other categories under "Category:Body" such as "Category:Buttocks", "Category:Face", "Category:Muscles", "Category:Organ systems", "Category:Skeleton", "Category:Skin", and "Category:Teeth". — SMUconlaw (talk) 17:44, 17 May 2017 (UTC)
- A category that contains items belonging to a particular set. See Category:List of sets. A characteristic of set categories is that they have plural names. —CodeCat 17:37, 17 May 2017 (UTC)
This should be handled with {{liushu}}
, since jiajie is one of the six categories (liushu). — justin(r)leung { (t...) | c=› } 18:36, 17 May 2017 (UTC)
- Can both of these templates be renamed to include a language code? —CodeCat 19:01, 17 May 2017 (UTC)
{{jiajie}}
should be merged with{{liushu}}
, which could be renamed as{{Han liushu}}
, following{{Han compound}}
and{{Han etym}}
. It might not be a good idea to use a particular language code because these templates are intended for use in multiple languages now. They used to be used under Translingual, but we have decided to move the glyph origin to their respective languages. — justin(r)leung { (t...) | c=› } 20:22, 17 May 2017 (UTC)- You can use script codes as prefixes too. We have Template:Latn-def, Module:Cans-translit and such. —CodeCat 20:26, 17 May 2017 (UTC)
Entries in CAT:Taos lemmas with curly apostrophes[edit]
Many Taos entries use curly apostrophes to represent glottal stops. They should either use the easy-to-type straight apostrophe ' that many other languages use, or the apostrophe letter ʼ that Navajo and a few other languages use. - -sche (discuss) 21:36, 20 May 2017 (UTC)
- I agree. The headword template interprets the curly apostrophe as a punctuation mark (because it is), and automatically links words such as adùbi’íne as adùbi’íne. (Personally, I think the apostrophe letter looks better, but there may be other considerations.) — Eru·tuon 21:45, 20 May 2017 (UTC)
- Oh, and I just learned of the Unicode character ꞌ for the saltillo. But no entries use it, and I am averse to introducing yet another visually-almost-identical symbol to represent the glottal stop, next to the three (counting the curly apostrophe) mentioned above that are already in use, plus the ˀ that some entries use. - -sche (discuss) 02:23, 21 May 2017 (UTC)
- I've moved quite a few of these; about 140 remain to be moved. - -sche (discuss) 04:49, 24 July 2018 (UTC)
The Category:E language surely has numbers, which would require this category to be used. Other suggestions for the food additive category name would be welcome. Maybe "List of E numbers"? DTLHS (talk) 16:31, 27 May 2017 (UTC)
- If we adopt a systematic naming scheme for topic and set categories as CodeCat and I have been discussing, then I guess it could be "Category:mul:set:E numbers" or "Category:Translingual:set:E numbers". However, independent of whether or not such prefixes ("Translingual:set:") come into use, a more intelligible name like the one you propose, replacing "E" with "European food additive", would be good. Other food-additive numbering schemes in use in Europe could also go in the same category. - -sche (discuss) 18:48, 27 May 2017 (UTC)
- Support. Very good find. —Μετάknowledgediscuss/deeds 03:50, 28 May 2017 (UTC)
- Disagree. They are not called European food additive numbers, they are E numbers. SemperBlotto (talk) 18:05, 28 May 2017 (UTC)
- @SemperBlotto: So what do you want to do about numbers in the E language? —Μετάknowledgediscuss/deeds 18:09, 28 May 2017 (UTC)
- I think you may be implying that the category should be something like mul:E numbers just in case any of our users think E is a language. I wouldn't object to that. SemperBlotto (talk) 18:12, 28 May 2017 (UTC)
- To be clear: E is a language, spoken in China. CAT:E language. (And like CAT:English numbers, it will have a "numbers" category someday when our coverage of it improves.) Perhaps a move should be postponed for a little while, though, while we see if we can come up with a systematic naming scheme for topic and set categories (see my talk page). - -sche (discuss) 18:33, 28 May 2017 (UTC)
- Since there's been no progress towards systematically changing how topic and set categories are named, this one does need to be renamed, because it does conflict with the expected 'numbers' category of the existing E language. Does anyone else want to weigh in on whether the name should be "Category:European food additive numbers" or "Category:mul:E numbers"? - -sche (discuss) 22:34, 18 November 2018 (UTC)
- To be clear: E is a language, spoken in China. CAT:E language. (And like CAT:English numbers, it will have a "numbers" category someday when our coverage of it improves.) Perhaps a move should be postponed for a little while, though, while we see if we can come up with a systematic naming scheme for topic and set categories (see my talk page). - -sche (discuss) 18:33, 28 May 2017 (UTC)
- I think you may be implying that the category should be something like mul:E numbers just in case any of our users think E is a language. I wouldn't object to that. SemperBlotto (talk) 18:12, 28 May 2017 (UTC)
- @SemperBlotto: So what do you want to do about numbers in the E language? —Μετάknowledgediscuss/deeds 18:09, 28 May 2017 (UTC)
Should perhaps be moved to long story? W3ird N3rd (talk) 06:42, 9 August 2017 (UTC)
- In contrast to long story short, neither seems entryworthy to me. They are quite transparent. Checking long story at OneLook Dictionary Search, one notes that none of those references find it inclusionworthy, whereas long story short at OneLook Dictionary Search shows some coverage. DCDuring (talk) 11:01, 9 August 2017 (UTC)
sense: Noun: "(aviation) A large multi-engined aircraft. The term heavy normally follows the call-sign when used by air traffic controllers."
In the aviation usage AA21 heavy ("American Airline flight 21 heavy") the head of the NP is AA21, heavy being a qualifying adjective indicating a "wide-bodied", ergo "heavy", aircraft.
Move to noun with any adjustments required. DCDuring (talk) 13:19, 24 August 2017 (UTC)
- @DCDuring You're proposing we move from noun to noun? Did you mean from noun to adjective? - excarnateSojourner (talk | contrib) 05:57, 18 October 2022 (UTC)
- I don't know what I meant 5 years ago, but that's what I mean now: move it to adjective. Though it would be good to confirm that there is not sufficient attestation of heavies and/or [DET] heavy. DCDuring (talk) 12:48, 18 October 2022 (UTC)
- I can find the plural in reference to large (sometimes restricted to widebody) commercial aircraft and heavy bombers (sometimes 2-engine, always at least 4-). Also "heavy" motor vehicles (eg. large trucks, esp semis). I'm not entirely sure what heavy refers to when used by the pilot of a Cessna. DCDuring (talk) 12:57, 18 October 2022 (UTC)
- I don't know what I meant 5 years ago, but that's what I mean now: move it to adjective. Though it would be good to confirm that there is not sufficient attestation of heavies and/or [DET] heavy. DCDuring (talk) 12:48, 18 October 2022 (UTC)
Renaming mey[edit]
We currently have it as "Hassaniya" (which we used to spell as Hassānīya; those macra were removed along the way, presumably by Liliana, although I don't see any discussion; MG deleted the old category once it was empty). To match the other colloquial Arabic languages, it should be "Hassaniya Arabic". (Note: if Arabic is merged, this will become moot.) —Μετάknowledgediscuss/deeds 07:07, 16 September 2017 (UTC)
- This seems a bit different from most of the other forms of Arabic which are "[Adjective referring to a place] Arabic", where just calling the lect "Libyan" (etc) would be more awkward. Still, I have no objection to a rename, though I don't have time to rename all the categories right now. I also notice that, while Hassaniya is probably still the most common spelling overall, it seems like Hassaniyya started to become more common around 2003. - -sche (discuss) 04:03, 29 December 2017 (UTC)
Categories about country subdivisions to include the country name[edit]
This will include at least the following:
- Category:Abkhazia → Category:Abkhazia, Georgia
Category:Alabama→ Category:Alabama, USA- Category:Alberta → Category:Alberta, Canada
- Category:Andhra Pradesh → Category:Andhra Pradesh, India
- Category:Aomori Prefecture → Category:Aomori Prefecture, Japan
Category:Arizona→ Category:Arizona, USACategory:Arkansas→ Category:Arkansas, USA- Category:Barisal Division → Category:Barisal Division, Bangladesh
Categories for certain things that are located within these subdivisions will also be named, e.g. Category:Cities in Aomori (Prefecture) → Category:Cities in Aomori Prefecture, Japan. —Rua (mew) 13:07, 16 October 2017 (UTC)
- Support. I oppose the existence of categories with language code like "en:" in the first place, but what is proposed here seems to be an improvement over the status quo. --Daniel Carrero (talk) 20:27, 20 October 2017 (UTC)
- I would have opposed a lot of these, but I was too late on the scene. DonnanZ (talk) 15:51, 12 November 2017 (UTC)
- Support all except Category:Abkhazia, Georgia (for which I abstain as I do not properly understand the political situation explained by User:Palaestrator verborum). - excarnateSojourner (talk|contrib) 03:34, 29 October 2021 (UTC)
- US states were moved by MewBot (talk • contribs) in 2017. - excarnateSojourner (talk | contrib) 22:00, 27 April 2022 (UTC)
The rename has been put on hold until there is a clear consensus either way. Please vote! —Rua (mew) 15:11, 14 November 2017 (UTC)
- @Rua It looks sane to me if politics are let out. But why is Abkhazia in Georgia though it is an independent state, statehood only depending on factual prerequisites and not on diplomatic recognition which has nothing to do with it? Where does the Crimea belong to? (article Sevastopol is only in Category:en:Ukraine because it has not really been edited since 2014.) I can think of two solutions: First possibility: We focus on geographical and cultural constants. Second possibility: We focus on the actual political power. I disprefer the second slightly because it can mean much work in cases of war (i.e. how much the Islamic state holds etc., or say the current factions in Libya). But in neither case Abkhazia is in Georgia. But the first possibility does not even answer what the Crimea belongs to, i.e. I am not sure if it is historically correct to speak of the Crimea as Ukraine. And geographical terms are often fuzzy and subject to editorial decisions. All seems so easy if you start your concepts from the United States, which do not even have a name for the region they are situated in. And even for the USA your idea is questionable because the constituent states of the United States are states in their own right (Teilstaat, Gliedstaat in German), as is also the case for the Federal Republic of Germany and the Russian Federation partially (according to the Russian constitution only those of the 85 subjects are states which are called Republic, not the Oblasti etc.). Is Tatarstan Russia? Not even Russians can agree with such a sentence, as in Russia one sharply distinguishs русские and россияне, Россия and Российская федерация. Technically Ceuta and Melilla are in Morocco because Spain is not in Africa. Also, Kosovo je Srbija, and it would become just a coincidence if a place important in Serbian history is listed as X, Kosovo or X, Serbia. Palaestrator verborum (loquier) 16:06, 14 November 2017 (UTC)
@Rua: Most of these categories like Category:en:Special wards in Tokyo are back on the {{delete}}
list. I think these should be removed again for the time being. DonnanZ (talk) 18:02, 14 November 2017 (UTC)
- Starting with the above, I don't know how the Tokyo ward system works, but I imagine it's a subdivision of the city. In England wards are subdivisions in cities, boroughs, local government districts, and possibly counties. "Wards in" is the natural usage.
- Municipalities similarly. For example in Norway there are hundreds of municipalities (kommuner) which are subdivisions within counties (fylker). Some of these can be large, especially in the north, but so are the counties in the north. To me "municipalities in" is the natural wording.
- States and provinces in the USA and Canada: In nearly all cases it is unnecessary to add the country name as the names are unambiguous. The only exception I can think of is Georgia, USA. This could also apply to prefectures in Japan and states in India (is there a Punjab in Pakistan?). DonnanZ (talk) 18:52, 14 November 2017 (UTC)
- Yes, there is, like there is in India. Maybe categorisations should be abundant? Cities can belong to Punjab as well as to Punjab, India, and the Crimea is part of administration of both the Russian Federation and the Republic Ukraine at least for some purposes in the Republic Ukraine. We can make the least thing wrong by adding Sheikh Zuweid (presuming it exists) as well to the Islamic State as to the Arab Republic of Egypt, because we do not want to judge morally and formally states and terror organizations are indistinguishable. On the other hand of course we need sufficient data to relate towns to administrative divisions and ISIS presumably does not publish organigrams. Palaestrator verborum (loquier) 19:44, 14 November 2017 (UTC)
This is a newly created (September 2017) topical category. It should be renamed to something that does not imply that it contains expressions that are directive. It contains terms that relate to direction or, more frequently, terms that can be confused with direction. I recognize that Direction would not be a suitable category name. I don't have any suggestion. It may be that the category is ill-conceived. DCDuring (talk)
- I see nothing wrong with it. If it contained directive expressions, it would be called Category:English directives or similar. We have voted in the past to keep topical category naming distinct from other categories, so the naming scheme is considered indicative of its use/meaning/function. —Rua (mew) 20:37, 22 December 2017 (UTC)
- I'm not surprised that you see nothing wrong, what with the cat scheme being otherwise so perfect.
- I favor keeping topical categories as far way as possible from our other entry categories.
- But, unlike other categories that have names that are plural in form, Category:en:Directives contains neither examples nor names of the referents of its category name, ie of directives. It contains a dog's breakfast of terms that the categorizer, User:51.9.55.214, thought to be connected to some sense of the noun(?) directive. One mistake was to pick as name for a concept/category a de-adjectival noun. Probably the name was made plural to avoid confusion with the adjective.
- If you can make sense of the rationale for the membership in the category of ban, bare minimum, beckoning, behest, besaiel, beseeching, bidding, bill, blacklist, blackmail, bloodlust, blueprint, booty call, boundary, boycott, breve, bribe, and bytecode, you, Gunga Din, are a better man than I. I am at a loss to understand the common element among these terms. Is each suppopsed to be a type of directive? If no one can come up with a better name for the category, or prune membership rationally, or split it into multiple comprehensible cateogries, or RfDO it, I will RfDO it. DCDuring (talk) 02:43, 23 December 2017 (UTC)
- Bytecode in the sense of compiler directive! Really pushing it a bit. Equinox ◑ 02:49, 23 December 2017 (UTC)
2018 — January[edit]
Is {{list helper 2}}
an improved version of {{list helper}}
? Can all instances of {{list helper}}
be converted to {{list helper 2}}
? --Per utramque cavernam (talk) 22:33, 3 January 2018 (UTC)
2018 — February[edit]
...keeping the redirect. Or is there a sensible distinction between the two that we want to maintain? - -sche (discuss) 18:43, 19 February 2018 (UTC)
- I was hesitant to recreate CAT:English misconstructions, but labelling evolutionary stable strategy as an "eggcorn" seems like a stretch. --Per utramque cavernam (talk) 18:47, 19 February 2018 (UTC)
- Oh wait, that's not what you're suggesting. --Per utramque cavernam (talk) 18:47, 19 February 2018 (UTC)
- I changed the eggcorn template to categorize into the misconstruction category, emptying Category:English eggcorns and Category:Vietnamese eggcorns, although that should be undone if there is some distinction I am missing that it would be good and feasible to maintain. - -sche (discuss) 18:49, 19 February 2018 (UTC)
- Well, I feel that there's a semantic aspect to eggcorns that isn't really present in evolutionary stable strategy, trompe-d'œil or analysises. --Per utramque cavernam (talk) 18:53, 19 February 2018 (UTC)
- True, but that distinction seems a bit fuzzy; e.g., dominate is labelled an eggcorn (because it's homographic to a valid word?) while unfortunant is labelled a misconstruction. And evolutionary in evolutionary stable strategy is also a word. (But I'm not opposed to making a dinstinction; I'm just pointing out the issues with it, devil's-advocate-style.) - -sche (discuss) 19:25, 19 February 2018 (UTC)
- @-sche: I agree that the distinction is fuzzy (in fact, I'd even say that the distinction between "misconstructed", "nonstandard" and "proscribed" is fuzzy: compare our treatment of developmentation, abortation and pronounciate). Still, I think it's not entirely without merit, although I would be hard pressed to give you a specific set of criteria.
- I wouldn't call dominate an eggcorn, but without any quotation it's hard to judge anyway.
In fact, I'm going to RFV it.not necessary: it's used indeed. - Another thing: I don't like the way idiosyncratic is used in our def of eggcorn. It seems to be used as a synonym of "odd, strange, peculiar, eccentric", but it shouldn't be. --Per utramque cavernam (talk) 20:02, 19 February 2018 (UTC)
- orange is a result of misconstruction of naranga, isn't it? But orange is certainly not nonstandard. (Other cases of loss of juncture are apron, newt, nickname) Though misconstructions may tend to be nonstandard (for all intensive purposes, at least), they can become standard over time, as with many "errors". DCDuring (talk) 20:09, 19 February 2018 (UTC)
- It's specifically a rebracketing/metanalysis, which you could say is a type of misconstruction. However, I certainly wouldn't want to label orange as a misconstruction; that's true diachronically, but not synchronically. I do want to label it as a rebracketing, though. --Per utramque cavernam (talk) 20:21, 19 February 2018 (UTC)
- orange is a result of misconstruction of naranga, isn't it? But orange is certainly not nonstandard. (Other cases of loss of juncture are apron, newt, nickname) Though misconstructions may tend to be nonstandard (for all intensive purposes, at least), they can become standard over time, as with many "errors". DCDuring (talk) 20:09, 19 February 2018 (UTC)
- True, but that distinction seems a bit fuzzy; e.g., dominate is labelled an eggcorn (because it's homographic to a valid word?) while unfortunant is labelled a misconstruction. And evolutionary in evolutionary stable strategy is also a word. (But I'm not opposed to making a dinstinction; I'm just pointing out the issues with it, devil's-advocate-style.) - -sche (discuss) 19:25, 19 February 2018 (UTC)
- Well, I feel that there's a semantic aspect to eggcorns that isn't really present in evolutionary stable strategy, trompe-d'œil or analysises. --Per utramque cavernam (talk) 18:53, 19 February 2018 (UTC)
- It's hard to find references rather than intuition to support classifying terms one way or another, but I suppose the difference between developmentation and pronounciate vs unfortunant and dominate is that I think the first two are intentional (jocular) errors and the second two are unintentional. If we keep the categories separate, should "eggcorns" be a subcategory of "misconstructions" or a "sibling category" on the same level (cross-linked)? - -sche (discuss) 20:39, 19 February 2018 (UTC)
- And then there are entries like firstable which only say they're eggcorns in the etymology, not the definition... - -sche (discuss) 21:06, 26 February 2018 (UTC)
April 2020 duplicate discussion: Template:eggcorn of[edit]
To Template:misconstruction of.
- Discussion moved from #Template:eggcorn of.
"Eggcorn" is a lovely term for our own amusement, but it is an inside joke that makes Wiktionary more closed to normal users. I believe that a term like misconstruction is more understandable to normal people and includes all eggcorns, mondegreens, etc. DCDuring (talk) 19:52, 9 April 2020 (UTC)
- A Google search for eggcorn brings up Wikipedia for the first entry. A Google search for misconstruction brings up "is misconstruction a real word" and dictionaries. Eggcorn might be slightly whimsical, but misconstruction is not a word used by normal people.--Prosfilaes (talk) 04:59, 22 April 2020 (UTC)
- Keep as is. It links to the entry eggcorn, so users are never more than a click away from comprehension. If you think it's an inside joke, then the in-group is all of linguistics, and we might apply the same logic to eliminating the word illative from our entries — only linguists know what it means, and why should we use the most exact word when a vaguer one might do? —Μετάknowledgediscuss/deeds 18:17, 23 April 2020 (UTC)
- Keep. I don't like dumbing things down to appeal to the broadest population possible. What about the people who want more precise information, or who want to learn whimsical words to describe things? I'm quite happy with us filling a niche that other dictionaries don't fill, since that's why I use Wiktionary in the first place. Besides, the kind of people who aren't interested in expanding their vocabulary tend not to look up words in the dictionary very much anyway. Andrew Sheedy (talk) 16:02, 24 April 2020 (UTC)
- FWIW I proposed something similar earlier / further up the page, #Template:eggcorn_of_into_Template:misconstruction_of. I think the issue is less that the term is opaque, and more that the distinction is fuzzy/questionable, compare my comments above. - -sche (discuss) 17:35, 3 May 2020 (UTC)
Why is this in the singular? It just looks weird in the case of a title like this. (Somewhat irrelevant, extra issue: the page needs a lede to explain what a shortcut is.) PseudoSkull (talk) 05:23, 21 February 2018 (UTC)
- Support on both counts. —Μετάknowledgediscuss/deeds 19:23, 20 March 2018 (UTC)
- Support per nom. - excarnateSojourner (talk|contrib) 03:42, 29 October 2021 (UTC)
- @PseudoSkull There has been a section explaining what shortcuts are this whole time. It's just not right at the top, which might have been done intentionally to make the table of common shortcuts as quickly accessible as possible. - excarnateSojourner (talk | contrib) 06:10, 18 October 2022 (UTC)
2018 — March[edit]
This is extremely trivial, not to mention something that could be found even if it were not categorised. I think that it suits an appendix much better, so I propose that its contents be moved to Appendix:English words ending in -gry. —Μετάknowledgediscuss/deeds 03:23, 15 March 2018 (UTC)
- A benefit to having it as a category is that theoretically it ought to be addable by the headword templates examining the pagename (like "English terms spelled with Œ"), which, if implemented (...if it could be implemented without excessive memory costs), would allow it to be kept up to date automatically. - -sche (discuss) 17:16, 15 March 2018 (UTC)
- That is true, but I don't really think we should be using headword templates to collate trivia. —Μετάknowledgediscuss/deeds 17:47, 15 March 2018 (UTC)
- Delete per proponent. --Per utramque cavernam 18:09, 31 May 2018 (UTC)
- Is there something like Category:English lemmas but sorted from the end, like anger, ranger, hunger, angry, hungry? --幽霊四 (talk) 19:40, 6 February 2021 (UTC)
- At http://tools.wmflabs.org/dixtosa/ you can get a list of all entries in any category that end with any string you like. —Mahāgaja · talk 20:58, 6 February 2021 (UTC)
- Is there something like Category:English lemmas but sorted from the end, like anger, ranger, hunger, angry, hungry? --幽霊四 (talk) 19:40, 6 February 2021 (UTC)
- Support the proposed move per nom. - excarnateSojourner (talk|contrib) 05:00, 29 October 2021 (UTC)
2018 — April[edit]
I would like to request the move of the content of entries like 茨城県 (Ibaraki-ken, literally “Ibaraki prefecture”) to simply 茨城 (Ibaraki, “Ibaraki”), cf. Daijisen. 県 is not an essential part of the name.
(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Atitarev, Dine2016, Poketalker, Cnilep, Britannic124, Fumiko Take, Dine2016): —Suzukaze-c◆◆ 03:19, 19 April 2018 (UTC)
- As a counterargument, Shogakukan's 国語大辞典 entry for 茨城 (Ibaraki) has one sense listed as 「いばらきけん(茨城県)」の略 ("Ibaraki-ken" no ryaku, "short for Ibaraki-ken"), and the 茨城 page on the JA Wikipedia is a disambig pointing to 茨城県 as one possible more-specific entry. ‑‑ Eiríkr Útlendi │Tala við mig 03:52, 19 April 2018 (UTC)
- (edit conflict) It seems like a two-word phrase to me. I am not a native speaker, but I think that if someone asked "水戸市は何県?" ((in) What prefecture is Mito?) then "茨城です。" (It's Ibaraki) would be a correct answer. Entries such as 奈良 and 広島 should have both the city and the prefecture. (I see that 奈良 currently does.) Cnilep (talk) 04:01, 19 April 2018 (UTC)
- 茨城県です would also be correct and probably more common. At least 東京 and 東京都 are clearly distinguished. No one in Izu Ōshima would say he/she is from 東京. — TAKASUGI Shinji (talk) 04:04, 19 April 2018 (UTC)
- Yes, 茨城県 is also correct. And if someone asked どこの出身? (Where are you from?) the answer would probably be 奈良県 rather than 奈良, or else expect a follow-up question. But I don't think that is necessarily a matter of word boundaries. Compare Pittsburgh, Pennsylvania and Pittsburgh, Kansas; the fact that it is usually necessary, and always acceptable to specify the latter doesn't mean that Pittsburgh on its own is not a proper noun. By same token, I think that 茨城 (et alia) is a word. That's the point I had in mind. I will say nothing about what is more common. I don't even have good intuitions about frequency in my native language. Cnilep (talk) 04:54, 19 April 2018 (UTC)
- I fully agree that 茨城 is a term worthy of inclusion. I also think that 茨城県 is a term worthy of inclusion. We have entries for both New York and New York City, and even New York State. Similarly, I think we should have entries for
[PREFECTURE NAME]
, and also for[PREFECTURE NAME]
県 and[PREFECTURE NAME]
市 and[PREFECTURE NAME]
郡, etc., as appropriate. ‑‑ Eiríkr Útlendi │Tala við mig 05:03, 19 April 2018 (UTC)- I believe New York is a special case because there is both the state and the city. We have Washington State, but we don't have City of Chicago or State of Oregon. —Suzukaze-c◆◆ 18:40, 19 April 2018 (UTC)
- A lot (maybe all?) of the prefecture names minus the 県 (-ken) suffix are polysemous. Listing a few from the north to the south, limiting just to geographical senses, and just in the same regions at that:
- 青森 (Aomori): a prefecture and a city
- 岩手 (Iwate): a prefecture, a city, and a township
- 秋田 (Akita): a prefecture and a city
- 山形 (Yamagata): a prefecture, a city, and a village
- 宮城 (Miyagi): a prefecture, a county, a township, a rural area (ancient Japan), a village, an island, and a mountain
- 福島 (Fukushima): a prefecture, a city, and a township
- 新潟 (Nīgata): a prefecture, a city, a park, and a village
- 栃木 (Tochigi): a prefecture and a city
- 茨城 (Ibaraki): a prefecture, a county, and a township
- Jumping south a bit to touch on Anatoli's example further below:
- 奈良 (Nara): a prefecture, a city, a township, and a village
- I am consequently in support of including both the bare name, and the qualified name(s), much as we already do for similar situations with English terms. ‑‑ Eiríkr Útlendi │Tala við mig 21:35, 19 April 2018 (UTC)
- They are polysemic because most prefectures were named after their capital city during the abolition of the han system. Exceptions include 埼玉 and 沖縄, where cities are named after their prefecture. — TAKASUGI Shinji (talk) 12:23, 23 April 2018 (UTC)
- A lot (maybe all?) of the prefecture names minus the 県 (-ken) suffix are polysemous. Listing a few from the north to the south, limiting just to geographical senses, and just in the same regions at that:
- I believe New York is a special case because there is both the state and the city. We have Washington State, but we don't have City of Chicago or State of Oregon. —Suzukaze-c◆◆ 18:40, 19 April 2018 (UTC)
- I fully agree that 茨城 is a term worthy of inclusion. I also think that 茨城県 is a term worthy of inclusion. We have entries for both New York and New York City, and even New York State. Similarly, I think we should have entries for
- Yes, 茨城県 is also correct. And if someone asked どこの出身? (Where are you from?) the answer would probably be 奈良県 rather than 奈良, or else expect a follow-up question. But I don't think that is necessarily a matter of word boundaries. Compare Pittsburgh, Pennsylvania and Pittsburgh, Kansas; the fact that it is usually necessary, and always acceptable to specify the latter doesn't mean that Pittsburgh on its own is not a proper noun. By same token, I think that 茨城 (et alia) is a word. That's the point I had in mind. I will say nothing about what is more common. I don't even have good intuitions about frequency in my native language. Cnilep (talk) 04:54, 19 April 2018 (UTC)
- 茨城県です would also be correct and probably more common. At least 東京 and 東京都 are clearly distinguished. No one in Izu Ōshima would say he/she is from 東京. — TAKASUGI Shinji (talk) 04:04, 19 April 2018 (UTC)
- Generally support. Less duplication is good, and it is not much different from Chinese etc. for which we generally delemmatise, if not completely hard-redirect, these forms. Wyang (talk) 04:49, 19 April 2018 (UTC)
- Support. For a dictionary, I think we don't need to keep entries with both prefecture name and prefecture, despite the usage but it's always helpful to provide usage notes (e.g. normally used with 県: ~県) and usage examples, e.g. 奈良県 (Nara ken, “Nara (prefecture)”). --Anatoli T. (обсудить/вклад) 05:45, 19 April 2018 (UTC)
Same suffix as in быль (bylʹ), убыль (ubylʹ), прибыль (pribylʹ), отрасль (otraslʹ), поросль (poroslʹ). а belongs to the stem. Guldrelokk (talk) 23:27, 20 April 2018 (UTC)
- @Atitarev, Benwing2, Chignon: Please voice an opinion; if you agree, the couple of entries using this suffix need to be modified. —Μετάknowledgediscuss/deeds 01:52, 16 April 2019 (UTC)
- Agreed. The two entries need a change. --Anatoli T. (обсудить/вклад) 01:57, 16 April 2019 (UTC)
- ruwikt: Категория:Русские слова с суффиксом -ль (Category:Russian words suffixed with -ль). --Anatoli T. (обсудить/вклад) 02:08, 16 April 2019 (UTC)
- @Guldrelokk, Benwing2, Chignon: I have modified entries, the category is orphaned, -ль (-lʹ) still needs to be defined. --Anatoli T. (обсудить/вклад) 03:30, 16 April 2019 (UTC)
- @Atitarev, can you please resolve this? —Μετάknowledgediscuss/deeds 07:41, 6 March 2021 (UTC)
- @Guldrelokk, Benwing2, Chignon: I have modified entries, the category is orphaned, -ль (-lʹ) still needs to be defined. --Anatoli T. (обсудить/вклад) 03:30, 16 April 2019 (UTC)
- ruwikt: Категория:Русские слова с суффиксом -ль (Category:Russian words suffixed with -ль). --Anatoli T. (обсудить/вклад) 02:08, 16 April 2019 (UTC)
- Agreed. The two entries need a change. --Anatoli T. (обсудить/вклад) 01:57, 16 April 2019 (UTC)
2018 — July[edit]
[edit]
After some discussion on Category talk:Baybayin script (that went a bit off-topic), some of the Indian language editors (@Bhagadatta, Msasag and myself) have agreed that this category should be renamed to Category:Eastern Nagari script, the reasons being (1) several languages other than Bengali use this script, and (2) the Bengali alphabet is just a subset of this script and lacks some of the glyphs used by other Bengali-script languages (most prominently Assamese which has a separate r-glyph). I want to make sure that there are no objections to this by editors who were not in the discussion. —AryamanA (मुझसे बात करें • योगदान) 02:06, 20 July 2018 (UTC)
- google:assamese+site:unicode.org —Suzukaze-c◇◇ 02:16, 20 July 2018 (UTC)
@Asm sultan, Dubomanab Kutchkutch (talk) 05:35, 21 July 2018 (UTC)
Support -- Bhagadatta (talk) 08:38, 21 July 2018 (UTC)
The result of the discussion is RFM-moved to Category:Eastern Nagari script. --Sbb1413 (he) (talk • contribs) 10:58, 27 March 2023 (UTC)Oppose – I had closed the discussion and renamed the category to Category:Eastern Nagari script, only to find out that there's a separate Category:Assamese script. --Sbb1413 (he) (talk • contribs) 11:16, 27 March 2023 (UTC)
The two verb senses are bad IMHO. The first should be at busy oneself, I think, since it is always reflexive AFAIK. The second one doesn't sound right at all -- "He busied her" isn't something I've heard. Is that real at all? 69.255.250.219 02:36, 29 July 2018 (UTC)
- Support the move of verb sense 1 to busy onself. Send verb sense 2 to RFV. - excarnateSojourner (talk|contrib) 05:46, 29 October 2021 (UTC)
- It's not purely reflexive, so I oppose the move for sense 1. Examples: "I will […] busy him with my affairs till he forgets his own" [2]; "And what has been busying you?" [3]; " […] he busied you with other chores" [4]. Rarer than I thought, since I've heard e.g. "sorry for busying you" in real life, but it's a thing. Sense 2 I'm unfamiliar with. —Al-Muqanna المقنع (talk) 23:51, 2 December 2022 (UTC)
- [[busy oneself]] might be a good hard redirect to the appropriate sense of busy, which would benefit from
{{lb|en|usually reflexive}}
and corresponding usage examples. DCDuring (talk) 14:46, 3 December 2022 (UTC)- The sense at [[busy]] should remain, whether or not there is a separate lemma entry for busy oneself. DCDuring (talk) 14:48, 3 December 2022 (UTC)
- Redirecting busy oneself and a label makes sense, agreed. —Al-Muqanna المقنع (talk) 16:10, 3 December 2022 (UTC)
- The sense at [[busy]] should remain, whether or not there is a separate lemma entry for busy oneself. DCDuring (talk) 14:48, 3 December 2022 (UTC)
- [[busy oneself]] might be a good hard redirect to the appropriate sense of busy, which would benefit from
2018 — August[edit]
Nahuatl is sometimes treated as a language, and sometimes as a family of languages. Right now, Wiktionary is treating it as both simultaneously, which doesn't make sense. "Nahuatl" should be removed as a language. --Lvovmauro (talk) 11:55, 30 August 2018 (UTC)
- I agree the current arrangement doesn't make sense; it is a relic of very early days on Wiktionary, and has persisted mostly because it's not entirely clear how intelligible the varieties are and hence whether it's better to lump them all into
nah
, or retirenah
and separate everything. But enough varieties are not intelligible that I agree with retiringnah
(or perhaps finally converting it to a family code). - -sche (discuss) 20:34, 31 August 2018 (UTC)
- @Lvovmauro: OK, thanks to you and a few other editors, all words with ==Nahuatl== sections have been given more specific headers. However, as many as a thousand translations remain to be dealt with before the code can be made a family code and Category:Nahuatl language moved on over to Category:Nahuan languages. - -sche (discuss) 06:48, 19 September 2018 (UTC)
- @Lvovmauro: Feel free to remove obvious errors / unattested neologisms. If a high proportion of the translations are bad, it might even be reasonable to start presuming they're bad and just removing them, since they already suffer from the problem of using an overbroad code. - -sche (discuss) 00:28, 21 October 2018 (UTC)
- Someone with more time on their hands than me at the moment will need to delete all the subcategories of Category:Nahuatl language, and then the category itself, in preparation for moving 'nah' from the language-code module to the family-code module so the categories won't be recreated by careless misuse of 'nah' in the labels etc of 'nci' entries. - -sche (discuss) 00:24, 21 October 2018 (UTC)
Mecayapan Nahuatl saltillos[edit]
A number of Mecayapan Nahuatl words are currently written with U+0027 APOSTROPHE, which is a punctuation mark and not a letter. And a couple are using U+02BC MODIFIER LETTER APOSTROPHE, which is the wrong shape for this language. They should all be written with U+A78C LATIN SMALL LETTER SALTILLO instead.
- a̱'ti → a̱ꞌti
- babasoti' → babasotiꞌ
- cacahua' → cacahuaꞌ
- ca̱la̱' → ca̱la̱ꞌ
- coyo̱' → coyo̱ꞌ
- epaso̱' → epaso̱ꞌ
- hui̱lo̱' → hui̱lo̱ꞌ
- ichca' → ichcaꞌ
- ilhui' → ilhuiꞌ
- ocoʼ → ocoꞌ
- po̱cho̱' → po̱cho̱ꞌ
- sihua̱' → sihua̱ꞌ
- soqui' → soquiꞌ
- ta̱ga' → ta̱gaꞌ
- tepe̱' → tepe̱ꞌ
- ti̱lti' → ti̱ltiꞌ
- toca' → tocaꞌ
- tomaʼ → tomaꞌ
- to̱ca̱' → to̱ca̱ꞌ
- to̱to̱' → to̱to̱ꞌ
- tzi̱ca' → tzi̱caꞌ
- xo̱chi' → xo̱chiꞌ
--Lvovmauro (talk) 09:48, 31 August 2018 (UTC)
- Or perhaps they should just be moved to use the Modifier Letter Apostrophe, cf WT:RFM#Entries_in_CAT:Taos_lemmas_with_curly_apostrophes, to avoid over-proliferation of different apostrophe-ish letters. I think we should try to be consistent within the Nahuatl languages, at least, in which codepoint we use. - -sche (discuss) 20:26, 31 August 2018 (UTC)
2018 — September[edit]
Arawak and Island Carib[edit]
Any objections to me renaming Arawak arw
(4 entries) and Island Carib crb
(0 entries) to Lokono and Kalhiphona, respectively? Arawak is easily confused with the Arawak/Arawakan proto language and family, and Carib is one of two often confounded languages, the Carib language and the Island Carib language. --Victar (talk) 04:03, 6 September 2018 (UTC)
- No objection to renaming Arawak, but I'm not sure about Kalhiphona, which seems to be quite rare even on a Google web search, and which seems to invite as much possible confusion (in its various spellings) with the various spellings of Garifuna as it avoids with other "Carib"s. - -sche (discuss) 06:56, 19 September 2018 (UTC)
Template:superlative attributive of to Template:da-superlative attributive of[edit]
Only used for Danish. —Rua (mew) 17:15, 9 September 2018 (UTC)
It’s not about goon but go-on. Most books on Japanese seem to use kan-on and go-on with a hyphen rather than the correctly Romanized kan’on and goon. — TAKASUGI Shinji (talk) 15:42, 22 September 2018 (UTC)
2018 — October[edit]
I propose to rename Category:Korean determiners to Category:Korean adnominals, just like Category:Japanese adnominals. Korean gwanhyeongsa are grammatically almost identical to Japanese rentaishi or adnominals, which may or may not be determiners. Gwanhyeongsa are generally divided into three classes: demonstrative gwanhyeongsa, numeral gwanhyeongsa, and qualifying gwanhyeongsa ([5]). The last ones are not determiners. (pinging @Atitarev, Eirikr, Garam, HappyMidnight, KoreanQuoter) — TAKASUGI Shinji (talk) 23:31, 10 October 2018 (UTC)
- Support. --Garam (talk) 08:21, 12 October 2018 (UTC)
- Tentatively Support. Let's check with User:Wyang who was also involved and had an opinion in a related discussion on the group of words ending in 적 (的, jeok). --Anatoli T. (обсудить/вклад) 02:42, 13 October 2018 (UTC)
- I feel determiner is the more common name for this in English; the different definitions of these terms across languages should not be a concern - e.g. we also use adjective differently for Korean. adnominal may be confused with the -eun, -neun, -eul, -deon forms of Korean verbs and adjectives. Wyang (talk) 03:57, 13 October 2018 (UTC)
- @Wyang: The problem is that Category:Korean determiners contains words other than determiners. It will be all right to have both Category:Korean adnominals and Category:Korean determiners without renaming if you want, just like Category:Japanese adnominals and Category:Japanese determiners. — TAKASUGI Shinji (talk) 10:31, 13 October 2018 (UTC)
@Tibidibi, AG202 —Fish bowl (talk) 11:32, 7 February 2022 (UTC)
ichthyosaur vs. ichthyosaurus, and other terms like these.[edit]
I'm in a dispute with an editor over the exact meaning and differences between these two terms - are they the same or must we tell apart the order from the genus? Is there is a standard to follow? Дрейгорич (talk) 15:55, 27 October 2018 (UTC)
- The standard is making a survey of contemporary and past usages and using that to inform the definitions. DTLHS (talk) 16:15, 27 October 2018 (UTC)
2018 — November[edit]
Language request: Old Cahita[edit]
Mayo and Yaqui are mutually intelligible and sometimes considered to be a single language called Cahita. But their speakers apparently consider them to be distinct languages, and they have distinct ISO codes (mfy
and yaq
) and are currently treated distinctly by Wiktionary.
I'm not requesting that they be merged, but separating them is a problem because an important early source, the Arte de la lengua cahita conforme à las reglas de muchos peritos en ella (published 1737 but written earlier) treats them as a single language, and also includes an extinct dialect called Tehueco. I'd like to add words from the Arte but I can't list them specifically as either Mayo or Yaqui.
One solution would be treat to the language of the Arte as a distinct historical language, "Old Cahita", which would then be the ancestor of Mayo and Yaqui. The downside is there only seems to be one linguist currently using this name. --Lvovmauro (talk) 11:32, 4 November 2018 (UTC)
- On linguistic grounds, it seems like we should merge Yaqui and Mayo. Jacqueline Lindenfeld's 1974 Yaqui Syntax says "Yaqui and Mayo are sufficiently similar to be mutually intelligible", the Handbook of Middle American Indians says "the modern known representatives of Cahitan—Yaqui and Mayo—are mutually intelligible", and various more general references say "Yaqui and Mayo are mutually intelligible dialects of the Cahitan language", "The Yaqui and Mayo speak mutually intelligible dialects of Cahita". (There are political considerations behind the split, which a merger might upset, so adding Old Cahita would also work, but we have tended to be lumpers...) - -sche (discuss) 23:03, 18 November 2018 (UTC)
Cleanup suggestions for some badly attested Semitic languages, needing admin action[edit]
- Discussion moved from Wiktionary:Grease_pit/2018/November#Cleanup suggestions for some badly attested Semitic languages, needing admin action.
- Pray somebody add
|scripts = {"Narb"
} to Module:languages/data3/x after line 1026 for xna. (Otherwise mentions of words in it are shown in slanted letters.)- Added. DTLHS (talk) 03:17, 14 November 2018 (UTC)
- It seems that even MediaWiki:Common.css needs a new class for Narb added, to get
font-style: normal
; Sarb is there and has it, Narb is not there. If the mention of a North Arabian word in عَنْكَبُوت (ʕankabūt) works then it is complete. Also I see that in Module:scripts/data Narb does not havedirection = "rtl"
while Sarb has. Fay Freak (talk) 14:43, 15 November 2018 (UTC)- Good catch. I've updated Common.css and Mobile.cc and set it to display rtl. Sadly, it seems there are no fonts that display it. If you or I could find a good image of what the letters are supposed to look like, I might have time to make a basic font iff the letters don't have to be joined the way they do in Arabic. - -sche (discuss) 22:08, 18 November 2018 (UTC)
- I as an Archfag recently had a great update three weeks ago that adds displaying support for Old North Arabian, amongst other things like which improved Arabic and Syriac script rendering everywhere. gucharmap calls the name of the font by “Noto Sans Old North Arabian”, which I find in the filelist of the noto-fonts package. @-sche Fay Freak (talk) 22:29, 18 November 2018 (UTC)
- Good catch. I've updated Common.css and Mobile.cc and set it to display rtl. Sadly, it seems there are no fonts that display it. If you or I could find a good image of what the letters are supposed to look like, I might have time to make a basic font iff the letters don't have to be joined the way they do in Arabic. - -sche (discuss) 22:08, 18 November 2018 (UTC)
- It seems that even MediaWiki:Common.css needs a new class for Narb added, to get
- Added. DTLHS (talk) 03:17, 14 November 2018 (UTC)
- I think everything under Category:Old North Arabian script languages should be “Ancient North Arabian” (xna), it is to wonder that Dadanitic (sem-dad), Hismaic (sem-his), Safaitic (sem-saf), Taymanitic (sem-tay), Dumaitic (sem-dum), Hasaitic (sem-has), Thamudic (sem-tha) are separate languages on Wiktionary (some also with no script assigned). (Prolly someone went through some lects and added all he found.) Those lects are at a level of attestion or study where it does not even matter whether they are dialects or languages, and “Thamudic” is even a collective term for any of the Ancient North Arabian lects not further classified. Many inscriptions cannot be classified unto more specific lects anyway (you know, people also were nomads and wrote graffiti here and there) and they can only be entered as “Ancient North Arabian”. With words being found randomly and in concise consonantal writing I don’t see why one would pursue separation other than by stating the find spot.
- Also, “Qatabanian” (xqt), “Sabaean” (xsa), “Minaean” (inm), “Harami” (xha, redirects to “Minaean” on Wikipedia), Hadrami (xhd) – likewise otiose distinctions, regarding form and amount of attestion of Epigraphic South Arabian, as the name says only epigraphically attested, without any vowels –, have been unpopular in use already, entries and etymologies use the header “Old South Arabian” (sem-srb). I suggests to cross out those. Etymology-only is possible so one can use those in
{{cog}}
when in an individual case a word is known to be attested as of one of the dialects. North Arabian epigraphy categorization is more complex and it is better anyway to mention in each etymology where a lexeme has been encountered.- Himyaritic (sem-him), as an attested language, is rather mythical because the Ḥimyarites wrote Sabaean. Wikipedia mentions “three Himyaritic texts”, at the same time in the Encyclopedia of Arabic Language and Lingustics s.v. we read about two: “It is not even possible to establish whether they were written in the same language. The first text dates from around 100 C.E. and the second from around 300 C.E.” And about the secondary material from Early Medieval Arabs: “It is easy to see that quotations from Himyaritic offer very different readings according to the manuscripts.” Or according to others, mentioned in the EALL, Ḥimyarite is the same as Arabic, only with peculiar features (which might as well derive from Arabicized transmission, or later language fusion or whatever, much that could fool us). It could be grouped with those spurious languages if this category held languages from Antiquity.
- Gurage is according to Wolf Leslau, it’s most eminent scholar, one language with twelve dialects; others share this view. The material for this language, particularly by Leslau across his works, only lists words as “Gurage”, without qualifying if they are “Inor”, “Mesqan” or some other Gurage, so on Wiktionary one cannot simply give “Gurage” words (which has recently been done in Semitic comparisons by abusing the code of the largest dialect Sebat Bet Gurage, in spite of the source saying “Gurage”). The following dialects I find on en.Wiktionary as languages: Kistane/Soddo (gru), Mesqan (mvz), Sebat Bet Gurage (sgw), Silt'e (stv), Inor (ior), Muher (sem-mhr), Mesmes (mys), Chaha (sem-cha), Wolane (wle), Zay (zwa); some of these are considered subdialects of Sebat Bet Gurage. There are more I don’t find on Wiktionary. It’s perhaps like with the Aramaic dialects yore or the Low German dialects today. People publish Westphalian dictionaries but it’s still Low German and so treated by Wiktionary. I suspect that instead of holding controversial subdivisions deriving from Ethnologue we should, holding to the sources, keep the Wiktionary-language level higher. The source for a certain word can be further qualified by labels as with Coptic. I mean that with language, unlike with biological taxonomy, one cannot simply assume that distinctiveness of a taxon is ascertained by experiments and then authoritatively published in some reference. As the individual forms are described in this dictionary, one must weigh if the data allows distinction at all. Currently it looks to me that hence Gurage must be lumped; I don’t know if, with new data or emerging different literary standards, separating the lects with separate codes will later be convenient (the increase in language material will be disappointing and unlikely someone will come and add Gurage in thousands of entries anyway, let’s be realistic), but I doubt that it would be comfortable. See also Why is Old Novgorodian a separate language in Wiktionary? This is the question: Is the difference in data enough to justify separation? The actual language-dialect distinction does not matter, it must be seen functionally, for dictionary purposes, for dictionary purposes. And if linguists publish material as “Gurage” the distinction is probably not good for Wiktionary headers. Isn’t it out of scope of Wiktionary to distinguish lect clusters when they are generally unwritten and chiefly written by and variously lumped and splitted by linguists? That’s a difficult question. Also I fear that such distinctions might be precisely the cause why nobody comes and pours out his rich Gurage knowledge. An adept would not be sure to distinguish, pendulating between two extremes, not witting if he should split as much as he can by all kinds of criteria or if to standardize and to abstract. To help though first all mentioned codes need the Ge'ez and Latin script both assigned, and the macrolanguage created. Maybe there will be late order from early ambiguity. Though I would perhaps do the order by lumping and labelling by location, were I that certain aficionado.
- The obese Wiktionary:List of languages currently comprising 8055 lects needs cuts however. Fay Freak (talk)
- This discussion really belongs at rfm, because that's where we normally discuss changes to whether or how we recognize a language. The Grease pit is for discussing how to implement something along those lines- not whether it should be implemented. The other option would be at the Beer parlour, but this seems like something that would benefit from the more specialized focus of rfm. Chuck Entz (talk) 03:39, 14 November 2018 (UTC)
- Some prior discussion of Thamudic et al is on Category talk:Hismaic language; IIRC they were separated because literature does mention them as distinct entities, but if they were very similar or often treated as one language, and especially if there's difficulty in assigning specific texts to specific ones due to similarity, that would be an argument for reversing that decision and going back to the conservative approach of treating them all as one language with 'dialect'/'region' labels where appropriate.
(As to the venue, yes, these discussions tend to happen on RFM for quirky historical reasons — originally the discussions entailed actually merging or splitting language templates — although some have proposed the Beer Parlour as a more logical venue. There are minor benefits and drawbacks to either venue; this venue does have the advantage that discussions stay on the page until resolved.) - -sche (discuss) 17:20, 14 November 2018 (UTC) - Who is likely to have access to resources on Africa's Semitic languages that could help judge what to do with Gurage? User:Metaknowledge, User:Wikitiki89? Wikipedia insists "The Gurage languages do not constitute a coherent linguistic grouping", which seems incompatible with merging them. William A. Shack, in his book on The Gurage, writes that "each Gurage dialect is usually understood only by its own speakers, and there is a rough correlation between the contiguity of dialect groups and the extent to which their dialects are mutually intelligible." (Steven Danver, in his (general-focus) encyclopedia, says "the languages of the different groups of Ethiopian Gurage are seldom mutually intelligible.") Marvin Lionel Bender, in his 1976 Language in Ethiopia, says "Although seventeen varieties of Gurage dialects are listed, mutual intelligibility reduces this to four languages and three dialect clusters as follows (Hetzron classification):
Gogot, Misqan, Muxir, Soddo
East Gurage (Inneqor, Silti, Urbareg, Weleni, Zway)
Central West Gurage (Chaha, Gumer, Gura Izha)
Peripheral West Gurage (Ener, Geto, Indegegn, Innemor)"
However, his very next sentence is: "Gogot, Muxir, Soddo comprise a geographical (non-genetic) grouping of non-mutually-intelligible languages known as 'North Gurage'", all of which seeems to suggest that merging all of the Gurages would not be sound.
- -sche (discuss) 17:28, 14 November 2018 (UTC)- The cited grouping of course adds to the confusion. Three languages, but four dialects clusters, not mentioning their intersections? Well, we will not find out how one should see them without deep-diving. But the question is which direction Wiktionary should go: likely the current division is not correct. Should Wiktionary just add all possible splits so they can be cleaned up later when someone would commit himself to add the whole Gurage and judge about which distinctions are most convenient or should we have one macro-code because distinction is hopeless? The reason why I have even mentioned Gurage is that for example Leslau’s Etymological Dictionary of Geʿez which I like to use just gives words as “Gurage”, which sounds like there is a common vocabulary. Fay Freak (talk) 14:43, 15 November 2018 (UTC)
- Perhaps you can deduce from Leslau's literature list which Gurage language he gets his data from? He seems to have written an etymological dictionary of Gurage as well, presumably its foreword could clear things up.
- His own field studies. I hade linked his Etymological Dictionary of Gurage (“according to Wolf Leslau” etc.). Fay Freak (talk) 15:23, 17 November 2018 (UTC)
- As a volunteer project (run on fancy), we really have no other choice than to wait for someone to investigate the matter deeply and order the languages in a manner that facilitates their lexicographical work.
- Maybe we need non-genetic language group categories and ways to give forms in unindentified languages belonging to language groups. Crom daba (talk) 15:49, 15 November 2018 (UTC)
- Perhaps you can deduce from Leslau's literature list which Gurage language he gets his data from? He seems to have written an etymological dictionary of Gurage as well, presumably its foreword could clear things up.
- The cited grouping of course adds to the confusion. Three languages, but four dialects clusters, not mentioning their intersections? Well, we will not find out how one should see them without deep-diving. But the question is which direction Wiktionary should go: likely the current division is not correct. Should Wiktionary just add all possible splits so they can be cleaned up later when someone would commit himself to add the whole Gurage and judge about which distinctions are most convenient or should we have one macro-code because distinction is hopeless? The reason why I have even mentioned Gurage is that for example Leslau’s Etymological Dictionary of Geʿez which I like to use just gives words as “Gurage”, which sounds like there is a common vocabulary. Fay Freak (talk) 14:43, 15 November 2018 (UTC)
- @Fay Freak, -sche: A bit late, but here are my responses to the three outstanding problems (your #2–4):
- It is fairly evident that Ancient North Arabian is not a single language, and I advocate that sem-xna be abolished rather than the specific language codes; read Al-Jallad (2018), "What is Ancient North Arabian?". He sees Safaitic (which he has written a grammar of) and Hismaic as being of the same continuum as Old Arabic, but they are obviously too distinct from Classical Arabic for lexicographical purposes. He supports the distinctness of the others as languages, and of the various "Thamudic" lects. Based on Al-Jallad, I would prefer we split Thamudic B, C, D, etc as necessary; each language will have a very small corpus, but it seems like the most honest way to do it, and if more inscriptions are found, the lettered Thamudic wastebaskets will probably get their own names as the others did.
- Old South Arabian is also not a single language, though Sabaean was the standard that the other lects imitated, and I advocated that sem-srb be abolished as well. Multhoff (2019) in The Semitic Languages makes the case for four distinct languages: Sabaean, Minaean, Qatabanian, and Hadrami. She makes no mention, however, of Harami. Macdonald (2000), "Reflections on the linguistic map of pre-Islamic Arabia" explains that "Harami" is a name given to a few Sabaean texts that seem to have been contaminated by other Semitic languages, which is not at all an unusual feature and not unique to that site, so I suggest we remove that code.
- As for Himyaritic, I now think I was wrong to include it. There are three texts often attributed to it, but see Stein (2008), "The ‘Himyaritic’ Language in Pre-Islamic Yemen", which makes a strong argument to consider these as simply very late examples of Sabaean, which is indisputably the language of the other texts of the region in that script.
- Finally, for Gurage, the chief problem is that some scholars follow Hetzron in saying that Gurage is polyphyletic, in which case lumping would be committing a grave error (and the same charge has been levelled for Aramaic, with perhaps more evidence). Meyer (2011) in the International Handbook does seem to support the unity of Gurage, and treats the lects together, which gives me hope for lumping, but he is unwilling to commit to whether they should be considered dialects or languages. I think your Gurage-adding genius is mythical, so we have to choose which is least bad: many languages with scanty coverage, because their forms may be similar to forms entered under a different L2 header; or one Gurage language with decent coverage, but many forms that are not marked for what dialect they belong to and therefore a poor resource. I hesitantly support merger, given those choices. —Μετάknowledgediscuss/deeds 03:13, 10 August 2020 (UTC)
- An addendum: "Hadrami" is a terrible name for xhd, and invites confusion with Hadrami Arabic. Wikipedia uses "Hadramautic", but N-grams and a quick literature review suggests that "Hadramitic" is more common. @Fay Freak, -sche again (yes, I know I'm pestering, but I don't want to move forward on all this alone, both because I am fallible and because some of these, particularly splitting OSA, would require a bit of work, although in that case there is an online corpus that will help immensely). —Μετάknowledgediscuss/deeds 02:40, 17 August 2020 (UTC)
- Re North Arabian: Many works I browsed through speak of Old North Arabian as a unit with dialects, but also carefully specify what lects (including Thamudic B vs C, etc) words are attested in. Some imply, in their presentations, that a large number of words are identical between dialects, at least in the sample of vocabulary that they're treating (e.g., the pronouns treated in Roger D. Woodard, The Ancient Languages of Syria-Palestine and Arabia (2008), pages 197-198), though this seems to be because the authors are presenting 'normal', normalized and romanized forms, given Al-Jallad's evidence that words (even the supposedly distinctive definite article) varied not just among dialects but even within the writings of individual speakers. The native script also loses many possible differences in pronunciation, but then, we are a written, writing-based dictionary. I find slightly more works speaking of "Ancient North Arabian dialects" than "Ancient North Arabian languages", and the fact that some authors have argued the varieties are the same language not only as each other but even as Arabic itself does suggest a high degree of similarity (or that the scholars in question are lumpers). As we're dealing with small, extinct and apparently clearly delineated corpora, it seems like the conservative approach of treating each under its own L2 could be better, and we could retire xna ... unless we need it as a wastebasket for unsorted things, which Al-Jallad (and Fay Freak, above) suggests we would. (Bah, It's messy business, deciding what's a language and what's a dialect...) I will try to dig into the rest later. - -sche (discuss) 04:10, 19 August 2020 (UTC)
Well myself I have added Sabaean, Minaean, Qatabanian entries meanwhile, understanding and quoting a few inscriptions, although apart from some occasional features I noticed little how such an inscription can be classified as either, other than by provenience or rulers or gods mentioned—but that must be due to my blasé comparative approach that also makes me read Romance without recognizing the individual language. So somehow the volition to a merge is gone, though the lumping codes “Old South Arabian” and “Old North Arabian” must be kept for inscriptions no one has classified. Both are useful.
For Himyaritic, however, nothing is left. As here said already, the three alleged Himyaritic inscriptions don’t even need to be in the same language, and they aren’t even from anything to be called Ḥimyar (there are “Lesser Himyarites” and “Greater Himyarites” and the ethnic identity is fragmentary, too, by the way). In the “Critical Reevaluation” of the Ḥimyaritic language – cited by Wikipedia on Himyaritic language one does not know what for: their “undeciphered-k language” header recently introduced is surely a made-up term, oddly suggesting that these inscriptions are yet another language when those “k-language” inscriptions are exactly those otherwise claimed for Himyaritic, so we see Wikipedia editors had no clue and phantasize together languages due to their disdain for primary sources – helpfully includes a map, also coming to the conclusion “we have no reason to assume the existence of some “non-Ṣayhadic” language in pre-Islamic Yemen that was spoken besides the (Late) Sabaic idiom known from the inscriptions.” That from the fact that “Himyaritic” words typically given from Arabic sources are all also found in Sabaic, and the grammar found in the three inscriptions, including the prefixed instead of postfixed article which is only found in two of them, is too either found in Sabaean or can well be ascribed to their being poetry, which is also the reason for their being poorly understood. Many Arabic poems are also hard to understand and mostly helped by the copious material for the language which is not the case for languages with so limited a corpus, like Old South Arabian. Even in the Digests, Latin prose, not all passages are of discoverable meaning.
What would hinder man though to add understood words with quotes from the ominous inscriptions as Sabaean? Or anything from Arabic sources transmitted as Himyaritic instead of Arabic as Sabaean? For there is no evidence for it being a particular language. You see, from the corpus-based standpoint Wiktionary takes Himyaritic must go. Nothing can get the header “Himyaritic”, it can only be mentioned at Sabaean or Old South Arabian entries that Himyaritic nature is suggested by those who have come to believe in this extraordinary claim for which extraordinary evidence is not provided. Fay Freak (talk) 04:18, 6 August 2021 (UTC)
- I went on and moved our only “Ḥimyaritic” entry after that famous sentence to Yemeni Arabic in which the word طَيِّب (ṭayyib) for “gold” turns out otherwise known, and to be nothing else than Classical Arabic طَيِّب (ṭayyib, “good”) meaning “refined” and therefore gold, while Old South Arabian could not have developed such sense, so it is clear the famous quote one has been so inept to classify is at best only macaronic Sabaean-Yemenite Arabic. It is well put by Marijn van Putten:
- The Arab grammarians were interested in describing correct usage of language of Classical Arabic. It is quite clear that Himyaritic (and by extension Yemeni Arabic) did not fall in the category of 'correct usage'. Within this context, it is of course not surprising that anything that is "wrong" and from Yemen might be denoted as Himyaritic. This would then include both varieties of Yemeni Arabic and some surviving vestiges of Ancient South Arabian. Fay Freak (talk) 04:59, 13 September 2021 (UTC)
- Now also in a new article by Koutchoukali like communis opinio, though his blogs transpire by him stalking Wiktionary: later Muslim historians would refer to anything related to South Arabia’s pre-Islamic history as “Himyaritic,” all memory of its other states having passed away. Fay Freak (talk) 01:01, 18 September 2021 (UTC)
Merging Classical Mongolian into Mongolian[edit]
"Classical Mongolian" refers to the literary language of Mongolia used from 17th to 19th century created through a language reform associated with increased Buddhist cultural production (this started in the 16th century, but language standardization took place later). In the 20th century, (outer) Mongolia became independent from China and later adopted a Cyrillic orthography based on the spoken language, while Inner Mongolia kept her Uyghur script.
The literary language of Inner Mongolia continues Classical Mongolian in terms of its orthography as well as most of its grammar (to an extent that Janhunen (?) calls the situation bilingual). Modern varieties, in both Outer and Inner Mongolia, have greatly expanded their lexicons through borrowing of modern terms, but they also both consider all of Classical Mongolian lexicon to be a part of their language, and will put it in their dictionaries, even transcribed into Cyrillic.
The actual problem I have with this division is that when it comes to borrowings from (Classical) Mongolian, we sometimes cannot ascertain whether they precede the 20th century or not, or more common still, we know they precede the 19th century (and post-date the 16th), but they obviously come from a spoken variety and not "Classical Mongolian" as a literary language. Crom daba (talk) 17:14, 15 November 2018 (UTC)
- Yes. I find it also strange that Wiktionary distinguishes Ottoman Turkish from Turkish, it’s like distinguishing pre-1918 Russian from “Russian”, or like one reads about “Ottoman Turks” instead of “Turks”. Also Kazakh and the other Turkic language do not get extra codes for Arabic spelling, this situation is even more comparable, innit. Kazakhs in China write in Arabic script, Mongols in China in Mongolian script, but the languages are two and not four. Or also it sounds as with Pali. Am I correct to assume that Classical Mongolian texts get reedited in Cyrillic script? Then you could base all on Cyrillic and make Mongolian script soft redirects, because even words died out before the introduction of Cyrillic can be found in Cyrillic. Fay Freak (talk) 15:23, 17 November 2018 (UTC)
- @Fay Freak, the situation is similar to Turkish, but it creates less problems there since the Arabic script Turkish is obsolete and most relevant loans are pre-Republican.
- In principle it could be possible to collapse all of Mongolian into Cyrillic, but this would be extremely politically incorrect.
- Collapsing everything (potentially even Buryat, Daur and Middle Mongolian) into Uyghur script, like we do with Chinese, would perhaps make more sense, but 1) it's a pain to enter 2) Cyrillic is generally more accessible and useful to our users and (Outer) Mongolians 3) most of my materials are in Cyrillic 4) it corresponds poorly to the spoken forms 5) its Unicode encoding corresponds poorly to its actual form 6) the encoding doesn't correspond that well to the spoken form either. Crom daba (talk) 16:50, 18 November 2018 (UTC)
- This is tricky, because as far as language headers and having entries for terms in the language, it seems like we could often resolve which language a word is in(?) by knowing the date of the texts it's attested in. It is, as you say, etymologies where it's hardest to ascertain dates. (Still, if we merged the lects, we could retain an "etymology only" code for borrowings that were clearly from Classical Mongolian, like is done for Classical Persian, etc.) I'm having a hard time finding any references on the mutual intelligibility of the two stages; most references are concerned with the intelligibility or non-intelligibility of modern Khalkha, Kalmyk, etc. If we kept the stages separate, etymologies could always say something like "from Mongolian foo, or a Classical Mongolian forerunner". - -sche (discuss) 22:50, 18 November 2018 (UTC)
- @-sche, yes, the Persian model would be desirable.
- It doesn't make much sense to speak of intelligibility between Classical and Modern Mongolian, Classical Mongolian is exclusively a written language, its spelling reflects the phonology of 13th-century Mongolian (early Middle Mongolian). The same spelling is used in Modern Mongolian as written in Uyghur script.
- The biggest problem with Classical Mongolian is how redundant it is. For any word that is shared between modern and classical periods, and that is probably most of the lexicon, we would need to make two identical entries in Uyghur script for modern and classical Mongolian. Crom daba (talk) 11:18, 19 November 2018 (UTC)
- That seems not unlike how we handle Serbo-Croatian and Hindi-Urdu. — [ זכריה קהת ] Zack. — 14:25, 30 November 2018 (UTC)
- Indeed. The way we handle them sucks. Crom daba (talk) 12:52, 1 December 2018 (UTC)
- Not exactly; Serbo-Croatian and Hindi-Urdu have redundant entries in different scripts on different pages, while I understand Crom daba's point to be that we would need to have redundant ==Mongolian== and ==Classical Mongolian== entries on the same pages for most Mongolian/Uyghur script words, which would be more like having duplicate Bosnian and Croatian entries on the same pages, not our current system. And Serbo-Croats are testier about their language(s) being lumped than speakers of Classical Mongolian... ;) - -sche (discuss) 17:29, 3 December 2018 (UTC)
- That seems not unlike how we handle Serbo-Croatian and Hindi-Urdu. — [ זכריה קהת ] Zack. — 14:25, 30 November 2018 (UTC)
- OK, does anyone object to the merge? If not, I can try to do it with AutoWikiBrowser later, or Crom or others could start reheadering our small number of Classical Mongolian entries, fixing any wayward translations, etc. For etymologies of terms that are known to derive from Classical Mongolian, we should be able to just move
cmg
over to Module:etymology languages/data. - -sche (discuss) 17:29, 3 December 2018 (UTC)
- @Crom daba, Fay Freak I made the few ==Classical Mongolian== entries we had into ==Mongolian== entries (labelled "Classical Mongolian" unless there was already a modern Mongolian section on the same page), but many of the categories still need to be deleted, and one needs to check whther anything else is left that would break before "cmg" is moved from being a language code to being an etymology-only code. - -sche (discuss) 02:46, 27 September 2020 (UTC)
- There's no full correspondence between different Mongolian scripts and none of the scripts is totally phonetic. It's not just the spelling, the phonologies are different but sometimes one script represents the true or historical pronunciation and it's not necessarily Cyrillic, which is strange. There are words that only exist on one or the other, which is quite understandable, cf. modern ᠱᠠᠹᠠ (šafa, “sofa”) in Inner Mongolia (from 沙發/沙发 (shāfā) and софа (sofa, “sofa”) in outer Mongolia (from софа́ (sofá). I support the merge, though but I am curious if classical Mongolian terms are equally representable in Cyrillic and Arabic scripts. In other words, are there terms in classical Mongolian, which are different from modern and there's no Cyrillic form for them? I think I saw them.
- Duplication of entries is a waste. You may think I am biased but I think Mongolian should be presented/lemmatised in Cyrillic (Uyghurjin should also be available in all entries where it can be found) - for which resources are much more accessible. (Serbo-Croatian should be lemmatised on the Roman alphabet, on the other hand, let's finish the senseless duplications of entries)
- Also supporting the Ottoman Turkish/Turkish merge. --Anatoli T. (обсудить/вклад) 03:25, 27 September 2020 (UTC)
- @Atitarev In Mongol khelnii ikh tailbar toli we see the term уйгуржин бичиг is described as ‘монгол бичгийн дундад эртний үеийн хэлбэр’ (‘early form of the Mongolian/Khudam script’). Middle Mongolian in uigurjin with its own rules shall not to be equated with the later ‘Classical’-Modern script and orthography. I maintain uigurjin (with its specific glyph forms and spelling rules) shall be treated as a term only for Middle Mongolian.
- Similarly I also object treating Northern Yuan – Qing (‘Classical’) Mongolian and Modern Mongolian-script Mongolian as one literary language standard. In fact orthographic standardisations and modifications make written Modern Mongolian such different from Classical. Personally I’d like to display a historical feature of this language collectively under ‘Classical Mongolian’, as only this term directly interlinks with an Inner Asian historical and linguistic tradition. LibCae (talk) 16:40, 7 May 2021 (UTC)
2018 — December[edit]
Renaming agu[edit]
We currently call this "Aguacateca", but "Aguacateco" is much more common. (Wikipedia opts for "Awakatek", which is rapidly becoming more common but is probably not there yet — not that we can't be crystal-ballsy if we want to when it comes to names rather than entries.) —Μετάknowledgediscuss/deeds 05:42, 19 December 2018 (UTC)
- You're right that several modern (and a few older) sources seem to use Awakatek. In turn, historically Aguacatec has been used in the titles of many reference works on it, and seems like it may be the most common name (ngrams), although it's also the name of the people-group. (Others: Awakateko, Awaketec, Qa'yol, Kayol, and variously spellings of Chalchitec sometimes considered a distinct lect.) - -sche (discuss) 04:31, 19 August 2020 (UTC)
2019 — January[edit]
"comparative adjectives" > "adjective comparative forms"[edit]
Apparently there was a recent vote to remove the ambiguity of comparative and superlative categories. What I don't understand is why the name "comparative adjectives" was chosen, which suggests a lemma category, yet it's now being subcategorised under non-lemmas. Lemma subcategories are named "xxx POSs", as can be seen in Module:category tree/poscatboiler/data/lemmas. Non-lemma subcategories are named "POS xxx forms", visible in Module:category tree/poscatboiler/data/non-lemma forms. Therefore, the obvious place for comparative forms of adjectives is the "adjective comparative forms" category we used to have. The new name, although voted on, stands out as an exception among all of our existing categories and is inconsistent. It should therefore either be renamed back to reflect its non-lemma status, or it should be moved back under its original lemma parent category. —Rua (mew) 23:57, 10 January 2019 (UTC)
@Surjection, Erutuon —Rua (mew) 00:09, 11 January 2019 (UTC)
- The vote was here: Wiktionary:Votes/2018-07/Restructure comparative and superlative categories. — Eru·tuon 00:13, 11 January 2019 (UTC)
- Participles are not lemmas yet they are called "(language) participles", so it's not as if the comparatives/superlatives would exactly be exceptions of some kind. They even have their own "participle forms" categories! The former also applies to gerunds. — surjection ⟨
?
⟩ 09:13, 11 January 2019 (UTC)- And to make it clear, "adjective/adverb comparative/superlative forms" categories are to be made obsolete as a direct result of the vote. — surjection ⟨
?
⟩ 09:16, 11 January 2019 (UTC)- Yes, and that should be undone, because as I said, the name "comparative adjectives" suggests that they are lemmas because of our existing naming scheme. Participles are non-lemmas by virtue of being participles, but adjectives are lemmas, so "comparative adjectives" are also lemmas. Are you implicitly proposing to rename all non-lemma categories to this new scheme, e.g. "dual adjectives", "plural nouns", "possessive nouns", "feminine adjectives"? If the vote is upheld then I will propose this change to make things consistent again. —Rua (mew) 12:00, 11 January 2019 (UTC)
- I certainly would not assume "comparative adjectives" refer to lemmas in any way as much as "participles" don't. If we go back to "adjective comparative forms", what do you suggest for the name of the category with inflected forms of such? And don't just say "put them in 'Adjective forms'", because that at the very least isn't consistent as I stated below. In the old system, there was no consistency at all - inflected forms of comparatives and superlatives went to either the same category as them or Adjective forms without any sort of rule. — surjection ⟨
?
⟩ 12:17, 11 January 2019 (UTC)- I would not even categorise inflected forms of comparatives in a special way. They are just adjective forms. I don't even think comparatives should be categorised separately at all, there is no obvious need to do so. The example of possessive forms is perhaps the best parallel, since they have inflection tables of their own in Northern Sami and many other languages. Do you propose renaming them to "possessive nouns" so that there can be a separate "possessive noun forms" category? —Rua (mew) 12:28, 11 January 2019 (UTC)
- If you feel comparatives too don't need a special category, I'm personally fine with bunching all of them under "adjective forms", but that will too need wider consensus to implement. When it comes to those possessive nouns, I would argue comparatives and superlatives are closer to participles than to those possessive forms, which is why I believe they're not a good parallel and should be considered separately. — surjection ⟨
?
⟩ 12:40, 11 January 2019 (UTC)
- If you feel comparatives too don't need a special category, I'm personally fine with bunching all of them under "adjective forms", but that will too need wider consensus to implement. When it comes to those possessive nouns, I would argue comparatives and superlatives are closer to participles than to those possessive forms, which is why I believe they're not a good parallel and should be considered separately. — surjection ⟨
- I would not even categorise inflected forms of comparatives in a special way. They are just adjective forms. I don't even think comparatives should be categorised separately at all, there is no obvious need to do so. The example of possessive forms is perhaps the best parallel, since they have inflection tables of their own in Northern Sami and many other languages. Do you propose renaming them to "possessive nouns" so that there can be a separate "possessive noun forms" category? —Rua (mew) 12:28, 11 January 2019 (UTC)
- I certainly would not assume "comparative adjectives" refer to lemmas in any way as much as "participles" don't. If we go back to "adjective comparative forms", what do you suggest for the name of the category with inflected forms of such? And don't just say "put them in 'Adjective forms'", because that at the very least isn't consistent as I stated below. In the old system, there was no consistency at all - inflected forms of comparatives and superlatives went to either the same category as them or Adjective forms without any sort of rule. — surjection ⟨
- Yes, and that should be undone, because as I said, the name "comparative adjectives" suggests that they are lemmas because of our existing naming scheme. Participles are non-lemmas by virtue of being participles, but adjectives are lemmas, so "comparative adjectives" are also lemmas. Are you implicitly proposing to rename all non-lemma categories to this new scheme, e.g. "dual adjectives", "plural nouns", "possessive nouns", "feminine adjectives"? If the vote is upheld then I will propose this change to make things consistent again. —Rua (mew) 12:00, 11 January 2019 (UTC)
- And to make it clear, "adjective/adverb comparative/superlative forms" categories are to be made obsolete as a direct result of the vote. — surjection ⟨
- In fact, unlike this new system which has parallels, I'm fairly sure the old system of having "adjective comparative forms" but then the forms of comparatives under "adjective forms" is more of an exception. — surjection ⟨
?
⟩ 09:32, 11 January 2019 (UTC)- Not really. We don't have separate non-lemma categories for everything in Module:category tree/poscatboiler/data/lemmas and in fact we don't need to. Under the old system, all comparative forms could be categorised under "adjective comparative forms", so that includes all case forms of comparatives. There was never any need to separately categorise forms of comparatives. In fact I'm generally opposed to subcategorising non-lemmas, so that's why I moved everything in Dutch to just "adjective forms". We don't need a subcategory for every possible type of non-lemma form. However, if we do have them, then they should be named consistently. —Rua (mew) 12:00, 11 January 2019 (UTC)
- We don't have separate non-lemma categories for the reason that many of them are simply not inflectable on and upon themselves. Again, participles have separate categories for the main participle and inflected forms of such - why should this not apply to comparative and superlative adjectives? — surjection ⟨
?
⟩ 12:17, 11 January 2019 (UTC)- What I get out of your argument is that you think "POS xxx forms" should become "xxx POSs" when the form has its own inflections. But then what about cases like English, where comparatives don't have their own forms and are simply adjective forms? Or cases like Dutch or Swedish, where there are multiple superlative forms but their inflections are shown on the lemma? How is an editor supposed to know what the name of the category for any particular adjective form is, when some of them are named differently from others? —Rua (mew) 12:28, 11 January 2019 (UTC)
- That is indeed my argument for comparatives and superlatives due to their so far horridly inconsistent handling. In the case of English and all other languages, they will only have "comparative adjectives", no "comparative adjective forms", much like English would have "participles" that too aren't lemmas but would not have "participle forms". In cases like Dutch, Swedish and such where comparative/superlative forms are more numerous, those need to be handled on a language by language basis, ideally to choose one of the forms as the most lemma-esque (such as which form dictionaries primarily use to describe the comparative/superlative of an adjective), and if not one can be decided, it is more of a tricky situation (possibly all into "comparative/superlative adjective forms"?). Editors in turn can rely on other existing entries and eventually remember these entries much like the existing ones are, or use language-specific headword templates. Yes, the new system is by no means perfect, but I would argue it is miles better than what we had before. — surjection ⟨
?
⟩ 12:38, 11 January 2019 (UTC)- But again, how is an editor of these languages supposed to know that, while adjective forms normally go in "adjective xxx forms", it is somehow different for comparative and superlative forms? You still haven't answered this. Your argument is based on sublemma-ness, but this differs per language, not all languages treat comparatives and superlatives as sublemmas. The categorisation should allow for both treatments depending on the needs of the individual language, not force a particular treatment on all languages. The fact that you think it makes sense for Finnish doesn't mean it makes sense for English. Now we have Category:English comparative adjectives for an adjective form, but Category:English noun plural forms for a noun form. How is that consistent? —Rua (mew) 12:45, 11 January 2019 (UTC)
- I did already answer that question - read the latter part of my previous response. Many a time has an editor checked an existing entry to see how something is formatted, and I doubt there would be a single editor that has never done that. Many of the languages with comparatives and superlatives set up have language-specific headword templates, and many of those too have ACCEL which can too give the correct headword category autom- oh wait, it can't anymore since someone removed that capability. — surjection ⟨
?
⟩ 12:49, 11 January 2019 (UTC)- You have not answered the question. An editor cannot, based on the rule that non-lemma categories are named "adjective xxx forms", guess the correct name of the category for comparative forms, whereas they could before. Instead, there is now a single exception that comparatives are named "comparative adjectives". Where are all the other "xxx POSs" categories for non-lemmas? Again, are you proposing that all non-lemmas be renamed to match this new scheme? If not, what justifies this single exception? —Rua (mew) 12:54, 11 January 2019 (UTC)
- I did already answer that question - read the latter part of my previous response. Many a time has an editor checked an existing entry to see how something is formatted, and I doubt there would be a single editor that has never done that. Many of the languages with comparatives and superlatives set up have language-specific headword templates, and many of those too have ACCEL which can too give the correct headword category autom- oh wait, it can't anymore since someone removed that capability. — surjection ⟨
- But again, how is an editor of these languages supposed to know that, while adjective forms normally go in "adjective xxx forms", it is somehow different for comparative and superlative forms? You still haven't answered this. Your argument is based on sublemma-ness, but this differs per language, not all languages treat comparatives and superlatives as sublemmas. The categorisation should allow for both treatments depending on the needs of the individual language, not force a particular treatment on all languages. The fact that you think it makes sense for Finnish doesn't mean it makes sense for English. Now we have Category:English comparative adjectives for an adjective form, but Category:English noun plural forms for a noun form. How is that consistent? —Rua (mew) 12:45, 11 January 2019 (UTC)
- That is indeed my argument for comparatives and superlatives due to their so far horridly inconsistent handling. In the case of English and all other languages, they will only have "comparative adjectives", no "comparative adjective forms", much like English would have "participles" that too aren't lemmas but would not have "participle forms". In cases like Dutch, Swedish and such where comparative/superlative forms are more numerous, those need to be handled on a language by language basis, ideally to choose one of the forms as the most lemma-esque (such as which form dictionaries primarily use to describe the comparative/superlative of an adjective), and if not one can be decided, it is more of a tricky situation (possibly all into "comparative/superlative adjective forms"?). Editors in turn can rely on other existing entries and eventually remember these entries much like the existing ones are, or use language-specific headword templates. Yes, the new system is by no means perfect, but I would argue it is miles better than what we had before. — surjection ⟨
- What I get out of your argument is that you think "POS xxx forms" should become "xxx POSs" when the form has its own inflections. But then what about cases like English, where comparatives don't have their own forms and are simply adjective forms? Or cases like Dutch or Swedish, where there are multiple superlative forms but their inflections are shown on the lemma? How is an editor supposed to know what the name of the category for any particular adjective form is, when some of them are named differently from others? —Rua (mew) 12:28, 11 January 2019 (UTC)
- We don't have separate non-lemma categories for the reason that many of them are simply not inflectable on and upon themselves. Again, participles have separate categories for the main participle and inflected forms of such - why should this not apply to comparative and superlative adjectives? — surjection ⟨
- Not really. We don't have separate non-lemma categories for everything in Module:category tree/poscatboiler/data/lemmas and in fact we don't need to. Under the old system, all comparative forms could be categorised under "adjective comparative forms", so that includes all case forms of comparatives. There was never any need to separately categorise forms of comparatives. In fact I'm generally opposed to subcategorising non-lemmas, so that's why I moved everything in Dutch to just "adjective forms". We don't need a subcategory for every possible type of non-lemma form. However, if we do have them, then they should be named consistently. —Rua (mew) 12:00, 11 January 2019 (UTC)
Which question exactly have I not answered? The question was "how would an editor of these languages know the correct name for the categories?", which I have now answered not less than twice in my two previous responses. Instead, what it seems you are arguing is that the new scheme creates inconsistency in terms of the category names for non-lemma forms. Indeed, if other derivations are shown to be just like participles or comparative/superlatives, I'm happy to agree to move them under a similar scheme as well, but the possessive forms you brought up above are not an example of such. — surjection ⟨?
⟩ 12:58, 11 January 2019 (UTC)
Since it seems that this is the new norm for naming categories, I have proposed to rename all existing categories to match the new naming scheme at WT:BP. —Rua (mew) 13:16, 11 January 2019 (UTC)
- @Rua Given the edits you have made to the templates and modules are still in place, are you willing to revert those yourself or are you asserting that you are overriding the consensus established by the vote? — surjection ⟨
?
⟩ 21:10, 11 January 2019 (UTC)
Reconcile Category:#### terms derived from the shape of letters and Category:#### terms making reference to character shapes[edit]
See also Category talk:Terms making reference to character shapes by language.
Perhaps they could be merged, or perhaps both could be kept (Japanese: characters; letters?), but the naming should be consistent, at the least. —Suzukaze-c◇◇ 11:08, 20 January 2019 (UTC)
- Merge, perhaps into Category:Terms derived from character shapes by language (a bit shorter, and inclusive of non-letter characters). - excarnateSojourner (talk | contrib) 04:50, 28 April 2022 (UTC)
2019 — February[edit]
These should be merged, I think. Per utramque cavernam 12:39, 2 February 2019 (UTC)
- Yes, IMO, into someone's blood runs cold, with hard redirects from both. DCDuring (talk) 15:43, 2 February 2019 (UTC)
- I would support a hard redirect. Imetsia (talk) 23:34, 1 August 2021 (UTC)
- Formerly entitled Category:Taxonomic eponyms
As above. —Rua (mew) 13:35, 2 February 2019 (UTC)
- As with Category:Specific epithets. DCDuring (talk) 15:41, 2 February 2019 (UTC)
- @Benwing2, Rua, DCDuring: I guess there is nothing to move here and this can be solved by an addition to module data so that we can auto-cat after adding
{{cln|langcode|taxonomic eponyms}}
in entries. I mean, in order to categorize the{{named-after}}
stuff more specifically. Fay Freak (talk) 23:45, 7 November 2020 (UTC)- I think all of these that are entire taxonomic names must be Translingual, by virtue of being taxonomic names. The ones that are specific epithets would have the same language code for the taxonomic eponyms as for the specific epithet. DCDuring (talk) 01:02, 8 November 2020 (UTC)
- @DCDuring: I am not exactly sure what you mean. I mean that “taxonomic eponyms” can be added to the topical data or to the etymological data (Category:Taxonomic names, the supercategory of Category:Taxonomic eponyms, resides in the former for some reason, but I devise the taxonomic eponym categories as motivated by etymological description, so the latter it should be), whereas Category:Taxonomic eponyms cannot because it cannot generally be applied onto all languages (only to Translingual and perhaps Latin words that also are epithets). @Rua mixed up different issues here, the reasoning “as above” is not comprehensible thus. Fay Freak (talk) 12:00, 9 November 2020 (UTC)
- The question then is whether Translingual appears as "Translingual " or "mul:"? I have thought that "specific epithets" is a category having to do with the usage of the term. Thus the categorization should be the result of a label or of a non-gloss definition.DCDuring (talk) 18:57, 9 November 2020 (UTC)
- Since "Translingual" is a junk supercategory, not comparable to our language categories, based on an attribute of the usage of some terms. The category includes CJKV characters, airport ocdes, other international abbreviations, symbols, and codes, some non-taxonomic scientific terms, and who-knows-what-else, as well as taxonomic names. The effort to act as if every linguistic entity in Wiktionary fits into a relatively well-defined hierarchy of language families, languages, and dialects comes a-cropper with the entities thrown into Translingual, just as the taxonomic naming system has its troubles with hybridisation and trans-taxon gene transfer (eg, from viruses or from the assimilation of prokaryotes into eukaryotes as organelles).
- Specific epithets have a function within taxonomic terms that has nothing whatsoever to do with the fact that taxonomic names are used translingually, but has everything to do with names in the taxonomic/biological "language". 'Specific epithet' is a grammatical role within certain classes of taxonomic names. DCDuring (talk) 22:02, 9 November 2020 (UTC)
- The question then is whether Translingual appears as "Translingual " or "mul:"? I have thought that "specific epithets" is a category having to do with the usage of the term. Thus the categorization should be the result of a label or of a non-gloss definition.DCDuring (talk) 18:57, 9 November 2020 (UTC)
- @DCDuring: I am not exactly sure what you mean. I mean that “taxonomic eponyms” can be added to the topical data or to the etymological data (Category:Taxonomic names, the supercategory of Category:Taxonomic eponyms, resides in the former for some reason, but I devise the taxonomic eponym categories as motivated by etymological description, so the latter it should be), whereas Category:Taxonomic eponyms cannot because it cannot generally be applied onto all languages (only to Translingual and perhaps Latin words that also are epithets). @Rua mixed up different issues here, the reasoning “as above” is not comprehensible thus. Fay Freak (talk) 12:00, 9 November 2020 (UTC)
- I think all of these that are entire taxonomic names must be Translingual, by virtue of being taxonomic names. The ones that are specific epithets would have the same language code for the taxonomic eponyms as for the specific epithet. DCDuring (talk) 01:02, 8 November 2020 (UTC)
- @Rua, DCDuring, Fay Freak: Heads up that I amended Module:category tree/poscatboiler/data/terms by etymology to standardize these categories and so we now have Category:Taxonomic eponyms by language. I realize this makes the deletion discussion a little more confusing, since the main category has changed, so just giving visibility to the subcats Category:Arabic taxonomic eponyms, Category:English taxonomic eponyms, and Category:Translingual taxonomic eponyms and the fact that the main category under discussion was emptied and deleted for being empty. I've put the notice on the new main category and changed this subheading. —Justin (koavf)❤T☮C☺M☯ 15:54, 13 March 2022 (UTC)
- By which of our definitions of eponym is Anna's hummingbird an eponym? DCDuring (talk) 16:05, 13 March 2022 (UTC)
Seems to be inconsistently integrated in so far as the latter in its name contains “verbs” but the former does not contain “noun”, and the latter gets categorized as Category:Lemmas subcategories by language but the former as Category:Terms by etymology subcategories by language. Outside the category structure we have Category:Taos deverbal nouns which nobody has noticed. I have no tendency towards any gestalt so far, and I can’t decide either. Furthermore somebody will have to make a complement {{denominal}}
for {{deverbal}}
– so far there is only an Arabic-specific {{ar-denominal verb}}
. Fay Freak (talk) 18:31, 25 February 2019 (UTC)
- A lot of this is redundant to our suffix derivation categories. In many cases, the suffix used already determines what something is derived from. For example, -ness always forms deadjectival nouns, it can't really be anything else. —Rua (mew) 18:47, 25 February 2019 (UTC)
- Please see Wiktionary:Etymology_scriptorium/2018/May#основать. Per utramque cavernam 19:13, 25 February 2019 (UTC)
- True, for “a lot”, and if you know the deep intricacies of Wiktionary’s category structure.
- Category:Russian deverbals that contains now 53 entries has only entries the etymology of which consists in just removing the verb ending and using the stem. I see we have for this case Category:Russian words suffixed with -∅ – we just need to implement something like Category:Latin words suffixed with -o that is split by purpose of the suffix, Category:Latin words suffixed with -o (denominative), Category:Latin words suffixed with -o (compound verb) and so on, which is bare laudable. Now you only need to tell people, @Rua, how to create this id stuff, for to me it is a secret thus far.
- However this does not work with non-catenative morphology thus far – you may link the previous discussions on those infix categorization matters here, but even if that pattern collecting is solved the derived terms listed at صَلِيب (ṣalīb, “cross”), for instance, would only be categorized by pattern but nothing would imply that the terms are denominal –, and the point I have made about the categorization and naming of these categories is still there. But I give you green light in any case, if you want to replace all those “[language] deverbals” and “[language] denominal verbs” categorizations by suffigation categories of the format “[language] words suffixed with -∅ [deverbal]”, as well if it concerns action towards categorization of noncatenative morphology language terms, since your idea of uniformity is correct. Fay Freak (talk) 19:49, 25 February 2019 (UTC)
- Nonconcatenative morphology is still an underexplored part of Wiktionary, which is kind of annoying. But quite often, we simply show the concatenative part as the affix, and then leave a usage note saying what other changes occur when this form of derivation is used. For example on Northern Sami -i and -hit. —Rua (mew) 20:40, 25 February 2019 (UTC)
- How to create an affix category with an id: add the id to the definition line in the affix's entry with
{{senseid|language code|id}}
, add{{affix|language code|affix|id1=id}}
(at minimum) to the etymology section of a term that uses the affix, find the resulting red-linked category and create it with{{auto cat}}
. — Eru·tuon 20:51, 25 February 2019 (UTC)- Thanks, this is easier than I imagined, so it takes the category name from
{{senseid}}
. I thought it is in some background module data. Now where to document it? Add it to the documentation of{{affix}}
under|idN=
? This is the main or even only use of this parameter in this template, right? Fay Freak (talk) 21:18, 25 February 2019 (UTC)- It's not that
{{senseid}}
has any effect on the category name, but that a category with a parenthesis after it, such as Latin words suffixed with -tus (action noun), expects a matching{{senseid}}
in the entry for -tus, in this case{{senseid|la|action noun}}
because the link in the category description points to-tus#Latin-action_noun
, which is the format of the anchor created by{{senseid}}
. The|id=
type parameters, including in{{affix}}
, generally create a link of that type. In{{affix}}
, the parameter also has the effect of changing the category name. Sorry, I am not sure if I am explaining this clearly. — Eru·tuon 22:36, 25 February 2019 (UTC)- You explain this clearly. I just rolled it up from that side that I need to choose the name in
{{senseid}}
that I want to have in the category name so later with affix I will categorize in a reasonably named category because in other cases the id can arbitrary – not that{{senseid}}
has an effect on the category name. Fay Freak (talk) 22:53, 25 February 2019 (UTC)
- You explain this clearly. I just rolled it up from that side that I need to choose the name in
- It's not that
- Thanks, this is easier than I imagined, so it takes the category name from
- Our affix system is not sufficient to handle morphological derivation we have to deal with (unless you want us to introduce lambdas...) Serbo-Croatian hardly has the intricacy of Arabic conjugation, but there are plenty of nouns that are created from verbal roots through apophony, and this needs to be categorized somehow. Crom daba (talk) 17:24, 2 March 2019 (UTC)
- @Crom daba At least for Indo-European, we do have a system for handling combinations of affixation + ablaut, like on *-os (notice the parentheses showing the root grade) and -ος (-os). Our current system totally fails where there is no affix, though, a case which also exists in Indo-European. For example, there are some Indo-European forms of derivation, called "internal derivation", which are built entirely around changing ablaut