Wiktionary:Beer parlour/2021/September

Quicker deletion/removal of uncited RFVs[edit]

Voting on: Reducing the time period for closing/removing/deleting an RFV which is not cited. To be clear, this will apply to RFV-sense as well as non-English RFV nominations.

Rationale: To quickly close RFVs, which clearly don't exist. For example, if a user sends an entry to RFV, which clearly doesn't exist, like Hindi ओगाज़्म (ogāzm)^{[RFV Discussion]}, but cannot be speedied per the rules. In such a case waiting a month should not be required.

As the wording contains the discussion may be closed, this doesn't mean that if there is an ongoing discussion it has to be stopped. The entry can stay in RFV for a longer time if any important discussion is going on.

Disadvantages and solutions: A possible disadvantage of this is that in less time the chances of the entry going unnoticed is higher. This is not a problem if the entry creator keeps looking at the pages created by them in the past through their watchlist and if the editors of the language keep checking the categories to see which entries are in RFV. Even if this doesn't happen, and a valid entry is deleted before a knowledgeable editor could intervene, its undeletion can be requested in the future with valid citations. The entry creator and/or the editor(s) of that language might also be notified at the time of nominating or before closing.

If both options pass, the one with the larger Support:Oppose ratio will be executed.

Schedule:

Vote starts: 06:12, 1 September 2021 (UTC)
Vote ends: 06:12, 21 September 2021 (UTC)
Vote created: —Svārtava² • 06:12, 1 September 2021 (UTC)[reply]

Option 1: 2 weeks[edit]

Support[edit]

Support as proposer —Svārtava² • 13:50, 1 September 2021 (UTC)[reply]
Support. I guess we're saying this is a speedy-delete option, just as we have for RFD, that can be used from time to time when an entry's unattestability is beyond debate. I support it in that very limited capacity, but this option should be used sparsely to avoid the concerns DCDuring poses (and which I share). Imetsia (talk) 15:27, 1 September 2021 (UTC)[reply]

Oppose[edit]

Oppose Even many English terms need time to allow their being cited, especially during months when few are here to do so. To rush the process will lead to further degradation of the quality of citations, which often do not unambiguously support the definitions they supposedly support. DCDuring (talk) 12:34, 1 September 2021 (UTC)[reply]
Oppose Equinox ◑ 15:08, 1 September 2021 (UTC)[reply]
Oppose To my mind, the reasons offered are not sufficient to warrant making such a change and the timing should be left as is. Geographyinitiative (talk) 16:16, 1 September 2021 (UTC)[reply]
Oppose. We already speedy-delete obvious cases, but the most are not that easy. —Μετάknowledge^{discuss/deeds} 16:51, 1 September 2021 (UTC)[reply]
Oppose. As DCDuring says above. I've been trying to work out what stops me slapping RfV on a thousand oik-deletable Thai senses for an enthusiast like svartava2 to then delete. The best hope would be for me to blocked as being disruptive, but I'm not sure I would be. Even the example above seems a bad one. Treating the nukta as optional, I found 6 examples of the word being used in what looked like Hindi to me. I think the word exists, it's just that the examples didn't seem durably archived. --RichardW57 (talk) 17:38, 2 September 2021 (UTC)[reply]
Oppose. There is no need to rush to delete things (assuming that existing speedy-delete rules for patent rubbish are working successfully). (By the way, on a somewhat different point, did someone comment somewhere that RFV'd entries may be deleted simply because no one has properly attempted to cite them? This is more of a concern to me. Not that I am demanding that there should always be someone available and willing to do this work, just that there should be a mechanism to show when reasonable effort/attention has been applied.) Mihia (talk) 22:00, 3 September 2021 (UTC)[reply]
Oppose. The proposer has many virtues, but patience isn't one of them. Not everyone checks in every day, or even every week, and verification can be quite time-consuming. We shouldn't be increasing the risk of deleting real words just because someone finds it painful to go through all the steps and wait for the results. Chuck Entz (talk) 23:19, 3 September 2021 (UTC)[reply]
Oppose Why is this in the form of a vote? --{{victar|talk}} 18:38, 7 September 2021 (UTC)[reply]

Abstain[edit]

Abstain. ·~ dictátor·mundꟾ 19:50, 2 September 2021 (UTC)[reply]

Option 2: 3 weeks[edit]

Support[edit]

Support —Svārtava² • 13:50, 1 September 2021 (UTC)[reply]
Support - seems balanced enough. Rishabhbhat (talk) 13:53, 1 September 2021 (UTC)[reply]
Support. See above. Although at this point, how much time are we really saving with the deletions? Just seven days? Imetsia (talk) 15:27, 1 September 2021 (UTC)[reply]

Oppose[edit]

Oppose See above. DCDuring (talk) 15:15, 1 September 2021 (UTC)[reply]
Oppose To my mind, the reasons offered are not sufficient to warrant making such a change and the timing should be left as is. Geographyinitiative (talk) 16:16, 1 September 2021 (UTC)[reply]
Oppose. We already speedy-delete obvious cases, but the most are not that easy. —Μετάknowledge^{discuss/deeds} 16:51, 1 September 2021 (UTC)[reply]
Oppose. I still haven't got round to writing up the quotation for RfV'd कदाय (kadāya). Maybe I should just mistype the quotation and leave someone else to do the translation. --RichardW57 (talk) 17:38, 2 September 2021 (UTC)[reply]
Oppose. See no need for this marginal change. Mihia (talk) 22:28, 3 September 2021 (UTC)[reply]
Oppose weakly. This wouldn't make much difference and I don't really see a problem with the status quo. Equinox ◑ 22:30, 3 September 2021 (UTC)[reply]
Oppose per my comments above. Chuck Entz (talk) 23:20, 3 September 2021 (UTC)[reply]
Oppose --{{victar|talk}} 18:38, 7 September 2021 (UTC)[reply]

Abstain[edit]

Decision[edit]

Both options failed.

Option 1: 2–8–1.
Option 2: 3–8–0.

Wiktionary:Votes/2020-07/Removing letter entries except Translingual[edit]

I would like to be able to finally launch this vote (it seems long past due), so if anyone wants to give any last feedback on it, now's your perfect chance. Thadh (talk) 11:40, 1 September 2021 (UTC)[reply]

I am very much

Strongly opposed to this proposal just because letters are words just like anything else, though the second option is better. Moving letters to translingual entries rids the dictionary of important information that the letters entries bring such as:

Knowing the alphabetical order in a language
Connecting the letters to their letter names
Seeing the pronunciations of each letter, whether the phonemic pronunciation or the pronunciation of the letter name

Among other issues. Also, it's important to note the languages that don't follow the typical a~z 26-letter order; those languages having their own entries is important because a translingual entry (if it even exists) wouldn't tell me that gb is a letter in Yorùbá, or that ʻ is a letter in Hawaiian, without some serious modification. Would we lose the mutation tables for Welsh rh, the inflection tables for Hungarian dzs, the important etymological & historical information for Jeju & Korean ᆢ (yaw) (let alone all the other Hangul letter entries), the pronunciation information for Thai ค (kɔɔ), and more? Some of these entries do not belong in translingual, and we'd be deleting so much important information. The proposal seems to be partially related to combatting the Lua memory errors that are happening in a, but I do not think that it's the best solution, and seems very much Latin-script-centric without considering the languages that don't use the script. Even moving the entries to the Appendix would cause issues, as trying to fit them on one alphabetical page would be a nightmare for a bunch of languages. I would seriously make sure to talk with editors from other not-as-represented languages before launching such a proposal. AG202 (talk) 14:24, 1 September 2021 (UTC)[reply]

That's why I'm posting it here! ;) Now, the main question is, are most of these informations something we want in a dictionary? You see, from where I'm standing, alphabetical orders are rarely set, and even when they are, it seems more the domain of Wikipedia, rather than Wiktionary. Sure, the etymological information of certain letters may be interesting, but it's not something we can't place under a Translingual section, and pronunciations aren't often useful, because either the language is phonemic (so the pronunciation will be present wherever, including the About: page and Wikipedia) or it isn't (in which case there'll be a number of different pronunciations, so the pronunciation section will be unusable). I want to make clear though that reducing memory usage isn't the goal of the vote, it's a benefit. Thadh (talk) 16:58, 1 September 2021 (UTC)[reply]

Yes I'd say that that's information that we'd want in a multilingual dictionary if we're truly aiming for "all words in all languages". Etymological information would not work in translingual for issues such as Jeju & Korean ᆢ (yaw), as you can see that the history is different for both Jeju & Korean, unless you really want to have multiple etymology sections for translingual. Putting all these entries in translingual sounds like more of a mess than what we have right now, if I'm being completely honest. I'm not sure what you mean by alphabetical orders are rarely set, as most languages do have an alphabetical order. In regards to Wikipedia, it's lacking a bunch of information that's already listed here, and I'd rather not have a hypothetical of deleting important info here, just because it might appear on Wikipedia. It's also just generally interesting to see how letters have changed and developed between languages, such as seeing how c changes completely between languages, giving users those comparisons up front. Regardless though, that still doesn't address the issues brought up with Welsh & Hungarian about inflection & mutation tables which are important enough to be included with those entries. If the Lua Memory Error problem is not the goal of the vote, then what exactly is the goal? It's really hard to see how much else is worth deleting such important information for a bunch of languages on this website. If anything, Rua's proposal was the best, but I'm still very wary of that one as well. AG202 (talk) 17:28, 1 September 2021 (UTC)[reply]

“letters are words just like anything else” … “gb is a letter in Yorùbá”—essentialist delusions, nothing follows from it. And most is beside the point: that “we'd be deleting (so much) important information” would have to be illustrated since it is intended to move information instead of deleting it.

Maybe we should split the vote to do it for Latin letters first and have the option to go back? Just to see how it works? Would it make it more or less Latin-script-centric? 🤷 Fay Freak (talk) 17:32, 1 September 2021 (UTC)[reply]

@AG202 I've created a set of examples to illustrate a possible implementation of option 2: User:Thadh/Translingual/a, User:Thadh/Appendix:List of languages using the letter "a", User:Thadh/Appendix:List of Afar letters. Any history on the alphabet could be placed at the second appendix, but since orthographical history isn't really my domain, I omitted it for Afar. Mutations? Just as simply. How the appendix will look like is mostly up to the editors that make them (i.e. members of the language's community).

The point of this vote is: Why do we have this information clogging up in the mainspace? We could have over 3624 entries of the letter a alone, and that would mean a lot of entries to sift through just to get to the one you want. Thadh (talk) 20:34, 1 September 2021 (UTC)[reply]

It's hard to see how an Appendix like that would work for etymologies, mutations, inflections, & more complex pronunciation systems, without it turning into normal entries for each letter on the Appendix. It just doesn't cleanly line up like that if it's a more complicated letter system. And re: clogging up the namespace, I think there really needs to be a better solution. Mi has the same Lua memory errors and has more languages listed than o & e individually. If you look at a more closely, most of the entries aren't even about the letter (50 letter entries compared to 134 languages with even more etymologies), with for example Scottish Gaelic a having nine etymologies, none of which are about the letter, so regardless we'd still have that issue of Lua errors to fix (the Scottish Gaelic entries should remain obviously, just an example). 一 managed to fix its error issue, so I think more ideas could be brought up before removing letters in general. Also, I wouldn't consider some of the more pertinent information to be clogging up the mainspace. If the goal is to fix clogged entries and lua memory errors and the like, then a more encompassing solution should be found. What's being proposed right now is a damning short-term solution to a much longer-term problem. AG202 (talk) 21:38, 1 September 2021 (UTC)[reply]

@AG202: See User:Thadh/Appendix:List of Welsh letters: There is every possibility to host any number of information in a dedicated appendix without it having to turn into full-fledged entries. So if your main concern is only that information will be lost, I hope you're satisfyingly convinced it won't with an adequately executed option 2. Thadh (talk) 22:43, 1 September 2021 (UTC)[reply]

Thank you, but I’m still wary about it, especially when it comes to non-Latin script-based letters. If we could have more examples in the vote beforehand, possibly, but I’m iffy about voting for a proposal like that first before we know how it’ll look. I’m still more in favor of fixing the long term issue of Lua errors before deleting all letter entries. After all, there are many reasons why the vote was cancelled originally, let alone the ones I’ve mentioned. I do appreciate the dialogue though. AG202 (talk) 22:57, 1 September 2021 (UTC)[reply]

The vote wasn't ever 'cancelled', I just didn't have the time or motivation to find out how to officially start it XD. I wouldn't oppose more discussion on the implementation beforehand, but I'm not optimistic it's going to happen, since it'll require quite a bit of thought and participation from many community members of the concerned languages, while the result of the vote isn't even known yet! Anyway, I'm happy I've cleared the air a little. Thadh (talk) 23:19, 1 September 2021 (UTC)[reply]

Oops sorry the vote originally put by Metaknowledge was cancelled iirc. But yes regardless, I look forward to more discussion and hopefully more editors can get involved! AG202 (talk) 23:29, 1 September 2021 (UTC)[reply]

@Thadh Quick followup, for cases such as English A where there are derived terms, or usage notes such as in Hungarian y, or references as in Latin v, or multiple meanings as in English Y, how would those be best addressed concisely? The more I look at different language's letter entries, the more information I see that would make it hard to concisely put things on a single appendix page without having aforementioned entries or very long pages. AG202 (talk) 01:32, 2 September 2021 (UTC)[reply]

@AG202: I'll go example by example on this:

The derived terms given at A are actually derivatives of the symbol, and the symbol should of course stay; there's no way of knowing "A" means a letter grade.
The usage examples given at y seem a bit wordy to me, but if we want to keep them, it can be given in an appendix under an L2 depicting this issue, or under an L3 within History or something like that.
Y being an upsilon is a good one, but not something we can't host at the translingual section, since I don't think it is language-specific
Finally, v having references can be resolved by either giving these references at the appendix (since it's monolingual, the length of the references shouldn't be an issue, just like on Wikipedia) or just deleted as unimportant, on the community's discretion.

I hope these solutions are okay with you. Thadh (talk) 09:51, 2 September 2021 (UTC)[reply]

@Thadh Apologies for the late response, but those solutions, while I still think more people should be involved from other communities, especially non-Latin-script-based ones, are alright for now. AG202 (talk) 15:16, 13 September 2021 (UTC)[reply]

And then I also wanted to point out the case of archaic/obsolete letters or letters that aren't used in an alphabet and what the case would be for those, example being at ᆞ for Korean. AG202 (talk) 15:18, 13 September 2021 (UTC)[reply]

Those, too, can be depicted in a history section or something like that. Thadh (talk) 15:20, 13 September 2021 (UTC)[reply]

Alright, I've been going through Category:Letters by language and there's quite a bit that still needs to be addressed especially from those communities, such as sign language letters, braille, language-specific morse code, and more, so I really hope that more people get involved. AG202 (talk) 15:24, 13 September 2021 (UTC)[reply]

If your intent is to foster discussion, then don't call genuine concerns delusions, or else I will not engage further. Gb is a letter in Yoruba, if you knew a single thing about the language. Regarding, "would have to be illustrated since it is intended to move information instead of deleting it.", if you look at the actual vote, one of the options is literally "Option 1: Remove all these entries. Update the CFI to not include these.", which would lead to mass deletion, and then the second option could still lead to the loss of information. AG202 (talk) 21:22, 1 September 2021 (UTC)[reply]

I can't imagine option 1 passing, and I think Thadh should just remove it from the vote altogether. —Μετάknowledge^{discuss/deeds} 00:35, 2 September 2021 (UTC)[reply]

Good point. At this point even I'm not in favour of it, so I'll remove it ~~tomorrow~~ right now. Even so, I think the discussions concerning how the appendices would look like should be left to the individual communities and that we shouldn't wait for these discussions to be finished before voting. Thadh (talk) 01:03, 2 September 2021 (UTC)[reply]

So the merits of the project depend on whither we move the content and how we point to it. Of course we would then have to account for the fact that some letters are two letters, so to say. So just this circumstance and that “letters are words” of course did not evoke the supposition that nothing could or should be done.

Of course I did not assume that “remove” could actually pass separately from “move”. In the end editors would still decide to write something to an appendix or somewhere unless it was specifically excluded, which it wasn’t.

The question arises how to move stuff while 1. not causing new module errors by the copy of content 2. not causing too much work either due to restructuring and reformulating 3. not duplicating Wikipedia by kind of writing articles on writing systems. Creating overview indexes linking to subpages housing letter entries? This sounds lame enough.

Of course, moving the Chinese characters would be a lot of work for no obvious benefit. It is hard to imagine how all the information on 箸 or 頭 could be ported somewhither. Are we intending to separate the meanings of the words described from the history of character usage? I can imagine one can heavily disagree on what goes where and better lives on with the module error because there one knows what works.

In the end, if we can only lump or split else, dynamic content fetching is unavoidable. Perhaps we want a “glyph information” tool that only fetches information requestes for a language. With correct TAB key implementation that would be faster to use than A with 200+ sections. It would be like those other Unicode sites but with language-specific information, and information valid for sundry languages.

(Actually, what we want is Lua more memory. Let’s just create more three-letter entries for obscure languages to show by way of more module errors that the issue is pressing. Tiffs about letters won’t help then. I mean man, instead of creating wahtuh for every language, pick out them short words. Would be so epic you already regret the letter topic. Headline “Wikimedia does not afford enough RAM for the simples!”) Fay Freak (talk) 02:03, 2 September 2021 (UTC)[reply]

@Fay Freak: before this goes outta hand, I want to stress: CJKV CHARACTERS ARE NOT PART OF THE VOTE (just like any hieroglyph). They don't have the header "letter" for the simple reason they aren't one. Now, concerning how the appendices would look like, see my examples posted above. Thadh (talk) 09:51, 2 September 2021 (UTC)[reply]

I note that for English, 'a' includes entries as both a letter ('The first letter of the English alphabet, written in the Latin script') and a noun ('The name of the Latin script letter A/a'). Are both of these to be moved? Basque does much the same, and I didn't look beyond that. Entries like English cee, and Pali ra and rassa escape the removal from mainspace as worded because they are currently recorded as nouns. How great is the risk that they will be reclassified as 'letters' or even 'letter forms'? I very much want a simple search for 'rassa' to find the entry for the case form of a name of a letter. --RichardW57m (talk) 12:17, 2 September 2021 (UTC)[reply]

@RichardW57m: See the discussion page of the vote. These nouns are not part of this vote. Thadh (talk) 13:03, 2 September 2021 (UTC)[reply]

@Thadh: Does "L3-header" include "L4-header" when parts of speech are demoted to level 4 by the inclusion of a grouping L3-header such as "===Etymology 1===="? An example of this is a. Should entries with noun headwords be split from letter 'L3-headers'? Same example. --RichardW57 (talk) 14:33, 3 October 2021 (UTC)[reply]

That's why it's L3 and/or the headword template. I think it's pretty straightforward what the vote's objective is (delete all non-translingual letter entries), we don't need to set everything in stone. Thadh (talk) 15:51, 3 October 2021 (UTC)[reply]

Huh? This one is L4 and {{head|LANG|noun}}, so you don't catch it at all! --RichardW57 (talk) 20:47, 3 October 2021 (UTC)[reply]

The noun isn't part of the vote. The letter has the header "letter". Thadh (talk) 21:01, 3 October 2021 (UTC)[reply]

The nesting of headers and calls of {{head}} is: a, Welsh, Etymology 1, Letter, {{cy-noun}}. There is only that one call of {{head}} or equivalents. As I understand you, you're saying we remove nothing of the letter, because it is all formally serving the the noun entry. I think the item has to be split into letter and noun. --08:05, 4 October 2021 (UTC)

@RichardW57: Nouns have to obide by the nouns' Criteria for inclusion, so at least one (LDL) or three (WDL) use(s) of the letter as a noun denoting the letter, which is not the case with letter entries (since you could use any noun using the letter as a verification). Thadh (talk) 12:35, 4 October 2021 (UTC)[reply]

@Thadh:: That's not an issue in this case, as the letter and noun senses are already separate senses. Demonstrating letters is not as easy as you suggest - the surname Lhuyd doesn't make 'Lh', let alone 'uy', a modern Welsh letter. RichardW57m (talk) 14:05, 4 October 2021 (UTC)[reply]

Pinging the Korean workgroup since this would involve the deletion and move of Hangul entries such as ㄱ (g), ᆞ (aw), ㆄ (f), and more, and KSL entries like 𝠀𝪜, and I'm still wary of putting everything cleanly in an Appendix. (Notifying TAKASUGI Shinji, Atitarev, HappyMidnight, Tibidibi, B2V22BHARAT, Quadmix77, Kaepoong): AG202 (talk) 15:31, 13 September 2021 (UTC)[reply]

The first of them looks as though it should actually be translingual rather than Korean! Sign language may be messier. --RichardW57m (talk) 16:43, 13 September 2021 (UTC)[reply]

To be honest, sign languages is a whole foreign word for me, so I have no idea what the practices are, what a good solution is for them and whether their letters are at all comparable to those of written languages. Thadh (talk) 16:55, 13 September 2021 (UTC)[reply]

We should probably have people from sign languages weigh in and/or exclude them from the policy until they do. @RichardW57m I disagree as a lot of the information there is specific to Korean and may not apply to other languages that have used the script & letter such as Cia-Cia. AG202 (talk) 19:19, 13 September 2021 (UTC)[reply]

New Android app based on Wiktionary - Vedaist[edit]

Hello everyone. I've just released a new wiktionary based english dictionary app for Android users called Vedaist. For those interested, give it a try at the Play store.

The app has a minimal interface compared to wiktionary and IMHO is a better experience on a mobile browser. Currently the app has around 750,000 words with meanings and images where possible. There are also no ads in the app. I did release an iOS version in late June and this is the second platform. The android version currently lacks features like setting personal goals, but those will be added soon.

Thanks again for building wiktionary. If there is any feedback for me, please reach out.

Toucanvs (talk) 05:23, 2 September 2021 (UTC)[reply]

Hey @Kiril kovachev. Wanted to cc you in case you are interested in the android version. Toucanvs (talk) 05:24, 2 September 2021 (UTC)[reply]

You mention Vedaist is powered by Wikipedia sites and sadly not Vedaist is powered by Wiktionary. I guess it's because few people in the real world give a shit about Wiktionary. Want to correct this? TVdinnerless (talk) 00:37, 3 September 2021 (UTC)[reply]

Modifying the page WT:WDL[edit]

I had proposed a cleanup of the arrangement of the languages at Wiktionary talk:Criteria for inclusion/Well documented languages#Request for cleaning up some while ago. Someone might want to edit the page accordingly or continue the discussion. ·~ dictátor·mundꟾ 19:32, 2 September 2021 (UTC)[reply]

@Kutchkutch: Hi. I think no one cares about making minor changes to the list; you can go ahead and change it. ·~ dictátor·mundꟾ 10:45, 8 September 2021 (UTC)[reply]

Treatment of Early Modern Korean?[edit]

(Notifying TAKASUGI Shinji, Atitarev, HappyMidnight, Tibidibi, B2V22BHARAT, Quadmix77, Kaepoong): @LoutK, Mujjingun

What is the best way to deal with the lemmatization of Early Modern Korean? There are some unresolved questions regarding the status quo merger of EMK and Contemporary Korean:

Should Sino-Korean words attested only in EMK be lemmatized in their modern readings, or in the Hangul form of the time?

For instance, 『이언언해/易言言解』 has 긔긔션 (guiguisyeon), which is obviously 기기 선 (機器船, gigiseon). Should this be lemmatized as 긔긔션 (guiguisyeon), or as 기기선 (gigiseon) with 긔긔션 (guiguisyeon) being a soft redirect?

Alternatively, should EMK words attested only in hanja form be lemmatized at their EMK readings at the time, or in the modern reading?

I currently favor using modern Sino-Korean readings for all EMK words.

Should non-SK words with obsolete orthographies be modernized, even if the modernized spelling is not attested, for consistency with modern words?

For example, 졀ᄯᅡ빗 (Yale: cyelstapis) in 『한청문감/漢淸文鑑』 would be 절따빛 (jeolttabit) today, but only 절따말 (jeolttamal) is found in dictionaries of contemporary Korean. Should the lemmatization be consistent with 절따말 (jeolttamal), or faithful to the EMK spelling?

I note that Oxford English Dictionary would artifically modernize Middle English spellings in cases corresponding to this.

Should EMK words whose regular reflexes are now dialectal be redirected to modern dialectal forms, or the standard forms? For example, should 짐츼 (jimchui) be an Early Modern form of 김치 (gimchi) or of 짐치 (jimchi)?

A definitive solution to these issues would be to spin off Early Modern Korean as another L2. The big downside is that this would lead to immense duplication of content across Korean and EMK. Or, more realistically, the Korean entries will have all the relevant information and the EMK entry will have a neglected single-word gloss. We can already see this by comparing French and Middle French entries, e.g. Middle French faire is in quite a pitiful state compared to French faire. Polysemous or tricky EMK words like ᄒᆞ다 might be much better served by just being soft redirects to the comprehensive entry at 하다 (hada).

What should be done?--Tibidibi (talk) 15:34, 3 September 2021 (UTC)[reply]

Edit: in conventional monolingual sources, EMK is not treated as part of the contemporary language.--Tibidibi (talk) 16:01, 3 September 2021 (UTC)[reply]

I'm in

Support of splitting EMK. Since it's said that even educated natives not trained in EMK struggle with the language, it should be split, let alone the high amount of obsolete characters, Hanja, & more used in the language. Also, it'll be easier with connecting etymologies since Modern Korean or even dialectal words that derived from EMK forms will be able to point to a specific entry under the EMK header. EMK words attested only in Hanja form should be lemmatized at the EMK readings imho as that's what was attested, and we shouldn't be putting lemmas at unattested or modernized forms (unlike what Oxford does). AG202 (talk) 23:40, 3 September 2021 (UTC)[reply]

I think Early Modern Korean should be treated as part of "Korean" and there should be redirects to the entry for the Modern 표준어 form whenever possible. If there is no corresponding modern reflex, then one of the variant forms should be chosen arbitrarily and all the variant forms should redirect to that. Also, the original orthography should be preserved, and Hanja terms should not be transcribed into Hangul. English Wiktionary seems to do something similar, redirecting the archaic spelling of speake from Early Modern English to the modern standard form speak.--Mujjingun

@Mujjingun I disagree that "Hanja terms should not be transcribed into Hangul". I'm not sure to what extent you mean this, but while the majority of EMK texts are written in pure Hangul, there are also ones that use Hanja. If I'm understanding you correctly, you mean that terms attested only in the latter should be lemmatized at their Hanja form, while terms attested in the former should be lemmatized at Hangul. Variably lemmatizing according only to the source in which they are found will severely inconvenience the reader, especially when this is the result of inconsistent orthographic practice, not some ulterior logic behind it.

Regarding the original orthography for words with no modern reflex, conventional sources like the 우리말샘 dictionary do not have the problem of consistency because EMK is not actually treated as part of Korean proper; they are treated as 옛말 (yenmal) and do not have full definitions, all being soft redirects. But if we consider EMK to be akin to one of the modern dialects, I do suppose that preserving the original orthography is valuable. For example, it would be dumb to redirect Yukjin Korean 아심탢다 (asimtaenta) to the theoretical cognate *아심찮다 (*asimchanta).

I still believe that all (obvious) EMK Sino-Korean terms should be lemmatized at their present-day readings, not at the actual EMK forms. SK words are particularly likely to be rewritten in modernized form. When works like the 『청구영언/靑丘永言』, which are written in mixed script, are republished, the Sino-Korean words are given with modernized readings. Extending from this principle, 기기선 (gigiseon) would be preferred to 긔긔션 (guiguisyeon).--Tibidibi (talk) 02:06, 4 September 2021 (UTC)[reply]

Etruscan topic[edit]

Is it possible to add to the words of the topics categories the word "Gentes" for the notable families in the Etruscan culture?--BandiniRaffaele2 (talk) 15:46, 3 September 2021 (UTC)[reply]

I need this for my Category:ett:Gentes.--BandiniRaffaele2 (talk) 06:10, 4 September 2021 (UTC)[reply]

We have Category:Latin nomina gentilia; presumably the Etruscan category should follow the same naming scheme? —Μετάknowledge^{discuss/deeds} 07:15, 4 September 2021 (UTC)[reply]

@Metaknowledge: I think yes.--BandiniRaffaele2 (talk) 09:01, 4 September 2021 (UTC)[reply]

Hard redirect: ꝛ and ſ[edit]

Words spelt with the r rotunda and the long s generally get redirected to the normal spelling with r and s, but not always. For example, ‘noꝛ’ does not get redirected to ‘nor’, or ‘Iſrael’ to ’Israel’. Any explanation for this? ·~ dictátor·mundꟾ 16:24, 3 September 2021 (UTC)[reply]

The automatic redirection only works if there is a unique page with the r or s. noꝛ doesn't automatically redirect to nor, because the software doesn't know whether you want nor, Nor, NOR, or what. But noꝛꝛ does automatically redirect to norr, because there is only one page name with those letters. —Mahāgaja · talk 17:39, 3 September 2021 (UTC)[reply]

Oh, that makes sense. But, ideally, a lowercase word should redirect to only the corresponding lowercase word; or in case of ‘Iſrael’, the software should only consider the letter I sans diacritics. I personally think that would be better, though there’s not much advantage of that otherwise. ·~ dictátor·mundꟾ 17:57, 3 September 2021 (UTC)[reply]

This is why I think we should use actual #REDIRECT pages instead of relying on the software to redirect for us. —Mahāgaja · talk 18:05, 3 September 2021 (UTC)[reply]

@Mahagaja: Are we actually allowed to redirect such spellings (of course, if they are attested, as in the KJV, etc.)? If it is noncontroversial, then I myself am willing to do that. ·~ dictátor·mundꟾ 10:23, 5 September 2021 (UTC)[reply]

I am not hugely enthusiastic about obscure- or obsolete-character versions of ordinary words redirecting to the normal-character versions with no explanation. Generally speaking, although I create them myself, I dislike automatic redirects altogether. The reader is thrown to a different entry to what they typed with usually no indication of why, and only a very missable indication that it has happened at all. Mihia (talk) 21:09, 5 September 2021 (UTC)[reply]

We have a template {{obsolete typography of}}. For example, English haue is defined as “obsolete typography of have”. I think we can likewise define againſt as “obsolete typography of against”, and so on and so forth. If someone is willing to go through an heroic effort of creating redirects to normalized spellings, with not too much extra effort they can use this method. It is IMO a bit awkward though when the same obsolete typographic form applies to a word with several valid part-of-speech assignments, an issue also present for misspellings; why is acount not listed as a verb so as to acount for such occurrences as found here? --Lambiam 11:42, 6 September 2021 (UTC)[reply]

@Lambiam: Should we then create entries for words spelt with ꝛ and ſ, using that template? ·~ dictátor·mundꟾ 13:37, 7 September 2021 (UTC)[reply]

My opinion does not carry more weight than that of others, but for the terms I can think of that appears (to me) the best currently available option. It has the additional advantage that it does not stand in the way of regular entries that happen to have the same spelling, such as German haue. --Lambiam 15:13, 7 September 2021 (UTC)[reply]

Another issue to consider on this subject: long ſ is not always equivalent to short s. In old Serbo-Croatian texts one common orthography kept them totally distinct, using ſ for /z/ and s for /s/, and similarly ſc for /ʒ/ and sc for /ʃ/. This also suggests manually-created pages are a better option than automatic software redirects, at least as far as long ſ is concerned. — Vorziblix (talk · contribs) 09:21, 10 September 2021 (UTC)[reply]

What's the name of that orthography? When was it used? The w:Long s link helps a bit more than a link to one random work, as it does point out several minor, historical orthographies that used it.

YILDIZ redirects to Yildiz, not yıldız. Given that Turkish is spoken by 80 million people and uses that spelling today in standard Turkish, I think that's a bigger problem. In either case, but especially in those obscure long-s-using orthographies, I think reflecting the needs of English speakers is more important. If you do have to create an entry with ſ in it, it won't automatically redirect away from it. To create manual redirects for every word used in every European language until about 1800 is crazy levels of work. Looking at the Unix words list, about one-third of English words have an s not in final position, with it alone listing 24000 words that would need redirects manually created.--Prosfilaes (talk) 09:13, 17 September 2021 (UTC)[reply]

Past participles - lemmas or not[edit]

I was wondering whether Macedonian adjectival past participles, which can modify nouns attributively and decline as adjectives, should be listed as adjective lemmas or non-lemma forms of verbs. On the whole, they have the same properties as English participles, which are adjectival in contexts such as "a shattered vase". I see that for shattered, there is an adjective section, but the treatment of such participles seems to be inconsistent across languages:

Russian покрашенный (pokrašennyj) - only non-lemma
Italian dipinto - adjective and non-lemma
Spanish pintado - only non-lemma (the noun lemma is immaterial to the discussion)
Romanian vopsit - only adjective (common practice in Bogdan's entries based on what I've seen so far)
French peint - only non-lemma
German gemalt - only non-lemma
Dutch gemaald - only non-lemma
Hungarian festett - adjective and non-lemma
Bulgarian искан (iskan) - only non-lemma (closest relative of Macedonian)

In all these languages, the form in question are both verbal (e.g. used in compound tenses, except the Hungarian form, which is a simple tense when verbal) and adjectival (i.e. used to modify nouns and declined as plain adjectives like "happy"). Are the differences in treatment due to compliance with different lexicographical traditions in the countries where the languages are spoken? Why are there inconsistencies even among English entries, e.g. repaired has no adjective section although shattered does? Martin123xyz (talk) 09:53, 6 September 2021 (UTC)[reply]

Dutch malen has the more common past participle gemalen, which will always be the form used adjectively (gemalen koffiebonen, not ^✽gemaalde koffiebonen). This is also listed only as a non-lemma. --Lambiam 11:57, 6 September 2021 (UTC)[reply]

Thank you for the clarification Martin123xyz (talk) 12:41, 6 September 2021 (UTC)[reply]

When the term has an adjectival meaning not fully explained by the semantics of the verb whose past participle it is, it should IMO definitely have an adjectival entry. For example, we now list German gewichst only as the past participle of wichsen, but that cannot explain the sense “clever, cunning”.^[1] One test of adjectivality is whether the term can be the complement of the usual copula and can be graded with adverbs like very, or with a comparative and superlative. “The plane has just taken off” is fine; ^✽“The plane is taken off” and ^✽“This plane is very taken off” are not possible. But “He has been very depressed for a long time” makes a perfect sentence. --Lambiam 12:19, 6 September 2021 (UTC)[reply]

I agree that participles should be treated as adjectives when they have some additional meaning that cannot be predicted from the verb. However, the other tests are not so reliable. In Macedonian, "the plane is taken off" is grammatical, and the same goes for many intransitive verbs indicating a change of state. These constructions could arguably be treated as perfect tenses with the copula as an auxiliary of the "Ich bin gekommen" type. However, one could also say "a taken-off airplane" in Macedonian, where the participle is clearly not part of a tense. As for the possibility of using "very", it seems to correlate with the possibility of quantifying the underlying verb. We can say "he saddened him greatly" but not "the plane took off greatly" because taking-off is construed as binary, whereas emotional changes are construed as gradual, regardless of whether a verb or a participle is involved. A more reliable test is whether we can add an explicit agent in a passive construction. "Mary was depressed" cannot normally be expanded into something like "Mary was depressed by John", whereas "Mary was killed" is much more easily expanded into "Mary was killed by John". This naturally poses problems for participles derived from intransitive verbs. Martin123xyz (talk) 12:41, 6 September 2021 (UTC)[reply]

But "Mary was depressed by the death of her brother" is quite natural - 'depress' does not always have a personal agent. --RichardW57m (talk) 16:16, 6 September 2021 (UTC)[reply]

It can also be a matter of convenience. In Pali, the tendency of past participles to have meanings beyond that of the verb plus participles, of which there are several, needing a 48-cell declension table prompted me to treat them as lemmas. --RichardW57m (talk) 16:16, 6 September 2021 (UTC)[reply]

In Arabic there should not be “participle” headers, because there aren’t undeclined participles, the participles aren’t used for periphrastic tempora. Hence they have the headers of adjectives but the definition line is “active participle of …” or “passive participle of …”, in so far as there aren’t additional meaning. These aren’t even present in مولد where only the etymology sections tell that they are active and passive participles. Fay Freak (talk) 16:25, 6 September 2021 (UTC)[reply]

A preliminary conclusion may be that the best practice will differ across languages. If all past participles of some language can practically always be used as adjectives, listing them separately as adjectives is pointless. Compare how German and Turkish adjectives are usually not also listed separately as adverbs (as seen in “es hat gut geschmeckt”), and English adjectives that can apply to people not also as (collective) nouns (as seen in “the unaware may fall for this scam”). --Lambiam 09:01, 7 September 2021 (UTC)[reply]

A participle is of course derived from a verb, but most often not "merely" a derivative of the verb, even without any semantic extension or shift from the verb's original sense. It's possible, but tedious, to explain the word silenced in a sentence such as "You speak for the silenced." purely as a derivation from the verb to silence. Here it's more pragmatic to treat it as an adjective that is used nominally, which is also the conventional treatment in grammar books, I think.

I suspect the inconsistency with English terms is mostly about whether a usage is widespread. I feel there is something less "common" in expressions such as "a repaired car" as compared to ones like "a disgraced politician", although we can surely find examples like "sequencing of repaired DNA damage regions" in technical writing. --Frigoris (talk) 08:31, 18 September 2021 (UTC)[reply]

Using syn template for alternate plurals[edit]

I have inserted {{syn}} at some entries where a word has multiple plurals. E.g. cactus with cacti, cactuses, and cactusses. I didn't want to proceed with this without getting feedback from others: does this seem like a good idea? If not, is there a better way to say, "Sometimes, cactus is pluralized as cacti but sometimes it's cactuses or cactusses"? —Justin (koavf)❤T☮C☺M☯ 18:10, 6 September 2021 (UTC)[reply]

Another way that is used is to list them is as “alternative forms”, as seen e.g. at formulae. I’ve not been able to think of a reason why the use of {{syn}} for this purpose should be ill-advised; these alternative plurals are indeed synonyms in the strict sense of the word. If I had to devise a term for such alternative plurals, I’d suggest the neologism synenic, from συν- (sun-, “same”) +‎ ἑνικός (henikós, “singular”). --Lambiam 09:22, 7 September 2021 (UTC)[reply]

I still know not why we don’t the same template for “alternative forms” as we have for synonyms. Fay Freak (talk) 11:54, 7 September 2021 (UTC)[reply]

Perhaps because you did not create it? Or do we need a Wiktionary:Requested templates page? --Lambiam 16:15, 10 September 2021 (UTC)[reply]

I think syn is definitely wrong for alternate word forms. I think it's better to put them in the pos declaration, if the language your working has accelerators for multiple plurals, such as the English seraph uses {{en-noun|s|seraphim|seraphims}} or Spanish ananás uses {{es-noun|m|ananás|pl2=+}} JeffDoozan (talk) 15:50, 13 September 2021 (UTC)[reply]

Template for original research in reconstructed entries[edit]

I have created {{original research}} to place at the bottom of the reconstruction entries that result from original research. There is consensus in favour of keeping such entries, but it remains unclear to our readers that some of our reconstructed entries are copied from referenced sources, whereas others are novel content produced by Wiktionarians (and whose references may support individual forms or sound changes). Note: this template existed in 2013 with unnecessarily aggressive wording, and was deleted summarily by Rua. —Μετάknowledge^{discuss/deeds} 21:28, 6 September 2021 (UTC)[reply]

This is the warning I have imagined for a few places. Shouldn’t it be, though you aimed at continuing the tradition of a historical title, called {{original reconstruction}}? Regarding namespace conflicts, its title looks like a warning of broader application that could go into mainspace, while being restricted to the reconstruction space in the beginning. Unless of course you relegate the current text to a parameter, so we can use other parameters for other texts (some raw examples: |1=newetymon: The etymon of this entry is original to Wiktionary and has not been proposed previously. |2=neworganism The identification of the organism referred to by this vernacular name is original research.) Fay Freak (talk) 22:07, 6 September 2021 (UTC)[reply]

I imagine this as being restricted to reconstructions. The reason is that all of Wiktionary incorporates original research to a degree; our definitions are based on quotations that we find, and tweaked to fit what we observe. Labelling all of that would be absurd, and unnecessary. Reconstructions are unique because the word itself does not exist, and this template is intended to indicate that not only the details but the headword itself is our work. —Μετάknowledge^{discuss/deeds} 00:29, 7 September 2021 (UTC)[reply]

I'd be all for restricting using this template to situations where the reconstructed headword is something Wiktionarians concocted themselves rather than derived from a reference. A rephrasing of the template for this end may be in order. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 01:32, 7 September 2021 (UTC)[reply]

That's what I intended, to be honest. How can I rephrase it? Would that eliminate the need to put it on PWG entries? —Μετάknowledge^{discuss/deeds} 01:39, 7 September 2021 (UTC)[reply]

I find this template both silly and unnecessary. @Mellohi! is starting to add it to West Germanic entries but 90% of PWG are reconstructions based on Proto-Germanic reconstructions and aren't directly attestable. --{{victar|talk}} 01:04, 7 September 2021 (UTC)[reply]

(Notifying Rua, Wikitiki89, Benwing2, Mnemosientje, The Editor's Apprentice, Hazarasp): Pinging other Germanic editors and also @Leasnam, Kwékwlos to this discussion. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 01:32, 7 September 2021 (UTC)[reply]

Though this isn't a issue I feel strongly about, I can't say I'm fond of this idea. Users should be able to implicitly detect entries based on original research, as they'll lack sources; explicitly specifying this is unnecessary, unprofessional, and inconsistent with practice in normal entries. Additionally, the combination of {{reconstructed}} and {{original research}} is unsightly. If people are dead-set on explicitly indicating this, it's better to add a parameter to {{reconstructed}} which adds extra text rather than having a separate {{original research}}. Hazarasp (parlement · werkis) 03:07, 7 September 2021 (UTC)[reply]

@Hazarasp: I actually prefer an extra parameter to {{reconstructed}}, that would modify the message to convey what the OR template is doing right now. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 03:10, 7 September 2021 (UTC)[reply]

@Victar: Now clearly I was decided that this should not be added to Proto-West Germanic entries and it would be a caricature to add this template, yet @Mellohi! surprises me with this exact application of the template which I dismissed as a crank sport of mine. They aren’t concocted enough. The forms are kind of implicit in published Proto-Germanic reconstructions, under the assumption that Proto-West Germanic was in between, in any case the forms are “derived from the reference” regularly as my man itself distinguishes. When there is a Proto-Slavic reconstruction with a particular ending chosen instead of an other, the others being given as alternative forms, I did not see there enough distinction either. No five entries I have found remarkable enough for a whole banner. Fay Freak (talk) 11:49, 7 September 2021 (UTC)[reply]

The obvious issue with this template is that we don't have a No original research policy like Wikipedia does, and that especially applies to reconstructions. What's the point of this template, to warn readers that this entry is unreliable? If an entry is exceptionally unreliable, it shouldn't exist in the first place, right? I can see this being useful for etymology sections, but perhaps an inline note would be better, like.^{[original research]} --{{victar|talk}} 18:37, 7 September 2021 (UTC)[reply]

It's not about reliability. It's about telling readers we didn't just omit references, but that we have gone beyond them. —Μετάknowledge^{discuss/deeds} 20:19, 7 September 2021 (UTC)[reply]

Even without a No original research policy, our readers should always get an idea where a reconstruction comes from. Some kind of non-stigmatizing flagging that the lack of mention of an external source in an entry is not just negligence is a good idea. For the reconstructed entry itself, the template (ideally called {{original reconstruction}}) looks good. An inline note in etymology sections as proposed by @Victar also makes sense, but only with redlinks. If the reconstructed entry exists, the reader can get the information about the origin of the reconstruction in the entry itself. There is no need to "stigmatize" well-thought (and ideally community-vetted) OR entries against externally-sourced material. –Austronesier (talk) 07:56, 8 September 2021 (UTC)[reply]

I

support Victar’s proposal of inline note so that we can mark specific portions of the entry as original research material. To give an example, I have to reconstruct the definitions of sourced reconstructed Prakrit and reconstructed Ashokan Prakrit terms, and such definitions begotten through original research can be tagged with such a note. ·~ dictátor·mundꟾ 14:13, 8 September 2021 (UTC)[reply]

If all we have is an inline note, where would we put it if the form of the headword itself is OR? I see this in Proto-Japonic, for instance. ‑‑ Eiríkr Útlendi │^{Tala við mig} 18:03, 8 September 2021 (UTC)[reply]

If you really need to, I guess you could put it after {{head}}. --{{victar|talk}} 23:10, 8 September 2021 (UTC)[reply]

It would suffice to add a parameter to {{reconstructed|R=WT:$link2discussion}} for example per one above comment. Add the same for {{der|R=|sign=*}}, that is restricted to etymology sections, and then create a footnote {{rfe}} and/or usage-note much like conjugations have.

Or would tags in edit summaries to policy decision suffice on occasion? Like this thread for PWG?!

It would be nice anyway if {{rfe}} would impliment dedicated links to the discussions as good citation practice, with some automation wizardry from WT:ES-titles to add a marked-up citation if one is found out later. ApisAzuli (talk) 18:17, 15 October 2021 (UTC)[reply]

The 2022 Community Wishlist Survey will happen in January[edit]

Hello everyone,

We hope all of you are as well and safe as possible during these trying times! We wanted to share some news about a change to the Community Wishlist Survey 2022. We would like to hear your opinions as well.

Summary:

We will be running the Community Wishlist Survey 2022 in January 2022. We need more time to work on the 2021 wishes. We also need time to prepare some changes to the Wishlist 2022. In the meantime, you can use a dedicated sandbox to leave early ideas for the 2022 wishes.

Proposing and wish-fulfillment will happen during the same year[edit]

In the past, the Community Tech team has run the Community Wishlist Survey for the following year in November of the prior year. For example, we ran the Wishlist for 2021 in November 2020. That worked well a few years ago. At that time, we used to start working on the Wishlist soon after the results of the voting were published.

However, in 2021, there was a delay between the voting and the time when we could start working on the new wishes. Until July 2021, we were working on wishes from the Wishlist for 2020.

We hope having the Wishlist 2022 in January 2022 will be more intuitive. This will also give us time to fulfill more wishes from the 2021 Wishlist.

Encouraging wider participation from historically excluded communities[edit]

We are thinking how to make the Wishlist easier to participate in. We want to support more translations, and encourage under-resourced communities to be more active. We would like to have some time to make these changes.

A new space to talk to us about priorities and wishes not granted yet[edit]

We will have gone 365 days without a Wishlist. We encourage you to approach us. We hope to hear from you in the talk page, but we also hope to see you at our bi-monthly Talk to Us meetings! These will be hosted at two different times friendly to time zones around the globe.

We will begin our first meeting September 15th at 23:00 UTC. More details about the agenda and format coming soon!

Brainstorm and draft proposals before the proposal phase[edit]

If you have early ideas for wishes, you can use the new Community Wishlist Survey sandbox. This way, you will not forget about these before January 2022. You will be able to come back and refine your ideas. Remember, edits in the sandbox don't count as wishes!

Feedback[edit]

What should we do to improve the Wishlist pages?
How would you like to use our new sandbox?
What, if any, risks do you foresee in our decision to change the date of the Wishlist 2022?
What will help more people participate in the Wishlist 2022?

Answer on the talk page (in any language you prefer) or at our Talk to Us meetings.

SGrabarczuk (WMF) (talk) 00:24, 7 September 2021 (UTC)[reply]

Results for the most contended Wikimedia Foundation Board of Trustees election[edit]

Read in other languages

Thank you to everyone who participated in the 2021 Board election. The Elections Committee has reviewed the votes of the 2021 Wikimedia Foundation Board of Trustees election, organized to select four new trustees. A record 6,873 people from across 214 projects cast their valid votes. The following four candidates received the most support:

Rosie Stephenson-Goodknight
Victoria Doronina
Dariusz Jemielniak
Lorenzo Losa

While these candidates have been ranked through the community vote, they are not yet appointed to the Board of Trustees. They still need to pass a successful background check and meet the qualifications outlined in the Bylaws. The Board has set a tentative date to appoint new trustees at the end of this month.

Read the full announcement here. Xeno (WMF) (talk) 01:54, 9 September 2021 (UTC)[reply]

lol I tried. Equinox ◑ 14:58, 9 September 2021 (UTC)[reply]

Call for Candidates for the Movement Charter Drafting Committee ending 14 September 2021[edit]

Movement Strategy announces the Call for Candidates for the Movement Charter Drafting Committee. The Call opens August 2, 2021 and closes September 14, 2021.

The Committee is expected to represent diversity in the Movement. Diversity includes gender, language, geography, and experience. This comprises participation in projects, affiliates, and the Wikimedia Foundation.

English fluency is not required to become a member. If needed, translation and interpretation support is provided. Members will receive an allowance to offset participation costs. It is US$100 every two months.

We are looking for people who have some of the following skills:

Know how to write collaboratively. (demonstrated experience is a plus)
Are ready to find compromises.
Focus on inclusion and diversity.
Have knowledge of community consultations.
Have intercultural communication experience.
Have governance or organization experience in non-profits or communities.
Have experience negotiating with different parties.

The Committee is expected to start with 15 people. If there are 20 or more candidates, a mixed election and selection process will happen. If there are 19 or fewer candidates, then the process of selection without election takes place.

Will you help move Wikimedia forward in this important role? Submit your candidacy here. Please contact strategy2030wikimedia.org with questions.

This message may have been sent previously - please note that the deadline for candidate submissions was extended and candidacies are still being accepted until 14 September 2021. Xeno (WMF) 17:16, 10 September 2021 (UTC)[reply]

Server switch[edit]

Read this message in another language • Please help translate to your language

The Wikimedia Foundation tests the switch between its first and secondary data centers. This will make sure that Wikipedia and the other Wikimedia wikis can stay online even after a disaster. To make sure everything is working, the Wikimedia Technology department needs to do a planned test. This test will show if they can reliably switch from one data centre to the other. It requires many teams to prepare for the test and to be available to fix any unexpected problems.

They will switch all traffic back to the primary data center on Tuesday, 14 September 2021.

Unfortunately, because of some limitations in MediaWiki, all editing must stop while the switch is made. We apologize for this disruption, and we are working to minimize it in the future.

You will be able to read, but not edit, all wikis for a short period of time.

You will not be able to edit for up to an hour on Tuesday, 14 September 2021. The test will start at 14:00 UTC (07:00 PDT, 10:00 EDT, 15:00 WEST/BST, 16:00 CEST, 19:30 IST, 23:00 JST, and in New Zealand at 02:00 NZST on Wednesday, 15 September).
If you try to edit or save during these times, you will see an error message. We hope that no edits will be lost during these minutes, but we can't guarantee it. If you see the error message, then please wait until everything is back to normal. Then you should be able to save your edit. But, we recommend that you make a copy of your changes first, just in case.

Other effects:

Background jobs will be slower and some may be dropped. Red links might not be updated as quickly as normal. If you create an article that is already linked somewhere else, the link will stay red longer than usual. Some long-running scripts will have to be stopped.
We expect the code deployments to happen as any other week. However, some case-by-case code freezes could punctually happen if the operation require them afterwards.

This project may be postponed if necessary. You can read the schedule at wikitech.wikimedia.org. Any changes will be announced in the schedule. There will be more notifications about this. A banner will be displayed on all wikis 30 minutes before this operation happens. Please share this information with your community.

SGrabarczuk (WMF) (talk) 00:46, 11 September 2021 (UTC)[reply]

Talk to the Community Tech[edit]

Read this message in another language • Please help translate to your language

Hello!

As we have recently announced, we, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will take place on September 15th, 23:00 UTC on Zoom, and will last an hour. Click here to join.

Agenda

How we prioritize the wishes to be granted
Why we decided to change the date from November 2021 to January 2022
Update on the disambiguation and the real-time preview wishes
Questions and answers

Format

The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (first three points in the agenda) will be given in English.

We can answer questions asked in English, French, Polish, and Spanish. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

Join online
Meeting ID: 898 2861 5390
One tap mobile
- +16465588656,,89828615390# US (New York)
- +16699006833,,89828615390# US (San Jose)
Dial by your location

See you! SGrabarczuk (WMF) (talk) 03:04, 11 September 2021 (UTC)[reply]

Request about a new transliteration of Hainanese[edit]

Why don't we apply Hainanese Transliteration Scheme to the category for Hainanese? Since Hainanese has no available pronunciation module now, I recommend introducing w:Hainanese Transliteration Scheme into Wiktionary. I am the native speaker and able to help the establishment. dia5 dia5!--洗腳盆收購站長 (talk) 11:34, 11 September 2021 (UTC)[reply]

I just found this: Module:nan-pron-Hainan. @Justinrleung Is this module ready to apply on Hainanese articles? --TongcyDai (talk) 11:46, 11 September 2021 (UTC)[reply]

@洗腳盆收購站長, TongcyDai: It's still experimental because it can only handle two character tone sandhi. This is technically the same level that our Taiyuan Jin can handle, so I guess it could be incorporated into {{zh-pron}} soon. 洗腳盆收購站長, are you a speaker of the Wenchang dialect? If so, do you have any insight on multi-character tone sandhi? — justin(r)leung _{{ (t...) | c=› }} 14:40, 11 September 2021 (UTC)[reply]

Definitions for semantically straightforward inflected forms in subsidiary Pali script.[edit]

We have a potential style war for definitions of inflected forms in subsidiary scripts.

I have been writing, for example:

# {{pi-sc|Brah|kaṇṇo}}, ''which is'' {{pi-nr-inflection of|𑀓𑀡𑁆𑀡||nom|s|t=ear}}, which yields

Latin script form of kaṇṇo, which is nominative singular of 𑀓𑀡𑁆𑀡 (kaṇṇa, “ear”)

@Svartava2 prefers:

# {{pi-sc|Brah|kaṇṇo|pos={{pi-nr-inflection of|𑀓𑀡𑁆𑀡||nom|s|t=ear}}}}, which yields

Latin script form of kaṇṇo (nominative singular of 𑀓𑀡𑁆𑀡 (kaṇṇa, “ear”))

I dislike the latter form for several reasons:

It results in nested parentheses, which are generally considered bad style
It looks ugly - even a dash would be a better connective. The aim is to combine simultaneously applicable definitions.
It does not lend itself to more complicated chains, such as case forms of participles of causatives where the generative lexical information beyond the meaning of the finite verb the causative is derived from is at most form of the participle itself.
It abuses the 'part of speech' field of {{pi-sc}}.

What style do editors think we should be using? --RichardW57 (talk) 11:38, 11 September 2021 (UTC)[reply]

@RichardW57: i have no strong opinion on this and it isnt any kind of "war" as you say. i however do not like "which is" but would be ok with a colon or semi-colon. eg {{pi-sc|Brah|kaṇṇo}}: {{pi-nr-inflection of|𑀓𑀡𑁆𑀡||nom|s|t=ear}} or {{pi-sc|Brah|kaṇṇo}}; {{pi-nr-inflection of|𑀓𑀡𑁆𑀡||nom|s|t=ear}}. regarding the abuse/misuse of |pos=, it is very common; technically if shouldn't be there at all in Form-of templates but it is there. i dont know how to but i'd surely like to show you the different uses of pos parameter in these form-of templates... Svārtava² • 13:46, 11 September 2021 (UTC)[reply]

@Svartava2: I was about to revert your change to the Pali entry given as an example above, but then I thought you were likely to change it back again, so I decided some public discussion would improve matters. A semicolon has the wrong semantics in a dictionary. It implies an alternative meaning, whereas these are simultaneous meanings. A colon seems odd. For simple cases like the example above, 'and' might work, and I've taken to using it in more complicated examples, such as ອະພິໂລປິໂຕ (abilopito):

# {{pi-sc|Lao|abhiropito}}, ''which is'' {{pi-nr-inflection of|ອະພິໂລປິຕະ|eqv=abhiropita||nom|s|m}} ''and is'' {{pi-nr-inflection of|ອະພິໂລເປຕິ|eqv=abhiropeti||past|part|t=to concentrate on}}, which yields

Lao script form of abhiropito, which is nominative singular masculine of ອະພິໂລປິຕະ (abilopita ⇨ abhiropita) and is past participle of ອະພິໂລເປຕິ (abilopeti ⇨ abhiropeti, “to concentrate on”)

Note that I've just taken a surplus comma out of that, which shows that varying the punctuation is tricky. Adding an article might improve the flow, but choosing the right one could be difficult. --RichardW57 (talk) 15:41, 11 September 2021 (UTC)[reply]

You're the one who added |pos= to {{pi-sc}}! You added it on 24 May 2021. --RichardW57 (talk) 15:41, 11 September 2021 (UTC)[reply]

@RichardW57 feel free to revert, I won't edit war on this. Svārtava² • 15:45, 11 September 2021 (UTC)[reply]

In context, the first two positional arguments to {{pi-sc}} are usually redundant, though @Octahedron80 would like to use the first to identify the writing system rather than just the script. The second argument gives the usual Roman script form, which is not always the same as the transliteration. These can differ because of the way a nasal before a consonant is written, and some writing systems drop distinctions made in more formal writing systems. --RichardW57 (talk) 11:38, 11 September 2021 (UTC)[reply]

Nominative singulars are being automatically recorded if different from the lemma form, for some dictionaries use the nominal singular as the citation form. --RichardW57 (talk) 11:38, 11 September 2021 (UTC)[reply]

Policy on deletion consensus[edit]

It appears that there is no formal policy on the deletion of RFDs, and what consensus is needed. What is regarded as no consensus^[1], and what is considered to be enough for deleting the nominated page? Thoughts? Svārtava² • 09:13, 12 September 2021 (UTC)[reply]

As I've said before, 3/5 should be a pretty strong "consensus" (lowercase-c) to keep/delete an entry. But I think the matter is more time-sensitive than our other votes. If the rfd has been sitting around for months and months and it doesn't look like 3/5 will be reached, then at that point a simple bare-majority (50% +1) should be enough to keep/delete the entry.

If (a) a lot of users have voted in the deletion request, (b) there is a technical minority/majority, (c) but it's on a thin margin, and (d) a lot of time has gone by; I would close that as RFD-kept by no consensus. Likewise if the vote is cleanly cut-down-the-middle as 50-50.

Lastly, if the vote has been in operation for several months (3 months+ should be a good rule of thumb) and absolutely no one has weighed in on the deletion request, it's safe to close as RFD-deleted by no objection. At that point, no single user has been interested enough to express an opinion on the entry's standing; so the entry can be deleted safely. If someone after the fact is alarmed at that, they can contest the decision a full seven days after the entry has been deleted. Per Chuck, "[e]ntries can always be undeleted, if anyone objects. It's more important to refrain from archiving too fast, so people have a chance to see what's happened and raise an objection, if necessary." Imetsia (talk) 15:03, 12 September 2021 (UTC)[reply]

Imetsia closed that RFD appropriately. Sodhak/Svartava/whatever, you need to stop looking for legislative solutions to personal quarrels. You are squandering whatever meagre goodwill you still have. —Μετάknowledge^{discuss/deeds} 17:32, 12 September 2021 (UTC)[reply]

@Imetsia: thanks for the explanation. @Metaknowledge: may I ask you for what personal quarrel do you think I started this discussion? If you think it was for {{bor+}} then you are wrong. Svārtava² • 05:27, 13 September 2021 (UTC)[reply]

@Metaknowledge: what is "that RFD"? Svārtava² • 05:35, 13 September 2021 (UTC)[reply]

Obviously (see the link you provided above) this. --Lambiam 08:37, 15 September 2021 (UTC)[reply]

anyways, I asked that for t:t-ws, no "personal quarrel". "squandering whatever meagre goodwill you still have" is totally inappropriate comment directed at me for no reason at all. Svārtava² • 06:02, 17 September 2021 (UTC)[reply]

Very late: There is nothing wrong about seeking legislative clarification for something that is not clearly and traceably specified and leads to avoidable quarrels. I see no issue with svartava2's politely opening a Beer parlour discussion for a policy issue; to the contrary, this should be praised as a civil attempt to resolve policy preferences by a common discussion. I find the above response by Metaknowledge to be rude and inappropriate. Incidentally, the above clarification by Imetsia misrepresents the only documentation of the practice that we had: this revision (19 July 2020, current at the time of the discussion) of Wiktionary:Requests for deletion/Header at the time: "If there is no consensus for more than a month, the entry should be kept as a 'no consensus'." The above "at that point a simple bare-majority (50% +1) should be enough to keep/delete the entry" contradicts that. Wiktionary:Requests for deletion/Header was never voted on so one can argue it never properly tracked consensus and it still does not. --Dan Polansky (talk) 10:05, 3 September 2022 (UTC)[reply]

Does anyone strongly support keeping anagrams?[edit]

I'm mostly just a (very happy) user of Wiktionary, but have watched the debate on anagrams with interest. See the Anagrams talk page, where I've tried to add links to all relevant Beer Parlour threads. Anagrams have a vocal advocate in Equinox. Is there anyone else who feels strongly that anagrams should be kept on Wiktionary? Khromegnome (talk) 09:50, 14 September 2021 (UTC)[reply]

Yeah, I strongly agree with keeping anagrams, though I'll say that I've never put any work or thought into this area.

--Geographyinitiative (talk) 14:06, 14 September 2021 (UTC)[reply]

Why do you feel strongly about keeping them, if you haven't thought about them specifically? Do you generally oppose removing content? Khromegnome (talk) 09:52, 15 September 2021 (UTC)[reply]

Strongly? Probably not but moderately, yes. Wiktionary is an all-purpose dictionary, including a rhyming dictionary and an etymological dictionary and a multilingual dictionary (as well as a thesaurus, etc.) A "Scrabble dictionary" is a valid dictionary in my eyes. —Justin (koavf)❤T☮C☺M☯ 04:30, 15 September 2021 (UTC)[reply]

Would you support the addition of other related lists (e.g. a list of all words that can be formed by adding letters, also useful for Scrabble)? Khromegnome (talk) 09:52, 15 September 2021 (UTC)[reply]

At the level of an particular entry, no but in the Appendix namespace? Sure. Virtually anything can go in there, include fairly trivial and fun word lists. —Justin (koavf)❤T☮C☺M☯ 21:11, 15 September 2021 (UTC)[reply]

Not passionately, but I'd be strongly opposed to getting rid of them, mostly because they are useful for word games and it makes little sense to get rid of them when we have so many and it is so easy to add them automatically. Andrew Sheedy (talk) 05:21, 15 September 2021 (UTC)[reply]

They’re useless though for crypto clues such as “anagram of margana”. --Lambiam 08:29, 15 September 2021 (UTC)[reply]

I'm strongly in favor of removing them. They're only useful for 1% of the users and even for them only 1% of the time. Apart from that, they're nothing but browser delay and useless clutter (both while scrolling as well as in the edit history). Additionally, seeing as they can be generated by a program anyway, why should we essentially tabulate the output of a computer program into every article? Oh wait, it's not even every article, it's really a hit or miss; and where's the option to input a random collection of letters and get all anagrams of that? A Wiktionary-based anagram tool seems like a decent idea but it should definitely not be incorporated into the source of the articles. --Fytcha (talk) 17:21, 15 September 2021 (UTC)[reply]

Are those real statistics? I could claim that the audio link causes browser delay and useless clutter while being only useful for 1% of the users 1% of the time. I find anagrams interesting, and don't see much point in proactively removing them; certainly "browser delay" is silly. I seriously doubt that the time difference with and without is noticeable or easily measurable; the variation on browser load times on any one setup would likely be far larger than the difference between the average load times of the pages.--Prosfilaes (talk) 09:22, 17 September 2021 (UTC)[reply]

Most of our entries are useful for fewer than 1% of users. So what? Equinox ◑ 15:54, 19 September 2021 (UTC)[reply]

There is potentially a case for changing anagrams into a real-time search feature, like "find words that begin or end with". Equinox ◑ 23:29, 15 September 2021 (UTC)[reply]

I have never liked the inclusion of anagrams. DonnanZ (talk) 09:48, 1 October 2021 (UTC)[reply]

I found a really weird one this morning: %iles for Elis, which shouldn't happen. TBH, I prefer entries which are anagram-proof. DonnanZ (talk) 09:06, 4 October 2021 (UTC)[reply]

I have no strong feelings about them either way. I don't mind if we have them. — SGconlaw (talk) 11:05, 1 October 2021 (UTC)[reply]

Change to Transliteration of Sinhala[edit]

Has there been any discussion of a change to the transliteration of Sinhalese? Or was today's (Tuesday's) change of Module:si-translit just unilateral vandalism of Sinhalese (si), Pali (pi) and possibly some Sanskrit (sa) transliteration by @Inqilābī? The transliteration of niggahita was changed to be the same as that of the velar nasal. Alerting @InsularAdam, Atitarev.--RichardW57 (talk) 23:21, 14 September 2021 (UTC)[reply]

@RichardW57: see Wiktionary:Information_desk/2021/September#Wrong_pronunciation_with_Sinhalese. Not exactly a broadbased consensus. Chuck Entz (talk) 04:16, 15 September 2021 (UTC)[reply]

My goodness. @RichardW57: I am amazed to see Sinhalese is using the exact transliteration as that of Pali and Sanskrit— unlike other NIA languages. Sinhalese should definitely have its unique module! ·~ dictátor·mundꟾ 18:53, 15 September 2021 (UTC)[reply]

The natural reply to that disruptive proposition is that the only good Inqilābī is a dead Inqilābī. Actually, Sinhalese has some unique consonants, so these get their unique transliterations. Just how many different transliteration modules can a translation page support? I think the limit is about 100, so kindly bound your profligacy. --RichardW57 (talk) 19:33, 15 September 2021 (UTC)[reply]

To be more precise in Sinhala script transliteration, it proceeds in two stages. First the source is transliterated into a common system. Then the outputs in that common system are tweaked to the particular language, though Pali and Sanskrit currently use consistent flavours of IAST. Thus, as "ṁ" was the transliteration chosen for anusvara in the original Sinhalese transliteration scheme, it gets systematically replaced by "ṃ" for Pali and Sanskrit. Similarly, for the syllabic consonants, the module accommodates ring below for Sinhalese and dot below for Sanskrit (and the logic for Pali follows the path for Sanskrit).

Can somebody explain WT:SOP to me?[edit]

To me it seems that in orthographic systems where compounds are written without spaces and hyphens, people are much less likely to call something out as SOP. To give an example: puré de batata has just been nominated for deletion on the grounds of SOP, however nobody would ever dare propose the same for Kartoffelbrei. The only thing separating the two entries is an arbitrary peculiarity of their respective orthographies, apart from that they're identical in every regard. --Fytcha (talk) 17:05, 15 September 2021 (UTC)[reply]

We do have a rule that spaces and hyphens inside compound words affect whether we keep the word. Our policy is influenced by English spelling conventions and the fact that English speakers looking up foreign words are less likely to know where to break unspaced compounds. Can you propose an easily applied rule that would allow us to decide whether to delete Kartoffelbrei as a sum of parts? Vox Sciurorum (talk) 17:17, 15 September 2021 (UTC)[reply]

In some compounding languages, including German, the issue is compounded by the insertion of interfixes in compounds, following somewhat unpredictable rules: Eiweiß = Ei + ∅ + Weiß, but Eierschale = Ei +‎ -er- +‎ Schale; also Dutch kalfsvlees ~ kalf +‎ -s- +‎ vlees, but kalverkop = kalf +‎ -er- +‎ kop. Dutch has the specific issue of the orthographic choice between the interfixes -e- and -en-, which has historically vacillated quite a bit: ruggengraat = rug +‎ -en- +‎ graat (since 1996), but ruggespraak = rug +‎ -e- +‎ spraak. If Wiktionary is meant to be also usable as a reference for the spelling of terms, these should be included even in those cases when their meaning can be understood from the parts. Another issue to consider is ambiguity; for example, valkuil = val +‎ kuil and valkuil = valk +‎ uil coexist in Dutch. --Lambiam 15:28, 16 September 2021 (UTC)[reply]

Kar + Toffel + Brei is edible hiking shoes, which is entirely different from puré + de + batata. In any case, it's an English rule that works well enough for most other languages normally written with spaces that we keep it. If enough people spoke polysynthetic languages we might have to rethink it, but most of those languages are from the Americas or Australia, not good places for a language to be from if it wants to have speakers.--Prosfilaes (talk) 09:47, 17 September 2021 (UTC)[reply]

I think continental Germanic and Indian (not just Indic) languages could give us a flood of terms. If that matters, we are relying on the self-constraint of the German, Sanskrit and Pali editors. I'm not sure what's stopping Thai acting like a polysynthetic language - most short sentences lack any word-separating punctuation, and hyphenation when words are split between lines is far from universal. --RichardW57m (talk) 14:52, 17 September 2021 (UTC)[reply]

Making solutions for future problems often just makes bad solutions that need fixing in the future. Polysynthetic is not about the writing system; it's about how words are composed in the underlying language. We don't use the space-free rules on languages that don't use spaces between words. Frankly, I'd be pushing for more cites earlier; how many German compound words actually have three cites? Even if the answer is still too many, people won't be rushing them into Wiktionary willy-nilly.--Prosfilaes (talk) 06:45, 18 September 2021 (UTC)[reply]

So what rules do we use for Thai? Thai word lists used for linebreaking and spellchecking often look to me as though they have a rule that a single English word translates to a single Thai word. I think we're currently being well-served by the judgement of the initial editors. --RichardW57 (talk) 11:07, 18 September 2021 (UTC)[reply]

‘Edible hiking shoes’??? —Caoimhin ceallach (talk) 03:10, 18 September 2021 (UTC)[reply]

Kar = curved depression in a mountainside; Toffel = slipper; Brei = mash. Maybe an Alpine dish mashed by slippers? You go out and walk in the flowers and bring the slippers in to mash the food?

For an actual example, I was actually pounding my head yesterday over sentebrio in Esperanto; is it sen (without) + *tebrio? sente (feelingly) + *brio? No, obviously it's sent' (feeling with the grammatical marker chopped off) + ebrio (drunkness), but the facts that (a) sen is an incredibly common prefix, (b) sente is a valid Esperanto word and sent isn't, and (c) I wasn't familiar with ebrio, kept me backing up and looking up words that didn't exist in multiple dictionaries. (Note that the rule didn't really matter for Esperanto, since we're missing most of those words anyway.)--Prosfilaes (talk) 06:45, 18 September 2021 (UTC)[reply]

You mean to say that a German compound is in theory analysable in more ways because it's written together and therefore it deserves to be an entry more? —Caoimhin ceallach (talk) 19:00, 18 September 2021 (UTC)[reply]

I'm saying that we should have an entry because users are going to turn to the dictionary to try and figure out the meaning of those phrase, unlike space separated phrases where users are going to look up the (obviously) separate parts separately. "In theory analysable" is irrelevant; it's unlikely a user would actually conclude that it's edible hiking shoes, but could easily spend a lot of time looking up Kar or Kart or Toffelbrei or Offelbrei before they realized where the split was. I was actually a little surprised that Kartoffel wasn't a compound.--Prosfilaes (talk) 02:09, 19 September 2021 (UTC)[reply]

Ok, now I get it. That actually makes sense. Sorry, I was puzzling over your example for ages and couldn't fathom how you could construe that meaning or what you meant to say by it. —Caoimhin ceallach (talk) 12:26, 19 September 2021 (UTC)[reply]

Turkish -ma forms: gerunds or verbal nouns?[edit]

In Turkish one may form a noun by replacing the infinitive suffix -mak with -ma. On the templatized conjugation table this form is called a gerund. In some references it is called a verbal noun. I don't know or care much which it is, but I would like a decision because we have distinct templates {{gerund of}} and {{verbal noun of}}. Category:Turkish gerunds and Category:Turkish verbal nouns both exist. Vox Sciurorum (talk) 20:43, 15 September 2021 (UTC)[reply]

@Allahverdi Verdizade — Fenakhay ^{(تكلم معاي · ما ساهمت)} 20:56, 15 September 2021 (UTC)[reply]

Verbal nouns. See here. Allahverdi Verdizade (talk) 21:22, 15 September 2021 (UTC)[reply]

Since gerunds in various languages are generally verbal nouns, it is not a simple either–or issue. Some Turkish grammar books call these -ma/-me forms “gerunds”,^[2] but Lewis (Turkish Grammar) reserves the term for suffixes that form adverbial clauses, such as repeated -e, -erek and -ken, and this use is widely followed. Thus, using the term “gerund” for -ma/-me forms is not per se wrong but potentially confusing, whereas calling them “verbal nouns” is unambiguous. --Lambiam 14:48, 16 September 2021 (UTC)[reply]

In German, they are commonly known as "Kurzinfinitiv" (short infinitive) and are categorized as a subset of verbal nouns. Some sources: [3], [4], [5], [6], [7] --Fytcha (talk) 15:17, 16 September 2021 (UTC)[reply]

Announcing ilscripto 0.0.1: pure Lua Scribunto engine[edit]

To celebrate the 30th birthday of Linux kernel, I present you ilscripto 0.0.1.

This engine is >85% identical to the php engine, see START.md. As you can see, getContent is powered by local backup, while since mw.language:formatDate is not fully implemented, there're some issues with w:Module:CS1. For wiktionaries, saving files like US, Us and us to Windows may cause issues.

This engine is inspired by bliki and partial-MediaWiki-lua-environment, also pure Lua. I hope this project will attract 100 users globally, and provide new chances for bot editing. Crowley666 (talk) 10:20, 17 September 2021 (UTC)[reply]

By the way, I hope my bot will also be useful. Crowley666 (talk) 10:20, 17 September 2021 (UTC)[reply]

Excellent work. Sounds much more complete than my current local Lua environment. I'll be sure to try it out when I have some time. — Eru·tuon 21:34, 17 September 2021 (UTC)[reply]

User:Donnanz’s etyl clean-up methods[edit]

Previous discussions:

Wiktionary:Beer parlour/2017/August#Category:etyl cleanup - cleanup going backwards?
User talk:Donnanz#French etyl cleanup
User talk:Donnanz#Afrikaans etyl cleanup (in this discussion, Donnanz says: As a matter of policy I never use {{bor}} or {{inh}}, I never voted in favour of them.)

If I said I didn't vote in favour, I didn't. DonnanZ (talk) 12:08, 20 September 2021 (UTC)[reply]

User talk:Donnanz#Precise origin
User talk:Donnanz#etyl cleanup
User talk:Donnanz#Hindi etyl cleanup (in this discussion, he makes this mocking remark: Don't leave it too long, or it will be cleaned up by default.)

There is always a listing for derived terms, by default. DonnanZ (talk) 12:08, 20 September 2021 (UTC)[reply]

User talk:Donnanz#inh vs. der (here, admitting he’s a bot, Donnanz says: That's the automaton in me eliminating {{etyl}} […] )

Nonsense, I don't have a bot. DonnanZ (talk) 12:08, 20 September 2021 (UTC)[reply]

User talk:Donnanz#Just wondering...

We are all familiar with the disruptive edits of Donnanz (talk • contribs). For a few years, Donnanz has been indiscriminately substituting all {{etyl}}s with {{der}}, even though the replacement ought to be done by more specific etymology templates. Not only is this useless, but also harmful: for “there is nothing to signal that many of the uses of {{der}} are inappropriate”. Indeed, this has exacerbated fixing etymologies, because an editor has to be alert about whether the source code has {{etyl}} or a misplaced & invalid {{der}}. Donnanz has taken upon himself to clear up the entire list in Category:etyl cleanup, disregarding the fact that etyl clean-up is meant to be done diligently. All entries using the deprecated template get properly categorized, as it behaves like {{der}}, and as such there is no point in mechanically performing a fake clean-up that does not change categorization. He has been asked before multiple times (see the linked discussions) to abandon this practice, but he has always refused to listen. This is probably the greatest scandal in the history of this project, and Donnanz has somehow managed to continue doing the unwanted edits unchecked.

In the interest of preserving the legitimacy of this project, Donnanz should be legally banned from doing etyl clean-up. How that could be achieved is a different matter, but I urge the community to consider this issue in earnest. Thank you. ·~ dictátor·mundꟾ 17:21, 18 September 2021 (UTC)[reply]

As a point of clarification, Donnanz sometimes resorts to unhelpful and provocative language, so I think that his "automaton" reference is just a rude and inappropriate joke rather than an admission that he is actually using a bot to edit. That said, your proposal seems sound to me. —Justin (koavf)❤T☮C☺M☯ 18:03, 18 September 2021 (UTC)[reply]

Support. "This is probably the greatest scandal in the history of this project" is unnecessarily inflammatory, but yes, I would like Donnanz to stop doing this. P U C – 18:28, 18 September 2021 (UTC)[reply]

I said ‘probably’… ·~ dictátor·mundꟾ 18:31, 18 September 2021 (UTC)[reply]

Ok. Let's not argue about that, it doesn't matter. P U C – 18:35, 18 September 2021 (UTC)[reply]

Why not just create some hidden categories or dumps, like French terms derived but not inherited from Middle French and use those as a tool for cleanup? --{{victar|talk}} 20:29, 18 September 2021 (UTC)[reply]

I did suggest to User:PUC a long time ago that he could clean up French, but judging by the lack of progress, it would appear that he can't be bothered. DonnanZ (talk) 20:59, 18 September 2021 (UTC)[reply]

I have no inkling ’bout your uncanny fascination with etyl clean-up, but how do you expect other editors to clear up etyl usages overnight? You might be a retired person (per your userpage; no offense intended) so you have time aplenty fiddling with etymologies, but such a demand from other editors is so unwholesome. ·~ dictátor·mundꟾ 00:04, 19 September 2021 (UTC)[reply]

As I said, a long time ago. No one is suggesting the job can be done overnight, that is an exaggeration.

I don't see how such a ban is workable, would it prevent me from adding new etymology? DonnanZ (talk) 08:40, 19 September 2021 (UTC)[reply]

I have also seen misuse of {{bor}} (e.g. Piffard). If a French person migrates to another country and keeps their name, they are introducing their name to the country, and thus into the language of that country - it is not a borrowing. As far as I know, there is no template which caters for such introductions, but use of {{bor}} is misleading, and should be avoided. DonnanZ (talk) 09:07, 19 September 2021 (UTC)[reply]

That is a good point. Such words are actually translingulisms, but people do not care much. ·~ dictátor·mundꟾ 14:52, 19 September 2021 (UTC)[reply]

Well, this proposal is about banning the disruptive edits, not about how to deal with the mess. ·~ dictátor·mundꟾ 23:17, 18 September 2021 (UTC)[reply]

What is meant by "disruptive"? DonnanZ (talk) 08:40, 19 September 2021 (UTC)[reply]

Is it still disruptive if there's a easy solution to it? --{{victar|talk}} 18:16, 19 September 2021 (UTC)[reply]

Thanks to Inqilabi for bringing this on BP with relevant discussions listed. Many Hindi etymologies are still suffering from Donnanz's fake etyl cleanup. I wholeheartedly support ban on his etyl cleanup, or also ban on any editing at all until he has cleaned up his fake cleanups: let his karma return. Svārtava² • 07:09, 19 September 2021 (UTC)[reply]
I think banning Donnanz from editing entirely is a step too far; However, I do think they ought to stop - an {{etyl}} template on itself signifies a page up for cleanup, a bare {{der}} doesn't, and is less likely to be found. Thadh (talk) 10:22, 19 September 2021 (UTC)[reply]
editing entirely until he has completely cleaned up his fake etyl cleanup Svārtava² • 11:27, 19 September 2021 (UTC)[reply]
It's still disproportionate. P U C – 11:31, 19 September 2021 (UTC)[reply]

I don't think that idea has been thought through. If I were banned from editing entirely I would be unable to "clean up" your so-called "fakes", even if I wanted to; you would have to do the job yourself to your satisfaction. A lot of fuss about nothing. DonnanZ (talk) 11:47, 19 September 2021 (UTC)[reply]

They meant non-cleanup editing. Anyway, like PUC says, it's disproportionate. Thadh (talk) 12:26, 19 September 2021 (UTC)[reply]

I want to make it very clear that this proposal does not seek any retributive justice. This is just a formal request to ban Donnanz from doing etyl clean-up. Yes, any etyl clean-up, for he is really not interested in using specific etymology templates. That said, I have not made the slightest expression that he would be forbidden from editing the Etymology section altogether. He is even free to add new etymologies indiscriminately using {{der}}, for that matter. Any new proposals should be brought forth elsewhere, not here, because the matter under consideration is only the fake etyl clean-up. ·~ dictátor·mundꟾ 12:23, 19 September 2021 (UTC)[reply]

Well, with that concession (for other editors too) you have unwittingly made a mockery of your own proposal. There are many instances where {{der}} should be used, such as for Proto-Indo-European, an unwritten, reconstructed and probably largely theoretical language. You can't do away with {{der}} as some would like. So it may not be only me contributing to your so-called "mess", there might be other "guilty" parties. I am merely the easiest one to blame. DonnanZ (talk) 14:01, 19 September 2021 (UTC)[reply]

The discussion is not about fixing wrong etymologies abounding the etymology sections; I have already stated clearly what the only objective of this proposal is. But for your information, there are terms in languages that are directly inherited from a PIE (or any other reconstructed language) term, for which {{inh}} should be used, e.g., the descendants of *ph₂tḗr. {{der}} is used for other cases, such as roots. Read the description at Template:inherited. And of course, there are many, many misuses of etymology templates, either through carelessness or wrong conviction. ·~ dictátor·mundꟾ 14:52, 19 September 2021 (UTC)[reply]

You can't blame any editor for having different convictions, as I don't think reputable dictionaries include Proto- languages as etymology. It's all very murky. DonnanZ (talk) 16:04, 19 September 2021 (UTC)[reply]

Not some of them, but the OED sometimes does. For instance, it mentions "Old Germanic *hafjan" in the entry for heave (apparently that's what they reconstructed for Proto-Germanic *habjaną at the time) and "Germanic *swōtja-" in the entry for sweet (adjective and adverb). Merriam-Webster didn't mention proto-languages for these words, but it also had much shorter etymologies. I think on etymology we'd do better to emulate the OED. — Eru·tuon 18:45, 19 September 2021 (UTC)[reply]

This isn't really about proto-languages. Let's look at a few scenarios:

A word is borrowed from some stage of Low German into Old Norse, and eventually ends up as Norwegian Nynorsk
A modern English word that can be traced back to an Old French descendant of a Latin word is borrowed directly into modern Norwegian Bokmål.
A modern English word that can be traced back to a Norman French borrowing from Old Norse is borrowed directly into modern Norwegian Bokmål.
A Old Norse word is present as part of the basic vocabulary and is obviously not borrowed from anywhere. It then is passed down in an unbroken line to Norwegian Nynorsk

These are all plausible scenarios, and you want to be able to distinguish, for instance, between Old Norse that was borrowed into another language and later borrowed back into modern Norwegian and Old Norse that Became Norwegian by way of regular language change. Also, a word that was borrowed into Old Norse should be marked as inherited from Old Norse, but derived from Low German. Chuck Entz (talk) 20:03, 19 September 2021 (UTC)[reply]

Just to show the whole picture: there are benefits to clearing an entire language of {{etyl}}: languages that have no etyl templates can be added to an exclusion list that prevents anyone from adding new etyl templates for that language (Adding it to the list before then would cause a module error in every existing etyl template for that language). It's not just a matter of saving the cleanup for later: as long as there are etyl templates for a language, people will continue to add new ones and the job will get harder. There's an abuse filter that tags edits which increase the number of etyl templates in an entry. It's been in place for 4 years, and already has 8917 hits. Chuck Entz (talk) 20:24, 19 September 2021 (UTC)[reply]

While I'm at it: pinging @Mahagaja, who probably knows as much about the issues involved as anyone. Chuck Entz (talk) 20:35, 19 September 2021 (UTC)[reply]

All I have to say is I too get annoyed when people clean up {{etyl}} by blindly replacing it with {{der}} rather than by paying attention to what kind of derivation it is and using the various templates correctly. I correct these as I encounter them, but of course usually I don't notice them. I have no idea whether Donnanz is the only person who does this or even the person who does it the most. I don't bother checking the page histories of the pages affected. —Mahāgaja · talk 20:42, 19 September 2021 (UTC)[reply]

I think @Ultimateria does it too sometimes. P U C – 21:49, 19 September 2021 (UTC)[reply]

I don't think I've done that in at least two years; as soon as I read the public discourse about it I stopped. However I add new etymologies with {{der}} (unless it's obviously a modern borrowing) because I don't feel qualified to make the distinction, and it's certainly better than having no etymology. Ultimateria (talk) 22:18, 19 September 2021 (UTC)[reply]

@Ultimateria: Sorry for the false accusation then ("accusation" is a bit strong, but you get my point), and yes I agree that adding an etymology with an imprecise template is better than having no etymology at all. P U C – 18:52, 23 September 2021 (UTC)[reply]

Despite your acute observations, I doubt that any more languages will be free of {{etyl}} any time soon. There is a lot of apathy around, which is not helped by User:Inqilābī. There is virtually no improvement in Category:etyl cleanup for today. DonnanZ (talk) 22:34, 19 September 2021 (UTC)[reply]

I think it is actually useful to have a template for signalling an etymological connection that can be used when it is unknown whether, for example, an Old French term is inherited from Latin or a borrowing. --Lambiam 08:19, 20 September 2021 (UTC)[reply]

I have been threatened with a non-specific block by User:Benwing2 on my talk page. This is very heavy-handed so I am reporting it here. DonnanZ (talk) 08:59, 20 September 2021 (UTC)[reply]

There's nothing heavy-handed about it. Several users asked you to stop these edits and you dismissed them all. I don't see anyone who agrees with your edits; you are the only "noisy minority" here. Ultimateria (talk) 15:02, 20 September 2021 (UTC)[reply]

If you were on the receiving end, you would think differently. DonnanZ (talk) 15:14, 20 September 2021 (UTC)[reply]

Support. Donnanz's edits border on the disruptive. {{etyl}} is not bad in itself, insofar as it does the same work as {{der}}; the reason it is subpar and needs to be cleaned up is because it is not as specific as {{bor}}, {{inh}}, and so forth. This lack of specificity is the problem with {{etyl}}. When Donnanz "fixes" {{etyl}} to {{der}}, nothing has been cleaned up. The problem is still there; the only thing that has changed is that the problem is no longer detectible (not being categorized in Category:etyl cleanup), and hence really much worse.--Tibidibi (talk) 12:02, 20 September 2021 (UTC)[reply]

I am washing my hands of Category:etyl cleanup, as satisfying a noisy minority of users is not worth the trouble, and my efforts are not appreciated. I will still monitor the situation, checking on progress or the lack of it, and carry on with other work. Nobody has won, "match abandoned".

Special thanks go to User:Chuck Entz, for content hidden by User:Inqilābī. DonnanZ (talk) 14:28, 20 September 2021 (UTC)[reply]

I do not get what is the point of repeatedly pinging me or other editors, but I hid the content due to its irrelevance, especially when it was in the lead. ·~ dictátor·mundꟾ 12:00, 21 September 2021 (UTC)[reply]

I guess he means he didn't like being overruled - twice. DonnanZ (talk) 12:25, 21 September 2021 (UTC)[reply]

?? ·~ dictátor·mundꟾ 12:30, 21 September 2021 (UTC)[reply]

Does Hawaiian have adjectives?[edit]

Category:Hawaiian_adjectives has been created and deleted repeatedly and now an IP keeps on adding entries such as huluhulu and mahū. I am trying to clear out Special:WantedCategories and want to know what to do with these entries. I see some conflicting information about Hawaiian having adjectives (I even started a Duolingo class on it to see what they said). Hawaiian grammar says that they exist and Wiktionary:About Hawaiian is pretty underdeveloped and does not discuss the topic. Can anyone weigh in on this so and add it to WT:AHAW so that we don't have repeated adding and deleting? —Justin (koavf)❤T☮C☺M☯ 18:01, 18 September 2021 (UTC)[reply]

Those are stative verbs, according to the standard grammars and dictionaries. —Μετάknowledge^{discuss/deeds} 18:05, 18 September 2021 (UTC)[reply]

I added some language to WT:AHAW. —Justin (koavf)❤T☮C☺M☯ 18:17, 18 September 2021 (UTC)[reply]

Out of curiosity, would i ʻula ia mean “it was red” (but ceased to be so) or “it became red”? Or is this an ungrammatical sentence? --Lambiam 07:54, 20 September 2021 (UTC)[reply]

@Lambiam: As far as I can tell, although I'm not very good at Hawaiian, so I'm not sure, what you're looking for is Ua ʻula ia (or simply ʻula ia) - "It became red". I ʻula ia is simply "It was red", it doesn't even say anything about the present state of affairs. Thadh (talk) 15:36, 20 September 2021 (UTC)[reply]

Thanks. I was not wondering how to say “it became red” in Hawai’ian, merely what the interpretation would be of “i ʻula ia”. --Lambiam 21:56, 20 September 2021 (UTC)[reply]

Serious point: it depends on how we decide to define "adjective".

Arguably, Japanese has a category of words that are, strictly speaking, classable as "stative verbs": these words (commonly called "-i adjectives") describe the quality of a thing, and can directly modify a noun or noun phrase, and thus function adjectivally, but they can also form the predicate of a statement, inflect for tense, and include an inherent "to be" sense, and thus function verbally.

For good or ill, English-language materials describing the Japanese language tend to call these "adjectives", outside of academia.

It's been a while since I was working much on my Hawaiian knowledge, but what I recall is that Hawaiian has a similar category of words that have similar qualities: they describe the quality of a thing, can be used to modify a noun or noun phrase, but also take markers for aspect and tense, and can be used predicatively.

Arguably, for an audience of English-language readers, I think a case can be made that these words should be called "adjectives", with the WT:AHAW page providing a fuller explanation of how these are also stative verbs.

Conversely, if a preponderance of existing English-language materials describing the Hawaiian language tend to call these "stative verbs", outside of academia, then presumably we should follow suit.

On the third hand, Hawaiian is somewhat fluid when it comes to parts of speech, a bit similar to the way things work in Chinese, another notably analytic language -- a given word could be used as a verb, noun, adjective, adverb, etc. In the Wehewehe entry for ʻula, for instance, I see that they group the noun, verb, and adjectival senses together. Probably not an approach we would adopt here, but food for thought. ‑‑ Eiríkr Útlendi │^{Tala við mig} 23:04, 24 September 2021 (UTC)[reply]

English prepositions[edit]

It appears so far that there is a disappointingly low level of interest in this, but may I nevertheless encourage anyone who does have a view to participate at Wiktionary:Votes/2021-08/Scope_of_English_prepositions. Mihia (talk) 22:00, 18 September 2021 (UTC)[reply]

I did not even understand a shred what the vote is about. Was it some school grammar book prescription? Sorry, my IQ is kinda low. ·~ dictátor·mundꟾ 23:14, 18 September 2021 (UTC)[reply]

I would not expect anyone to participate in the vote if the issue did not mean anything to them. Regarding this edit, may I ask what you mean by "collapsible option sections"? I did not see any collapsibility, and the heading levels were to my mind logical before your change, and illogical after. Is this some special setting that you have? (Also, may I ask why you, or anyone, would have a signature completely different to your username? I had no idea until I accidentally discovered it just now that " ·~ dictátor·mund" was the same person as "Inqilābī". Isn't this just confusing?) Mihia (talk) 01:15, 19 September 2021 (UTC)[reply]

@Mihia: I thought you would attempt to explain the vote to me, but never mind, your reply was as intriguing as your vote. You should create vote pages with the correct layout; I have fixed the page further— if you have no idea then just check out other vote pages before bickering. Using a level 3 heading causes collapsible sections on mobile. Congratulations on your discovery; that’s nothing to fuss over, but I guess it’s because of less interaction ’twixt us. And you mispelt my signature. Anyway, good luck with your ‘vote’ (if it is really one). ·~ dictátor·mundꟾ 11:56, 19 September 2021 (UTC)[reply]

The heading levels that you have created are illogical. If other votes do the same then they are illogical too. Mihia (talk) 19:24, 19 September 2021 (UTC)[reply]

Logicality is a subjective thing. We act according to the norm. If you dislike the norm, you have to make a proposal first, to change the norm. ·~ dictátor·mundꟾ 19:59, 19 September 2021 (UTC)[reply]

If there is one thing that should not be subjective, it is logicality. Mihia (talk) 22:50, 19 September 2021 (UTC)[reply]

The vote is about the best assignment of a POS descriptor in some murky cases. --Lambiam 07:42, 20 September 2021 (UTC)[reply]

What the vote really means[edit]

The section WT:CFI § Idiomaticity contains the following clause:

In rare cases, a phrase that is arguably unidiomatic may be included by the consensus of the community, based on the determination of editors that inclusion of the term is likely to be useful to readers.

It was added with the argument (in the edit summary)

This is what the vote really means.

Here, “the vote” refers to Wiktionary:Votes/2014-11/Entries which do not meet CFI to be deleted even if there is a consensus to keep, which failed with 7 in favour, 9 against. However, in no way does the discussion around the vote refer to a judgement about a term’s being “useful to readers”; this interpretation appears to have sprouted from the imagination of one editor. (Note that Wiktionary:Idioms that survived RFD also does not refer to “usefulness”.) I also think that the meaning of the vote was that a consensus-to-keep overrides the CFI in general, not specifically for the application of the idiomaticity criterion. Such exceptional overriding applies likewise (IMO) to names of specific entities, or terms originating in fictional universes. Thus, it is misplaced in the section Idiomaticity. But does it need to be mentioned at all? Surely, this is not meant to be a free pass for ignoring the criteria, as some appear inclined to see it. I think it is better if its addition is undone. --Lambiam 07:31, 20 September 2021 (UTC)[reply]

Delete the sentence. Enough people already vote keep in RFD without a valid policy rationale that we don't need to enshrine that "keep just because" attitude in the CFI. P U C – 13:26, 20 September 2021 (UTC)[reply]

Keep something of the sort: Terms like gagana Tokelau and ižoran keeli are useful for the readers (for instance, they portray the standard formula for a language name), but strictly speaking SOP and not subject to the phrasebook. I would be okay with some rewriting though, if that's necessary. Thadh (talk) 13:36, 20 September 2021 (UTC)[reply]

@Thadh: There have been several RFD's about this (Talk:bulgarian kieli, Talk:afrikanų kalba, and others), and the recent consensus has been to delete these combinations (while they were kept in older RFD's). The pattern can be described at keeli and gagana, and I think the entries you've mentioned should be deleted. P U C – 09:49, 22 September 2021 (UTC)[reply]

I don't agree. Moreover, I'm not really sure why language names are SOP anyway, because a language is not equal to its place of origin, and the current rule contradicts our goal of being descriptive, not prescriptive. For English, it's another story, because "English" or "French" are already nouns denoting the language on their own. However, that "Tokelau language" means "Tokelauan", rather than English is not as evident as it seems from our Euroamerican point of view. Thadh (talk) 12:53, 22 September 2021 (UTC)[reply]

Delete because it is deceptive, being included on a policy page, when no vote explicitly endorsed it, AFAICT. If we keep things against our policy, we can continue to do so OR we can have a vote to enshrine some more careful version of the language in question. DCDuring (talk) 15:48, 20 September 2021 (UTC)[reply]

Does this need a vote for it to be removed? It was IMO inserted out of process. The sentence may be a valid observation of voting behaviour at RfD’s, but apart from one’s view on the usefulness of terms like lingua corsa, the sentence is not itself useful, as it does not give any guidance, one way or another. --Lambiam 08:57, 22 September 2021 (UTC)[reply]

I think that it should be removed without a vote, and that if people want to see a form of that argument appear in the CFI, it's that proposal that should be put to the vote: "usefulness as a criterion for keeping" or something. But this is likely to raise some objections by the person who added the sentence.

Moreover, it could be interesting to run the original vote a second time. The editor base has changed quite a bit since 2014, and there have been additions to the CFI (for example the translation hub policy) that could make certain people reconsider their position. Also, the scope of the vote could be broadened a bit to make it less partisan: it's not only that entries which don't mee CFI should be deleted despite a consensus to keep, but also that entries which do meet CFI should be kept even if there's a consensus to delete (even though I think that scenario never happens in practice). I suspect this proposal would fail again, but it would be interesting to see what arguments are raised this time. P U C – 09:42, 22 September 2021 (UTC)[reply]

I think it is problematic to think that the criteria for inclusion of terms can readily shift back and forth based on the changing editor base. With respect to "usefulness" as a criteria, what exactly is the point of adding another dictionary to the world at all, other than usefulness? bd2412 T 21:39, 22 September 2021 (UTC)[reply]

It is not without reason that Wikipedia explicitly mentions “it’s useful” as an argument to avoid in deletion discussions. By adding all material that may be useful to some users, the dictionary as a whole may become less useful to most users. As I see it, the CFI are designed to provide guidance for striking a balance between the incidental and the universal so as to promote general usefulness as a dictionary. The judgement of usefulness on the basis of individual entries is bound to be very subjective and will predictably result in uneven application depending on which small selection of editors happens to weigh in. --Lambiam 08:24, 23 September 2021 (UTC)[reply]

"it is problematic to think that the criteria for inclusion of terms can readily shift back and forth based on the changing editor base": you're putting words in my mouth, though admittedly it wasn't very clear what I had in mind by mentioning the changing editor base. What I meant is that since the players aren't the same, a new vote wouldn't just be a rehash of the old one, but would give us the opportunity to see if the arguments that prevailed at the time are still found to be relevant by current editors; to see if more recent editors, who arrived after the introduction of the translation hub policy, still find CFI too imperfect to consent to its being binding; to perhaps read new arguments, etc.

Also, I don't know what you're getting at with your second sentence, and I don't see how it pertains to the matter at hand. In fact, it is not the first time I have trouble following your argumentation; I'm sorry to say this, but when I read your answers around here, I often feel like I'm being gaslighted: you're kicking in touch by resorting to vague general statements that are impossible to refute. I'd be grateful if you could spell things out a bit more in debates. P U C – 18:38, 23 September 2021 (UTC)[reply]

Control over what words mean. Equinox ◑ 20:04, 23 September 2021 (UTC)[reply]

"Gaslighted" is offensive, and feels like an attack. Reference to a changing editor base certainly sounds like a hope that different people will mean a different outcome. With respect to the addition to the policy, if anything, take out the "based on the determination..." portion, and leave it at "In rare cases, a phrase that is arguably unidiomatic may be included by the consensus of the community". That is the outcome of a discussion clearly rejecting the opposing position. We need something to counter what I can only describe as a creeping parochial prescriptivism, wherein editors ignore the existence of set phrases and various other tests for the utility of a collocation, and push for deletion because it's obvious in their corner of the world that a phrase intends sense 5 of one word and sense 7 of the other, or the like. In a recent discussion, someone asserted that a phrase was SOP by comparing it to "fairies at the bottom of the garden", oblivious to the fact that in American English we don't refer to gardens as having a "bottom", so the phrase is gibberish. bd2412 T 15:40, 24 September 2021 (UTC)[reply]

What you are just conveying is the languagist position that the vulgar Usonian usage you are familiar with is the measure of all things, even though it is well possible that idioms are gibberish and your feelings too. Absent erudition, gibberish becomes laws. Isn’t it more and more fashionable now in the US, that even the laws make no sense and estrange you? Similar it is with common language usage, it might be outright absurdity that rejects systematization. Or with “art” …

So the mere “idiomaticity” or “utility of a collocation” is not the essence of what makes a word-combination demand inclusion, although opinion-makers must know such combinations covered by no dictionary, to manipulate the masses that is.

We might have {{&lit}} linking specific senses via {{senseid}} or usage notes under a {{&lit}} entry.

So delete, we shan’t have statutes contradicting themselves. Fay Freak (talk) 16:25, 24 September 2021 (UTC)[reply]

Perhaps those editors who want a dictionary based exclusively on British English usage (which amounts to 6% of the English-speaking world) should go write one under their own banner. bd2412 T 16:50, 24 September 2021 (UTC)[reply]

Ignoring the unnecessarily inflammatory part, I think the key here is that there's a difference between the practice of how the rules are applied and the rules themselves. The CFI are the codification of community consensus, the rules that the community has decided to follow. The community's occasional practice of setting aside the letter of the rules in rare cases is part of the way Wiktionary works, but should not be part of the rules. An analogy in US law is the principle of jury nullification: the fact that a jury may choose not to convict someone who is guilty according to the letter of the law doesn't mean that there should be a clause in the statutes saying "in rare cases, a jury may decide not to convict".

Besides which, it's not specific to the context where it's mentioned, though there may not have been occasion for it to occur elsewhere. Chuck Entz (talk) 17:02, 24 September 2021 (UTC)[reply]

CAT:English terms containing 'n', CAT:English terms spelled with &[edit]

I think ’twould be better to not have these redundant categories; instead we can relocate all terms in the Derived terms section of the entries 'n' and & to avoid unnecessary duplication. See previous discussion. (I did not take to rfd for I thought ’tis better to have a consensus on this first.) ·~ dictátor·mundꟾ 12:25, 21 September 2021 (UTC)[reply]

How are the categories populated and how will the derived term section be maintained? I'm not convinced that A & E is derived from A, & and E; it is etymologically derived from abbreviating the three words, so it is unnatural to record these abbreviations as derivatives of &. If you rely on normal editor action, it ain't gonna happen. The category seems to rely on a wetware bot by the name of @Wikihistorian. --RichardW57m (talk) 15:25, 21 September 2021 (UTC)[reply]

Of course the cats are added manually to each page, but I am convinced with your suggestion that A & E is not etymologically derived from &. Hence, can we tweak this proposal, that is: delete the Derived terms section at &? ·~ dictátor·mundꟾ 15:40, 21 September 2021 (UTC)[reply]

How to cover a topic[edit]

Has anybody written an essay or guideline for how to cover the terms of a particular topic? One could start by locating the closest topic category (say, en:Electronics) and see which words are already there, and then brainstorm a list of missing entries, adding existing entries to the category and creating missing entries. But maybe there are more structured approaches? If you have done this, how did you do it? --LA2 (talk) 21:34, 21 September 2021 (UTC)[reply]

You could look for existing technical dictionaries for your chosen topic. Or extract words from relevant Wikipedia articles. DTLHS (talk) 21:42, 21 September 2021 (UTC)[reply]

Notability requirement for names of people[edit]

In addition to the regular criteria for inclusion, I propose a notability or popularity requirement for names of people. This will reduce the incentive to argue about whether a name should be treated as belonging to a particular language. And as important, it will reduce clutter. Clutter is bad because even with infinite storage space the crap still gets in the way. (Recall the angst of the librarian in A Fire Upon the Deep over adding another level of indexing to the vast library.) Names these days are tags. Carter is not limited to carters. One of the Dutch editors has a policy of not adding names with less than a thousand name-bearers. That seems like a reasonable threshold, so I propose the following rules for names of people in well-documented languages:

To be included a name must have had at least a thousand name bearers at the same time, including alternative spellings but not variants like diminutives. Each individual only counts towards one language (L2 header).
Exceptions may be made for less common names clearly in widespread use (not merely with three citable uses) such as
- A name of a famous person known by a one word name. So, Prince: A name of men, especially the name of a former musician.
- A well known metaphorical use, like Faustian.
- A translation hub when cognates of a rare English name are common in other languages.

The only value we add to the countless online lists of names or baby names is etymology, so we could make the threshold different for names with and without etymology.

Thoughts? Vox Sciurorum (talk) 13:17, 22 September 2021 (UTC)[reply]

I don't think we should have stage names ~~like "Prince"~~. Oh, it's his real name! Well, he shouldn't have a sense line anyway. He's just an individual person with that name; something for Wikipedia. Equinox ◑ 15:25, 22 September 2021 (UTC)[reply]

We have a history of including notable people as senses. We do not need to continue that tradition if it clutters our pages. Xi is a better example because the real name is not an English word. Vox Sciurorum (talk) 16:23, 22 September 2021 (UTC)[reply]

The required number of bearers should be by percentage of native speakers. A thousand bearers of Hawaiian or Icelandic names would be impossible. Also data is missing from many countries, or the native tongue of name-bearers is not known. Basically I would welcome such rules. Good luck to Vox Sciurorum for setting up a vote. Discussions about given names tend to become very emotional ("This is aimed against MY name or MY mother tongue"). --Makaokalani (talk) 20:07, 24 September 2021 (UTC)[reply]

Hawaiian is not a well-documented language and would not be subject to the rule. For Icelandic, we can say that inclusion in a government's list of acceptable names takes precedence over the notability rule. Vox Sciurorum (talk) 20:21, 24 September 2021 (UTC)[reply]

enwiki transwikis[edit]

Hello Wiktionarians, a discussion on enwiki is open about how to handle cases where one of our articles is actually a definition that likely belongs over here instead. If you have any tips (hopefully a page you could link to?) please feel free to drop us a line at: w:en:Wikipedia:Village_pump_(policy)#Transwiki_to_enwikt Thank you, Xaosflux (talk) 18:57, 24 September 2021 (UTC)[reply]

We don't want transwikis. I have commented there. —Μετάknowledge^{discuss/deeds} 19:04, 24 September 2021 (UTC)[reply]

Issues with rhymes[edit]

User:Surjection implemented automatic categorization of rhymes. This generated maybe 50,000 new categories (over 30,000 from Finnish alone); I've finally cleared them. In the process I noticed a ton of inconsistencies and I'd like some comments on them:

One thing in particular I notice repeatedly are rhymes beginning with a consonant, e.g. Category:Rhymes:Zazaki/ma, Category:Rhymes:Turkish/si, Category:Rhymes:Indonesian/tal, Category:Rhymes:Czech/r̝aːp, Category:Rhymes:French/tif, Category:Rhymes:Nepali/t̪e, Category:Rhymes:Neapolitan/tʃa, Category:Rhymes:Malay/ŋunan, etc. etc. Presumably these are all mistakes? Do any languages actually have rhymes beginning with a consonant? Hungarian in particular has a lot of them: Category:Rhymes:Hungarian/riː, Category:Rhymes:Hungarian/t͡su, Category:Rhymes:Hungarian/kːaː; this seems consistently to be the case for rhymes ending in a vowel. @Adam78, Panda10 Are these correct?
Telugu rhymes like Category:Rhymes:Telugu/స. All Telugu rhymes are like this. Who are the native Telugu speakers at Wiktionary who could help clean this up?
Translingual rhymes: There are lots of them and they appear to use British English pronun. Should Translingual rhymes exist at all? It seems questionable to me.
Issues for English rhymes:
1. English rhymes randomly use /r/ or /ɹ/ (lots of examples; cf. Category:Rhymes:English/ɛdə(r) vs. Category:Rhymes:English/ɒdə(ɹ)), cf. also Category:Rhymes:English/ɪntɚ vs. Category:Rhymes:English/ʌntə(r) vs. Category:Rhymes:English/ʌntə(ɹ). What should be the convention here?
2. English rhymes are inconsistent in schwas vs. syllabic resonants: Category:Rhymes:English/ɪlɪkəl vs. Category:Rhymes:English/ɛlədʒəbl̩.
3. Most English rhymes only use British English pronunciation, including examples like Category:Rhymes:English/ɔː(ɹ)dʒi, Category:Rhymes:English/ɒɡɹəfi, Category:Rhymes:English/əʊ, Category:Rhymes:English/ɪə(ɹ). Shouldn't we include both British and American rhymes, at the least?
4. Pretending Middle English is pronounced according to Modern English rules: Category:Rhymes:Middle English/ɪə(ɹ) for eglatere.
Issues for French rhymes:
1. French rhymes beginning with a semivowel e.g. Category:Rhymes:French/wœʁ, Category:Rhymes:French/jy
2. French rhymes with multiple syllables: Category:Rhymes:French/uldɔɡ
3. French rhymes with schwa at the end: Category:Rhymes:French/ɑ̃dʁə (vs. Category:Rhymes:French/ɑvʁ)
Issues for Spanish rhymes:
1. Spanish rhymes inconsistent about mid vowels: Category:Rhymes:Spanish/mo̞s vs. Category:Rhymes:Spanish/mon.
2. Stray accents in Spanish rhymes e.g. Category:Rhymes:Spanish/édine.
3. Spanish rhymes potentially with unnecessary phonetic info: Category:Rhymes:Spanish/iŋxe, Category:Rhymes:Spanish/iβtiko, Category:Rhymes:Spanish/aʝa; Category:Rhymes:Spanish/undo vs. Category:Rhymes:Spanish/uðiko with the same phoneme.
4. Spanish rhymes inconsistent about how much phonetic info to include and how to represent it: Category:Rhymes:Spanish/aɡtiko vs. Category:Rhymes:Spanish/iɣtiko; Category:Rhymes:Spanish/oiko vs. Category:Rhymes:Spanish/ejko; Category:Rhymes:Spanish/eunja vs. Category:Rhymes:Spanish/au̯la; Category:Rhymes:Spanish/avia (BTW /v/ occurs as neither a phoneme nor allophone in Spanish between vowels) vs. Category:Rhymes:Spanish/eθja; Category:Rhymes:Spanish/ar vs. Category:Rhymes:Spanish/aɾ; Category:Rhymes:Spanish/armako vs. Category:Rhymes:Spanish/eɾma; Category:Rhymes:Spanish/adra, Category:Rhymes:Spanish/ada, Category:Rhymes:Spanish/able vs. Category:Rhymes:Spanish/aða, Category:Rhymes:Spanish/aβulo.
Issues for German rhymes:
1. German rhymes inconsistent about syllabic nasals: Category:Rhymes:German/ɔstn̩ vs. Category:Rhymes:German/ɔstən
2. German rhymes inconsistent about /j/ vs. /i̯/: Category:Rhymes:German/aljə vs. Category:Rhymes:German/aːli̯ən
Issues for Portuguese and Galician rhymes:
1. In Portuguese, both Portugal and Brazilian rhymes (maybe this is OK): cf. Category:Rhymes:Portuguese/ɐmɨ vs. Category:Rhymes:Portuguese/ɛk(i) vs. Category:Rhymes:Portuguese/aʎi (but last two probably inconsistent); Category:Rhymes:Portuguese/azɨ vs. Category:Rhymes:Portuguese/azi
2. In Brazilian Portuguese, rhymes inconsistent about written -te /tʃi/: Category:Rhymes:Portuguese/awtʃi, Category:Rhymes:Portuguese/ɐ̃tʃi vs. Category:Rhymes:Portuguese/ɛt(ʃ)i, Category:Rhymes:Portuguese/et(ʃ)i vs. Category:Rhymes:Portuguese/ɔt͡ʃiku vs. Category:Rhymes:Portuguese/ɔtiku (this latter one from Portugal?)
3. Galician rhymes inconsistent about final written o: Category:Rhymes:Galician/anʊ vs. Category:Rhymes:Galician/aɲo vs. Category:Rhymes:Galician/aθo̝
Icelandic rhymes inconsistent about voiceless resonants (phonetic detail?): Category:Rhymes:Icelandic/aulkʏr vs. Category:Rhymes:Icelandic/aul̥kʏr; Category:Rhymes:Icelandic/ir̥ka vs. Category:Rhymes:Icelandic/irtna
Russian rhymes inconsistent about /l/ vs. /ɫ/; cf. Category:Rhymes:Russian/al vs. Category:Rhymes:Russian/aɫ (the former much more common)
Malay rhymes: same word will have three possible rhymes like Category:Rhymes:Malay/apar, Category:Rhymes:Malay/par, Category:Rhymes:Malay/ar (one beginning with a consonant). What is going on here?
Catalan rhymes like Category:Rhymes:Catalan/oɾ vs. Category:Rhymes:Catalan/o(r) (no consistency on final -r vs. -ɾ and presence or absence of parens).
Czech and other language rhymes inconsistent in semivowels: Category:Rhymes:Czech/ouʃɛk vs. Category:Rhymes:Czech/ou̯ɦiː, Category:Rhymes:Czech/outɛk vs. Category:Rhymes:Czech/ou̯t; Category:Rhymes:Dutch/ɛin vs. Category:Rhymes:Dutch/ɛi̯nər (in Dutch, the latter is much more common).
Inconsistent whether to write /t͡ʃ/ /t͡s/ or /tʃ/ /ts/ (and similar) cross-linguistically and sometimes within a language; cf. Czech Category:Rhymes:Czech/ɛt͡ʃɛr vs. Category:Rhymes:Czech/ɛtʃɛk
Inconsistent whether to write /mː/ or /mm/ (and similar) cross-linguistically (and sometimes within a language?); the former seems more common.
Stray dots in rhymes; lots of examples in Urdu, e.g. Category:Rhymes:Urdu/iː.mɑː, also in other languages like Category:Rhymes:Malay/ə.ŋ̩ah.
Stray hyphens in rhymes like Tagalog Category:Rhymes:Tagalog/-ot and Category:Rhymes:Tagalog/-is.
IPA stress marks in rhymes, e.g. Category:Rhymes:Danish/ˈʌɪ̯lə, Category:Rhymes:English/aɪˌsɪkəl.
Rhymes using regular /g/ instead of IPA /ɡ/, cf. Category:Rhymes:Polish/ɛgas (very few examples).

Benwing2 (talk) 06:02, 26 September 2021 (UTC)[reply]

Malay /-ə.ŋ̩ah/ is an error. The nasal is not syllabic.

Personally, although I speak US English, I'd prefer the rhymes to be rhotic British. That's because RP makes more distinctions than GA. Just as rhymes can be grouped together for those that conflate for non-rhotic speakers, they can be grouped for ppl who make the marry-merry-Mary merger. It doesn't make sense to me to have separate RP and GA rhymes. For ppl who don't speak either RP or GA, they'd have to conflate some of the rhymes anyway. Better IMO to ask the same thing of everyone. kwami (talk) 08:35, 26 September 2021 (UTC)[reply]

4) Probably special-casing English somehow to only have categories with one of the r-signs; then categorize by the rhotic pronunciation and contain links forwarding to the corresponding rhyme without /r/ in each rhyme category page with a rhotic consonant.

Strictly the problem comes from categorizing from IPA in the first place. For English we would have to give the lexical set a vowel belongs to. Editors tend to reflect the north–force merger of course, for example, but it does not give the ideal categorization, as thus we don’t get to know, for example, whether the vowel is [ɒːɹ] or [oːɹ] in conservative Irish English: it is the former in stork but the page does not tell. It would also make sense to distinguish, lacking NURSE mergers, the vowel [ʊːɹ] in dirt, from the /ɛr/ and germ. So what you actually want is an abstraction layer from the actual realizations, which likely needs to be based on the Middle English vowels, that would allow a poet to sort the rhyme lists according to his dialect. Since this is practically different owing to the taciturnity of language material about the lexical sets, or the specialist knowledge demanded by this that only Middle English editors are expected to have, one might have concurrent systems, one of which gives more information than the other. (I did not give this view before, not to completely confuse and unsettle you pending your making a first implementation, Benwing.)

7a) I think it should be Category:Rhymes:German/ɔstən since that is the phonematic level, and the pronunciation aimed at by speakers—syllabic nasals are exaggerated by language students. Even when the majority of speakers in the majority of cases realizes a syllabic nasal, express schwa is by far not fictitious.

7b) /i̯/ is not /j/. There is a difference whether it is a part of a diphthong or already a semiconsonant. In careful speech words which have the former may even have the vowel as a whole syllable, perhaps extra-short. Though ultimately the words which have it can all have /j/ and this sounds authentic at least in Northern Germany.

19) I guess IPA /ɡ/ is logically expected. For if we have to use it in {{IPA}} it is confusing to have the opposite in any other place related to pronunciation sections. Fay Freak (talk) 14:16, 26 September 2021 (UTC)[reply]

@Benwing2, as far as Hungarian is concerned, I may have made a mistake by creating consonant-initial rhyme pages. However, the original concept of rhymes was not workable for Hungarian as it always has the stress on the first syllable, so e.g. we would have had to create Rhymes:Hungarian/ɛkːylømbøstɛthɛtɛtlɛnʃeːɡ for megkülönböztethetetlenség, resulting in an inordinate amount of pages (tens of thousands). For words ending in a vowel, I had practically two choices: either creating very few pages for their last vowel with tens of thousands of entries, or creating (again) tens of thousands of pages for their last -VC(C(C))V sequence with very few entries on most of them. I intended to take the golden mean by creating rhyme pages for their -CV endings. It also had the advantage of lending itself to this table (on the left). If we want to change it, the only alternative I can imagine is splitting them into -VC(C(C))V pages because some -i final rhyme pages are already crowded (e.g. Rhymes:Hungarian/ʃi). Then again, we'd lose the option of having an overview of them all in a single chart. True enough, there are also separate pages for each of the 14 -VC(C(C(C))) rhymes (like Rhymes:Hungarian/ɛ-, available with the "Navigation" bar above). Maybe there's a meaningful way of arranging -VC(C(C))V pages as well. (?) Adam78 (talk) 15:41, 26 September 2021 (UTC)[reply]

What counts as a rhyme in Hungarian poetry? That's the issue here. In English, a rhyme starts with the vowel of the stressed syllable, but we can't assume that's the case for other languages. E.g. French has no stress, so stress rules obviously aren't going to work. kwami (talk) 22:33, 26 September 2021 (UTC)[reply]

But doesn't e muet count for French rimes? >:) --RichardW57m (talk) 14:27, 27 September 2021 (UTC)[reply]

@Kwamikagami I never really thought of it in terms of poetry, but instead in the phonetic sense, like a reverse dictionary (except that the latter gives preference to spelling over pronunciation). In poetry, Fönn az égen ragyogó nap; ¶ Csillanó tükrén a tónak and Királyasszony, néném, ¶ Az egekre kérném may be rhymes, despite the difference in the very last consonant and the consonant (cluster) preceding the last vowel, respectively. (Examples taken from Rím in Hu. WP; you can browse it for other types.)

In fact, I've been wondering for a long time why it's not possible in the English-language Wiktionary to browse categories in the reverse order, as in the Catalan-, German-, Latin-, Polish-, and Russian-language Wiktionaries. I must note that the German version is especially impressive.

Overall, if stress is not accounted for (as in Hungarian or in French), I think rhymes could be taken either (1) as an agreement of the last n sounds (phonemes), with whatever n working best for the arrangement of entries available, or (2) as an agreement of the last one or two syllables, counted from the last or the penultimate vowel. (I can rearrange the vowel-final rhyme pages in Hungarian to include two syllables, from the penultimate vowel.) Adam78 (talk) 14:35, 27 September 2021 (UTC)[reply]

Does anybody use rhymes categories? If so, who and how? Vox Sciurorum (talk) 14:57, 27 September 2021 (UTC)[reply]

I don't use them on Wikt, because IMO they're not complete enough to be useful, but I have used rhyming dictionaries. The reasons were (a) finding a rhyme for poetry and (b) verifying claims of refractory rhymes. AFAIK, (a) is what they were created for.

From above, it sounds like it would be difficult to decide what goes into a particular rhyming category for Hungarian. kwami (talk) 18:29, 27 September 2021 (UTC)[reply]

I've deleted ~50 Spanish categories that had one issue or another, plus some egregious characters that don't belong in other languages. Spanish is mostly standardized now, except @Benwing2 isn't Category:Rhymes:Spanish/aʝa correct? It's the module's default and the most common pronunciation, it seems. I'm not sure how to tackle parentheses in Catalan. Do we care if rhymes are separated by spelling? Probably not, right? If that's the case, I'd categorize e.g. agricultor as rhyming with both -o and -oɾ. Ultimateria (talk) 04:28, 28 September 2021 (UTC)[reply]

As for Catalan see at ca.wiki -oɾ, -o(ɾ) and -o categories. Depending of the dialect, the rhyme is completed with 1, 2 or 3 categories. For a Valencian it is confusing to find agricultor at -o, and for a Central Catalan it is confusing at -oɾ. That's the reason for rhyme -o(ɾ), as in ca:agricultor. Using /r/ or /ɾ/ in a coda is not phonemic. --Vriullop (talk) 08:58, 28 September 2021 (UTC)[reply]

English compounds containing spaces?[edit]

Are English compounds always single or hyphenated words (for example, heavyweight and blue-collar), or can a term consisting of several words separated by spaces (for example, barking mad, ground zero, and prince regent) be considered a compound? "w:Compound (linguistics)" says: "As a member of the Germanic family of languages, English is unusual in that even simple compounds made since the 18th century tend to be written in separate parts. This would be an error in other Germanic languages such as Norwegian, Swedish, Danish, German and Dutch. However, this is merely an orthographic convention: As in other Germanic languages, arbitrary noun phrases, for example "girl scout troop", "city council member", and "cellar door", can be made up on the spot and used as compound nouns in English too." However, @Metaknowledge doubts the correctness of this. (@Inqilābī.) — SGconlaw (talk) 17:29, 27 September 2021 (UTC)[reply]

Note that "Category:English compound words" and its subcategories contain numerous entries that are terms made up of two or more words separated with spaces. — SGconlaw (talk) 17:34, 27 September 2021 (UTC)[reply]

You can generally tell a compound because one of the components loses its stress. E.g. a 'high school' is stressed on 'high', while 'high' + 'school' (a school at a high altitude? on pot?) is stressed on both. Orthography isn't a good guide. kwami (talk) 18:21, 27 September 2021 (UTC)[reply]

@Sgconlaw is misrepresenting what I said, and I do not appreciate him doing so. Here is what I said: "In a comparative Germanic context, I certainly agree with Wikipedia's statement. However, for the purposes of English lexicography, a distinction is generally drawn between spaced and unspaced compounds, with only the latter being considered compound words." There is no reason to include "high + school" as the etymology of high school when both words are already linked in the headword line. —Μετάknowledge^{discuss/deeds} 20:00, 27 September 2021 (UTC)[reply]

@Metaknowledge: sorry, I did not mean to. But then I don't quite understand what you mean – are spaced compounds then not compound words, despite the use of the word compounds? — SGconlaw (talk) 20:13, 27 September 2021 (UTC)[reply]

If you don't know how to summarise someone else's statements, you should simply quote them. The distinction is that traditionally, English lexicography has drawn a division between lemmata and words; we generally refer to them interchangeably on Wiktionary. —Μετάknowledge^{discuss/deeds} 20:25, 27 September 2021 (UTC)[reply]

There’s no reason why English should be receiving a special treatment in a multilingual dictionary. ·~ dictátor·mundꟾ 20:28, 27 September 2021 (UTC)[reply]

Still it’s preferable to categorize open compounds using {{com}}. I do not even understand why individual words should be linked in the headword line in the first place. ·~ dictátor·mundꟾ 20:08, 27 September 2021 (UTC)[reply]

These open compounds are more succinctly dealt with in the current way. The multiword form shows that it is a compound (or phrase), and avoids the need for separate linking. Or are you saying that there is an ambiguity between compound words and phrases? --RichardW57 (talk) 20:54, 28 September 2021 (UTC)[reply]

I would just add that, while Metaknowledge objects to ‘treating multiword terms as compounds in the etymology section’, we actually have a convention of manually categorizing such terms as compound nouns and compound adjectives. Anyways, following the orthography to determine compounds is indeed a backward and ignorant practice, that needs to be eradicated in favour of proper linguistic treatment of the term. ·~ dictátor·mundꟾ 20:02, 27 September 2021 (UTC)[reply]

That is not a standardised convention, but it also a separate proposal from adding these excessive etymologies. —Μετάknowledge^{discuss/deeds} 20:25, 27 September 2021 (UTC)[reply]

It's inefficient to treat multiword terms as compounds in the etymology section, and individually linking the components (which is done at the headword). The prevalent practise now is just using the etymology section to show the literal meaning or anything other than A + B. I think that this current practise is better. Svārtava² • 07:19, 29 September 2021 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ Apart from what (if anything) should be indicated in the etymology section, are we quite sure that, say, a noun entry consisting of two or more words with spaces between them is not to be considered a compound noun? Thus, even if {{compound}} is not used in the etymology section, the entry should not be manually added to "Category:English compound nouns"? In other words, is it correct to assume that English compound nouns must either be single words, or words separated only by hyphens? (And if so why? I'd love to know the reason, in the light of what the Wikipedia article says.) — SGconlaw (talk) 17:34, 29 September 2021 (UTC)[reply]

@Sgconlaw: This proposal looks like a vote material, as it involves standardising the way we treat compounds. ·~ dictátor·mundꟾ 04:14, 2 October 2021 (UTC)[reply]

Why `{{inh+}}`/`{{bor+}}`[edit]

Following the suggestion of editors like @Sgconlaw, PUC, Erutuon that the underlying issue regarding the templates needs to be resolved first, I am going to state some serious facts that should encourage everyone to use the templates.

The linguistic terms ‘inherited’ and ‘borrowed’ need to be linked to Glossary because we have a convention of linking terminologies or any technical jargons. For example the grammatical cases, numbers, tense and aspect, labels like dated, obsolete, literary, etc. etc. When some people bring the argument that the link is really unneeded, they at first have to consider the fact that on Wiktionary we link any key terms, and we do not even spare definitions!

The terms ‘inherited’ and ‘borrowed’ are necessary to display because the reader has to check the source code to know if the term is inherited. Because of the prevalence of {{der}} (when in fact more specific etymology templates should be used), we cannot ignore the fact that we have simply failed to give a good presentation of etymology, so using {{inh+}} and {{bor+}} solves the problem.

There are many terms that are so conservative that they look like learned borrowings but are in fact inherited. Using {{inh+}} immediately solves that problem, and also eases the process of clean-up of wrong usages of templates.

None of the opposers of the new templates work in Indo-Aryan languages in general (here I am not counting dealing with Sanskrit etymologies, but am referring to Middle and New Indo-Aryan languages specifically), and yet they are opposed to new templates that the advocates of the new templates find very useful. So I think the opposers have no right to oppose their usage when in fact they do not, or hardly, work in those languages.

Just like other etymology templates like {{lbor}}, {{slbor}}, {{clq}}, {{pclq}}, etc., the new templates help prevent typographical errors. When I was new here, I used to publish edits with typos like borowed, I herited; I am still prone to these but I now avoid doing them by carefully checking the preview. No doubt, new editors would find the templates extremely helpful.

The display of the etymological wording is very, very useful to the uninitiated; most people in the world do not know the discipline of Linguistics, and this display helps any to-be amateur linguists to learn about word origin. Helping spreading knowledge is beneficial given that most people have incorrect conceptions about etymologies. For instance, most Indo-Aryan speakers (including lexicographers) think that inherited words are corruptions of tatsamas, they regard learned loans as the correct form, and have no idea what an inheritance is. People do not even know what a Germanic language is, and thus assume English is a Romance language. Thus, the objective of any Wikimedia project is to make knowledge accessible to the whole population: just because 0.01% of the populace is knowledgeable about linguistics does in no wise suggest that we can assume our readers are all-knowing.

So are the opposers willing to acknowledge that the new templates solve existing problems and do no harm to the project?

Thus:

General recommendation: Both {{inh+}} and {{bor+}} should be freely used in all language entries.
Strong recommendation: The new templates must be used in language entries where the editors favour their usage.

That said, there are of course cases where {{inh+}} and {{bor+}} are unneeded. Protolanguages need not use these new templates. Also, when the etymon is a substrate word, the wording ‘borrowed from’ is unnecessary. ·~ dictátor·mundꟾ 20:57, 27 September 2021 (UTC)[reply]

Some of your arguments are quite unconvincing, and I hope I'm speaking for most opposers of the templates:

"The linguistic terms ‘inherited’ and ‘borrowed’ need to be linked to Glossary because we have a convention of linking terminologies or any technical jargons." - this already implies the terms are needed in the etymology, and thus not an argument if they aren't.
"The terms ‘inherited’ and ‘borrowed’ are necessary to display because the reader has to check the source code to know if the term is inherited." - This implies that readers don't assume inheritance, which I'm pretty sure they do, unless we do our best to resist this.
"[T]he prevalence of {{der}} (when in fact more specific etymology templates should be used), we cannot ignore the fact that we have simply failed to give a good presentation of etymology" - When more specific etymology templates should be used, the etymology is done badly, and should be fixed. We can't do anything about errors in our dictionary, other than fix them.
"[M]ost people in the world do not know the discipline of Linguistics, and this display helps any to-be amateur linguists to learn about word origin." - I know it sounds harsh, but that's simply not our job. If we were to include the terms anyway, I would agree to link them, but I doubt we need to. Furthermore, most of our readers are familiar with linguistics to some degree, and if they aren't, it's doubtful they want and need to know the distinction between inheritance and borrowing
"Just like other etymology templates [...] the new templates help prevent typographical errors." - I doubt that makes much of a difference: We still need L3s and L4s spelled properly, and IIRC, ToilBot (talk • contribs) cleans up such misspellings quite well. And, once more, it implies we need the terms.
"None of the opposers of the new templates work in Indo-Aryan languages in general" - Sure, but my problem doesn't lie in Indo-Aryan languages, but rather in the templates spreading to other communities, which I do edit.

So, all in all, I do not think the templates solve any real problems in mother-daughter relations where inheritance isn't ambiguous, and that we should rather focus on creating an environment where these templates are not needed. Thadh (talk) 21:45, 27 September 2021 (UTC)[reply]

@Thadh: I see that you did not really address the concerns. 1) Many people are opposed to the linking of the words ‘inherited’ and ‘borrowed’, hence I justified & explained why the link should be there. 2) If the creation of new templates makes fixing errors easier, we should definitely have them. 3) Making knowledge accessible to the whole of humanity is our foremost job; I myself I did not know about linguistics before 2019, and Wiktionary helped me to understand the difference between inheritances and borrowings — your assumption about readers is not helpful. 4) Toilbot fixes only headings, not any other typos, I think you are mistaken there (correct me if I am wrong).

All I can understand is that you just personally dislike the new templates. ·~ dictátor·mundꟾ 13:32, 28 September 2021 (UTC)[reply]

@Inqilābī: I feel like we're having the same conversation over and over again. First of all, the creation of the new templates doesn't make fixing errors easier, because "Inherited from Old Hindi, from Sanskrit" is just as 'ambiguous' as a bare "From Sanskrit". Second, the fact you learned through Wiktionary doesn't mean we should be focussing on teaching linguistics on this website - that's not what this wiki is for, you ought to attend classes for that. We use the practices of linguists to write the dictionary, and the fact some people don't know what a noun is doesn't mean we should linkify our headers either! Finally, if misspellings like "b rrowed" or "iherited" are such a big problem, I'm sure @Erutuon wouldn't mind adding these to their code.

And yes, I don't like the templates for multiple reasons. In my opinion, "Inherited from" is cluttering and useless, "Borrowed from" shouldn't be obligatory either and finally, creating these templates after the vote about them didn't pass without any discussion with the opposers and continuing the use of these after multiple pleas to stop is just outright uncollaborative. Thadh (talk) 18:11, 28 September 2021 (UTC)[reply]

using only from is a prevalent bad practise even for {{bor}}. Not all {{bor|hi|sa}}s are properly preceded by Borrowed from. This can really lead to mistakes. one statement I find 100% true is

most Indo-Aryan speakers (including lexicographers) think that inherited words are corruptions of tatsamas, they regard learned loans as the correct form, and have no idea what an inheritance is.

yes, i regarded the inherited words as corruptions. See also User_talk:Bhagadatta/2020#Borrowings,_Learned_borrowings_and_inherited_words. i had not as much linguistic knowledge back then. Strangely I considered "derived" as the corrupted ones/tadbhavas and "borrowed" as tatsamas. From seeing जाल (jāl) (at that time which I believed was a tatsama, as of the similar spelling) I got the ridiculous thought the inherited were actually tatsama! Once I actually used {{inh}} for a tatsama. I can only say had these templates been used then, all this wouldn't happen. These templates are especially beneficial in IA languages for newcomers and readers alike. Svārtava² • 06:41, 28 September 2021 (UTC)[reply]

As someone who works in Indo-Iranian languages, I can say the above is total nonsense with the purpose of creating a false dilemma. Tatsamas are no different than any other borrowings from Latin or Greek. --{{victar|talk}} 06:58, 28 September 2021 (UTC)[reply]

it isn't. Svārtava² • 07:19, 29 September 2021 (UTC)[reply]

@Victar: You deal with protolanguages, and I said there is no need to use those templates for protolanguages— so why do the templates bother you? ·~ dictátor·mundꟾ 13:00, 28 September 2021 (UTC)[reply]

You think because I create proto entries means I don't understand and work in the languages below them? SodhakSH/Svartava is also adding the templates to every page he edits, whether it be Pali, English, French, whathaveyou. --{{victar|talk}} 15:44, 28 September 2021 (UTC)[reply]

There is no problem. As it is I don't regularly edit Pali, English, French. Svārtava² • 03:19, 29 September 2021 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘ To resolve the underlying issues, I feel what is needed is a discussion about whether we should (1) always use terms like "Borrowed from" and "Inherited from" in etymology sections; (2) never use these terms; or (3) leave it to editors to use their discretion as to whether the terms should be used. Then the {{bor}} and {{inh}} templates can be updated accordingly, and {{bor+}} and {{inh+}} deleted. — SGconlaw (talk) 13:39, 28 September 2021 (UTC)[reply]

@Sgconlaw: In a wiki, we the editors should cater to the needs of both the layperson and the educated — thus what is supposed to be done is to provide the full, detailed information that would satisfy someone who studies that subject/topic, and yet in a manner that the uninitiated is able to the grasp it. Now, in this case, most people in the world do not know about etymologies, so they have a right to see the linguistic terminologies that are used to describe word origin. We are, at the same time, showing the actual information someone interested (whether an amateur or an expert) is looking for. This justifies the implementation of the templets {{inh+}} and {{bor+}} in our etymologies. However the opposers of the templates do not care about all this.

Now, due to the opposition, the best way is to seek a compromise. The templates are favoured by Indo-Aryan editors the most, so the templates should be allowed in Indo-Aryan entries. There are of course editors working in various other families that support the templates as well, but at this stage I am not really sure how the option of ‘leave it to editors to use their discretion […] ’ would really work out, given this malevolent attitude: ‘If you do such replacements, I will revert you (particularly in the languages I work with); I'll also feel free to replace new instances with plain text (particularly in the languages I work with)’. Maybe you would be able to settle the dispute as an uninvolved editor? ·~ dictátor·mundꟾ 16:49, 28 September 2021 (UTC)[reply]

@Inqilābī: the way I see it, these are arguments that should be raised in a discussion about the underlying issue. So long as the underlying issue is not resolved, I don't see how arguing about why {{bor+}} or {{inh+}} should be used is going to change anything. I also don't see how agreeing to the use of the templates in Indo-Aryan entries and no others is in any way principled. — SGconlaw (talk) 18:26, 28 September 2021 (UTC)[reply]

Having both {{inh+}} and {{inh}} is an easily remembered way of choosing to have or not to have the preamble 'inherited from'; not all of us can remember the parameters that we may need - it is hard enough to remember what tailoring capabilities are available. As a consumer, 'from' is irritatingly ambiguous - has the author left out a word or not? There's also the problem that the reader may not agree with out ancestry assignments. If an etymology merely says 'From Sanskrit' for a word in a living language, it must surely mean '[ultimately] borrowed from Sanskrit', for in normal parlance no living language descends from Sanskrit. Nynorsk is another problem area. The downside of having two templates is that some editors hope to understand the raw wikicode, and every extra notation is a further burden on the memory. That is why we try to keep parameter names consistent between templates. --RichardW57m (talk) 16:08, 28 September 2021 (UTC)[reply]

Dude, all due respect, if you falsely believe "no living language descends from Sanskrit", you really shouldn't be making any statements on the language or its family. --{{victar|talk}} 18:23, 28 September 2021 (UTC)[reply]

I would refer the learned gentleman to Ancestor of Middle Indo-Aryan. But rest assured, my belief is correct with regard to the normal meaning of the word 'Sanskrit'. It is Wiktionary that is bending the term. --RichardW57 (talk) 20:18, 28 September 2021 (UTC)[reply]

@RichardW57: We could of course switch over to the name Old Indo-Aryan, and reserve the name ‘Sanskrit’ only for learned loans. But then the way we present our etymology needs to change: ‘From OIA (cf. Sanskrit [Term])’. Would you be happy with such a change? This would especially be helpful seeing as {{inh+}} looks doomed to die. ·~ dictátor·mundꟾ 02:58, 2 October 2021 (UTC)[reply]

I won't be happy. The current one with sa as ancestor of IA is better, even linguists like Turner and McGregor take Sanskrit as IA languages' ancestor. Anyways that's not the issue to discuss here. {{inh+}} isn't doomed to die. Svartava2 (talk) 03:11, 2 October 2021 (UTC)[reply]

Yea, a troll is never happy with anything. ·~ dictátor·mundꟾ 03:27, 2 October 2021 (UTC)[reply]

wow, sweet language. i am one of the most active IA editors, dealing with descendants of Sanskrit every now and then. I wouldn't opt for more complexity, we already have PIA. thanks for the lovely comment Svartava2 (talk) 03:37, 2 October 2021 (UTC)[reply]

You, Svartava2, are coming close to a lie here. They use it as a substitute for the unattested ancestor, because for most words Sanskrit is close enough. To answer Inqilābī, I would be happy with that. What I am distinctly unhappy with is linking to the hindutvin-infested Wikipedia article for 'Sanskrit'. --RichardW57 (talk) 06:12, 2 October 2021 (UTC)[reply]

user talk:Bhagadatta/2020#Sanskrit vs "pre-Sanskrit", @Bhagadatta, @AryamanA Svartava2 (talk) 06:41, 2 October 2021 (UTC)[reply]

@RichardW57: As youre quoting me in that link, obviously I agree with you to some extent. But just as English is descended from various dialects of Old English, so are MIA languages from various dialects of OIA. We could reconstruct Proto-Old English and call that the ancestor of English, but the redundancy makes it a silly notion, and instead we use West Saxon as the default. Its not a perfect analogy, but it gets the point across. --{{victar|talk}} 00:16, 3 October 2021 (UTC)[reply]

@Victar: The imperfections in the analogy are where the differences arrive. The analogy Old English:West Saxon::OIA:Sanskrit breaks down because we generally trace etmologies back to 'Old English' rather than 'West Saxon', and one can easily find find references to explicitly Anglian forms. There's the additional wrinkle that West Saxon does not borrow from Middle English, whereas Sanskrit does borrow from Prakrit. When a reader sees the word 'Sanskrit', he is liable to think that we mean Sanskrit. --RichardW57 (talk) 10:30, 3 October 2021 (UTC)[reply]

The key point for this discussion is that when a Middle or New Indo-Aryan word is described as being 'From Sanskrit', in simple cases different readers will have different interpretations - inheritance, borrowing or non-resolution of the issue. That is not good. --RichardW57 (talk) 11:01, 3 October 2021 (UTC)[reply]

I would also remind people that if we use a glossary to define our terms, our usage of those terms, at least, when linked to the glossary, should conform to the definition in the glossary. --RichardW57m (talk) 16:08, 28 September 2021 (UTC)[reply]

For the record, I am in support of {{bor+}} if: 1. users don't systematically replace {{bor}} with {{bor+}}, and 2. the overkill glossary link is removed. {{inh+}} is a non-starter for me, because I see absolutely no use for it. This was essentially the compromise put forth a month ago, --{{victar|talk}} 18:23, 28 September 2021 (UTC)[reply]

@RichardW57m: my preference is for just {{bor}} and {{inh}} but I have no strong feelings about this if editors prefer to have {{bor+}} and {{inh+}}. The main thing is to find consensus on the use of the phrases "Borrowed from" and "Inherited from" in etymologies instead of arguing about the templates. — SGconlaw (talk) 18:29, 28 September 2021 (UTC)[reply]

True. SodhakSH/Svartava believes all etymologies should explicitly state "Inherited from", including English entries, and is willing to edit war over it. --{{victar|talk}} 18:42, 28 September 2021 (UTC)[reply]

Yeah, even I was thinking about that. Inqilābī also adds {{inh+}}. I don't believe it is a problem if inheritance is stated. And many English entries like this and many French entries like this had "inherited" before the birth of the templates. Svārtava² • 03:18, 29 September 2021 (UTC)[reply]

Victar and Thad seem to think that 'inherited from' is redundant. There is also an opinion around that readers will know what the ancestors of a language are. I believe both of these are unsound assumptions. I therefore favour starting an entry with an explicit 'inherited' or 'borrowed'. Thad raises the issue of chained etymologies, such as Hindi from Old Hindi from Sanskrit. In cases like this, I think we can sensibly rule that readers should read the subsequent modes as being by inheritance, i.e. instruct editors to visibly note intermediate borrowings. (Personally, I dislike chained etymologies because of the risk of multiple repetitions of an incorrect or out-of-fashion etymology.)

An example that would then need to be fixed is:

From {{bor|en|frm|impeccable}}, from {{der|en|la|impeccabilis||not liable to sin}}, from {{m|la|im-||not}} + {{m|la|pecco|peccare|to err, to sin}}.

The Latin word is borrowed by Middle French, not inherited from Middle French. There is no template available to generate the word 'borrowed' for that transmission. --RichardW57m (talk) 15:48, 30 September 2021 (UTC)[reply]

Incidentally, systematic suppression of 'inherited from' or 'borrowed from' as appropriate could be implemented on the basis of the languages if we used {{inh+}} and {{bor+}}. For example we could program that Latin to French is inheritance by default (if that is right) but that Latin to Romanian is borrowing by default. Personally I think such rules would be extremely confusing to the reader. --RichardW57m (talk) 12:09, 29 September 2021 (UTC).[reply]

This isn't the most important issue, but my name is Thadh with a final dh Thadh (talk) 16:13, 29 September 2021 (UTC)[reply]

Sorry @Thadh, that wasn't my only mistake in the post. While Romanian may have more words ultimately borrowed from Latin than inherited from Latin, a lot of them were borrowed via another Romance language, so for words whose 'immediate' source is Latin, inheritance will still be commoner. There may be other significant features, as I gather a lot of the loans entered the language as learned semi-learned borrowings to replace words of Slavonic origin. --RichardW57 (talk) 07:59, 30 September 2021 (UTC)[reply]

Certainly we must assume that the average consumer of Wiktionary knows nothing of linguistics; they're merely using a dictionary. Likewise, all this fuss about linking inherited or borrowed seems absurd; when in doubt, link them.--Prosfilaes (talk) 01:07, 30 September 2021 (UTC)[reply]

Glossary definition of inherited.[edit]

I've improved (I hope) the glossary definition of inherited to read 'through regular or sporadic sound change' rather than 'through regular sound change'. --RichardW57m (talk) 12:24, 29 September 2021 (UTC)[reply]

Wiktionary:Beer parlour/2021/September

Quicker deletion/removal of uncited RFVs[edit]

Option 1: 2 weeks[edit]

Support[edit]

Oppose[edit]

Abstain[edit]

Option 2: 3 weeks[edit]

Support[edit]

Oppose[edit]

Abstain[edit]

Decision[edit]

Wiktionary:Votes/2020-07/Removing letter entries except Translingual[edit]

New Android app based on Wiktionary - Vedaist[edit]

Modifying the page WT:WDL[edit]

Treatment of Early Modern Korean?[edit]

Etruscan topic[edit]

Hard redirect: ꝛ and ſ[edit]

Past participles - lemmas or not[edit]

Using syn template for alternate plurals[edit]

Template for original research in reconstructed entries[edit]

The 2022 Community Wishlist Survey will happen in January[edit]

Proposing and wish-fulfillment will happen during the same year[edit]

Encouraging wider participation from historically excluded communities[edit]

A new space to talk to us about priorities and wishes not granted yet[edit]

Brainstorm and draft proposals before the proposal phase[edit]

Feedback[edit]

Results for the most contended Wikimedia Foundation Board of Trustees election[edit]

Call for Candidates for the Movement Charter Drafting Committee ending 14 September 2021[edit]

Server switch[edit]

Talk to the Community Tech[edit]

Request about a new transliteration of Hainanese[edit]

Definitions for semantically straightforward inflected forms in subsidiary Pali script.[edit]

Policy on deletion consensus[edit]

Does anyone strongly support keeping anagrams?[edit]

Change to Transliteration of Sinhala[edit]

Can somebody explain WT:SOP to me?[edit]

Turkish -ma forms: gerunds or verbal nouns?[edit]

Announcing ilscripto 0.0.1: pure Lua Scribunto engine[edit]

User:Donnanz’s etyl clean-up methods[edit]

Does Hawaiian have adjectives?[edit]

English prepositions[edit]

What the vote really means[edit]

CAT:English terms containing 'n', CAT:English terms spelled with &[edit]

How to cover a topic[edit]

Notability requirement for names of people[edit]

enwiki transwikis[edit]

Issues with rhymes[edit]

English compounds containing spaces?[edit]

Why {{inh+}}/{{bor+}}[edit]

Glossary definition of inherited.[edit]

Navigation menu

Search

Why `{{inh+}}`/`{{bor+}}`[edit]