Wiktionary:Beer parlour/2022/January

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search
discussion rooms: Tea roomEtym. scr.Info deskBeer parlourGrease pit ← December 2021 · January 2022 · February 2022 → · (current)

Headers ==Chalita Conjugation== and ==Sadhu Conjugation== in Bengali[edit]

(Notifying AryamanA, Kutchkutch, Bhagadatta, Inqilābī, Msasag, Svartava2): Sorry to be pinging the Prakrit editors but there is no wgping entry for Bengali. Several Bengali pages have ==Chalita Conjugation== and ==Sadhu Conjugation== headers. Some example pages: অনুবাদ করা (ônubad kôra), উল্কি আঁকা (ulki ãka), চালানো (calano), করা (kôra). I don't know what the difference is but these headers are nonstandard. I propose instead putting both conjugation variants under a single ==Conjugation== header and either preceding them with a header beginning with a semicolon (which generates boldface text), or (probably better) just putting the words "Chalita" and "Sadhu" in the respective conjugation table headers. Thoughts? If people agree, I can make this change by bot. Benwing2 (talk) 01:58, 1 January 2022 (UTC)[reply]

@Benwing2 Sadhu is a standardised version of Middle Bengali (14th-early 19th c). I think it's better if we add Sadhu forms in Middle Bengali rather than the modern one. Sadhu Bengali used to be the official literary form of Bengali during colonization but it wasn't a spoken language, which was the Cholito and other forms. Msasag (talk) 10:42, 1 January 2022 (UTC)[reply]
@Msasag: We already treat Sādhu Bhāṣā as Modern Bengali. You are right that this register is derived from Middle Bengali, but it was employed right during the Modern Bengali period as a standard, literary register. There was even instances of code-switching: with the narration being written in Sādhu Bhāṣā while the dialogues in the contemporary colloquial language, as seen in 19th—early 20th century Bengali literature. Some 19th century works were even written in a mixture of the classical and the colloquial form! We already have a precedence for treating a learned, archaizing register as part of the modern language itself: cf. Category:Katharevousa, which is the Greek equivalent of Sādhu Bhāṣā. Hope you understand. ·~ dictátor·mundꟾ 18:13, 1 January 2022 (UTC)[reply]
@Inqilābī So Sadhu is an archaized form of Modern Bengali (which followed or were inspired by middle Bengali forms) rather than being a standardized form of Middle Bengali? Msasag (talk) 01:48, 1 January 2022 (UTC)[reply]
@Benwing2: Yea, we should get rid of those nonstandard headers. And it’s enough to specify only Sādhu Bhāṣā in the conjugation table header, because the other one is the same as the contemporary Modern Standard Bengali. ·~ dictátor·mundꟾ 18:13, 1 January 2022 (UTC)[reply]
I agree that it is non-standard and support the proposed changes if the editors involved in Bengali presently viz. Inqilabi and Msasag both agree. -- 𝓑𝓱𝓪𝓰𝓪𝓭𝓪𝓽𝓽𝓪(𝓽𝓪𝓵𝓴) 09:28, 2 January 2022 (UTC)[reply]

Codifying certain rules for RFD[edit]

I think it is time for us to set in stone certain rules for RFD, it is one of the most important processes of Wiktionary that decides the fate of various entries, pages and senses. For now, I'm thinking of a few policies that may be helpful to codify:

  1. Banning IPs from voting.
    This was a matter of discussion at Talk:opinions are like assholes. Also, by our current lack of policy on this matter, any IP with even without constructive edits may vote, which isn't always desirable.
  2. Defining the required consensus: 3/5 or 2/3 (per whatever the consensus is).
    Earlier discussions on this include Wiktionary:Beer parlour/2021/September#Policy on deletion consensus, Talk:Real Academia Española, Reconstruction talk:Proto-Indo-European/kr̥snós. This might be the most important requirement, since currently, it's totally upto the closer of the RFD to decide how much consensus is required, and we must define the supermajority needed.
  3. Time period for closing an RFD as no-consensus or no-objection: 1 or 2 or 3 months (per whatever the consensus is).
    There have been previous instances of deleting by no-objection and keeping by no-consensus, but like the others, this is also a grey area. "No-consensus" would mean that the required consensus (which would be clearly defined, see the second point) hasn't been reached in a certain time period and the status quo would be maintained; "no-objection" would mean that apart from the nomination, there hasn't been any keep/delete vote. The time period here, in my opinion, should be more than when there have been votes and consensus in either direction — and I support 2 or 3 months — but this needs to be set in stone too, whatever the agreed upon time period.

This would be a major change, so it would most likely require a policy vote to be implemented and enforced. All the proposed points would be analogously applicable to undeletion requests as well. I would like to call some other users to analyse this proposal and share their thoughts (Metaknowledge, Surjection, Chuck Entz, PUC, Lambiam, Fytcha, Sgconlaw, DAVilla, Donnanz, Inqilābī, Imetsia, Equinox, PseudoSkull, Lingo Bingo Dingo, BD2412, SemperBlotto, Fay Freak, Vininn126), but anyone should feel free to comment. —Svārtava [tcur] 17:38, 1 January 2022 (UTC)[reply]

In most cases I am not a fan of copying the ways of Wikipedia, but in this case I think we can use some inspiration from how article deletion proposals are handled over there, as described in Wikipedia:Deletion process. The most important thing is that the decision depends on a mysterious phenomenon called consensus, which is not based on a tally of votes, but on reasonable, logical, [Wikipedia] policy-based arguments. If an IP presents a reasonable case here in line with our policies and guidelines, then it should by all means be considered seriously. Conversely, if a veteran editor casts a "vote" Keep here for a blatantly a sum-of-parts term without presenting an argument, or merely the argument that the term is common or useful or some other variant of "I like it", this expression of their sentiment in the matter can be duly noted but should in my opinion further be ignored.  --Lambiam 18:18, 1 January 2022 (UTC)[reply]
I definitely agree with the first proposed change, as it is impossible to tell in some cases whether an IP is a sock of a logged in user or of other IPs who have already voted. While I think it's unlikely anyone would vote that dishonestly here at RFD, if such a thing were ever suspected it would be a huge waste of time to have to sort through evidence to support or contradict the suspicions. It is also not too hard to vote a second time with an IP such that it wouldn't arouse suspicion at all. These are important reasons that formal votes don't allow IPs to cast votes.
I think each of the 3 items should get their own sections in the BP and/or formal votes, because I have no comment on the other 2 at this time. PseudoSkull (talk) 19:55, 1 January 2022 (UTC)[reply]
I oppose this. You have mistaken discussion pages for votes, but they are not votes — they are discussions. We allow for grey areas because we trust closers (usually admins) to be able to use their best judgement in the interests of the dictionary. You may remember back when Dentonius was flooding RFD with keep votes, but you may not have noticed that I simply discounted his vote when closing those RFDs. That is to say, we already function a bit more like what Lambiam described than you realise, and if we were to codify our procedures, it should be in that direction. I concur with PseudoSkull that your first item regarding IPs has merit, but I challenge you to find even a single time that an IP has ever swayed an RFD discussion. If it came to a vote, I would support that item, but it seems like a solution in search of a problem. —Μετάknowledgediscuss/deeds 20:09, 1 January 2022 (UTC)[reply]
@Metaknowledge: If Lambiam's proposed solution is adopted, we would need better and more clearly defined policies especially those dealing with SOPs. Similarly, there would be many SOPs that would be deleted going by our current policy but have been kept with community consensus. Also, at a discussion, there was disagreement whether hypallage could save a term from being SOP, so I find it a bit confusing which one would be chosen correct (in other words, would the entry be deleted or kept) without others' votes ⇒ comes back to the same thing. There have been other users apart from Dentonius, particularly I have noticed, SemperBlotto and Donnanz who have voted keep at multiple occasions without giving any rationale or justification for the term not being SOP; but their votes have been counted. To add, per CFI: “a phrase that is arguably unidiomatic may be included by the consensus of the community, based on the determination of editors that inclusion of the term is likely to be useful to readers” so I don't believe we need to fully adopt Wikipedia's policy and abolish the voting process (again, its good to argue against the given justification of any user and try to convince them with the counter-arguments, but maybe we can retain the voting process). Regarding banning IP voting, I can't find for now any such instances of IPs swaying RFDs, but I think we need to do it since it's extremely lax and can very easily be exploited (an editor could just switch their network and IP and double-vote). Regarding setting the time period for no-consensus/objection, what are your thoughts? —Svārtava [tcur] 04:43, 2 January 2022 (UTC)[reply]
You agree that our system works, and you can produce no examples of IPs ever having swayed an RFD in the entire history of Wiktionary. I think the time period for RFDs is fine as well; we should (and usually do) grant more laxness at RFV, of course. You are still fishing for problems with poorly conceived solutions. —Μετάknowledgediscuss/deeds 06:15, 2 January 2022 (UTC)[reply]
@Metaknowledge: I do not have any problems with the time period for RFDs where there is consensus (in favour of keeping or deleting). However, the time period for closing as no-objection and no-consensus seems unclear. I am proposing that it be fixed and set in clone: be it 1 month, 2 months or 3 months. I think this would require more time, since in "no-consensus" it is okay to wait and see if any consensus is observed in near future and similarly with "no-objection" it may be a better option to wait and let some votes be cast and some thoughts expressed. Some might think that 1 month period is enough in this case also, so if a vote is created regarding this, it would have multiple options. The proposal seeks to codify certain things (which don't have any policy whatsoever), rather than change any existing policy/rule. I'm waiting for a week or so to see where this discussion leads us to and where the consensus is; accordingly, after that, this may be voted upon. —Svārtava [tcur] 06:28, 2 January 2022 (UTC)[reply]
@Svartava2 You have pinged me. I don't normally spend a lot of time participating in deletion votes but I do think the general concept of consensus is right. Obviously there is a lot of judgment involved but I think it's better than simply using a blanket voting rule, as it e.g. allows admins to ignore people who consistently vote keep without any rationale. Benwing2 (talk) 04:59, 2 January 2022 (UTC)[reply]
@Lambiam, Metaknowledge, Benwing2: I urge you 3 to read Imetsia's argument below: “[E]ven if we put this proposal in place, we could keep discounting unmeritorious votes (such as those by Dentonius or other noted inclusionists). This vote would do nothing to change or challenge this practice.” Our RFD system is primarily based on votes (a fact that can't be ignored), so its helpful to “solidif[y] the consensus standard” if we count votes at all (which I think we do, even if we ignore some particular votes). —Svārtava [tcur] 06:19, 4 January 2022 (UTC)[reply]
@Svartava2 The issue with voting is that in my experience there usually aren't enough participants for the voting to be meaningful. Benwing2 (talk) 06:44, 4 January 2022 (UTC)[reply]
If nothing else, it’s useful to codify the IP ban because it’s a sensible standard that could patch up possible holes in our policies. I’m less concerned about manipulating votes via socks (although this proposal would be helpful in that regard too). My real problem is that anonymous editors are less likely to have a history on Wiktionary, and therefore don’t have the knowledge, experience, or judgment to cast ballots at RFD. Of course there are exceptions (e.g., PUC often votes when not logged in, logged-in editors are sometimes newbies), but this would go a long way in addressing the issue. By the way, the phrase “a solution in search of a problem” has become so routinely used at the BP that it’s lost all meaning.
Solidifying the consensus standard makes sense on the basis of both consistency and precedent. The problem with leaving it up to admin discretion is that it inevitably produces inconsistent results. One admin might count the votes and interpret a consensus to delete, but another would see the same votes and argue there’s no consensus. This makes it so that the existence or deletion of an entry hinges entirely on the arbitrary fact of which user closed the RFD. Secondly, we’ve already pinned down a more precise consensus standard for actual votes, and it makes sense to do so again for this other category of votes. It’s, once again, more consistent.
Note also that, even if we put this proposal in place, we could keep discounting unmeritorious votes (such as those by Dentonius or other noted inclusionists). This vote would do nothing to change or challenge this practice.
Lastly, the proposal to close by no-objection is useful simply because activity at RFD is often stagnant/lacking. Imetsia (talk) 20:29, 2 January 2022 (UTC)[reply]
@Metaknowledge: On IP votes, I would almost agree that it is a solution searching for a problem, except that this is a system that is so easy to exploit as I've pointed out. Take this for example: Each user with an account is easily identifiable as a separate identity because of the list of previous contributions, and in cases of doubt (which I'm pretty sure has existed before even in RFD) it is relatively easy to spot sockpuppeteering and easier to prove. However, in the case of IP accounts, those are rarely kept longer than a day, and some have no contributions at all to speak of, at least for individual IPs not considering ranges. I can definitely see double-voting going completely unnoticed, even by closing administrators, because admins are just people after all, which is why we can't come up with even one example of it (which I admit, is probably going to make this harder to pass as a rule in a formal vote). Disguises are pretty easy to use if they don't stand out as such. Every vote that is counted in an RFD discussion does sway the vote one direction or the other, if the closing admin counts the vote. Of all the years I've been here I didn't even know that IP votes were countable until recently, and it honestly shocked me to find that out. I see it as an easy exploit to a system based on contemporary consensus, which is already inherently far from perfect as it is in determining consensus consistently. PseudoSkull (talk) 10:05, 2 January 2022 (UTC)[reply]
  • I don't personally see a strong need to change a great deal about how RFD works. Yes, some listings do drag on for a very long time, and should ideally be resolved more quickly, but there aren't huge numbers of these. If there was a lot of disruptive or unhelpful activity by unregistered users then I would support curbs on that, but I don't see much of this (unless a lot is deleted very quickly and I never see it). If an unregistered user "votes" a certain way on the basis of a solid argument that is useful for others to see, then that seems fine to me. Mihia (talk) 12:04, 2 January 2022 (UTC)[reply]
@Mihia In your view then, IPs being able to comment or even vote aside, should their votes therefore be counted in the end? PseudoSkull (talk) 15:00, 2 January 2022 (UTC)[reply]
My understanding, as has also been mentioned above, is that a decision at RFD isn't, or shouldn't be, merely a case of totting up votes. It should be based on a judgement of the cases made for retaining or deleting. So, in theory, even if five registered users vote "delete" with no explanation of why, while one unregistered user makes a compelling case why the entry should be kept, then the RFD can be closed as kept, in my understanding. So basically it's the argument that matters, not who makes it. Mihia (talk) 18:03, 2 January 2022 (UTC)[reply]
@PseudoSkull: The matter of IPs vs. accounts isn't quite that simple: as long as you're using the same Internet Service Provider from the same geographical area, your IP address will generally be in the same range, and there are geolocation services that tell what that geographical area probably is. I have created abuse filters that have very effectively kept certain IP editors from making certain types of edits for years at a time. It's also not uncommon for someone with hardwired internet to have the same IP address for years. An account is uniquely identifiable, but has none of the information that's available for IPs- unless you're a checkuser and have cause to run a check. What's more, there's no way for a non-checkuser to tell if two accounts are the same person. You, of all people, should know how easy it is to create a new account. As for evidence of IPs cheating on RFDs: there was a recent case in RFDO where someone voted with their account and with multiple IPs. All the votes were struck and the master account has been blocked permanently. Of course, it was pretty obvious what was going on, and reasonable suspicion of double-voting is clear justification for a checkuser check. Chuck Entz (talk) 17:02, 2 January 2022 (UTC)[reply]
The listings that “drag on for for a very long time” are IMO a greater problem than you’re suggesting. There just seem to be so many entries at RFDN that suffer from this. There are essentially two problematic scenarios:
  1. An entry is voted on, a result is clear, but no one closes it for an unreasonably long time.
  2. No one votes on an entry at all, so the entries are stuck at RFD for months and months.
So I think this is a real issue, and the proposals would go a long way in alleviating them. Imetsia (talk) 20:29, 2 January 2022 (UTC)[reply]
Sorry, I should have made it clear that my comments apply only to WT:RFDE. I do not ever participate at WT:RFDN. Mihia (talk) 20:36, 2 January 2022 (UTC)[reply]
I've given it some thought and I'm adding in my two sense, namely to voting criteria. I'm not sure that the problem is whether IP's can vote or not, as that's not something that's come up very often, but rather who can vote at all. The current rules is "the first English wikt vote must have been made 1 week before the given vote, 50 edits in total, and no suckpuppetting". I can't really think of a way to stop sockpupetting anymore than we are now. I also don't believe that votes cast by non-regular editors has really affected the outcomes - however I do wonder if some of the criteria and procedures should be slightly more standardized. Is a week and 50 votes enough for a given IP or account? Would a better rule be "the first edit must predate the CREATION" of the vote by one week", rather than the start time? Is 50 edits enough? Are people with only 50 edits really going to be all that aware of votes and how to even cast them and such?
On a similar note, in light of recent admin votes, we should consider implementing a standard that all votes begin and end at 23:59, and instead of "a month, give or take", perhaps set lengths for types of votes. I propose a tiered system - quick votes (i.e. bot votes) can last 15 days, standard votes 30. Vininn126 (talk) 20:30, 3 January 2022 (UTC)[reply]
@Vininn126: You're clearly mistaken there; this is for WT:RFD, not for actual votes (of which the rules are pretty clearly defined and I don't propose to change that). —Svārtava [tcur] 06:22, 4 January 2022 (UTC)[reply]
oop, brainfart Vininn126 (talk) 07:30, 4 January 2022 (UTC)[reply]
I want to point out that, if a vote fails, it doesn't mean that the negation of the vote's content is implemented but rather that the status quo prior to the vote is maintained. Concretely: If the vote on "IPs/Anonymous editors cannot vote" fails, it merely means that closing admins are not forced to disregard IPs' votes, not that all IPs' votes must be respected (as that's not the status quo). Fytcha (talk) 03:33, 5 January 2022 (UTC)[reply]
I think Wonderfool should be banned from voting too Br00pVain (talk) 14:54, 5 January 2022 (UTC)[reply]
@Br00pVain: Why from Rfd? I have anyways now seen him vote much here. —Svārtava [tcur] 15:00, 5 January 2022 (UTC)[reply]

User:B2V22BHARAT's Korean entries[edit]

(Notifying TAKASUGI Shinji, Atitarev, HappyMidnight, Tibidibi, B2V22BHARAT, Quadmix77, Kaepoong): An IP user has just aired their discontent with the above user's entries (diff), stating that this is a repeating problem. A quick glance at the user's recent contributions reveals that a considerable part has been reverted already. I think manually looking over their entries would be a wise choice. --Fytcha (talk) 03:45, 2 January 2022 (UTC)[reply]

@Fytcha: More checks may be required by native speakers. They haven't been very active lately. User:Tibidibi (or his previous accounts) pointed and corrected some edits in the past. I have RFD'ed one entry - 발음법(發音法) (bareumbeop) as an SoP. --Anatoli T. (обсудить/вклад) 10:11, 2 January 2022 (UTC)[reply]
@Atitarev, @Fytcha, Tibidibi will unfortunately be out until 2023 due to mandatory conscription. I can look into some of the entries (though I'm not a native), and I maybe could ask some folks who aren't as active to chime in as well. Looking into B2V22BHARAT's history and entries, it seems like there's a lot to fix nonetheless, so it'll for sure take time. Also, perhaps they should be removed from the Korean working group for pings? AG202 (talk) 21:38, 2 January 2022 (UTC)[reply]

Administrator intervention requested[edit]

I would be grateful if an administrator could please look at Wiktionary:Beer_parlour/2021/November#WT:ATTEST_proposal and either make the suggested change or rule that this has to go to a formal vote. Thank you. Mihia (talk) 18:09, 2 January 2022 (UTC)[reply]

I'm afraid I don't feel competent enough to adjudicate on the further steps to be taken in this case. I do agree however that you should be given guidance on how to proceed seeing that you have garnered community support for your idea. Pinging @Chuck Entz. Fytcha (talk) 17:21, 5 January 2022 (UTC)[reply]

Entry formatting of one-character Chinese entries e.g. [edit]

(Notifying Atitarev, Tooironic, Fish bowl, Justinrleung, Mar vin kaiser, RcAlex36, The dog2, Frigoris, 沈澄心, 恨国党非蠢即坏, Michael Ly): I have a script to clean up misindented sections but it currently doesn't work right on single-character Chinese entries like . These have a very strange format with e.g. a ==Definitions== header that promiscuously mixes parts of speech. This one puts ==See also== underneath the Definitions header instead of at the same level as is more normal, but conversely puts Descendants at the same level when it normally would be indented underneath a POS header. It also uses a Compounds header instead of a Derived terms header, and puts that at the same level as Definitions instead of indented underneath it. Questions:

  1. Is this standardized? If so, is there a page documenting the standards?
  2. Does this apply to all Chinese entries or only one-character ones?
  3. Does it apply to any other languages? If so, which ones, and if it applies to a subset of entries (e.g. only one-character entries), what is that subset?

Thanks, Benwing2 (talk) 08:14, 3 January 2022 (UTC)[reply]

@Benwing2: Wiktionary:About_Chinese#Basic_headers_for_single_characters has information.
And {{zh-see}} and {{ja-see}} can be found under ==Chinese==/==Japanese== (), ===Etymology=== if there is no actual etymology written (), or ===Definitions=== ().
Otherwise, ===Definitions=== should not be used elsewhere, especially not Japanese single characters.
Fish bowl (talk) 08:17, 3 January 2022 (UTC)[reply]
@Fish bowl Thanks. There isn't info there though about header indentation or whether and how much this applies to multicharacter entries. Benwing2 (talk) 08:20, 3 January 2022 (UTC)[reply]
I don't know concretely about indentation level, but would support indentation for all; HOWEVER for ====Compounds==== sometimes it is a lazy cop-out where there are multiple Etymologies but no one has sorted the words into each Etymology.
As for the naming of ====Compounds====, IIRC it is used to skirt the question of whether a term is actually a derived term or technically the other way around ().
Fish bowl (talk) 08:29, 3 January 2022 (UTC)[reply]
@Benwing2 It's not a good idea to mess with header levels in Chinese character entries: The Module Is Watching You. See Module:zh-forms after about line 330 for details... Chuck Entz (talk) 09:41, 3 January 2022 (UTC)[reply]

Words used solely by non-native speakers[edit]

Moved to WT:TR#Words used solely by non-native speakers --Fytcha (talk) 23:39, 4 January 2022 (UTC)[reply]

New phrasebook rules[edit]

Following up Wiktionary:Beer parlour/2021/October#The_phrasebook_is_in_dire_need_of_rules., I've decided to create a formal vote: Wiktionary:Votes/2022-01/New phrasebook regulations.

Suggestions strongly encouraged! Fytcha (talk) 21:25, 3 January 2022 (UTC)[reply]

Question book magnify2.svg Input needed
This discussion needs further input in order to be successfully closed. Please take a look!
I would greatly appreciate some more input before the vote begins. Most importantly regarding the two points raised on the talk page. — Fytcha T | L | C 〉 13:02, 15 January 2022 (UTC)[reply]

ordering of languages[edit]

@DTLHS I have a script to correct various misformatting issues that I've been running. I recently added support for reordering languages. This brings up an issue: What should the order of non-ASCII characters in language names be? User:NadandoBot seems to sort strictly by Unicode codepoint, possibly ignoring case; hence on A, Võro comes after Votic, Xârâcùù comes after Xhosa, and Yámana comes after Yoruba. On the other hand, Yámana comes before Yoruba on ala and several other pages not recently touched by User:NadandoBot. From looking at various pages, I see that 'Are'are comes before Acehnese on ma, and ǃKung (which despite appearances does not contain an exclamation point but the Unicode codepoint U+01C3) generally comes at the end, e.g. after Zulu on m. Furthermore, Indonesian comes after Indo-Portuguese on a (a change made by User:NadandoBot in [1]; formerly it was the other way around). Rather than sorting strictly by Unicode codepoint, I propose instead to sort by Unicode codepoint but ignore case distinctions and combining diacritics; this would place Võro before Votic, Xârâcùù before Xhosa, and Yámana before Yoruba, but put 'Are'are before Acehnese (apostrophe is not a combining diacritic but a spacing character), Indo-Portuguese before Indonesian (hyphen is likewise not a combining diacritic) and ǃKung after Zulu. Other more possibilities are to ignore hyphens and apostrophes (i.e. act as if they aren't present) or even to ignore any character that isn't A through Z, after removing combining diacritics. (The latter would alphabetize ǃKung like Kung, Zo'é like Zoe, and Indo-Portuguese like Indoportuguese.) Benwing2 (talk) 02:28, 5 January 2022 (UTC)[reply]

@Erutuon, Surjection, This, that and the other who might be interested in this topic. Benwing2 (talk) 02:29, 5 January 2022 (UTC)[reply]
I'm on board with a sort order that ignores diacritics. As an extreme example, I noticed recently that Önge entries are placed at the very bottom of pages, which is definitely illogical.
How do the languages such as 'Are'are and ǃKung sort the special letters in the context of their alphabet? I can't seem to find any info on this. But it might be best to sort them in a place that speakers and scholars of that language would expect to find them. This, that and the other (talk) 02:45, 5 January 2022 (UTC)[reply]
@This, that and the other Thanks. Another possible issue has to do with spaces. E.g. on sa, South Slavey comes after Southern Ndebele (effectively ignoring spaces); this was added by User:Thadh on Dec 19, 2021. Benwing2 (talk) 02:51, 5 January 2022 (UTC)[reply]
That's yet another illogicality (and ironically, one where sorting strictly by Unicode codepoint would result in a better outcome). In any sane sort order, South Zzz would come before Southern Aaa. This, that and the other (talk) 03:01, 5 January 2022 (UTC)[reply]
I don't think that this is something that we should bother actual editors about, so I support whatever the person who decides to run a script to order languages thinks is best. DTLHS (talk) 03:13, 5 January 2022 (UTC)[reply]
@Benwing2: Don't we just follow WT:STATS' order? Thadh (talk) 08:41, 5 January 2022 (UTC)[reply]
@Thadh This has the issue of putting Önge and Àhàn at the very end, and Záparo after Zuni, which seems contrary to what most people expect; so I am following what DTLHS said above and using the ordering I described above. Benwing2 (talk) 08:54, 5 January 2022 (UTC)[reply]
Oh right, sorry, I misunderstood the issue. Thadh (talk) 09:02, 5 January 2022 (UTC)[reply]
Whichever ordering is the standard (or rather most common) for English should be used. So if the language names were entries on some dictionary, that order should be used. — SURJECTION / T / C / L / 12:10, 5 January 2022 (UTC)[reply]
Another option that should handle the diacriticked letters would be sorting by the decomposed version of the language names (unicodedata.normalize('NFD', language_name) in Python). That would split letters with diacritics into sequences of the base letter and a combining diacritic wherever possible. I'm not sure what we should do for apostrophes or click letters. I imagine a typical English speaker would just ignore them in sorting. — Eru·tuon 15:27, 5 January 2022 (UTC)[reply]

{{plural of}} redlinks[edit]

(Following up the request for speedy deletion of emiratis and envolvers by User:This, that and the other)

What is the community consensus on plural-of stubs without the corresponding lemma? I'm not particularly a fan of them but at least they provide possible redlinks for article creators. On the other hand, some of these plural entries are ancient; If the lemma has not been added in 15 years, it will never be added.

Pinging also User:Equinox and User:Apisite as two users I see adding plural-only stubs somewhat regularly. Fytcha (talk) 18:20, 5 January 2022 (UTC)[reply]

I am trying not to speedy those which were intentionally created by humans and where the lemma has not failed RFV or RFD. The two entries you mention were pure bot creations and never had a human eye look over them. All the redlinked plurals I've looked at that were created by people like Equinox seem legitimate, and someone with the time could very well create the singular form. I haven't been speedying those. This, that and the other (talk) 21:56, 5 January 2022 (UTC)[reply]

Half collapsed boxes look like garbage[edit]

For example: B.1.617. All of these synonyms / coordinate terms / derived terms sections look like shit when they're piled on top of each other. Especially with the title formatting. Please let's just go back to having collapsed boxes like translation sections have. DTLHS (talk) 01:14, 6 January 2022 (UTC)[reply]

Particularly bad with mixture of red and blue links. The pale blue background makes it look extra busy. The overall look draws attention away from the definition, which may be all that many users want and which is needed by other users to confirm that the other material is relevant to their needs. DCDuring (talk) 16:34, 6 January 2022 (UTC)[reply]
I prefer col4 or top4's to collapsible boxes in derived and related terms sections, but I agree that using it in 'nym sections is distracting. Thadh (talk) 17:07, 6 January 2022 (UTC)[reply]

Unattested translations and {{not used}}[edit]

(Following up on diff) What should be added in cases where a potential translation does not meet WT:ATTEST but is still clearly and obviously correct and is also the form that's used by any native? Using {{not used}} is misleading: A term not meeting WT:CFI is different from a term not being used in a language (like the in many languages). — Fytcha T | L | C 〉 02:45, 7 January 2022 (UTC)[reply]

@Fytcha: If you mean the cases when a valid translation is an SoP, i.e. Russian translation of time-consuming is {{t|ru|тре́бующий мно́го вре́мени}}, producing тре́бующий мно́го вре́мени (trébujuščij mnógo vrémeni). I have described this case at Wiktionary:About_Russian#Translations_into_Russian.
A translated term may be vague or ambiguous or narrow, may require additional words to mean exactly the same as the English term. You can use {{qualifier}} for clarifications.
{{not used}} can also be used for abbreviations if the target language doesn't use them. --Anatoli T. (обсудить/вклад) 03:20, 7 January 2022 (UTC)[reply]
@Atitarev: I didn't mean SOP translations, I more meant cases like this. The way I understand it, translating to SOPs is fine in all languages as long as the SOP is the commonly used one (also codified here: Template:t). — Fytcha T | L | C 〉 03:29, 7 January 2022 (UTC)[reply]
@Fytcha: That usage is not expected. I noticed some users link only native words in a multiword translation, the unlinked word being a name. E.g. {{t|fi|Streisand-ilmiö}} in Streisand effect translation into Finnish. Wouldn't "panzerfaust" (lower case?) still be a valid translation into Romanian? You can mark it as {{qualifier|rare}}. --Anatoli T. (обсудить/вклад) 03:57, 7 January 2022 (UTC)[reply]
@Atitarev: Turns out, I was just really bad at searching. It is now added as both a translation as well as entry. Though the point still remains, what should I do if a translation is unattested (per WT:ATTEST) but exists in real language usage? {{no entry}} gives off the wrong signal in my opinion, which it seems you agree on.
Not related to this issue but I'm not sure I agree with how that Finnish translation is handled. If the whole term is attestable, it should be linking to that because it's definitely not a SOP. — Fytcha T | L | C 〉 04:15, 7 January 2022 (UTC)[reply]
@Fytcha Panzerfaust even has an entry in the Romanian Wikipedia: [2]. I think it's very strange to use {{not used}} when translating any non-function word (except maybe an abbreviation). Speakers in a language must be able to refer to a concept in some way, if nothing else by code-switching or using an unadapted borrowing. Rather than {{not used}}, you can add a qualifier explaining what native speakers actually do; e.g. since essentially all Catalan speakers also speak Spanish, they might well use Spanish words to refer to certain concepts when speaking Catalan. (But there are plenty of monolingual Romanian speakers so I can't see this applying to Romanian.) Benwing2 (talk) 05:09, 7 January 2022 (UTC)[reply]
@Benwing2: Yes, that's exactly my point; I agree that using {{not used}} is strange and I'm hereby asking what exactly I am supposed to add. What native speakers would do: Use an easily understood term that fails WT:ATTEST (see Wiktionary:Translations#Sources: "clashing with the fact that words added to translation tables are subject to attestation requirements as well." Emphasis not mine). Hence, I don't think recording the correct term using {{q}} instead of {{t}} is even a true loophole. See also Schläfli symbol, it also has a Romanian Wikipedia entry but that one is actually impossible to attest (surprise me!). — Fytcha T | L | C 〉 05:22, 7 January 2022 (UTC)[reply]
@Fytcha I would say, use {{q}} to provide an explanation (e.g. (found in Wikipedia as simbol Schläfli; otherwise unattested)). {{not used}}, as you added, seems simply wrong. {{not used}} implies an intentional gap, when this is clearly an accidental gap. Benwing2 (talk) 05:41, 7 January 2022 (UTC)[reply]
@Benwing2: What do you think about creating something like {{no attested translation}} in the spirit of {{no equivalent translation}}? — Fytcha T | L | C 〉 04:01, 8 January 2022 (UTC)[reply]
@Fytcha That is fine with me. However, I still think in a case like Schläfli symbol, where we have a living non-LDL language and where there is a translation in a source that does not pass WT:ATTEST, it's worth mentioning in a qualifier. I would not bother doing so for dead languages like Gothic, Old English or Latin, or in an LDL (low-documentation language), because in all these cases the people creating the Wikipedia entries are likely to be non-native speakers. Benwing2 (talk) 02:40, 9 January 2022 (UTC)[reply]

Template:rfv-t[edit]

(Following up on User_talk:Fytcha#Template:rfv-t)

I have created this template today and was advised to start a discussion with the wider community. I will share my rationale for creating this template:

Being quite active in translating rather specialized or rare English terms, I have come across a fair share of translations that seemed at least a little iffy to me. Keep in mind that, while translations are not subject to the entirety of WT:CFI (notably, they are exempt from our idiomaticity policy), they are still subject to WT:ATTEST as per WT:TRANS: " [] clashing with the fact that words added to translation tables are subject to attestation requirements as well." (emphasis again not mine). If, for a WDL, I can offhand only find, say, one valid quotation then there needs to be some kind of action that I can undertake to ensure that the term is properly verified in accordance with WT:ATTEST. For lemmas that have an entry, this would be using {{rfv}}. For lemmas that don't have an entry, there was no infrastructure, hence the template. {{t-check}} is unsatisfactory because there is no time limit so it may as well remain there unchecked for another decade and secondly because the term can not be easily listed on WT:RFVN where it belongs. On the other hand, creating an article for it and immediately RFVing it is also an unsatisfactory solution because it is unnecessarily time-consuming and makes RFVing terms (especially in languages one is not familiar with) unnecessarily complicated. — Fytcha T | L | C 〉 01:56, 8 January 2022 (UTC)[reply]

@Fytcha I'm not really sure about this. I simply remove any unattested translation (if red link), and anyone is free to revert me if they can provide the required citations. In the aforementioned scenario, if there is one valid quotation for a WDL, it should be considered unattested, and hence the translation removed. What's wrong? —Svārtava [tcur] 03:34, 8 January 2022 (UTC)[reply]
@Svartava2: I don't trust my attestation skills in languages I don't speak: I don't know what the inflected forms are and I don't know where to look specifically. The fact that I can't attest a word doesn't mean by a long shot that it is unattestable. Therefore I would never remove such a translation, especially not if I was in fact able find one attestation. Isn't this the whole reason why we do RFVs in the first place instead of directly deleting entries we can't ourselves find attestations for? — Fytcha T | L | C 〉 03:42, 8 January 2022 (UTC)[reply]
@Fytcha Deleting an entry and removing a mention of are totally different. Editors may remove the translations when they're sure and it is a translation in a language they know. Similar is the criteria for RFV: it's usually that editors send RFVs of their language (some exceptions being if someone tagged it but not listed it) to be verified (ideally when they were not able to). Their are tons of red links in translations, and tons of entries without quotations, which is okay IMO; how pedantic it would be if editors of other languages start RFVing it? So that's why the editors of a particular language should only deal with the translations. —Svārtava [tcur] 04:12, 8 January 2022 (UTC)[reply]
@Svartava2, Fytcha I tend to agree with Fytcha here. I don't think it's a good idea to remove translations in languages that aren't your native language, even if you think the term is unattested; you could easily be wrong and then you may have removed good info. Depending on someone else to revert you is likely to fail because there aren't enough editors out there on Wiktionary to properly police everyone's changes. There are of course exceptions; e.g. if the translation is into a dead language and smells funny, or into an LDL language where you don't trust the competence of the person who added it, or in similar circumstances where you have reason to believe the translation is likely to be wrong. Even in that case, unless you're pretty sure the translation is garbage, I would comment it out rather than remove it outright. Benwing2 (talk) 02:48, 9 January 2022 (UTC)[reply]
“listed on WT:RFVN where it belongs” – I thought there was a principle to not RFV entries that have not even be created. Or you would need a separate list for these, or a separate category for these links would suffice: The current category Requests for verification in langname entries hardly applies since there is no entry.
And this new template created after seeking translations for “rather specialized or rare English terms”, won’t it be abused to hunt for or displace words that are correct translations but fail CFI (or CFI fails for them—like internet slang)?
Also the sentence “words added to translation tables are subject to attestation requirements as well” does not mean that the attestation requirements for these words are the same as for words with entries, 😄, it says they are subject to some (unwritten?) attestation requirements, thus the problem “where a potential translation does not meet WT:ATTEST but is still clearly and obviously correct” is a chimera. Just only include good translations, mkay? If it is a complicated Gewissensfrage then you can write a few lines about what corresponds and what is attested how, as man had to do so with some legal terms as tortious interference and negligence per se, and if you told how it is then it can’t be wrong. Fay Freak (talk) 08:37, 8 January 2022 (UTC)[reply]
Thanks for bringing the category stuff to my attention. That certainly needs some kind of change; maybe something like Category:Requests for verification of langname translations will do.
I should have cited the entire sentence because the first part actually directly links to WT:ATTEST; as such, it is clear to me that translations are subject to that specific attestation policy, which is also in use for standalone entries. — Fytcha T | L | C 〉 15:21, 8 January 2022 (UTC)[reply]
I personally Symbol support vote.svg Support this. I've seen a ton of translations that have made me scratch my head, but with most of them, since they're not in my primary languages, I've just let them be, so having a RFV template would help a ton to be able to have them verified somewhere. However, I am a bit worried about continuously adding more to the non-English RFV, which is still backlogged. AG202 (talk) 05:06, 10 January 2022 (UTC)[reply]
I think it's a great concept, but rather than listing the terms at the backlogged RFV page, it might be better to just let the entries be categorised into language-by-language categories of questioned translations, which editors in that language can go through and check as required, similar to the {{attention}} categories. This, that and the other (talk) 04:16, 12 January 2022 (UTC)[reply]

Transparent "law of X" entries[edit]

(Following up on Wiktionary:Requests_for_deletion/English#law_of_conservation_of_energy)

Do we want such entries? The only information worthwhile in these articles are perhaps the translations. German, for one, is not always predictable regarding which term is used for a mathematically proven statement. In other words, there is no mapping between English {"theorem", "law", "lemma", "corollary"} and German {"Theorem", "Satz", "Hilfssatz", "Gesetz", "Lemma", "Korollar"}. I think a BP discussion might be better suited for such a general class of terms, rather than proposing them one by one as I encounter them. — Fytcha T | L | C 〉 04:34, 9 January 2022 (UTC)[reply]

This seems very appendix-y to me. —Justin (koavf)TCM 04:36, 9 January 2022 (UTC)[reply]

Abuse filter to block "Pronunciation 1"[edit]

I am thinking of creating an abuse filter to block any addition of ==Pronunciation 1== headers. It has never been agreed to allow them and in practice entries created with them have all sorts of weird formatting issues, because there's no standard for how to handle them. On top of this, it's really not necessary to have such a header at all. Instead, you split by etymology, and if a given etymology section has multiple lemmas in it with different pronunciations (or a single lemma with multiple pronunciations, or whatever), you list the pronunciations in a Pronunciation subsection at the top of the etymology section, appropriately tagging the pronunciations so it's clear which lemma goes with which pronunciation.

Probably I will make an exception for entries with Chinese characters in their title, since the use of Pronunciation 1 headers for Chinese (and Japanese kanji terms) seems to occur fairly frequently. Benwing2 (talk) 08:06, 9 January 2022 (UTC)[reply]

Symbol support vote.svg Support
I don't like most usage in Chinese, and usage in Japanese should be eliminated AFAICT. —Fish bowl (talk) 08:10, 9 January 2022 (UTC)[reply]
Symbol oppose vote.svg Strong oppose: See Afar awka: The term obviously has only one etymology, and the only difference between the two senses is the shift in stress to accommodate a different gender. Thadh (talk) 09:27, 9 January 2022 (UTC)[reply]
Perhaps coming up with a more standard formatting would solve this issue. Something like Lang -> Etymology -> pronun1 -> header/definition -> pronun 2 -> header/definition, and so on. Vininn126 (talk) 10:03, 9 January 2022 (UTC)[reply]
@Benwing2: So if I understand you correctly, your solution for entries like German unumgänglich is to merge the pronunciation sections into one while qualifying them with the senses using {{sense}}? — Fytcha T | L | C 〉 11:59, 9 January 2022 (UTC)[reply]
See also îndoi#Etymology_2. If I understand your proposal correctly, it would introduce quite a lot of redundancy into this article. — Fytcha T | L | C 〉 16:26, 9 January 2022 (UTC)[reply]
@Thadh You misunderstand. I am not saying these must be separate etymologies. See diff. (We need some slight changes to {{aa-IPA}} to make this cleaner-looking.) The problem with Pronunciation 1 is there is absolutely no standardization, making editing the pages by bot virtually impossible. Sometimes you find Etymology N under Pronunciation N, sometimes vice-versa, sometimes they are randomly threaded together in a non-nesting fashion. Furthermore, there is no mention in WT:ELE of such headers at all, but many past discussions that they are undesirable. I can dig these up if you need proof. There is actually a tag {{rfc-pron-n}} added by some past bot indicating that such entries need to be cleaned up. Benwing2 (talk) 23:25, 9 January 2022 (UTC)[reply]
@Benwing2: Perhaps a better example was baxa. In any case, what are we going to do, add all the senses to the pronunciation section? And adding the header (which would actually work for Afar) is also less than ideal, because that makes it harder to distinguish between one word pronounced in different ways and two words having the same etymology.
What about deciding on one standard solution for entries with multiple pronunciations instead? I think using pronunciation sections after etymologies whenever there's more than one seems doable. Thadh (talk) 23:40, 9 January 2022 (UTC)[reply]
@Fytcha I cleaned up îndoi appropriately. I don't see a lot of redundancy and in fact the page got 103 bytes smaller. Benwing2 (talk) 23:29, 9 January 2022 (UTC)[reply]
@Benwing2: It's a bit misleading to say that it got smaller when it getting smaller had nothing to do with getting rid of numbered pronunciation sections and especially when bringing back numbered pronunciation made it 18 bytes smaller. As to redundancy, it obviously is there (the argument of {{s}} must necessarily be a redundant copy of the corresponding senses) which is to be disliked by default. In the end it probably just comes down to whether we as a community have a stronger dislike for the redundancy or the more difficult / less reliable parsing. — Fytcha T | L | C 〉 01:14, 10 January 2022 (UTC)[reply]
@Fytcha You are right about the size difference, my apologies. However, I don't see why this claimed "redundancy" is a big issue; we're talking about on average maybe four or five words. And it's not just "less reliable parsing" it's that (a) these headers are essentially disallowed by policy (WT:EL); (b) there is no absolutely standard for how to do this. If there are multiple Pronunciation N headings and multiple Etymology N headings, does that mean we end up with L6 headings? And which goes under what? Existing entries that use Pronunciation 1 are all over the place. What happens if the pronunciations cross-cut the etymologies, which often happens e.g. in Tagalog? Editors have tried to interleave the headings in such cases, which just doesn't work. If you really don't like putting all pronunciations in an etymology section in one place, I would much rather see a Pronunciation section placed *under* the corresponding POS header, similar to a Conjugation/Declension header, than introduce Pronunciation N headers. This is easy to follow and does not complexify the entry structure, which is already complex enough with single etymologies vs. multiple etymology sections, different possible places for Alternative forms, nested vs. non-nested Derived terms/Related terms/Descendants/..., etc. Benwing2 (talk) 02:29, 10 January 2022 (UTC)[reply]
I should also add, others besides me have been cleaning up Pronunciation N headers; I have seen User:Jberkel do this, for example. Benwing2 (talk) 23:30, 9 January 2022 (UTC)[reply]
I am okay with removing these headers, never a fan of those, though yesterday I added one case, Baumheide (Eastern Westphalian place names have strange stresses). I’d rather link the different senses or etymologies from a joint pronunciation section, although I don’t know now by which template: we have used {{sense}} for this a few times (mostly Arabic pages where an etymology is just a root and the vocalizations differ for noun and verb or the like) but how to additionally link IDs instead? (Of course we won’t reintroduce {{jump}}.) Fay Freak (talk) 00:16, 10 January 2022 (UTC)[reply]
@Fay Freak IMO having "etymologies" that are just roots is lazy. Properly, different vocalizations of noun vs. verb, form I vs. form II, etc. *are* different etymologies; either they have distinct Proto-Semitic (or Proto-West Semitic/Proto-Central Semitic/etc.) forms, which could be listed, or they are post-Proto-Semitic creations, in which case the etymology could/should specify this. I know that this is difficult in practice since Semitic etymology is such a shambles; but theoretically Arabic is no different from any other language in this regard. Benwing2 (talk) 02:33, 10 January 2022 (UTC)[reply]
Surely, but even if they are separate etymologies then it impedes readers if there are separate etymologies only for that reason, without there being anything ever said as specific etymology. (Or as you often did: put a new reference section under each part of speech section with a Steingass template referring to the same location. The use of space has to look efficient—this is for me more the point to forgo numbered pronunciation sections than your bot logics; these won’t prompt us to split up into etymology sections because “actually” there are etymologies, which can’t be actualized–if the text is old you don’t see whether you have stem I or stem II save for the imperative or verbal noun, so really never, even discounting any possible human work.) Fay Freak (talk) 02:48, 10 January 2022 (UTC)[reply]
@Benwing2: I forgot something I had in my mind and mentioned at various times: If the inflection tables had a “switch” to swap Semitistic transcriptions for IPA transcriptions we would rarely seek pronunciation sections in the first place. Of course in the cases that there are audio files they might have to be added to particular forms with form-specific parameters as we use in Russian tables anyway—currently I even have audio files between POS headers and POS templates not to create noisy pronunciation sections or structure the whole page after pronunciations. Example of an Arabic page: حبن‎. Fay Freak (talk)
Symbol oppose vote.svg Weak oppose per Thadh. A general comment though: I think we seriously need to review our entry layout. I know there was the vote a while back related to prioritizing definitions that unfortunately failed, but I still feel that the etymology-first approach is tougher for languages where etymologies aren't as clear or close to non-existent. I remember when I first started with Yoruba entries, I actually got suggestions to use the Pronunciation headers similar to how Akar & Arabic use them actually, see: User:Smashhoof/Sandbox/bi#Yoruba, though in the end I went with a different approach, see: ọkan, odo, and bi, the last of which I know has the overarching "Pronunciation" header that some folks aren't fans of. Thus, I understand why Thadh and others would use them, as {{sense}} definitely can be messy and unclear at times. AG202 (talk) 05:00, 10 January 2022 (UTC)[reply]
Symbol support vote.svg Support If there's a need for this it should be discussed instead of making up non-standard headers. – Jberkel 12:57, 13 January 2022 (UTC)[reply]

Wiki Loves Folklore is back![edit]

Please help translate to your language

Wiki Loves Folklore Logo.svg

You are humbly invited to participate in the Wiki Loves Folklore 2022 an international photography contest organized on Wikimedia Commons to document folklore and intangible cultural heritage from different regions, including, folk creative activities and many more. It is held every year from the 1st till the 28th of February.

You can help in enriching the folklore documentation on Commons from your region by taking photos, audios, videos, and submitting them in this commons contest.

You can also organize a local contest in your country and support us in translating the project pages to help us spread the word in your native language.

Feel free to contact us on our project Talk page if you need any assistance.

Kind regards,

Wiki loves Folklore International Team

--MediaWiki message delivery (talk) 13:14, 9 January 2022 (UTC)[reply]

Special:Contributions/2A00:23C8:A08:C801::/48's daily Template:etyl -> Template:der substitution[edit]

(Following up on @Svartava2's edit)

The above user has been doing daily {{etyl}}->{{der}} substitutions in Serbo-Croatian lemmas for almost 4 months now. They have been on my radar before actually (diff, diff, many more) but I got bored so I stopped cleaning up after them at some point. Evidence of automated scripting is relatively low I guess (https://imgur.com/u1WBj9P) but I haven't spent a lot of time trying to reverse-engineer their distribution. We would need a Serbo-Croatian editor to clean up more thoroughly; I can only decide for the obvious cases like late and direct borrowings from German.

Please discuss what to do; such indiscriminate mass-substitutions are considered bannably disruptive by precedent: [3]Fytcha T | L | C 〉 15:04, 9 January 2022 (UTC)[reply]

@Fytcha: I suppose we can block them right away, since this substitution is all that they ever do. Another precedent: Donnanz was also banned from etyl-cleanup, see Wiktionary:Beer_parlour/2021/September#User:Donnanz’s_etyl_clean-up_methods. —Svārtava [tcur] 15:36, 9 January 2022 (UTC)[reply]
★ I was never banned, I stopped. As I stated at the time "I am washing my hands of Category:etyl cleanup". Get your facts right, please. DonnanZ (talk) 16:45, 13 January 2022 (UTC)[reply]
@Svartava2: I've made them aware on their talk page. Let's see if it continues. — Fytcha T | L | C 〉 16:10, 9 January 2022 (UTC)[reply]
@Fytcha: this is continuing. I think a block is in order, and also, since they really aren't making any constructive edits, we lose nothing by blocking them. —Svārtava [tcur] 12:22, 10 January 2022 (UTC)[reply]
@Fytcha: Still they're making those edits. Block them for a month to stop it, and as above, “since they really aren't making any constructive edits, we lose nothing by blocking them”. It's an awkward situation because if I don't know the language well and the case is not obvious enough, neither can I revert it boldly nor can I say that it is correct. —Svārtava [tcur] 08:34, 12 January 2022 (UTC)[reply]
@Svartava2: Blocked them for a week; I hope they get the memo this time. The true pity is that they could be really productive if only they replaced {{etyl}} with the right substitute. — Fytcha T | L | C 〉 13:44, 12 January 2022 (UTC)[reply]
I reduced it to a /64 block. All the edits in question are within the same /64 range, so /48 was overkill. Most ISPs (except mobile providers) assign a /64 IPV6 block to each single customer account, so the default should be a /64 block. Chuck Entz (talk) 20:41, 14 January 2022 (UTC)[reply]
See my comment above. The pettiness of this action, targeting some poor user, beggars description. DonnanZ (talk) 16:57, 13 January 2022 (UTC)[reply]
@Donnanz: Why is it petty? — Fytcha T | L | C 〉 01:58, 14 January 2022 (UTC)[reply]
The blocking of that user doesn't actually achieve anything. Since I stopped doing etyl cleanups, the rate of cleanup has dropped to less than a snail's pace, the speed is now glacial. Apart from some insignificant minor languages included in the cleanup but not listed separately, the only languages totally cleaned up in the last few months are German and Japanese. The tally of major languages where cleanup is still outstanding is 23. DonnanZ (talk) 10:32, 14 January 2022 (UTC)[reply]
It achieves keeping them from making disruptive edits. I'm not going to discuss the point whether such edits are disruptive or not. That has been discussed extensively and there's precedent. It is also, frankly, completely obvious. — Fytcha T | L | C 〉 12:53, 14 January 2022 (UTC)[reply]
A world war could begin and end in the time it has taken to clean them up. If Armageddon came... DonnanZ (talk) 14:27, 14 January 2022 (UTC)[reply]
  • @Fytcha, is it possible this London IP is Donnanz editing as an anon? Donnanz has done etyl replacements in Serbo-Croatian entries in the past… I wish my suspicion is wrong, but I think it’s better to check whether it is really Donnanz deceiving us— just see his vocal support in the blocked IP’s defense! ·~ dictátor·mundꟾ 19:48, 14 January 2022 (UTC)[reply]
    @Inqilābī: Good catch, I will look into it and collect behavioral evidence if I find any. — Fytcha T | L | C 〉 20:12, 14 January 2022 (UTC)[reply]
    In case anyone gets the idea of asking me to run checkuser on them: if this actually is Donnanz, it may be sneaky- but it isn't enough to merit a block for abusing multiple accounts, so the checkuser tool shouldn't be used. Chuck Entz (talk) 20:47, 14 January 2022 (UTC)[reply]
Unfortunately these geezers are obsessed about this topic - they are welcome to check my contributions if they don't do that already, but I very much doubt that they will find anything of interest in the last few months apart from new etymology, which doesn't count. DonnanZ (talk) 21:29, 14 January 2022 (UTC)[reply]
@Chuck Entz: Thank you for clarifying this. I must say though that it strikes me as weird that being blocked on an account, logging out and editing on an IP is perma-bannable whereas being blocked on an IP, logging in and editing on an account is totally fine. — Fytcha T | L | C 〉 22:02, 14 January 2022 (UTC)[reply]
My user a/c has never ever been blocked. Moreover, I have absolutely no idea what IP # I would have if I wanted to edit when logged out. DonnanZ (talk) 23:34, 14 January 2022 (UTC)[reply]
What I said (above) about the glacial speed of etyl cleanups seems to have made an impact on a certain user, as there has been some frenzied activity in Romanian. DonnanZ (talk) 10:37, 18 January 2022 (UTC)[reply]

Community Wishlist Survey 2022[edit]

Community Wishlist Survey Lamp.svg

The Community Wishlist Survey 2022 is now open!

This survey is the process where communities decide what the Community Tech team should work on over the next year. We encourage everyone to submit proposals until the deadline on 23 January, or comment on other proposals to help make them better. The communities will vote on the proposals between 28 January and 11 February.

The Community Tech team is focused on tools for experienced Wikimedia editors. You can write proposals in any language, and we will translate them for you. Thank you, and we look forward to seeing your proposals! SGrabarczuk (WMF) (talk) 18:10, 10 January 2022 (UTC)[reply]

Adpositional phrase[edit]

In 2010, @Ruakh proposed adopting prepositional phrase as a POS header, but thought that neither "Adposition" nor "Postposition" is a standard POS header: so I (Ruakh) see no need to consider "Adpositional phrase" and "Postpositional phrase" at this time. However, entries like at the earliest or at the latest show the need for that header to avoid the confusion arising from overgeneralizing prepositions as adpositions. Notably, @DCDuring opined that preposition is already a misnomer, meaning that it's already in established use to refer to adpositions in general and that only a minority of people object to its heterological nature. However, neither our definition here reflects such convention, nor that of MW. I wanted to see the general position in the project. Assem Khidhr (talk) 14:32, 12 January 2022 (UTC)[reply]

Aren't they called prepositional phrases because they contain a preposition, rather than because they act prepositionally (a convention I really hate because it breaks consistency with noun phrase etc.)? By that token, the two examples you've mentioned don't qualify as postpositional phrases, phrases that contain a postposition and its object. — Fytcha T | L | C 〉 14:57, 12 January 2022 (UTC)[reply]
@Fytcha Oh, I think I fell victim to surface analysis. Thank you! Assem Khidhr (talk) 15:15, 12 January 2022 (UTC)[reply]

Deprecating Usenet[edit]

As Usenet becomes less relevant, less distributed, and less easy to search, it no longer has the value it did in the early days of Wiktionary. People have rightly complained that its mention in WT:CFI is out of line with our general attitude towards online sources. As it does have some value for recording 20th century usage I propose the following more explicit policy for just that one source:

Usenet posts from 2005 or earlier are considered durable if they are still findable.

I have suggested a more general rule that 20 year old web pages with fixed content should be citable, but nobody seems interested in that. People unhappy with the current rules seem to want the hottest ephemera, not established words.

I'm OK with grandfathering any existing post-2005 Usenet citations. This proposal is not meant to delete any existing entries. Thoughts? Vox Sciurorum (talk) 14:39, 12 January 2022 (UTC)[reply]

I'd love to be able to quote webpages if there was an easy way to create an accessible, archived version of it. I think this isn't a bad idea, a step in the right direction, perhaps. Vininn126 (talk) 14:59, 12 January 2022 (UTC)[reply]
Source for Usenet being harder to search now? I've never found it difficult using Google Groups. And you can even link directly to the post in the quotation, which IMO should be required. 70.172.194.25 22:34, 12 January 2022 (UTC)[reply]
  • Okay, true, Google Groups' search does include a lot of results that are not from Usenet and there does not appear to be any built-in way to filter these out. However, it takes less than a second to look at the group name and tell if it fits the usenet.newsgroup.name.format (it would even be trivial to write a script to filter results in this manner). If you want to be careful, you can also verify that the group really is from Usenet by searching its name in combination with Usenet, or by looking up a newsgroup hierarchy listing.
  • If we're talking about being inaccessible, though, I feel like this doesn't hold a candle to old journal articles behind a paywall or books with no/limited preview on Google Books, which are seemingly still allowed as long as they are durably archived in some format. Meanwhile, a link to a Usenet post via Google Groups poses no barriers to access; it is only the process of finding such quotations that some find challenging (although IMO it is not that hard once you get used to it).
  • The point about Usenet not being as relevant anymore is certainly true, but I don't think we generally require that sources be relevant, only that they be durably archived. There are a lot of works in the Internet Archive, Google Books, or Google Scholar that have probably only ever been read in their entirety by a dozen people and are not culturally relevant, but we still count them as long as they meet the criteria.
  • Overall, I fail to see how adopting this change to policy would benefit the project. It would just make it harder to attest certain slang, jargon, or non-standard forms that actually do exist but don't often appear in print. Are there any examples of words that you think should not be included where post-2005 Usenet postings pushed it just over the edge of CFI? 70.172.194.25 02:02, 13 January 2022 (UTC)[reply]
  • I don't see the point of "deprecating" Usenet. Is it no longer durably archived? I know that it is no longer possible to use Google Groups to get Usenet cites. But if someone, using any means, finds a valid Usenet cite and provides a link thereto, why shouldn't we accept it? DCDuring (talk) 22:49, 12 January 2022 (UTC)[reply]
    I consider it less durable than it was 20 years ago. It's definitely harder to get than it used to be. I don't think it is so superior to Facebook, Twitter, or MySpace to deserve special treatment. It's a worse source than the archives of a major newspaper. It should be handled based on the same policies as electronic sources in general, and as unedited sources in general. (Note that we do not currently distinguish professionally edited documents from keyboard spew, but I think we should.) Vox Sciurorum (talk) 12:12, 13 January 2022 (UTC)[reply]
As a side note, not gonna lie, I'm still a bit confused as to why Usenet has so much power here? I've seen many complaints about how a word widely used on Twitter shouldn't be included because it's not "durably archived", but yet a word can appear in the far reaches of Usenet only three times and be included? It's similar to any other forum nowadays, just that it's much older, so I really wonder if there's anything that can be done with it to actually put it in line with the other guidelines and policies. No reason why there should be this discrepancy. (And this isn't the first time me or anyone else has brought this up.) AG202 (talk) 04:08, 13 January 2022 (UTC)[reply]
  • I'm in favor of downgrading Usenet. As a corpus, it's greatly biased towards a certain type of speaker/language (tech-related, predominately male, etc.) Useful to document the language/slang of that particular group (and time), but not representative of the language community of today. – Jberkel 12:34, 13 January 2022 (UTC)[reply]
    • As a corpus, journal articles are biased towards highly-educated speakers and technical language, but that doesn't seem to be a problem. I'd be in favor of including more sources that we deem as durably archived that encompass a wider range of lects, but I don't see how removing Usenet as a source would help. It's currently the one form of non-print source we deem as acceptable, which is wildly out of line with the trends of the day, and removing it would in essence be relegating ourselves to only dead-tree sources (which admittedly have a longer shelf life than most internet sources; with the notable exception of Usenet!). 70.172.194.25 20:40, 13 January 2022 (UTC)[reply]
      • Don't forget films and music which aren't dead-tree, and a good source for more casual language. Not very searchable though. – Jberkel 21:08, 13 January 2022 (UTC)[reply]
I think maybe the question shouldn't be about depreciating usenet, but specifically an ability to upgrade other web-citations, which might be controversial. However, there has been talk of using something like internet archive, which would make these sources much more durable. Vininn126 (talk) 11:39, 14 January 2022 (UTC)[reply]

Spanish redundant accent category[edit]

Should we have a category for Spanish words with an accent that doesn't affect pronunciation such as and  ? Dngweh2s (talk) 00:00, 13 January 2022 (UTC)[reply]

In both examples the acute accent indicates an aspect of pronunciation- namely stress. The corresponding forms te and mi are atonic (and yes, also different words). That this is so is more obvious in cases like que/qué, quien/quién, etc. Nicodene (talk) 01:03, 13 January 2022 (UTC)[reply]
@Nicodene What about solo/sólo and este/éste? Dngweh2s (talk) 01:09, 13 January 2022 (UTC)[reply]
Éste serves to indicate the pronoun, and so a stressed form, whereas the determiner este forms the first part of a noun-phrase, where the stress will instead be on the head. It is true that the pronoun may also be spelled without the diacritic (for typographical convenience), but, as far as I am aware, the reverse is not true- because an acute accent would not be appropriate for the atonic form.
With solo/sólo I do not see a clear-cut difference in stress. Perhaps not surprising, then, that the latter spelling has been retired. Nicodene (talk) 01:36, 13 January 2022 (UTC)[reply]
Both pairs (este/éste & solo/sólo) fall under the same guidelines, and the tilde should only be added in cases of ambiguity, per la RAE, sections 3.2.1 & 3.2.3. AG202 (talk) 04:11, 13 January 2022 (UTC)[reply]
If the official practice is now to drop the diacritic in general, but to require it where ambiguity is possible, then the description on our entry for sólo should be changed accordingly, because it simply states the spelling is deprecated.
It seems, in any case, that sólo is a genuine example of what the OP was describing. Nicodene (talk) 05:33, 13 January 2022 (UTC)[reply]

Call for Feedback about the Board of Trustees elections is now Open[edit]

You can find this message translated into additional languages on Meta-wiki. More languages • Please help translate to your language The Call for Feedback: Board of Trustees elections is now open and will close on 7 February 2022.

With this Call for Feedback, the Movement Strategy and Governance team is taking a different approach. This approach incorporates community feedback from 2021. Instead of leading with proposals, the Call is framed around key questions from the Board of Trustees. The key questions came from the feedback about the 2021 Board of Trustees election. The intention is to inspire collective conversation and collaborative proposal development about these key questions.

There are two confirmed questions that will be asked during this Call for Feedback:

  1. What is the best way to ensure more diverse representation among elected candidates? The Board of Trustees noted the importance of selecting candidates who represent the full diversity of the Wikimedia movement. The current processes have favored volunteers from North America and Europe.
  2. What are the expectations for the candidates during the election? Board candidates have traditionally completed applications and answered community questions. How can an election provide appropriate insight into candidates while also appreciating candidates’ status as volunteers?

There is one additional question that may be presented during the Call about selection processes. This question is still under discussion, but the Board wanted to give insight into the confirmed questions as soon as possible. Hopefully if an additional question is going to be asked, it will be ready during the first week of the Call for Feedback.

Join the conversation.

Best,

Movement Strategy and Governance --Mervat (WMF) (talk) 09:29, 13 January 2022 (UTC)[reply]

Allow adding unattested translations?[edit]

(Following up on #Unattested_translations_and_{{not_used}})

WT:TRANS clearly states: " [] words added to translation tables are subject to attestation requirements as well." Where does the community stand on relaxing this requirement? As I've also stated in the above discussion, I see it as a possibility to create a template {{no attested translation}} (in the spirit of and styled equivalently to {{no equivalent translation}}), after which unattested translations are allowed to be added. There needs to be something one can add to such entries. {{not used}} (as seen here) is just not satisfactory because there is a term that is used and readily understood, it just doesn't conform to our specific attestation criteria.

This discussion made me think of Alemannic where there are potentially tens of thousands of commonly used and understood terms (mainly technical or scientific in nature) that can absolutely not be attested (I guess because science is in general not conducted in the L language in a diglossic environment). These terms are usually calques, nativized loans, or loan renderings of the German equivalent.

I do acknowledge that this makes the dictionary less verifiable. What else do we have other than the respective editor's word? — Fytcha T | L | C 〉 12:39, 13 January 2022 (UTC)[reply]

  • If a translation is challenged the supporter should be able to demonstrate that it is used. Personally, I am willing to accept weaker evidence for a translation than an entry but I would still demand some evidence. Others might insist on full citation compliance. I had some translations challenged around late 2020 and I went to the effort of digging up quotations. I do not think the challenger needed to accept my word for it. One of my translations was said to be code switching so I deleted it. Vox Sciurorum (talk) 21:55, 13 January 2022 (UTC)[reply]
    @Vox Sciurorum: I must say, even finding any shred of evidence at all will be hard for the majority of scientific terms in Alemannic. coordinate chart is Alemannic German Charte (likely a loan meaning from German Karte) but this would be absolutely impossible to attest in any way, shape or form. Then again, we probably don't care about scientific Alemannic terms anyway. — Fytcha T | L | C 〉 13:42, 14 January 2022 (UTC)[reply]
    As a general policy, if it hasn't been written down it does not exist as far as Wiktionary is concerned. We are not in the business of researching spoken languages. Vox Sciurorum (talk) 14:30, 14 January 2022 (UTC)[reply]
  • The mention of Alemannic reminds me of hearing technical conversations in foreign languages. English words appear regularly, but they are still English words. We consider them code-switching rather than borrowings. A similar issue has been brought up with Scots and English. Almost any English word can be dropped into a conversation. Vox Sciurorum (talk) 01:41, 14 January 2022 (UTC)[reply]
@Fytcha: Since I add a lot of translations, I find Category:Requests for translations by language both useful and bothersome, since a few users just carelessly throw around {{t-needed}} into rare languages or for terms, which are unlikely to be even known to speakers of a given language. My request to reduce the usage of {{t-needed}} was always rebutted with "all words all languages" motto. I agree with the motto but it has to be reasonable. What are all these requests doing at 610 Office, contact tracing or Doukhobor?
I am not sure if {{no equivalent translation}} is an ideal solution but it's better than nothing or if you want to quickly close an annoying request to reduce the request category. Also pinging @Benwing2: who contributed in the other discussion. --Anatoli T. (обсудить/вклад) 02:11, 14 January 2022 (UTC)[reply]
@Atitarev: I think it would be valuable to distinguish between {{no equivalent translation}} and something like a hypothetical {{no attested translation}}. The first one states that there exists no equivalent, when, it fact, it does exists (it can't be claimed that Romanian mathematicians simply lack the vocabulary to refer to Schläfli symbols, right? Even ro.wiki has an entry: ro:simbol Schläfli) but just doesn't fit our policies (yet). On the other point I agree, is there really a need to request a translation of 610 Office into Armenian and Georgian, considering those languages don't even have a translation for doormat? Request the important stuff first! contact tracing seems more reasonable though. — Fytcha T | L | C 〉 13:35, 14 January 2022 (UTC)[reply]
Is there a mechanical test to find out if a word in spoken language is code switching or not? I honestly think code-switching to High German is rarely the case in Alemannic, though it happens sometimes, e.g. if a High German idiom that has grammatical features that Alemannic lacks is used (then, Swiss High German phonology is usually used accordingly). Apart from that, I'd call them calques / nativizations for morpho-/phonological reasons (the exact categorization is maybe a bit hard; compare also Wiktionary:Etymology_scriptorium/2021/October#Romanian_asexualitate; it may be that Wiktionary currently doesn't document this class of "borrowings" correctly). — Fytcha T | L | C 〉 13:21, 14 January 2022 (UTC)[reply]
We also have the CFI criterion "clearly widespread use", which may be helpful in this case, as Alemannic is rarely written down. But of course you should make sure that it's not codeswitching, like Vox says. Thadh (talk) 09:49, 14 January 2022 (UTC)[reply]
@Fytcha: I don't quite understand this. You have to explain so that people without a good grasp of Romanian make sense of this too. Is jeton nefungibil an SoP? Perhaps it should be broken up like jeton nefungibil? Is "jeton nefungibil" identical to non-fungible token? It's unattestable but how is it a correct translation? Having both "no attested translation in Romanian, but see" and a translation next to it, doesn't make much sense to me. Perhaps splitting into parts like this jeton nefungibil would be sufficient? --Anatoli T. (обсудить/вклад) 23:38, 18 January 2022 (UTC)[reply]
@Atitarev: It is not SOP for the same reason the English term is not SOP: The specific cryptocurrency related meaning is not deducible by the parts. Therefore, I don't think linking to the parts is the correct way to go about it. It is "obviously correct" because it is the term that is actually used by Romanian media ([4], [5]) and ro.wikipedia ([6]). The reason I didn't add {{not used}} or {{no equivalent translation}} is because there actually is an equivalent term in use, so those template would be wrong (the latter of the two also requires attestation from my understanding, so that wouldn't fly anyway). The reason I didn't just use {{t}} is because WT:TRANS states that WT:ATTEST applies to translations too. In conclusion, we're in quite a weird spot here from what I can tell, which is why I created this template. Tell me if this makes it clearer! — Fytcha T | L | C 〉 23:58, 19 January 2022 (UTC)[reply]
@Fytcha: I see your point now, thanks. Perhaps the WT:ATTEST are to strict and should include media usage? Or perhaps there should be some distinction between "entry-worthy" vs "translation-worthy" (correct and perhaps the only way but unattested")?
As in the previous discussion (I know you disagreed), I think {{t|fi|Streisand-ilmiö}} is a good Finnish translation of Streisand effect, if you don't think it's worth creating that entry, there is a Wikipedia article for that but wish to show users how this is translated into a target language.
Perhaps a fully de-linked templatised translations should also be allowed, just "jeton nefungibil" you can achieve it with a template call: {{t|ro||n|alt=jeton nefungibil}} -> jeton nefungibil n.
See also how I dealt with the Russian translation of [[boat people]]. «лю́ди с ло́дки» m pl (“ljúdi s lódki”) (i.e. "people from a/the boat") is not idiomatic in Russian (even if attestable) but it's a correct translation, using "quotes" to highlight that's what is used by media as a translation only. So, maybe it would be correct to use "jeton nefungibil" in your case? --Anatoli T. (обсудить/вклад) 00:37, 20 January 2022 (UTC)[reply]
@Fytcha If it's an "obviously correct" translation because it's found in actual usage in the media, it should be included, not hidden behind {{no attested translation}}. If it can't be attested per our attestation guidelines, either we need to revise them, or you should just ignore the attestation rules in this particular case. To me this is clear case where the letter of the law is contrary to the spirit of the law, in which case the spirit should prevail. (Compare Wikipedia's "ignore all rules" policy.) Benwing2 (talk) 06:04, 20 January 2022 (UTC)[reply]

Trivial English present participles[edit]

(See Wiktionary:Votes/2022-01/Excluding_trivial_present_participal_adjectives. I will probably temporarily retract the vote until the exceptional case brought up by AG202 on the talk page has been addressed satisfactorily.)

Where does the wider community stand on trivial (in the sense as defined in the vote as a combination of three criteria) adjectival conversions of present participles? I generally disagree with their separate inclusion (i.e. by using an adjective header in addition to a verb header), see the vote page for the majority of my reasoning and arguments. However, the point has been brought up that it would be potentially confusing to lose the overwhelmingly common second sense of e.g. interesting along with its many good translations. I strongly agree with the translations part so I am currently in search of a good criterion to pick out the trivial but keep-worthy adjectives (like interesting, annoying) while getting rid of the "That VERBS." glossed nonsense (and equivalents, of course) like falling. See Wiktionary_talk:Votes/2022-01/Excluding_trivial_present_participal_adjectives#Present_participles_that_do_act_like_true_adjectives for two ideas on how to potentially discriminate those two cases. Most importantly:

1. Have the community collaboratively define a set of exceptional adjectives (via the BP which also can always be updated with a BP consensus). This is still an improvement compared to the status quo as it shifts the default position from inclusion to deletion, which is how it should be.
2. Try to come up with something along the lines of a WT:THUB criterion, where a present participal adjective may be entered if it has a certain number of interesting translations. What constitutes interesting exactly would still have to be hammered out, but I think "having N translations that are not the present participle equivalents of the respective translated base verb or any verb that is synonymous (in this specific sense)" would be a starting point. What should and shouldn't be counted towards these N would, as is the case for WT:THUB, be subject to appeal; we don't want e.g. the collective of all Arabic lects to be able to unilaterally meet this criterion.

Pinging @Lambiam, Vininn126, Sgconlaw, AG202, Donnanz, Eirikr as some of the more involved parties in this discussion. — Fytcha T | L | C 〉 21:13, 13 January 2022 (UTC)[reply]

I think that the vast majority of participles should NOT have adjectival definitions except a small handful - i.e. the ones brought up by AG202. I think it will be hard to come up with a hard and fast rule, unfortunately, as most of these adjectival participles are considered such only becuase they're more idiomatic. Vininn126 (talk) 21:18, 13 January 2022 (UTC)[reply]
I'm not happy with the use of "trivial", it should be replaced with something else. And on the subject of worthy -ing adjectives, I bet most users wouldn't wouldn't delete f***ing if its part of their vocabulary. DonnanZ (talk) 21:38, 13 January 2022 (UTC)[reply]
F***ing is a good example actually of a word that doesn't fall under the typical adjective tests and also has a bunch of translations, so @Fytcha we still would probably need more clarity, though I'm not sure where to start with it. AG202 (talk) 22:24, 13 January 2022 (UTC)[reply]
@AG202: That word is not concerned by my vote (just like becoming and eating aren't) because the semantics are not 100% transparent (it doesn't mean "that VERBS"). — Fytcha T | L | C 〉 22:41, 13 January 2022 (UTC)[reply]
Ahhhh alright that makes sense, I wasn't sure if it could be construed to fit under one of the verb meanings at f***. Thanks! AG202 (talk) 00:21, 14 January 2022 (UTC)[reply]
Strike the language "may be deleted by any user on sight and may not be re-entered" and simply say they are not to be included as separate parts of speech. We don't need to say that something not meeting CFI can be deleted or should not be entered and controversial cases will end up in a forum discussion anyway. More specifically, add to Wiktionary:About_English a paragraph "adjectives which are simply present participles used adjectivally in the sense of 'which VERBS' are not included as separate parts of speech; they should be listed as verb forms using {{present participle of}}." Vox Sciurorum (talk) 21:49, 13 January 2022 (UTC)[reply]
I was going to object to that anyway. If that happened, perish the thought, it could be extended (in a nightmare) to SoP terms and other PoS. Definitely not on. But this whole proposed vote deserves to fail. DonnanZ (talk) 22:16, 13 January 2022 (UTC)[reply]
@Vox Sciurorum, Lambiam: I agree with both of your objections regarding the wording so I've updated the wording now. Tell me what you think! — Fytcha T | L | C 〉 12:48, 14 January 2022 (UTC)[reply]
I’m on board with the general idea, even though I have some problems with the current wording of the proposal. Also, I think it can be smoothly extended to past participles. Even more generally, if in some language it is a property of its grammar that terms with some given primary POS assignment can also be used with a second POS role (like adjective → adverb in e.g. German; see German adverbial phrases § Adverbial forms of adjectives on Wikipedia), there needs to be a specific reason beyond such routine use for the inclusion of such a term under that second POS.  --Lambiam 23:41, 13 January 2022 (UTC)[reply]
Completely agree with Lambian’s extension. MuDavid 栘𩿠 (talk) 01:22, 14 January 2022 (UTC)[reply]
@Lambiam: I agree, I've also mentioned German and Romanian in the vote's rationale (both of which feature trivial adj->adv conversion). I actually started out writing the vote to pertain to all languages but then in the middle of it I thought to myself that I don't want to vote to fail only because of some unforeseen corner case in a language I'm not familiar with, so I've changed it to English. Judging by the languages I know, however, I totally agree with what you're saying, there needs some kind of distinguishing feature for inclusion apart from the predictable and trivial conversion. This is already de-facto policy for adj->adv conversions in the two languages I've mentioned. — Fytcha T | L | C 〉 01:24, 14 January 2022 (UTC)[reply]
Another adj → adv example is Turkish grammar, which moreover has several finite verb forms with predictable secondary roles as participles, such as [third-person singular present simple indicative] → [aorist participle] and [third-person singular future] → [future participle].  --Lambiam 10:36, 14 January 2022 (UTC)[reply]
Lambiam's proposed extension to past participles is also flawed, judging by Fytcha's RFD of pressurized. That one has an antonym, unpressurized. DonnanZ (talk) 10:58, 14 January 2022 (UTC)[reply]
I do not see the argument. Are you suggesting that the verb forms abetted, abolished, abraded, abrased, abrogated, absolved, ... merit a second entry as an adjective merely by dint of having a derived term with un-? Are there any past participles of transitive verbs that, in your opinion, do not deserve a separate inclusion as adjective? And then, what of adjectives as nouns (the dispossessed, the strong, and so on and so forth)?  --Lambiam 14:07, 14 January 2022 (UTC)[reply]
The fate of usexes, quotes, and translations included in entries covered by this proposal is also unclear. Deleting those would be vandalism, considering the effort of editors adding them. DonnanZ (talk) 12:11, 14 January 2022 (UTC)[reply]
Agreed, Vininn126 (talk) 12:53, 14 January 2022 (UTC)[reply]
You agree to what? DonnanZ (talk) 14:31, 14 January 2022 (UTC)[reply]
To the comment above mine by Lambian ;) That's why I replied to that comment Vininn126 (talk) 14:34, 14 January 2022 (UTC)[reply]
I'm still confused. DonnanZ (talk) 14:47, 14 January 2022 (UTC)[reply]
I’m on board with the general idea, even though I have some problems with the current wording of the proposal. Also, I think it can be smoothly extended to past participles. Even more generally, if in some language it is a property of its grammar that terms with some given primary POS assignment can also be used with a second POS role (like adjective → adverb in e.g. German; see German adverbial phrases § Adverbial forms of adjectives on Wikipedia), there needs to be a specific reason beyond such routine use for the inclusion of such a term under that second POS.  --Lambiam 23:41, 13 January 2022 (UTC) Vininn126 (talk) 14:52, 14 January 2022 (UTC)[reply]
@Lambiam I don't think I'd support a sweeping proposal like this one. There've already been issues with proposals that don't take into account more minority languages and communities, so I'd really just keep it to English for now. If it should be applied to German and there's no major opposition to it in the German editor community, then it should just be added to Wiktionary:About_German, after some discussion in Beer Parlour. Re: past participles, I still think that it'd be best to just run the typical adjective tests found on Wiktionary:English adjectives as that'd weed out the ones that should be weeded out (ex: opened would not pass but frightened or broken would), and to be honest, the more I think about the more I feel that a formal vote on this is becoming less necessary. AG202 (talk) 22:13, 14 January 2022 (UTC)[reply]
  • Could someone explain to me why an English word ending in 'ing' that passes the tests for adjectivity (gradability/comparability/modification by 'very' or 'too', use after copulas other than forms of 'is', meaning distinct from that of a verb it is derived from) should not retain an adjective heading? DCDuring (talk) 14:59, 14 January 2022 (UTC)[reply]
    @DCDuring: No, but that is not the concern of this vote anyway. I only want to get rid of (most; see AG202's point) adjective entries that just mean "that VERBs" such as falling (among other criteria), i.e. ones that have no "meaning distinct from that of a verb [they are] derived from". My arguments for that are on display on the vote's page. — Fytcha T | L | C 〉 15:07, 14 January 2022 (UTC)[reply]
@Fytcha Can you give a larger list of "trivial present participles" besides just falling which currently have adjective entries? Benwing2 (talk) 01:55, 16 January 2022 (UTC)[reply]
@Benwing2 Not sure if I did this right, but if I did, this should be the search for all the English present participles that have an adjective header, so it'd include both the "trivial" entries and the ones that I've mentioned, as a starting point. Also as a side note, I think that a category like "English present participles" would be very helpful to have. I'm not sure exactly why it was deleted, and looking at the RFD "discussion", it seems that it was deleted without direct discussion about it. AG202 (talk) 02:09, 16 January 2022 (UTC)[reply]
@Benwing2: growling, accusing, reigning, curving, quivering, improving, defining, resulting, tickling, contracting, differing, tinkling (sense 1), inducing, wetting, discouraging, thieving, widening, musing, rousing, arousing, all meaning "That VERBs". For some of these, like improving, it could be argued that they are not trivial because they additionally contain some semantics about habituality, which would bar them from my proposal, something I didn't think of before. — Fytcha T | L | C 〉 03:11, 16 January 2022 (UTC)[reply]
@Fytcha Thank you. I would argue that some of these deserve to be adjectives as they can be qualified by words like extremely, somewhat or quite: extremely discouraging, somewhat curving, quite arousing. Benwing2 (talk) 03:21, 16 January 2022 (UTC)[reply]
@Benwing2: Aren't that simply the comparable ones? — Fytcha T | L | C 〉 03:24, 16 January 2022 (UTC)[reply]
@Fytcha My point is that being comparable is IMO one clear test of a term being an adjective rather than just a participle. Another IMO is when a term describes a state rather than a result; this is the point User:Chuck Entz made in his description about pressurized. So it's not enough just to say it can be defined as "that VERBs" because the adjective-y terms have additional semantics, even if not explicitly captured in the definition. Benwing2 (talk) 03:56, 16 January 2022 (UTC)[reply]
@Benwing2, AG202: From what I see, the discussion seems to circle mainly around comparability for the two of you. I want to remind, however, that comparability is neither a sufficient nor necessary condition for adjectivality, the former of which I've learned only recently in this discussion. It could be the case however, that it is indeed sufficient to discriminate within the class of present participles and merely fails for nouns; that has to be shown though.
Even if this proposal of mine leads nowhere (which is what it currently looks like), I at least hope that we can come up with better tests that can be applied razor-sharply. The non-be copula test that AG202 uses could be one. Then again, where do we draw the line? spiring, for instance, has exactly one attested use with become; does this merit an adjective entry now? Once the dust has settled, something should be put into WT:CFI. — Fytcha T | L | C 〉 04:15, 16 January 2022 (UTC)[reply]
I propose a second test - can the given participle be used predicatively without giving the continuous form (i.e. that was exciting). This would give two possible tests. I wonder if there's a participle that would fail both and still be considered an adjective. If yes, then perhaps we need one more test. Vininn126 (talk) 04:17, 16 January 2022 (UTC)[reply]
Out of that list, I'd personally probably keep "discouraging" & "arousing" just based on the same tests I used for "pressurized". AG202 (talk) 03:21, 16 January 2022 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I feel quite doubtful about the proposal in general. I worry that it will lead to a lot of wrangling over what is regarded as “trivial” and what isn’t. Moreover, if common -ing forms are not marked as adjectives, are readers supposed to assume that all such words can be used as adjectives? Is this generally a correct assumption? If so, where will readers learn about this? Do we put quotations showing adjectival use interspersed among verb uses, organized chronologically in the usual way – will this be confusing to readers? (Or do we somehow list such quotations in a separate block?) Why should we ignore the lemming principle here (many other major dictionaries mark such words as adjectives)? — SGconlaw (talk) 04:21, 16 January 2022 (UTC)[reply]

The general consensus so far is that a few will be kept. And yes, readers should be expected to understand that participles are adjectives, we're a dictionary, not a grammar. I don't think we should ignore the lemming principle, just because OED lists them doesn't mean we should. Vininn126 (talk) 04:40, 16 January 2022 (UTC)[reply]
@Vininn126: what should we do about adjective quotations then? Mix them together with quotations showing verb uses? — SGconlaw (talk) 05:16, 16 January 2022 (UTC)[reply]
You mean like on growing? Potentially. Or under the participle usage, proving that, indeed, the participle exists, and it is functioning as participles do in English. Vininn126 (talk) 05:21, 16 January 2022 (UTC)[reply]
We may not be a grammar, but word-class membership is essentially a matter of syntax, not semantics. I fundamentally disagree with the proposition that "trivial" wording in a participle's definition is a sufficient reason to delete an adjective PoS section for any English participle. Further, I believe that no semantic consideration is a sufficient reason for deletion. Membership in the adjective word class is governed by syntactic criteria for inclusion: attributive use [a necessary condition], gradability/comparability (eg, modification by too, very) [sufficient], use in a predicate after a copula other than be (eg, seem, become) [sufficient]. Having a distinct definition not immediately transparent from the definitions of the verb from which it is derived is also a sufficient reason for inclusion.
I'm sorry that contributors are looking for shortcuts to deletion of PoS sections they object to instead of doing the work involved in determining whether a word behaves like a member of a word class. Flooding the RfV process with definitions one objects to to achieve deletion is scarcely better. DCDuring (talk) 06:03, 16 January 2022 (UTC)[reply]
@DCDuring: Not sure why you keep going back to that "wording in a participle's definition" line; it is self-evident that my criterion is aimed at the actual semantics of the word, not whether its definition on Wiktionary matches some kind of regex; in my vote I also wrote "a 100% transparent meaning". Even if you misunderstood or I explained poorly, this would be the obvious steelman.
Also, could you please address the concerns @Mihia brought up there? As it stands, it looks like you're saying "New York" is an adjective, for it passes the sufficient condition proposed by you of being gradable/comparable. — Fytcha T | L | C 〉 13:11, 16 January 2022 (UTC)[reply]
New York is no adjective nor is New Zealand, but both are used attributively. The adjective was deleted, see Talk:New York. DonnanZ (talk) 13:55, 16 January 2022 (UTC)[reply]
I am saying that semantics is often irrelevant to word-class membership and is especially so in the case of participles. Not even a completely transparent meaning trumps the syntactic evidence that a word is a member of a specific word class.
As to proper nouns being adjectives, no one says they are worth including as adjectives merely because they are used attributively. In fact, we don't even take as sufficient that a proper noun can be shown to be sometimes forced into adjective-like use with too or very (eg, "That tweet was very Trump"). Virtually any noun can be used attributively. Your reliance on semantics for some kind of rule about word-class membership is completely misguided. DCDuring (talk) 18:42, 16 January 2022 (UTC)[reply]
So we shouldn't keep adjectival definitions of nouns, as it's 100% predictable how they're used, but we should keep adjectival participles such as "growing", because it's... 100% predictable? Huh? Vininn126 (talk) 13:21, 18 January 2022 (UTC)[reply]
@DCDuring: "no one says [proper nouns] are worth including as adjectives" I think you are by presenting a sufficient condition ("gradability/comparability (eg, modification by too, very)") that applies to them as well. Either your sufficient condition is not sufficient or you think these words are adjectives as well and deserve an adjective PoS header. Which one is it? You're free to walk back your claim that gradability/comparability is a sufficient condition, but then I'd again ask you to present sufficient conditions to test adjectivality of a word. — Fytcha T | L | C 〉 13:28, 18 January 2022 (UTC)[reply]
We made the explicit decision not to accept attributive use alone as evidence of adjectivity for any English noun because such attributive use is possible for virtually any English noun, both as a matter of syntax and as a matter of actual usage. For English proper nouns we decided that we wanted more than minimal evidence of adjective-type usage like "It was not a very White House way of communicating" to support adjectivity.
We have some 1,300 entries that are in both Category:English proper nouns and Category:English adjectives. Most of them are demonyms or glossonyms, for which we believe it likely that evidence could be found of true adjectivity. Hardly any have citations supporting their adjectivity. Many of them could use some cleanup, eg, moving derived and related terms to the bottom of the L2. DCDuring (talk) 14:56, 18 January 2022 (UTC)[reply]
The whole concept of removing these adjectives should die a death. I notice that Fytcha has taken down the proposed vote - for now. DonnanZ (talk) 10:50, 16 January 2022 (UTC)[reply]

Non-English entries that don't meet WT:CFI#Numbers,_numerals,_and_ordinals[edit]

See Wiktionary:Requests_for_deletion/Non-English#Uzbek_SOP_numbers and Wiktionary:Requests_for_deletion/Non-English#өч_йөз: Would anybody mind if I instagibbed all such entries whenever I encounter them? I would of course take care of proper relinking etc. I ask this because RFD-tagging them, listing them and then revisiting them after a month etc. can be so tiring for such a large number of essentially equivalent entries. — Fytcha T | L | C 〉 22:55, 13 January 2022 (UTC)[reply]

@Fytcha: No I wouldn't mind. And to the contrary, there's precedent for deleting all such entries. Imetsia (talk) 18:22, 17 January 2022 (UTC)[reply]
Seconded. Ultimateria (talk) 21:45, 18 January 2022 (UTC)[reply]

Why the heck is "m*nstr*l" word of the day?[edit]

Discussion moved from Wiktionary talk:Word of the day/Nominations.

The following comment was posted at the above location. Minstrel was WOTD on 12 January 2022 but it may be useful to hear some views on it:

"It is offensive to BIPOC. Especially without proper contextualization in the bottom for words relevant to today!" —⁠This unsigned comment was added by 72.76.95.136 (talk) at 08:27, 12 January 2022‎.

I assume that the original poster was referring to sense 2.2: "(US, historical) One of a troupe of entertainers, often a white person who wore black makeup (blackface), to present a so-called minstrel show, being a variety show of banjo music, dance, and song."

Now as far as I can tell, this sense of the word itself is not derogatory, but the practice of people who are not black performing in blackface is nowadays regarded as inappropriate. Should that disentitle an entry from appearing as WOTD? (I feel that if an explanation is needed, then it would probably not be feasible to feature such an entry as a WOTD as the comment line at the bottom of the WOTD is not a very suitable place for a lengthy discourse.) Thoughts? — SGconlaw (talk) 13:01, 14 January 2022 (UTC)[reply]

If the term is already considered offensive by itself, then what about bl*ckf*c*, or sl*v* tr*d* for that matter?  --Lambiam 14:14, 14 January 2022 (UTC)[reply]
Some people are too thin-skinned or deniers. DonnanZ (talk) 14:21, 14 January 2022 (UTC)[reply]
I know right, we allow minstrel but Sgconlaw (talkcontribs) didn't accept my nomination of proctorrhea - what double standards! Br00pVain (talk) 14:24, 14 January 2022 (UTC)[reply]
Oh well, if you all think we should have proctorrhea on the Main Page, please say so now … ha, ha. — SGconlaw (talk) 14:34, 14 January 2022 (UTC)[reply]
Minstrel is definitely more acceptable. SGconlaw is the boss, mate. Hard cheese. DonnanZ (talk) 14:42, 14 January 2022 (UTC)[reply]
The word minstrel is not offensive in of itself. Buidhe (talk) 11:40, 15 January 2022 (UTC)[reply]
IMO the blurb should have included a note of some sort in sense 2.2 that such shows are considered racist today; cf. the lede in the Wikipedia article minstrel show, which says "The minstrel show, also called minstrelsy, was an American form of racist entertainment developed in the early 19th century." Benwing2 (talk) 01:48, 16 January 2022 (UTC)[reply]
I suppose we could update the definition to reflect this in some way, though the OED does not. — SGconlaw (talk) 04:24, 16 January 2022 (UTC)[reply]
I went ahead and updated the definition and the image caption. — SGconlaw (talk) 05:34, 16 January 2022 (UTC)[reply]
Those who erase history are doomed to repeat it, ain't they. Equinox 17:31, 17 January 2022 (UTC)[reply]
What do people gain from all their laborious study of history? The sun rises and the sun sets, and hurries back to where it rises. What has been will be again, what has been done will be done again; there is nothing new under the sun. :(  --Lambiam 00:30, 18 January 2022 (UTC)[reply]

“Red” links shown black in inflection-tables[edit]

The use of class="inflection-table" in inflection tables has the effect that entries in the table with a wikilink that leads nowhere are initially not shown in red but in black, even though the link has action=edit&redlink=1. After following the link, it turns red. Is this intentional? I find it awkward.  --Lambiam 13:44, 15 January 2022 (UTC)[reply]

Why do you find it awkward? Also, having a lot of redlinks really looks bad in most inflection-table templates. Thadh (talk) 13:48, 15 January 2022 (UTC)[reply]
Interesting. I would like to see some examples, especially verb entries incorporating "one's", "someone's" etc. DonnanZ (talk) 14:08, 15 January 2022 (UTC)[reply]

"Adjectival noun" header in Japanese[edit]

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo): This is not a standard header but shows up in several pages, e.g. 大人気 (very popular), 誇大 (exaggeration), 小柄 (small build, short stature), 無礼 (impolite, rude). There is a Wikipedia page Adjectival noun (Japanese) that says Adjectival nouns constitute one of several Japanese word classes that can be considered equivalent to adjectives. The last three words link to the Japanese adjectives page, which describes these words as "na-adjectives". They are categorized under Category:Japanese adjectives. All of this makes me think the header should just read "Adjective". Even the two definitions above that are nouns seem suspect; 誇大 as shown in the cited examples seems better glossed as "exaggerated", and 小柄 could easily be glossed as "short of stature" or "of small build". Another point here is the following, from the Wikipedia article: In their attributive function, Japanese adjectival nouns function similarly to English noun adjuncts, as in "chicken soup" or "winter coat" – in these cases, the nouns "chicken" and "winter" modify the nouns "soup" and "coat", respectively. What is being described here is exactly equivalent to what we call "relational adjectives" in Slavic languages (as well as in Latin, Ancient Greek and most Romance languages). Such adjectives are often glossed as nouns in English, but that does not change their status in the source language. Similarly, Japanese したい is translated as "want to do" but is an adjective. Benwing2 (talk) 01:45, 16 January 2022 (UTC)[reply]

These seem to be remnants of an obsolete entry style. Japanese entries used to distinguish "adjective nouns" (-na) from "adjective" (-i), but no longer so now. -- Huhu9001 (talk) 03:03, 16 January 2022 (UTC)[reply]
Right, and I have discussed the issue before with Eirikr. https://en.wiktionary.org/wiki/User_talk:Eirikr/2020#%E9%87%8D%E7%AE%B1%E8%AA%AD%E3%81%BF Shen233 (talk) 18:41, 17 January 2022 (UTC)[reply]
I went ahead and renamed ==Adjectival noun== to ==Adjective== in all Japanese entries. Benwing2 (talk) 02:09, 18 January 2022 (UTC)[reply]
Thank you @Benwing2! As others have noted, this "adjectival noun" terminology was a stale holdover from past style.
One key difference between this class of Japanese adjectives and the English construction of attributive nouns is that a Japanese -na adjective cannot be used as a standalone noun -- it cannot be used as the patient or agent. Some of these words are also nouns, but not many of them. The majority that cannot be used as nouns as-is can be turned into nouns with the addition of the nominalizing suffixes (-sa, -ness, objective degree or amount) or (-mi, -ness, subjective experience).
Due to this distinct non-noun-ness of Japanese -na adjectives, I have cringed whenever I run across an English-language text describing these as "nouns". I am happy that you've cleared out the last of these stale headers.  :) ‑‑ Eiríkr Útlendi │Tala við mig 18:50, 18 January 2022 (UTC)[reply]

Hebrew conjugation accent[edit]

I think indicating accent somehow in the Hebrew conjugation tables would be useful, even if it is possible to figure it out without it. Dngweh2s (talk) 16:06, 17 January 2022 (UTC)[reply]

@Dngweh2s I agree, although doing so is non-trivial (to say the least) given the complexity of Hebrew verb conjugation. I am almost done writing a module to do something similar for Italian conjugation (which is currently missing marking of stress and vowel quality), and the module is over 3,000 lines of Lua code. Benwing2 (talk) 20:14, 17 January 2022 (UTC)[reply]
@Benwing2 It would also be useful for the conjugation table to say what subconjugation/pattern it is. This must be possible given that they are generated automatically. Dngweh2s (talk) 20:39, 17 January 2022 (UTC)[reply]

Lojban cleanup[edit]

Just FYI (not sure if anyone cares), I have done some work cleaning up Lojban lemmas. I think the main contributor to these lemmas is User:Jawitkien, although certain other users have added entries, e.g. User:DefinitionFanatic, User:Sarefo, User:Brantmeierz, maybe others. User:Kc kennylau did some bot work on these entries. None of these users are currently active AFAICT. The only thing potentially controversial that I did is to replace Lojban headers with English ones, consistent with our handling of other languages. I am not familiar with Lojban but from looking at the Wikipedia article, and from the fact that some headers are compounds of Lojban and English (e.g. ==Gismu | Root word==), I made the following substitutions:

  • Lujvo -> Predicate
  • Brivla -> Predicate
  • Rafsi -> Affix
  • Gismu -> Root
  • Cmavo -> Particle
  • Vlakra -> Etymology (Pronunciation in one case)
  • Noun -> Predicate (in one case), Particle (in one case)

Except for the Lujvo/Brivla merger, this doesn't lose information. In that case, there seemed to be no consistency in whether the terms "Lujvo" or "Brivla" were used as headers. Furthermore, apparently a "Lujvo" is just a "Brivla" that is also a compound word, and generally we don't make a header distinction between compound and non-compound words.

Now, granted, "Predicate" isn't quite a standard header (although we do have "Predicative", which is a part of speech e.g. in Russian). But I think it's a lot easier to make sense of than "Lujvo" or "Brivla". I originally thought of using "Verb" but the definitions of these entities are usually more like nouns than verbs, so "Verb" seemed too confusing. Benwing2 (talk) 00:18, 18 January 2022 (UTC)[reply]

BTW if anyone thinks it's important to include the Lojban-language POS's in the entry, IMO the correct way to do that is in the headword. Since there are already headword templates like {{jbo-lujvo}}, this should be easy to do. Benwing2 (talk) 00:22, 18 January 2022 (UTC)[reply]
I thank you again! A previous discussion about Lojban POS headers went nowhere useful. ‑‑ Eiríkr Útlendi │Tala við mig 19:11, 18 January 2022 (UTC)[reply]

Talk to the Community Tech[edit]

Community Wishlist Survey Lamp.svg

Hello

We, the team working on the Community Wishlist Survey, would like to invite you to an online meeting with us. It will take place on 19 January (Wednesday), 18:00 UTC on Zoom, and will last an hour. This external system is not subject to the WMF Privacy Policy. Click here to join.

Agenda

  • Bring drafts of your proposals and talk to to a member of the Community Tech Team about your questions on how to improve the proposal

Format

The meeting will not be recorded or streamed. Notes without attribution will be taken and published on Meta-Wiki. The presentation (all points in the agenda except for the questions and answers) will be given in English.

We can answer questions asked in English, French, Polish, Spanish, and German. If you would like to ask questions in advance, add them on the Community Wishlist Survey talk page or send to sgrabarczuk@wikimedia.org.

Natalia Rodriguez (the Community Tech manager) will be hosting this meeting.

Invitation link

We hope to see you! SGrabarczuk (WMF) (talk) 00:21, 18 January 2022 (UTC)[reply]

Nivkh should be split[edit]

Nivkh should be split into two languages. The Amur, West Sakhalin, and North Sakhalin varieties are indisputably distinct enough from the East Sakhalin, Central Sakhalin, and (extinct) South Sakhalin dialects to consider them separate languages. The two varieties could be called "Amur Nivkh" and "Sakhalin Nivkh", or "Nivkh" and "Nighvng" (which is "Nivkh" in Sakhalin Nivkh), though the latter is a rarely used distinction. Within these two, we (by which I mean mostly "I") can continue specifying finer dialectal distinctions, like Amur vs. North Sakhalin, and East vs. South Sakhalin. The distinction is often made, it's just often (but not always) made as a dialectal difference. This, however, is simply for the sake of simplicity and tradition.

See the following sources: This talk by Ekaterina Gruzdeva; Austerlitz 1985 ("South-Sakhalin Gilyak... vs. North-Sakhalin and Amur dialects"); Gruzdeva & Janhunen 2018 ("East Sakhalin Nivkh, also known as Nighvng, is spoken in eastern and central Sakhalin"); Gruzdeva 2016 ("...expressing epistemic modality in the Amur (A) and East Sakhalin (ES) dialects of Nivkh"); Gusev 2015 ("Two main dialects of Nivkh are Amur and East-Sakhalin (Krejnovich 1934: 182–183 and Shiraishi 2006: 10–11), cf. the scheme based on Shiraishi (2006: 11): Amur dialect group: Continental Amur, West-Sakhalin, North-Sakhalin; (East-)Sakhalin dialect group: East Sakhalin, Southeastern. Some authors treat poorly documented Nivkh idioms on North Sakhalin (Panfilov 1962: 3) and also on South(eastern) Sakhalin (Gruzdeva 1998: 7) as separate dialectal units"); Shiraishi & Botma 2016 ("In the Amur dialect (incl. West Sakhalin), accent falls..."); Shiraishi 2006 ("Nivkh has two dialect groups, the Amur dialect group and the Sakhalin dialect group. ... Within each group, there are numerous sub-dialects, some of which have not been described or documented to date. The best-described dialects are the dialects spoken on the lower reaches of the Amur River (Kreinovich 1934, 1937, Panfilov 1962, 1965, 1968, etc.) and the dialects on Sakhalin spoken in Nogliki (Gruzdeva 1998, etc.) and Poronaisk (Hattori 1955, 1962a,b, Austerlitz 1956, etc.)").

  • Austerlitz, Robert. 1985. Etymological Frustrations (Gilyak). International Journal of American Linguistics 51(4). 336–39.
  • Gruzdeva, Ekaterina. 2016. Epistemic modality and related categories in Nivkh. Studia Orientalia 117. 171–98.
  • Gruzdeva, Ekaterina & Juha Janhunen. 2018. Revitalization of Nivkh on Sakhalin. In Leanne Hinton, Leena Huss & Gerald Roche (eds.), The Routledge handbook of language revitalization, ch. 45. New York: Routledge.
  • Gusev, Valentin. Some parallels in grammar between Nivkh and Tungusic languages. Journal of the Center for Northern Humanities 8. 63-75.
  • Shiraishi, Hidetoshi. 2006. Topics in Nivkh phonology. University of Groningen.
  • Shiraishi, Hidetoshi & Bert Botma. 2016. Asymmetric distribution of vowels in Nivkh. Studia Orientalia 117. 39–46.

—⁠This unsigned comment was added by Dylanvt (talkcontribs).

Wikipedia agrees with you, although this does not mean we necessarily need to follow suit. Dialect distinctions could be handled with labels, which categorize appropriately, and the various varieties could be added as etymology-only languages, for example. Benwing2 (talk) 06:07, 18 January 2022 (UTC)[reply]

Subscribe to the This Month in Education newsletter - learn from others and share your stories[edit]

Dear community members,

Greetings from the EWOC Newsletter team and the education team at Wikimedia Foundation. We are very excited to share that we on tenth years of Education Newsletter (This Month in Education) invite you to join us by subscribing to the newsletter on your talk page or by sharing your activities in the upcoming newsletters. The Wikimedia Education newsletter is a monthly newsletter that collects articles written by community members using Wikimedia projects in education around the world, and it is published by the EWOC Newsletter team in collaboration with the Education team. These stories can bring you new ideas to try, valuable insights about the success and challenges of our community members in running education programs in their context.

If your affiliate/language project is developing its own education initiatives, please remember to take advantage of this newsletter to publish your stories with the wider movement that shares your passion for education. You can submit newsletter articles in your own language or submit bilingual articles for the education newsletter. For the month of January the deadline to submit articles is on the 20th January. We look forward to reading your stories.

Older versions of this newsletter can be found in the complete archive.

More information about the newsletter can be found at Education/Newsletter/About.

For more information, please contact spatnaik(a)wikimedia.org.


About This Month in Education · Subscribe/Unsubscribe · Global message delivery · For the team: ZI Jony (Talk), Thursday 11:28, 20 January 2022 (UTC)[reply]

Suggestion: Allowing soft-redirected entries of vocalized words of languages spelt in an abjad[edit]

Now, searching, for example, the Persian word "مِسواک" ("toothbrush") with the kasra returns search results. I suggest allowing entries such vocalized words that are soft-redirected to the main entry (in this case مسواک#Persian), in a similar manner as the Japanese alternatively spelt words (example:はブラシ, which is soft-redirected to 歯ブラシ). This applies to all languages spelt in an abjad, where short vowels are not usually spelt out, such as Arabic and Hebrew. This would make Wiktionary more convenient to language learners. Jonashtand (talk) 16:20, 18 January 2022 (UTC)[reply]

Undo informal/colloquial merge[edit]

Following earlier discussion over at User talk:Surjection#Colloquial and informal, this topic seeks to reverse the decision to merge "informal" and "colloquial" terms in Wiktionary:Requests for moves, mergers and splits#Category:Colloquialisms by language and Category:Informal terms by language for the simple reason that a distinction between "informal" and "colloquial" (spoken language) does exist in many languages, including Finnish and Welsh, and merging the two is counterproductive. Informal terms may "fly" in higher registers while colloquialisms will not and strictly belong to a lower, more vernacular, register. — SURJECTION / T / C / L / 15:41, 18 January 2022 (UTC)[reply]

Symbol support vote.svg Support keeping them separate for the reasons given by Surjection. I think the "Could this appear (without quotation marks) in the running text of a newspaper?" is a decent (but not definitive) test to discriminate between colloquial and informal. To give an example in another language, I could very well see German den Stecker ziehen be used in the main body of a newspaper article but not Dingsda or behindert (2). — Fytcha T | L | C 〉 16:04, 18 January 2022 (UTC)[reply]
Even in English there is a distinction. Collins COBUILD (1995) gives a full explanation for its labels and specifically makes it quite clear by reducing the number of syllables in a key label from 4 ("colloquial") to 2 ("spoken"). The spoken label indicates "used mainly in speech rather than in writing: e.g. school kids, whoops". In contrast, the written label indicates "used mainly in writing rather than in speech: e.g. animus, bespectacled,". The informal label indicates "used mainly in informal situations, conversations, and personal letters: e.g. decaf, elbow room." In contrast, the formal label indicates "used mainly in official situations, or by political and business organizations, or when speaking or writing to people in authority: e.g. belated, demonstrable." In COBUILD most terms do not have any of these labels. Also they have many specialized labels including legal, literary, technical, journalism, medical, which are narrower than the previously mentioned labels, though sometimes overlapping or marking a subset of them (eg, legal).
Sadly, we lack the full range of the COBUILD corpus and, more importantly the annotation and software they have to support the labels. We can, however, make do with what we have. DCDuring (talk) 16:19, 18 January 2022 (UTC)[reply]
I I struggle to perceive much difference between "school kids" and "elbow room", although my perspective as an Australian might colour this a little. This, that and the other (talk) 01:12, 19 January 2022 (UTC)[reply]
Their database was UK based and the quoted material is from a 1995 print edition. I also would put them both in informal. In our ever-more-democratic times it may be that informal and colloquial speech can be used in what were once formal settings. I would have once called "Fuck you" colloquial (as well as derogatory); now I'm not so sure. DCDuring (talk) 01:51, 19 January 2022 (UTC)[reply]
Symbol support vote.svg Support, it seems weird we even merged them in the first place. Vininn126 (talk) 16:31, 18 January 2022 (UTC)[reply]
Strongly Symbol support vote.svg Support. Our Finnish entries and the sources they draw upon distinguish two informal registers, and I'm sure Finnish is not the only lect suffering from this decision. The issue of merging the two categories is something that should be tackled separately for each language. If the Anglophone editors decide to support the merger for English terms, I won't oppose, but their language-specific decision should not damage the other languages within this project. As such, the site-wide merger should be reverted and further discussion moved to pages like Category talk:English informal terms. brittletheories (talk) 17:23, 18 January 2022 (UTC)[reply]
Symbol support vote.svg Support. This is part of why I maintain that decisions like those should take all language communities in mind, not just the ones with more editors. It's frustrating to see that happen over and over again. AG202 (talk) 17:28, 18 January 2022 (UTC)[reply]
Symbol oppose vote.svg Oppose for English, at least. I think the distinction is too fine, and there isn't a convenient way for editors to analyse whether a term is more commonly used in speech or writing. We will just end up with people slapping on either label pretty much randomly. (If there is consensus for restoring "colloquial", then a very clear explanation of when each label is used must be added at "Appendix:Glossary".) — SGconlaw (talk) 18:08, 18 January 2022 (UTC)[reply]
English to my knowledge does not exhibit a similar form of diglossia, but that's not a reason to merge the two for every other language as well. — SURJECTION / T / C / L / 18:55, 18 January 2022 (UTC)[reply]
If use of a term is almost exclusively found in Google Books and News in dialog or quoted speech (or the sports section?) I think that is good evidence that it is "spoke"/"colloquial". We probably have more trouble finding empirical support for the formal label. In any event I don't see why we couldn't analyze what lemmings say. DCDuring (talk) 00:50, 19 January 2022 (UTC)[reply]
Strongly Symbol oppose vote.svg Oppose a blanket unmerger. I was the one who merged them, based on the fact that for a large number of languages, the distinction is unclear and the terms were (and still are) being used promiscuously. By unmerging them, we'll end up in the same situation as before, where the largest languages will have an artificial distinction made between "informal" and "colloquial" that really doesn't mean anything. Rather than just unmerging, we need a different solution. One possibility is to make the unmerger conditional only in specific languages, but this requires hacking in the label and category code. Another possibility is to come up with a different term for either the "informal" or "colloquial" register. If the idea is the "colloquial" is lower-register than "informal", I'd propose something like "vernacular" in place of "colloquial"/"informal". Russian has a maybe-similar distinction, termed просторе́чный (prostoréčnyj), which we translate as "low colloquial" but "vernacular" (or "popular") would probably work as well. Either way I would prefer seeing terms that are consistent across languages. The nice thing about "vernacular" or "popular" is that neither term is really used to describe English (except AFAIK in the context of AAVE, which is something different altogether), so we are free to define the terms for use with other languages. Benwing2 (talk) 01:47, 19 January 2022 (UTC)[reply]
@Benwing2 "Either way I would prefer seeing terms that are consistent across languages." As much as we can try, I don't think that there's a solution for labels regarding formality usage across all languages. Some languages have up to a 5-or-more way distinction with politeness/formality when it comes to vocabulary, and those should be shown with our labels (ex: French has up to six registers depending on whom you ask, and currently the registers familier and populaire, even some from jargon and argot for some reason, are all grouped together "informal French", which is borderline inaccurate). Also, just because we have the labels doesn't mean that every language has to use them, while at the moment no one can use them properly. Stuff like that should be left up to the communities of editors rather than a blanket solution that stops everyone. AG202 (talk) 02:14, 19 January 2022 (UTC)[reply]
@AG202 Whatever terms we use, I oppose splitting informal and colloquial the way it is proposed, because time has shown people cannot use these terms consistently. "Familiar", "popular" and "vernacular" are all reasonable ways of showing a register that is considered inappropriate for written language and somewhat nonstandard, if that is what is going on in Finnish (as in Russian). Benwing2 (talk) 02:34, 19 January 2022 (UTC)[reply]
@Benwing2 Time has shown that English contributors can't use the terms right. As such, the solution should be to for the English categories. And sure, there could be more languages like English that barely experience diglossia, and a merger is needed with those too. Still, editorial freedom should be the default. brittletheories (talk) 07:21, 19 January 2022 (UTC)[reply]
This is not a problem just for Finnish. The entire reason for the merge seems to boil down to "English editors cannot use these consistently so no language can use them". If this discussion dies down, you can be assured that I will simply undo the blanket merge and let editors for each language decide how to use the labels, because this merge, as it was carried out, was destructive, plain and simple. — SURJECTION / T / C / L / 10:09, 19 January 2022 (UTC)[reply]
The distinction clearly does mean something to some lexicographers. I quoted Collins COBUILD above. Note that they use spoken instead of colloquial (which has two definitions in its entry here). A large number of "interjections" would probably deserve the "spoken" label. DCDuring (talk) 03:08, 19 January 2022 (UTC)[reply]
"vernacular" is an ambiguous term, and there is a risk of confusion; what is being meant is our def 2, but people could likewise think it is supposed to mean our def 3. As someone else already pointed out, I'd argue there's less consistency between the uses of formal and literary, which no one has proposed merging, than there was between informal and colloquial, a split which made complete sense in some languages and discarding of which has been a terrible mistake. — SURJECTION / T / C / L / 09:45, 19 January 2022 (UTC)[reply]
@Surjection I know you feel strongly about this but please do *NOT* simply take unilateral action once "this discussion dies down". That in itself would be very destructive unless there is consensus, and I may well undo you on these grounds. I have proposed some alternatives to address your concerns. You seem to have rejected them out of hand in your zeal to implement your preferred solution, but let me repeat them. Specifically: (1) My preferred solution is to adopt other terminology for the lower-register distinction. Any of "spoken" (as User:DCDuring mentions), "vernacular", or "popular" would do as well, possibly also "familiar". The problem with forcing an artificial distinction between "informal" and "colloquial" is exactly that it is artificial: these two terms are essentially synonymous in English, and we are the *ENGLISH* Wiktionary, so we need to choose terms that correspond to the way that speakers of English use them. I gather that in languages like Finnish and Russian where this register distinction exists, it is between an informal register that is acceptable is some writing, and a lower register that is allowed only in speech (or in quoted dialogue) and is considered in some sense outside the standard language. To me, both "popular" and "vernacular" connote this sort of register quite well, whereas "colloquial" does not at all, considering its normal usage in English. (2) Another possible solution is to hack the label code that handles the labels "colloquial" and "informal" and make them categorize differently in certain languages (Finnish, Russian, Welsh, ...), but the same in other languages. That is technically doable but ugly in a way that I'd very much disprefer having.
Let me ask you again: Why are you so wedded to the specific terms "informal" and "colloquial"? Please take it from a native speaker that these terms do not have a clear enough distinction in English. Benwing2 (talk) 05:53, 20 January 2022 (UTC)[reply]
I'm not. It's just that better alternatives simply do not appear to exist. "familiar" is already taken for something else. "vernacular" is ambiguous, as it is also used to refer to a particular group's vernacular, i.e. slang, jargon or any other kind of idiolect. "spoken" is bad because it implies these terms cannot be written down, which is nonsensical because how else would we document them? "colloquial" is the only term I've seen used to describe this kind of register in an English dictionary of any kind. DCDuring's messages seem to suggest that yes, English dictionaries do indeed use these two terms and maintain a distinction between them. If you have a better proposal, I'm all ears - but all of the "alternatives" suggested thus far are simply worse. — SURJECTION / T / C / L / 10:44, 20 January 2022 (UTC)[reply]
Symbol oppose vote.svg Oppose per Benwing2, Sgconlaw. —Svārtava [tcur] 04:40, 19 January 2022 (UTC)[reply]
Symbol support vote.svg Support per Surjection. --Rishabhbhat (talk) 07:25, 19 January 2022 (UTC)[reply]
Symbol oppose vote.svg Oppose per Surjection. For the alleged distinction is exactly how I would see the distinction but reversely: Colloquial terms may “fly” in higher registers while informalisms will not and strictly belong to a lower, more vernacular, register. 👏 Therefore I have a few times even labelled “formal colloquial” or intended to do so for colloquial technical terms of jurists, e.g. abheben that they may write in the most formal contexts; it is not informal because it fits the form, and it is not formal because it does not make speech formal, and it will not believed to be jargon as in “police jargon” (will jurists ever admit to use jargon?). You have also in Finnish “law, colloquial”, todistus. English: dissental. If you can’t lastingly make it clear why one label is wrong but not the other, this shows again that the distinction doesn’t exist or is artificial, the same way the non-existence and artificiality of gods is shown by believers varying their understanding. Can’t pinpoint a signification without ambiguity, then don’t make points from the exact distinction which is under dispute.
The vagary uniaxial definition, rejuggling the alleged distribution of the terms’ “informal” and “colloquial” usage between a mere high and low, playing with connotations of other underdefined words like “casual” or “common” has no reliability to satisfy.
I originally suggested to keep the labellings intact to not manipulate the usage as well as make the decision adaptable if a definition comes to light but to merge the categories by reason that without contexts from a bird's-eye view there is no sense in the distinction even if you feel it well enough to distinguish in individual cases of labelling senses or terms. A gloss can be a feeling, a category cannot. Fay Freak (talk) 02:18, 20 January 2022 (UTC)[reply]
@Fay Freak: I think you meant to support Surjection's proposal, not oppose it. Thadh (talk) 02:25, 20 January 2022 (UTC)[reply]
@Thadh: No, I come to the opposite conclusion by swapping the terms within her premises, which then afford an equally plausible distinction, “disproving Surjection per Surjection”. If a distinction and its opposite are both true then it makes no sense. Fay Freak (talk) 03:01, 20 January 2022 (UTC)[reply]
All you did was flip my two terms, which does nothing to disprove the fact that there is a distinction worth documenting here. The rest of the message is a mix of disingenuous argumentation and pointing out individual mistakes in entries (todistus should be informal, not colloquial). Clearly if individual entries get the two mixed up then no distinction may exist, except for the fact that when "colloquial" briefly even displayed as "informal", it instantly made thousands of Finnish entries sound incorrect and misleading. — SURJECTION / T / C / L / 10:44, 20 January 2022 (UTC)[reply]
I think Benwing makes the salient point that even though there are languages with more than one non-formal register, using the English terms "informal" and "colloquial" to denote this is attempting a distinction the English words don't easily make (looking at how various dictionaries define them, they frequently define colloquial as informal). I also question whether Finnish, Welsh, Russian, etc have the same two non-formal registers as each other, and I think the fact that even people who perceive a distinction in the English terms don't agree on what it is—besides being the reason the labels were merged (because it meant whether an entry was categorized as one or the other was haphazard)—calls into question whether applying those English terms to the other languages is the best approach, as opposed to using more language-specific terminology. But I certainly agree we shouldn't gloss over distinctions other languages make just because English doesn't make the same distinctions. I do think, in general, it'd be helpful if there were a way to tell the module to interpret a label differently for some languages, so that e.g. "Doric" can be interpreted differently if language = Greek vs if language = Scots, and likewise for other cases where we've been frustrated by needing the same label-term for different things in different languages. The same functionality could make "colloquial" or "vernacular" categorize as "informal" if lang = en (since people definitely will use {{lb|en|vernacular}} just as haphazardly), but make it categorize separately if lang = fi. (Obvious question is would this use a lot of Lua memory?) - -sche (discuss) 07:01, 20 January 2022 (UTC)[reply]

Should we have entries for Turkish predicative forms?[edit]

A brief lesson in Turkish grammar. As is well known, Turkish is a very synthetic and highly agglutinative language. Pinker gives the example şehir +‎ -li +‎ -leş +‎ -tir +‎ -eme +‎ -dik +‎ -ler +‎ -imiz +‎ -den +‎ -siniz.[7] While probably constructed for the purpose of illustration, this word is not unnatural. It is what one would expect a Turkish speaker to utter when saying, in Turkish, “you are one of those whom we can’t turn into a town dweller“. The message may be contrived; its expression as a sentence is not. Clearly, we should not attempt to list all possible Turkish words that can be synthesized. Listing those that can be attested would result in a completely haphazed collection.

Turkish has no copular verb like the English be. Any noun phrase or adjectival phrase can be used as a predicate, and then assumes a predicative form (which can be the same as the nude predicate: “makine bozuk ” is an acceptable complete sentence for saying “the machine is out of order”). Enclitics added to the predicate indicate the person, but the third person allows a null form. In the film İmparatorluk Geri Dönüyor, the Turkish version of The Empire Strikes Back, Darth Vader says, “Hayır, ben senin babanım” – “No, I am your father”. Here, babanım is the first-person predicative form of baban (“your father”).

We do not have an entry for baba +‎ -n +‎ -ım = “father” + “of you” + “I am”, nor should we; the possible cases are endless, like ben dünyanın en kötü babasıyım - “I am the world's worst father”, which should be parsed as [ben dünyanın en kötü babası ] + -(y)ım, not as [ben dünyanın en kötü] [babasıyım]. Pinker’s şehirlileştiremediklerimizdensiniz is a predicative form. The question now is:

Should we have entries for Turkish predicative forms, and if so, which ones (and why those)?

(The question arose at RfD. For comparison, for Italian we do not list bevimi = bevi +‎ -mi and mangiami = mangia +‎ -mi (and countless other similar imperative + enclitic forms), even though attestable.[8] We do list some, though, like amami, but not amalo,[9] so the selection appears to be haphazard.)  --Lambiam 13:17, 19 January 2022 (UTC)[reply]

Rhymes and hyphenation in affix entries[edit]

What are your thoughts about providing rhymes and hyphenations at affix entries (especially prefixes and infixes)? Thadh (talk) 15:47, 19 January 2022 (UTC)[reply]

Anything other than a final suffix that always takes the accent is incompatible with our system. That said, hemidemisemiquaver comes to mind as an example of a word with rhyming prefixes. Chuck Entz (talk) 15:56, 19 January 2022 (UTC)[reply]
What if someone wants to write poetry about prefixes, treating them as if they were words in and of themselves? A bit esoteric I know, but plausible. Vininn126 (talk) 16:07, 19 January 2022 (UTC)[reply]
I’m sure if a poet wanted to do that they wouldn’t need the Wiktionary’s help… — SGconlaw (talk) 11:28, 20 January 2022 (UTC)[reply]