Wiktionary:Beer parlour/2022/June: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
Line 627: Line 627:
:::@[[User:Atitarev|Atitarev]] This can easily lead to clutter on single-letter entries. Why do we distinguish between Arabic-script languages and Latin (or Cyrillic, etc.) script ones in this regard, when both scripts make use of orthographic spaces?—[[User:Tibidibi|Tibidibi]] ([[User talk:Tibidibi|talk]]) 09:20, 14 June 2022 (UTC)
:::@[[User:Atitarev|Atitarev]] This can easily lead to clutter on single-letter entries. Why do we distinguish between Arabic-script languages and Latin (or Cyrillic, etc.) script ones in this regard, when both scripts make use of orthographic spaces?—[[User:Tibidibi|Tibidibi]] ([[User talk:Tibidibi|talk]]) 09:20, 14 June 2022 (UTC)
::::{{re|Tibidibi}} I understand your idea better now. It may work. It might be difficult to engage all editors for all Arabic script-based languages, though. Perhaps, focusing on one, such as Persian? [[User:Atitarev|Anatoli T.]] <sup>([[User talk:Atitarev|обсудить]]</sup>/<sup>[[Special:Contributions/Atitarev|вклад]])</sup> 23:12, 14 June 2022 (UTC)
::::{{re|Tibidibi}} I understand your idea better now. It may work. It might be difficult to engage all editors for all Arabic script-based languages, though. Perhaps, focusing on one, such as Persian? [[User:Atitarev|Anatoli T.]] <sup>([[User talk:Atitarev|обсудить]]</sup>/<sup>[[Special:Contributions/Atitarev|вклад]])</sup> 23:12, 14 June 2022 (UTC)
:::::{{wgping|fa|u1=Ariamihr|u2=Dijan|u3=Mazsch|u4=Qehath|u5=ZxxZxxZ}}
:::::If there aren’t any responses by this time next week, I will move the relevant entries to the tatwil-ed form.
:::::@[[User:Benwing2|Benwing2]] If this passes, can you modify the relevant code so that {{tl|af|fa|آزاد|ـی}} links to the tatwil-ed form?—[[User:Tibidibi|Tibidibi]] ([[User talk:Tibidibi|talk]]) 09:42, 20 June 2022 (UTC)


== Can we standardize morphophonemic/phonemic/phonetic/etc conventions for Middle Korean, Modern Standard Korean and Jeju? ==
== Can we standardize morphophonemic/phonemic/phonetic/etc conventions for Middle Korean, Modern Standard Korean and Jeju? ==

Revision as of 09:42, 20 June 2022


The issue of Large Entries

We have a dilemma and balance to strike. Being an online dictionary we are able to include much more information than a paper one. Information that, no doubt, is useful to some amount of readers. However the vast majority of readers are looking for definitions. When they are presented with a massive wall of text, it might be easy to get lost and not be able to find the information you are looking for. I believe we should make more things collapsible as a solution. Any level of header (L2-L4) should be collapsible entirely, and L2's are already collapsible on mobile. I think the default should be L2's are open by default and so is the L3 with the headword and definition. I also wonder if we should follow suit of the collapsible nyms change and make collocations and usex's collapsible WITHIN the definition line. This is also in reaction to the [https://en.wiktionary.org/wiki/Wiktionary:Votes/2021-08/Prioritizing_definitions recent failed vote to prioritize definitions and move etymologies to the bottom. We could still keep etys at the top, but they could remain hidden. I am sure there might be some downsides I am overlooking, so I am very interested in feedback. However, again, the upside would be this would enable us to include yet more information valued by some readers without overloading others. Vininn126 (talk) 10:07, 1 June 2022 (UTC)[reply]

P.S. Polling a VERY small number of people (this is honestly probably an issue we should somehow ask the readership about), people seem to prefer having the information just there, even if it's a lot. Granted, this was a tiny pool of people, we should ask more. Vininn126 (talk) 15:58, 1 June 2022 (UTC)[reply]
I don’t know, if descendants lists and reference sections or even windy etymologies are unusually large they are already collapsed. Don’t wanna click all the time for any particular kind of section. In general, I also recommend having a large screen true to colours (not under 222 €), if not two or three, so you don’t perceive to get into this predicament. Polish mice are also very good to scroll large pages like bar. Fay Freak (talk) 20:12, 2 June 2022 (UTC)[reply]
It's less for me and more for readers. I have large screen and ways of navigating, but not everyone. Maybe we could include in the settings what is automatically collapsed or not. But again, it'd be something that we should ask readers about. Vininn126 (talk) 20:22, 2 June 2022 (UTC)[reply]
Can you give an example of an entry that badly needs it and can't be reduced any other way? I feel like a totally collapsed etymology section would like quite, so I would like to try some possible layouts first. - Sarilho1 (talk) 12:49, 3 June 2022 (UTC)[reply]
Some of the larger English entries come to mind. Or pages with many L2's. go is very annoying to navigate, even with good tools turned on. Polish behawiorysta is a page with very little semantic information, thus all the other information is obscuring it. Vininn126 (talk) 12:53, 3 June 2022 (UTC)[reply]
Thank you. Though frankly, I think the main problem with the two examples are the Pronunciation sections, in particular the audios. - Sarilho1 (talk) 14:42, 3 June 2022 (UTC)[reply]
So I reckon. They should be collapsed; this becomes more pressing if we include more dialectal pronunciations as we should for English. Using three lines for audio files in the same language is unwarranted in almost any case. Fay Freak (talk) 15:11, 3 June 2022 (UTC)[reply]
Well, imagine a similar situation with any entry. A person might be interested in only one thing, and having a ton of other stuff in the way might be a problem. So having them optionally collapsible might be convenient. Vininn126 (talk) 15:17, 3 June 2022 (UTC)[reply]
psst. — SURJECTION / T / C / L / 22:32, 3 June 2022 (UTC)[reply]
That's exactly the sort of thing I'm talking about. I just wonder if there'd be value in making this a togglable option for everyone. I also wonder if it'd be possible to get it work for further reading and references. Finally, this would require it not being a json, but the options to say what's automatically collapsed or not. Vininn126 (talk) 09:29, 4 June 2022 (UTC)[reply]

Different categories for older vs newer borrowings from Latin, and use of Vulgar Latin in etymology if lemmas are absent

So is it policy not to include a language in an etymology if you don't have the actual lemma for the term in the etymology? For example, if you know a Portuguese, French or Spanish term comes through Old Portuguese, Old French, Old Spanish but you don't actually have the word in the 'Old' language itself. Are we supposed to just skip and it go straight to Latin until we have it (or create a reconstruction, which I'm not too keen on)? Or in languages like Welsh or Albanian, which absorbed a lot from spoken Latin during Roman rule... are we not supposed to put borrowed from Vulgar Latin if we don't have that precise term? I noticed some of the ones I did that way were reverted. Most of the proto-Brythonic terms that came from Latin were likely taken from Vulgar Latin or Ecclesiastical in some cases (like ciwdod), but this was an organic process and not later medieval scholarly borrowings (those can be differentiated). How do we handle this? I think it's important to distinguish, through category, terms in these languages borrowed in ancient times in this organic fashion versus later scholarly borrowings. That's why I did things the way I did. Should I not do that? Word dewd544 (talk) 21:21, 5 June 2022 (UTC)[reply]

For inherited Romance vocabulary, I don't see how saying, for instance, 'from Old Portuguese' is beneficial. If we don't know the medieval form, we can simply say 'inherited from Latin', which implies the existence of a medieval form.
For Latin words that were, say, borrowed into Old Portuguese and then passed into the modern language, the best way to distinguish them would be to create Old Portuguese entries for them, so that they get categorized under 'Old Portuguese terms borrowed from Latin'. If on the other hand we don't know the Old Portuguese forms in the first place, are we really justified in positing medieval-era borrowings?
For languages such as Welsh, Albanian, etc. I have fiddled around, incidentally, with distinguishing ancient borrowings from modern ones in descendants sections, as in sagitta, contrarius, and Latinus. Nicodene (talk) 22:12, 5 June 2022 (UTC)[reply]
@User:Nicodene I see what you did in the descendants sections and I do like that. But I also meant how to distinguish them in the lemma entries themselves and through potentially different categories. That seems like a trickier endeavor. Like if someone was interested in seeing a list of words in Welsh that specifically came from Latin via Proto-Brythonic vs. a list of later Latin borrowings. Word dewd544 (talk) 14:19, 13 June 2022 (UTC)[reply]
@Word dewd544 I like your approach of adding {{bor|x|VL.|-}}. If I'm not mistaken, it already accomplishes all that we need it to. Is anyone against this specific usage? Nicodene (talk) 19:29, 13 June 2022 (UTC)[reply]
@User:NicodeneSome of the folks here who were against the general policy of not including VL. unless we have an explicit lemma or reconstruction in the etymology I guess. Also I believe @User:Mahagaja reverted most of the ones I did like that for Welsh. I can see where they're coming from but I'm still trying to figure out the best way to handle this. Also, Welsh is special and a bit different from say, Irish, in that it was a language that developed directly under the rule of the Roman empire (as proto-Brythonic), from the Romano-Britons. While Ireland was never ruled by Latin speakers and only got some of its words later, mostly via the church and missionaries. There was a British Latin vernacular apparently, and this is where the Welsh would have incorporated this vocabulary from. That was probably more likely spoken by the elites, who eventually abandoned it as Old English pushed them west and they lost contact with other speakers. In a parallel way, Albanian (probably) borrowed much of their Latin from the local emerging Eastern Romance, although that is contentious because it shows some features closer to Italo-Dalmatian at times rather than Romanian. Word dewd544 (talk) 23:55, 13 June 2022 (UTC)[reply]
Perhaps your reason for doing so did not occur to Mahagaja. Your approach has my vote. Nicodene (talk) 00:08, 14 June 2022 (UTC)[reply]
@User:Word dewd544 It’s possible to do exactly what you’re asking by using a hyphen in place of the lemma. For example: {{der|en|la|-}} renders simply as Latin, but still categorises things correctly. I have never encountered this being a problem (though I don’t know if some people have an issue with it). Theknightwho (talk) 21:29, 9 June 2022 (UTC)[reply]
I disagree and think that we ought to always provide all intermediate steps and remove some intermediate language codes if those languages aren't different enough from their mother or daughter language (which is why I think some codes, like Proto-Nuclear Polynesian, should be removed as too specific; or, alternatively, more pages could be created).
Anyway, as long as we don't have these intermediate steps we're incomplete in my opinion, and I would be happy to see some, in particular Polish and Russian, editors structurally introducing Old Polish and OES to their etymologies. Thadh (talk) 22:27, 9 June 2022 (UTC)[reply]
I do know about using the hyphen in place of a lemma and have done that many times. But as a matter of policy, some people seem to be against that unless we have an explicit lemma in the etymology. I was just thinking it could be useful in terms of categorization regardless. Word dewd544 (talk) 14:19, 13 June 2022 (UTC)[reply]

I created {{uder}} (= "unknown derivation"), which takes the same parameters as {{der}}, {{bor}} and {{inh}} but adds the page to a tracking category, just like for {{etyl}}. It is intended to replace {{etyl}}, to reduce the temptation of certain editors to mechanistically convert {{etyl}} to {{der}}. In fact I went ahead and bot-converted uses of {{etyl}} in French to use {{uder}}, according to the following procedure (note that {{m}} below actually stands for any of {{m}}, {{l}}, {{mention}} or {{link}}):

  1. {{etyl|DEST|SOURCE}} {{m|DEST|...}} becomes {{uder|SOURCE|DEST|...}}.
  2. {{etyl|DEST1|SOURCE}} {{m|DEST2|...}} becomes {{uder|SOURCE|DEST1|...}} if DEST1 is an etymology language that's either a child of DEST2 or an alias of DEST2.
  3. Remaining {{etyl|DEST1|SOURCE}} {{m|DEST2|...}} (i.e. with mismatched destination languages) are left alone, but my bot issues a warning when it runs.
  4. Remaining {{etyl|DEST|SOURCE}} not occurring before {{m|...}} are converted to {{uder|SOURCE|DEST|-}}.

I am going to convert the remaining occurrences of {{etyl}} in other languages according to the same procedure unless someone objects. Thanks to User:Svartava for the original prodding to do this (granted, that was more than a year ago ...). Benwing2 (talk) 01:41, 6 June 2022 (UTC)[reply]

The main advantage to this approach is that it makes it possible to permanently retire {{etyl}} for all languages without losing any information. The main caveat is that it may just move the problem to a new template, with the same tendency for people to copy from other entries that have the wrong template. At least it will stop the people who have been using {{etyl}} because that's what they've always used. Chuck Entz (talk) 01:59, 6 June 2022 (UTC)[reply]
@Chuck Entz My other thought is that people who are abusing {{der}} as a catch-all because they don't want to be bothered to figure out inheritance vs. borrowing can be persuaded to use {{uder}}, which at least makes it explicit that the term needs reinvestigation. There are a lot of existing uses of {{der}} put there by users who either don't understand the {{der}} vs. {{bor}} vs. {{inh}} distinction or (just as often) know better but just can't be bothered. Benwing2 (talk) 02:24, 6 June 2022 (UTC)[reply]
@Benwing2, Chuck Entz: Thanks for creating it, @Benwing2. I don't think it's best to make it a copy of {{etyl}}, but ideas for a derivation template when the editor doesn't know what type of specific derivation it is, have been proposed before at BP discussions (see also Template:der?). I think it would be useful to have this kept even after etyl-cleanup is done. I created the tracking cat LANG undefined derivations, e.g. Category:French undefined derivations for this template. If you think categories like CAT/{{{1}}}/{{{2}}} are needed, feel free to restore them but I didn't think they're needed especially for some particular languages since they're findable from MediaWiki search. —Svārtava (t/u) • 04:01, 6 June 2022 (UTC)[reply]
@Svartava I think it might be best to have it categorize both into the etyl-cleanup categories and the one you created; eventually we can remove the former. Benwing2 (talk) 04:24, 6 June 2022 (UTC)[reply]
@Benwing2: Taking a look at the source, the special categories with both source and destination lang codes such as CAT:etyl cleanup/en/fr were only for English, so I restored that bit. I didn't restore Category:etyl cleanup no target since {{uder}} would generate a module error in case there is no lang code and it'd be picked up from CAT:E. The list of languages done with etyl cleanup is also not needed in this case since we want to allow {{uder}}/{{der?}} for all languages. Lastly, I didn't bring back CAT:etyl cleanup/LANGCODE since ultimately after {{etyl}} -> {{uder}} replacements are done, e.g. CAT:etyl cleanup/fr would be redundant to CAT:French undefined derivations. —Svārtava (t/u) • 05:10, 6 June 2022 (UTC)[reply]
Thanks all. “langname undefined derivations” I deem a very reasonable tracking category name. Probably for clarity that a page does not use {{etyl}} {{uder}} must not simultaneously categorize under Category:etyl cleanup/langcode/langcode, else you yourself as the category creator get misled about the situation like if you take a wikibreak and then look how things are going and don’t remember by heart what you have implemented or run. But by the way I estimate the remaining etyl cleanup at roughly two-hundred manhours, however partially requiring people with judgement particular to the connoisseur of a language’s morphology and relations. Fay Freak (talk) 05:33, 6 June 2022 (UTC)[reply]
@Fay Freak, Svartava What do you think about "LANG unclassified derivations" instead of "undefined"? This might be clearer in specifying that the issue is that the derivations need to be classified properly as inheritances/borrowings/neither. Benwing2 (talk) 05:46, 6 June 2022 (UTC)[reply]
@Benwing2: I agree either way. Feel free to change that. —Svārtava (t/u) • 05:48, 6 June 2022 (UTC)[reply]
Not sure that it is so, with previous knowledge of the categories’s purpose. If you don’t know what we are about then “unclassified” doesn’t tell more either, and it is longer to type. It is quite clear that the derivation is undefined. If you really want to be clear you call them underspecific derivations. Fay Freak (talk) 05:51, 6 June 2022 (UTC)[reply]
OK, I left it at "undefined" for now, let's see what others think. The issue of typing length shouldn't matter too much with autocompletion. Benwing2 (talk) 06:24, 6 June 2022 (UTC)[reply]
Now it occurs me that unclassified could also mean that nobody knows it, as in a biological unclassified taxon, so it would look less a maintenance category than it should look. Fay Freak (talk) 06:55, 6 June 2022 (UTC)[reply]
I'm skeptical this will help, but I don't see any harm in it. Thadh (talk) 07:16, 6 June 2022 (UTC)[reply]

Wow, I had the exact same thought in April. Needless to say, I fully support this idea. 70.172.194.25 07:00, 6 June 2022 (UTC)[reply]

I'm also in support of this change (I think I've proposed the same thing back in October) and I agree with Fay Freak's point about preferring undefined over unclassified. Defined (to me) expresses something about us, Wiktionary: "we have not defined it yet"; classified (to me) speaks about the nature of the word itself: "it has not been determined which class this word belongs to". — Fytcha T | L | C 20:22, 9 June 2022 (UTC)[reply]
Whoa, I must have either read that and totally forgotten about it, or we both came up with the same idea independently (GMTA?). 70.172.194.25 03:32, 10 June 2022 (UTC)[reply]
Funniest thing is that we both wanted to call it {{der?}} :) — Fytcha T | L | C 10:10, 10 June 2022 (UTC)[reply]

Old Armenian derivations from "Middle Iranian"

See Category:etyl cleanup/xcl. The remaining 80 entries after my bot attempted cleanup are all or almost all cases where words are derived from "Middle Iranian". In reality there is no such thing as "Middle Iranian"; it's not even a well-defined family. So I have no idea what language the reconstructed forms are supposed to be. They look to all be created by User:Vahagn Petrosyan. Can you comment on where these came from and what language or proto-language they're supposed to represent? I have a hankering to comment them all out. Benwing2 (talk) 07:46, 6 June 2022 (UTC)[reply]

This has been discussed before. It is often impossible to determine from which Iranian language an Armenian term is borrowed — Parthian, Middle Persian, Middle Median or controversially also some other Iranian languages, all from the Middle Iranian period. It is quite common in Iranological and Armenological literature to use "Middle Iranian" in such cases. Why can't you do this? Vahag (talk) 08:32, 6 June 2022 (UTC)[reply]
@Vahagn Petrosyan I can do something like that, e.g. whenever I see 'und' as the language being linked to; but my larger point is that I have a hard time believing that you really can't determine at least to some extent which Middle Iranian language is being referenced. Whether this is standard practice in Iranological/Armenological literature makes no difference to what is the right thing to do. I know for example there is a massive difference in the phonology of Old Persian vs. Avestan, and if you carry that forward it should get even worse. I see for example that Wiktionary has a Proto-Medo-Parthian language, why can't you derive from that? And is there really so little phonological difference between Proto-Medo-Parthian and Middle Persian that you can't tell which is which? At the very least rename "Middle Iranian" to "Middle Western Iranian". Benwing2 (talk) 02:59, 7 June 2022 (UTC)[reply]
Western Iranian languages developed in parallel and often borrowed from each other. It is really often impossible to distinguish the source of the Armenian loanword. Compare Old Armenian հազար (hazar, thousand) ~ Parthian hazār, Middle Persian hazār "thousand", Old Armenian դէմ (dēm, face) ~ Parthian dēm, Middle Persian dēm "face", Old Armenian դրաւշ (drawš, flag) ~ Parthian drafš, Middle Persian drafš "flag". Proto-Medo-Parthian is not helpful, as it does not include Middle Persian, and also because the borrowing from the Parthian or Middle Median may not always be reconstructible to the Proto-Medo-Parthian stage. We can't rename "Middle Iranian" to "Middle Western Iranian", as Category:Terms derived from Middle Iranian also includes terms in languages which borrowed from Eastern Middle Iranian. Vahag (talk) 06:30, 7 June 2022 (UTC)[reply]
@Vahagn Petrosyan If Category:Terms derived from Middle Iranian includes Eastern Middle Iranian, then how can that possibly be of any use to users of Wiktionary? Linguistically it is total nonsense and IMO an embarrassment, just like etymologies that derive from "Native American". It needs to be cleaned up. Also IMO a better approach than the existence of a "Middle Iranian" etymology language, even if just for Western Middle Iranian, is to say something like Borrowed from an {{bor|xcl|ira}} source, compare {{cog|pal|...}} and {{cog|xpr|...}}; at least that way you avoid linguistic nonsense. Also I rewrote everything in Category:etyl cleanup/xcl that had a family (including "Middle Iranian" and "Old Iranian") as the destination and 'und' as the source; can you take a look at the remainder? Benwing2 (talk) 08:03, 7 June 2022 (UTC)[reply]
@Benwing2: You are nonsense “linguistically”. It is only saying two things, “Iranian”, which you find legitimate, and a periodization of the borrowing time, for the Iranian chronolects are marked by common isoglosses (it is questionable how much this word isogloss can be used for other-than-geographic boundaries, but I just do it now, as you know there are a features spreading over multiple languages, e.g. the Slavic yer reduction), a defined subset of the Iranian languages. And we thereby avoid circumstantial and stereotypical phrases like “Iranian, compare x and y” with conspicuous linguistic precision, instead variating our wording to something like “borrowed from a Middle Iranian term attested in …” (the attestation uses to be spotty). I say “Iranian borrowing” for Arabic when I am not sure if a word is from the 700s or present around 600 already, but all of Parthian and Middle Persian and Proto-Kurdish as far as we see changed drastically at this time, as did the political situation in the Middle East, together with the fall of antiquity in the West; it is useful to note whether a word spread before or after the occurrences. Fay Freak (talk) 09:23, 7 June 2022 (UTC)[reply]
Distinguishing Old, Middle and New Iranian layers in Nebenüberlieferungen is an important task. For now we often cannot be more precise because of the spotty attestation of Iranian, intra-dialect borrowings and the late writing of the three main Caucasian languages. Thanks for cleaning up Category:etyl cleanup/xcl. I handled the remaining words. Vahag (talk) 09:40, 7 June 2022 (UTC)[reply]

"Proto-Baltic"

@Rua Speaking of linguistic nonsense, we have an etymology language bat-pro that stands for "Proto-Baltic". These etymologies need to be reviewed; probably we need to rename that code to "Proto-East Baltic" and the supposed "Proto-Baltic" ancestors of Old Prussian fixed. Benwing2 (talk) 08:06, 7 June 2022 (UTC)[reply]

Wiktionary:Etymology scriptorium/2019/December § code for Proto-East-Baltic PUC09:49, 7 June 2022 (UTC)[reply]
Agreed, I also think we're overdue to have PEB reconstructions. Thadh (talk) 14:17, 7 June 2022 (UTC)[reply]

This template is used in the etymology section of Chinese entries, and adds Category:Chinese terms borrowed from Japanese. But it also adds Category:Wasei kango, which is a subcategory of Category:Japanese terms derived from Chinese. Should this latter categorization be changed to reflect the fact that the entries being referred to are Chinese? Most of the entries also have Japanese sections anyway, but the ones listed here do not. 70.172.194.25 13:41, 7 June 2022 (UTC)[reply]

A term can, normally speaking, not be at the same time a borrowing in language X from language Y, and a term in language Y derived from language X. There may be a few exceptions of terms that made the round trip (with a change in meaning), but these should be noted separately as such. In almost all cases there is no round trip, so this should not be triggered automatically by the template.  --Lambiam 09:17, 8 June 2022 (UTC)[reply]
We should make a distinction between Japanese terms that are wasei kango (“written in kanji, but made in Japan”), and Chinese terms that have been borrowed from such Japanese wasei kango. The documentation of the template {{wasei kango}} states that it is meant for use in Chinese entries. I think it is also of interest to mark Japanese wasei kango entries as such, which requires some modifications – either a separate template, or a parameter identifying the language.

Old Tamil in Brahmi Script

Given the changes in the Brahmi script with Unicode 14.0, and the lack of grandfathering of the old encoding for Tamil Brahmi, at some point we ought to convert the entries. The relevant changes are:

  1. Short e and o now have their own characters; they are no longer to be written as e and o plus virama.
  2. Pulli is now encoded distinctly from the original virama. Previously it was formally considered to be a stylistic distinction.

An example of the change would be from Unicode 13 𑀧𑁂𑁆𑀬𑀭𑁆 (peyar) and 𑀯𑁂𑁆𑀫𑁆 (vem) to Unicode 14 𑀧𑁳𑀬𑀭𑁆𑀧𑁳𑀬𑀭𑁰 and 𑀯𑁳𑀫𑁰.

The Unicode standard does not grandfather the old encoding. This argues that we should convert the spellings now - which will ensure that they don't render properly! However, should the old encoding be allowed to linger? Most (all?) of us lack a font for the new encoding. I propose that we move the old spelling to the new spelling, and then we turn the hard redirects into soft redirects in slower time. Would oty-preuni14 be a suitable name for a template implementing the soft redirect?

When looking at the task, I found no citations for how the words were written. Could someone please advise onm the source of these entries - at the moment I am tempted to RfV the lot, but I fear that would be counterproductive. Also, is Brahmi beiong used as a transliteration of Vatteluttu? Unfortunately, thje original authors were IPs, so it is hard to ask them for citations. @AryamanA, Hk5183, 108.31.52.77, 98.179.127.59 -- RichardW57 (talk) 18:16, 8 June 2022 (UTC)[reply]

Incidentally, do we need to transliterate long and short Old Tamil 'e' and 'o' as ē/ō and e/o rather than e/o and ĕ/ŏ? --RichardW57 (talk) 18:37, 8 June 2022 (UTC)[reply]

In the absence of any response, I have gone ahead and made the change to the transliteration to Roman script. Orthographically long Old Tamil e and o are now marked with a macron, while the short vowels are unmarked. This corresponds to the convention for Tamil. --RichardW57 (talk) 13:30, 12 June 2022 (UTC)[reply]

Flood of audio pronunciation requests

As currently visible at Special:Contributions/Rodrigo5260, this user appears to be adding {{rfap}} to pretty much any and every entry they happen to land on. I happened to see a lot of activity from them on the Japanese pages on my Watchlist, but I see from their contributions that it's pretty indiscriminate with regard to term language.

This strikes me as odd and unhelpful, more as noise than anything. But I'm uncertain if that's just me.  :)

What do others think? Is this kind of blanket request useful, or should I ask them to stop? ‑‑ Eiríkr Útlendi │Tala við mig 00:01, 9 June 2022 (UTC)[reply]

@Eirikr: I personally oppose to blanket requests on audio recordings and translations, even though there is no explicit policy on such restrictions. --Anatoli T. (обсудить/вклад) 02:34, 9 June 2022 (UTC)[reply]
Done Done Equinox 02:42, 9 June 2022 (UTC)[reply]
I couldn't disagree more: this helps us find out where gaps are. Why wouldn't we want pronunciations on all entries? —Justin (koavf)TCM 04:33, 9 June 2022 (UTC)[reply]
@Koavf You're not thinking this through: we know where the gaps are: everywhere. If we look at English, which is has much better coverage than most languages, there are 45,115 pages in Category:English terms with audio links out of well over a million total (about 4%). Used properly, {{rfap}} lets us know where people feel a need for audio. Used this way, it only lets us know about this person's browsing habits. It wouldn't be hard to bot-add the template to every entry without audio, but then we would end up flooding the categories with literally millions of requests- if everything is top priority, nothing is. Chuck Entz (talk) 05:47, 9 June 2022 (UTC)[reply]
But how do you decide what is "top priority"? There's no deadline on the dictionary: just keep on adding pronunciations as they come up. If we need some human-curated list, then we can make a request board or have some votes or something. Until then, all of the entries in all of the languages that are spoken could use a pronunciation (and an etymology and rhymes and hyphenation, etc. as applicable). —Justin (koavf)TCM 05:51, 9 June 2022 (UTC)[reply]
Yes, Chuck already said that basically all entries could use a pronunciation. It's pointless tagging all entries with the same tag, which just wastes space and pushes the real content further down. There are too many. Equinox 06:10, 9 June 2022 (UTC)[reply]
Then make it so it doesn't display and just adds entries to a tracking category. —Justin (koavf)TCM 16:44, 9 June 2022 (UTC)[reply]
If this were a social media site, you'd probably see +1 buttons below the requests. Right now it's binary. We could also create a list of requests sorted by number of page views / interwiki links or similar metric. I did something similar a while back while adding German recordings: the category was flooded by one user who added the tag to all the entries they came across. – Jberkel 06:33, 9 June 2022 (UTC)[reply]
Interesting idea. I'd support it if it needed a vote. (Almost any practical attempt to put any set of requests into priority order using something at least plausible seems useful to me. I'd been using just "taxlinks"/redlinks within Wiktionary, but pageviews would be more useful to prioritize entries with higher volumes of pageviews than the bulk of organism names.) DCDuring (talk) 18:35, 9 June 2022 (UTC)[reply]
@Koavf: Given how narrowly our pronunciations are recorded, I'm not sure that we really want 20+ different pronunciations for every Latin script Pali term. --RichardW57 (talk) 07:03, 9 June 2022 (UTC)[reply]
How many do we want? —Justin (koavf)TCM 16:44, 9 June 2022 (UTC)[reply]
    • I'd like more English audio requests. But Wonderfool has made 11,000 since October 2019 (actually that's impressive, but there's bound to be a number of crappy ones in there - they'll be found eventually...) Zumbacool (talk) 18:49, 10 June 2022 (UTC)[reply]

Currencies and place names

I was wondering how best to tackle the proper noun sense of currencies at entries like dollar, pound, franc and so on, which could easily approach 50–100 senses once historical currencies and redenominations are taken into account. Our usual approach is obviously completely unwieldy.

Although we could just hive this stuff off into an appendix, I think a better approach would be to list the major ones (or not, if that would cause too many arguments), and then put the rest (as well as those which usually use a qualifier) in a collapsible box under the definition line. An illustrative example:

  1. the United States dollar
  2. the Canadian dollar
  3. the Australian dollar
  4. Any one of various other currencies:
  5. (historical) Any one of various former currencies:


There's no need for it to look exactly like this, as I've just used one of the templates we already have (combined with some HTML fudging as it's not designed to work like this), but I feel like this is a much better approach to "common" proper nouns.

The other obvious application is place names, and while it would need to be laid out differently (i.e. no columns), the same principle applies, as entries like Kingston or San Antonio are really unwieldy at the moment. Much better to break things down in a more user-friendly manner. Theknightwho (talk) 13:31, 9 June 2022 (UTC)[reply]

I'm not sure there is an objective way to identify which currencies are major and which aren't. How is the Canadian dollar more important than, say, a Belize dollar? Thadh (talk) 13:51, 9 June 2022 (UTC)[reply]
I agree, but it's a side issue. We have this problem with place names, too, where it's difficult to know which ones go under the "various other" bit. The US dollar does definitely count as major, as that's what people assume in countries that don't use the dollar, but it feels a bit uncomfortable to only give special prominence to that.
In any event, the formatting is the main issue here. Theknightwho (talk) 14:05, 9 June 2022 (UTC)[reply]
As we are concerned with word use, the point is: which of the various entities referred to as "[country] dollar" are referred to as "dollar" (and by whom/where). For the most part they are/were referred to as dollar only by English speakers in the country of issue. Dollar, is used much more widely to refer to the US Dollar, most tellingly by English speakers in places like India, UK, Ireland, and Singapore, which have currency names that do not include dollar in their names. The Australian dollar may be referred to as dollar by English speakers in, say, PNG. All of the currencies that are referred to as dollar "only" in the issuing country could share a single definition.
If we accept this rationale, then only the US and Australian dollars would get their own definitions under dollar, the rest being defined as members of a class and appearing individually only under derived terms. DCDuring (talk) 19:08, 9 June 2022 (UTC)[reply]
I don't really understand your reasoning. If I'm going to Canada and I want to check how much money we have (so, whether or not I have to change any once I'm there), I'd also ask "How many dollars do we have?", probably omitting the "Canadian" part due to context. And while I can see how the US dollar is universally the prototypical dollar (even though that's a shame, but oh well), I don't see how the Australian dollar is better-known in PNG than, say, Hong Kong dollar in Cantonia, Brunei dollar in the Phillipines or the Singaporean dollar in Malaysia. Thadh (talk) 19:35, 9 June 2022 (UTC)[reply]
Thanks - I was completely confused by that reasoning as well.
@DCDuring Almost all of these currencies refer to themselves by the name "dollar" on their notes in English, without including the name of the country (e.g. "five dollars"), including Singapore. Of the modern currencies, the only exceptions are Hong Kong, Brunei, Namibia and Taiwan. I simply don't understand how you could call the rest derived terms, and there is no requirement that senses must be used internationally.
Most of these are English-speaking countries, by the way, and there are plenty of contexts where you might use the term as a shorthand outside of the country in question, given the right circumstances (e.g. academia). I think you're making a hugely over-generalised assumption, and in any event the same could be said of a large number of place names, too.
That's also to say nothng of the fact this isn't even taking redenominations into account. Currencies change all the time, and might still use the name dollar, and your logic completely breaks down when we bring up the peso as none of them dominate. Theknightwho (talk) 19:41, 9 June 2022 (UTC)[reply]
My point is that all except the US dollar (and, probably, the Australian dollar) share the common characteristic of being called dollar only within the jurisdiction of issue, whereas the US dollar (& Aus$) have a different range of usage, the US dollar being the intended referent in most English-speaking countries that don't call their own official currency dollar. Thus, all but those two can share the same English definition: "Any local currency with an official name of, or containing, dollar", with something like {{lb|en|principally in respective jurisdiction of issue}}. (We should use jurisdiction because of Hong Kong dollar. This finesses Taiwan dollar as well.)
For dictionary purposes what appears on the currency is secondary to other usage, just as what appears on Kimberly-Clark products is lexicographically secondary to the way normal speakers use kleenex. In any event, dollar is very much like foot before its standardization in UK, a mere locally standardized unit of measure, or noon, a term whose exact meaning differs now by time zone and was formerly defined astronomically for each location.
I don't understand why all such "[country] dollar" names aren't derived directly from dollar, sometimes via Spanish dollar, sometimes via US dollar. US dollar certainly is derived from early use of dollar in other places (possibly other languages).
I think academics don't need Wiktionary to tell them how to use the term dollar. The referent in academic usage is usually made clear in the document in which the term is used. For a term of wide use by normal people academic usage is largely irrelevant unless the academic usage is with a truly distinct meaning. DCDuring (talk) 14:30, 10 June 2022 (UTC)[reply]
@DCDuring Your argument about the implied referent applies to almost all of our place name entries, as well as the vast majority of our detailed taxonomic entries. It also ignores the fact that currencies change while keeping the same name - something completely disguised by your argument, as well as the use of currencies in neighbouring jurisdictions (which is more common than you seem to realise in many parts of the world). I’m also not sure why inclusion would be a problem, particularly when the point of this discussion is about preventing clutter and not about whether or not we should include these in the first place. It’s frustrating to have this discussion go so wildly off-track. Theknightwho (talk) 19:59, 11 June 2022 (UTC)[reply]
Obviously all the [jurisdiction] dollar names are just normal entries, subject to normal RfD. I was addressing the question of how many definitions we need at [[dollar]] to cover all the usage without wasting screen space in the definition portion of the entry. That currencies are revalued (if that indeed is what you mean when you write "fact that currencies change while keeping the same name") is of no import to the lexical meaning of dollar in any reasonable definition. After all, currencies change in value constantly, even when governments say they don't. It is very easy to prevent clutter by putting all the "[jurisdiction] dollar" terms under derived terms. End of. DCDuring (talk) 23:55, 11 June 2022 (UTC)[reply]
I’m referring to the way that currencies get withdrawn and replaced (e.g. there have been at least 5 Argentine pesos), not their shift in value. In any event, the whole point of this post was to suggest a way we could include both without inaccurately glossing over the definition with a fudge. Putting them all under derived terms is not particularly helpful when they might be mixed with other terms which never get referred to as “dollar” alone. I also don’t see how the same logic doesn’t apply to place names. Theknightwho (talk) 00:13, 12 June 2022 (UTC)[reply]

Translations of Alternative Forms

@Fytcha Alternative forms are not included in the main form's translation box: see diff. Apparently, translation boxes on the pages of alternative forms are rare: see diff. Does this mean that alternative forms don't have translations in other languages? I say they very well may perhaps do so. See Xensi, Boao, Tatung, Taibei, Peking (these four examples were wiped out by Atitarev- please see the new example at CHamoru). Let me know what you all think. --Geographyinitiative (talk) 23:18, 9 June 2022 (UTC) (modified)[reply]

withdrawn
@Geographyinitiative: First of all, I wasn't involved in your interaction with Fytcha. I have only demonstrated again with the edits, what needs to happen. What has made you a competent editor? Your question has been answered since the day the template to be used for that purpose in this revision was created in 2008. If it's still not clear, alternative forms don't get translations. To avoid duplications, the main form usually houses the translations to avoid duplications. --Anatoli T. (обсудить/вклад) 02:15, 10 June 2022 (UTC)[reply]
(edit conflict) And if you want to demonstrate a specific revision (you can't expect that entry used in discussion will not be modified, even if the discussion is incomplete), I will show the competent editor: Xensi. --Anatoli T. (обсудить/вклад) 02:23, 10 June 2022 (UTC)[reply]
Your viewpoint means these forms of these words don't have a more specific translation in the target languages, which is manifestly inaccurate. That template from 2008 may very well be a pile of shit, which is why I'm questioning it. I don't need approval from a website that ignored Wade-Giles for 20 years. --Geographyinitiative (talk) 02:19, 10 June 2022 (UTC)[reply]
You have to blame all Chinese Wiktionary editors for being so mean to you and Wade-Giles. I personally have nothing against WG. Something's wrong with you. Go to hell. --Anatoli T. (обсудить/вклад) 02:23, 10 June 2022 (UTC)[reply]
@Atitarev Your reaction is way out of line, and your aggressive tone from the start has been completely unnecessary. Aren't you an admin, too?
You've also completely missed the point, which is that languages other than Chinese may have direct equivalents to these alternative forms. Bundling them into the main translation box is obviously an inferior approach, particularly when you've not even bothered to do it properly. Theknightwho (talk) 03:46, 10 June 2022 (UTC)[reply]
Way out of line, huh? Did you check the original tone? I already answered that the decision on not translating all alternative forms has been made long ago. All alt forms may have different connotations and usages in a given language but that may not apply to translations at. Even if you think that "Beijing" is fundamentally different from "Peking", in many target languages it's either the same or one is more common and the other is obsolete. In German both Beijing and Peking are translated as Peking, Japanese 北京(ペキン) (Pekin) and Russian Пеки́н (Pekín), any variations can be added with a {{qualifier}}. You can easily see that these translations are based on the original European name for Beijing (Peking) but they are also current. The template {{trans-see}} softly redirects users from the alternative term entry to where all translations are placed. There is no prejudice here or political bias, just centralising the information. --Anatoli T. (обсудить/вклад) 05:03, 10 June 2022 (UTC)[reply]
@Atitarev If you had bothered to read the discussion, you would have seen this is discussed in detail below. Consensus can change - this is not a court of law. Please try to keep up.
I also flatly disagree with your interpretation of the original post - there's nothing rude about it at all. Theknightwho (talk) 05:04, 10 June 2022 (UTC)[reply]
I agree that there's nothing rude about the original post whatsoever. Geographyinitiative's second post in this thread, however, is less polite, using some crass language and possibly expressing a willingness to override consensus, but at least it's not personally targeted like "Something's wrong with you. Go to hell.". I don't see how that response was called for, unless there's some history between the two users I'm missing. Overall, I don't understand why this topic has aroused such heat, since it seems like the kind of thing that can be discussed calmly. 70.172.194.25 05:29, 10 June 2022 (UTC)[reply]
  • I agree that there is some value in a list of Peking-equivalents as opposed to Beijing-equivalents. These words are so different-sounding in English that I'm not even sure I'd call them alternative forms, even though they come from the same Mandarin source and describe the same territory. IMO they're close to the border line between alternative forms and synonyms.
  • To play devil's advocate, though, what do you do when the situation with regards to usage is the opposite as it is in English? For example, German Peking is the most commonly used form; it seems wrong to only list it in the translation box of Peking and not on Beijing, which someone is more likely to see.
  • We could also consider how we handle non-altform synonyms. For example, entire, complete, and total have more cognate translations than mixing-and-matching even though it may not be technically wrong to translate, e.g., English "complete" as Italian "intero". (Of course, most languages in the world do not even use the Latinate words for these concepts, but the ones that do are among the best-represented on Wiktionary.) 70.172.194.25 04:05, 10 June 2022 (UTC)[reply]
    These are all good points and excellent food for thought. My gut instinct is that we should be trying to capture the equivalent tone and context, rather than the cognate, so moving obsolete translations to the obsolete form might be sensible for langauges that have undergone an equivalent shift, whereas that wouldn't be the case for languages which still primarily use the old form (like German with Peking), as the implications carried by English Peking simply aren't there (even aside from the issue of duplication).
    A compromise might be to have a middle ground function in the template, which says something along the lines of "See Beijing, but note the following exceptions:" (I'm sure there's a better way of phrasing it).
    On your final point, I like the way that German editors will frequently define words by listing a bunch of English equivalents so as to encircle the exact concept conveyed by the word. For example, ganz and gesamt are both defined in very similar ways, but the slight differences in word choice and word order are an effective way of demonstrating the difference without getting bogged down. We could do something similar with the translation boxes.
    Theknightwho (talk) 04:35, 10 June 2022 (UTC)[reply]
@70.172.194.25 There is definitely past history involving User:Geographyinitiative, although I haven't been following the specifics of it.
@Theknightwho I took a look at gesamt and I don't much like the definition with five similar English words. I really think this is unhelpful; much better to explicitly indicate the differences with a usage note. When I studied Spanish, for example, I had a textbook that spelled out all the ways to say "become" (hacerse, quedarse, tornarse, llegar a ser, etc.) and explained the differences explicitly. There's no other way I could have sorted out the differences, and Wiktionary currently does a much worse job of this. Benwing2 (talk) 05:50, 10 June 2022 (UTC)[reply]
While I am not that involved in the general outcome of this, I am strongly opposed to any proposal that would entail e.g. German Peking being removed from the translation section in English Beijing, which is what Atitarev expounded on in more detail to which I entirely subscribe.
I also want to clarify a misunderstanding: @Geographyinitiative: In my diff that you're citing, I wrote "alt spellings" which you've paraphrased as "Alternative forms" in the OP. I want to point out that those are not the same: Alternative spellings always share the same pronunciation which is not necessarily true for alternative forms. — Fytcha T | L | C 10:09, 10 June 2022 (UTC)[reply]
@Fytcha I just want to point out that nobody is suggesting that we blindly try to match cognates while ignoring actual use, and I agree that it would be completely wrong to move German Peking to English Peking. Atitarev had pretty obviously not read the rest of the conversation when he responded, because me and 70 had already discussed that exact example and some possible ways forward, and his response that you agree with concerned points which either no-one had made or which had already been addressed. It's a bit frustrating to see the genuine merits of this proposal being ignored, simply because one user has personal problems with the person that proposed it. We're better than that. Theknightwho (talk) 19:02, 10 June 2022 (UTC)[reply]

Ding, dong, Template:etyl is dead

I cleaned up the last few hundred uses that my bot wouldn't touch; these were cases where the source in {{etyl}} mismatched the following {{m}}. Benwing2 (talk) 03:23, 10 June 2022 (UTC)[reply]

Shouldn't it be kept so that old revisions are still readable? 70.172.194.25 03:30, 10 June 2022 (UTC)[reply]
Harrumph, let's see what other people think, I don't particularly want people to be able to continue using it. Benwing2 (talk) 03:56, 10 June 2022 (UTC)[reply]
I'm generally supportive of leaving old templates so that old versions have some level of functionality (and that does come in useful from time to time), but if there's a contingent of hold-outs still using it then it might be better to resurrect it in a few months instead. Theknightwho (talk) 04:06, 10 June 2022 (UTC)[reply]
Some simple version of the template should remain, if possible. IMO, the requirement would be that entry histories would be readable. The long red messages make old versions of entries ugly and intimidating. Perhaps just showing {{temp|etyl}} (ie, no parameters) with a link to Talk:etyl or Documentation:etyl. Those pages would be restored to the last version before they were deleted. DCDuring (talk) 14:50, 10 June 2022 (UTC)[reply]
I propose keeping the template, marking it as obfuscated, and potentially adding an edit filter which flags use (or prevents it). It is very annoying when you go to an old revision and cannot parse what it says. - TheDaveRoss 14:54, 10 June 2022 (UTC)[reply]
I've recreated it in order to be able to view old revisions properly, along the same lines as {{context}}. If it causes issues it can certainly be deleted again. This, that and the other (talk) 03:12, 11 June 2022 (UTC)[reply]
See User:Mglovesfun/-eur for an example of what this looks like. I have no idea why every use of {{etyl}} is on its own line, but honestly it probably doesn't matter - especially if it makes it less likely that people will use the template anew. This, that and the other (talk) 03:17, 11 June 2022 (UTC)[reply]
@This, that and the other {{deprecated code}} was using <div> when it should have been using <span>. I fixed it and now things look more reasonable. Benwing2 (talk) 06:16, 11 June 2022 (UTC)[reply]
I don't really care about old revisions and am not particularly inclined to keep it - lots of templates are deleted all the time, and we don't and definitely shouldn't go keeping all those as deprecated just for old revisions' sake. —Svārtava (t/u) • 15:31, 10 June 2022 (UTC)[reply]

Great job. Link Count still says 566 wikilinks and 15 transclusions, but these are all out of mainspace. Thanks to everyone who helped clean out all of these. —Justin (koavf)TCM 04:35, 11 June 2022 (UTC)[reply]

Stricter attestation criteria for offensive entries

Hi, I would like to raise for discussion a proposed amendment to WT:ATTEST for how offensive entries are dealt with. Examples of such offensive entries include Apefrican, Buttswana, criminigger, cumskinned, faggotface, jaboon, koala fucker, Mexicunt, negro fatigue, nigdar, Norgay, piss drinker, Porntugal, San Fransicko, suspook, teenaper, Turd World, Vladimir Pootin, and West Undies (and this is just what's currently on or was recently on the RFD and RFV pages). Please help to refine the amendment, and comment on whether you feel this is a good idea or not. — Sgconlaw (talk) 14:14, 10 June 2022 (UTC)[reply]


If an entry is offensive to an individual, group of persons, or geographical location, it must have at least three quotations satisfying WT:ATTEST added to it within two weeks [one week?] of the entry being created or being nominated at RFD o r RFV, whichever is later, otherwise it may be speedily deleted after that period.

An entry is considered as offensive if it:

  • denigrates a named individual in any way; or
  • denigrates an unnamed individual, group of persons, or geographical location on the basis of ancestry, ethnicity, gender or sex, religion, or sexual orientation.

The speedy deletion of the entry is without prejudice to its re-creation if WT:ATTEST can be satisfied as described above.


The rationales for the proposed amendment are as follows:

  • It is hard to tell whether such entries are genuine or hoaxes.
  • The (usually anonymous) editors who create such entries are essentially pushing the task of verifying these entries to other editors. We are not the Urban Dictionary. The amendment discourages editors from adding offensive entries unless they are willing to put in the effort of ensuring the entries are attested.
  • Due to the dubious nature of these entries, they are rightly challenged at RFV or RFD. However, this clutters up these fora, and uses up the time and effort of editors in discussing and verifying the entries which could be used more productively.
  • Arguably, the reputation of the project as a whole is lowered by the presence of such entries. There is no particular benefit in having many unattested offensive entries; only those which are properly attested within a short period of time deserve to remain.

Discussion

Agreed on all points. Many of these entries are nonce words as well and can be formed arbitrarily (one of the points of WT:SOP as well). — SURJECTION / T / C / L / 14:22, 10 June 2022 (UTC)[reply]
Categorizing single words as "SOP" sets a troubling precedent. Affixes such as anti- and -hood are inherently formulaic, yet we still document the words that can be formed with them. Binarystep (talk) 09:49, 11 June 2022 (UTC)[reply]

I'd like to just add to this by saying that a lot of the time these look to be repeat nonce words, rather than genuine words, too. Theknightwho (talk) 14:27, 10 June 2022 (UTC)[reply]

I believe that the main problem with this proposal is that the meaning of "denigrate" will inevitably be over-extended based on politics. --Geographyinitiative (talk) 14:29, 10 June 2022 (UTC)[reply]

It doesn't matter. At the end of the day, if qualifying quotations can be found, the entry will be kept (or can be recreated). This puts the onus on editors wishing to create the entries to do their homework, and not use a scattergun approach by creating numerous entries and then pushing the verification work to others. — Sgconlaw (talk) 14:32, 10 June 2022 (UTC)[reply]
I support this idea, I would advocate for requiring citations to create the entry to begin with. If someone doesn't want to do that legwork they can create a request for the entry to be created. - TheDaveRoss 14:46, 10 June 2022 (UTC)[reply]
(edit conflict) I agree, but I'd actually raise the bar a little: One week after the creation of the entry regardless of whether it gets an RFV or RFD. I don't see why an entry should sit there a week longer just because an RFV has been filed. That said, it might be a good idea to open a new forum (or make it a subtask of RFV) to ask others for help with finding quotations (especailly for WDLs). I can see why it can be frustrating to have to come up with a third quote all by yourself when it's a widespread word. Actually scratch that, Dave makes a good point about using RE for that. Thadh (talk) 14:51, 10 June 2022 (UTC)[reply]
The prob w/ Dave Ross' idea is that this forms an INCREDIBLE barrier to entry for n00bs, which is the exact wrong direction for Wiktionary to go. No new entry w/o cites=gated community. --Geographyinitiative (talk) 11:54, 11 June 2022 (UTC)[reply]
@Geographyinitiative: note that this proposal deals only with denigratory or derogatory entries, not all entries. I feel that a higher standard is required for entries which are essentially used purely for insult, especially when it seems there are editors who deliberately create large numbers of such entries. Frankly, I don't think there's a great loss if n00bs who wish to engage in this sort of behaviour are dissuaded by the policy. — Sgconlaw (talk) 12:20, 11 June 2022 (UTC)[reply]
@Thadh: I proposed two weeks [or one week] after creation or after nomination for RFD or RFV, whichever is later. The latter was to cover the situation where an offensive entry goes unnoticed until after two weeks (or a week) after its creation. So under the proposal it's not necessary to wait till an entry has been nominated for RFD or RFV; an administrator who spots an unverified entry within two weeks (or a week) of its creation can go ahead and nuke it. — Sgconlaw (talk) 16:12, 10 June 2022 (UTC)[reply]
Okay, that wasn't made clear by the wording "whichever is later". Thanks for clearing this up! Thadh (talk) 16:24, 10 June 2022 (UTC)[reply]
A cleaner way to handle this might be to make the requirements explicitly the same as those for recreating an entry that was deleted through rfv, except that they shouldn't be as easy to speedy because deleted entries have a warning that comes up when you edit the deleted page. We probably should add a sentence to the page creation text notifying would-be entry creators. Chuck Entz (talk) 15:33, 10 June 2022 (UTC)[reply]
@Chuck Entz: remind me what these requirements are and where they are noted? — Sgconlaw (talk) 16:12, 10 June 2022 (UTC)[reply]
It seems to be one specific anon (using various IPs) who is mass-creating this kind of thing lately. Equinox 17:21, 10 June 2022 (UTC)[reply]
That might be the case, but my impression is that we’ve had this sort of problem on and off for some years now, so we might as well decide on a way of dealing with it. — Sgconlaw (talk) 17:38, 10 June 2022 (UTC)[reply]
Yes, but this week's anon is different from the anon that triggered the original discussion (Australia vs. US), and I suspect there will be others. Chuck Entz (talk) 19:38, 10 June 2022 (UTC)[reply]

TBH this is not a bad idea at all. If you're going to add rare offensive terms, you should be prepared to back them up with attestation instead of burdening other users with that work. I think it does make sense to apply this specifically to offensive terms since they are more inflammatory, sometimes made up, and are often low-quality entries with just the bare definition provided. And while Geographyinitiative above makes a decent point that "offensive" may end up being interpreted more broadly than intended, I honestly wouldn't mind a wide application of this rule. After all, there's no prejudice against recreating when quotations are found anyway. 70.172.194.25 19:15, 10 June 2022 (UTC)[reply]

The good news is that it is an easy thing to regulate, if someone deletes something which shouldn't be deleted there is quick and easy recourse: either add some citations (anyone can do this) or, other admins can undelete and create an RFV. I'd rather find out if it is a problem rather than ignore the problem which seems apparent already. - TheDaveRoss 19:19, 10 June 2022 (UTC)[reply]
@TheDaveRoss: yes, that's what I figured. There's very little downside to the proposal. It's not intended to act as a ban on offensive entries. If it is really felt that a particular entry should be included, then the editor(s) merely have to back it up with the required minimum number of quotations and it can be recreated or undeleted. On the other hand, the proposal hopefully dissuades editors who really can't be bothered to properly justify offensive entries from creating numerous ones and wasting the time of other editors who then have to deal with the entries at RFD or RFV. — Sgconlaw (talk) 21:46, 10 June 2022 (UTC)[reply]
There is no other burden than with any term, other than your personally feeling offended despite not being spoken to or about by the mere mention of a word.
Also this proposed rule discriminates autistic users who have a hard time recognizing offense in the first place, here even more complicatedly only abstractly assumed from the possible uses of a word rather than its actual use which happens in lexicography, which anyone linguistically minded may barely feel.
I also respect those who enter the editorship of this dictionary by filling the gaps they perceive in the coverage of injurious terms. Laxer criteria attract editors—they don’t repel readers, who don’t search for bad entries. Who of the greatest Wiktionarians started there? Closing the gate after twenty years is cheap.
There are a great many things on the internet to be offended by, these terms being systematically entered into Wiktionary aren’t one. Fay Freak (talk) 21:28, 10 June 2022 (UTC)[reply]
Offence is not the operative criterion, though. Denigration is, which is a much more objective benchmark. Theknightwho (talk) 22:50, 10 June 2022 (UTC)[reply]
It is not clear from the formulation that this is exclusive and even if it were the concept of “denigration” hardly has a lesser compass. Lambiam reckons this “definition” likewise unhelpful below. Fay Freak (talk) 23:29, 10 June 2022 (UTC)[reply]
Note also the revealing misstep in wording of assuming entries denigrating. You’ll only act on what you made up in your mind. If you abstract the entries from the objects described by them you should be indifferent to the former. Fay Freak (talk) 23:32, 10 June 2022 (UTC)[reply]
To be clear: "entry" is being used as a shorthand for "term described by an entry".
I'm also confused by your point that people might not understand that a term is denigrating. No specific person in the Wiktionary community needs to be offended for us to recognise that - the point is that the meaning of the word is derogatory in some fashion. Theknightwho (talk) 03:46, 11 June 2022 (UTC)[reply]
You are really scraping the bottom of the barrel looking for reasons to oppose this, discriminating against autistic people... The whole project discriminates against illiterate people too, might as well shut it down. - TheDaveRoss 23:36, 10 June 2022 (UTC)[reply]
@TheDaveRoss: What if it is not the bottom of the barrel but the gorilla in the room? I can’t relate at all to this culture of being offended, but to those who can’t relate and are passed over by those who show much concern. And it has often happened in larger software projects that those codes of conduct or similar have made all too risky to that kind of people that fail sensitivity to those distinctions of social acceptedness—which is completely irrelevant to objective mission of the project as long as contexts can be caught by rough labels, but even these are controversial (Wiktionary:Requests for verification/English#niggershipTalk:niggership one showed haphazard application of the label “ethnic slur”, the largest contributor exaggerates the meaning of “slang”, another categorized all vaguely right-wing as “Neo-Nazism” and was rightly reverted by him; soon we will only discuss the interpretation of our rules instead of content if the former accretes on this basis—aye, I really like opposing expansion of rules in general, and this is a good enough example for general reservations; new rules, new problems, nothing of concern solved). Fay Freak (talk) 01:09, 11 June 2022 (UTC)[reply]
I am disappointed that you continue to misinterpret the meaning of "ethnic slur" despite having had it explained to you in depth, and I have absolutely no idea how you came to either of your other conclusions other than the fact that you didn't like the fact that you didn't get your own way. Particularly with neo-Nazism, it's blatantly incorrect to say that anyone "categorized all vaguely right-wing as “Neo-Nazism”", because the terms under discussion are either well-known to be neo-Nazist, or were being considered for removal from the category. You just seem to have an axe to grind. Theknightwho (talk) 03:44, 11 June 2022 (UTC)[reply]
@Theknightwho: If you call repeatedly dropping a reference to the synonymous Wikipedia article, whose definition I had already proven to be practically incomprehensible, an “explanation in depth”. deaf rather than deep is the term you aim at to describe the quality of your answer, that’s why you have “no idea”. The construction of my ideas is completely laid out to be traced. The claim stands that you, and WordAndNerdy, misinterpret the meaning of “ethnic slur”, and so you will phantasize broader meanings of being “offensive” and “denigrating”. “Reference” and “allusion” can be understood in various grades of directness. Currently the fourth gloss of refer you refer to (now reading “To allude to, make a reference or allusion to“) is no real definition and must be replaced for using but itself and a synonym for definition. Fay Freak (talk) 09:26, 11 June 2022 (UTC)[reply]
@User:Fay Freak Nothing about the definition I gave you was incomprehensible - you simply didn’t realise that it is possible to use “refer” to mean “allude”, which means to refer to something indirectly. There is nothing circular about that - it just means the verb “refer” can be direct or indirect. Quite clearly that means that the definition of “ethnic slur” encompasses words which indirectly denigrate. This is not a difficult concept, and your prescriptive rules lawyer approach is not convincing to anyone.
Aside from that, are you seriously making the argument that it is impossible to know when a word is denigrating on a collaborative dictionary of all places? Is denigration some kind of special form of knowledge that is uniquely difficult to determine? How do you think we determine the meaning of any words at all? Theknightwho (talk) 13:10, 11 June 2022 (UTC)[reply]
@Theknightwho: I deny that “refer” means “allude”; even if it does, it is not clear that the Wikipedia article uses it in this unusual way, and rather it uses the stricter sense and e.g. niggership is not an ethnic slur under it. The correct word is apparently connote as opposed to denote, no such thing as “indirect reference”—if you search that you find analytic philosophy books with their usual made-up language; Indirect self-reference uses “indirect” not in the sense of “aside from” but “through a longer path directly”; like “rules lawyer” is a paradox and beside the point that editors are unable to work with the definitions to any advantage. I am not making an argument that it is impossible but editors are incapable or it is unnecessary uncertain and hard though “possible”. Perhaps I can, you don’t and WordyAndNerdy doesn’t and does not want as owned by her below as “subjective lines”, and unknown IPs will perform worse than you all. Note that it is a well-known fact that defining any term of law by a criterion “directness” is always to some degree controversial and vague, but you can’t drop it either and content yourself with offensiveness or denigration discovered over five corners. You will meet cases of doubt “is it denigrating (directly) enough?” even according to your broader framing. Fay Freak (talk) 16:07, 11 June 2022 (UTC)[reply]
@User:Fay Freak The Merriam-Webster dictionary defines “refer” in the intransitive sense to mean “to have a relation or connection”, which is precisely the way it is being used, and does not exclude indirect references. connote would not be a correct gloss in Wikipedia’s definition, because the usage is intransitive (“to refer to X”), and not transitive (“to connote X”).
There is also a pretty extreme irony in you arguing that you can dismiss a definition as “made up” while trying to argue that we should keep words that are themselves made up by those that use them. Rules lawyering is not something to aspire to - it is nonsense borne out of working backwards from your conclusion, instead of using reason to work towards one. Do not conflate it for making a cogent argument. It also explains why you would argue against such a plainly common use of the word “refer”, which I and many other native speakers use frequently (whether or not you approve). Your prescriptivism has no place here.
You have also failed to provide any justification for saying that it is particularly difficult to judge that a term is being used in a derogatory way. You have simply expressed doubt, while conflating verbiage with making a substantive point. It’s hard not to see the double standard in your viewpoint, and you have yet to provide any reason for it. That’s aside from the fact that they’re very often added with that exact label or something very similar, which circumvents your entire point. You may call it a mistake, but the operative issue is that the person intended to add a derogatory term, and in the absence of citations we must take it at face value (as we do for the rest of it). This is, after all, a conversation about whether such terms are attestable.
I should also add that you are the only person that has made this about being offended. This has been pointed out to you several times. The issue is actually with phantom terms that are wasting our time, and those happen to more often be derogatory because they’re much more likely to be created by people as a prank (or at the very least without any genuine conviction that they exist or have ever existed in real use). Theknightwho (talk) 17:24, 11 June 2022 (UTC)[reply]
@Theknightwho: No u. You afford verbiage around the fact that you are unable to comprehend the use—mention distinction or English in general without the use of a dictionary. By their having been used they are not made up any more as to be fake, but the dictionaries do not prove that this is not a ghost meaning—the usage examples for the alleged sense are inexactly described with “to allude”. Basic words use to be not well defined but circumscribed, and glossing “to refer” with “to allude” is exactly the kind of no-definition that we have to avert in the long run. These words are not synonymous. You are back in the Middle Ages when the pronunciation of words was illustrated by their being “pronounced like” some other word, abaca was defined as a kind of flax (this exactly happened in Medieval Arabic glossaries regularly) and the like; the Medieval layman state of definition is still there in the dictionaries as their foundation, and some reputable source claiming a word having a certain sense does not absolve use from discerning it in the corpus: use—mention distinction, you have not understood it. Fay Freak (talk) 17:48, 11 June 2022 (UTC)[reply]
@User:Fay Freak It’s the first entry for the intransitive use, and the OED also gives the sense “to mention, allude or make reference to something”. I have also not failed to understand the difference between mention and use - I have simply pointed out that (now two) bodies of experts agree with me. The fact that you have not heard (or more likely, not perceived) a particular use does not mean that it does not exist, particularly when you openly dismiss evidence to the contrary, while failing to understand the difference between transitive and intransitive senses.
At this point, Occam's razor suggests it’s much more likely that you simply just don’t like it because it’s inconvenient to your original bad faith argument that we shouldn’t label a lot of slurs as slurs. I am wholly unconvinced that you are simply trying to be technically correct, because you have presented nothing that supports your position - just unreasonable scepticism in the face of overwhelming evidence.
You are correct that refer and allude are not synonymous, though, because “refer” is general while “allude” more specific. I’m not sure why you think I said otherwise.
Theknightwho (talk) 18:32, 11 June 2022 (UTC)[reply]
@Theknightwho: Then don’t define it that way. My edit to refer was still an improvement—and not a “removal” either since the single usage example of that definition line which used this term was moved by me to the first definition line (which was expanded). But as you start to see the difference between reference and allusion, or its possible meaning, you see how much room there is to see the see some vagueness within concepts of “denigration” and “offensiveness”, or at the periphery of the sets of terms to be covered by them.
I don’t think “just don’t like it” can apply since I don’t even remember having added terms of any connected kind nor plan to do so, and also because not liking also has its reason, and I endeavoured to uncover the reasons why I intuitively don’t like it, not having been convinced of a different stance about offensive terms, which as said I can’t relate to. (Somebody made something offensive on the internet, ugh! Yet formally he was right and a scientist.)
BTW, why not, if at all curtailing the Usenet quotery, restrict it to English, since for foreign languages we have too small editor communities altogether and the problem has not arisen there nor is there equivalent potential? For foreign languages we still have very usual slurs to cover. Fay Freak (talk) 21:29, 11 June 2022 (UTC)[reply]
@Fay Freak Your edit to refer was an incoherent mess that conflated the transitive sense (“I refer you to X” = “I bring your attention to X”) with the intransitive sense (“I refer to X” = “I make reference to X”) - they’re completely different things. Either you are not competent enough in English to be editing the entry, or (as I suspect) you were intentionally trying to remove a sense out of process because you didn't like it. There is no excuse for it, particularly when you have native speakers insisting it is correct. The correct venue is WT:RFV/E, which you very well know. Theknightwho (talk) 21:40, 11 June 2022 (UTC)[reply]
@Theknightwho: Transitivity variation does not automatically make a new sense. Still there is no definition and I don’t take your correctness claim in favour of evident nonsense.for granted. I know English better than most native speakers and you are obviously on the lower end of them—why not? There is no rule that a native speakers trumps a non-native. It is all about the amount of input of language material, and despite perhaps having read more English than any other language this definition line is no explanation of the alleged sense to me—and objectively. What is the alleged sense? In the usage example there is no allusion. It is a complete nonce definition. Perhaps define the basic words sensefully before trying to restrict nonces, this would amount to greater reputation of Wiktionary! Fay Freak (talk) 21:49, 11 June 2022 (UTC)[reply]
@Fay Freak Your definition failed to capture either sense accurately, but feel free to take things to WT:RFV/E or WT:RFC if you perceive there to be a problem. Please do remember, though, that your inability (or unwillingness) to comprehend something does not mean that it is incomprehensible. Also, I recommend you write to the OED to inform them that their definition means they must be on the lower end of the English spectrum, too. I’m sure they’d be delighted to have your input. Theknightwho (talk) 22:03, 11 June 2022 (UTC)[reply]
My definition succeeded in capturing either sense accurately. Your failure to comprehend what it captures does not mean that it is incomprehensible. Conversely your claim of having subjectively comprehended a definition does not mean it is comprehensible, maybe you just fancied something together which is not there, or the definition here only fails higher requirements of those who need less vulgar concepts to content themselves with. So feel free, too, to take my version or both versions to WT:RFV/E or WT:RFC if it is all too hard for you. I’m also sure the OED would be delighted to have my input but they would have to pay for better definitions. Fay Freak (talk) 22:18, 11 June 2022 (UTC)[reply]
@Fay Freak You don’t get to remove a sense and then tell other people to take it to RFV, and if you think one sense doesn’t belong then the correct place is WT:RFD/E. This is a consensus project. Theknightwho (talk) 22:28, 11 June 2022 (UTC)[reply]
@Theknightwho: I have not removed any sense but combined definitions. You still have not shown what sense there would be. Will you show it if I RFV it? I will still have to combine it because your interpretation of what the quotes attest will be wrong; since it is impossible to prove a senseless definition, someone has to completely replace it or merge it. This is why it is no RFV matter. You are very detached from the meaning of all procedures which consensus has introduced. Neither RFV nor RFC are for evident nonsense in entries—OED having the same nonsense does not get you to preserve it. You just try to instigate me to abuse process, for the price of possibly keeping senseless definition. Fay Freak (talk) 22:51, 11 June 2022 (UTC)[reply]
Can this conversation please move elsewhere? I'm on the verge of collapsing it into a box, since it's not explicitly related to the main discussion at hand. Also, fyi there is an informal lemma policy where we do look at other dictionaries, especially OED, to determine if a word should be included. AG202 (talk) 22:58, 11 June 2022 (UTC)[reply]
@AG202 Please feel free. @Fay Freak I refer you to WT:RFD/E to make your case. Be sure to refer to this discussion! Theknightwho (talk) 23:05, 11 June 2022 (UTC)[reply]
@AG202: It is related in so far as the controversy about the proposal concerns how vaguely or indirectly a term (an offensive or denigrating term) might make reference and depreciate an “individual, group of persons, or geographical location on the basis of ancestry, ethnicity, gender or sex, religion, or sexual orientation.” A similar term like “(ethnic) slur” was likewise problematic, for comparison. I mean, in which fashion does niggerhood so, while nigger does undoubtedly in some usages? This is just one stupid and easy example, I am anxious about harder ones.
It makes sense to collapse this argument of increasing detail: meseems it can be after the words ”should be indifferent to the former.” Fay Freak (talk) 23:20, 11 June 2022 (UTC)[reply]
I don't think it's accurate to describe triple parentheses and ZOG as merely "vaguely right-wing", given their origins and usage. Binarystep (talk) 09:42, 11 June 2022 (UTC)[reply]
And I didn’t, this is more in the inner ballpark of “right-wing”, yet ))) ((( is not “Nazism”, and Nazism, given a clear historical picture, should be understood as a more clearly defined term than “offensive” and “denigrating”, yet editors even fail that. Fay Freak (talk) 09:57, 11 June 2022 (UTC)[reply]
))) ((( wouldn't belong in Category:en:Nazism, as it wasn't used during WWII, but it would certainly belong in Category:en:Neo-Nazism if such a category existed. Binarystep (talk) 10:07, 11 June 2022 (UTC)[reply]
@Binarystep Such a category now exists. It’s good to separate them, if nothing else to prevent the kind of blatant misrepresentation we both replied to (including the obvious lie that they weren’t referring to triple parentheses or ZOG, which were the only two terms mentioned in the linked discussion). Theknightwho (talk) 19:11, 11 June 2022 (UTC)[reply]
@Fay Freak My arguments have nothing to do with being personally offended. I firmly do not believe that Wiktionary is a better dictionary or lexical resource if we claim that literally any string of characters to which anyone ever has ascribed meaning is automatically part of the language. I get that you don't agree, but you can express your disagreement without doing so on the highest horse you can find. The vast majority of the arguments you have made in this discussion section have nothing to do with the policy question posed, and instead of furthering the discussion they make it much harder to follow, some kind of lexical Gish gallop. - TheDaveRoss 15:39, 13 June 2022 (UTC)[reply]
Many people find at least some terms offensive that are not denigrating. Conversely, some people may consider some clearly denigratory terms not offensive. To avoid some pointless discussion, it may be better to refer to this by “attestation criteria for denigratory entries”. The definition of “denigratory entry” would be the same as now (“This rule applies to entries that denigrate a specific individual in any way, or an individual, group of persons, or geographical location, etcetera, on the basis of ancestry, ethnicity, gender or sex, religion, or sexual orientation.”)  --Lambiam 22:21, 10 June 2022 (UTC)[reply]
@Lambiam: sure, I’ve no objection to that. — Sgconlaw (talk) 22:27, 10 June 2022 (UTC)[reply]
I think it would be nice if it covered some other dubious slang like the recently deleted daddy's carrot, but I'll take the improvement that merely including denigrating terms would bring. - TheDaveRoss 23:35, 10 June 2022 (UTC)[reply]
I agree with the proposal and I would add further that I'm very skeptical of terms that can be attested only on Usenet. I have never heard of Norgay, for example, or any of the other terms given at the top of this entry, yet someone has added 5 cites from 3 different Usenet groups. Usenet is (or more like was) a sort of subculture with its own idiolectal terms. If you search in Google for "Norgay" for example, you get a zillion hits for Tenzing Norgay, and if you search for "norgay" -tenzing you get 96 hits, none of which seem to refer to Norway except for one link to the Urban Dictionary entry and one other to a random website timetoast.com that for all I know made it up independently, as it also has similar terms like "Swedgay", "Dangay", "Germgay". So if such terms get kept due to Usenet cites, I would want them tagged with a "Usenet only" or similar label. Benwing2 (talk) 00:46, 11 June 2022 (UTC)[reply]
This is not particular to Usenet. Somewhere terms must be interconnected in their arisal for lexicalization to be achieved, rather than having been independently coined. But the connection uses to be invisible in the sources. And there is no provision to save use from occasionalisms other than counting attestation which you perceive as a method reduced to absurdity. Fay Freak (talk) 01:21, 11 June 2022 (UTC)[reply]
Reading the Norgay/West Undies/Buttswana/Porntugal discussion, actually one could add an additional criterion of something like, as a rough 3:30 AM negative formulation, that the word must be believed to be not coined independently or occasionally. WT:CFI requires “independence” of terms meaning independence from referring to a particular environment (controversial in the details), yet they also must be dependent in the sense of being back-coupled in the language communities and perhaps still causally “depend” on a single coiner. One could also just implement User:Fay Freak/Wiktionary:ATTEST 2021 and take the wording “live” seriously, as a word living merely on the occasion is even less then living in the familiar circle of thee and thy best friends (which we soothfast even already agree about to be not inclusionworthy but fail to reflect conceptually). Fay Freak (talk) 01:43, 11 June 2022 (UTC)[reply]
Re "Somewhere terms must be interconnected in their arisal for lexicalization to be achieved, rather than having been independently coined": I agree, and that's what I tried to argue at Talk:cowtastrophe: "The problem is that this word is not a real "trend": it's not being picked up by a speaker, then another, etc. We're simply lumping quotes together to fulfil the CFI, but this is artificial". PUC21:30, 12 June 2022 (UTC)[reply]
I don't think the coinage of a word really matters all that much. Plenty of affixes (such as anti- or -able) can be used in such formulaic ways that some words surely only exist because of different writers independently creating them. Should we restrict them too? Binarystep (talk) 01:53, 13 June 2022 (UTC)[reply]
(They are already restricted, see non-Canadian which I argued to keep) AG202 (talk) 01:58, 13 June 2022 (UTC)[reply]
To my knowledge, they're only restricted if they're hyphenated. I also think non-Canadian should've been kept. Binarystep (talk) 02:27, 13 June 2022 (UTC)[reply]

Overall this seems to be a very good proposal. Vininn126 (talk) 12:34, 11 June 2022 (UTC)[reply]

If the concern is about e.g. some new account just showing up and adding a bunch of racist (or otherwise offensive) terms out of (presumably) racist (or otherwise objectionable) motivations, why not add a requirement that, no matter how well cited a new entry documenting an offensive word is, it can only be added if the user has a history of making numerous other good edits for non-offensive words? Like, maybe only allowed if the proportion of like, offensive words they've added to good positive contributions for non-offensive words, is sufficiently low. So, for example, because my account is new and I haven't made other contributions before this edit, I would not be allowed to create an entry for any offensive words. (If you are wondering why I'm making this suggestion when I haven't made any other edits: someone else mentioned the discussion to me, and I thought of this idea that I thought of as a compromise position, and they asked that I add it, because I thought of it and they didn't want to take claim of my idea. If my participation here is inappropriate, I apologize.) A potential difficulty I see with this idea of mine, is that I'm not sure whether it would be easy enough to measure such ratios... but still, maybe something along these lines could help? --Madaco1 (talk) 01:21, 13 June 2022 (UTC)[reply]

As mentioned in the comment I recently made, I don’t feel like this is implementable. How would they be tracked? Would they just be barred from adding the “offensive” label? Then they could just add the terms and then wait for someone else to add the labels later, and then we’re back to square one. Let alone the issue of new users with genuine intents documenting languages that aren’t covered here. That’s why I came up with this compromise after the multiple discussions that’ve been had. (CC: @Binarystep) AG202 (talk) 01:32, 13 June 2022 (UTC)[reply]
Yes, there's no way to automatically prevent new users from adding "offensive" material because there's way to automatically identify such terms. Benwing2 (talk) 01:39, 13 June 2022 (UTC)[reply]
I'm aware they can't be automatically prevented, but they can be deleted and their creators can be banned. It'd be treated the same way as any other form of vandalism, which is what this form of trolling effectively is. Binarystep (talk) 01:44, 13 June 2022 (UTC)[reply]
@AG202: You never did respond to my points here, by the way. Binarystep (talk) 23:59, 13 June 2022 (UTC)[reply]
I didn't feel the need to; please don't tag me like that. I don't have to respond to everything, and part of me wishes I hadn't as much. AG202 (talk) 00:04, 14 June 2022 (UTC)[reply]
If your concern is whether new users would be allowed to add terms from LDLs, you could always rewrite my suggestion to only block IPs from creating pages for offensive terms from well-documented languages. I'm not sure this is even a problem to begin with, though. Binarystep (talk) 01:47, 13 June 2022 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I have now created a vote at "Wiktionary:Votes/pl-2022-06/Attestation criteria for derogatory terms". — Sgconlaw (talk) 22:06, 13 June 2022 (UTC)[reply]

Attestation period

Assuming people are generally in favour of the proposal, should we stick with two weeks, reduce it to one week, or pick some other period to allow for attestation of such entries? — Sgconlaw (talk) 04:37, 11 June 2022 (UTC)[reply]

One week was my suggestion when I proposed this policy. I'm not set on that exact length of time though. I can see a rationale for allowing newer users a grace period to learn how to find and format CFI-compliant cites. WordyAndNerdy (talk) 05:08, 11 June 2022 (UTC)[reply]
The shorter timing should not start until the entry is identified as offensive. Additionally, it should not start until a putatively adequate set of quotations has been rejected. The word Dutchman strikes me as a good one to start thinking about. Should the offensive meaning 'A male white Afrikaner' be struck immediately under this rule? It lacks three attestations. Even if we strike that meaning, isn't there something offensive about the rider 'or I'm a Dutchman'.
As to the number of attestations, does this requirement also apply to words in less documented languages? I can easily imagine someone innocently adding a Sumerian or Pali word without being aware that it was an offensive term. And if they were aware, would they be required to provide three independent attestations? --RichardW57 (talk) 14:33, 12 June 2022 (UTC)[reply]
Thinking of words that would stand for immediate deletion, I can also think of the term Rupert for an officer. --RichardW57 (talk) 14:45, 12 June 2022 (UTC)[reply]
Is the dog's name Nigger offensive? Apparently it wasn't in the 1950's (source w:Nigger (dog)), and I don't have any quotations to show it's still being bestowed. RichardW57 (talk) 14:45, 12 June 2022 (UTC)[reply]
Would this even be needed as an entry on Wiktionary? Of all the things I've seen folks call encyclopediac here, this seems like a prime example. AG202 (talk) 14:51, 12 June 2022 (UTC)[reply]
For better or worse, it was the name of Lovecraft's cat as well, and was a somewhat common name for pets back then. We certainly shouldn't be referring to any specific animals, though. Theknightwho (talk) 15:55, 12 June 2022 (UTC)[reply]
LDLs only need one cite or reference and I would be very surprised if a person adding the term in Sumerian doesn't have a reference they could give. Thadh (talk) 08:24, 13 June 2022 (UTC)[reply]

More quotations to be required? Disallowance of certain sources?

@Sgconlaw I support this proposal, but I think more could be done. If our concern with the project is focused around having these types of words in general, then the amount of citations required needs to be increased, either in this proposal or one very soon after. "Arguably, the reputation of the project as a whole is lowered by the presence of such entries." Currently, there's nothing stopping anyone from easily finding solely 3 cites from Usenet, often from the most horrid places (see: the cites at Apefrican for example), leading to these low-quality entries being kept in the first place. We talk so much about how we're not Urban Dictionary, yet we give a presence to these one-off terms from vile, racist spaces? At least people know not to take Urban Dictionary seriously. By giving these words this kinda platform, we're only furthering the propagation of them, while they may have initially died out, being used only in those spaces. People take this website seriously (see: Petersonian and how the subreddit for it took its presence on Wiktionary as vilifying the term), so I wish that there were more done to actually increase its quality as a whole. And while I can see this current proposal at least stopping a few for now, I can easily see some editors keeping track of which offensive entries have been created, solely to add citations to them so that they aren't deleted. Overall, WT:ATTEST must be updated. AG202 (talk) 05:49, 11 June 2022 (UTC)[reply]

How do editors feel about increasing the minimum number of quotations required for derogatory entries from three to, say, five? — Sgconlaw (talk) 05:55, 11 June 2022 (UTC)[reply]
For me, I feel like it's less the number, but more the place. I think it'd be better if nonce offensive terms were required to be found in multiple sources. Most of these terms are found solely on Usenet and three vs five doesn't really feel like it'll change much. This would just be adding on to the requirement of having 3 separate authors, and that way, offensive terms that do have currency would be included, while the ones that clearly don't wouldn't be. AG202 (talk) 06:01, 11 June 2022 (UTC)[reply]
@AG202, Sgconlaw I agree with AG202 and I would suggest that as part of this proposal we simply disallow Usenet sources from counting as attestations for such words. That should get rid of most of the nonce words while allowing words like libtard and Rethuglican that do have currency in multiple sources. Benwing2 (talk) 06:06, 11 June 2022 (UTC)[reply]
Hmmm, interesting. Let’s see what others think. What do you feel is the justification for disallowing reference to Usenet in this case, which is generally permitted as a source? — Sgconlaw (talk) 09:10, 11 June 2022 (UTC)[reply]
@Sgconlaw Many of the terms listed above appear to be citable only in Usenet, which seems to be a magnet for people making up derogatory blends. On top of that, we tend not to allow postings on Twitter, Instagram or the like, and Usenet seems more similar to that in its self-curation than to a book, newspaper, magazine, etc. Benwing2 (talk) 17:26, 11 June 2022 (UTC)[reply]
I strongly disagree with this. Wiktionary's goal is allegedly to document "all words in all languages", and arbitrarily restricting our coverage of offensive terms runs counter to that. I'm not going to defend these terms or the racists who use them, but they exist whether we like it or not. If a term is citable and isn't SOP, it should be included. It's not our job to sanitize the English language. Binarystep (talk) 09:34, 11 June 2022 (UTC)[reply]
That's not what's happening. You seem obsessed with the idea we're being prescriptivist. We're not. That is a very phenomenon. What's happening is we are trying to reduce the amount of vandalism as well as nonce words, i.e. words perceived as created for the situation by both the speaker and listener. With time nonce words might become real, but until then they are one-offs. Stop accusing everything of being prescriptivist. Vininn126 (talk) 12:38, 11 June 2022 (UTC)[reply]
Category:English nonce terms existed for a decade without any issues, so it's not like nonce words are banned or anything. Only a few terms are being targeted, and it's solely because of their offensiveness, which is what I take issue with. It's not Wiktionary's job to decide which words are too offensive to mention, and none of our policies support removing words for that reason. In fact, similar proposals have been overwhelmingly rejected in the past (see here for one example). Binarystep (talk) 15:20, 11 June 2022 (UTC)[reply]
There's a difference between these kinds of nonce words, and it's their frequency. Vininn126 (talk) 15:21, 11 June 2022 (UTC)[reply]
Tbf looking at that discussion, it doesn't look like it was overwhelmingly rejected? There were many posts from both sides and a lot of editors trying to find a middle ground; it's just that the discussion fizzled out without finding a common ground, as Wiktionary discussions tend to do multiple times on end. AG202 (talk) 15:46, 11 June 2022 (UTC)[reply]
I'm a bit late to the party, but I also oppose this: If three different people have used it, it's a word, that's our attestation criterion. I wouldn't be opposed to giving a label like "internet slang" or "Usenet slang", but dismissing the source doesn't seem right. Thadh (talk) 08:30, 13 June 2022 (UTC)[reply]
I think it could be a good compromise to have a label and category for Usenet-exclusive terms. Binarystep (talk) 09:39, 13 June 2022 (UTC)[reply]
@Thadh Once again, this is not a Usenet-exclusive issue. I also don't really feel comfortable assigning "Usenet slang" to other terms, like fandom slang, that are only citable on Usenet, as that could make them have a negative label. I just want that offensive terms be cited in more than one website/source, no matter what it is. Also, CFI is not that loose, it's definitely not as simple as "if three different people have used it, it's a word", otherwise we wouldn't clarifications about being durably archived, what counts as being independent, the spanning a year requirement, and more that occurs in the RFD/RFV discussions. Like I've mentioned to other folks, it seems like y'all have this image of Wiktionary that's a noble view of what the website should be, but that's not what it is in reality. We definitely have significant standards, and definitely not every word that has been used by three people is included on the website, otherwise we'd be inching much closer to what folks here complain about Urban Dictionary. AG202 (talk) 11:26, 13 June 2022 (UTC)[reply]
Durably archived is an issue for future reference. If Urban Dictionary were durably archived and if it were apparent that the people writing it actually used the words they're describing, and that those words are used by three people or more, yes, we would include it. Thadh (talk) 13:42, 13 June 2022 (UTC)[reply]
@AG202:
I also don't really feel comfortable assigning "Usenet slang" to other terms, like fandom slang, that are only citable on Usenet, as that could make them have a negative label.
How is it negative? We have categories for dialects of English, English used by non-native speakers, Polari slang, thieves' cant, and various other designations that indicate these terms are only used by specific groups of people. Listing a term as Usenet-exclusive isn't derogatory, it's a simple statement of fact.
I just want that offensive terms be cited in more than one website/source, no matter what it is.
Treating Usenet as a single source is factually inaccurate.
Also, CFI is not that loose, it's definitely not as simple as "if three different people have used it, it's a word", otherwise we wouldn't clarifications about being durably archived, what counts as being independent, the spanning a year requirement, and more that occurs in the RFD/RFV discussions.
CFI doesn't exclude words for moral reasons, though. That's the primary difference between the current status quo and this proposal.
Like I've mentioned to other folks, it seems like y'all have this image of Wiktionary that's a noble view of what the website should be, but that's not what it is in reality.
Should we not try to make the site better?
We definitely have significant standards, and definitely not every word that has been used by three people is included on the website, otherwise we'd be inching much closer to what folks here complain about Urban Dictionary.
Most terms on Urban Dictionary haven't been used by one person, much less three. Binarystep (talk) 23:07, 13 June 2022 (UTC)[reply]
I feel like I'm just going in circles. "Listing a term as Usenet-exclusive isn't derogatory, it's a simple statement of fact." Imho labels like "4chan-slang" do not give a positive light to me and some other folks, and with the rate that Usenet is at rn for me, I'm almost starting to feel the same. "Treating Usenet as a single source is factually inaccurate." this is up to interpretation, just like Twitter & Reddit are treated as one source in many conversations here, Usenet can be as well, especially with its community around these terms (I will not elaborate further please). "Should we not try to make the site better?" I don't find including any nonce derogatory that happens to pop up on Usenet three times making the site better, and that's just a fundamental difference between the two of us. "Most terms on Urban Dictionary haven't been used by one person" tbf, someone making an entry and uploading it to Urban Dictionary with a usage example is one person using it ¯\_(ツ)_/¯ I'm sure if we dug enough into the depths of Twitter, MySpace, and other social media, a lot of the terms that folks mention here from Urban Dictionary would be citable, it's just not worth it to do so. AG202 (talk) 23:13, 13 June 2022 (UTC)[reply]
"Listing a term as Usenet-exclusive isn't derogatory, it's a simple statement of fact." Imho labels like "4chan-slang" do not give a positive light to me and some other folks, and with the rate that Usenet is at rn for me, I'm almost starting to feel the same.
Why is that, though? It feels like you're seeing connotations that aren't there. Saying that a word is only used within a particular community isn't an insult.
"Treating Usenet as a single source is factually inaccurate." this is up to interpretation, just like Twitter & Reddit are treated as one source in many conversations here, Usenet can be as well, especially with its community around these terms (I will not elaborate further please).
Do all Usenet posts have the same author?
"Should we not try to make the site better?" I don't find including any nonce derogatory that happens to pop up on Usenet three times making the site better, and that's just a fundamental difference between the two of us.
My idea of making the site better includes making it as accurate as possible.
"Most terms on Urban Dictionary haven't been used by one person" tbf, someone making an entry and uploading it to Urban Dictionary with a usage example is one person using it ¯\_(ツ)_/¯
That's a definition, but not a use. Per Appendix:English dictionary-only terms, that wouldn't justify its inclusion.
I'm sure if we dug enough into the depths of Twitter, MySpace, and other social media, a lot of the terms that folks mention here from Urban Dictionary would be citable, it's just not worth it to do so.
Those terms are mentioned, not used. Even if we allowed online citations, they'd be treated as dictionary-only terms. I guarantee no one here would be able to find three independent uses of a term like "San Fernando Roulette" on any website, for instance. At best, you might find some "Did you know UD has a page for this?" and "This means this according to UD" comments. Binarystep (talk) 23:20, 13 June 2022 (UTC)[reply]
The connotations are there for me and other folks, as 4chan does not have the best rep, whether or not you see them is another thing. Clearly I am not! saying that Usenet posts all have the same author! Holy hell, I really feel like I'm being interrogated here when we clearly just disagree and I'm not going to change your opinion. "That's a definition, but not a use. Per Appendix:English dictionary-only terms, that wouldn't justify its inclusion." wouldn't justify its inclusion yes, but would still be "one person" using it. And then, idk I've been surprised at RFV efforts here, maybe not that specific word but others. AG202 (talk) 23:25, 13 June 2022 (UTC)[reply]
The connotations are there for me and other folks, as 4chan does not have the best rep, whether or not you see them is another thing.
I don't think we should ignore the history or usage of a word just because mentioning it would reflect badly on the word's users.
Clearly I am not! saying that Usenet posts all have the same author!
Exactly my point. By treating all Usenet posts as the same source, you're holding it to a much higher standard than any other "durably archived" source we use.
"That's a definition, but not a use. Per Appendix:English dictionary-only terms, that wouldn't justify its inclusion." wouldn't justify its inclusion yes, but would still be "one person" using it.
It may be a use in the literal sense, but it wouldn't count as one for the purposes of RFV, which is what I meant.
And then, idk I've been surprised at RFV efforts here, maybe not that specific word but others.
I mean... if a word can be proven to exist, then it should be included. Binarystep (talk) 23:44, 13 June 2022 (UTC)[reply]
🫤 This change would apply to all sources, so all Tweets, Reddit posts, NYT articles (though they usually don't have offensive terms, but are also durably archived) would count as one source respectively as I've already said :-////, so it's not just Usenet :-////. My same feelings as my most recent message far below apply here as well. AG202 (talk) 23:47, 13 June 2022 (UTC)[reply]
🫤 This change would apply to all sources, so all Tweets, Reddit posts, NYT articles (though they usually don't have offensive terms, but are also durably archived) would count as one source respectively as I've already said :-////, so it's not just Usenet :-////
This proposal came about because of Usenet, so I'm mentioning it as an example. My point is that you're lumping potentially dozens of authors together and treating them as the same person, solely because they used the same method of communication. Binarystep (talk) 23:53, 13 June 2022 (UTC)[reply]
I am treating them as one source, as I have said. You don't agree with that, and that's fine. You are going to vote oppose, and that's fine. Unfortunately, this conversation has not been fruitful. I am not arguing this further. AG202 (talk) 23:56, 13 June 2022 (UTC)[reply]

I am opposed to selectively requiring more or higher-quality citations for any one category of terms. Such a framework is just as likely to be employed to gatekeep non-derogatory terms that are viewed as less inclusion-worthy by some (fandom slang, obscure regional slang, social-justice and LGBT terminology, etc.) as it is to keep out nonsense pulled from the bowels of Urban Dictionary. Many terms seldom make it into print because they are rarely used outside specific contexts or communities.

I do agree with the principle of minimizing extremist and fringe material. I would support disallowing certain sources to be used as citations unless they are quoted by acceptable secondary sources (e.g. a white supremacist website quoted in an academic text). I would support updating CFI to allow for quotes deemed particularly offensive or unhelpful to be limited to citations pages. But I oppose proposals to redraw the criteria for inclusion along subjective lines. The lexicographer's job is to document language as it is used. And it isn't always used to noble ends. We cannot pick-and-choose which words we document without undermining our mission. It would also create a huge slippery slope.

That said I do support enforcing a time limit on RfV nominations of offensive terms (or at least terms of abuse against race, gender, religion, sexuality, etc.) but more for the purpose of vandalism reduction. WordyAndNerdy (talk) 10:23, 11 June 2022 (UTC)[reply]

@Binarystep @WordyAndNerdy "Such a framework is just as likely to be employed to gatekeep non-derogatory terms that are viewed as less inclusion-worthy by some (fandom slang, obscure regional slang, social-justice and LGBT terminology, etc.)" Imho, this is a slippery slope. I've been an ardent supporter of keeping those terms in Wiktionary and would strongly oppose any attempt to limit them. That's not my goal here. "We cannot pick-and-choose which words we document without undermining our mission." I don't get this argument. We already pick and choose, otherwise we wouldn't have WT:ATTEST, WT:CFI, WT:RFD, or WT:RFV in the first place. Every dictionary except maybe Urban Dictionary has inclusion criteria. Words used significantly on Twitter are still struggling to be covered here. Like we are not a 100% every-word-must-be-included dictionary. We have criteria and we can update it as we see fit. Giving a space to nonce derogatory words used in the most horrid spaces does not have to be our job, and we can update our guidelines as we see fit. That being said, that's part of why I said it's not the number, but the place where the words are being used. If the words are used elsewhere, then that's fine, they can be included, but if they're only used in white supremacist spaces in Usenet, then there's no reason why we need to give them the space that they're currently given here. For all the talk about us not being Urban Dictionary, these terms often make us look worse than them... AG202 (talk) 14:09, 11 June 2022 (UTC)[reply]
"Such a framework is just as likely to be employed to gatekeep non-derogatory terms that are viewed as less inclusion-worthy by some (fandom slang, obscure regional slang, social-justice and LGBT terminology, etc.)" Imho, this is a slippery slope.
Is it? There's already been a recent attempt to ban certain fandom slang terms for being "too niche".
We already pick and choose, otherwise we wouldn't have WT:ATTEST, WT:CFI, WT:RFD, or WT:RFV in the first place.
The issue here is that we'd be making an exception to our existing rules just to ban a handful of terms for reasons that have nothing to do with attestability or SOP-ness.
Words used significantly on Twitter are still struggling to be covered here.
And that's a problem. The solution isn't to restrict our already limited coverage even further.
Like we are not a 100% every-word-must-be-included dictionary.
Again, that's not a good thing. The more gaps we have in our coverage, the less useful the site becomes.
We have criteria and we can update it as we see fit.
That's true, which makes me wonder why we can't update our criteria in a way that makes our coverage more accurate instead.
That being said, that's part of why I said it's not the number, but the place where the words are being used. If the words are used elsewhere, then that's fine, they can be included, but if they're only used in white supremacist spaces in Usenet, then there's no reason why we need to give them the space that they're currently given here.
As long as Usenet is considered a "durably archived" source, we shouldn't make value judgements about which Usenet-exclusive terms are worth mentioning.
For all the talk about us not being Urban Dictionary, these terms often make us look worse than them...
The problem with Urban Dictionary isn't that it allows stupid terms, it's that it allows (and primarily consists of) terms that have literally never been used by anyone. A recurring nonce word used exclusively by racists is still a real word, unlike, say, "cincinatti ferris wheel". Binarystep (talk) 15:51, 11 June 2022 (UTC)[reply]
@Binarystep
Is it? There's already been a recent attempt to ban certain fandom slang terms for being "too niche".
I specifically defended that term (I was literally the first person to vote keep on it, and it's one RFD that's already heavily leaning on keeping the word), my proposal does not center around them, and I don't want it derailed.
The issue here is that we'd be making an exception to our existing rules just to ban a handful of terms for reasons that have nothing to do with attestability or SOP-ness.
We already make a TON of exceptions based on that? We don't include every celestial body, we don't include every place name, we don't include every number, we don't include pleeeease, we don't include Charizard, we don't include sarcastic usages, we don't include some company names, we don't include all chemical formulae, and the list goes on and on. Those are all words too that we exclude. I don't see why we can't make yet another exception for nonce offensive terms, just so that we don't give space to literally any nonce offensive term that Usenet racists make up. They would literally only need to be cited on 1-2 more sources to be included. It's not a blanket ban on every offensive term either, most of the ones we already have would stay, it'd just combat the recent wave of random nonce horrific offensive terms out of the depths of Usenet, that we can't even be sure if they're even used anymore. Wiktionary should be more inclusive, yes, but this is not one of those ways. We should also be thinking about the everyday user and what platform we're giving to words that would've otherwise never seen the light of day. AG202 (talk) 16:38, 11 June 2022 (UTC)[reply]
Why do Wiktionarians now think though that they are righter about Usenet quoting and offensive terms than fifteen years ago? Fay Freak (talk) 21:29, 11 June 2022 (UTC)[reply]
Wiktionary:Policies and guidelines#How are policies decided? - this link should help. Theknightwho (talk) 22:19, 11 June 2022 (UTC)[reply]
Whereby? You need help. Fay Freak (talk) 22:51, 11 June 2022 (UTC)[reply]
@Fay Freak Perhaps sealion#Verb might be more enlightening. Theknightwho (talk) 03:12, 12 June 2022 (UTC)[reply]
I oppose treating offensive words differently in this regard. I don't really want entries for offensive words to contain more quotations and I think it's awfully arbitrary to exclude Usenet for this but not other things. Andrew Sheedy (talk) 21:17, 12 June 2022 (UTC)[reply]
@Andrew Sheedy, @Benwing2, @Binarystep, @WordyAndNerdy, @Sgconlaw To make it clear, I am not advocating for the exclusion of Usenet here. I would just prefer that offensive terms require more than one website to show usage. Otherwise we will have an infinite amount of derogatory nonce terms that really bring down the quality of the website and continue to give them a platform to spread out more. This is the sixth conversation, at the very least, about this issue, and I've listened and talked with so many people and changed my proposal and approach so many times, but nothing seems to be changing, which is really unfortunate. There was a conversation that I read from two years ago about the image that we want to give our users and fellow editors, and I think that it's something that really needs to be taken into consideration. We have so so so so many policies about which words can and cannot be included at WT:CFI, but when it comes to offensive nonce terms that were made in the pits of the most vile, white supremacist places, but did not make it out of them, we're all of a sudden hesitant to require that they be cited a bit more aggressively, and honestly it hasn't sent the best message. It's truly sad and disappointing to me that there's more energy and time and resources being spent on preserving and debating words like Apefrican and Darky Cuntinent than getting words from actual African languages on here. Our coverage on them is so paltry, though I've been able to get more Yorùbá editors on here and increase coverage significantly, and I wish that instead of lengthy RFD, RFV, and Beer Parlour discussions on preserving words that were only used a few times in the most racist spaces, we could actually spend time on preserving some of our most impacted and endangered languages, which is why I joined this community in the first place. However, the longer I've been here, the less welcome I've felt. AG202 (talk) 21:44, 12 June 2022 (UTC)[reply]
Thanks for the clarification. I had indeed misunderstood you. I do agree. The last thing we want is our documentation of the language to be the cause of obscure racist slurs becoming mainstream. I was concerned that what we allowed or didn't would start to become somewhat arbitrary, but I think what you're describing would prevent that from much of an issue. Andrew Sheedy (talk) 21:50, 12 June 2022 (UTC)[reply]
@AG202 My suggestion to exclude Usenet for derogatory terms was just one way of trying to cut down on the crap. I'm fine with a more general requirement that at least two different sources be provided. Benwing2 (talk) 23:05, 12 June 2022 (UTC)[reply]
@Sgconlaw I think we should move to a vote fairly soon; given the viewpoints expressed here, I think we will be able to get one that passes with a 2/3 majority. Only a small minority seem categorically opposed to such a thing. Benwing2 (talk) 23:05, 12 June 2022 (UTC)[reply]
To make it clear, I am not advocating for the exclusion of Usenet here. I would just prefer that offensive terms require more than one website to show usage.
You're still proposing that we hold offensive terms to a different standard. Raising the bar to exclude certain words is something I'll never agree with. For comparison, imagine if we added an addendum to WT:FICTION saying we only accepted terms from well-known fictional works, even if a more obscure term didn't violate policy whatsoever.
Otherwise we will have an infinite amount of derogatory nonce terms that really bring down the quality of the website and continue to give them a platform to spread out more.
It's not Wiktionary's job to prevent the spread of offensive terms, and racists will continue to be racist regardless of whether we document their slurs, something which I can personally attest to. Incidentally, every racist term I've been called would still be allowed on the site after this, given that they didn't originate from Usenet.
As for bringing down the quality of the site, I'd argue that refusing to document attestable terms simply because we don't like them (yes, they're objectively vile terms, but that's not a good reason to pretend they don't exist) does that far more than having pages for terms that no one's obligated to read.
We have so so so so many policies about which words can and cannot be included at WT:CFI, but when it comes to offensive nonce terms that were made in the pits of the most vile, white supremacist places, but did not make it out of them, we're all of a sudden hesitant to require that they be cited a bit more aggressively, and honestly it hasn't sent the best message.
We don't have any policies that justify excluding words for moral reasons. Aside from that, our policies are already overly restrictive, and have held us back as a result. Making them even more limiting is a step backwards.
It's truly sad and disappointing to me that there's more energy and time and resources being spent on preserving and debating words like Apefrican and Darky Cuntinent than getting words from actual African languages on here.
Our coverage isn't a zero-sum game. We could delete everything in Category:English ethnic slurs if you wanted, that wouldn't automatically lead to better documentation of LDLs. This argument is ultimately a non sequitur.
Our coverage on them is so paltry, though I've been able to get more Yorùbá editors on here and increase coverage significantly, and I wish that instead of lengthy RFD, RFV, and Beer Parlour discussions on preserving words that were only used a few times in the most racist spaces, we could actually spend time on preserving some of our most impacted and endangered languages, which is why I joined this community in the first place.
These discussions wouldn't be happening if some users weren't more focused on trying to reduce our coverage than expand it. I don't appreciate how you blame your opposition for something they didn't start in the first place. How many Yoruba terms could've been added in the time it took to come up with this proposal? Binarystep (talk) 23:36, 12 June 2022 (UTC)[reply]
I never said I wasn’t going to hold offensive terms to a different standard, that’s been the main point of my proposal. This discussion spawned because certain IPs were spamming offensive nonce terms which happened to be citable on Usenet. I never said I wanted to delete all of the ethnic slurs in the ethnic slur category, this is mainly to limit the creation of random derogatory nonce terms that have never been used elsewhere. I’ve literally said that I would just prefer that terms be cited on more than one website. That’s very far from saying that they should all be deleted, and I don’t appreciate that assumption, nor do I appreciate calling my experiences here a non-sequitor as it’s what me and other users working on underrepresented languages have felt. It’s also not “refusing to document because we don’t like them”, otherwise I’d once again advocate for their full deletion, which I am not. I’ve also been one of the targets of one of the biggest slurs of all mankind that didn’t originate on Usenet, yet I’m not calling for its deletion because it’s clearly and evidently cited. And then finally, the part about “how many Yorùbá terms could’ve been added in the time it took to come up with this proposal?” frankly feels insulting and not an argument in good-faith, as I have put an immense amount of effort into Yorùbá coverage here and we’ve increased the lemmas almost tenfold since starting, and I'm the one that spent weeks of my time creating modules and templates, let alone the work I’ve done with Jeju as well, so please don’t use that argument with me again or I will not engage with you further. I can use my time to call out the project for not giving as much support as it could be for underrepresented languages like these with the intent that it’ll make things easier in the long-run. AG202 (talk) 01:01, 13 June 2022 (UTC)[reply]
I never said I wasn’t going to hold offensive terms to a different standard, that’s been the main point of my proposal.
And that's what I fundamentally disagree with. It also implies that a word's inclusion in Wiktionary is synonymous with its endorsement, which is problematic.
This discussion spawned because certain IPs were spamming offensive nonce terms which happened to be citable on Usenet.
There are other solutions, though. For one, we could ban IPs and new users from making pages for offensive terms.
I never said I wanted to delete all of the ethnic slurs in the ethnic slur category, this is mainly to limit the creation of random derogatory nonce terms that have never been used elsewhere.
I didn't say you did? If that's how that came off, I'm sorry about that. My intent was only to say that even the strongest possible approach to reducing coverage of offensive terms wouldn't automatically have a positive effect on other gaps in our coverage.
And then finally, the part about “how many Yorùbá terms could’ve been added in the time it took to come up with this proposal?” frankly feels insulting and not an argument in good-faith, as I have put an immense amount of effort into Yorùbá coverage here and we’ve increased the lemmas almost tenfold since starting, and I'm the one that spent weeks of my time creating modules and templates, let alone the work I’ve done with Jeju as well, so please don’t use that argument with me again or I will not engage with you further.
My intent wasn't to imply that you're not putting effort into your contributions, but rather that this specific proposal doesn't solve the other issue you mentioned. My point was that these two situations have nothing to do with each other. Binarystep (talk) 01:13, 13 June 2022 (UTC)[reply]
And I’m fine with you disagreeing with it, that’s why this discussion is happening in the first place. I’ve tried hard to find a middle ground across the multiple discussions had and this seems to be it as I can’t appeal to everyone unfortunately. I’m sure more folks would object to having IPs & new users from making offensive terms (also I feel like that’d be less implementable @Benwing2 can correct me on that) The two issues may not directly impact each other, but they do leave an image about the community. It’s hard for me to convince editors to come and help, as mentioned, I myself feel less welcome when we’re comfortable with giving such a space to those terms that are barely citable, which leads to less enthusiasm on my part and others’ parts to contribute to this project. AG202 (talk) 01:28, 13 June 2022 (UTC)[reply]
I’m sure more folks would object to having IPs & new users from making offensive terms (also I feel like that’d be less implementable @Benwing2 can correct me on that)
Why? IPs can't edit the pages for most offensive terms, why should they be allowed to create new ones?
The two issues may not directly impact each other, but they do leave an image about the community.
It shouldn't. Wiktionary is a dictionary, and the inclusion of a word isn't the same thing as endorsement.
It’s hard for me to convince editors to come and help, as mentioned, I myself feel less welcome when we’re comfortable with giving such a space to those terms that are barely citable, which leads to less enthusiasm on my part and others’ parts to contribute to this project.
This is what I'm struggling to understand. From my perspective, Wiktionary choosing to define a word is only saying "this exists and someone said it". It doesn't mean the site approves of the word, its users, or their ideology. It makes more sense to me to go after racist users (hence not allowing IPs to troll the site with new slurs) than racist words. Binarystep (talk) 01:43, 13 June 2022 (UTC)[reply]
If Wiktionary were an “any word can go” website, then I wouldn’t be having this discussion, but we have tons upon tons of standards. We choose to preserve some words but then choose to delete others. I’m having another discussion about why United Nations should be kept, I had to fight tooth and nail to find appropriate cites for Mickey Mouse ring, internalized homophobia got deleted for being SOP (though I still think it’s needed but alas), but vile words that don’t have any coverage at all past one website get to stay? Wiktionary, whether we like it or not, implicitly approves certain words and phrases, and we don’t have to approve those ones automatically. Also as a side note, Wiktionary has led to the approval of terms for groups, Petersonian being sent to RFV started an issue in the related subreddit and they took it as a victory when it was kept. There’ve been callouts about Wiktionary’s reconstructions and coverage in different linguistic forums. There’ve been questions about who actually makes up Wiktionary’s editors. I’ve seen my own entry at yassification be used as justification for the word “existing” and being cited in multiple tweets. As with any dictionary (see: the RAE and elle, the outrage against Le Petit Robert and its inclusion of iel, and Merriam-Webster’s inclusion of the singular they), we do have an impact, and as such, I think that we could be a bit more strict with how we include those words. AG202 (talk) 01:56, 13 June 2022 (UTC)[reply]
If Wiktionary were an “any word can go” website, then I wouldn’t be having this discussion, but we have tons upon tons of standards.
We have consistent standards. Problems arise when you start banning words on an individual basis.
(I'd also argue that Wiktionary should be an "any word can go" website, given that our collaborative format would make it trivially easy for us to become the most accurate dictionary in existence.)
We choose to preserve some words but then choose to delete others.
Assuming they're valid words, that's not a good thing.
I’m having another discussion about why United Nations should be kept, I had to fight tooth and nail to find appropriate cites for Mickey Mouse ring, internalized homophobia got deleted for being SOP (though I still think it’s needed but alas), but vile words that don’t have any coverage at all past one website get to stay?
Again, none of those should be deleted. I'm well aware that we have numerous problems with our coverage, and a long history of deleting perfectly valid terms due to some problem with our CFI. The solution is to put an end to our excessive deletionism, not make it worse.
Also as a side note, Wiktionary has led to the approval of terms for groups, Petersonian being sent to RFV started an issue in the related subreddit and they took it as a victory when it was kept.
So? I don't see why that's our problem. I'm sure our coverage of ((( ))), Holohoax, and bix nood has led to some neo-Nazis feeling proud of themselves, but that doesn't mean we should pretend those terms don't exist just to stick it to them.
Honestly, Petersonian really doesn't feel like the best example, given that it's a completely neutral term whose inclusion doesn't communicate anything beyond "people talk about Jordan Peterson". His fans may as well celebrate the fact that he has a Wikipedia page.
There’ve been callouts about Wiktionary’s reconstructions and coverage in different linguistic forums.
There'll be criticism no matter what we do. I've seen some people say we have too much fandom slang while others say we have too little.
There’ve been questions about who actually makes up Wiktionary’s editors.
What do you mean by that?
I’ve seen my own entry at yassification be used as justification for the word “existing” and being cited in multiple tweets.
I mean, yassification is a word that exists. Whether that's a good thing isn't for us to decide.
As with any dictionary (see: the RAE and elle, the outrage against Le Petit Robert and its inclusion of iel, and Merriam-Webster’s inclusion of the singular they), we do have an impact, and as such, I think that we could be a bit more strict with how we include those words.
I doubt that our impact is that big in this case. Removing obscure slurs from Wiktionary isn't going to make anyone less racist. No one's getting "redpilled" by the dictionary. People will continue to be shitty regardless of whether we document examples of their shittiness. Binarystep (talk) 02:26, 13 June 2022 (UTC)[reply]
I could apply that last argument to a LOT of different issues in society today, but that’s get far too off-topic. Just because some folks will continue to be trash, doesn’t mean we should continue to cover these words without a more strict guideline. And while I agree that those words shouldn’t have been deleted, they were and so, I’m trying to build off of what Wiktionary currently has as its policies unless a very major change occurs. The consensus didn’t agree with me, so that’s what I build off of, hence why I’ve altered this proposal a ton. Also our policies are definitely not consistent. I’ve been confused on multiple occasions about which words fall under which policies or how to go about entries or what counts as “durably archived” for example. We definitely already ban certain terms on an individual basis, otherwise there wouldn’t be multiple sections at WT:CFI or WT:RFD. It feels like there’s a Wiktionary that you’re wanting that’s different from what’s actually going on. I want more coverage as well (minus these terms), but alas, I know that a policy that removes CFI, for example, would not be popular and would fail spectacularly. I’m trying to focus on what’s practical and what could maybe pass after talking about it with folks. Re: the demographics part, there’ve been questions about Wiktionary’s demographics and why we cover certain terms and languages and why we don’t cover others. Re: impact, our impact isn’t as big, but it’s definitely there, so we should be striving for quality and think a bit more about what we put out there. AG202 (talk) 02:50, 13 June 2022 (UTC)[reply]
Just because some folks will continue to be trash, doesn’t mean we should continue to cover these words without a more strict guideline.
No, the fact that these words exist means we should continue to cover them. The fact that racists exist isn't a reason to delete valid entries.
And while I agree that those words shouldn’t have been deleted, they were and so, I’m trying to build off of what Wiktionary currently has as its policies unless a very major change occurs.
How does it benefit anyone to make Wiktionary even more deletionist than it already is? The fact that our policies are flawed doesn't justify making them worse. Unless something changes, the best thing we can do is protect our existing coverage.
Also our policies are definitely not consistent. I’ve been confused on multiple occasions about which words fall under which policies or how to go about entries or what counts as “durably archived” for example.
Can you elaborate?
We definitely already ban certain terms on an individual basis, otherwise there wouldn’t be multiple sections at WT:CFI or WT:RFD.
We usually don't ban CFI-compliant words because we dislike them, which is what this proposal ultimately boils down to. Off the top of my head, I can only think of two comparable cases from RFD: the proposal to delete Kent State Gun Girl for being "non-notable" (which sadly passed despite having zero basis in policy), and the proposal to delete everypony for being "too niche".
I can understand deleting entries for being sum-of-parts, names of individuals, non-lexicalized trademarks, or terms coined in fiction that haven't entered general use. What I don't agree with is deleting terms that'd otherwise be kept, solely because they're offensive. All else being equal, the offensiveness of a term should not be the reason for its removal.
It feels like there’s a Wiktionary that you’re wanting that’s different from what’s actually going on.
Well, yeah. Is that not the case for you as well? Both of us want to see the site change in some way or another.
I want more coverage as well (minus these terms), but alas, I know that a policy that removes CFI, for example, would not be popular and would fail spectacularly.
How is less coverage the solution? Sure, outright abolishing CFI isn't feasible, but gradually improving it definitely is. Consider the recent CFI change allowing online citations on a case-by-case basis, which came after years of failed proposals to do pretty much the same thing. Consensus changes over time, and as traditional media becomes less relevant, our policies will likely change to reflect that. On the other hand, coming up with more reasons to delete valid entries will only lead to us becoming less accurate.
Re: impact, our impact isn’t as big, but it’s definitely there, so we should be striving for quality and think a bit more about what we put out there.
We should be striving for accuracy and completeness, or, in other words, "all words in all languages". Our format gives us the potential to become a far better resource than our stricter counterparts, and creating a more restrictive CFI would only accomplish the exact opposite.
Whatever our impact is, I can't see how it matters here. No one became racist because they read a slur in the dictionary. We're not making the world a worse place by describing the bad things that already exist. Additionally, as I said before, our decision to document a word isn't the same thing as us advocating for its usage. Our job is to describe what exists, not to decide what should exist. Binarystep (talk) 09:37, 13 June 2022 (UTC)[reply]
"The fact that racists exist isn't a reason to delete valid entries." I've addressed this and how I'm explicitly not calling for mass deletions. "Our job is to describe what exists, not to decide what should exist." this is not what happens with Wiktionary in reality though. I've definitely seen many many "CFI-compliant" words be deleted because Wiktionarians do not like them. I generally am considered an "inclusionist" by some other editors here, but even then, I don't feel like these terms are really needed without any major citations. I don't want them bulk-deleted, as I've mentioned, I want them to be cited on more than one website. "Additionally, as I said before, our decision to document a word isn't the same thing as us advocating for its usage." once again, this may be our explicit job, but it's not what happens implicitly. "Can you elaborate?" I've had multiple arguments about what is encyclopediac at RFD, what counts as durably archived (I was told that I would have to see if every newspaper at Mickey Mouse ring at the time had a print version before it could pass RFV), RFD closing guidelines, what should be the entry line for some languages, why we decide to strip diacritics from Yorùbá but not from Vietnamese, and more. Our policies are the opposite of consistent. I've ironically used this argument before "all words in all languages", at Wiktionary:Votes/2020-07/Removing_letter_entries_except_Translingual which was a much more sweeping proposal than this one, but the more I've been on this website, the more I realize that that's not the case. And so, I'm personally fine with requiring more than one website for offensive terms, though I'm very aware that you're not, which is also fine. AG202 (talk) 11:39, 13 June 2022 (UTC)[reply]
"The fact that racists exist isn't a reason to delete valid entries." I've addressed this and how I'm explicitly not calling for mass deletions.
Are you or are you not calling for the mass deletion of offensive tems that can only be cited on Usenet?
"Our job is to describe what exists, not to decide what should exist." this is not what happens with Wiktionary in reality though. I've definitely seen many many "CFI-compliant" words be deleted because Wiktionarians do not like them.
And that's a problem. You're acknowledging what's wrong with our current system, but your response is to accept it as unfixable.
I don't want them bulk-deleted, as I've mentioned, I want them to be cited on more than one website.
Aside from the inaccuracy of referring to Usenet as "one website", what happens if a word can't be cited elsewhere? Not all terms entered widespread usage.
"Additionally, as I said before, our decision to document a word isn't the same thing as us advocating for its usage." once again, this may be our explicit job, but it's not what happens implicitly.
How doesn't it? Because some neo-Nazis might think us having a page for their favorite slur validates their beliefs in some way? We can't control what other people think. Regardless of some people's opinions, our explicit purpose is to document words, not support them. If some people refuse to accept that, that's not our fault.
"Can you elaborate?" I've had multiple arguments about what is encyclopediac at RFD, what counts as durably archived (I was told that I would have to see if every newspaper at Mickey Mouse ring at the time had a print version before it could pass RFV), RFD closing guidelines, what should be the entry line for some languages, why we decide to strip diacritics from Yorùbá but not from Vietnamese, and more. Our policies are the opposite of consistent.
Everything you've described is a major problem, but, again, the solution isn't to move even further in that direction.
I've ironically used this argument before "all words in all languages", at Wiktionary:Votes/2020-07/Removing_letter_entries_except_Translingual which was a much more sweeping proposal than this one, but the more I've been on this website, the more I realize that that's not the case.
I agree that this site has the unfortunate tendency to contradict its mission statement, but that isn't a good reason to continue the trend. Binarystep (talk) 22:59, 13 June 2022 (UTC)[reply]
@Binarystep "Are you or are you not calling for the mass deletion of offensive tems that can only be cited on Usenet?" As I've said, if they are citable they are fine, this doesn't even affect that many words, mainly the ones that have been spammed. If they're not citable, then they'd be subject to RFV like other words are. "Regardless of some people's opinions, our explicit purpose is to document words, not support them. If some people refuse to accept that, that's not our fault." I don't think I will change your mind on this, as it's a very fundamental difference in our experiences, so I will leave it at that. "but your response is to accept it as unfixable." I don't know how long you've been here, but I have 100% tried to fix those issues as I've shown. I don't need to prove that to you further. AG202 (talk) 23:03, 13 June 2022 (UTC)[reply]
"Are you or are you not calling for the mass deletion of offensive tems that can only be cited on Usenet?" As I've said, if they are citable they are fine, this doesn't even affect that many words, mainly the ones that have been spammed. If they're not citable, then they'd be subject to RFV like other words are.
You didn't answer my question. If a derogatory term is only citable on Usenet, it would be deleted, correct?
"Regardless of some people's opinions, our explicit purpose is to document words, not support them. If some people refuse to accept that, that's not our fault." I don't think I will change your mind on this, as it's a very fundamental difference in our experiences, so I will leave it at that.
I don't intend to sound rude, but unless I'm mistaken, your experiences only prove that some people think Wiktionary endorses every word it documents. I'm not denying that some people may believe that, but that doesn't make them correct.
"but your response is to accept it as unfixable." I don't know how long you've been here, but I have 100% tried to fix those issues as I've shown. I don't need to prove that to you further.
I know you've tried to fix those issues in the past, which makes me wonder why you're bringing them up now as proof that the status quo is unchangeable. Binarystep (talk) 23:12, 13 June 2022 (UTC)[reply]
Yes, if they fail RFV they'd be deleted, but that does not mean that they're all going to be deleted with a snap of a finger, which is what "mass deletion" sounds like to me, that many many entries would be deleted, like the letter vote would've implied. "I don't intend to sound rude, but unless I'm mistaken, your experiences only prove that some people think Wiktionary endorses every word it documents." This is exactly what I was talking about with implicit impact, Wiktionary may not explicitly endorse certain terms, but having them here gives them power and people thinking that Wiktionary endorses them can create issues. If you don't agree, then fine, but I don't want to go back and forth on that point anymore either. I mainly brought them the status quo now as a rationale for being more stringent with these nonce offensive terms. If we're already more stringent on a lot of valid terms, then I don't know why we can't be even a bit strict with offensive nonce terms. That's another thing on which we fundamentally disagree, so that's that. AG202 (talk) 23:18, 13 June 2022 (UTC)[reply]
Yes, if they fail RFV they'd be deleted, but that does not mean that they're all going to be deleted with a snap of a finger, which is what "mass deletion" sounds like to me, that many many entries would be deleted, like the letter vote would've implied.
Then my statement is accurate. All offensive terms that can't be cited outside of Usenet would be deleted, which is the definition of mass deletion. The phrase doesn't imply a lack of due process.
"I don't intend to sound rude, but unless I'm mistaken, your experiences only prove that some people think Wiktionary endorses every word it documents." This is exactly what I was talking about with implicit impact, Wiktionary may not explicitly endorse certain terms, but having them here gives them power and people thinking that Wiktionary endorses them can create issues.
It's not Wiktionary's fault or responsibility what people think. I'm sure neo-Nazis feel proud of themselves because of our decision to include terms like ((( ))), 1488, bix nood, chimp out, Holohoax, Holocaustianity, electric Jew, and countless other vile epithets, yet I don't think you'd support deleting them.
I mainly brought them the status quo now as a rationale for being more stringent with these nonce offensive terms. If we're already more stringent on a lot of valid terms, then I don't know why we can't be even a bit strict with offensive nonce terms.
You want to hold offensive terms to a higher standard than everything else, solely because they're offensive. Wiktionary is a lot of things, both good and bad, but it's not censored. Binarystep (talk) 23:27, 13 June 2022 (UTC)[reply]
I never understood the "Wiktionary is not censored" portion, when when we have very clear guidelines it is. Your other points have been addressed already in our many exchanges, so I won't rehash them. AG202 (talk) 23:29, 13 June 2022 (UTC)[reply]
I never understood the "Wiktionary is not censored" portion, when when we have very clear guidelines it is.
There's a difference between removing a word for being SOP and removing a word for being objectionable. That's not to say that the former is always a good thing, but they're not the same situation.
Your other points have been addressed already in our many exchanges, so I won't rehash them.
They really haven't been addressed, though. Why should Wiktionary delete valid entries simply because of how some people feel about them? Why does that only apply to obscure slurs, but not more popular ones like yard ape? The same people feel validated in both scenarios, yet only one is worth censoring for the common good. Why is that? Binarystep (talk) 23:48, 13 June 2022 (UTC)[reply]
Me and other folks have stated how those nonce offensive terms bring down the quality of the website to us. They don't bring it down for you, and that's fine, we disagree. I've already addressed the other points to my satisfaction multiple times, and unfortunately, I don't think I'll ever be able to explain myself to your own satisfaction, so here as well, I will end my portion here. AG202 (talk) 23:58, 13 June 2022 (UTC)[reply]
Me and other folks have stated how those nonce offensive terms bring down the quality of the website to us. They don't bring it down for you, and that's fine, we disagree.
These terms bring down the quality of the English language. Unfortunately, they're still part of it, and I don't see why we should lie to our readers by pretending otherwise. Binarystep (talk) 00:02, 14 June 2022 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── From the discussion thus far, it looks like there isn’t a consensus for excluding a specific source like Usenet. Since that raises separate issues, I think the use or otherwise of Usenet as a source shouldn’t be made part of the present proposal. — Sgconlaw (talk) 03:06, 13 June 2022 (UTC)[reply]

@Sgconlaw I’m fine with separating the proposals, but once again, please please please, I’ve said a few times now that I’m not proposing the exclusion of Usenet. I am proposing increasing the number of sources required for offensive terms by 1~2. I don’t want folks once again getting confused based on a proposal that’s not being presented. Also, it does seem like there’s a general consensus on that proposal, minus one or two folks (whose opinions are also valid). AG202 (talk) 03:57, 13 June 2022 (UTC)[reply]
@AG202: thanks. I will figure out how to create a vote soon. I could add an option for requiring that for derogatory entries, the quotations must originate from more than one source (not just Usenet alone), and we could see if there is support for that. Personally, if what is proposed is a greater diversity of sources rather than questioning Usenet itself as a source (which requires a separate discussion) I have no problem with that. — Sgconlaw (talk) 05:24, 13 June 2022 (UTC)[reply]
@Sgconlaw Yes that'd be fine with me. I'd remove language about Usenet, maybe something like "Offensive terms must be cited in more than one source (website, book, television show, etc.) to be included on Wiktionary" with a caveat for LDLs obviously. AG202 (talk) 11:41, 13 June 2022 (UTC)[reply]
@AG202: to avoid doubt, I think it might be necessary to expressly mention that Usenet as a whole is considered as a single source, rather than each conversation therein, or each post with a distinct date, being regarded as a separate source. — Sgconlaw (talk) 11:46, 13 June 2022 (UTC)[reply]
@Sgconlaw That would be fine, maybe we could add more examples like for example "Twitter, Reddit, 4chan, Usenet all count as one source individually" just so that there's less focus on Usenet. AG202 (talk) 15:35, 13 June 2022 (UTC)[reply]
@AG202: sure, though don't we not use some of these websites like Reddit and Twitter because they aren't durably archived? We should only mention durably archived sites. — Sgconlaw (talk) 15:37, 13 June 2022 (UTC)[reply]
@Sgconlaw I do feel that Reddit & Twitter are on their way towards being included especially with the current RFVs shenanigans going on + the passed vote on this issue, so I think that it'd be good to include them now rather than later. AG202 (talk) 15:47, 13 June 2022 (UTC)[reply]
@AG202: OK, then. — Sgconlaw (talk) 16:15, 13 June 2022 (UTC)[reply]
The fact that this proposal gives more weight to a word used in 3 books than a word used in 50 Usenet posts is inherently problematic. Usenet is a medium, not a single source. The very basis of this proposal is an inaccurate oversimplification. Binarystep (talk) 23:01, 13 June 2022 (UTC)[reply]
If you don't see the difference between 3 books and 50 Usenet posts when it comes to the propogation of offensive terms, then I just think that's a different perspective and experience between the two of us, and there's unfortunately little point to try and go back and forth about this anymore. AG202 (talk) 23:04, 13 June 2022 (UTC)[reply]
What's the difference, then? How does the fact that a word made it into print make it more of a word, even if it's only been used three times in documented human history? Usage defines the validity of a word, not the medium it's used in. Binarystep (talk) 23:13, 13 June 2022 (UTC)[reply]
If the medium does not matter then we wouldn't have the durably archived clause. But anyways, yeah, a book feels more established when it comes to words being used, vs a racist community where they're literally just attaching any negative word to the name of an African country, using it three (or fifty, doesn't matter) times, and then bam we got an entry in Wiktionary. As the target of a lot of these new terms, it just doesn't feel right. AG202 (talk) 23:27, 13 June 2022 (UTC)[reply]
If the medium does not matter then we wouldn't have the durably archived clause.
I mean, the "durably archived" rule is keeping us stuck in the past and should probably be abolished, but that's another story.
But anyways, yeah, a book feels more established when it comes to words being used, vs a racist community where they're literally just attaching any negative word to the name of an African country, using it three (or fifty, doesn't matter) times, and then bam we got an entry in Wiktionary.
This isn't the Middle Ages. It's trivially easy for anyone with enough money to get a book published, the barrier of entry isn't nearly as high as people tend to assume. Aside from that, Wiktionary documents terms based on their usage (within "durably archived" sources), not based on whether they were used by a more elite class of people. Usenet is considered a "durably archived" source, exceptions shouldn't be made to get rid of terms we don't like.
As the target of a lot of these new terms, it just doesn't feel right.
Does a term become less offensive because its user had more resources available to them? How does it make a difference to either of us whether a term like dindu nuffin was used in a book or a neo-Nazi website? It's still the same word and it carries the same meaning. Ultimately, Wiktionary pretending a handful of slurs don't exist isn't going to make anyone less racist. Binarystep (talk) 23:38, 13 June 2022 (UTC)[reply]
🫤 and we're back several exchanges once again. I'm not going to change your mind. This issue, as I've mentioned is very near and dear to my heart and as such it's already very taxing to participate in (especially seeing how internet users continue to show their hate for Black people in very very novel ways). And now, my most important and heartfelt message has been utterly drowned out by back and forth exchanges that have led neither of us to shift at all. And so, I will bow out here and see how the vote goes in the end. AG202 (talk) 23:44, 13 June 2022 (UTC)[reply]
🫤 and we're back several exchanges once again. I'm not going to change your mind. This issue, as I've mentioned is very near and dear to my heart and as such it's already very taxing to participate in (especially seeing how internet users continue to show their hate for Black people in very very novel ways).
Censoring Wiktionary isn't going to end racism, and you're blaming it for something it has no involvement in. Twitter user "RaceRealist88" isn't going to plunge the depths of Wiktionary for a slur that was used between 1993 and 2005 on alt.fan.adolf-hitler, he's just going to say the N-word and call it a day. I feel pretty confident in saying that, given my experience dealing with that exact word. Binarystep (talk) 23:58, 13 June 2022 (UTC)[reply]
"and you're blaming it for something it has no involvement in." It has in my own and others' experiences :-/ which is what I've been trying to get at. I'm not going to get into detail about them here, but I have felt its direct impact, so I wish you'd at least respect that even if you disagree with the proposal. AG202 (talk) 00:02, 14 June 2022 (UTC)[reply]

Without looking thoroughly into things, the idea of "Disallowance of certain sources?" sounds very dangerous on its face. Descriptivism means the whole language and everywhere it is used. That doesn't mean "no standards", but it does mean that all sources must be allowed- including Chinese Communist Party mouthpieces, Putin/KGB/FSB, CIA, religious fundamentalists, cults, deviants, and everybody else. --Geographyinitiative (talk) 23:13, 13 June 2022 (UTC)[reply]

@Geographyinitiative For the upteenth time, please, no source is being disallowed 🙏. I don't know how this conclusion keeps happening, but that's not the goal here. AG202 (talk) 23:19, 13 June 2022 (UTC)[reply]
AG202, I apologize if I have misunderstood anything. But the words "Disallowance of certain sources?" appear above. It causes my spider sense to tingle. I want the Iranian Revolutionary Guard, Juche, Turkmenbashi, the Green Book, whatever else to be allowable if relevant. If these get too extreme or "fringe" then, I would confine them to the Citations page. But use of the word "disallowance" was a mistake if you all mean that "please, no source is being disallowed 🙏". God bless. --Geographyinitiative (talk) 23:57, 13 June 2022 (UTC)[reply]
@Geographyinitiative I was not the one that came up with that header unfortunately, and it was an unfortunate misunderstanding of my original point. I never called for the disallowance of certain sources in this conversation as I had proposed it months back, and it got a rightful pushback. AG202 (talk) 00:00, 14 June 2022 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I have now created a vote at "Wiktionary:Votes/pl-2022-06/Attestation criteria for derogatory terms". — Sgconlaw (talk) 22:06, 13 June 2022 (UTC)[reply]

(Replying to the above discussion) A user has tried to add Daily Stormer quotes to random entries in the past. Without firm, clear rules disallowing links to fringe and extremist sites, this is an issue that will metastasize throughout the wiki. We can implement editorial standards like every other dictionary that aims to balance accuracy, reliability, and accessibility, or we can be a digital bathroom stall for provocateurs to graffiti. "Wiktionary is not censored" is not carte blanche for anything to be included in any entry. The entry for head does not include an image of a severed head. The entry for brown doesn't include an image of feces. We make discretionary choices about what to include in entries all the time. That said, I don't believe we should disallow offensive quotations, only that we shouldn't lend visibility to fringe websites by linking them. Usenet or print books don't raise the same misgivings for me. I'm not opposed to the idea of limiting certain quotations to citations pages. But for me, at least, there can be a degree of academic distance in quoting a book or historical Usenet thread, but not in linking to a live website that exists solely to propagate fringe ideas or theories. WordyAndNerdy (talk) 09:32, 15 June 2022 (UTC)[reply]
@WordyAndNerdy A LOT of the offensive Usenet quotes have been from the past 5 years though (see: Apefrican for an example); I don't feel like those are really historical anymore. And if they are still live (@Equinox can correct me on that), then the same issue arises as linking to fringe websites. We link to the specific Google group with the Usenet group name, so imho this issue still applies to it. AG202 (talk) 16:20, 19 June 2022 (UTC)[reply]

Mozarabic: what to do when the sources disagree?

Mozarabic is a long-extinct Romance language attested only in a few dozen compositions, where it is written in a rather haphazard way in Arabic or Hebrew script. That, in addition to centuries of copying errors by non-Mozarabophones, makes interpretation of any given text tricky, hence different scholars often arrive at different results.

As an example, we will consider a quote from kharja A1, which is cited on our entry for Mozarabic دلج (dalji) and which is, in fact, the only thing supporting the existence of that entry.

Jones (1988: 33; cited on that page) transcribes the relevant quote as ⟨yā ?nwāmni? dalji⟩, without giving a translation, and with nwāmni indicated as an uncertain reading.

Corriente (1993: 27–28; also cited on that page) transcribes it as ⟨yā nwāmin dalji⟩, which he translates as 'sweet name'. He takes this to phonetically represent a Mozarabic ya NWÉMNE DÓLČE, where the lowercase word is of Arabic origin, and the uppercase words of Latin origin, per Corriente's system. Judging by that and by his translation, we are dealing, etymologically speaking, with Arabic يا () + Latin nōmine and dulcem.

The problem is that, a decade and a half later, Corriente apparently changed his mind. He decided (2009: 120; see here) that the phrase really says ya ndá min tháljE, which he translates as 'oh you who are fresher than the snow', indicating all of the words as being Arabic in origin (and apparently claiming that Romance contributed a single vowel).

Needless to say, that completely undermines our entry for دلج (dalji), as well as the one for نوامن (nwāmni), which depends on the same quote. Incidentally, I have another reason to doubt that a supposed nwāmni (interpreted by Corriente as phonetically representing nwémne) could really derive from Latin nōmine: it shows diphthongization, as if derived from *nŏmine, with a short tonic vowel.

In any case, the larger question here is: what should we do when the aforementioned sources disagree? Perhaps we should only rely on phrases that all three of them agree on (of which there is, fortunately, a decent number).

It would also be helpful to find an additional modern source that transcribes and translates the Mozarabic kharjas. Post-1988, preferably, as there are serious issues with earlier attempts, which I won't get into right now.

Pinging @Santi2222, Ser be etre shi, and @Fay Freak.

- Nicodene (talk) 23:46, 10 June 2022 (UTC)[reply]

@Nicodene IMO you hit the nail on the head. There should be agreement among sources (or at least among a substantial number of them) when it comes to what are essentially reconstructions like this. Benwing2 (talk) 00:30, 11 June 2022 (UTC)[reply]
At least for Arabic I require that a reading has little to doubt, as I for pre-printing (1800) Arabic in accordance with European languages which start around 1500 (English, German, Polish, Russian) or even 1615 and 1650 (French, Dutch) I would only seek one occurrence excluding the presence of a ghost word. Here you don’t know the language and don’t get much meaning into the texts, and interestingly we even have Category:Undetermined lemmas but this is not even for this, maybe those shambles of languages aren’t a matter for a dictionary like this but only for the use in specialist fields. Alas, we can’t stop editors from including Mozarabic references and thus also entries, however it is easy to dismiss entries of one does not even know the rough spelling aimed at nor the language roughly. The benefit from deciphering these poems is really low anyway as we know Latin and copious descendants and Arabic so I judge that you really need to gather something exclusive to bother much about these uncertainties. In other words your time may be too valuable to care about those certain words as the yield won’t likely be of great significance in any case!
This “agreement” thing is misleading since it is usually just one author following another sequentially with no direct conclusion for our purposes. Some people learned working with Meroitic that you can’t trust anyone. Fay Freak (talk) 00:41, 11 June 2022 (UTC)[reply]
Regarding Corriente, I have not even mentioned my suspicion that he did not actually change his opinion that much but let his academic gofers write articles published under his name, understood as a kind of brand. In كرزية the three literature loci starting with “Corriente” have three different etymologies of which only the chronologically second one I could even make sense of, others include corrupted or phantasized citations of Iranian words – the secondary literature is often as bad as the medieval manuscripts in transmission but we savvy to filter the popular stuff. (We could open another can of strange “references” cited on Wiktionary but it is 3:54 AM in Germany.) Fay Freak (talk) 01:54, 11 June 2022 (UTC)[reply]
An initial ameliorative step would be to slam {{LDL}} on the dubious entries. The agreement on allowed single sources should then exist, and should allow for challenges. After all, there is a significant reason to suspect scribal errors at various levels. It then comes down to what we do with uncertain words - do we include them with a warning, which is useful in itself, or do we puristically exclude them because we aren't sure. In this particular case, what do people here make of the actual Arabic script text? Or has no-one here tried to read it?
Actually, using Google Books, I see that Jones gives two examples of the word دلج (dlj). Does this solve this particular matter?
Perhaps it would help to have some mechanism for indicating the reliability of quotations - including the level of fabrication. --RichardW57 (talk) 16:07, 12 June 2022 (UTC)[reply]
I don't know what the standard WT practice is in cases like this, but perhaps having an appendix with the kharjas (and plausible reconstructions) could be a work-around for words like the ones mentioned here. There are entries for other badly attested languages with headers like "Word" and definitions with the phrase "the meaning of this term is uncertain", but given that in Mozarabic the difficulties in interpretation are often at text level (and not word level) I would personally favor an appendix-style solution (in case we want to include dubious terms).--Santi2222 (talk) 18:26, 12 June 2022 (UTC)[reply]

Translingual entries and anagrams

If there is an English word or term whose letters can be re-arranged to make a translingual entry should that be a valid anagram for Wiktionary? I think it should because translingual terms are used in English. Others may disagree. Some contributors may think we shouldn't have anagrams at all. John Cross (talk) 11:26, 11 June 2022 (UTC)[reply]

Yes, just as it should for any other language that the term is used in. I have mixed feelings about Translingual entries, because there is an occasional tendency to assume that use in 3 (or sometimes even 2) languages using a term in the same way warrants lumping things together there. But the fact is that there are plenty of terms (primarily symbols and proper nouns) which do deserve entry there, and are clearly used in English. Theknightwho (talk) 19:24, 11 June 2022 (UTC)[reply]
I have a hunch that this might lead to a lot of not-very-English clutter in anagram sections, like taxonomic names which we all secretly know are Latin. Equinox 04:14, 12 June 2022 (UTC)[reply]
Like English altbier, betrail, librate, tablier, triable, trilabe → Translingual alberti.  --Lambiam 11:36, 12 June 2022 (UTC)[reply]
Since Translingual terms are, in principle, used in every language, we could have larger anagram sections for every Latin-script language, based on taxonomic names alone. This doesn't seem very productive. DCDuring (talk) 20:49, 12 June 2022 (UTC)[reply]
No, I don't think this is overly useful. Some people view Translingual terms as belonging to any language. I tend to see them as belonging to no language, but being used within a language. What this means practically is that most people would not consider most Translingual terms English, nor would you be able to use them in word games, which is where Anagrams are most useful. If you're playing Scrabble, it's not useful to know that the combination of letters you have can spell Poecilia. Andrew Sheedy (talk) 21:09, 12 June 2022 (UTC)[reply]
As a point of interest, one can find "poecilias" (lower case) in running English text. Many taxonomic names, both current and obsolete, have corresponding English names. DCDuring (talk) 22:28, 12 June 2022 (UTC)[reply]

I feel we should adopt a policy with regard to internet quotations and settling. As was stated, there was a clear feeling that we should settle these kinds of issues in RFV. I propose we allow for the ability to create votes within an RFV thread. Vininn126 (talk) 13:25, 11 June 2022 (UTC)[reply]

This is currently also being discussed at WT:RFVE#creeper. @Fytcha @AG202 @WordyAndNerdy This, that and the other (talk) 02:31, 12 June 2022 (UTC)[reply]

Can we settle whether affixes in Arabic-script languages should be lemmatized with or without ـ , e.g. ـی vs. ی?

Currently, it’s a bit in a shambles:

Can we settle this for good? IMO, af least in the case of Persian, it’s better to lemmatize with ـ, since Arabic-script languages use orthographic spaces and hence there’s always an orthographic difference between e.g. Persian چه (če, what), always preceded by a space, and ـچه (-če, diminutive suffix), always written joined. Korean was somewhat recently revised to use hyphens in lemmatization for the same reason.

In addition, some Persian affixes are more commonly written spaced or with zero-width non-joiners (especially in formal writing), e.g. the verbal prefix می (mi). Currently we have no way to tell readers outside a cumbersome Usage Note that e.g. the prefix می (mi-) is usually written spaced or with a ZWNJ while ب (be-) is never spaced, whereas if lemmatization with ـ was consistently implemented, this would be obvious from the very title.

Thoughts?--Tibidibi (talk) 02:26, 12 June 2022 (UTC)[reply]

@Tibidibi Hello! I was told you might be gone for awhile due to army service; good to see you back. I think we should include the tatweel character before the suffix or after the prefix if the affix attaches to the main word without a space or ZWNJ (which is always the case for Arabic at least). You enumerated some reasons why this makes sense for Persian, but IMO it should be done for Arabic as well, if for no other reason than that several characters look noticeably different in their independent vs. joined forms, and the tatweel forces the joined form, which visually helps signal that we're dealing with an affix. Benwing2 (talk) 03:18, 12 June 2022 (UTC)[reply]
@Tibidibi I like what they're doing with Arabic at the moment (compare كَـ (ka-) vs ـكَ (-ka) under ك (k)). It looks like lemmatising with tatweel would make more sense for Persian, but I don't feel the need to do that for Arabic. Sartma (talk) 23:50, 12 June 2022 (UTC)[reply]
@Tibidibi, Sartma: In my opinion it's better to lemmatise affixes in all Arabic based (script) languages without the taṭwīl but use it on the correct side of the word in the headword, if those affixes are spelled together with a corresponding word (no space or ZWNJ), as is the case with the Persian suffix ـچه (-če) (the entry title is at چه). A hyphen in the transliteration should be used for terms written together or with a ZWNJ. So prefix می (mi-) (no taṭwīl but a hyphen in the transliteration), which needs a ZWNJ is also good as it is now. --Anatoli T. (обсудить/вклад) 01:02, 14 June 2022 (UTC)[reply]
@Atitarev This can easily lead to clutter on single-letter entries. Why do we distinguish between Arabic-script languages and Latin (or Cyrillic, etc.) script ones in this regard, when both scripts make use of orthographic spaces?—Tibidibi (talk) 09:20, 14 June 2022 (UTC)[reply]
@Tibidibi: I understand your idea better now. It may work. It might be difficult to engage all editors for all Arabic script-based languages, though. Perhaps, focusing on one, such as Persian? Anatoli T. (обсудить/вклад) 23:12, 14 June 2022 (UTC)[reply]
(Notifying Ariamihr, Dijan, Mazsch, Qehath, ZxxZxxZ):
If there aren’t any responses by this time next week, I will move the relevant entries to the tatwil-ed form.
@Benwing2 If this passes, can you modify the relevant code so that {{af|fa|آزاد|ـی}} links to the tatwil-ed form?—Tibidibi (talk) 09:42, 20 June 2022 (UTC)[reply]

Can we standardize morphophonemic/phonemic/phonetic/etc conventions for Middle Korean, Modern Standard Korean and Jeju?

Which level of "underlyingness" counts as phonemic seems to be not so well defined, and I often see "phonemic analysis" of Middle Korean and Modern Korean that just seems to follow Modern Korean morphophonemic orthography rules. I find this problematic, because it only allows underlying forms that are possible to write in hangul:

Hangul 짚다 짚어서 짚는
Analysis /cita/ /ciʌsʌ/ /cinɯn/
Pronunciation [ʨipt˭a] [ʨiʌsʌ] [ʨimnɯn]
Hangul 깁다 기워서 깁는
Analysis /kipta/ /kiwʌsʌ/ /kipnɯn/
Pronunciation [kipt˭a] [kiwʌsʌ] [kimnɯn]

Since they follow the same pattern, it would make more sense to analyze the latter verb as /kiw-/, but to me the fact that it does not means it is very heavily influenced by the orthography, which I think should be avoided.


Current Korean entries also give IPA transcripts which I find sucks. Take the example of 설화: [sʰʌ̹ɾβwa̠]

I don't know who the first person to write intervocalic /hw/ as [β], but I keep seeing it and I am tired of it. It seems to be derived from the fact that /hw/ is normally realized as [ɸ], and /h/ normally undergoes voicing intervocalically. The reason /hw/ is fricated is because of the strengthened air stream caused by the /h/, making it easier to fricate in places where it normally would produce an approximant. There's also the fact that initial /w/ is more strongly rounded than in other places, meaning that medial /hw/ will have even less chance of getting fricated. The most common pronunciation in my experience is [sʰʌɾʷa] with the /h/ completely dropped, or [sʰʌɾʱʷa], with the /l/ becoming breathy voiced. There are other problems with it also, like the fact that medial and doubled /l/ is written as [ɭ]. I believe that it is possible, and even recommended, to write apical lateral approximant as [ɭ], even if it is not strictly a true retroflex. However, given how other sounds are given such specific realizations with all the diacritics, I believe it makes more sense to write it as [l] with an apical diacritic under it. That would indeed fix the problem if it weren't for the fact that coda /l/ is also varied greatly even within Seoul Korean. Some people seem to have [ɹ] for final /l/ except before coronals and word finally. It would be misleading to say that specifically [ɭ] is the pronunciation of coda /l/ in Korean. If the purpose of IPA transcription was to help non-Korean speakers pronounce Korean words, then it does a terrible job at it, because anyone who knows IPA but not korean will see [sʰʌ̹ɾβwa̠] and read it with a consonant cluster followed by a semi vowel.


As I mentioned earlier, what actually counts as being phonemic is not well defined and there are different conventions between different linguists. I see multiple ways Korean is analyzed phonemically with different levels. For example:

Hangul 짚다 짚어서 짚는
Option 1 /cipʰta/ /cipʰʌsʌ/ /cipʰnɯn/
Option 2 /cipta/ /cipʰʌsʌ/ /cipnɯn/
Option 3 /cipt˭a/ /cipʰʌsʌ/ /cimnɯn/

Option 1 is I believe better described as "morphophonemic", because then we can make a distinction between it and the other two options, and it seems to be the most common convention. Morphophonemic analysis uses |pipes|, ||double pipes||, or //double slashes//, instead of /slashes/. Option 2 is basically what you get if you tell a Korean to pronounce something syllable by syllable. It is similar to Option 3, sans assimilation etc. Option 3 is like Option 2, but with assimilation etc rules applied. It assumes that what's pronounced the same, are phonemically the same, and it's basically phonetic hangul in IPA. I think Option 2 is the best option because it closely matches how people perceive pronunciation. Of course it also relies on orthography to some degree, although much less than the aforementioned Korean orthography based analysis.


In conclusion, I think we should come up with a standardized and consistent way to transcribe Koreanic words, including Middle Korean and Jeju, using their equivalents of whatever Modern Standard Korean would have. I believe implementing phonemic analysis for Middle Korean would be fairly straightforward, since the ortho is already phonemic, but it might not be a great idea to use IPA since you might be providing extra information of what is a reconstructed pronunciation. If we do use IPA, native transc and non-native transc (e.g. 동국정운 pronunciation) should probably not share the same system, and the latter might be better to be left untranscribed. I also want to propose a pitch accent analysis system for busan dialect, which is more toneme oriented than the phonetic approach we have right now, and it could possibly make analyzing pitch accent patterns of verbs easier. Jeannebluemonheo (talk) 12:55, 12 June 2022 (UTC)[reply]

Strong Support, we already implemented the morphophonemic change for Jeju a while back, see: 뜬 쉐가 울 넘나 (tteun swega ul neomna), and it would be amazing to finally have a phonemic transcription somewhere. AG202 (talk) 13:19, 12 June 2022 (UTC)[reply]

French pre/post-1990 spellings

@PUC I'm curious to understand why we seem to prefer pre-1990 French spellings but post-1996 German spellings. As an example, the French verb meaning "to know/to recognize (a person)" is lemmatized under the pre-1990 spelling connaître, and the post-1990 spelling connaitre redirects to it. The article Appendix:French spelling reforms of 1990 just says this:

Some [post-1990 spellings] are now more prevalent than the still correct pre-1990 spellings, but many less. On Wiktionary, French words with revised spellings are usually treated as alternative spellings, while the traditional spelling is the main article.

This doesn't give any explanation as to why Wiktionary prefers pre-1990 spellings. Benwing2 (talk) 23:12, 12 June 2022 (UTC)[reply]

I read that as saying that pre-1990 spellings are still commoner than post-1990 spellings, and to keep things simple, we uniformly standardise on the pre-1990 spelling. --RichardW57 (talk) 23:31, 12 June 2022 (UTC)[reply]

CAT:D pages added by User:Fish bowl

Hi. There are > 100 Talk pages in CAT:D added for speedy deletion by User:Fish bowl. I want to make sure these are correctly added. They are all tagged with either "copyright violation" (because someone asked "please translate the following" along with a quote) or "spam". The ones labeled "spam" in particular I'm not sure about. E.g. in Talk:麺, someone asked for a Cantonese pronunciation, which was answered by someone else, who added the pronunciation. Another example is Talk:㓃, which has a couple of topics, one of which asks whether the character is simplified or traditional, and another asks for clarification of the contexts of the various Mandarin readings. These don't seem obviously like spam to me, and I'm not sure why they're tagged. Benwing2 (talk) 00:57, 13 June 2022 (UTC)[reply]

Agreed. Fish bowl, can you give us an example of a copyright violation and the source that is being violated? (Note that [https://en.wiktionary.org/w/index.php?title=Special:Contributions/Fish_bowl&offset=&limit=5000&target=Fish+bowl I don't see any edits with an edit summary stating this.) —Justin (koavf)TCM 01:32, 13 June 2022 (UTC)[reply]
I did see something tagged that way and the text was Chinese so I left it alone, as I didn't understand. I wonder if this is perhaps the same user who used to create huge numbers of rather useless talk pages saying "can it be added..."? (We have rfp, rfe, etc. templates for this.) Equinox 03:19, 13 June 2022 (UTC)[reply]
It is, yes. — SURJECTION / T / C / L / 14:47, 13 June 2022 (UTC)[reply]
they're spam. [1]Fish bowl (talk) 19:17, 13 June 2022 (UTC)[reply]
I'm not sure how this link (an edit of yours) supports the claim that someone else's edits are spam. I did note that you added all of these speedy deletion templates after the proposal that they be deleted failed to gain consensus, and that seems like a very bad faith use of the speedy deletion template. - TheDaveRoss 19:22, 13 June 2022 (UTC)[reply]
Every Chinese editor who I've talked to doesn't like this guy. I kind of don't give a fuck anymore about the "keep it 😠" opinions of non-Chinese editors 🤷🤪 —Fish bowl (talk) 20:11, 13 June 2022 (UTC)[reply]
Perhaps best to ignore them rather than create more work for others against consensus. - TheDaveRoss 20:13, 13 June 2022 (UTC)[reply]
I put in futile work answering too many of these in the past. (Did you? Would you like to try?) How hard is it to press "delete" 🤪 —Fish bowl (talk) 20:20, 13 June 2022 (UTC)[reply]
I didn't even mark them all (although I could 😳) This is just a small corner. —Fish bowl (talk) 20:24, 13 June 2022 (UTC)[reply]

Decades

@BD2412 When you go to 1360s, you see "deleted page 1360s Per RfD discussion on Decades". I recently created 1370s and 2160s with cites (1370s has stronger cites). What would the participants of the previous discussion think of a piecemeal creation of decades articles IF they have good cites? Thanks. --Geographyinitiative (talk) 10:20, 13 June 2022 (UTC)[reply]

I don't think this is a good idea. Such numbers are created in an entirely predictable way, so there is no need to have such entries at all, whether or not quotations can be found for them. — Sgconlaw (talk) 11:02, 13 June 2022 (UTC)[reply]
Regarding @Sgconlaw's statement, I am 100% neutral on the issue of whether these decade entries fall within Wiktionary's scope. If you want them, I'll work on them. If you don't want them, I'll delete them. However, I do think that Wiktionary:Criteria_for_inclusion#Numbers,_numerals,_and_ordinals (or somewhere on that page?) should talk about decade entries and reference the relevant discussion (sorry if it's there and I'm not seeing it). The 1370s entry is facially similar to the 1990s article, so there must be something I'm missing. --Geographyinitiative (talk) 11:16, 13 June 2022 (UTC) (modified)[reply]
Category:en:Decades shows extensive coverage of the 18th through 21st centuries, and barely anything else. Anyway, I see no principled reason to treat 1990s differently from how we treat 1370s, except recency bias. 98.170.164.88 15:13, 13 June 2022 (UTC)[reply]
Do you really not see the difference? What if you took it a little further back, say the BCE 278990s? That was a decade that happened (I assume), but in since the goal of the project is not to be a calendar but to instead be a dictionary, there is perhaps less value in having "definitions" for highly predictable numeric constructions which are unlikely to be used in any manner other than the most narrow, literal ones. Similar to first being a tremendously useful and often used ordinal, but two-hundred-seventy-eight-thousand-nine-hundred-and-ninetieth being somewhat less so. - TheDaveRoss 15:21, 13 June 2022 (UTC)[reply]
You would not be able to find three independent quotations for BCE 278990s, so the comparison fails. By the way, I said there is not much reason to treat them differently. That doesn't rule out deleting 1770s, 1870s, and 1970s along with 1370s as all being predictable/non-CFI-worthy terms. Their content and usage is entirely analogous. 98.170.164.88 16:11, 13 June 2022 (UTC)[reply]
At this point I am fairly certain that I could find three cites for it on UseNet, it seems like everything which is possible to type has been typed there. But whether or not something is attestable isn't actually relevant to the argument, or the previous discussion. It is very easy to attest "the sky is blue", that isn't a counter-argument to the policy to exclude sum-of-parts terms. - TheDaveRoss 16:27, 13 June 2022 (UTC)[reply]
In all fairness the terms are not SoP, and this issue is dealt with at WT:CFI#Issues to consider. They're also no more formulaic than many other kinds of entry, such as plurals. Theknightwho (talk) 23:29, 13 June 2022 (UTC)[reply]

Here is the RfD discussion on these entries. I would suggest, rather than adding routinely generated entries for decades that are lexicologically unremarkable, we should add content to the ones that are lexicologically remarkable. For the last century, at least, each decade has its own cultural associations — the "roaring" twenties, the 1930s (global depression), 1940s (war and aftermath), 1950s (postwar boom), 1960s (counterculture movement), 1970s (disco and stagflation), 1980s (consumerism), 1990s (grunge vs. synth-pop and post-Cold War), 2000s (war on terror), etc. bd2412 T 16:23, 13 June 2022 (UTC)[reply]

This feels like recency bias, though. It's not like the twentieth century was the only time period to have associated culture or events. By the same token, shouldn't 1776 or 1770s be an entry, since it was the year/decade of the American Revolution? (Even used metaphorically: Alex Jones said "1776 will commence again"; " "spirit of the 1770s" has been used) Should 1492 or 1490s be an entry because of the discovery of the Americas ("spirit of 1492")? etc. We could draw a line in the sand and say that only things from 1900 and on are allowed, but that's pretty arbitrary. Maybe you're saying that for each decade we need to separately determine whether there is some significance beyond just referring to the mere time period. I'm not sure how you'd precisely draw that line, though, so maybe you can expand on that. 98.170.164.88 16:51, 13 June 2022 (UTC)[reply]
@BD2412: yeah, not keen on that idea. To me, all the "decade" years are simply SoP. — Sgconlaw (talk) 18:31, 13 June 2022 (UTC)[reply]

Just to guestimate what we're talking about here, I'm thinking that if we went "full bore", it would be 100 entries per millennium, so if you get all of 2000 BC-AD 3000 (which would be hard?), that will be maximum 500 entries (depending on citations, which will be harder near each end). Then there will be decades outside that range are the focus of sci-fi or scientific speculation or the focus of archaeology. Anyway, I doubt the whole collection, if confined to that which can be cited, would exceed 400 entries. Again, I am neutral on the issue. --Geographyinitiative (talk) 20:16, 13 June 2022 (UTC)[reply]

You should take into account that this amount should be multiplied by the number of WDLs we have. Thadh (talk) 22:48, 13 June 2022 (UTC)[reply]
This is another good question. Idk how many languages use the letter "s" here? Variants? Etc? The Iran's calendar would have decades of their own- 300-400 pages of that in Farsi then too, I suppose. Again, I am neutral on the issue, but I would find it fun to do cites for these as I ran across them. --Geographyinitiative (talk) 22:55, 13 June 2022 (UTC)[reply]
Based on the cites in 2160s, it does seem SOP. We do seem to accept that several categories of unspaced but formulaic things are SOP, e.g. episode numbers (Talk:S01E01), Latin -que words (Talk:fasque) and Tzotzil -e words (Talk:antse), chemical formulas, and yes, decades (Talk:1700s). So 2160s should probably be deleted per that. A few decades have stronger arguments for inclusion, not because of specific cultural associations per se, but because those associations pull the period referred to as "the XXs" out of the actual period from XXX0 to XXX9. For example, a fair bit has been written about how (in English) the 60s refers to a cultural period from 1963-64 to 1970 and the 90s refers to a cultural period from 1998-99 to the early 2000s ("Blink-182? Didn’t go mainstream until 1999. Shrek? 2001. The Tony Hawk series? Debuted in 1999 and peaked in popularity around 2003. You’ll struggle mightily to find a cultural touchstone of “the 90s” that dates earlier than maybe late 1998"). - -sche (discuss) 21:49, 14 June 2022 (UTC)[reply]

The logic being applied here would justify the deletion of the vast majority of English plural forms. There are a finite number of decades for which this format will see any use, and it’s not that high. Theknightwho (talk) 21:54, 14 June 2022 (UTC)[reply]

So-called "wiki" is secret alt-right hive!!!!

We should probably be preparing some DAMAGE CONTROL... can you imagine what will happen when Twitter, Wired, and Salon find out that we have got 57 variants of the n-word? Especially with the recent IPs who keep adding stupid slurs like Buttswana. Presumably the answer is "well, they are words, and we are volunteers". Right. What are we really going to do? Equinox 13:52, 13 June 2022 (UTC)[reply]

Locking Fay Freak in the shoe cupboard might be a good start. Equinox 13:53, 13 June 2022 (UTC)[reply]
How many variants of the f-word do we have? bd2412 T 17:01, 13 June 2022 (UTC)[reply]
I am probably more right wing than most people on here, but I tell you that some of these words you guys find out there on the intertubes are f'n wild. But I think it really was worthwhile for Wiktionary to document the horrific term "Citations:niggership"- no other dictionary had this evil term, and now we know a little something about its 19th century roots. --Geographyinitiative (talk) 20:27, 13 June 2022 (UTC)[reply]
Collective nostalgie de la boue. – Jberkel 21:09, 13 June 2022 (UTC)[reply]
Meh. As long as we're describing offensive words as offensive, and not giving them undue prominence (e.g. when the "Synonyms" section of Jew was a long list of slurs that was bad), we're a dictionary defining words people have used. Of course, in cases where "words people have used" means "obscure/nonce slurs someone with a few usernames on usenet used in 2001 and 2002", or more recently "4chan op coinages cited via reddit/twitter", we should do better; if we get criticized for falling for some 4chan op invented word, I suppose we deserve it and may it impel us to improve our CFI. If we get criticized for documenting that people have used N-words for a few hundred years, meh. (As to the other point: I do think based on other factors that FF is, like you once suggested Dentonius was, an "entryist", but I'm not sure how many of these entries he was involved in making.) - -sche (discuss) 21:24, 14 June 2022 (UTC)[reply]
@- -sche The issue is surely the number of uses something has, and whether it actually lexicalised as a genuine term. There’s a difference between objecting to nonce words that have been independently coined a few times, and objecting to terms that just happens to only be used on Reddit and Twitter. I’m not sure it’s a good idea to conflate the two, particularly when I’m pretty sure you mean 👌, which has seen pretty widespread use. Theknightwho (talk) 21:38, 14 June 2022 (UTC)[reply]
Oh, no, I'm thinking of the various pukeskin, cumskin, etc type rare/nonce insults, and (as far as "coordinated"/"op" stuff) things like clovergender. I don't know whether the OK emoji is attestable, but I agree the gesture is a genuine white-supremacist signal, flashed by Stephen Miller in the White House etc (as you said at RFV, the "hoax" there is not "the gesture is white supremacist" but rather "the gesture isn't white-supremacist, it's just a joke somehow!"). - -sche (discuss) 22:04, 14 June 2022 (UTC)[reply]
Thanks - sorry for being a little prickly. It’s something about that term in particular. Equinox did mention the idea of having a post-ironic label, which I think would make sense for a bunch of these 4chan coinages (among others). The blurring of humour and sincerity is 4chan’s MO, after all. Theknightwho (talk) 22:19, 14 June 2022 (UTC)[reply]
Post-post-post-whatever is not meaningful. Whether we are serious or joking when we call someone an "XYZ", the word still has its meaning. The sarcasm is something beyond a dictionary. If I get on a video game server and call someone (sorry, I don't play these games, so I dunno) an "epic winner", and I mean they are actually shit, that isn't a new sense of "winner", that's just me taking the piss. There might be a very, very few words that are mostly used sarcastically, and not used honestly, but I'm not sure. That would be a usage note. Equinox 04:18, 17 June 2022 (UTC)[reply]
You all are very rude. It could not take long for the editorship to miss me, for the quality edits I create myself as well attract by new editors often from or for non-Western countries who feel encouraged. That this project has reached higher agreement and refinement in presentation matters without becoming an echo chamber is also the work of my memorious distinctions, presented on many an occasion of possible controversy.
Alt-rights are still strawmen, whom we have barely encountered and whose agenda would barely withstand the hard reality of lexicography. Have I mentioned that extremist groups feel enticed by attempts to exclude them rather than assimilate them? We make all as boring as possible for them as well as for so-called vandals, whatever the distinction may be, and thus they are stripped of their essentials. Fay Freak (talk) 10:32, 16 June 2022 (UTC)[reply]
Every time I think "THE WHITE MAN IS THE REAL OPPRESSION VICTIM" I just check out your posts and I feel okay again. Here's a beer. Equinox 04:19, 17 June 2022 (UTC)[reply]

A content-neutral way to look at the problem with offensive terms

There are a number of ways that nonsense can enter the mainstream, but offensive nonsense gets eliminated fairly quickly. Wiktionary, on the other hand, only eliminates nonsense via processes that take time. That creates an incentive for people to add offensive nonsense here: it may get deleted, but fringe interests get a period of mainstream exposure that they wouldn't get elsewhere.

Our approach should be to neutralize this incentive by removing the conditions that create it. Chuck Entz (talk) 15:09, 13 June 2022 (UTC)[reply]

@Chuck Entz This is exactly what I've been trying to do with my proposals above. Whether we like it or not, we do have an impact. I'm getting frustrated with being told by the same folks that we don't have an impact here, even though I've seen it with my own eyes (and all it takes is a few news articles about this project). I would like to propose once again that offensive terms should require citation on at least two to three sources (websites, books, etc.). This would preserve the majority of offensive terms that we have while limiting the amount of nonce terms that would've never seen the light of day otherwise. (CC: @Equinox since this a similar issue to your point above) AG202 (talk) 15:29, 13 June 2022 (UTC)[reply]
One thing we could do is just flip the attestation requirement to be "up front" for all terms. The benefit there is that everything comes with citations providing evidence of its actual use, the downside is it is a huge barrier to creating new entries, especially for people who are less familiar with the practices and policies here. Anyone who saw any entry or definition about which they were skeptical could delete it and immediately create an RFV asking for evidence of usage. If someone added red and claimed it was a color, my guess is that nobody would feel compelled to delete/RFV it, but if someone added red and claimed it was yet another neo-Nazi word for Jew, well I can imagine lots of people would question the veracity of that definition and ask for further research. I don't think it is a great injustice, or a disservice to freedom-loving Wiktionary mirrors to put a short wait pending verification on less used or fringe terms. - TheDaveRoss 15:31, 13 June 2022 (UTC)[reply]
@TheDaveRoss For all terms? While this is more neutral, I feel that this would not be implementable nor practical unfortunately. The majority of the entries I create do have citations (ex: Jeju ᄒᆞ다 (hawda), Yorùbá ọ̀kan, and English yassification) as I try to focus on quality (and have experience dealing with some ... choice RFVs), so I'm very well-versed in the tasking process of finding and adding quotes. And so, I wouldn't feel comfortable putting that burden on new users, even though I would prefer that there were more citations on the website. It's just too much work and effort for often little return as I've found. AG202 (talk) 15:46, 13 June 2022 (UTC)[reply]
What it would functionally do is give the people with the ability to delete entries more power to do so at their discretion, pending a completed verification process. Since 98% of entries are non-controversial, it wouldn't change anything with those (since hardly anyone would be inclined to delete such entries). Some other number of entries are already deleted on sight, nothing changes there either. The difference would be in the small number of entries which are currently added and then immediately sent to RFV or RFD, it would not be allowable to delete those while sending them to RFV or RFD and then restore them pending successful outcomes. Very little would actually change except that dubious entries would have to wait slightly longer to be visible to the masses. - TheDaveRoss 16:13, 13 June 2022 (UTC)[reply]
I think I'm on board with this. I am certainly the guy Twitter hates, who thinks that "I am offended" is frequently weaponised, but there's no doubt we keep getting a lot of crappy unsubstantiated slurs lately and if the rule is just "you have to cite it if it looks rude" then yeah, um, I could go with that. I'm rather sick of these entries. Equinox 16:14, 13 June 2022 (UTC)[reply]
Within existing rules we can exercise discretion on offensive entries as follows:
  1. Speedily delete poorly formatted offensive entries and poorly worded offensive definitions
  2. RfV offensive definitions as soon as they are seen
  3. Withhold citation effort for offensive entries and definitions
  4. Promptly delete after 30 days
Why is this not sufficient? DCDuring (talk) 16:34, 13 June 2022 (UTC)[reply]
I think the biggest gain is that, in the case of trolls and other bad-faith adders, there isn't the validation of the terms sticking around. Also we wouldn't be propagating them out into the internet at large via the many mirror sites which just copy Wiktionary data directly and present it, sometimes without any context or caveat. - TheDaveRoss 18:17, 13 June 2022 (UTC)[reply]
What about putting offensive entries/definitions "on hold" with respect to inclusion in dumps or whatever APIs the mirror sites use. We could go further and suspend their visibility until they pass. I realize I am talking through my hat about the technical possibilities, but WP holds certain contributions in suspense until review. I am also amazed what kind of things are technically possible. DCDuring (talk) 18:54, 13 June 2022 (UTC)[reply]
That solution would be more difficult and solve half of the problem, not sure why it would be preferred. We don't have much control over what content of ours people choose to take, and without making actual software changes the best control we have is deletion. - TheDaveRoss 18:59, 13 June 2022 (UTC)[reply]
I assume that they take it all from dumps, probably not from the much larger diff files. Content excluded from dumps would probably not be on most mirrors, if my assumption is correct. That would be the objective. I just don't think that arbitrary Ptolemaic epicycles to our rules are better than mostly automated solutions. DCDuring (talk) 20:48, 13 June 2022 (UTC)[reply]
I still think the best solution would be to simply ban new users and IPs from adding offensive terms in the first place. I guarantee it'd filter out all the nonsense we've been getting lately. We already block those users from editing pages for offensive terms, so it seems odd that we'd allow them to add new ones. Binarystep (talk) 06:32, 16 June 2022 (UTC)[reply]
This seems like a good idea if it could be done, but how could it be done? (Manually block any new user who adds offensive terms?) - -sche (discuss) 10:27, 16 June 2022 (UTC)[reply]
That'd probably be the easiest way, though I wonder if some sort of filter would be feasible. Binarystep (talk) 10:33, 16 June 2022 (UTC)[reply]
The main hurdle I see to formalizing an "attestation up-front" requirement for offensive terms as an official rule is...the process of formalizing it; I can see it getting derailed in discussions of what is offensive; nonetheless, I'd support it. Any rule can be rules-lawyered or gamed; well moderated sites' mods have some discretion (to block or delete things for violation the "spirit" of rules even if not the "letter", and to interpret things like "offensive".). We have some discretion inasmuch as "Creative invention or protologism" is a stock deletion rationale, and we could use that more often. I also like the idea that when we delete some offensive protologism (whether under any new rule or as a "Creative invention or protologism" now), if someone challenges that, the RFV can proceed for its usual month while the entry stays deleted until it's actually cited/RFV-passed; there's no reason an entry needs to be live for the RFV process to operate (as long as the definition to be cited is copied over to the RFV thread). - -sche (discuss) 10:27, 16 June 2022 (UTC)[reply]
I prefer putting offensive definitions on hold pending passing RfV rather than devising other special rules. It wouldn't be bad to do so for non-offensive definitions, so the overenthusiastic application of the 'offensive' label would do little harm. I'd also favor suspending existing uncited offensive definitions if some number of contributors greater than three agreed. DCDuring (talk) 12:52, 16 June 2022 (UTC)[reply]
You can talk all month about this, but what you really want to do is ban IPs. Equinox 04:20, 17 June 2022 (UTC)[reply]
P.S. You could also just be me, or SemperBlotto, and block people who look like bad faith, instead of wringing your hands and letting them creating 80 entries which we then spend the next entire year putting solicitously through RFV. It worked in the old days. Ask Blotto what he thinks. Best wishes, Equinox 04:28, 17 June 2022 (UTC)[reply]
If you don't like it, you're gonna be terrified when I finally die and you wonder why there is a huge sudden influx of bullshit you have to deal with. Is there a statistician in the house. Equinox 04:29, 17 June 2022 (UTC)[reply]
Actuarially speaking we can expect something like 4 million more edits out of you, get cracking. - TheDaveRoss 17:39, 17 June 2022 (UTC)[reply]

Shitgibbons

In the spirit of the season, I wanted to bring up the topic of shitgibbons. This is the name that was coined a few years ago for those tiresome insults like cockwomble, jizztrumpet, cuntwaffle and wankpuffin that get used as faux-Britishisms by fans of Benedict Cucumber Sandwich, but also covers such delightful words as fucknugget, shitlicker, turd burglar and so on.

Naturally, documenting this is of the highest priority, but unfortunately the evidence for the term is a bit scant. There are some blog posts that do seem to be by genuine linguists [2][3][4] and an opinion piece, as well as a few other bits and pieces scattered around the web, but it would be good to know if there's something a bit more concrete.

So I guess my question is whether (a) anyone knows a more formal term, and (b) whether this is a phenomenon that crops up in other languages, because I genuinely do think it's deserving of a category due to its usefulness in etymologies. At the moment the entries just say things like wank +‎ puffin, which is utterly useless to anyone who wants to know where the word came from, and says nothing of the wider lnguistic phenomenon that it developed out of. Theknightwho (talk) 01:58, 14 June 2022 (UTC)[reply]

Tessier & Becker (2018). 98.170.164.88 02:31, 14 June 2022 (UTC)[reply]
Perfect. Thank you. Theknightwho (talk) 02:34, 14 June 2022 (UTC)[reply]
I salute the pair of pissdrinkers who wrote it. Nicodene (talk) 22:40, 18 June 2022 (UTC)[reply]
I think it's worth making a category for. Binarystep (talk) 03:05, 14 June 2022 (UTC)[reply]
I think "3-syllable words" should be a parent category. 98.170.164.88 04:41, 14 June 2022 (UTC)[reply]
If I'm understanding this correctly, the following entries belong in the category, in addition to the ones mentioned above and the ones recently created by WordyAndNerdy:
Not an exhaustive list, but I went through Category:English vulgarities and these were the ones that stood out.
I'm not sure if we count words that have the syllable structure but where there is a meaningful interpretation (e.g. "assmuncher"), so I excluded them in favor of entries where the second component was obviously there for prosody. Other terms I was uncertain about since they don't sound quite like the pattern to me: douche canoe and fuck-knuckle. 98.170.164.88 05:34, 14 June 2022 (UTC)[reply]
From Category:English derogatory terms: assmonkey, cockweasel, dickweasel, shitpuddle, twatwaffle. Possibly (but not vulgar): nutburger, scumbucket, sleazebucket, slimebucket. 98.170.164.88 06:06, 14 June 2022 (UTC)[reply]
It definitely would include assmuncher - turd burglar is similar in that it's also not just total nonsense. Thanks for these - I'll get them added. Even the seemingly milder ones still follow the pattern (e.g. nut is being used to mean "crazy person"). Theknightwho (talk) 11:33, 14 June 2022 (UTC)[reply]
I second this in that I don't think that just any 3-syllable vulgarity that is composed of a 1-syllable and a 2-syllable word is a shitgibbon, which seems to be @Theknightwho's working definition: dick muncher, jizz bucket, nutsucker, dicksucker, dickrider, even piss drinker. My personal opinion is that the second part must not be semantically relevant for it to be a shitgibbon. Pinging @-sche, DCDuring, Benwing2. — Fytcha T | L | C 03:00, 15 June 2022 (UTC)[reply]
No, that isn't my working definition. There are specific requirements:
  • It must be an insult.
  • The stress must be antibacchius.
  • The first word must be an expletive.
  • The second word must be a noun.
Theknightwho (talk) 03:10, 15 June 2022 (UTC)[reply]
Note that none of the examples in the original blog post had meaningful second parts as in "dicksucker". And I would tend to agree that "dicksucker", "asskisser", etc. is a different phenomenon than what is going on with "cuntwaffle". 98.170.164.88 03:17, 15 June 2022 (UTC)[reply]
That is true. There is definitely a pattern going on with words consisting of a 1 syllable noun + a 2 syllable agent noun, with an antibacchius stress pattern. In an overwhelming number of cases they're used as insults, and almost all use an expletive or pejorative for the first word. If not shitgibbons, they're very closely related. They're overwhelmingly derogatory, too, which does not apply to true shitgibbons which use innocuous nouns. Theknightwho (talk) 03:36, 15 June 2022 (UTC)[reply]
@Theknightwho: Thank you for changing these words back again; I wholly agree with the contents of Category:English shitgibbons now. — Fytcha T | L | C 10:15, 15 June 2022 (UTC)[reply]
Theknightwho keeps edit-warring at shitgibbon to restore their poorly-formatted version of the etymology section, including a confusing, unhelpful circular explanation of the term's origin. Someone ought to tell them to knock it off. I thought working on "shitgibbon" entries would be a fun distraction from the heavier CFI matters that have arisen lately, but I'm checking out again before this drives up the wall. WordyAndNerdy (talk) 04:23, 16 June 2022 (UTC)[reply]
To give background to this, the issue is with the etymology beginning "Shitgibbon of shit + gibbon." However, I've already explained at Talk:shitgibbon why this makes sense in the context of the rest of the etymology section. The word "shitgibbon" does not define what it means for a word to be a shitgibbon, and it didn't inspire all of the others. The causal relationship is the other way around: it was coined by linguists precisely because it came about as a shitgibbon in the first place. There's also no issue with separating the senses with bulletpoints, either. We do that in other entries as well. Theknightwho (talk) 04:36, 16 June 2022 (UTC)[reply]
There was a time where I thought Wonderfool had got a girlfriend, bc a number of these stupid words got girl audio in short order. Personally I used to like to do a bit of audio now and then, but I realised that audio is something where AI will beat us in a few years (we can generate perfect speech from the IPA). As humans, we should spend our effort on doing things that only humans can do: writing convincing definitions based on our knowledge of language and novels is good. Although some Harry Potter fuck-wit will ruin them in six months. Equinox 04:24, 17 June 2022 (UTC)[reply]

How to follow up on a move request

From what I understand, only admins can move pages. In April, I added a request to Wiktionary:Requests for moves, mergers and splits in April, which led to no discussion or other followup: Telchinis→Telchines and Hyadis → Hyades. These pages are currently located at non-lemma forms of the nouns, so moving them shouldn't be controversial ... the only question might be whether the singular Hyas or plural Hyades is better (both the cited dictionaries, L&S and Gaffiot, have the main entry at the nominative plural form). Is the best thing for me to do next making a Beer Parlour topic like this to get an admin's attention? Or just wait? Unlike with RFV, I don't understand what timeline to expect. Urszag (talk) 04:38, 14 June 2022 (UTC)[reply]

@Urszag Non-admins can move pages as well. Normally a redirect gets left behind, but if you have the right privileges (maybe it's the "mover" privilege?), you can disable that and not leave a redirect. I don't see why you shouldn't get that privilege; unless someone objects in the next couple of days, I'll give it to you. Benwing2 (talk) 07:54, 14 June 2022 (UTC)[reply]
Discussion moved to Wiktionary:Requests_for_moves,_mergers_and_splits#Renaming_Category:Disputed_terms_by_language_to_Category:Proscribed_terms_by_language.

Make Template:euphemistic spelling of categorize into a different category than Category:Euphemisms by language

I feel like there is an important difference between bullsh*t and at peace: the former is an attempt by the writer to adhere to the societal rule of not using profane language in certain settings and/or to distance themself from the profane nature of the underlying word while still conveying the exact profane word, whereas the latter is intended to evoke a different, milder feeling in the recipient than a non-euphemistic synonym would. I suggest renaming Template:euphemistic spelling of to Template:censored spelling of and making it categorize to Category:Censored spellings by language but I'm not married to this exact naming scheme; I'd be pleased with any scheme that clearly differentiates between the two. — Fytcha T | L | C 20:12, 14 June 2022 (UTC)[reply]

Good point. I agree these should be differentiated; at a minimum, looking at how {{rare spelling of|en|foobar}} catgorizes into Category:English rare forms, I would've expected {{euphemistic spelling of}} to produce "Category:English euphemistic forms" (etc. m.m. for other languages). But "euphemistic" doesn't really seem like an intuitive way of describing bullsh*t, so I agree that something like "censored spelling" or "redacted spelling" (if "censored" is a loaded word) seems better. - -sche (discuss) 20:47, 14 June 2022 (UTC)[reply]
@-sche, Fytcha I agree with this, but should we use 'censored' or 'redacted'? Also there are some entries like fuggheaded and forkhead that are less obviously "censored" or "redacted". Benwing2 (talk) 00:05, 19 June 2022 (UTC)[reply]
@Benwing2: forkhead is much better categorized as a minced oath, no? Unless the claim is that it has the same pronunciation as fuckhead which I find doubtful. fuggheaded is tougher; either put it in a third category or bite the bullet and put it in the censored/redacted one. — Fytcha T | L | C 00:13, 19 June 2022 (UTC)[reply]

🏁 as a translation of chequered flag

Do we want these? I've seen some back and forth on these in another article that I can't remember. Either way, it'd probably be helpful to write down the community consensus in WT:TRANS. Pinging @Equinox, Fay Freak because I think you've been involved in something like this in the past. — Fytcha T | L | C 12:51, 16 June 2022 (UTC)[reply]

I don’t remember taking a stance on them, for I ignore this by reason that it is easy and the harm from these entries is predictably low and they will add them anyway, and it may be an argument that people seek out emojis as translations when remembering their names (although stupidly since this is the kind of thing you should and can search offline much easier with many a setup), probably it is even SEO, so I would rather just do nothing. Fay Freak (talk) 14:03, 16 June 2022 (UTC)[reply]
A translation into what? What language is emoji? Is it the same language as memes or is that a different language? What about animal noises? Facial expressions? Is there an ISO code for Extremely Modern Old Egyptian? - TheDaveRoss 15:31, 16 June 2022 (UTC)[reply]
I agree - it's not a translation. It might be possible to substitute it in various contexts, but that doesn't mean it's being translated into "Translingual". That doesn't make any sense. Theknightwho (talk) 15:33, 16 June 2022 (UTC)[reply]
Previous discussion: Wiktionary:Information desk/2021/June § Translingual translations to emoji. J3133 (talk) 15:40, 16 June 2022 (UTC)[reply]
Definitely not. It's not a translation. Simply ask any professional translator. Equinox 03:36, 17 June 2022 (UTC)[reply]
We also need to punish and lock away the people who regularly add the fuckin ice-cream emoji as a "translation" of the word ice cream. Ummmm yeah what language is that, and what grammar does it use? "Picture of X" ain't the same as "translation of X". Equinox 05:54, 17 June 2022 (UTC)[reply]
Counterpoint: This is useful cross-referencing between entries that's worth including, in much the same way as it is useful to link em dash to , cotangent to cot, and cat to Felis catus. The problem is that calling it a "Translingual translation" makes us look stupid. A slightly less dumb way to handle this would be to put a more structured list at the beginning of the list of translations:
But that doesn't deal with the fact that we'd still be labelling them "translations". Even smarter would be to put them in a separate section of the entry ("See also" perhaps? Anyone worked out what that's for yet?) This, that and the other (talk) 06:56, 17 June 2022 (UTC)[reply]
“See also” seems a good place. — Sgconlaw (talk) 12:03, 17 June 2022 (UTC)[reply]
I was absolutely gonna say the same thing as Sgconlaw: I believe it's useful to link the entries, but we shouldn't pretend they are synonyms, or translations. "See also" is fine. Equinox 12:17, 17 June 2022 (UTC)[reply]
I did it at chequered flag. This, that and the other (talk) 12:27, 17 June 2022 (UTC)[reply]
@This, that and the other: By the way: "anyone worked out what see also is for?" I think that's exactly what it is: stuff that we know is related, or worth linking, but isn't explicitly a hypernym, hyponym, synonym, meronym, example, or whatever you can make up! There may be a far-future time where we literally connect everything semantically (uh, once we get rid of IPs, and Wonderfool), but this is a good start for now. Equinox 12:20, 17 June 2022 (UTC)[reply]
That's pretty much how I see "See also". (P.S. All "Wiktionary:Entry layout" says about it is this: "The “See also” section is used to link to entries and/or other pages on Wiktionary, including appendices and categories. Don’t use this section to link to external sites such as Wikipedia or other encyclopedias and dictionaries.") — Sgconlaw (talk) 12:52, 17 June 2022 (UTC)[reply]
We should ban Translingual translations with an abuse filter... This, that and the other (talk) 12:28, 17 June 2022 (UTC)[reply]
It would be better to change the translation adder first. Chuck Entz (talk) 20:33, 17 June 2022 (UTC)[reply]
Make me an interface-admin and I'll gladly take care of it 😉 This, that and the other (talk) 08:29, 18 June 2022 (UTC)[reply]
[5]Fytcha T | L | C 12:31, 17 June 2022 (UTC)[reply]
Yeah, "See also" seems like a better place to put these than "Translations". - -sche (discuss) 20:23, 17 June 2022 (UTC)[reply]
Okay I've got CAT:Translingual translations almost empty. Hopefully someone here can figure out what to do with the last four entries. This, that and the other (talk) 14:18, 18 June 2022 (UTC)[reply]

I tried to import w:User:Abcormal/List of numbers in various languages to Appendix:Numerals in various languages, but an edit filter prevented me from importing it in its entirety, since I got an automated warning message saying that "this action has been idenfitied as harmful." This list was deleted in an English Wikipedia AfD (Articles for Deletion) discussion. Although it may be beyond the scope of Wikipedia, this looks like it would be a very useful Wiktionary appendix. Could anyone help me import it? Many of the Wikipedia templates in there will also need to be removed, converted, and/or imported. Suomitaiga (talk) 22:44, 17 June 2022 (UTC)[reply]

@Suomitaiga I could speculate about what exactly set the abuse filter off, but what I see in the abuse filter logs wouldn't have worked, anyway.
  1. To start with, all the wikilinks are wrong: this is a different project, do you would have to add "w:" to the beginning of all the wikilinks. Either that, or use the {{w}} template instead.
  2. As you mentioned, most of the templates are wrong.
    1. We dont have things like "short description" or "cleanup lang"
    2. Wikipedia uses different templates for different languages and scripts. We use basically two:
      1. {{l|[language code]|[term to link to]|[display form]|[gloss]]}} the third parameter can be replaced with |t=, and for cases where the correct transliteration isn't automatically provided, use |tr=
      2. {{m}}, which is identical except it displays in italics
      See WT:LOL for the list of language codes we use, and WT:LANGTREAT for explanations. Please note that you have to provide a language code
    3. We don't have a "note" template
    4. There are differences in the templates that are shared between projects, though I couldn't give you a comprehensive rundown on them.
    5. I'm sure many of the cited references have their own dedicated reference templates. See Category:Reference templates by language
  3. This is a dictionary, so we have entries for almost all of the words. Please wikilink them or use the {{l}} or {{m}} templates I mentioned above. I see HTML markup within words for some languages, which we don't use. That complicates things.
Aside from that, I would recommend putting in the section headers and empty tables first, then adding content to one part at a time. The content is going to need to be reworked anyway, with the type of reworking depending on the language, so even without the abuse filters it's a good idea, anyway.
I should mention that the added templates will also add system overhead, so it may be a good idea to split the appendix into sub-pages. Chuck Entz (talk) 01:49, 18 June 2022 (UTC)[reply]
I fished the headers and the smallest of the tables out of the abuse logs as a start. We seem to have a different orthography for the languages in question, so they're all redlinks except for pages with sections only for unrelated languages.
Although most languages will yield much better results, this looks like a massive undertaking / time-sink in order to do it even half right. Chuck Entz (talk) 03:37, 18 June 2022 (UTC)[reply]

Sicilian phonemic transcriptions

(Notifying Inqvisitor, Scorpios90, Afc0703, A. T. Galenitis, SignorNic, 151.82.148.85, 151.18.206.223):

Many of our Sicilian entries have phonemic transcriptions that aren't actually phonemic. Some examples are:

/ɐ̠l.fɐ̠bˈbɛː.tu̞/ = alfabbetu
/çɪɾɪ(ɨ)ˈv(ʲ)ɛɖːu/ = ciriveddu
/ˌnku.nʊk.ˈkja.ɾɪ/ = ncununcchiari
/ˈcjum.mʊ/ = chiummu
/ˈpa.ʃɪ/ = paci
/sʊˈʃːaɾɪ/ = susciari
/ˈvɛːk.cju̞/ = vecchiu

These are really just phonetic transcriptions of varying accuracy, as they indicate various allophonic phenomena, such as stressed-open-syllable lengthening (ˈɛː), unstressed vowel reduction (ɪ, ɨ, ʊ), and synchronic palatalization (vʲɛ, cj).

Phonemic transcriptions are only meant to contain distinctive elements or things that are not 'predictable' in a language. The phonemic structure of the above words, for instance, would be something like:

/alfabˈbɛtu/
/tʃiriˈvɛɖɖu/
/nkunukˈkjari/
/ˈkjummu/
/ˈpatʃi/
/suˈʃari/
/ˈvɛkkju/

The accompanying phonetic transcriptions can include as many details as one likes. For the last five words, we would have perhaps something like:

[ˌŋku.nʊc.ˈcjäː.ɾɪ]
[ˈcjum.mʊ]
[ˈpäː.ʃɪ]
[sʊʃ.ˈʃäː.ɾɪ]
[ˈvɛc.cjʊ]

The above is only meant as an example; the important thing is to agree on some way or other of doing things. Ideally this would involve finding a source or two that describes Sicilian phonetics in detail.

Perhaps we could design an automated pronunciation module for Sicilian, such as the one that @Benwing2 has made for Italian, and then have a bot clean up the 1001 Sicilian entries with manually written pronunciations. Nicodene (talk) 01:19, 18 June 2022 (UTC)[reply]

@Nicodene I would agree with this. For some languages such as Russian we include some allophonic phenomena in the phonemic pronunciations but I prefer to separate the phonemic and phonetic variants. Unfortunately I don't know much at all about Sicilian pronunciation or how predictable it is from the spelling. Benwing2 (talk) 01:28, 18 June 2022 (UTC)[reply]
Also ncununcchiari has a pronunciation without the third /n/, and likewise the conjugation table leaves out the third n. I assume there is a mistake somewhere. Benwing2 (talk) 01:30, 18 June 2022 (UTC)[reply]
Yes, not sure what happened with the /n/.
For Russian, it seems we don't have phonemic transcriptions at all, only phonetic ones. Perhaps that was to avoid disagreements over what counts as a phoneme, since e.g. the unstressed vowel mergers complicate things. Also the old controversy over /ɨ/. Nicodene (talk) 02:21, 18 June 2022 (UTC)[reply]

Removal of PWG from etymologies

Some users are doing this [[6]], essentially removing the Proto-West Germanic step in etymologies from view, and replacing it with a dercat marker which add it to the category and the bottom of the page. Is this something we want to do ? Leasnam (talk) 13:23, 18 June 2022 (UTC)[reply]

No, we would want the exact opposite: PG in the dercat, and PWG in the etymologies. Thadh (talk) 13:25, 18 June 2022 (UTC)[reply]
That's what I was thinking as well. Leasnam (talk) 13:27, 18 June 2022 (UTC)[reply]