Wiktionary:Beer parlour
Wiktionary > Discussion rooms > Beer parlour
Information desk start a new discussion | this month | archives Newcomers’ questions, minor problems, specific requests for information or assistance. |
Tea room start a new discussion | this month | archives Questions and discussions about specific words. |
Etymology scriptorium start a new discussion | this month | archives Questions and discussions about etymology—the historical development of words. |
Beer parlour start a new discussion | this month | archives General policy discussions and proposals, requests for permissions and major announcements. |
Grease pit start a new discussion | this month | archives Technical questions, requests and discussions. |
All Wiktionary: namespace discussions 1 2 3 4 5 – All discussion pages 1 2 3 4 5 |

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.
Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.
Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!
Beer parlour archives edit | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Global ban proposal for Shāntián Tàiláng
[edit]Hello. This is to notify the community that there is an ongoing global ban proposal for User:Shāntián Tàiláng who has been active on this wiki. You are invited to participate at m:Requests for comment/Global ban for Shāntián Tàiláng. Wüstenspringmaus (talk) 12:08, 2 February 2025 (UTC)
Are dictionaries of other languages acceptable as a reference for an entry?
[edit]A while back I found that the entry palco has a monolingual Portuguese dictionary as the reference for the Spanish entry. This didn't seem right, so I deleted that reference, but I see it's been reverted. I'm wondering if perhaps it is in fact acceptable to use dictionaries of other (relatively closely related) languages as references for an entry? Laralei (talk) 23:35, 2 February 2025 (UTC)
- I don't see the issue with doing so for etymologies, as the entry in question does, as long as the source in question also mentions the Spanish term. — SURJECTION / T / C / L / 07:05, 3 February 2025 (UTC)
- I won't weigh in on the use of the reference, but since it was my bot that reverted your edit I want to clarify why: your edit removed the
<references/>
tag without deleting the actual<ref>...</ref>
reference. Since<references/>
just tells the page where it should display any previously tagged references, after your edit, there was no explicit destination for the references and they ended up displayed at the end of the page after the categories. If there had been another language after Spanish that included<references/>
, the reference would have been incorrectly displayed there. The bot detected that the Spanish entry included explicit references without a References section and added one. JeffDoozan (talk) 18:40, 4 February 2025 (UTC)
Reminder: first part of the annual UCoC review closes soon
[edit]Please help translate to your language.
This is a reminder that the first phase of the annual review period for the Universal Code of Conduct and Enforcement Guidelines will be closing soon. You can make suggestions for changes through the end of day, 3 February 2025. This is the first step of several to be taken for the annual review. Read more information and find a conversation to join on the UCoC page on Meta. After review of the feedback, proposals for updated text will be published on Meta in March for another round of community review.
Please share this information with other members in your community wherever else might be appropriate.
-- In cooperation with the U4C, Keegan (WMF) (talk) 00:49, 3 February 2025 (UTC)
English verb conjugation tables
[edit]Occasionally I come across conjugation tables for English verbs, such as at clarify#Conjugation. Do we have a policy about including or not including these? {{en-conj}}
says "In general, it should only be used for verbs with archaic forms", but doesn't every English verb (potentially) have archaic forms in "-est" or "-eth" at least? Even verbs invented in modern times can be cast in this archaic style for effect. Mihia (talk) 18:53, 4 February 2025 (UTC)
- @Mihia: in theory, yes, but I always check to see if at least one of the archaic forms is attestable before adding the conjugation table, and I don't create entries for forms which I can't find uses of. I think it is useful to include conjugation tables, otherwise there really isn't anything linking the lemma to the archaic forms. — Sgconlaw (talk) 19:49, 4 February 2025 (UTC)
- For clarity, should we change "should only be used for verbs with archaic forms" to "should only be used for verbs with attestable archaic forms"? Mihia (talk) 19:55, 4 February 2025 (UTC)
- @Mihia: I guess that would be OK with me. We should clarify, though, whether this means (1) the same standards of verifiability as lemmas (I think not, since as far as I can tell we don't require other inflections of verbs to be separately attestable with at least three quotations); and (2) whether unattestable archaic forms need to be omitted from the conjugation table altogether, or whether they can remain in the table as red or green links (I would say yes since, for example, in an Irish verb entry the mutation table notes that "[c]ertain mutated forms of some words can never occur in standard Modern Irish. All possible mutated forms are displayed for convenience."). — Sgconlaw (talk) 20:26, 4 February 2025 (UTC)
- I agree that there should be linkage to the archaic forms. I wonder, though, whether the conjugation table makes more of a meal of this than is necessary. Apart from vanishingly unusual special cases, such as perhaps the "be" verb, are there many (any?) verbs with "unexpected" entries in any of these conjugation fields, beyond what is already listed in the head anyway (notably irregular past tense / participle)? Is the conjugation table in fact making things look more complicated than they really are -- giving the impression, for instance, that certain verbs have an irregular imperative or subjunctive, or a (modern) past tense varying with person/number, etc.? Mihia (talk) 18:03, 5 February 2025 (UTC)
- @Mihia: I guess that would be OK with me. We should clarify, though, whether this means (1) the same standards of verifiability as lemmas (I think not, since as far as I can tell we don't require other inflections of verbs to be separately attestable with at least three quotations); and (2) whether unattestable archaic forms need to be omitted from the conjugation table altogether, or whether they can remain in the table as red or green links (I would say yes since, for example, in an Irish verb entry the mutation table notes that "[c]ertain mutated forms of some words can never occur in standard Modern Irish. All possible mutated forms are displayed for convenience."). — Sgconlaw (talk) 20:26, 4 February 2025 (UTC)
- For clarity, should we change "should only be used for verbs with archaic forms" to "should only be used for verbs with attestable archaic forms"? Mihia (talk) 19:55, 4 February 2025 (UTC)
- I would say that it should be restricted to verbs which existed back when the forms were common, or modern verbs that use the arachic forms. (Maybe not even those, since only certain archaic verb forms are used a lot these days.) CitationsFreak (talk) 20:38, 4 February 2025 (UTC)
- I agree. I always thought conjugations were sort of exempted from the WDL rules. I mean, they're conjugations. MedK1 (talk) 20:42, 6 February 2025 (UTC)
- IMO English verb conjugation tables are overkill in the vast majority of cases. I've removed several of them. I think the only situation where such tables are needed are (as mentioned by @CitationsFreak) verbs that existed <= 1650 AD (when the archaic forms were still common), and even then maybe only if the forms aren't completely predictable. Otherwise it just adds noise to the entries. Benwing2 (talk) 00:27, 10 February 2025 (UTC)
- I have been in favour of conjugation tables in English; I find it weird that we (the English Wiktionary) have big tables showing all kinds of obscure or obsolete French and German verb and noun forms, but many English verb entries don't link to their less-common inflected forms at all, or acknowledge that they existed. I concede that the average user may not be interested in such forms. Perhaps we might alternatively have a little button at the end of the forms that are listed in the headword, that said "show archaic and obsolete forms", that would make them appear, but then, that's not all that different from having a collapsed table with its button to make the forms appear. I don't know, perhaps someone can think of some other way of linking from lemmas to their inflected forms. - -sche (discuss) 03:45, 10 February 2025 (UTC)
- A feature to "show archaic and obsolete forms", which showed just those forms, is a better idea, in my opinion. The present "conjugation table" gives the impression that any of those fields might be irregular in modern English, over and above what is already displayed in the head. For example, that English verbs routinely might have an irregular or unpredictable subjunctive form, or whatever. While archaic forms are shown too, there is no indication that the purpose of the table is to show these, and not deal generally with conjugation of the verb, including modern forms. It makes English verbs, which in modern use overwhelmingly have only the forms listed in the head, look more complicated than is necessary. A "show archaic and obsolete forms" feature would avoid that. Mihia (talk) 10:00, 10 February 2025 (UTC)
- I support this idea. I have the same concerns as @Mihia about having a big conjugation table that gives a misleading impression about English verbs, and I think something that only shows the archaic and obsolete forms, or (maybe) shows both but clearly indicates which ones are archaic vs. normal and is only present when there are archaic forms, would be the best. It is a bit similar to the way that I've segregated archaic/obsolete/literary/etc. forms for Italian verbs like essere, facere and avere into separate tables; similarly, Russian pre-reform conjugations get their own tables. The difference here is that English verbs are simple enough that the headword is able to show *ALL* non-archaic/obsolete info, so there's no need for the "regular" table at all. Benwing2 (talk) 10:14, 10 February 2025 (UTC)
- Worth noting that I tend to agree with -sche on this and would likely oppose the vote. I see value in, say, come#Conjugation; sure, we could present this in a textual form ("archaic second-person singular forms camest and camedst are attested for this verb"), but a table is much clearer way of presenting that info. Whether the imperative and subjunctive need to be included is a different question to removing the table entirely. This, that and the other (talk) 01:52, 14 February 2025 (UTC)
- @This, that and the other What exactly do you oppose/support? Do you want verb tables on *all* verbs (which I would very strongly oppose) or only on verbs with attested archaic forms? Are you opposed to labeling the verb table "(including archaic and obsolete forms)" to emphasize that that is the main point of the table? Please note that even in come, the verb table presented is misleading, because (as can be seen by clicking on it) the form camedst labeled "archaic or obsolete" is not a genuine archaism but a rare, hypercorrect pseudo-archaism, and the form comen labeled "rare" is in fact archaic. I would suspect quite a lot of the tables have basic mistakes in them like this, because many of the people who are inclined to add such tables are not doing it to impart actual information but because it's "kewl". I would be inclined in fact to rename the table to
{{en-conj-archaic}}
and remove the support for not setting|old=1
, so it's obvious that it's only intended for verbs in existence <= 1650 AD and would not make any sense for e.g. "to text" or "to download". Benwing2 (talk) 03:41, 14 February 2025 (UTC)- @Benwing2 My understanding is that
{{en-conj}}
is only to be used on verbs where at least one form not listed in the headword line is attested. I don't support use beyond that. As such I'd definitely support removing the old= parameter and would not be opposed to renaming the template. I could support the "(including archaic and obsolete forms)" caption as well. All very good ideas! - Honestly I think the template needs to be written in Lua; it would have to be one of the most complex pure-wikitext templates that we still have. This could make it easier for people to specify proper labels using
<q:>
syntax. You could even reformat the cells as- came
- come (nonstandard)
- camedst (archaic, rare, hypercorrect)
- to avoid the proliferation of footnotes. This, that and the other (talk) 03:53, 14 February 2025 (UTC)
- Yeah I completely agree about rewriting it in Lua. Benwing2 (talk) 03:58, 14 February 2025 (UTC)
- @Benwing2, CitationsFreak, Mihia: I don't object if the template is rewritten so that it is only for displaying the archaic forms, and would just make the following comments on that:
- It would be easier to understand the table if it compares the modern forms with the archaic forms. For example, if on the 3rd-person singular present row the table showed the cells for both clarifes and clarifieth, this would be helpful to readers as the label "3rd-person singular" alone may not be understandable by readers unfamiliar with grammar.
- Should the table also allow for modern variant inflected forms to be specified if, say, there are two or more of such forms, so they do not clutter up the headword line? (Offhand, I can't think of any specific examples.)
- — Sgconlaw (talk) 12:47, 15 February 2025 (UTC)
- @Benwing2, CitationsFreak, Mihia: I don't object if the template is rewritten so that it is only for displaying the archaic forms, and would just make the following comments on that:
- Yeah I completely agree about rewriting it in Lua. Benwing2 (talk) 03:58, 14 February 2025 (UTC)
- @Benwing2 My understanding is that
- @This, that and the other What exactly do you oppose/support? Do you want verb tables on *all* verbs (which I would very strongly oppose) or only on verbs with attested archaic forms? Are you opposed to labeling the verb table "(including archaic and obsolete forms)" to emphasize that that is the main point of the table? Please note that even in come, the verb table presented is misleading, because (as can be seen by clicking on it) the form camedst labeled "archaic or obsolete" is not a genuine archaism but a rare, hypercorrect pseudo-archaism, and the form comen labeled "rare" is in fact archaic. I would suspect quite a lot of the tables have basic mistakes in them like this, because many of the people who are inclined to add such tables are not doing it to impart actual information but because it's "kewl". I would be inclined in fact to rename the table to
- A feature to "show archaic and obsolete forms", which showed just those forms, is a better idea, in my opinion. The present "conjugation table" gives the impression that any of those fields might be irregular in modern English, over and above what is already displayed in the head. For example, that English verbs routinely might have an irregular or unpredictable subjunctive form, or whatever. While archaic forms are shown too, there is no indication that the purpose of the table is to show these, and not deal generally with conjugation of the verb, including modern forms. It makes English verbs, which in modern use overwhelmingly have only the forms listed in the head, look more complicated than is necessary. A "show archaic and obsolete forms" feature would avoid that. Mihia (talk) 10:00, 10 February 2025 (UTC)
- I created a draft vote on the usage of en-conj at Wiktionary:Votes/2025-02/Retiring_the_English_verb_conjugation_table. Please put comments specifically on the wording of the vote at the discussion page there. Mihia (talk) 09:26, 14 February 2025 (UTC)
- FYI: @Mihia, @-sche, @CitationsFreak, @MedK1, @Sgconlaw, @This, that and the other: Looks like the vote has started. (I did not tag Benwing since he's already been made aware on Discord) AG202 (talk) 23:43, 13 March 2025 (UTC)
- I actually asked @Mihia on the vote talk page if the vote had actually started or not, but they haven't responded. Looks like it has now started, though. — Sgconlaw (talk) 11:14, 14 March 2025 (UTC)
- FYI: @Mihia, @-sche, @CitationsFreak, @MedK1, @Sgconlaw, @This, that and the other: Looks like the vote has started. (I did not tag Benwing since he's already been made aware on Discord) AG202 (talk) 23:43, 13 March 2025 (UTC)
Lutuv vs. Lautu Chin for language name
[edit]A majority of recent academia on Lutuv (language code: clt), a Kuki-Chin language of the Maraic branch, uses the name Lutuv (as used by its speakers) rather than the term "Lautu Chin," as it is currently set on here. "Lautu" is an exonym used by the Hakha Chin adopted in the earliest scholarship that mention the language & people, much of which did not work directly with Lutuv people or their language, & all recent works either exclusively use the name Lutuv, or list "Lautu" merely as an alternative name. For an example of this among recent working papers on the language, see here, & for an example among papers working with the community in a non-linguistic context, see here. One can also find community centers like churches, both in the native region, & among diaspora communities preferring the "Lutuv" spelling.
So, this is my longwinded way of asking- could we change the name of "Lautu Chin" on Wiktionary to reflect the preferred name? CedarForest14 (talk) 01:38, 5 February 2025 (UTC)
- Pinging @-sche who seems to know a lot about obscure languages :) ... Benwing2 (talk) 00:24, 10 February 2025 (UTC)
- Support, "Lutuv" indeed seems to be more common these days. ("Lautu Chin" should be retained as an alt name; bare "Lautu" and "Lutuv Chin" also occur in a few papers and could be added as alt names for findability.) Glottolog's bibliography, which can sometimes (for languages that are written about more often) give indications of what names are commonly used to refer to certain languages, is no help here as it has only a forty-year-old French-language ethnography of the people, the ISO code request form, and two linguistic works that are more concerned with Proto-Chin than anything modern, and don't mention this language in their titles AFAICT. But on Google Scholar I indeed find more papers about "Lutuv" than "Lautu"/"Lautu Chin". - -sche (discuss) 02:12, 10 February 2025 (UTC)
- Thanks @-sche! If there are no objections, in a couple of days I'll switch the language name and move the lemmas, categories, etc. to use the new name. Benwing2 (talk) 10:15, 10 February 2025 (UTC)
- Thank you both very much, @-sche & @Benwing2, for your input- I'm hoping to add more entries for the language from existing publications soon, so I appreciate the help, & eagerly await the change! CedarForest14 (talk) 19:07, 15 February 2025 (UTC)
- Support, "Lutuv" indeed seems to be more common these days. ("Lautu Chin" should be retained as an alt name; bare "Lautu" and "Lutuv Chin" also occur in a few papers and could be added as alt names for findability.) Glottolog's bibliography, which can sometimes (for languages that are written about more often) give indications of what names are commonly used to refer to certain languages, is no help here as it has only a forty-year-old French-language ethnography of the people, the ISO code request form, and two linguistic works that are more concerned with Proto-Chin than anything modern, and don't mention this language in their titles AFAICT. But on Google Scholar I indeed find more papers about "Lutuv" than "Lautu"/"Lautu Chin". - -sche (discuss) 02:12, 10 February 2025 (UTC)
Switching all turkish verbs to tr-conj-table template?
[edit]It is formatted more accurately (the 6 big blocks with 4 sub-blocks each is very systematic and how verbs are formed) and has more information than tr-conj so I think we should. If not then I'll edit tr-conj to make it better but that would be a waste of time considering the work has already been done in tr-conj-table Zbutie3.14 (talk) 02:41, 5 February 2025 (UTC)
- @Zbutie3.14 I support this although the
{{tr-conj-table}}
template should be renamed{{tr-conj}}
and the existing{{tr-conj}}
table deleted. How would I bot-convert from the old templates to the new one? I don't know much anything about Turkish so I don't know what parameters need to be passed to{{tr-conj-table}}
. Benwing2 (talk) 00:23, 10 February 2025 (UTC)- @Trimpulot Zbutie3.14 (talk) 00:27, 10 February 2025 (UTC)
- @Benwing2
{{tr-conj-table}}
only requires at most 1 parameter in the form of any of the vowels <ı, i, u, ü> or the letter <d>: the vowel is required when the verb's stem is monosyllabic and its aorist is expressed with a -V4r suffix (current tables always have a parameter that is the aorist vowel), while the <d> is required only for the verbs etmek, gitmek and gütmek, plus their compounds (ie. covering all the verbs currently covered by{{tr-conj-*tmek}}
). - In all other cases, no parameters are needed.
- Trimpulot (talk) 06:21, 10 February 2025 (UTC)
- If I understand this correctly, a current
{{tr-conj|stem|V1|aorist|V2|t-or-d}}
- should be replaced by
{{tr-conj-table|V2}}
- if stem is monosyllabic (contains only one occurrence of an ⟨a⟩, ⟨e⟩, ⟨ı⟩, ⟨i⟩, ⟨o⟩, ⟨ö⟩, ⟨u⟩ or ⟨ü⟩) and V2 is one of ⟨ı⟩, ⟨i⟩, ⟨u⟩ and ⟨ü⟩. Otherwise, it should be replaced by a simple
{{tr-conj-table}}
.
- For example:
- at almak:
{{tr-conj|al|a|alır|ı|d}}
→{{tr-conj-table|ı}}
; - at yanmak:
{{tr-conj|yan|ı|yanar|a|d}}
→{{tr-conj-table}}
(V2 is ⟨a⟩); - at edilmek:
{{tr-conj|edil|i|edilir|i|d}}
→{{tr-conj-table}}
(edil is polysyllabic).
- at almak:
- Moreover, a current
{{tr-conj-*tmek|⟨letters⟩}}
- should be replaced by
{{tr-conj-table|⟨letters⟩t|d}}
.
- For example:
- at affetmek:
{{tr-conj-*tmek|affe}}
→{{tr-conj-table|affet|d}}
.
- at affetmek:
- --Lambiam 07:55, 10 February 2025 (UTC)
- @Lambiam Slight correction:
{{tr-conj-table|d}}
is enough for any{{tr-conj-*tmek}}
, so affetmek would only need{{tr-conj-table|d}}
. Trimpulot (talk) 08:39, 10 February 2025 (UTC)
- @Lambiam Slight correction:
- If I understand this correctly, a current
- I agree that ultimately the new template should replace the old one. To avoid switch-over problems, I think this is best done in phases.
- Phase 1A: Move page
{{tr-conj}}
to page{{tr-conj-obs}}
, leaving a redirect behind. - Phase 1B: Edit all uses (transclusions) of
{{tr-conj}}
to use{{tr-conj-obs}}
instead. - Phase 2A: Move page
{{tr-conj-table}}
to page{{tr-conj}}
, overwriting the redirect. - Phase 2B: Transform all uses of
{{tr-conj-obs}}
into uses of the new{{tr-conj}}
. - Phase 2C: Transform the uses of
{{tr-conj-*tmek}}
into uses of{{tr-conj}}
. - Phase 3: The now unused templates
{{tr-conj-obs}}
and{{tr-conj-*tmek}}
(double check) may be relegated to the dustbin of unused bits.
- Phase 1A: Move page
- --Lambiam 08:16, 10 February 2025 (UTC)
- @Zbutie3.14 Of course I agree with this decision, but we should also hear what the other Turkish editors have to say.
- @Kakaeater, Keleci, Lagrium, Lambiam, Moonpulsar, Orexan, Whitekiko.
- Trimpulot (talk) 06:31, 10 February 2025 (UTC)
- Fine with me. I don’t think anyone will miss forms like “almaz mıymışsın? ” and their ilk. --Lambiam 08:26, 10 February 2025 (UTC)
- This sounds good. What you describe about moving the old template out of the way is exactly what I've done in similar situations; essentially
{{tr-conj}}
->{{tr-conj/old}}
, then{{tr-conj-table}}
is moved to overwrite{{tr-conj}}
, and then the bot does its thing, and finally you delete{{tr-conj/old}}
and the other old templates. If there are no objections, I'll do this in a couple of days. Benwing2 (talk) 10:08, 10 February 2025 (UTC)- @Trimpulot @Lambiam I'm doing a bot run now to convert the calls. There's still ~ 300 verbs using
{{tr-conj-v}}
as well as a few using{{tr-conj-mi}}
,{{tr-demek-yemek}}
and{{tr-conj-aux-bil}}
to convert, plus some miscellaneous leftover stuff to delete like{{tr-conj-exp}}
. Benwing2 (talk) 08:42, 14 February 2025 (UTC)- Also the module can't handle the suffixes -akalmak and -ekalmak and throws an error; please fix, thanks! Benwing2 (talk) 08:46, 14 February 2025 (UTC)
- @Benwing2 tr-conj-mi should stay for now, whilst tr-conj-aux-bil verbs are already treated in the template in their non-bil form.
- I will add support for suffixes right away. Trimpulot (talk) 08:48, 14 February 2025 (UTC)
- @Trimpulot Thanks! What about
{{tr-conj-v}}
, how should they be converted? Also there are three verbs still using{{tr-conj/old}}
that my module refused to convert because there was something wrong with the params. Can you convert them by hand? Then I can remove{{tr-conj/old}}
. Benwing2 (talk) 09:48, 14 February 2025 (UTC)- @Benwing2 tr-conj-v shouldn't need any params to be converted. As for those three pages, I'll take care of them. Trimpulot (talk) 10:46, 14 February 2025 (UTC)
- @Trimpulot I converted
{{tr-conj-v}}
. There are still three terms that use{{tr-conj-head}}
. Can you add the requisite support for them (bilmek using{{tr-conj-aux-bil}}
and dayak yemek and kazık yemek using{{tr-demek-yemek}}
) to{{tr-conj}}
and convert them to{{tr-conj}}
? Also can you add support to{{tr-conj}}
for{{tr-conj-mi}}
and convert mı to use{{tr-conj}}
? The overall appearance of the new{{tr-conj}}
templates is radically different from the old ones and it would be best to have all Turkish verbs use the same (new) look and feel. Once you convert all the templates, you should convert the colors in{{tr-conj}}
so they support dark mode; I can help you with that. Benwing2 (talk) 07:25, 15 February 2025 (UTC)- I've fixed dayak yemek and kazık yemek, and I'll get to work on implementing a table for mi. As for suffixed bilmek, I'm not sure we should even give it a table of its own anymore, since it's included in the standard verb table. I'll greatly appreciate your help to make the table dark-mode friendly since I have no idea how to do that. Trimpulot (talk) 09:50, 15 February 2025 (UTC)
- @Trimpulot The only thing left now preventing deletion of
{{tr-conj-head}}
is{{tr-conj-aux-bil}}
. The text says that it is suppletive as an auxiliary; how should we handle this? Can you format the bilmek entry appropriately? As for dark mode, it would be good to restructure Module:tr-conj to avoid duplication and use CSS classes rather than raw inline styles for the colors, along with a separatestyle.css
file containing the CSS class definitions. For an example, see Module:is-noun/style.css and its use in the make_table() function of Module:is-noun (starting on line 3829). Benwing2 (talk) 00:57, 16 February 2025 (UTC)- @Benwing2 I'll add a special case for if the entry ends in "-bilmek", so that
{{tr-conj-aux-bil}}
will need to be replaced with{{tr-conj|pot}}
, though I think other entries in -bilmek aside -bilmek itself shouldn't exist. I'll try to see if I can make the css work, just where can I find all the --wikt-palette colours? - Trimpulot (talk) 07:06, 16 February 2025 (UTC)
- @Trimpulot see MediaWiki:Gadget-Palette/table. If there's a color you need that isn't in the palette, I may be able to add it. Benwing2 (talk) 07:12, 16 February 2025 (UTC)
- @Benwing2 Thanks! I'll get to it as soon as I can.
- Trimpulot (talk) 07:14, 16 February 2025 (UTC)
- @Benwing2 It's done. Trimpulot (talk) 11:29, 16 February 2025 (UTC)
- @Trimpulot see MediaWiki:Gadget-Palette/table. If there's a color you need that isn't in the palette, I may be able to add it. Benwing2 (talk) 07:12, 16 February 2025 (UTC)
- @Benwing2 I'll add a special case for if the entry ends in "-bilmek", so that
- @Trimpulot The only thing left now preventing deletion of
- I've fixed dayak yemek and kazık yemek, and I'll get to work on implementing a table for mi. As for suffixed bilmek, I'm not sure we should even give it a table of its own anymore, since it's included in the standard verb table. I'll greatly appreciate your help to make the table dark-mode friendly since I have no idea how to do that. Trimpulot (talk) 09:50, 15 February 2025 (UTC)
- @Trimpulot I converted
- @Benwing2 tr-conj-v shouldn't need any params to be converted. As for those three pages, I'll take care of them. Trimpulot (talk) 10:46, 14 February 2025 (UTC)
- @Trimpulot Thanks! What about
- @Trimpulot @Lambiam I'm doing a bot run now to convert the calls. There's still ~ 300 verbs using
- This sounds good. What you describe about moving the old template out of the way is exactly what I've done in similar situations; essentially
- Fine with me. I don’t think anyone will miss forms like “almaz mıymışsın? ” and their ilk. --Lambiam 08:26, 10 February 2025 (UTC)
French pronunciation
[edit]Why is there shown no primary stress mark on the last syllable of French words in the pronunciation sections although, in speech, it can very easily be heard to be there – maybe because it practically always falls to be situated just there? 2001:14BB:112:8152:0:0:340:5F01 17:54, 5 February 2025 (UTC)
- Our automatic transcriptions for French are phonemic, and stress is not phonemic in French. Nicodene (talk) 04:22, 6 February 2025 (UTC)
- @Nicodene: why is that? I’m curious. — Sgconlaw (talk) 04:33, 6 February 2025 (UTC)
- Why our transcriptions are phonemic, or why stress in French is not? Nicodene (talk) 05:39, 6 February 2025 (UTC)
- Perhaps it could be possible to create a template for phonetic transcriptions, also? 2001:14BB:AF:7F8A:0:0:5A68:101 08:23, 6 February 2025 (UTC)
- @Nicodene: why is that? I’m curious. — Sgconlaw (talk) 04:33, 6 February 2025 (UTC)
- French doesn't always have stress on the last syllable of a word. That is how English speakers hear the pronunciation of French words in isolation, but when words are connected in phrases, this stress normally appears only on the last syllable of every phrase, not on the last syllable of each word in every phrase. Wikipedia gives the example "la.pə.tit.mɛ.zɔ̃ dɑ̃.la.pʁɛ.ˈʁi" for "La Petite Maison dans la prairie". That is another good reason not to transcribe it, aside from the fact that stress does not distinguish different words from each other in French and so, as Nicodene said, is not a phonemic property of French words.--Urszag (talk) 10:11, 6 February 2025 (UTC)
- Yeah French is well-known for glomming together multiple words with various transformations applied, such as elision, liaison, enchaînement (= resyllabification?) and schwa-dropping. Hence Doukipudonktan = D'où (est-ce) qu'il pue donc tant at the beginning of Zazie dans le Métro. (FWIW similar things happen in spoken Egyptian Arabic, and maybe also in Icelandic.) Benwing2 (talk) 00:33, 10 February 2025 (UTC)
- Why is it assumed that every reader knows how isolated words are pronounced in French? Because it is UN language? If i Chinese learner comes in here what will s/he say? mérsi -ending up in English- or mersí? Why aren't there small explanatory notes for us the ignorantes? (plus a nice audio with isolated words and a whole phrase) Notes for basic things, like 'this language lemmatises verbs in the infinitive' 'that language lemmatises verbs in 1st person present' etc... En.wikt prides itself on accuracy and documentation (: ‑‑Sarri.greek ♫ I 02:15, 10 February 2025 (UTC)
- Phonemic transcriptions by their nature do not show non-phonemic information. Their purpose is to abstract away from details that are not significant in a language (which can be distracting or irrelevant) and just show the aspects of pronunciation that are unique to a word, not phonetic features shared across the entire language. The latter would be discussed in a phonetic description of the language (if looking for a scholarly treatment), or a general guide to pronunciation (if looking for a practical resource for language learners). For some languages, we show phonetic transcriptions, but this is often more complicated to do. It's normal for a word to have many phonetic variants that could potentially be transcribed differently depending on the transcriber's preferences and what the transcriber wants to emphasize. Furthermore, audio recordings of French can easily be found, so it's not clear to me how much value it adds to attempt to include detailed, narrow phonetic transcriptions of French words.
- For comparison, pitch is a phonetic aspect of the pronunciation of words in any language. But we do not include pitch in our pronunciation information for English words, because pitch is not phonemic in English and pitch patterns are not consistently associated with particular words (but instead depend on the intonation of phrases). If somebody wants to understand how pitch is used as a feature of English pronunciation, the way to do it is not to look up the pronunciations of individual words: instead, it's better to read descriptions of English prosody and listen to audio of English sentences. Likewise, French prosody is not communicated effectively by putting stress marks on the entries for individual French words.--Urszag (talk) 10:48, 10 February 2025 (UTC)
- Why is it assumed that every reader knows how isolated words are pronounced in French? Because it is UN language? If i Chinese learner comes in here what will s/he say? mérsi -ending up in English- or mersí? Why aren't there small explanatory notes for us the ignorantes? (plus a nice audio with isolated words and a whole phrase) Notes for basic things, like 'this language lemmatises verbs in the infinitive' 'that language lemmatises verbs in 1st person present' etc... En.wikt prides itself on accuracy and documentation (: ‑‑Sarri.greek ♫ I 02:15, 10 February 2025 (UTC)
- Yeah French is well-known for glomming together multiple words with various transformations applied, such as elision, liaison, enchaînement (= resyllabification?) and schwa-dropping. Hence Doukipudonktan = D'où (est-ce) qu'il pue donc tant at the beginning of Zazie dans le Métro. (FWIW similar things happen in spoken Egyptian Arabic, and maybe also in Icelandic.) Benwing2 (talk) 00:33, 10 February 2025 (UTC)
Wiktionary:Audio whitelist
[edit]Hello,
I am considering a petition to create a project page titled Wiktionary:Audio whitelist. Users can consider major audio contributors for autopatrolled audio rights, and an administrator approves. Trusted audio contributors' sound files are presumed high-quality, thus do not need to be manually patrolled.
Thank you Flame, not lame (Don't talk to me.) 19:40, 5 February 2025 (UTC)
- If we can the whitelisters' audios automatically added to en.wikt after creation on Lingualibre, that'd be cool - save me some time. Father of minus 2 (talk) 22:17, 5 February 2025 (UTC)
- psst! the project page Wiktionary:Approved Lingua Libre users already exists! Juwan (talk) 19:54, 6 February 2025 (UTC)
- We want Flamey on the list! Father of minus 2 (talk) 11:06, 7 February 2025 (UTC)
- You do not know how much that means to me. Flame, not lame (Don't talk to me.) 16:05, 9 February 2025 (UTC)
- I had no idea. Flame, not lame (Don't talk to me.) 16:04, 9 February 2025 (UTC)
- We want Flamey on the list! Father of minus 2 (talk) 11:06, 7 February 2025 (UTC)
- @Flame, not lame I've added you to the list. Please note that the only bot that imports LL files is currently blocked and wasn't very active before the block. I think I'll post in the GP soon asking if anyone else can do the job. Ultimateria (talk) 01:19, 18 February 2025 (UTC)
tr table template to fix proto turkic pages
[edit]This is what I have so far. https://en.wiktionary.org/wiki/User_talk:Zbutie3.14/trtable#example \
The descendent list on here https://en.wiktionary.org/wiki/Wiktionary:About_Proto-Turkic#Descendants needs to be fixed and then I can finish it.
@BurakD53 @Allahverdi Verdizade @Yorınçga573 @Blueskies006 @Ardahan Karabağ @Bartanaqa @Samiollah1357 @Zbutie3.14 @Rttle1 @AmaçsızBirKişi Zbutie3.14 (talk) 02:05, 6 February 2025 (UTC)
Support, though your pings did not work.
- AmaçsızBirKişi (talk) 10:35, 6 February 2025 (UTC)
- If I used desctree for the loanwords wouldn't it get pretty messy? Like if I used desctree for یوغورت (yogurt) then wouldn't there be way too many? Zbutie3.14 (talk) 14:10, 6 February 2025 (UTC)
- Maybe for such entries you can adopt the Indo-European reconstruction pages' model of just writing "(see there for more)", if that is applicable here.
- That would also reduce the work needed to keep the pages consistent with each other.
- AmaçsızBirKişi (talk) 15:40, 6 February 2025 (UTC)
- If I used desctree for the loanwords wouldn't it get pretty messy? Like if I used desctree for یوغورت (yogurt) then wouldn't there be way too many? Zbutie3.14 (talk) 14:10, 6 February 2025 (UTC)
- @BurakD53 @Allahverdi Verdizade @Yorınçga573 @Blueskies006 @Ardahan Karabağ @Bartanaqa @Samiollah1357 @Zbutie3.14 @Rttle1 Zbutie3.14 (talk) 14:06, 6 February 2025 (UTC)
- Languages without descendants should be set to not appear in the table. However, if a language, such as Turkish, exists but is not attested in its ancestor languages, the ancestor languages should still appear in the table as empty fields. The table should be arranged to reflect this structure. BurakD53 (talk) 08:41, 7 February 2025 (UTC)
Support Yorınçga573 (talk) 17:35, 6 February 2025 (UTC)
Support BurakD53 (talk) 08:30, 7 February 2025 (UTC)
- @AmaçsızBirKişi
- https://en.wiktionary.org/wiki/User:Zbutie3.14/trtable
- can you look at the descendent structure I have now, I added oghuric but I still need the [*xbo-dnb] and [*xbo-vol] language codes. I also need the orkhon turkic [otk-ork] and ajem-turkic codes. For siberian turkic I followed the classification on here. https://en.wikipedia.org/wiki/Siberian_Turkic_languages Zbutie3.14 (talk) 01:04, 13 February 2025 (UTC)
Should Category:Japanese terms spelled with jukujikun and Category:Japanese terms read with jukujikun be merged?
[edit]It looks like the former category is given to any entries which use the t:ja-jukujikun template, and the latter to any entries with the yomi=
field in t:ja-kanjitab set to juku
, but it's unclear to me if these are supposed to be separate categories, and if so, what the intended difference between them is supposed to be. Horse Battery (talk) 02:13, 6 February 2025 (UTC)
- Also pinging @Eirikr (I'm not sure who else to add). Horse Battery (talk) 14:03, 12 February 2025 (UTC)
Nasalization of diphthongs
[edit]If I wanted to specify the phonetic realization of a word such as time or no, should both vowels of the diphthong be nasalized? [tãɪ̯̃m], [nõʊ̯̃]
I don't know if this varies between languages, or there's a kinda univesal phonetic rule.
Additionally, how about triphthongs in Received Pronunciation?
JMGN (talk) 10:42, 6 February 2025 (UTC)
- IMO nasalization in the pronunciation of English words is TMI and not necessary except for a few words like uh-uh (where it's either quasi-phonemic) or can't (where the nasal that triggered the nasalization is clearly deleted). Benwing2 (talk) 00:19, 10 February 2025 (UTC)
Entries needing images
[edit]in order to improve Wiktionary per the think tank policy I've written, I wish to add images to most entries that would warrant them. as my interests focus on subcultures, which often lack good documentation and completely lack free images, I request a template and maintance category for pages to publicly keep track of them. Juwan (talk) 19:50, 6 February 2025 (UTC)
I think that a category tree of requested images by language is appropriate and second the proposal. Since so many entries are lacking images, it would be a little too unwieldy to list them on a page like Wiktionary:Requested entries, but I could be persuaded that this is the better option than a category. —Justin (koavf)❤T☮C☺M☯ 19:59, 6 February 2025 (UTC){{rfi}}
Vininn126 (talk) 20:02, 6 February 2025 (UTC)- And Category:Requests for images by language. —Justin (koavf)❤T☮C☺M☯ 20:07, 6 February 2025 (UTC)
- I will dig a hole to bury myself in, anyone wanna join me? this was in front of my face. Juwan (talk) 20:11, 6 February 2025 (UTC)
- And Category:Requests for images by language. —Justin (koavf)❤T☮C☺M☯ 20:07, 6 February 2025 (UTC)
- BTW,
{{rfi}}
, unlike the most common request templates like{{rfe}}
,{{rfd}}
and{{rfv}}
, doesn't play well with right-hand-side table of contents, with images, or with project boxes like{{wikipedia}}
. It leaves a lot of white space at the top of the screen. (See Bovidae for a basic example.) DCDuring (talk) 22:32, 6 February 2025 (UTC)
Sense IDs resulting in dead links
[edit]- I apologize for posting this question here. I tried to convert the Belarusian entry каса (kasa) to senseid/senseno via this diff, but it somehow only results in dead links. Additionally, where exactly does the senseid need to be placed in the image description? In the beginning of it? Right after the bolded term? In the end? --Ssvb (talk) 07:39, 8 February 2025 (UTC)
- How important is it to use
{{senseid}}
/{{senseno}}
anyway? It seems to require less efforts to just keep images and senses in sync manually. --Ssvb (talk) 07:56, 8 February 2025 (UTC)- @Ssvb: it's not mandatory to use those templates, but it does reduce the work required if someone adds or rearranges the senses in an entry. — Sgconlaw (talk) 10:37, 8 February 2025 (UTC)
- @Theknightwho: any idea why
{{senseid}}
and{{senseno}}
are not working properly for the Belarusian entry mentioned above? I have not had any problems with these templates when using them with English entries. — Sgconlaw (talk) 10:36, 8 February 2025 (UTC) - @Sgconlaw I'll have a look.
- @Ssvb It's a lot more work to do it manually, because entries go out of sync over time, which is really annoying. Theknightwho (talk) 10:51, 8 February 2025 (UTC)
- @Ssvb I've fixed these. Part of the problem here was because you put things like
{{lang|be|(etymology 1
in the image caption, which isn't right, because the words "(etymology 1 sense X)" aren't Belarusian. The{{senseno|be|hair}}
) ...}}{{lang}}
template should only be for the parts of a sentence which are in a given language. - There does look to be some underlying issue which I'll look into now, as it's not clear why
{{senseno}}
breaks if it's inside{{lang}}
, but I also can't think of any reasons why it would ever need to be in the first place, which I suspect is why this bug hasn't come up before. Theknightwho (talk) 10:59, 8 February 2025 (UTC)- @Theknightwho: Thanks a lot for your help! As for being in the middle of
{{lang}}
, the stock car entry that is given as an example, puts it immediately after the term:A '''stock car''' ({{senseno|en|sports}}) in a race
. The output of the{{senseno}}
template is indeed not localized, and probably rightfully so, in the same manner how the "etymology 1" section header and the description label isn't localized either. This makes putting the "(etymology 1 sense 1)" label before or after the image description reasonable. Such label is also a bit longish and would disrupt the image description text if placed in the middle of it. --Ssvb (talk) 13:15, 8 February 2025 (UTC)- @Ssvb That's a good point. I've fixed the underlying issue, anyway, which was down to the fact that
{{senseno}}
exposed an underlying bug in Module:links, as it produces links like[[#Belarusian:_hair|sense 1]]
, which weren't being handled properly. If the link module sees a#
, it knows the target page is everything before the#
, but if there's nothing before#
then the target page is supposed to be the current page, which it wasn't accounting for. Instead, it thought the target was nothing, and assumed the#
must be the start of the intended page name (i.e. the start of an unsupported title), which is why it was creating the link to the page we'd use if the term was #Belarusian:_hair. Theknightwho (talk) 13:36, 8 February 2025 (UTC)
- @Ssvb That's a good point. I've fixed the underlying issue, anyway, which was down to the fact that
- @Theknightwho: Thanks a lot for your help! As for being in the middle of
- @Ssvb I've fixed these. Part of the problem here was because you put things like
- How important is it to use
Move all reference and quotation templates with appropriate language codes
[edit]several of the reference and quotation templates (those that start with R:
and RQ:
, respectively), mostly those for English, lack the appropriate language code in the template title. this is not great for searching and confusing for when you want to add several of them and they inconsistently implement it. templates that refer to multiple languages should likely have redirects with different codes that all point to one. Juwan (talk) 20:31, 8 February 2025 (UTC)
- I support this and it's not only English templates that lack the language code. I think there is already consensus for this (there was a previous BP discussion a year or two ago on this topic) as long as the reference or quotation templates refer to only one language; if a work covers several languages, it's less obvious what to do and may make sense to leave off the language code. In any case, at various times I've renamed reference templates appropriately. Benwing2 (talk) 00:13, 10 February 2025 (UTC)
- Oh I see you mentioned the case of multiple languages. IMO this should be a separate discussion as the consensus isn't so obvious there. Benwing2 (talk) 00:15, 10 February 2025 (UTC)
- @Benwing2 if you can generate a list of the templates that need work, it would be nice to then see what editing communities to reach out for comment. Juwan (talk) 21:17, 10 February 2025 (UTC)
- @JnpoJuwan I can try but the list might be huge ... Benwing2 (talk) 21:37, 10 February 2025 (UTC)
- @Benwing2 if you can generate a list of the templates that need work, it would be nice to then see what editing communities to reach out for comment. Juwan (talk) 21:17, 10 February 2025 (UTC)
- Oh I see you mentioned the case of multiple languages. IMO this should be a separate discussion as the consensus isn't so obvious there. Benwing2 (talk) 00:15, 10 February 2025 (UTC)
Living languages hapax
[edit]Does it make sense to use the "hapax" label for living languages? I am asking this because I added the label to two entries in Nheengatu that I created, misa-pituna and gandú (both were only recorded in a 19th-century vocabulary published by Gonçalves Dias). However, since Nheengatu is a living language, it is hard to say for sure if any speaker, wanting to revive a so-called purer version of their language, might end up using these (archaic, perhaps obsolete) words, thus making them no longer hapax. Nheengatu is spoken by a few thousand people, so its presence on the internet is not very large. Therefore, if this were to happen, I would not know... Pinging Trooper57, who also contributes to Nheengatu entries. Opinions from other users, who are not familiar with the language, are of course welcome too; otherwise, I would not have started this topic. RodRabelo7 (talk) 04:45, 9 February 2025 (UTC)
- Yes, but not for this one. It is more of a thing if the corpus is somehow exposed, by a sizable literary tradition and subsequent digitization. You still don’t usually know whether something is a hapax however, the concept is somehow transferred from classical philology where dictionary authors historically read all texts.
- Since I engaged in Arabic in the same way I acquired Latin and other philogists also imitated classical studies when foraging Arabic, by searching academic treatments I could make some guesses referring more to known than actually retained occurrence of a word, but you see even for this supposedly “well-documented” language we “cut corners”. Nor is it possible for Syriac which has editions running, unlike Latin which after the nineteenth century has few editions enriching its wordhoard with statistical significance (i.e. not that it would even increase our number of recognized hapax legomena, actually the obliteration of ghost words is more likely).
- Only where an area’s documents have been completely studied we can actually claim, and apply the label of hapax without analogy to allow for the situation, e.g. Andalusi شُلُنْبَر (šulunbar), because Romancists appear to have complete dictionaries of Andalusi Arabic (
{{R:xaa:ELA|II}}
the latest, not widely different from{{R:xaa:Corriente}}
,{{R:xaa:Corriente-Additions}}
but there is some insanity in it), the rest of Category:Arabic hapax legomena is al-Qurʔān, the oldest prose source and excessively discussed in every culture employing the language, so there is cultural knowledge whether a word has not been recorded elsewhere; actually I did not implement any of the guesses except بِرْجار (birjār), though I used to believe that كُذِينَق (kuḏīnaq) and عَمْرُوسَة (ʕamrūsa) are unique as well: maybe it is just our collective attention. - What about the cases where a use is hapax but it is joined by a few mentions? I remember to have had this, especially for terms beginning with ك (k) or ل (l), the only letters covered by
{{R:ar:WKAS}}
. This is a distinction introduced by Wiktionary. Technically the Latinists and Grecists didn’t care, mentions in wordlists of antiquity always counted and also counted on Wiktionary but due to CFI we have a core belief in their bastardhood; for Classical Arabic which is not declared a well-attested language I preferred one use to be certain of a reading and reach the threshold of one rather than 0.75, as the words collected by old wordlist compilers have various reasons for mutilation, in both forms and meanings given. - Then there are those terms only introduced by conjecture into our texts, it broke the system at lausa vs. *lausa, attested in multiple states of existence, a schizolegomenon … Fay Freak (talk) 06:09, 9 February 2025 (UTC)
- I doesn't make much sense now that I think about it. We have this with Arabic, but their hapaxes are a thousand years old, while the oldest Nheengatu recording is less than 300. Also, is it still truly a hapax when it's mentioned in Avila's work? He never really claimed it was a historical dictionary, and some of the archaic terms seem to be attested in Amazonas. Trooper57 (talk) 16:13, 9 February 2025 (UTC)
Voting majorities and supermajorities
[edit]Sorry to raise this issue again. It was a little while ago, and a fresh look may be beneficial.
I have just created Wiktionary:Votes/2025-02/Deletion of "Tennis player test", where we vote "support" to delete the test, and "oppose" to keep the test.
Suppose that 7 people want to delete the test, and therefore vote "support", while 5 people want to keep the test, and therefore vote "oppose". According to Wiktionary:Votes/2019-03/Defining a supermajority for passing votes, this results in "no consensus", i.e. no consensus to change, i.e. we keep the test.
On the other hand, suppose I had worded the vote so that people voted "support" to keep the test, and "oppose" to remove it. This time 5 people vote support and 7 people vote oppose, so the vote fails, and -- what?
What should happen?
Or should votes worded so that "support" supports the status quo be disallowed? Mihia (talk) 21:37, 9 February 2025 (UTC)
- @Mihia: Yes, this is why votes to "affirm" the status quo don't really work and imho should be avoided. AG202 (talk) 21:43, 9 February 2025 (UTC)
- I would say that votes that have no consensus retain the status quo. Although the best way to approach this is to make it a rule that "support" means "change status quo" and "oppose" means "don't change". CitationsFreak (talk) 23:07, 9 February 2025 (UTC)
- In reply to your first sentence, "no consensus" is presently defined as at least 50% but less than 2/3 in support. There is presently no concept of "no consensus to fail", and no apparent reason why a "fail" majority of any margin should not result in the "fail" mandate being carried out. Mihia (talk) 23:54, 9 February 2025 (UTC)
- @Mihia: Seeing the corresponding BP discussion, I think that this particular vote might be easily getting 2/3+ support for removing the tennis player test, so it wouldn't matter much either way here.
- I initally considered suggesting that this could be kept as a bare-majority-passing vote since there was no formal vote when this test was added to the page. See also Wiktionary:Votes/2022-01/Label for lower register as another example of a vote that was designed to pass by simple majority instead of the standard 2/3. But the argument against doing this would be that controversial tests are allowed to stay on that page even if their application is not universal, e.g. Wiktionary:Votes/pl-2018-12/Lemming principle into CFI and WT:LEMMING paragraph.
- So IMO it makes sense for this to be a regular vote that would need 2/3 or more supports to pass. In any case, it would always be possible to continue this discussion after the voting has completed. Svārtava (tɕ) 12:04, 13 February 2025 (UTC)
- Although I raised this (again) using the "Tennis player test" vote as an example, it is not an issue specifically about that vote. It is a general point that the "supermajority" rule, as presently worded, is fatally illogical UNLESS there is also a stipulation that votes must be worded so that "support" is to change the status quo and not preserve it. Mihia (talk) 14:54, 13 February 2025 (UTC)
- @Mihia: Even if not written down explicitly, it is true that in practice we don't have any confirmation votes (where "support" would be a vote to preserve the status quo), so yes, a "support" vote would indeed be to change the status quo in votes. Svārtava (tɕ) 15:16, 13 February 2025 (UTC)
- It would be as well to make it explicit. I'm pretty sure that last time I looked at this I found that we DID have such a vote, and, guess what, nobody noticed. If I'd worded my vote so that "support" supported the "tennis player" principle, would anyone have noticed any problem? I wouldn't bank on it. Mihia (talk) 15:23, 13 February 2025 (UTC)
- @Mihia: I'm pretty sure that it would be easily noticed. The only way this could be possible is if you first removed it successfully (i.e. without anyone objecting or reverting for doing it without a vote) without a vote and then started a vote on whether the test should be on the page or not such that the status quo is not having it.
- Also, is the vote you mentioned one of your own votes like Wiktionary:Votes/2021-08/Scope of English prepositions? Svārtava (tɕ) 15:31, 13 February 2025 (UTC)
- Unfortunately I don't remember now which vote(s) I noticed last time as being "wrong polarity". I would need to trawl through all the posts again. Anyway, it will not hurt to make it explicit. What is the downside? I think I may try again to get some wording agreed. Mihia (talk) 15:37, 13 February 2025 (UTC)
- @Mihia: Sure, there is no downside in making it explicit. I remember seeing Wiktionary:Votes/2021-03/Polarity of voting proposals and application of supermajority rule but that failed mainly due to the attempt to codify cases with an unclear status quo, which could be left out for now in favour of prioritizing making the part
Voting proposals must be worded so that a "support" vote is a vote to change the status quo, while an "oppose" vote is a vote to leave things unchanged.
explicit. Svārtava (tɕ) 15:45, 13 February 2025 (UTC)
- @Mihia: Sure, there is no downside in making it explicit. I remember seeing Wiktionary:Votes/2021-03/Polarity of voting proposals and application of supermajority rule but that failed mainly due to the attempt to codify cases with an unclear status quo, which could be left out for now in favour of prioritizing making the part
- Unfortunately I don't remember now which vote(s) I noticed last time as being "wrong polarity". I would need to trawl through all the posts again. Anyway, it will not hurt to make it explicit. What is the downside? I think I may try again to get some wording agreed. Mihia (talk) 15:37, 13 February 2025 (UTC)
- It would be as well to make it explicit. I'm pretty sure that last time I looked at this I found that we DID have such a vote, and, guess what, nobody noticed. If I'd worded my vote so that "support" supported the "tennis player" principle, would anyone have noticed any problem? I wouldn't bank on it. Mihia (talk) 15:23, 13 February 2025 (UTC)
- @Mihia: Even if not written down explicitly, it is true that in practice we don't have any confirmation votes (where "support" would be a vote to preserve the status quo), so yes, a "support" vote would indeed be to change the status quo in votes. Svārtava (tɕ) 15:16, 13 February 2025 (UTC)
- Although I raised this (again) using the "Tennis player test" vote as an example, it is not an issue specifically about that vote. It is a general point that the "supermajority" rule, as presently worded, is fatally illogical UNLESS there is also a stipulation that votes must be worded so that "support" is to change the status quo and not preserve it. Mihia (talk) 14:54, 13 February 2025 (UTC)
- In reply to your first sentence, "no consensus" is presently defined as at least 50% but less than 2/3 in support. There is presently no concept of "no consensus to fail", and no apparent reason why a "fail" majority of any margin should not result in the "fail" mandate being carried out. Mihia (talk) 23:54, 9 February 2025 (UTC)
- In Wiktionary:Votes/2019-03/Defining a supermajority for passing votes, “supports and opposes” implies “supports for the proposed change and opposes to the proposed change”, even if you word your proposal as supports maintaining the status quo, which would only result in supports under such a proposal being opposes under the rule and opposes being supports under the rule. It does not explicitly disallow such motions, since there is no way to game this. Fay Freak (talk) 12:52, 13 February 2025 (UTC)
- There is no mention whatsoever in the wording to indicate that "supports" and "opposes" would be reversed in the way that you describe. Mihia (talk) 14:58, 13 February 2025 (UTC)
- Yeah, but a bidding can always be formulated as a forbidding, an action as an omission of an omission, and there is always context that is not inside the formulation. This is how language works, that there are things indicated by other means than “mention”; if you think a lot about it, I will think a lot about it. And if you make an effort making up cases it will also take us effort to apply the rules within their very scopes of application. The language is but typical. Fay Freak (talk) 17:52, 13 February 2025 (UTC)
- There is no mention whatsoever in the wording to indicate that "supports" and "opposes" would be reversed in the way that you describe. Mihia (talk) 14:58, 13 February 2025 (UTC)
These are used in Swedish dictionaries. Would it be okay to create articles w a swedish language heading for them? kwami (talk) 00:06, 10 February 2025 (UTC)
- To clarify, the question is whether they should be created as Translingual or Swedish entries. In general there doesn't seem to be consensus on how to handle single-character symbols and such. Benwing2 (talk) 00:10, 10 February 2025 (UTC)
- Yes. In this case this is, AFAICT, a specifically Swedish convention, so IMO it would be odd to claim, without supporting evidence, that it's translingual. But there have been objections to creating language-specific entries for Unicode characters. kwami (talk) 01:42, 10 February 2025 (UTC)
- For Swedish, the entry will need the possibility of being given three independent, durably archived quotes indicating usage. For translingual, the fact that it is described in one work is enough.
- So I guess the question to Kwami here is, are you prepared to add three quotes to the Swedish entry, correctly formated, or are you not. In the latter case, it would be best to create a translingual entry instead.
- I would also like to make sure whether it is, indeed, only used in Swedish, or also Elfdalian, early Finnish, Danish, and Norwegian, as these languages could be influenced by the Swedish orthographical practices. If it is used there, as well, I would definitely not create a Swedish entry, just a Translingual one. Thadh (talk) 10:13, 10 February 2025 (UTC)
- So translingual use is assumed to be true unless proven otherwise?
- I'm only aware of use in Swedish. These symbols are a dictionary convention, not an orthographic practice in the usual sense. It's possible they're only used in one line of Swedish dictionaries, Norstedts. If that's the case, there cannot be 3 independent quotations. We're then in the bizarre situation that a demonstrable use in Swedish cannot be added to Wiktionary, but that it is acceptable to falsely claim translingual use. kwami (talk) 18:32, 10 February 2025 (UTC)
- I'll go ahead and create a translingual entry. It can always be converted to Swedish if ppl think that's justified. kwami (talk) 18:52, 10 February 2025 (UTC)
- Done. I don't know how to suppress the request for translation, which isn't appropriate in this case. kwami (talk) 19:19, 10 February 2025 (UTC)
- Well, characters are characters, they don't by themselves necessarily belong to one language or another. I can write a nonsensical string "hekegob|{•×fje|[™ x_:&" and this will not be any language, but every single character there is now attested once (assuming I've printed this in a book/durably archived media). Translingual is the L2 we use for this. Thadh (talk) 10:10, 11 February 2025 (UTC)
- But you could do the same for any word as well. I can jumble together every word from every language on Wiktionary. By your argument, that makes every word translingual.
- The common-sense meaning of 'translingual' is that it's used across languages, not just that we imagine that it could be. What happened to the idea that entries on Wikt need to be attested in the senses and languages that we claim for them? kwami (talk) 10:24, 11 February 2025 (UTC)
- @Kwamikagami: No, because words have meanings. You cannot have a word in a language without a meaning. You can however have a symbol without a meaning. This is why symbols are not language-specific when meaningless, but are when meaningful. Basically, a string of letters in Translingual would be SOP, a Sum of Parts. Thadh (talk) 11:36, 12 February 2025 (UTC)
- But if the symbol is meaningless, we wouldn't provide it with a definition. What you're saying is not only that the sense doesn't need to be verified, but that there doesn't need to be a sense at all. How is that appropriate for a dictionary? kwami (talk) 18:24, 12 February 2025 (UTC)
- @Kwamikagami: No, because words have meanings. You cannot have a word in a language without a meaning. You can however have a symbol without a meaning. This is why symbols are not language-specific when meaningless, but are when meaningful. Basically, a string of letters in Translingual would be SOP, a Sum of Parts. Thadh (talk) 11:36, 12 February 2025 (UTC)
- Oof, a thorny question; I'm unsure of the best approach/answer, and (I think) can understand the arguments for both positions. FWIW, my gut reaction is that if we can only find something occurring in Swedish, we enter it as Swedish, and then if people later find it also attested in e.g. Finnish, the entry can be changed (to make it Translingual, or to add Finnish, etc).
Regarding hekegob|{•×fje|[™ x_:&, if three authors were to use hekegob|{•×fje|[™ x_:& in (let's say) Swedish texts, AFAICT we would consider it to be ==Swedish== and its gibberishness would belong on the (non-gloss) definition line, in the same way くぁwせdrftgyふじこlp is Japanese (or asdfghjkl is English). If the authors used the symbol in texts which were entirely devoid of L2-having-language or meaning, I don't think the symbol would be included at all. Can anyone think of counterexamples, where a text belonging to no language(s) and consisting only of meaningless characters is the sole basis for an entry? And if the authors used the symbol in texts which were meaningful language, and the language simply couldn't be identified, it might still get included but AFAICT it'd be as Undetermined, not Translingual... and that doesn't seem relevant to this situation, where the language of the texts using this symbol is identifiable as Swedish. - -sche (discuss) 14:10, 12 February 2025 (UTC)- @-sche: My example wasn't for three quotations of hekegob|{•×fje|[™ x_:&, but rather three instances of any of these characters in a similar manner. For instance, "hekegob|{•×fje|[™ x_:&", "lrngkaowkm38($?#?{`°¥=®" and "wopalf|§{¥©{[¢]?" being three independent quotes for the character { existing. Now, regardless of whether we know what this character is used for, whether it has meaning in this case or could have in some other case, as far as I know, we include these characters simply because readers that find this character in some text or another, would want to know what it is, not necessarily what it means. If { is then used in Swedish in a specific meaning not found elsewhere and attested three times, then indeed it should also be included as a Swedish entry. However, the inherent characteristic of { being a character is imo not a feature of Swedish. Thadh (talk) 17:21, 12 February 2025 (UTC)
- What you seem to be advocating is that Unicode characters are inherently notable as Unicode characters. Indeed, it's easy enough to find probably any character in a string of gibberish where one of the non-Unicode conventions for Chinese is spuriously converted to Unicode, for example on Gbooks. But by that argument, we should have an article for every Unicode character, including every emoji, as a couple other wiktionaries do but which by consensus we do not.
- Instead, Wk-en has the very nice feature that if you click on a red link for a character, you'll see the info box for that character, giving its Unicode definition. I use that feature all the time, but in the past when people created articles for characters, where the definition was nothing more than the Unicode name, those pages were deleted. kwami (talk) 18:34, 12 February 2025 (UTC)
- @-sche: My example wasn't for three quotations of hekegob|{•×fje|[™ x_:&, but rather three instances of any of these characters in a similar manner. For instance, "hekegob|{•×fje|[™ x_:&", "lrngkaowkm38($?#?{`°¥=®" and "wopalf|§{¥©{[¢]?" being three independent quotes for the character { existing. Now, regardless of whether we know what this character is used for, whether it has meaning in this case or could have in some other case, as far as I know, we include these characters simply because readers that find this character in some text or another, would want to know what it is, not necessarily what it means. If { is then used in Swedish in a specific meaning not found elsewhere and attested three times, then indeed it should also be included as a Swedish entry. However, the inherent characteristic of { being a character is imo not a feature of Swedish. Thadh (talk) 17:21, 12 February 2025 (UTC)
English or Translingual for constellations?
[edit]I notice that And, Cass and probably others are defined weirdly:
# {{lb|mul|astronomy}} {{abbreviation of|en|Cassiopeia|nodot=1}} {{n-g|or its genitive form {{m|mul|Cassiopeiae}}.}}
This is the doing of User:Moverton. @Chuck Entz corrected the corresponding definition of And to use mul
in the first abbreviation and Moverton undid this later. Indeed, the genitive forms Cassiopeiae and Andromedae are Translingual but the nominatives are not. This seems strange, but I dunno how Translingual is supposed to work. Either: (a) we need to create Translingual entries for the nominatives or (b) we need to split the abbreviations into English and Translingual entries (or (c) remove the nominative as a possible abbreviation). Also pinging @DCDuring, @Thadh who may have a better understanding of what counts as "Translingual" than I do. Benwing2 (talk) 21:57, 10 February 2025 (UTC)
- I don't think we have anything but the general idea that, for a term (or letter, synbol, etc) to be Translingual it should be used (attestably?) in multiple languages, presumably with the same meaning (and pronunciation?). That fits with proper names that are mostly used in writing (pronunciation being secondary), especially when regulated by some multinational body, like the taxonomy, astronomy, and chemical bodies. We include the CJKV characters and taxonomic names systematically and, I suppose, many symbols. I know that at least some of the names of astronomical entities are treated as translingual. I don't know why chemical names other than abbreviations are not so treated, given the role of IUPAC. DCDuring (talk) 03:59, 11 February 2025 (UTC)
- Chemical names commonly have slightly different spellings in different languages, I think, don't they? (E.g. varying in the presence or absence of terminal -e, or of inflectional suffixes. It seems English "Barrelene" = German "Barrelen".) The IUPAC's 1993 Introduction makes some mention of language-specific aspects of spelling: "In this guide, efforts have been made to systematize the style (spelling, position of locants, typography, punctuation, italicization, etc.) of the names of organic compounds according to the IUPAC English style. As usual, IUPAC recognizes the needs of other languages to introduce their own modifications". In contrast, forms that originate as Latin genitive forms are probably less likely to be modified for language-specific spelling or inflection conventions, which could be a reason to have Translingual entries for the genitive forms of constellation names but not for their nominative versions (if the latter are not in fact widely used unchanged across languages). I see that the German Wikipedia article on Cassiopeia uses the spelling "Kassiopeia" to refer to the constellation directly but uses "β Cassiopeiae" (rather than e.g. "β Kassiopeiä").--Urszag (talk) 09:43, 11 February 2025 (UTC)
- Names of constellations are definitely not translingual, as various languages have traditional names for these (and languages with different scripts have their own form). Abbreviations however, are a different thing, as they may be used in scientific research worldwide regardless of the main body's text. I'm not an astronomer though, so I wouldn't know if in this case that is true. I would change the definition to simply "Abbreviation of the constellation Cassiopeia" without the template (which should be used for within-language abbreviations) or the mentioning of the genitive (as the abbreviation can probably be used for absolutely any case form). Translingual doesn't have grammar. Thadh (talk) 10:06, 11 February 2025 (UTC)
- We wouldn't be talking about "vernacular" names of astronomical entities, just whatever standardized names astronomers use, just as kangaroo isn't a taxonomic name. There are many abbreviations (of, eg, asteroids) that seem to be used a lot in different languages, even outside technical literature, and seem obviously translingual. DCDuring (talk) 14:58, 11 February 2025 (UTC)
- @DCDuring: "Standardised" astronomical entries afaik are still country- and language-specific, unlike taxonomic names. On the abbreviations I've elaborated above. Thadh (talk) 11:33, 12 February 2025 (UTC)
- See my ignored comment below. DCDuring (talk) 14:32, 12 February 2025 (UTC)
- @DCDuring: "Standardised" astronomical entries afaik are still country- and language-specific, unlike taxonomic names. On the abbreviations I've elaborated above. Thadh (talk) 11:33, 12 February 2025 (UTC)
- The entries seem to suggest that abbreviations such as And, Cass are used specifically or particularly in star names constructed according to the formula "Greek letter + Latin genitive form", rather than being used willy-nilly as a replacement for the name of the constellation Cassiopeia in any context. I don't know if that's true, but if so, it would mean that this type of abbreviation is not necessarily "used for absolutely any case form". For example, the abbreviations appear in this table under the column "Const.", but the immediately preceding column, labeled "ID", provides the Greek letter designation (e.g. ξ And), so it might make sense to interpret the abbreviations in this context as being implicitly short for Andromedae, Cassiopeiae, etc.--Urszag (talk) 16:00, 11 February 2025 (UTC)
- Yeah I would prefer this approach to having a vague untemplated "Abbreviation of Foo" definition. In the case where an abbreviation is translingual but is based on a specific language, it should probably use
{{abbrev}}
in the Etymology section and rely on the explicit language code support that I'm about to add (if it's not already there), so Translingual IOP could use{{abbrev|mul|en:[[independent]] [[Olympic]] [[participant]]s}}
or similar, and have a definition that says "a neutral designation for athletes competing in the Olympic games not under a specific country's flag" or similar. Benwing2 (talk) 00:08, 12 February 2025 (UTC)- These 3-letter abbreviations do seem to be officially defined by the IUC, as mentioned by some articles related to the one DCDuring posted below; maybe a template specifically for these could be created that would automatically generate wording like what I have now put at And. Adjusting the "abbreviation" template to enable examples like the one you mention also seems like a good idea in general.--Urszag (talk) 15:26, 12 February 2025 (UTC)
- Yeah I would prefer this approach to having a vague untemplated "Abbreviation of Foo" definition. In the case where an abbreviation is translingual but is based on a specific language, it should probably use
- We wouldn't be talking about "vernacular" names of astronomical entities, just whatever standardized names astronomers use, just as kangaroo isn't a taxonomic name. There are many abbreviations (of, eg, asteroids) that seem to be used a lot in different languages, even outside technical literature, and seem obviously translingual. DCDuring (talk) 14:58, 11 February 2025 (UTC)
- Per WP, in May 2016, the w:International Astronomical Union ("IAU") has established the w:IAU Working Group on Star Names ("WGSN"). As of June 2018 they had approved 330 names, often enshrining traditional or historical names. Per WP there are w:Astronomical naming conventions for stars, constellations, galaxies, comets, novas, pulsars, black holes and geological or geographical features of some of these. I would think that, if attestable, these should be Translingual by default, but subject to challenge as to their use in multiple languages. I would think that Latinate forms, as Cassiopeiae could be treated as Latin inflected forms and as Translingual 'adjective' lemmas, specific epithets are now, albeit very incompletely and unsystematically. DCDuring (talk) 15:28, 11 February 2025 (UTC)
- The current state of IAU naming policy can be found here. An 8/14/2022 list of 24,254 IAU names can be found here. For star names there is a downloadable list that includes "proper name", "designation", "constellation", etymological information, "reference", one or two star catalog designations, etc. This seems ripe for a template for the ~500 star names. Analogous templates might be worthwhile for several other classes of astronomical entities. DCDuring (talk) 15:59, 12 February 2025 (UTC)
- For the original question, the three-letter IAU abbreviations are international. In German, for example, Cassiopeia is Kassiopeia, but the abbreviation is still Cas with a 'C'.
- The international forms of the full names are Latin. I don't know if those should be listed as Latin or as translingual, but 'Cas' is definitely translingual.
- Note that there are also language-specific abbreviations, such as Cass in English. kwami (talk) 18:44, 12 February 2025 (UTC)
- What kwami says here is what I understood to be true. But I'm interested in what others think. The whole Translingual thing always feels a bit awkward since it isn't used by other dictionaries. Mike (talk) 05:41, 17 February 2025 (UTC)
- The current state of IAU naming policy can be found here. An 8/14/2022 list of 24,254 IAU names can be found here. For star names there is a downloadable list that includes "proper name", "designation", "constellation", etymological information, "reference", one or two star catalog designations, etc. This seems ripe for a template for the ~500 star names. Analogous templates might be worthwhile for several other classes of astronomical entities. DCDuring (talk) 15:59, 12 February 2025 (UTC)
New records
[edit]According to stats.wikimedia.org, last month was our biggest month ever with...
- 224,806,388 page views (previous record: 221,640,467 in September 2024)
- 227,144 user edits (previous record: 215,498 in August 2012, but that was probably bot activity)
- 971 active editors (previous record: 957 in December 2024)
Looks like we're doing something right! Ioaxxere (talk) 03:02, 12 February 2025 (UTC)
- 🍻 —Quercus solaris (talk) 04:57, 12 February 2025 (UTC)
- Yes, just to note that the 224.8 m figure includes everything, while human page views (as far as can be detected ... I don't know how reliable this can be) were 91.5 m. Mihia (talk) 15:46, 13 February 2025 (UTC)
- Any stats on number of added bits? Vininn126 (talk) 16:29, 13 February 2025 (UTC)
- WT:STATS includes the recent month. Fay Freak (talk) 17:54, 13 February 2025 (UTC)
It coincides with Wonderfool being unemployed and single. In fact: of the 227,144 user edits, 69,069 were WF, of the 971 active editors 343 are WF. So it's not that impressive. Father of minus 2 (talk) 21:27, 16 February 2025 (UTC)
- Okay dude. Vininn126 (talk) 21:30, 16 February 2025 (UTC)
Derived terms from a different part of speech we do not have
[edit]We had Walmart only as a verb, with firebomb a Walmart under “Related terms”, but @WordyAndNerdy changed it to “Derived terms” (before adding the proper noun) with the edit summary, “WT:ELE: "List terms in the same language that are morphological derivatives. For example, the noun driver is derived, by addition of the suffix -er, from the verb to drive." All these terms derive from the name of the store. That we don't have a noun sense for the store is a byproduct of WT:CFI/WT:BRAND. It doesn't change the derivation of these terms.” How should these situations be handled? This issue also came up with Dothraki, where Dothrakian is not derived from the proper noun (the language). Perhaps the “Derived terms” section should be in the entry as a level-3 heading instead of a subsection of the wrong part of speech? J3133 (talk) 10:05, 12 February 2025 (UTC)
As another example, @LunaEatsTuna mentioned in the RfD of FedEx that, were the proper noun deleted (leaving the verb), the “Derived terms” section (with FedExer and FedEx quest), would be changed to “Related terms”. J3133 (talk) 10:19, 12 February 2025 (UTC)
- This seems like a solution in search of a problem. Mallwart and firebomb a Walmart derive from the company name Walmart (i.e., the retail store chain) regardless of whether we have a dedicated entry proper-noun definition for Walmart. Gaps in Wiktionary's coverage – whether rooted in policy or oversight – shouldn't shape how we document the relationships of words. Any company/brand/fictional concept that is linguistically productive enough to have multiple derived terms (such as Facebook, Twitter, etc.) may warrant its own entry/definition. Anyway, Dothrakian is also used as a non-standard synonym of the conlang, so there's now a one-for-one relationship between it and Dothraki. WordyAndNerdy (talk) 01:52, 13 February 2025 (UTC)
- "may warrant its own entry/definition" is the key thing here: should we implement a policy that would enable disallowed terms, such as corporations, fictional locations, political parties etc. to be allowed to have entries if they have a certain number of derived terms (perhaps two or three)? It might appear odd for us to have entries for websites like Mumsnet (which has three derived/related terms on here) but not more popular ones like Bilibili, Canva or xHamster. Would an “etymology hub” (allowing these entries similarly to THUBs) be a terrible idea? It would enable the more convenient categorisation of terms derived from the same source while simultaneously letting editors know that said entries should not be RfD'ed (and, it would not enable a massive flood of entries for websites and corporations either). Pinging @This, that and the other who had proposed a similar idea two months ago. LunaEatsTuna (talk) 05:14, 13 February 2025 (UTC)
- @LunaEatsTuna: The question is whether they should currently be under “Related terms” or “Derived terms”. From your mention in the RfD, I assume that you support the former. J3133 (talk) 06:02, 13 February 2025 (UTC)
- "may warrant its own entry/definition" is the key thing here: should we implement a policy that would enable disallowed terms, such as corporations, fictional locations, political parties etc. to be allowed to have entries if they have a certain number of derived terms (perhaps two or three)? It might appear odd for us to have entries for websites like Mumsnet (which has three derived/related terms on here) but not more popular ones like Bilibili, Canva or xHamster. Would an “etymology hub” (allowing these entries similarly to THUBs) be a terrible idea? It would enable the more convenient categorisation of terms derived from the same source while simultaneously letting editors know that said entries should not be RfD'ed (and, it would not enable a massive flood of entries for websites and corporations either). Pinging @This, that and the other who had proposed a similar idea two months ago. LunaEatsTuna (talk) 05:14, 13 February 2025 (UTC)
- Yes—that is the logical option IMO if there is no proper noun sense listed. Otherwise including a “Derived terms” header under the incorrect word (like a verb) could be misleading to readers. LunaEatsTuna (talk) 06:04, 13 February 2025 (UTC)
- I also wrote that it is misleading, but WordyAndNerdy claimed that it is irrelevant because it “meets definition of a "derived term" laid out in WT:ELE” and “the word morphology is the same”. (I included one edit summary above, but you can see the history of the Dothraki entry). J3133 (talk) 06:18, 13 February 2025 (UTC)
- WT:ELE provides the only policy guidance on the distinction between "derived terms" and "related terms" of which I'm aware. The way it reads is that a "derived term" is one that directly derives from another. For example, Sherlockian derives from Sherlock through the addition of the -ian suffix, Tescoization through the addition of -ization, etc. Whereas a "related term" is one that shares a common/parallel etymology with another word but isn't directly derived from it. Examples would be broligarch and broligarchy. Broligarchy didn't derive from broligarch. Both words were formed by blending bro with oligarch/oligarchy. Whereas the hypothetical broligarchical would be a "derived term" of broligarchy since it would be formed by combining the latter with the -ical suffix. In the absence of other clear, codified policy guidance, this is the framework we should use. WordyAndNerdy (talk) 06:43, 13 February 2025 (UTC)
- I am aware of our policy, but we sort derived terms by the part of speech they derive from. It is misleading to state that firebomb a Walmart is derived from the verb Walmart. J3133 (talk) 06:47, 13 February 2025 (UTC)
- I'd hazard that most people looking at the previous version of the Walmart entry would have intuitively concluded that firebomb a Walmart derives from the name of the store. They wouldn't conclude it derived specifically from the verb sense of Walmart because most readers don't know – much less care – about inside-baseball considerations like header levels. This is a solution in search of a problem in the truest sense. WordyAndNerdy (talk) 07:10, 13 February 2025 (UTC)
- The solution is not to claim that firebomb a Walmart is derived from “To shop at Walmart” or “To outcompete, […]”. We used this solution but you decided that this is incorrect. J3133 (talk) 07:27, 13 February 2025 (UTC)
- Well, that might not be true in every case; the policy itself is still misleading in my view and, I do not entirely see why it must remain so? I would presume that “Derived terms” are terms actually derived from the entry or sense the subheading appears on. Changing it to “Related terms” is more correct as it removes any potential ambiguity or misinformation. LunaEatsTuna (talk) 07:32, 13 February 2025 (UTC)
- The problem is that the distinction you're making here is an entirely circumstantial one. It's based on the (former) absence of a proper-noun sense at Walmart rather than the inherent properties of firebomb a Walmart. The derivation of a term/phrase doesn't change simply because Wiktionary doesn't have an entry for its source. Firebomb a Walmart is a derived term as outlined by WT:ELE because it combines the store name with a verb in a fashion similar to prefixing, suffixing, or blending. However, I'm not opposed to J3133's suggestion of resolving such edge cases by having a level-3 "derived terms" section floating unattached to any POS sub-heading. Seems more constructive than applying "related terms" in a way that doesn't align with WT:ELE. WordyAndNerdy (talk) 08:06, 13 February 2025 (UTC)
- I'd hazard that most people looking at the previous version of the Walmart entry would have intuitively concluded that firebomb a Walmart derives from the name of the store. They wouldn't conclude it derived specifically from the verb sense of Walmart because most readers don't know – much less care – about inside-baseball considerations like header levels. This is a solution in search of a problem in the truest sense. WordyAndNerdy (talk) 07:10, 13 February 2025 (UTC)
- I am aware of our policy, but we sort derived terms by the part of speech they derive from. It is misleading to state that firebomb a Walmart is derived from the verb Walmart. J3133 (talk) 06:47, 13 February 2025 (UTC)
- WT:ELE provides the only policy guidance on the distinction between "derived terms" and "related terms" of which I'm aware. The way it reads is that a "derived term" is one that directly derives from another. For example, Sherlockian derives from Sherlock through the addition of the -ian suffix, Tescoization through the addition of -ization, etc. Whereas a "related term" is one that shares a common/parallel etymology with another word but isn't directly derived from it. Examples would be broligarch and broligarchy. Broligarchy didn't derive from broligarch. Both words were formed by blending bro with oligarch/oligarchy. Whereas the hypothetical broligarchical would be a "derived term" of broligarchy since it would be formed by combining the latter with the -ical suffix. In the absence of other clear, codified policy guidance, this is the framework we should use. WordyAndNerdy (talk) 06:43, 13 February 2025 (UTC)
- I also wrote that it is misleading, but WordyAndNerdy claimed that it is irrelevant because it “meets definition of a "derived term" laid out in WT:ELE” and “the word morphology is the same”. (I included one edit summary above, but you can see the history of the Dothraki entry). J3133 (talk) 06:18, 13 February 2025 (UTC)
- Yes—that is the logical option IMO if there is no proper noun sense listed. Otherwise including a “Derived terms” header under the incorrect word (like a verb) could be misleading to readers. LunaEatsTuna (talk) 06:04, 13 February 2025 (UTC)
- I don't see a need to remake the wheel as far as policy goes. WT:BRAND, WT:COMPANY, and WT:FICTION allow for the inclusion of otherwise disallowed terms as long as narrow (usually idiomatic) use is documented. We have had entries for Facebook, McDonald's, Darth Vader without issue for over a decade. I do like Luna's "etymology hub" idea though. Would allow for the inclusion of less-obvious productive proper nouns like Mumsnet. (The proliferation of Mumsnet-related terms is explained by the site's context in UK politics.) I agree that two or three derived terms would be a good threshold for such "hub" entries. WordyAndNerdy (talk) 07:33, 13 February 2025 (UTC)
- @LunaEatsTuna @WordyAndNerdy I called it "derived terms hub", because the intent is to provide a central place for all those derivations to be listed together and showcased. The "etymology hub" aspect (whereby we avoid having to repeat the proper noun's etymology in umpteen places) is also important, but, to me, secondary.
- Honestly WT:COMPANY could do with being rewritten from scratch while we're at it. It currently says that company names can only be included if they're not company names, which might have been useful advice for Wiktionarians of 20 years ago, but is tautological by today's standards. This, that and the other (talk) 08:07, 13 February 2025 (UTC)
- I made a similar argument in the RfD nomination of Minecraft a while back. WordyAndNerdy (talk) 08:12, 13 February 2025 (UTC)
- @WordyAndNerdy ah, I knew I had seen that somewhere - it included the nice turn of phrase "linguistically productive".
- For your or anyone else's interest, I drafted some text at User:This, that and the other/WT:COMPANY. One needs to be acutely aware of the failed 2022 vote on this topic, which I opposed as too prescriptive. This, that and the other (talk) 08:50, 13 February 2025 (UTC)
- Pinging @AG202, Mihia, Polomo47, Svartava for This, that and the other's proposal above since I know they might be interested. LunaEatsTuna (talk) 09:05, 13 February 2025 (UTC)
- I strongly agree with what WordyAndNerdy said on the RFD for Minecraft. I sort of lean towards including single-word brand/company name if they even satisfy a requirement of just having one inclusion-worthy derived term or sense (such as a verb or common noun having the same spelling). Svārtava (tɕ) 09:27, 13 February 2025 (UTC)
- I would support WordyAndNerdy's suggestion. I find WT:COMPANY rather confusing as is. AG202 (talk) 14:55, 13 February 2025 (UTC)
- Pinging @AG202, Mihia, Polomo47, Svartava for This, that and the other's proposal above since I know they might be interested. LunaEatsTuna (talk) 09:05, 13 February 2025 (UTC)
- I made a similar argument in the RfD nomination of Minecraft a while back. WordyAndNerdy (talk) 08:12, 13 February 2025 (UTC)
Babel rework
[edit]@-sche, benwing2 I've reworked the Babel template to a module, MOD:User:Saph/Babel, which allows default messages and has a parameter for disabling categorisation; it's also completely (I think) back-compatible with the current template. All that's missing now is for translations of the default message to be added, but I'm hesitant to do that before it's out of my user space so that there aren't a ton of data subpages that need to be moved. - saph ^_^⠀talk⠀ 20:28, 12 February 2025 (UTC)
- You can find examples in my sandbox. - saph ^_^⠀talk⠀ 20:30, 12 February 2025 (UTC)
- Just a heads-up that your template seems not to print the "This user cannot read or write any languages. Assistance is required." message correctly when no parameters are specified. Lunabunn (talk) 23:23, 12 February 2025 (UTC)
- Fixed, thanks. - saph ^_^⠀talk⠀ 02:38, 13 February 2025 (UTC)
- Thank you. As another heads-up — I have already brought this up on Discord, but note on top of parsing JSON files from the extension repo, we also need to crawl user templates manually for some languages we have added on-site that aren't covered by the extension. We then have to decide upon how we will reconcile conflicts between the two sources; benwing has suggested that extension data should be prioritized.
- (Tangentially, I hope you don't mind the signature plagiarism ;-)) 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 10:34, 13 February 2025 (UTC)
- Fixed, thanks. - saph ^_^⠀talk⠀ 02:38, 13 February 2025 (UTC)
- I've moved the module to mainspace, Module:Babel. - saph ^_^⠀talk⠀ 13:51, 13 February 2025 (UTC)
- (Notifying Kc kennylau, Ruakh, Erutuon, Jberkel, Benwing2, RichardW57, Theknightwho): Since the data modules are more than halfway done, I am again asking for input. Are there any criticisms of the module, or is anyone opposed to a switch? - saph ^_^⠀talk⠀ 13:41, 27 February 2025 (UTC)
- No objections from me; I've always thought this was necessary and a good idea. Once everything is done we need to bot-convert existing uses of the Babel parser extension (which should hopefully be easy), disable the use of the Babel parser extension (through an abuse filter?) and then set nocat=1 on anyone whose last contribution is more than 2 years ago. Benwing2 (talk) 06:47, 28 February 2025 (UTC)
- Is there a reason we would disallow use of the extension but still keep it installed on the wiki? - saph ^_^⠀talk⠀ 16:42, 28 February 2025 (UTC)
- If there's an easy way of uninstalling it that's even better, but that might take some significant effort (a Phabricator ticket that may take quite awhile to get carried out, etc.). Benwing2 (talk) 20:25, 28 February 2025 (UTC)
- Is there a reason we would disallow use of the extension but still keep it installed on the wiki? - saph ^_^⠀talk⠀ 16:42, 28 February 2025 (UTC)
- I should add the final data module is done now and this is ready to deploy. - saph ^_^⠀talk⠀ 01:27, 1 March 2025 (UTC)
- Great! Is it 100% compatible with the old Babel templates? If so let's go ahead and deploy it. If not, what needs to be done to convert the old templates? Benwing2 (talk) 01:37, 1 March 2025 (UTC)
- Yes, it should be. It might be a little difficult to monitor for errors it causes since they'll all be in CAT:Pages with module errors/hidden; do you know how I can check if there are any new additions to the category? - saph ^_^⠀talk⠀ 01:50, 1 March 2025 (UTC)
- WT:Babel will also need to be updated with new information on how to add a new language. - saph ^_^⠀talk⠀ 01:55, 1 March 2025 (UTC)
- Has anyone started updating it yet? — SURJECTION / T / C / L / 22:12, 1 March 2025 (UTC)
- No, I'm currently trying to implement
!
and-
. If you want to do it that'd be great. - saph ^_^⠀talk⠀ 22:13, 1 March 2025 (UTC)
- No, I'm currently trying to implement
- Has anyone started updating it yet? — SURJECTION / T / C / L / 22:12, 1 March 2025 (UTC)
- Off the top of my head, I don't know of any simple way to check for new additions to a category (although it might well be possible using the new-ish additions to the watchlist mechanism; i think you can filter for category changes). What I'd recommend though is to change the code of Module:Babel so that before it throws an error, it adds the page to some tracking category (using Module:debug/track). That way all errors related to Module:Babel will go into the tracking category and you can easily find them. Benwing2 (talk) 02:01, 1 March 2025 (UTC)
- OK, added tracking. - saph ^_^⠀talk⠀ 02:11, 1 March 2025 (UTC)
- WT:Babel will also need to be updated with new information on how to add a new language. - saph ^_^⠀talk⠀ 01:55, 1 March 2025 (UTC)
- @Benwing2 I'm ready to watch for errors, now, if you'll switch the template over. Afterwards, what do we do with the user lang templates? - saph ^_^⠀talk⠀ 19:21, 1 March 2025 (UTC)
- @Saph I have switched the template over. Do you know if there are any users who directly call the user lang templates on their page? If not let's just delete them all. Benwing2 (talk) 21:19, 1 March 2025 (UTC)
- @Saph Just FYI it appears there are some users who do weird things with their
{{Babel}}
call, which is now triggering errors; see for example User:Hanno the Navigator. I'm not sure how many of them there are; ideally they should all be edited to not do these things. For now either we can just ignore the issue and expect users to fix the problem, or you can maybe change Module:Babel to ignore such cases. Benwing2 (talk) 21:24, 1 March 2025 (UTC)- How do you mean having Module:Babel ignore it? As in, ignoring template calls? I would agree with the former, though, that this should be fixed by the users; this is not how the template should be used with or without the module. - saph ^_^⠀talk⠀ 21:27, 1 March 2025 (UTC)
- You'd have to check for stuff like 'UNIQ' occurring in the language name; it would be a major hack. Benwing2 (talk) 21:29, 1 March 2025 (UTC)
- How do you mean having Module:Babel ignore it? As in, ignoring template calls? I would agree with the former, though, that this should be fixed by the users; this is not how the template should be used with or without the module. - saph ^_^⠀talk⠀ 21:27, 1 March 2025 (UTC)
- I can do a search later for pages which directly call the templates. - saph ^_^⠀talk⠀ 21:28, 1 March 2025 (UTC)
- Sounds good. Benwing2 (talk) 21:29, 1 March 2025 (UTC)
- @Benwing2 There are currently 701 user pages (see this Petscan query) which directly call them; I'm not sure if the way to handle this is to add an entry point in the module for individual userboxes (and then convert existing uses to use a single template for these) or to convert these to just use
{{babel}}
. I would tend towards the latter but I don't know if people will get fussed over it. Either way the wikitext userboxes should be deleted. - saph ^_^⠀talk⠀ 20:36, 2 March 2025 (UTC)- I think we should just convert them and not worry about users complaining; most of the uses are old anyway. BTW the Petscan query seems to have some false positives, e.g. it includes User:Jóhann Heiðar Árnason and User:Lennart.larsen who are just using regular
{{Babel}}
. Benwing2 (talk) 20:47, 2 March 2025 (UTC)- BTW there is some precedent here; I see User:-sche did an AWB run in 2013 that touched e.g. User:Lennart.larsen's page, converting zh-1 to cmn-1. Benwing2 (talk) 20:48, 2 March 2025 (UTC)
- OK, I'll convert these with AWB later, then, unless you'd like to do a bot run; I assume a bot run would be faster. - saph ^_^⠀talk⠀ 20:52, 2 March 2025 (UTC)
- Go ahead; the bot run probably won't be any less effort than using AWB as there are different formats that people are using when manually calling the user templates. Benwing2 (talk) 20:53, 2 March 2025 (UTC)
- @Benwing2 Can we also convert and delete these old redirects [1]? Babel-1, 2, etc. I think we should only keep
{{babel}}
out of all of these. - saph ^_^⠀talk⠀ 21:55, 2 March 2025 (UTC)- yup i'll take care of them. Benwing2 (talk) 22:05, 2 March 2025 (UTC)
- @Saph OK they're all converted and deleted. Benwing2 (talk) 22:27, 2 March 2025 (UTC)
- Thanks! - saph ^_^⠀talk⠀ 22:28, 2 March 2025 (UTC)
- FYI there's also
{{babels}}
and{{babels2}}
; I dunno what they are but they're ancient and unused; they're gonna get deleted. Benwing2 (talk) 22:29, 2 March 2025 (UTC)- Yeah, those were used on WT:Babel before I updated it. They're not needed anymore. - saph ^_^⠀talk⠀ 22:31, 2 March 2025 (UTC)
- @Saph OK they're all converted and deleted. Benwing2 (talk) 22:27, 2 March 2025 (UTC)
- yup i'll take care of them. Benwing2 (talk) 22:05, 2 March 2025 (UTC)
- OK, I'll convert these with AWB later, then, unless you'd like to do a bot run; I assume a bot run would be faster. - saph ^_^⠀talk⠀ 20:52, 2 March 2025 (UTC)
- BTW there is some precedent here; I see User:-sche did an AWB run in 2013 that touched e.g. User:Lennart.larsen's page, converting zh-1 to cmn-1. Benwing2 (talk) 20:48, 2 March 2025 (UTC)
- I think we should just convert them and not worry about users complaining; most of the uses are old anyway. BTW the Petscan query seems to have some false positives, e.g. it includes User:Jóhann Heiðar Árnason and User:Lennart.larsen who are just using regular
- @Benwing2 There are currently 701 user pages (see this Petscan query) which directly call them; I'm not sure if the way to handle this is to add an entry point in the module for individual userboxes (and then convert existing uses to use a single template for these) or to convert these to just use
- Sounds good. Benwing2 (talk) 21:29, 1 March 2025 (UTC)
- @Saph Just FYI it appears there are some users who do weird things with their
- All of the current invalid language code errors appear to just be bad use of the template. - saph ^_^⠀talk⠀ 21:32, 1 March 2025 (UTC)
- All right, I don't think that is such a big deal. Users shouldn't be supplying such codes anyway. Benwing2 (talk) 21:35, 1 March 2025 (UTC)
- Some of the invalid language codes are calls to userspace templates, like at User:Wikitiki89; I'm not sure if we should support these. - saph ^_^⠀talk⠀ 21:42, 1 March 2025 (UTC)
- Not sure either. BTW I remember this user ... right when I started he was very active, but has since disappeared. Benwing2 (talk) 21:44, 1 March 2025 (UTC)
- @Saph I have switched the template over. Do you know if there are any users who directly call the user lang templates on their page? If not let's just delete them all. Benwing2 (talk) 21:19, 1 March 2025 (UTC)
- Yes, it should be. It might be a little difficult to monitor for errors it causes since they'll all be in CAT:Pages with module errors/hidden; do you know how I can check if there are any new additions to the category? - saph ^_^⠀talk⠀ 01:50, 1 March 2025 (UTC)
- Great! Is it 100% compatible with the old Babel templates? If so let's go ahead and deploy it. If not, what needs to be done to convert the old templates? Benwing2 (talk) 01:37, 1 March 2025 (UTC)
- No objections from me; I've always thought this was necessary and a good idea. Once everything is done we need to bot-convert existing uses of the Babel parser extension (which should hopefully be easy), disable the use of the Babel parser extension (through an abuse filter?) and then set nocat=1 on anyone whose last contribution is more than 2 years ago. Benwing2 (talk) 06:47, 28 February 2025 (UTC)
Arabic root links from category "terms derived from the Arabic root" broken
[edit]https://en.wiktionary.org/wiki/Category:Swahili_terms_derived_from_the_Arabic_root_%D9%84_%D8%AD_%D9%82 says "Swahili terms that originate ultimately from the Arabic root ل ح ق (l ḥ q).". But the link to the root page is broken, although the root page exists (Appendix:Arabic_roots/ل_ح_ق). Same for https://en.wiktionary.org/wiki/Category:Swahili_terms_derived_from_the_Arabic_root_%D8%AD_%D8%B6_%D8%B1 and Appendix:Arabic_roots/ح_ض_ر.
These links used to work before.
CC @Fenakhay tbm (talk) 04:34, 13 February 2025 (UTC)
- Should be fixed (although note that bug reports of this nature should go to the WT:Grease pit rather than the WT:Beer parlour). Benwing2 (talk) 09:30, 13 February 2025 (UTC)
- @Benwing2 thanks, I can confirm it's fixed. Doh, I wanted to report it in Grease pit. I didn't notice I opened the wrong page. Thanks again! tbm (talk) 03:25, 14 February 2025 (UTC)
Extended Mover Request: User:Lunabunn
[edit]I would like to request WT:Extended mover rights for easier cleanup of Middle (okm
) and Old (oko
) Korean entries. We have recently settled a new consensus on lemmatization policy, leaving us with several entries to be moved and many others to be reviewed. For context see WT:Beer parlour/2024/December#Rethinking Middle Korean verb lemmatization, WT:About Middle Korean#Lemmatizations. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 10:52, 13 February 2025 (UTC)
- All right, I have granted this. It's been 7 days, no one has specifically objected and this user seems responsible based on their prior edits to Korean pages and pronunciation modules and their participation in various discussions online and in Discord. Benwing2 (talk) 09:06, 21 February 2025 (UTC)
- Thank you always! I will pick up past editors' great work and see that Koreanic gets the housekeeping attention it needs. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 10:03, 21 February 2025 (UTC)
proposed new set or POS category: Category:Postal abbreviations?
[edit]English and other languages have lots of postal abbreviations such as AZ for Arizona in the US, and Wilts for Wiltshire in the UK. I think we should have a set (or POS) category for this. I notice we have Category:Geographic abbreviations outside the category tree, with only 3 entries, so someone else had the same idea and was (half-assedly) trying to implement it. I was thinking the postal abbreviations category would contain abbreviations both for toponyms and other types of postal abbreviations (St. = "street", Dr. = "drive"; COD = "cash on delivery"), but maybe it makes more sense to separate out the ones referring to toponyms. If so, we could call the category postal toponym abbreviations or maybe just toponym abbreviations, to incorporate things like Calif. for California; and because at least in the US, postal abbreviations like AZ have expanded their use beyond the mail system; and also because of ISO 3166-2, which establishes standardized abbreviations for first-level political subdivisions of countries with broader application than just postal services. (In of the case of the US at least, the abbreviations look like US-AZ for Arizona, i.e. they recycle US postal abbreviations.) (FWIW they also establish codes for some "lower-level divisions" in weird cases of de-facto countries that ISO doesn't consider countries; case in point, Taiwan, which is listed as "Taiwan, Province of China" [grrrr] because the UN seems to see things this way on China's behest, but where Taiwanese counties, independent cities and special municipalities still get codes).
Next question: Should this be a set category like Category:en:Postal abbreviations or Category:en:Toponym abbreviations etc., or a POS category like Category:English postal abbreviations or Category:English toponym abbreviations etc.?
@-sche, @Ioaxxere who have commented on past proposals for new categories and helped separate out the boundary between set and POS categories. Benwing2 (talk) 07:46, 15 February 2025 (UTC)
- @Benwing2: Since we have Category:ISO 3166-1 alpha-2, why don’t you add Category:ISO 3166-2 alpha-2? I realize there may be unofficial codes, but this would be a reason to rename both categories to something more generic. Fay Freak (talk) 08:31, 15 February 2025 (UTC)
- Okay, you think of “other types of postal abbreviations”, which has the potential to become an unorganized wastebasket from what I can see, the country subdivisions however are not language-specific, you print it on package labels sent from one EU country to another. (Just sent some designer drip from DE to IT.) Fay Freak (talk) 08:35, 15 February 2025 (UTC)
- To me, "toponym abbreviations" seems more maintainable (perhaps more useful?) than "postal toponym abbreviations", as it seems difficult to determine what constitutes a "postal" vs "non-postal" abbreviation: if you address a letter to "Willcox City Hall, 101 Sou. Railroad Av., Willcox, Ariz." rather than "...S. Railroad Ave., Willcox, AZ", or address a letter to "Nola City Hall, 1300 Perdido Str., NOLA" rather than "...Perdido St., New Orleans, LA", the postal service will still deliver it, so which of those are "postal" abbreviations? And while some historical postal abbreviations (e.g. official USSR or DDR ones) surely count, it seems likely there are cases where it's unclear whether a country in the past or present uses particular abbreviation(s) officially. (I found things like Chs. in the English Dialect Dictionary and other dictionaries and have no idea whether it's a Royal-Mail-recognised abbreviation or not.) But if there are official lists like "ISO 3166-2 alpha-2" (which is will-defined) that someone wants to categorize, that seems fine, and if someone wants to make a case for why postal and nonpostal toponym abbreviations should be in separate categories, please do!
Abbreviations like "St." and "COD" do not seem restricted to postal use(?) nor do they seem to have anything in particular in common except that some post offices use both, but don't post offices also use e.g. "i.e." or "e.g." in some publications? So I'd want to see more explanation of why "St." and "COD" should be together in one category (besides the overall "abbreviations" category), and how to decide what else should or shouldn't be in that category. It is possible we could assemble enough "postal service terminology" to merit a category (nutting truck comes to mind).
Regarding what type of category it should be, I guess it should be a subcategory (like Category:English case citation abbreviations) of, and thus the same type of category as, Category:English abbreviations...? - -sche (discuss) 19:25, 15 February 2025 (UTC)- OK this makes sense. Should it be "toponym abbreviations" or "geographic abbreviations"? Some might argue that the latter uses a more familiar term, but I personally prefer "toponym abbreviations" because someone could argue that Mt. or mtn. = mountain is a "geographic abbreviation". Benwing2 (talk) 22:14, 15 February 2025 (UTC)
@AG202 @Chom.kwoy
I am currently working on a complete rewrite of all Koreanic translit and pron modules/templates (see my user page) that I hope to roll out gradually in coming months (see my user page). This will hopefully bring easy maintenance and consistency by replacing huge data tables and disjointed code with shared, modular, imperative functions.
That aside, @Solarkoid and I have found this opportunity to provide impetus for simplifying {{ko-IPA}}
's parameter interface. This will also be inherited by {{jje-IPA}}
.
I propose the following:
- Add
|alt=alternative pron
; allow alternative pronunciations to be specified without having to retype the headword- Strong Remove
|ui=
; in the standard pronunciation, ui becoming i in non-word-initial position is completely regular and does not/should not need to be specified|uie=
; this is for one word, -의 (-ui). why?? manually specify|svar=
; this is for two words, 멋있다 (meositda) and 맛있다 (masitda). manually specify- Remove
|iot=
; this is very seldom used & can be manually specified|nobc=
; this is seldom used & can be manually specified- Modify
|com=
|tense=
; specify the syllable being tensed, e.g.|tense=1
in 사이트 (saiteu), not the previous syllable, e.g.|com=0
- Keep
|cap=
|l=
|bcred=
|nn=
,|ni=
; internally, one can be an alias of the other, but n-insertion and nl > nn are semantically distinct, so both can be kept for intuitiveness
🌙🐇 ⠀talk⠀ ⠀contribs⠀ 10:14, 15 February 2025 (UTC)
- Strong Support. AG202 (talk) 15:56, 17 February 2025 (UTC)
CFI edit request
[edit]At Wiktionary:Criteria_for_inclusion#Idiomaticity, after the sentence "Idiomaticity rules apply to hyphenated compounds, including hyphenated prefixed words, in the same way as to spaced phrases", could someone with permission please add a reference linking to the vote at Wiktionary:Votes/2019-10/Application of idiomaticity rules to hyphenated compounds. Many votes are linked, but this one seems to have been overlooked. Thanks. Mihia (talk) 12:41, 16 February 2025 (UTC)
- @Mihia: Added. J3133 (talk) 12:45, 16 February 2025 (UTC)
- Great, thanks. Mihia (talk) 12:46, 16 February 2025 (UTC)
Allowing technically SoP entries involving highly polysemic words
[edit]There is already a "get-out clause" at WT:SOP ("In rare cases ... etc. etc."), but I have for some time thought that we should make specific allowance for inclusion of phrases involving words with very many meanings, where the phrase almost invariably has one specific meaning that is obtained by choosing the correct sense of each of the components, and where it may be unreasonable to expect readers to be able to readily do this. One recent example that comes to mind is patch file, though I don't want to focus particularly on whether that would or would not qualify, just on opinions about the general idea. Does anyone have any views? Mihia (talk) 12:55, 16 February 2025 (UTC)
- I assume we are just talking about two-part compounds. It could be desirable, where both terms in the compound used non-obvious tertiary senses for highly polysemic (including multi-etymology) terms. Unfortunately, our ability to efficiently come to a conclusion about this kind of thing (either a policy or individual definitions) has proven to be insufficient to prevent low-quality compounds from remaining in Wiktionary for years. I can hope that having well-defined criteria for inclusion would also mean well-defined criteria for exclusion, which would make it easier to remove some of the dreck. Unfortunately we seem to have an inclusionist bias, so that hope is probably unjustified. DCDuring (talk) 23:20, 16 February 2025 (UTC)
harmonizing families and proto-languages, and other proto-language warnings
[edit]We have a whole host of warnings (17) issued concerning mismatches between proto-languages and families:
- Proto-Central Togo (
alv-gtm-pro
) does not have the expected name "Proto-Ghana-Togo Mountain", even though it is the proto-language of the Ghana-Togo Mountain languages (alv-gtm
). - Proto-Arawa (
auf-pro
) does not have the expected name "Proto-Arauan", even though it is the proto-language of the Arauan languages (auf
). - Proto-Arawak (
awd-pro
) does not have the expected name "Proto-Arawakan", even though it is the proto-language of the Arawakan languages (awd
). [harmonize under Arawak] - Proto-Ta-Arawak (
awd-taa-pro
) does not have the expected name "Proto-Ta-Arawakan", even though it is the proto-language of the Ta-Arawakan languages (awd-taa
). [harmonize under Ta-Arawak] - Proto-Basque (
euq-pro
) does not have the expected name "Proto-Vasconic", even though it is the proto-language of the Vasconic languages (euq
). [keep as-is] - Proto-Norse (
gmq-pro
) does not have the expected name "Proto-North Germanic", even though it is the proto-language of the North Germanic languages (gmq
). [keep as-is but rename gmq-pro to non-pro] - Proto-Kamta (
inc-krn-pro
) does not have the expected name "Proto-KRNB lects", even though it is the proto-language of the KRNB lects (inc-krn
). [rename family to KRDS languages, keep proto-language as-is] - Proto-Chumash (
nai-chu-pro
) does not have the expected name "Proto-Chumashan", even though it is the proto-language of the Chumashan languages (nai-chu
). - Proto-Maidun (
nai-mdu-pro
) does not have the expected name "Proto-Maiduan", even though it is the proto-language of the Maiduan languages (nai-mdu
). - Proto-Mixe-Zoque (
nai-miz-pro
) does not have the expected name "Proto-Mixe-Zoquean", even though it is the proto-language of the Mixe-Zoquean languages (nai-miz
). - Proto-Pomo (
nai-pom-pro
) does not have the expected name "Proto-Pomoan", even though it is the proto-language of the Pomoan languages (nai-pom
). - Proto-Mazatec (
omq-maz-pro
) does not have the expected name "Proto-Mazatecan", even though it is the proto-language of the Mazatecan languages (omq-maz
). - Proto-North Sarawak (
poz-swa-pro
) does not have the expected name "Proto-North Sarawakan", even though it is the proto-language of the North Sarawakan languages (poz-swa
). - Proto-Salish (
sal-pro
) does not have the expected name "Proto-Salishan", even though it is the proto-language of the Salishan languages (sal
). [harmonize under Salish] - Proto-Samic (
smi-pro
) does not have the expected name "Proto-Sami", even though it is the proto-language of the Sami languages (smi
). - Proto-Kuki-Chin (
tbq-kuk-pro
) does not have the expected name "Proto-Kukish", even though it is the proto-language of the Kukish languages (tbq-kuk
). [harmonize under Kuki-Chin] - Proto-Saka (
xsc-sak-pro
) does not have the expected name "Proto-Sakan", even though it is the proto-language of the Sakan languages (xsc-sak
).
We also have four warnings about proto-languages without associated families;
- Proto-Amuesha-Chamicuro (
awd-amc-pro
) has a proto-language code associated with the invalid code"awd-amc"
. - Proto-Kampa (
awd-kmp-pro
) has a proto-language code associated with the invalid code"awd-kmp"
. - Proto-Paresi-Waura (
awd-prw-pro
) has a proto-language code associated with the invalid code"awd-prw"
. - Proto-Puroik (
sit-khp-pro
) has a proto-language code associated with the invalid code"sit-khp"
.
We also have two weird miscellaneous warnings:
- Proto-Rukai (
dru-pro
) has a proto-language code associated with Rukai (dru
), which is not a family. - Kelantan Peranakan Hokkien (
mis-hkl
) has its canonical name ("Kelantan Peranakan Hokkien"
) repeated in the table ofaliases
.
I can look into the second miscellaneous warning, but for the others, I mostly don't have enough context. Proto-Norse being the ancestor of the North Germanic languages is a special case because it's attested, but for the other mismatches, I imagine a lot of them are unintentional due the existence of multiple names for the same family. It should be possible in many cases to rename either the family or proto-language to avoid the mismatch. Pinging @-sche and @Theknightwho who might know something about this; please feel free to ping others. Benwing2 (talk) 04:09, 19 February 2025 (UTC)
- In some cases, I think the family uses a different name to avoid having the same exact name as a (non-proto) language (as described in WT:FAM). For example, "Proto-Vasconic" gets only 13 Google Books hits (that actually use that term; the subsequent pages upon pages of results that Google returns don't use the term or sometimes even have any particular relevance — who knows why Google returns them), whereas I find 10+ pages [of ten uses each] of "Proto-Basque", so "Proto-Basque" is clearly the more common name for the language ... but without even checking whether "Basque languages" or "Vasconic languages" is more common for the family, I can see that one benefit to calling them "Vasconic languages" is that if they were called "Basque languages", then things like
{{der|en|euq|-}}
would display identically to{{der|en|eu|-}}
. (That might not matter that much in that particular case, but for larger families it'd be confusing. However,{{der|en|qwm|-}}
and{{der|en|trk-kip|-}}
do display identically... so maybe we need to rename one of those, or find some way of solving this "same name" issue...)
In some cases, the proto-language and family might really have different common names.
In the case of Salish, it looks like the family could be renamed "Salish" to match the proto-language; "Proto-Salish" gets 11 pages of relevant Google Books results vs only 9 pages for "Proto-Salishan", and "Salish languages" is apparently also more common. - -sche (discuss) 05:04, 19 February 2025 (UTC)
- "Ta-Arawak" seems to be marginally more common than "Ta-Arawakan", if we wanted to synchronize that pair: on Google Scholar, "Ta-Arawak" gets 40 hits, "Ta-Arawakan" 26; on Google Books, each one gets about 14 hits (discounting a few which are not in English and are only using ta as a particle while mentioning the Arawak/an languages). "Proto-Ta-Arawakan" gets 1 GBooks hit and "Proto-Ta-Arawak" gets none; "Ta-Arawakan languages" returns 2 copies of 1 book, "Ta-Arawak languages" returns 1 book. On Google , "Ta-Arawakan languages" returns 0 hits while "Ta-Arawak languages" returns 7 (of which 3 are duplicates of a single work). - -sche (discuss) 18:31, 19 February 2025 (UTC)
- @-sche What about Proto-Arawak vs. Arawakan? Wikipedia has w:Arawakan languages and w:Ta-Arawakan languages (although the w:Arawakan languages article uses "Ta-Arawak" in reference to the family). Since Ta-Arawakan is a subfamily of Arawakan, it seems we should be consistent in the names of these two families. (Meanwhile, confusingly, Category:Arauan languages is an apparently unrelated family; Wikipedia's article is at w:Arawan languages, which looks more "modern".) Benwing2 (talk) 00:59, 21 February 2025 (UTC)
- Although both names seem to be common enough that the Google (Books) Ngram Viewer should be able to plot them (both seem to get well over 40 hits), it doesn't like the hyphens, so this claims no results, and I can't be sure whether this is actually a graph of "Proto-Arawak" or instead of how many books have "Proto" minus "Arawak". Nonetheless it seems like "Arawak" is more common, if we wanted to standardize everything on that. (Google Scholar also claims to find slightly more results for "Proto-Arawak" than "Proto-Arawakan", and significantly more for "Arawak" than "Arawakan".) - -sche (discuss) 18:32, 22 February 2025 (UTC)
- For Kamta, I notice there's the added oddity that the language family/category is named "... lects" rather than "... languages", even though the languages in the category are named "Category: ... language". AFAICT, that part of the name should be regularized (from "lects" to "languages"). For the name itself, google books:"KRNB" languages Kamta turns up zilch (and I spy only three Google Scholar hits), but "Kamta languages" also turns up zilch (and if the family were renamed "Kamta" to match the proto-language, we would run into the Kipchak issue where
{{der}}
etc would return the same name whether the family or the [non-proto] language that's already called "Kamta" was called). Wikipedia uses a third name, "KRDS", which I can find a couple of Google Books and a couple of Google Scholar hits using. There are a couple Google Books and Scholar hits for "proto-Kamta", and none for "Proto-KRNB" or "Proto-KRDS", so maybe we leave the proto-language name as "Proto-Kamta" but change the family from "KRNB lects" to "KRDS languages"? Or maybe some Indian-language editors have better knowledge/ideas: pinging User:AryamanA who created Category:Rajbanshi language (and you already pinged TKW, who Category:Surjapuri language). - -sche (discuss) 18:32, 22 February 2025 (UTC) - In general, I'd follow the literature; if they generally use a different name for the proto-language vs. the group by which the proto-language is reconstructed, so be it. If it's an even split between multiple names: sure, harmonize it for convenience. However, I have a few suggestions.
- Rename "Kukish" to "Kuki-Chin" (Kuki-Chin is more common)
- Change the code of Proto-Norse from
gmq-pro
tonon-pro
but keep the "Proto-Norse" name (since that's what the literature calls it). It doesn't really make sense for Old Norse to benon
but Proto-Norse to have "gmq" instead.
- — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 17:06, 24 February 2025 (UTC)
- Definitely, in cases where one name is more common for the proto-language and another for the group, I agree it's fine for them not to match. - -sche (discuss) 17:50, 25 February 2025 (UTC)
@-sche, Mellohi! I added the results so far in bold. There's a trend here in that so far generally the name of the proto-language has remained and the name of the family changed. I don't know if that applies to the remainder, though. Benwing2 (talk) 20:30, 24 February 2025 (UTC)
Turkish IPA module proposal
[edit](Notifying İtidal, Fytcha, Vox Sciurorum, Lambiam, Whitekiko, Ardahan Karabağ, Orexan, Moonpulsar, Lagrium): I'd like to propose the possibility of introducing a pronunciation module to tidy up the mess that is currently on the pronunciation section of Turkish pages. I have already made a prototype of such module on Module:User:Trimpulot/tr-IPA-test. It is currently capable of guessing the pronunciation of most words:
- it understands that â, î and û cause palatalization and are usually long in open syllables
- in other cases the palatalization of consonants needs to be marked with capital letters (K, G or L), and the length of vowels with a following colon (:)
- stress must be marked with an apostrophe (') preceding the stressed syllable, unless it's the last one
- if a word's last vowel becomes long before a suffix beginning in a vowel, the word must be followed by a plus sign and the accusative vowel; this plus sign should be replaced by a dash if the final consonant also undergoes lenition
- if the word's pronunciation is perfectly understandable from its spelling, the template needs no parameters, except for a potential +V or -V, which can stand on its own (e.g. on mahbup, the only needed parameter is
|-u
)
Trimpulot (talk) 09:06, 19 February 2025 (UTC)
- Judging from the example, the "dash" is an ordinary hyphen (
-
) - Some entries give only a phonemic pronunciation (e.g. abi: “IPA(key): /aːbi/”), some give only a phonetic pronunciation (e.g. açım: “IPA(key): [ɑˈtʃɯm]”), and some have both phonemic and phonetic ones (e.g. kar: “IPA(key): /ˈkaɾ/ [ˈkʰɑɾ̞̊]”). How will the new module handle this?
- How is the module invoked? (
{{#invoke:User:Trimpulot/tr-IPA-test|???|...|???}}
). The documentation should include some examples. For example, how should mal and hal be handled? --Lambiam 12:43, 19 February 2025 (UTC)- @Lambiam As of now, the module is designed to only give a phonemic pronunciation (which I deem to be sufficient). I will add a documentation on Template:User:Trimpulot/tr-IPA-test to better explain how it works.
- Trimpulot (talk) 15:44, 19 February 2025 (UTC)
- I agree with only giving a phonemic pronunciation; see my argument at Wiktionary:Tea room/2021/March#Turkish pronunciation. But others may disagree, seeing how much work appears to have been put in these narrow transcriptions. In particular, User:Science boy 30 writes on his talk page: “This user is against broad transcription.” His latest contribution, at gâvur, has been to replace [ɟɑˈβ̞uɾ̞̊] by [ɟɑˈβ̞ʊɾ̞̊]. So it may be wisest to at least allow room for narrow, phonetic transcriptions and use it to retain existing ones. I also suggest testing the new module by comparing its results with currently given phonemic pronunciations. --Lambiam 17:34, 19 February 2025 (UTC)
- @Trimpulot @Lambiam My personal view is that we should provide a "lightly phonetic" transcription that includes aspects of pronunciation that may not be phonemic but which significantly impact the actual pronunciation and may be non-obvious to language learners. An example is Spanish voiced stops /b d g/, which become approximants [β̞ ð̞ ɣ̞] in certain positions (e.g. between vowels). The pronunciation as approximants is very salient and audible, and pronouncing them as stops marks you as a foreigner with a bad accent. OTOH the exact quality of Spanish mid vowels /e o/ is less important and probably doesn't need to be indicated. I don't know Turkish well but it seems to be that a "lightly phonetic" rendition would include palatalization of /k g l/ whenever it occurs but not necessarily things like aspiration of voiceless stops or the other details found in a transcription like [ɟɑˈβ̞ʊɾ̞̊] (which seems too detailed). Overall though I'm strongly in favor of having a pronunciation module; manually generated pronunciations always end up messy and inconsistent. Benwing2 (talk) 22:16, 19 February 2025 (UTC)
- @Trimpulot One other thing ... you should probably come up with a different way of marking palatalization than capital letters, because capital K G L will clash with proper names that happen to have capital letters in them that aren't palatalized. You could for example use an apostrophe to indicate palatalization (
k' g' l'
) and switch to using an acute accent to mark stress (á é ...
), or mark one of them with an apostrophe and the other with a double quote"
. Actually, apostrophes might not be so good if there are Turkish words that have apostrophes in their normal spelling (I don't know if that's the case, but you don't want people to be forced to provide respelling of words that happen to have capital letters, apostrophes, etc. in them). Also your module should be able to handle multi-word terms correctly; I can help you come up with a syntax for this, as I've written several pronunciation modules. Benwing2 (talk) 22:27, 19 February 2025 (UTC)- @Benwing2 Capital letters in proper names do not constitute a problem, since titles are converted to lowercase before being analysed (of course, if a proper name needs to be manually transcribed, the editor should not use capital letters for anything but palatalization). As for the stress, using accent marks would make it difficult to stress special characters such as ü, ö and ı, in turn making it necessary to use an apostrophe for stress. Furthermore, I don't see a valid reason to switch to an apostrophe-double quote system, since that would require more characters to transcribe what it can already handle well. Multi-word terms are already handled, I only need to figure out where it's best to automatically place the stress. As for phonetic transcriptions, I might add a way to add them manually.
- Trimpulot (talk) 09:22, 20 February 2025 (UTC)
- If a proper name needs to be respelled, it's IMO going to be very awkward to require that lowercase letters be used in place of capital letters in respellings so that capital letters can be used for palatalization. You're likely to have bad output coming from editors who forget they need to lowercase all capital letters in respelling. It also makes the substitution notation (see e.g.
{{ca-IPA}}
,{{fr-IPA}}
,{{cs-IPA}}
,{{pt-IPA}}
for examples of this in action) significantly more awkward. Trust me that it would be better to use something other than capital letters for palatalization. Benwing2 (talk) 09:27, 20 February 2025 (UTC)
- If a proper name needs to be respelled, it's IMO going to be very awkward to require that lowercase letters be used in place of capital letters in respellings so that capital letters can be used for palatalization. You're likely to have bad output coming from editors who forget they need to lowercase all capital letters in respelling. It also makes the substitution notation (see e.g.
- I can't tell when the impact on the actual pronunciation should be considered significant. The minimal pair kar – kâr shows the distinction is not always purely phonetic, but I don't know any comparable /ɡ/ – /ɟ/ pairs. I tend to indicate palatalization of /ɡ/ in entries I create, mainly for consistency with how native Turkish editors tend to handle this (e.g. using /bɛlˈɟe/ for belge). Other native Turkish editors, however, may disagree (as shown by a preference for the broad /t͡ʃiˈzel.ɡe/). Some anomalous pronunciations, such as the common monosyllabic pronunciation [diːl] for değil, are IMO also worth recording. So I think we should allow one or more narrow transcriptions in conjunction with the broad one. (We will still miss colloquial sandhi phenomena like [nɑˈbæɾ] for ne haber and [nɑpˈtɯn] for ne yaptın.) --Lambiam 09:00, 20 February 2025 (UTC)
- @Trimpulot One other thing ... you should probably come up with a different way of marking palatalization than capital letters, because capital K G L will clash with proper names that happen to have capital letters in them that aren't palatalized. You could for example use an apostrophe to indicate palatalization (
- @Trimpulot @Lambiam My personal view is that we should provide a "lightly phonetic" transcription that includes aspects of pronunciation that may not be phonemic but which significantly impact the actual pronunciation and may be non-obvious to language learners. An example is Spanish voiced stops /b d g/, which become approximants [β̞ ð̞ ɣ̞] in certain positions (e.g. between vowels). The pronunciation as approximants is very salient and audible, and pronouncing them as stops marks you as a foreigner with a bad accent. OTOH the exact quality of Spanish mid vowels /e o/ is less important and probably doesn't need to be indicated. I don't know Turkish well but it seems to be that a "lightly phonetic" rendition would include palatalization of /k g l/ whenever it occurs but not necessarily things like aspiration of voiceless stops or the other details found in a transcription like [ɟɑˈβ̞ʊɾ̞̊] (which seems too detailed). Overall though I'm strongly in favor of having a pronunciation module; manually generated pronunciations always end up messy and inconsistent. Benwing2 (talk) 22:16, 19 February 2025 (UTC)
- I agree with only giving a phonemic pronunciation; see my argument at Wiktionary:Tea room/2021/March#Turkish pronunciation. But others may disagree, seeing how much work appears to have been put in these narrow transcriptions. In particular, User:Science boy 30 writes on his talk page: “This user is against broad transcription.” His latest contribution, at gâvur, has been to replace [ɟɑˈβ̞uɾ̞̊] by [ɟɑˈβ̞ʊɾ̞̊]. So it may be wisest to at least allow room for narrow, phonetic transcriptions and use it to retain existing ones. I also suggest testing the new module by comparing its results with currently given phonemic pronunciations. --Lambiam 17:34, 19 February 2025 (UTC)
- @Benwing2 @Lambiam I have updated the module. It can now handle manual narrow transcriptions and qualifiers for pronunciations. I've also switched the capital letters for a following asterisk to indicate palatalization. You can find all the relevant information on the documentation page for Template:User:Trimpulot/tr-IPA-test.
- Trimpulot (talk) 21:07, 28 February 2025 (UTC)
- @Trimpulot Sounds good, I'll take a look. Benwing2 (talk) 21:59, 28 February 2025 (UTC)
FYI: "About [Language] pages" are being moved to "[Language] Entry Guidelines"
[edit]Per Wiktionary:Requests for moves, mergers and splits § Wiktionary:English entry guidelines vs "About (language)" in every other language, all the About Language pages, like WT:About Jeju, are being moved to Language entry guidelines, such as WT:Jeju entry guidelines, by @ExcarnateSojourner. This change blindsided me a bit considering what I've been used to, and looking at the discussion, I don't feel that there was enough participation (and it should've been mentioned here). Nonetheless, this is more so a message out there for other folks so that they're not confused as well. AG202 (talk) 06:00, 20 February 2025 (UTC)
- We should also consider renaming Cat:Wiktionary language considerations and replacing references to "language considerations pages" to something like "Wiktionary language guidelines"/"language guidelines pages". ("Wiktionary language-specific entry guidelines"/"language-specific entry guidelines pages" is too much of a mouthful.) This, that and the other (talk) 10:23, 20 February 2025 (UTC)
- @AG202 Thanks for the feedback, and sorry to have caught you off guard. Counting RFM discussions there were twelve participants, which is a lot for an RFM (though I get that this is a particularly large change). — excarnateSojourner (ta·co) 20:46, 20 February 2025 (UTC)
- @excarnateSojourner: This change was also a surprise to me. There may have been twelve participants, but if I found the correct discussion, most of them do not edit anymore, and the discussion has been stale for years. I think this should have been discussed anew before implementing. Thadh (talk) 15:37, 28 February 2025 (UTC)
- @Thadh I see now that I was thinking too bureaucratically (as opposed to practically) in implementing stale consensus. My apologies. — excarnateSojourner (ta·co) 19:28, 7 March 2025 (UTC)
- @excarnateSojourner: This change was also a surprise to me. There may have been twelve participants, but if I found the correct discussion, most of them do not edit anymore, and the discussion has been stale for years. I think this should have been discussed anew before implementing. Thadh (talk) 15:37, 28 February 2025 (UTC)
Social media account
[edit]@Chuck Entz, CitationsFreak, DCDuring, Ioaxxere, Thadh, Theknightwho, Vininn126: there was a previous discussion in March last year about whether it would be a good idea to start one or more social media accounts to publicize the English Wiktionary, and maybe also to interact with people (though I'm slightly sceptical about that). Having tried out Bluesky for a while now, I wonder if we want to experiment by setting up an account which can be accessed by a few trusted users. I can put up a daily Word of the Day post, and maybe someone can do one for the Foreign Word of the Day too. Maybe others would like to highlight other entries which are relevant to current affairs, or talk about how they improve the dictionary. To register an account on the main Bluesky Social platform, we'd need to put down an e-mail address (one that the trusted users can access, I suppose), a password and a "birth date" (the date when the dictionary launched??). A possible account name is @en.wiktionary.
Alternative, "Bluesky is an open network where you can choose your hosting provider. If you're a developer, you can host your own server." See https://atproto.com/guides/self-hosting. Not sure if that's better, but someone else would have to set up and maintain this.
Thoughts? — Sgconlaw (talk) 11:37, 20 February 2025 (UTC)
- I made a Bluesky account a few days ago. Vininn126 (talk) 11:43, 20 February 2025 (UTC)
- @Vininn126: for yourself or for the English Wiktionary? — Sgconlaw (talk) 11:44, 20 February 2025 (UTC)
- For English Wiktionary. Haven't done much to set it up, but it exists. As it stands, access is generally limited to admins. Setting up some code or something to automatically post (F)WOTD's would be nice. Vininn126 (talk) 11:47, 20 February 2025 (UTC)
- @Vininn126: great! Well, as I mentioned, I’m happy to post WOTDs. No idea if this can be automated, but it might be nice to do it manually as I can add interesting comments about the etymology or meaning, as well as an image from the Commons. I could start, say, on 1 March 2025. — Sgconlaw (talk) 11:58, 20 February 2025 (UTC)
- I think experimenting is a good idea. Starting on a platform that doesn't have a vast number of users seems wise. But we wouldn't get a lot of new users without going big (FB, etc.). OTOH, going big scares me. DCDuring (talk) 17:19, 20 February 2025 (UTC)
- @Vininn126: great! Well, as I mentioned, I’m happy to post WOTDs. No idea if this can be automated, but it might be nice to do it manually as I can add interesting comments about the etymology or meaning, as well as an image from the Commons. I could start, say, on 1 March 2025. — Sgconlaw (talk) 11:58, 20 February 2025 (UTC)
- For English Wiktionary. Haven't done much to set it up, but it exists. As it stands, access is generally limited to admins. Setting up some code or something to automatically post (F)WOTD's would be nice. Vininn126 (talk) 11:47, 20 February 2025 (UTC)
- @Vininn126, do you know if will we be able to use the @wiktionary.org domain as wikipedia does? It would be nice if we could have at least one "verified" account, lol — BABR・talk 08:06, 27 February 2025 (UTC)
- Our current accounts are the result of working with phabricator IT techs for a solution. So this is somewhat what WMF has decided to give us. Vininn126 (talk) 09:43, 27 February 2025 (UTC)
- @Vininn126: for yourself or for the English Wiktionary? — Sgconlaw (talk) 11:44, 20 February 2025 (UTC)
- Speaking of which, Sgconlaw, your latest WOTD picks have been absolutely popping. ―K(ə)tom (talk) 18:12, 20 February 2025 (UTC)
- @Ktom: ha ha, thanks! The holoalphabetic month has been fun to work on. — Sgconlaw (talk) 19:40, 20 February 2025 (UTC)
@Vininn126: so can we start using the Bluesky account? Feel free to e-mail me directly. — Sgconlaw (talk) 19:10, 12 March 2025 (UTC)
- The account is reserved for admins right now. Perhaps if others agree, this can change. Otherwise more concrete actions (setting something up to share WOTD and FWOTD) and specific posts on what we can be suggested. Vininn126 (talk) 19:24, 12 March 2025 (UTC)
- @Vininn126: I'm an admin … — Sgconlaw (talk) 19:25, 12 March 2025 (UTC)
- Sorry, brain fart. I knew that... I'll email you the passwords. Vininn126 (talk) 19:27, 12 March 2025 (UTC)
- @Vininn126: I'm an admin … — Sgconlaw (talk) 19:25, 12 March 2025 (UTC)
The continuity of Foreign Word of the Day
[edit]This is a notice that you will soon need someone new to prepare Foreign Words of the Day if you desire to continue it. I shall not detail why, but I am no longer available, neither for setting them nor mentoring somebody else nor for standing by.
All slots to the end of March have been filled.
I wish good luck to the next person in charge. General advice is to learn the WDL and LDL rules fast, always feature one definition with a quotation at the least (if applicable), and to look at older examples and copy them when in doubt (in particular for Chinese and Egyptian). ←₰-→ Lingo Bingo Dingo (talk) 17:40, 20 February 2025 (UTC)
- @Lingo Bingo Dingo: thanks for all your hard work! — Sgconlaw (talk) 17:42, 20 February 2025 (UTC)
- You will be missed. Thank you for all you've done! Vininn126 (talk) 17:44, 20 February 2025 (UTC)
- @Lingo Bingo Dingo: I'm also sorry to see you go. Your userpage picture is beautiful; I hope it stays. I also hope you won't object to the restoration of your talk page and its archives, for the sake of retaining accessible and searchable discussion records. Thank you for all your work. Whatever has happened, is happening, or will happen in your life, I wish you the best. 0DF (talk) 18:23, 20 February 2025 (UTC)
- Thanks for your work on FWOTD. I may be interested in filling in the role a little bit. (New around here but I have 3½ years experience posting words every day) Hftf (talk) 22:29, 20 February 2025 (UTC)
- Wonderfool is also leaving. Father of minus 2 (talk) 22:33, 20 February 2025 (UTC)
- When Equinox left, Wonderfool said he was leaving too. That was a year or so ago. Benwing2 (talk) 00:48, 21 February 2025 (UTC)
- See you, mister. Polomo47 (talk) 23:20, 20 February 2025 (UTC)
- Working with you has been a treat. Flame, not lame (Don't talk to me.) 16:27, 22 February 2025 (UTC)
- @Lingo Bingo Dingo: Thank you for all the work, and I am sad to see you go :( Thadh (talk) 16:30, 22 February 2025 (UTC)
- @LBD: Thank you for all your hard work maintaining FWOTD for so long! - -sche (discuss) 18:01, 22 February 2025 (UTC)
- @everyone: when someone isn't available to set an English WOTD, the system recycles last year's word, whereas with FWOTD (unless this has changed in the time since I was familiar with it) the system fails. If no-one is able to step up and maintain FWOTD, two ideas that'd allow for a smaller workload are (1) switch to a fallback system like WOTD, and/or (2) reduce the frequency, e.g. make it "f. word of the week". (Of course, if someone has time to maintain the current system, great!) - -sche (discuss) 18:01, 22 February 2025 (UTC)
- In the Information Desk post complaining about a lack of entries, I mentioned how I would be interested in adding words. Also, it doesn't have to be a single person — say, everyone with the autopatroller role should be able to edit FWotD. Polomo47 (talk) 02:14, 23 February 2025 (UTC)
- We should let AI choose and set up FWOTD. Or what Polomo47 says, sometimes one can (excelling editors can) shortcut it and dump some quoted terms into FWOTD because it spares braincells of one who would maintain setting FWOTD from nominations; I never understood numerological criteria, to be frank.
- As a middle ground, I fancy a “FWOTD adder” equivalent to the translation adder where we can just drop ready lemmas and some computer program will arrange it according to which languages have been too recently featured and are stocked. Fay Freak (talk) 04:47, 23 February 2025 (UTC)
- @Polomo47: if you are interested in taking on the FWOTD, you should go for it and try it out. I also think it's a good idea to set up a fallback system for the FWOTD like the one used for the WOTD. (Actually, the WOTD's fallback isn't fully implemented yet. I've been (slowly) adding permanent fallbacks for various days of the year now and then, but I'm sure there are some present fallbacks that are incorrect because they refer to movable holidays in past years.) — Sgconlaw (talk) 22:20, 23 February 2025 (UTC)
- Well, FWotD, like WotD, is (was?) locked from editing — I found that out at the start of the year. Not sure how one gets access... But yes, I'd like to try it out! Polomo47 (talk) 22:22, 23 February 2025 (UTC)
- @Polomo47: I assume you have been around long enough to be autoconfirmed? I see you've been editing since last year. I'm not very sure how the FWOTDs are protected; perhaps @Chuck Entz can advise on this. — Sgconlaw (talk) 22:27, 23 February 2025 (UTC)
- It seems back when I tried I was not autopatrolled; tried it now and it worked. That settles it, then. Polomo47 (talk) 22:31, 23 February 2025 (UTC)
- @Polomo47 What page were you trying to edit? I just looked at Wiktionary:Foreign Word of the Day/2025/January 8 and it has no protection at all, not even autoconfirmed. I was able to edit it logged out. Benwing2 (talk) 22:33, 23 February 2025 (UTC)
- It wasn't protection, per se, but an abuse filter. I tried to add a FWotD on January 1st but got hit with what I know know is Special:AbuseFilter/119. That might've been because I tried to edit it on the day itself, and I might not've been autopatrolled back then. Polomo47 (talk) 22:41, 23 February 2025 (UTC)
- @Polomo47 What page were you trying to edit? I just looked at Wiktionary:Foreign Word of the Day/2025/January 8 and it has no protection at all, not even autoconfirmed. I was able to edit it logged out. Benwing2 (talk) 22:33, 23 February 2025 (UTC)
- It seems back when I tried I was not autopatrolled; tried it now and it worked. That settles it, then. Polomo47 (talk) 22:31, 23 February 2025 (UTC)
- @Polomo47: I assume you have been around long enough to be autoconfirmed? I see you've been editing since last year. I'm not very sure how the FWOTDs are protected; perhaps @Chuck Entz can advise on this. — Sgconlaw (talk) 22:27, 23 February 2025 (UTC)
- Well, FWotD, like WotD, is (was?) locked from editing — I found that out at the start of the year. Not sure how one gets access... But yes, I'd like to try it out! Polomo47 (talk) 22:22, 23 February 2025 (UTC)
- @Polomo47: if you are interested in taking on the FWOTD, you should go for it and try it out. I also think it's a good idea to set up a fallback system for the FWOTD like the one used for the WOTD. (Actually, the WOTD's fallback isn't fully implemented yet. I've been (slowly) adding permanent fallbacks for various days of the year now and then, but I'm sure there are some present fallbacks that are incorrect because they refer to movable holidays in past years.) — Sgconlaw (talk) 22:20, 23 February 2025 (UTC)
- In the Information Desk post complaining about a lack of entries, I mentioned how I would be interested in adding words. Also, it doesn't have to be a single person — say, everyone with the autopatroller role should be able to edit FWotD. Polomo47 (talk) 02:14, 23 February 2025 (UTC)
- @Lingo Bingo Dingo: Really appreciate your work, especially FWOTD! The care and effort you take to maintain this for so long is really admirable. All the best on your endeavours! (Note for whoever takes over: Feel free to bug me for/about more Chinese entries if relevant.) — justin(r)leung { (t...) | c=› } 01:21, 23 February 2025 (UTC)
Transliteration of Ethiopic ቐ
[edit]Currently the transliteration norms listed at Wiktionary:Ethiopic transliteration give the transliteration of ቐ (and other glyphs with the same consonant) as <ḳʰ>; I feel this is misleading, as this consonant does not indicate an aspirated stop in Tigrinya (the only language widely using this glyph), but rather indicates an ejective fricative [x']~[χ'].
I feel that <x̣> would be a more representative and internally consistent transliteration, as it ties its transliteration to ኸ <x> (another velar stop commonly spirantized post-vocalically in Tigrinya) and continues to follow the established norm of using the underdot to indicate its emphatic articulation, as well as more clearly showing its actual phonetic realization. Rsmit274 (talk) 00:36, 21 February 2025 (UTC)
- Then is the Wikipedia page for Geʽez script wrong in giving ቐ as "qʰ [q]" ? Exarchus (talk) 15:18, 21 February 2025 (UTC)
- Thanks for flagging that; it is indeed wrong. If there are any scholarly sources that identify the phone as [q], I'd certainly be interested to hear about it; however, on a quick (informal) literature review, it looks like Maria Bulakh, Niguss Mehari and Rainer Voigt identify it as [χ'], and Tsehaye Teferra, Colleen Fitzgerald and Wolf Leslau identify it as [x'], with none identifying [q]. Rsmit274 (talk) 19:30, 21 February 2025 (UTC)
- I was looking for the ISO standard for Geʽez transliteration, but it doesn't appear to exist.
- ቐ does seem to be used for [q] in the Awngi language, so that might explain the statement on Wikipedia. But given the greater importance of Tigrinya (we only have one Awngi lemma), using <x̣> for ቐ seems a good idea. Exarchus (talk) 20:52, 21 February 2025 (UTC)
- I took the liberty of making the proposed move from 〈ḳʰ〉 to 〈x̣〉 in the relevant modules and pages. It would still be possible to have a different transliteration for Awngi if needed. Exarchus (talk) 11:23, 22 February 2025 (UTC)
Entries with no Etymology headers
[edit]Is there, or could there be a category or other tool to track down entries which contain lemmas without Etymology section? Saumache (talk) 20:29, 21 February 2025 (UTC)
- The usefulness of such a category might even still be minimal as there are plenty of things such as alt forms or English multiword phrases where the etymology is clear/unnecessary. Vininn126 (talk) 20:31, 21 February 2025 (UTC)
- In what cases would it be so clear as to be unnecessary? Even doghouse has an etymology and I really cannot imagine the scenario where someone is confused as to how that word came about. —Justin (koavf)❤T☮C☺M☯ 21:00, 21 February 2025 (UTC)
- I listed two such cases. I'm kind of wondering how you missed those. Vininn126 (talk) 21:02, 21 February 2025 (UTC)
- But how is a multi word phrase more obvious than the compound word "doghouse"? Is "doghouse" somehow less clear than "dog house"? The only reason why a multi-word phrase may not need an etymology is because the header template is likely to just link to each word individually, making it redundant in that regard. That said, there are clearly plenty of multiword phrases where how it was coined or why it exists as a phrase is actually far more obscure than "this is a house for a dog, so it's called a 'doghouse'", so an etymology would be helpful. I will grant that I don't know of any alternative forms that have separate etymologies (e.g. only one etymology at color, not colour or yogurt but not yoghurt), so that may be a case where we don't in practice have them, but that doesn't mean they shouldn't. Has there been discussion on this? —Justin (koavf)❤T☮C☺M☯ 21:12, 21 February 2025 (UTC)
- I listed two such cases. I'm kind of wondering how you missed those. Vininn126 (talk) 21:02, 21 February 2025 (UTC)
- In what cases would it be so clear as to be unnecessary? Even doghouse has an etymology and I really cannot imagine the scenario where someone is confused as to how that word came about. —Justin (koavf)❤T☮C☺M☯ 21:00, 21 February 2025 (UTC)
- The space shows where the gap is between words. Without it, we can't tell "psycho- + therapist" from "psycho- + the + rapist". 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 21:14, 21 February 2025 (UTC)
- Okay, but who would think that "doghouse" is "do-+gho+-use"? It's a house for a dog, so it's a "doghouse". Is there anyone who is confused by this? —Justin (koavf)❤T☮C☺M☯ 21:16, 21 February 2025 (UTC)
- The space shows where the gap is between words. Without it, we can't tell "psycho- + therapist" from "psycho- + the + rapist". 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 21:14, 21 February 2025 (UTC)
- Some foreign learners might look up a complex word, with little knowledge of its constituents. (I do this with Finnish.) It's good to be consistent and include these things anyway (where there is any ambiguity, i.e. no spaces), for example to allow machine parsing. 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 21:18, 21 February 2025 (UTC)
- I'm not sure what your point is IP EQ. Vininn126 (talk) 21:20, 21 February 2025 (UTC)
- I think he means that some people might be a tad confused, at think that "doghouse" has a different etymology than "dog+house" (e.g. coming from Spanish *perrocasa). It also helps etymology bots, but telling them that "dog" and "house" are the origins of "doghouse", leading to stuff like etymology trees and the like. CitationsFreak (talk) 01:42, 22 February 2025 (UTC)
- Given terms like anethole and cathode, it may not be immediately obvious to a non-native speaker that cathole is not the spelled form of a word pronounced /ˈkæθ.oʊl/. ‑‑Lambiam 19:28, 23 February 2025 (UTC)
- I'm not sure what your point is IP EQ. Vininn126 (talk) 21:20, 21 February 2025 (UTC)
- @Koavf I would say the main thing to consider is that dog house has links to dog and house in the headword title, and doghouse doesn't (aside from the etymology section). But at the end of the day a blanket rule is easier than trying to figure out what's "obvious" enough. For example I doubt most people could point out that haphazard is made up of hap + hazard even though in principle it's equivalent to doghouse. Ioaxxere (talk) 22:02, 21 February 2025 (UTC)
- Some foreign learners might look up a complex word, with little knowledge of its constituents. (I do this with Finnish.) It's good to be consistent and include these things anyway (where there is any ambiguity, i.e. no spaces), for example to allow machine parsing. 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 21:18, 21 February 2025 (UTC)
- My sister for years thought that "misled" was pronounced like a past tense (MAI-zuld), as she had only seen it in print. 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 22:03, 21 February 2025 (UTC)
- I also made this mistake as a child. Theknightwho (talk) 02:56, 27 February 2025 (UTC)
- My sister for years thought that "misled" was pronounced like a past tense (MAI-zuld), as she had only seen it in print. 2A00:23C5:FE1C:3701:DCF2:CDF7:FC1F:D3F 22:03, 21 February 2025 (UTC)
- I'm not saying all multiword entries shouldn't have etymologies. Vininn126 (talk) 21:15, 21 February 2025 (UTC)
- Granted, but I'm asking what is the difference between what is an apparently sufficiently clear multiword phrase and a sufficiently clear compound like "doghouse"? If we have an etymology at one, why not the other? You also seemed to not see my other questions. —Justin (koavf)❤T☮C☺M☯ 21:17, 21 February 2025 (UTC)
- I do not think you understand my point and are accidentally making a strawman. I said that it's a numbers game, that the number of such entries not needing a section could easily be larger than those needing it. Reread my first comment. Vininn126 (talk) 21:19, 21 February 2025 (UTC)
- Apart from it being sufficiently clear, it may also be that we do not know any wording or formatting that could make the matter more clear than it is, which is why we only include
{{ar-rootbox}}
,{{syc-rootbox}}
,{{he-rootbox}}
,{{aii-root}}
when a word has a native root with a transfix serving vague purposes. (Just found out that{{shi-rootbox}}
exists for two months but has not had success in deployment yet.) If you can only be superficial you can just as well leave it at the linked constituents. Fay Freak (talk) 22:22, 21 February 2025 (UTC)
- Granted, but I'm asking what is the difference between what is an apparently sufficiently clear multiword phrase and a sufficiently clear compound like "doghouse"? If we have an etymology at one, why not the other? You also seemed to not see my other questions. —Justin (koavf)❤T☮C☺M☯ 21:17, 21 February 2025 (UTC)
- I'm not saying all multiword entries shouldn't have etymologies. Vininn126 (talk) 21:15, 21 February 2025 (UTC)
- Eh, nor sure what this would do that
{{rfe}}
doesn't already cover. Also, to be quite honest, for some languages, there simply isn't anything to mention etymology-wise. It's not unclear, and putting{{unk}}
everywhere doesn't feel right. AG202 (talk) 21:47, 21 February 2025 (UTC)- If it isn't unclear, why not put down the etymology for the languages? If it's unknown, then I could see the argument. CitationsFreak (talk) 01:46, 22 February 2025 (UTC)
- It is unknown. For a lot of underrepresented languages' base morphemes, there haven't been much major research into their etymologies and they have no written ancestors. Other than possible cognates, there's really nothing to add. Ex: Yoruba bùn and most other Yoruba monosyllabic verbs; the etymology sections are empty, and if the header "Etymology" exists, it's only to separate out lemmas per our entry layout. Reconstructions have only been made for certain words and outside of those words, there's quite literally no information out there. AG202 (talk) 05:50, 22 February 2025 (UTC)
- If it isn't unclear, why not put down the etymology for the languages? If it's unknown, then I could see the argument. CitationsFreak (talk) 01:46, 22 February 2025 (UTC)
- You could use
-insource:/\=Etymology/
to eliminate pages with etymology headers andincategory:"English lemmas"
in Special:Search to find English lemmas without etymology sections anywhere on the page, but you would want to narrow it down further, and it would take some tweaking to keep the search from timing out. Chuck Entz (talk) 21:56, 21 February 2025 (UTC)- @Chuck Entz Thanks! I wasn't narrowly thinking of English entries in making my query and most of the comments are far off what I intended to do with such a tool. The idea is that, apart from the fact I deem them mandatory, entries lacking Etymology headers (and that should have one, most of these lemmas simply being of affixational origin) are more or less all stubs, old entries that need some clean up and/or added content. I keep stumbling upon these randomly and wanted to really address the issue. Saumache (talk) 22:49, 21 February 2025 (UTC)
- And, by the way, where do I find documentation on search box "templates"? Saumache (talk) 22:59, 21 February 2025 (UTC)
- @Saumache the Help button in the top-right of Special:Search takes you to the documentation at mw:Help:CirrusSearch. There is also the advanced search dropdown at Special:Search. This, that and the other (talk) 00:25, 22 February 2025 (UTC)
- Please don't add "etymology" sections to taxonomic species names. It is more useful to make sure that there are entries with etymologies for genera and for specific epithets. DCDuring (talk) 16:14, 22 February 2025 (UTC)
- @Saumache the Help button in the top-right of Special:Search takes you to the documentation at mw:Help:CirrusSearch. There is also the advanced search dropdown at Special:Search. This, that and the other (talk) 00:25, 22 February 2025 (UTC)
- And, by the way, where do I find documentation on search box "templates"? Saumache (talk) 22:59, 21 February 2025 (UTC)
- @Chuck Entz Thanks! I wasn't narrowly thinking of English entries in making my query and most of the comments are far off what I intended to do with such a tool. The idea is that, apart from the fact I deem them mandatory, entries lacking Etymology headers (and that should have one, most of these lemmas simply being of affixational origin) are more or less all stubs, old entries that need some clean up and/or added content. I keep stumbling upon these randomly and wanted to really address the issue. Saumache (talk) 22:49, 21 February 2025 (UTC)
Upcoming Language Community Meeting (Feb 28th, 14:00 UTC) and Newsletter
[edit]Hello everyone!

We’re excited to announce that the next Language Community Meeting is happening soon, February 28th at 14:00 UTC! If you’d like to join, simply sign up on the wiki page.
This is a participant-driven meeting where we share updates on language-related projects, discuss technical challenges in language wikis, and collaborate on solutions. In our last meeting, we covered topics like developing language keyboards, creating the Moore Wikipedia, and updates from the language support track at Wiki Indaba.
Got a topic to share? Whether it’s a technical update from your project, a challenge you need help with, or a request for interpretation support, we’d love to hear from you! Feel free to reply to this message or add agenda items to the document here.
Also, we wanted to highlight that the sixth edition of the Language & Internationalization newsletter (January 2025) is available here: Wikimedia Language and Product Localization/Newsletter/2025/January. This newsletter provides updates from the October–December 2024 quarter on new feature development, improvements in various language-related technical projects and support efforts, details about community meetings, and ideas for contributing to projects. To stay updated, you can subscribe to the newsletter on its wiki page: Wikimedia Language and Product Localization/Newsletter.
We look forward to your ideas and participation at the language community meeting, see you there!
MediaWiki message delivery 08:30, 22 February 2025 (UTC)
Transliteration of Bactrian υ /h/
[edit]Would it be an idea to transliterate Bactrian υ (Greek script) as 'h'? I noticed that Bactrian φ is already transliterated differently (viz. as 'f') than Greek ('ph'). Exarchus (talk) 14:05, 22 February 2025 (UTC)
- Anyone? Could transliterating υ as 'h' be a problem if people were looking for Bactrian terms in Latin script? But I'd think our romanisations are often different from the scientific standard. Exarchus (talk) 19:31, 2 March 2025 (UTC)
Inaccurate label and usage notes on non-standard English verb forms
[edit]English conjugations like knowed and swimmed are marked as mistakes typically made by non-native speakers or children, but these forms are extremely common in the South. From visiting my kin in Kentucky, I have heard "knowed" from native speakers probably more often than "knew". Conjugating irregular verbs in the past tense as tho they are standard is more the rule than the exception in these dialects, particularly in years/decades/centuries past. I want to be conservative about removing the labels and usage notes as they are or modifying them, so I wanted to get some validation here that these are not merely or even primarily mistakes made by someone who doesn't know better, but a perfectly normal part of some American dialects. —Justin (koavf)❤T☮C☺M☯ 00:21, 23 February 2025 (UTC)
- The quotes for knowed show it is common in dialectal English. nonstandard is the normal label for this, and indeed these terms have this label. You could expand on this by writing something like
{{lb|en|nonstandard|;|dialectal|or|non-native speaker error}}
(which displays as (nonstandard; dialectal or non-native speaker error)) and remove the usage note. Benwing2 (talk) 01:32, 23 February 2025 (UTC)
- Generally agreed on all of the thinking above. One nuance that can be added is the concept of "nonstandard in most dialects but a standard alternative form in some." Thus I would word the label more like "...|nonstandard in most dialects|..." rather than "...|nonstandard|;|dialectal|...", for full accuracy. The examples that leap to my mind for that aspect are come, run, and seen as preterite inflections (in addition to being the past participle inflection), which can fairly be said to have been traditionally standard (i.e., alternative but not-nonstandard) in working-class sociolects of AmE in the 19th and 20th centuries, and still today for plenty of people. The only reason it was taught in schools that they were "wrong" is the theme that "if you want to participate in 'upper-class' discussions, you must shed those forms from your usage." The difference is conflating upper-class usage with the only usage that can be standard in a language, versus the linguistically accurate understanding that each lect can have some standards that are different from those of other lects. It is interesting how in the 21st century there is more room, culturally, for people to properly understand how working-class sociolects are not inherently "backward" (just different), whereas in the 19th and 20th centuries there was no room for admitting that. A complex topic of course. Quercus solaris (talk) 17:46, 24 February 2025 (UTC)
Hokkien (or Southern Min) as a separate language again
[edit]I'm a frequent Wiktionary user frustrated by this enough to look up ways to open a discussion here. This appeal will be for Hokkien, but the same also applies to at least Cantonese and Hakka. This is effectively an appeal to reverse Wiktionary:Votes/pl-2014-04/Unified Chinese.
There is no dictionary other than Wiktionary that treats Hokkien as Unified Chinese. For a learner trying to look up words in Hokkien, the experience has them looking for their language in just a random "Etymology 2" entry. For example, try looking for the Hokkien definition of 阮.
In the Unified Chinese vote from 2014, it was stated:
the reason for the marginalisation of other varieties is that it is practically troublesome and unnecessary to have to duplicate everything (...) except the pronunciation for all 17 ISO-coded Chinese topolects.
This is not the reason for the marginalization. That runs much deeper; cf. me trying to make a case here that the language shouldn't be relegated to a subsection of the macrolanguage.
The vote was also rooted in an incorrect understanding of Chinese languages. They are not simply different pronunciations of the same language; this remains true even if one thinks of them as dialects of Chinese. A reminder that Northern Thai gets to have its own section, while the non-mutually-intelligible language/dialects Cantonese, Mandarin, Hokkien have to use the same section.
The perceived de-duplication also has no such effect, as other unquestionably-non-dialect languages that use Han/Chinese Characters still necessitate separate templates, etymologies, and definitions.
I would also like to note that Southern Min is the only case in [[Wiktionary:Language_treatment][the current language treatment policies]] that is a subdivision that's also treated as a language family. For that matter, Chinese is also the only language family with this Unique Treatment, and it hurts English Wiktionary as a whole. Kisaragi Hiu (talk) 06:59, 23 February 2025 (UTC)
- It's an interesting piece of Wiktionary history that the user who was (to my recollection, at least) most responsible for Sinitic languages being merged under one header (and for traditional Chinese being lemmatized rather than the modernly-more-common simplified Chinese!), and was doing the work of implementing and maintaining that system, Wyang, subsequently became quite argumentative, edit- and wheel-warring with people, and ultimately left the project (and thus stopped doing that work), but now it'd be a lot of work to undo or modify either of the changes. You're not the first person to suggest this, and I'm glad it's being discussed. There are benefits and drawbacks to either approach, merging or splitting; the current approach indeed makes it harder / less intuitive to find content on a specific lect, or tell whether or not a given (unlabelled) definition exists in a given lect or not, but it's more compact. - -sche (discuss) 08:53, 23 February 2025 (UTC)
- I will add that in the past I've seen people argue that splitting would result in less coverage of smaller lects, but I don't see how: surely all the information we currently have on them should be preserved in any split, and for that matter, I don't see why we couldn't retain any "unified" infrastructure (e.g. dialect maps) that editors found useful to maintain in a unified way; and any claim that it's easier to enter Hokkien [etc] information under the current system is [citation needed]. - -sche (discuss) 18:30, 23 February 2025 (UTC)
Strong support - shouldn't have been merged. Chihunglu83 (talk) 09:11, 23 February 2025 (UTC)
- I actually saw some problems in the current "Unified Chinese" representation:
- The "Traditional Han script" vs. "Simplified Han script" part didn't respect different Han simplification standards/facts - for example, "個"=>"个" is the Han simplification in Mandarin standard while "个"=>"个", "個"=>"個" (unmerged) is the Han simplification standard in Hakka, Hokkien, Wu.
- The current "Unified Chinese" implementation did not clearly give any information about whether the word is only used in Mandarin or only lack of "Pronounciation in other Sinitic languages" - this is the case for most entries with only Mainland Chinese Mandarin/Taiwanese Mandarin pronounciation written in the "Pronounciation" section.
- -- 2402:7500:586:3B29:0:0:34C5:81A6 11:42, 23 February 2025 (UTC)
- I agree with the view that the current treatment of Chinese is flawed (there has been multiple posts and discussions on this in the past years), and certainly it needs improvement. I should also note that the original 2014 vote is deeply flawed in its rationale, assuming that the main differences are in vocabulary and sometimes (quote: 1%) in vocabular (and later the proposer asserts in the discussions that there are zero grammatical differences between Sinitic languages, when in fact there are many).
- There are two ways to approach the problem, splitting or merging.
- Splitting Chinese up might seem straight forward, but there are outstanding problems on how the grouping should be done (it's known that the traditional or ISO groupings are problematic in certain parts, and often omits minor dialect groups e.g. She), and how deep do we want to go splitting up (e.g. should Southern Min be a macro-L2? Or should Hokkien, Teochew, Leizhou, and Hainanese each be an L2? What about marginal dialects that don't really fall under a proper grouping?).
- On the other hand, I'm not opposed to putting the entirety of "Chinese" under one L2 (if done properly) – but the current approach clearly doesn't work (arguably this is caused by Wyang created the Chinese L2 by merging other lects into Mandarin). At the minimum we should distinguish between senses that are pan-Sinitic, or "MSC" (i.e. put
{{lb}}
onto every definition no matter what), and split classical/literary Chinese off. – wpi (talk) 16:38, 23 February 2025 (UTC) - In my personal opinion Chinese shouldn't be an L2 at all, and all Chinese languages should be split into individual ones (by whatever classification seems best; for instance, a separate Dungan L2 not being poorly linked to a [China] Mandarin L2 would maybe be a good idea). However, I can understand why that would be a problem for the editors of Sinitic languages on Wiktionary, since that likely means years of work carefully splitting up the definitions and re-designing the entire infrastructure.
- So the main question in my opinion should be: Are our Sinitic editors (e.g. @wpi, Justinrleung, TongcyDai and others, forgive me if I've forgotten to ping anyone else, I'm not too familiar with our editor base) prepared to put in the work right now, or not? And in which domains? Thadh (talk) 17:08, 23 February 2025 (UTC)
- My opinion hasn't really changed much from what I have said in Wiktionary:Beer parlour/2022/March#Why are all Chinese varieties stuffed under one Chinese?. I do think there are trade-offs with either approach. I may be less opposed to splitting Chinese up than before, but I still think the value of the current infrastructure allows us to worry less about the fuzziness of boundaries among varieties and focus on the lexical items one by one. I guess this can be too much of an editor-centric convenience and really make it less useable for users. If we are to continue with the current format, labelling is definitely an issue that needs to be dealt with, especially with single-character entries. Another issue of the current format is the problem of Mandarin/"mainstream Chinese"-centric writing standards applied to other varieties, as pointed out above, rather than respecting regional variation. This is partially the problem of overusing
{{zh-see}}
, which often forces us to pick a "standard" form, even though sometimes this is a rather arbitrary process. — justin(r)leung { (t...) | c=› } 19:08, 23 February 2025 (UTC)- When it comes to phonetic loan words into English, I treat the Cantonese, Hokkien and Mandarin derived words as if those varieties are the languages of origin. (I'd like to see y'all try to reverse that!) That is, there are no phonetic loans from "Chinese", only semantic loans from Chinese. I support division of the Chinese header. It is an inevitability that it will be divided, so I don't need to really push too hard. Geographyinitiative (talk) 19:45, 23 February 2025 (UTC)
- I'd like to raise several practical concerns.
- The first major question is the granularity of division. Even within Southern Min, we face complex decisions: should Teochew be in the same L2 as Hokkien? What about Longyan, which currently shares pronunciation module with Hokkien despite their limited mutual intelligibility? Similar questions arise for other varieties - Northern Wu alone could potentially be split into at least three L2s. Each decision to split one variety could create precedent for further divisions, potentially leading to a very large number of L2s with substantially duplicated content.
- This leads to the scale of the proposed changes. Given that you mentioned this would apply "at least" to Cantonese and Hakka as well, we're looking at restructuring over 300k entries (90k for Hokkien, 180k for Cantonese, and 32k for Hakka, among others). Before we could even begin such restructuring, we should ensure every definition is properly labeled with its variety (as wpi just mentioned) - a substantial task in itself. Do you have specific plans for managing such a large-scale reorganization? Additionally, how would we handle the numerous synonym templates that currently work across varieties? These modules are still of considerable linguistic/dialectological value even though they span multiple unintelligible variants, and splitting these could make them significantly more fragmented and harder to maintain.
- As volunteer editors, we need to be mindful of the long-term maintenance burden of any major structural changes. While the current system has multiple drawbacks, it provides a workable framework for handling the fuzzy boundaries between varieties and focusing on lexical items individually. TongcyDai (talk) 20:25, 23 February 2025 (UTC)
- I have strong concerns about splitting on the same lines as @TongcyDai. For some data points:
- We were unable to merge North and South Levantine Arabic (respectively 310 and 2,872 lemmas) due to the enormity of the task despite the fact that ISO merged them and that we had a specific request from the instigator of the ISO merge process to merge them here; he initially offered to help but then vanished once the scope of work was realized.
- @Theknightwho instigated a split of Min Nan maybe 2 years ago (?), which is still far from complete and currently stalled (and this didn't involve major reorganization of the infrastructure since all the resulting lects still sit under the Unified Chinese umbrella).
- I tried to propose a split and reorganization of the Yue lects along the lines of what we did with Min Nan but it stalled due to disagreements among the various Chinese editors over how to partition the Yue space into languages and general lack of will to carry out the resulting work.
- When @Vininn126 decided to re-merge Masurian (c. 750 lemmas) into Polish, it was decided easier to delete the entire language and start from scratch rather than try to merge the existing lemmas.
- It is true that splits, in my experience, are generally easier than merges, but in the one case where I was able to carry out a large split (Kurdish, with about 4,000 lemmas), it was helped enormously by the fact that Northern Kurdish and Central Kurdish generally use different scripts. In this case, the macro-language we're talking about has orders of magnitude more lemmas (c. 300,000) and everything is written in the same script. If we were unable to finish a much smaller split (the case of Min Nan) and couldn't even agree on how to split a subfamily of Chinese (the case of Yue), how are we going to have a prayer of carrying out such a task as splitting Chinese? This is even apart from the major concerns I have about potential duplication of data across potentially dozens or even hundreds of Chinese varieties (depending on how many separate L2's we end up with).
- I would instead suggest identifying the main pain points of the current organization and seeing how we can resolve them without throwing away the baby with the bathwater. Some examples:
- Links to Mandarin, Hokkien, etc. currently show up yellow because the corresponding pages usually only have a Chinese header, not a Mandarin or Hokkien header. We can fix that in Module:links with a system that, for example, redirects links for any Chinese lect that is written in Chinese characters to the Chinese header. (We can also consider a system where we actually check the page to see whether a specific lect header exists, but I have concerns about running up against memory or expensive-call limits. Maybe this is overblown though; @Theknightwho can comment more.)
- @JnpoJuwan complained that all the Chinese lect labels are under the
zh
code and don't work with any other code. I am already about to add family-level categories and I have considered family-level labels, which could solve this issue. We already have support for label handlers to display labels in a smart fashion as well as a Chinese-specific label handler the removes duplication when multiple labels of the same subfamily are given, and we can extend this so that e.g. the "Taiwanese Hokkien" label displays "Taiwanese Hokkien" when the language iszh
but just "Taiwanese" when the language is Hokkien. - We already have ad-hoc "lect" codes for several dozen written Chinese lects for use with
{{zh-x}}
. I have an existing proposal to replace these with proper etymology codes, but it stalled due to some disagreements about how to handle some of the edge cases. If we can resolve these disagreements, we can scrap the ad-hoc codes in favor of standard codes, which should simplify etymologies for terms borrowed into other languages and similar such things. - We (meaning mostly TKW and I) have been gradually deprecating some of the Chinese-specific infrastructure in favor of using the language-independent infrastructure, which is generally more robust, more featureful and easier to maintain. We did this with
{{zh-syn-saurus}}
,{{zh-syn-list}}
and mostly with{{zh-der}}
; the next target is probably{{zh-abbrev}}
. This can be continued.
- Benwing2 (talk) 21:08, 24 February 2025 (UTC)
- I will add that one of the biggest reasons was orthography and also the source used, which covered two neighboring (but very different) dialects. Vininn126 (talk) 21:10, 24 February 2025 (UTC)
- If we are allowed to just start splitting off entries without actively being merged into Chinese no matter the script, then I am willing to just start gradually doing that. Kisaragi Hiu (talk) 06:39, 19 March 2025 (UTC)
- No, please do not do that. You need to get consensus before going against the current policies, and there does not appear to be any such consensus at the present time. Benwing2 (talk) 07:07, 19 March 2025 (UTC)
- I have strong concerns about splitting on the same lines as @TongcyDai. For some data points:
- I've been of the opinion that Chinese shouldn't have been merged, but alas, trying to split it now would be way too daunting of a task. I do have two main thoughts though:
- I do believe that historical lects should be split out, as @Wpi brought up. The way that it's set up now is a mess when it comes to descendants, as I've mentioned since 2022. Chinese 筆 / 笔 (bǐ) and 白菜 (báicài) are some of the main culprits. The former is entirely unclear as to what descendants come from what historical lect, and uninformed readers could assume that everything under "others" comes from Modern Chinese! Similar thing with 白菜 (báicài), it doesn't make clear which entries come from anything other than Sino-Xenic & Early Mandarin. The English descendants make it even more clear: bok choy comes from Cantonese 白菜 (baak6 coi3), pechay comes from Hokkien 白菜 (pe̍h-chhài), baicai from Mandarin 白菜 (báicài), it's not clear at all, and is fairly misleading when compared to the etymology sections of the descendants. I also feel that it obscures inter-lect borrowing when the term is spelled the same. I don't believe that there are only 9 Cantonese terms borrowed from Mandarin. Same with the weird way we handle Chinese 麥當勞 / 麦当劳 (Màidāngláo) and its etymology and descendant Cantonese 牡丹樓 / 牡丹楼 (maau5 daan1 lau4). It says that the latter is borrowed from the former in Mandarin, which in turn is from Cantonese, but 麥當勞 / 麦当劳 (Màidāngláo) does not make this clear at all, and doesn't even list the descendant. Something needs to be done, as it's harming the way we present information. CC: @Benwing2
- Additionally, I am a bit concerned about the discrepancy in the number of usage examples & quotations and overall coverage between Chinese lects, as wpi and @Justinrleung brought up. With merges like this, the "main" lect, for lack of a better term, tends to almost completely eclipse the other lect when it comes to usage examples, since they could be seen as nonstandard or almost unworthy of usage example creation. This is made even more evident in the case where the vast majority of terms are spelled the same way across lects. Ex: Hakka only has 60 terms with usage examples, with many, if not most, of them being only found at Hakka-specific senses. Imho having separate L2s for Chinese lects could incentivize more dedicated coverage to the smaller ones, if there are editors willing to work on the effort. It's worked very well for Jeju, as the coverage we have now would not have been possible if not for it being a separate L2. (That being said, the typical language vs dialect issue still applies, I'm not saying that an L2 should be made for every dialect out there) Maybe the macro-L2 idea could work.
- That being said, I don't speak any Chinese lect, but I'd be willing to help out if needed, since I do think that this would be a net benefit for users in the long run. AG202 (talk) 06:49, 25 February 2025 (UTC)
- @AG202 What was your ping in reference to? Can you expand? As for the issue concerning discrepancy of usage examples and quotations, I think that's inevitable when you have one dominant lect among many. Compare Arabic, which is handled in exactly the opposite fashion (one L2 for every lect), and where almost all lects other than MSA and Maltese are sorely lacking in every way. (In fact I would use Arabic as a good cautionary tale of what happens when you have too many splits.) As for historical Chinese lects, I was a bit surprised myself to see them merged under the Chinese header; possibly they could be split out, but that would be a lot of work and would need a really well-thought-out and fleshed-out plan of action before we proceed. (Min Nan didn't have that which is part of the reason it's sitting in a stalled half-split stage.) Benwing2 (talk) 07:01, 25 February 2025 (UTC)
- Sorry, the ping was specifically in reference to the historical lects section. And as for Arabic, yeah I've seen that and I do think that a middle ground could be found between the two extremes. AG202 (talk) 07:24, 25 February 2025 (UTC)
- @AG202 I don’t have any strong feelings on whether the contemporary varieties of Chinese should be split or not, but splitting the historical forms of Chinese is neither feasible nor desirable.
- My understanding is that Old Chinese and Middle Chinese are essentially phonological constructs that do not correspond one-to-one with any attested written language. The written language itself existed as a spectrum between the exemplary classical Warring States models and the dominant vernacular of the period, so that all “Old Chinese” structures and lexemes, even if obsolete in the spoken language, could be used in writing in the right context. There are plenty of late imperial texts that partly imitate even the style of the Shijing, from 700 BCE or earlier. And what about texts that are in a perfect mix of vernacular Early Modern Mandarin and Literary Chinese, or texts that are mostly Literary but use vernacular terms for effect?
- The best way to deal with this is to use
{{datedef}}
more extensively, not to split the languages.-—Saranamd (talk) 09:07, 25 February 2025 (UTC)- @Saranamd: Technically the "conventional" way of handling Old Chinese would be to transport it to the reconstruction mainspace as a purely phonological reconstruction of the attested Sinitic varieties. I think that is best if we split, but if we don't, it's pretty worthless. But essentially, this is what it is, a Proto-Sinitic reconstruction that happens to be attested in a logographic script. Thadh (talk) 11:58, 25 February 2025 (UTC)
- @Saranamd: Unfortunately,
{{datedef}}
does not solve the problems I mentioned. And if Old Chinese & Middle Chinese can't be separated out, could we at least separate out Classical & Literary Chinese? AG202 (talk) 19:27, 25 February 2025 (UTC)- @AG202 I think the descendants section is honestly the least important section of a well-attested language, since it pertains entirely to other languages. What matters most for a dictionary is the definitions section, the quality of which will be severely impaired by splitting Literary Chinese and Standard Written Chinese because the two written languages even now exist in a continuum. Even today, virtually any Literary Chinese term can be used in SWC in the right (historical or literary) context, and of course splitting the two would be even more impossible for older written forms of Mandarin. Any split would lead to massive duplication of definitions.—Saranamd (talk) 05:42, 26 February 2025 (UTC)
- Okay but we have the Descendants section, so something needs to be done about the major confusion that exists currently. We can’t just hand-wave it away. Otherwise there’s no point in having Descendants sections in the first place. AG202 (talk) 06:36, 26 February 2025 (UTC)
- @AG202 I think the descendants section is honestly the least important section of a well-attested language, since it pertains entirely to other languages. What matters most for a dictionary is the definitions section, the quality of which will be severely impaired by splitting Literary Chinese and Standard Written Chinese because the two written languages even now exist in a continuum. Even today, virtually any Literary Chinese term can be used in SWC in the right (historical or literary) context, and of course splitting the two would be even more impossible for older written forms of Mandarin. Any split would lead to massive duplication of definitions.—Saranamd (talk) 05:42, 26 February 2025 (UTC)
- @Saranamd: While OC and MC are phonological constructs, I strongly disagree that Classical Chinese can't be split out. (and if we do treat Classical Chinese separately, OC and MC should be placed under it due to the time period)
- Although Classical vocabulary can still be used within modern texts and dialects, the grammatical structure of Classical is fossilized and cannot be altered (this often also applies to non-grammar words, which creates fossilized idioms i.e. Category:Chinese four-character idioms), and some constructs like anastrophe and 互文 no longer work.
- There is also a very clear dividing line (New Culture Movement and May Fourth Movement) which marked the change from Classical Chinese to (early) MSC. – wpi (talk) 05:42, 26 February 2025 (UTC)
- @Wpi What about Baihuawen texts that incorporate classical constructions extensively—are they Mandarin or Literary Chinese? Do we say that a single chapter in the same novel alternates freely between two different Wiktionary languages? What about Baihuawen or mixed Baihuawen-Wenyanwen texts written in Korea which were always read out as Sino-Korean, or Tang-era vernacular texts that really cannot be called Mandarin? My impression is that the clear dividing line only looks clear from the vantage point of today, and when we get down to the historical sources it’s much less clear.
- Furthermore, as a dictionary and not a grammar, the lexicon is what matters most. There are many languages where the literary and colloquial varieties differ in important grammatical structures but much less in vocabulary, and where the colloquial variety can borrow freely from the literary variety. Splitting harms the functionality of the dictionary when there is no lexical dividing line between the two varieties.—Saranamd (talk) 05:51, 26 February 2025 (UTC)
- And even from a lexical viewpoint, given that the death of Literary Chinese was not spontaneous, there are plenty of texts that use late nineteenth- and early twentieth-century neologisms in a mostly Classical grammatical framework. So even words like 自由 (zìyóu, “liberty”) or 民主主義 / 民主主义 (mínzhǔzhǔyì, “democracy”) could be said to be “Literary Chinese” words.—Saranamd (talk) 05:58, 26 February 2025 (UTC)
- @AG202: Fully agree with both points. Regarding point #1, I think it would definitely be helpful to list all descendants from OC (including internal ones if appropriate). (previous failed discussion). As for usage examples, I agree the uxes and quotations are heavily focused on MSC, but I'm also concerned about duplication of collocations, for example sense 2.4 of 落 repeats 落車 twice (and if more collocations are added, there will only be more duplication, which arguably is the thing that "we" originally tried to avoid).
- (Category:Cantonese terms borrowed from Mandarin should only include phonological borrowings – there's probably quite a bit more, but I reckon it's less than 100, perhaps maybe in the low hundreds (?), so not super far off) – wpi (talk) 16:32, 25 February 2025 (UTC)
- @AG202 What was your ping in reference to? Can you expand? As for the issue concerning discrepancy of usage examples and quotations, I think that's inevitable when you have one dominant lect among many. Compare Arabic, which is handled in exactly the opposite fashion (one L2 for every lect), and where almost all lects other than MSA and Maltese are sorely lacking in every way. (In fact I would use Arabic as a good cautionary tale of what happens when you have too many splits.) As for historical Chinese lects, I was a bit surprised myself to see them merged under the Chinese header; possibly they could be split out, but that would be a lot of work and would need a really well-thought-out and fleshed-out plan of action before we proceed. (Min Nan didn't have that which is part of the reason it's sitting in a stalled half-split stage.) Benwing2 (talk) 07:01, 25 February 2025 (UTC)
- Spitballing an idea for testing the feasibility of a split: (bot-)duplicate Chinese entries to subpages of some project or userspace page (e.g. WT:Chinese split demo/天, WT:Chinese split demo/馬, etc), making whatever tweaks are needed to let our modules also function in that new non-mainspace place, and then apply whatever tactics would be used to split Chinese, to those pages: e.g. if someone is prepared to write a bot to go through and split out separate Mandarin and Hokkien L2s for all the pages that have Mandarin and Hokkien pronunciations, then have the bot do that to the project/userspace pages. Start manually (or automatedly) adding Hokkien usexes. Etc. (Or, instead of duplicating all entries and then starting to modify them, only duplicate entries when modifying them, e.g. only duplicate 天 into the user-/project-space at such a time as you're splitting it up into different L2s.) If it proves feasible to split the pages in user-/project-space, then either the same techniques can be used to split the mainspace pages, or the project/user pages can be moved to mainspace. If the project proves infeasible and gets abandoned, the pages can be (bot-)deleted en masse. - -sche (discuss) 18:22, 25 February 2025 (UTC)
- @-sche Although I respect your judgment greatly, I'm a bit concerned that you're suggesting something like this in a "spitballing" kind of way. Splitting Chinese would be an enormous task, and before even beginning on something like this, particularly splitting the main body of modern-lect definitions into separate L2's, we'd need (a) buy-in from a large majority of Chinese editors, (b) a detailed plan about how to proceed with some estimates of how much work this would involve. Simply spitballing a proof-of-concept like this without either buy-in or a plan would, in the best case, waste a lot of someone's effort once it gets abandoned, or in the worst case, create a major fork in Wiktionary, essentially splitting the Chinese editing community, with resulting mutual animosity, and forcing people to either choose to contribute to one or other fork or double their effort by contributing to both. Something like this would likely take several person-years of effort at least, meaning we'd potentially have a long-lasting fork hanging around causing innumerable problems. If we're really serious about splitting of some sort (which I don't at this point see the buy-in for), it would be more practical to split out certain smaller chunks (e.g. historical lects, although @Saranamd has several issues with this) rather than trying to split the whole thing at once. Benwing2 (talk) 05:57, 26 February 2025 (UTC)
- Oh, I certainly don't mean to suggest that one person should unilaterally do this right now! I mean to bring the idea up for discussion here to see if anyone thinks it'd be a good idea. My rationale is that a fair few people seem to agree that merging all of Chinese was inappropriate, but a fair few people also agree that splitting Chinese has the potential to create a mess while the split is in progress, so this struck me as an idea for a possible way to determine / demonstrate the feasibility (or infeasibility) of a split, iff enough users want to try it. It does occur to me that, in the vein of my second idea (only duplicating entries as they're modified, instead of duplicating everything and then modifying it), duplicating just a random thousand Chinese entries might provide a sufficient testbed for people to try splitting techniques on. (Or perhaps people don't even need to actually modify entries but can just post what their code would do.) I'm trying to think, since many people think "splitting has the potential to create a mess while it's in progress, and it might not finish" is a blocker to trying to split, of ways people could determine / demonstrate the feasibility of splitting. - -sche (discuss) 17:45, 26 February 2025 (UTC)
- But the problems are more fundamental than just "splitting has the potential to create a mess while it's in progress, and it might not finish". First of all it's not at all clear to me there's even consensus to split, and secondly no one has even remotely come up with a feasible plan for what a split-Chinese system would look like that would be demonstrably better than what we have. Plenty of people have complained about the deficiencies of the current system, but no one has proposed any workable alternatives, particularly concerning splitting the modern lects. The only proposal I see so far is coming from @AG202, about historical lects only. I would be strongly opposed to a split that put every "mutually incomprehensible" lect (however we define that) under its own L2; this is effectively what we did with Arabic, following the ISO splits almost to the letter, and the result IMO is absolutely not any better than the current Chinese system. I don't want to be a party pooper but I really think people are both underestimating the magnitude of the task and failing to appreciate the serious problems that are likely to ensue if a split is begun in a willy-nilly fashion, without a detailed plan of operation that has strong consensus behind it. It's kind of like we're jumping right into talking about doing open-heart surgery before we've seriously considered all the less-invasive options, or even enumerated what the actual problems are. Benwing2 (talk) 22:09, 26 February 2025 (UTC)
- Oh, I certainly don't mean to suggest that one person should unilaterally do this right now! I mean to bring the idea up for discussion here to see if anyone thinks it'd be a good idea. My rationale is that a fair few people seem to agree that merging all of Chinese was inappropriate, but a fair few people also agree that splitting Chinese has the potential to create a mess while the split is in progress, so this struck me as an idea for a possible way to determine / demonstrate the feasibility (or infeasibility) of a split, iff enough users want to try it. It does occur to me that, in the vein of my second idea (only duplicating entries as they're modified, instead of duplicating everything and then modifying it), duplicating just a random thousand Chinese entries might provide a sufficient testbed for people to try splitting techniques on. (Or perhaps people don't even need to actually modify entries but can just post what their code would do.) I'm trying to think, since many people think "splitting has the potential to create a mess while it's in progress, and it might not finish" is a blocker to trying to split, of ways people could determine / demonstrate the feasibility of splitting. - -sche (discuss) 17:45, 26 February 2025 (UTC)
- @-sche Although I respect your judgment greatly, I'm a bit concerned that you're suggesting something like this in a "spitballing" kind of way. Splitting Chinese would be an enormous task, and before even beginning on something like this, particularly splitting the main body of modern-lect definitions into separate L2's, we'd need (a) buy-in from a large majority of Chinese editors, (b) a detailed plan about how to proceed with some estimates of how much work this would involve. Simply spitballing a proof-of-concept like this without either buy-in or a plan would, in the best case, waste a lot of someone's effort once it gets abandoned, or in the worst case, create a major fork in Wiktionary, essentially splitting the Chinese editing community, with resulting mutual animosity, and forcing people to either choose to contribute to one or other fork or double their effort by contributing to both. Something like this would likely take several person-years of effort at least, meaning we'd potentially have a long-lasting fork hanging around causing innumerable problems. If we're really serious about splitting of some sort (which I don't at this point see the buy-in for), it would be more practical to split out certain smaller chunks (e.g. historical lects, although @Saranamd has several issues with this) rather than trying to split the whole thing at once. Benwing2 (talk) 05:57, 26 February 2025 (UTC)
- Question: how do other Wiktionaries handle Chinese? (Do any split it?) zh:天 seems to merge everything under one Chinese L2, and so does fr:天, despite fr.Wikt splitting a lot of other lects (especially any that ISO assigned separate codes). de:天 doesn't even have a Chinese section. (Some of those wikis are borderline-unusable in dark mode, as an aside.) - -sche (discuss) 18:22, 25 February 2025 (UTC)
- @-sche: It seems like fr.wikt does have separate headers (at least for Cantonese) as in fr:德國. AG202 (talk) 19:30, 25 February 2025 (UTC)
- In my experience zhwiktionary tends to follow us quite closely, so it's no surprise to see it merging Chinese like we do.
- As for dewiktionary, they only has 356 Chinese entries and their coverage of the language seems to be in a rudimentary state (de:Project:Chinesisch opens with the truism "Das Chinesische umfasst verschiedene Varietäten").
- As for other large Wiktionaries, ruwiktionary does appear to split Chinese (see ru:烏鴉 for instance) but the coverage of non-Mandarin lects is so limited it's difficult to know what direction they have chosen, and jawiktionary seems to merge Chinese like us. This, that and the other (talk) 12:54, 26 February 2025 (UTC)
- Regarding the valid point above that some aspects of our Chinese infrastructure (like synonym templates) are
of considerable linguistic/dialectological value even though they span multiple unintelligible variants, and splitting these could make them significantly more fragmented and harder to maintain
: iff people think such things are useful, and iff people also want to split Chinese, couldn't we just keep those templates and not split them, even if we give lects their own L2s? Make whatever tweaks are needed to let those templates/modules handle full language codes (not just dialect codes) and link to different L2s, but if we think that "showing synonyms from (certain) other, mutually-unintelligible lects" is useful, why not just keep (the infrastructure that is) doing it? I don't see why a split would necessarily have to throw out any usefully-centralized infrastructure. Sure, it might make Chinese lects a somewhat special case if they linked to each other despite having different L2s, but it's already a quite special case (merging varieties under one L2 despite them being mutually unintelligible), it can't get any specialer. (I've (rarely) put links to different languages/L2s in the "see also" sections of a few entries. If it's useful, why not do it?)
If the issue is that some information is not being stored centrally, but is being input on each entry, and might have to be input in each L2 section if we were to split Chinese, then let's evaluate whether there are feasible ways of centralizing, whether that's putting a "see list at X" notice in each L2 other than one — say, the alphabetically first, or the Standard Mandarin Chinese entry — or whether it's moving the lists to one centrally-editable and transcludable place, the way we have e.g. usage notes that are transcluded across multiple entries. But I don't see why "linking between different varieties is useful" would block splitting, iff (again: iff) people want to split, or at least want to play out what splitting would look like and entail. - -sche (discuss) 17:50, 26 February 2025 (UTC)- There is precedent in using
{{dial syn}}
across multiple L2s (e.g. Yoruba), and there is precedent in having modules being shared between related languages (e.g. Module:Jpan-headword). I think the point about throwing out existing infrastructure is a non-issue. (Some minor changes will definitely be needed but those will not be significant) – wpi (talk) 19:04, 26 February 2025 (UTC)- Also with Koreanic as well, ex: at Korean 가위 (gawi), which is why we renamed the title to "Historical and regional synonyms". AG202 (talk) 21:25, 26 February 2025 (UTC)
- I strongly oppose having a "see list at X" notice, which is just equivalent to merging but worse. Theknightwho (talk) 20:55, 26 February 2025 (UTC)
- There is precedent in using
Adding and organizing disparate definitions
[edit]Please disregard this, I didn't look closely enough to realize that they were split due to perfective and imperfective uses. This answers my question.
I was going to add a very common definition for the Ukrainian word виходити (vyxodyty). This word is more commonly used to describe the act of getting married than it's used for several of the other definitions present. What I'm not sure about is whether it should be appended to one of the four separate definition divisions present now, or a fifth one should be added. I don't understand what the logic is for dividing the verb in the first place since they all have identical spellings, stress, and (I believe) etymology.
If it's currently divided this way simply for readability, then the first definition should probably be divided into at least two definitions (various emerging definitions, and definitions surrounding conclusion).
I would love some advice about this entry specifically, as well as general discussion about the ethos of when to divide these kinds of words. Proudlyuseless (talk) 22:35, 27 February 2025 (UTC)
street shitter needs protecting
[edit]2A00:23C5:FE1C:3701:183C:98AD:BA8B:4A49 19:49, 1 March 2025 (UTC)
(See here for previous discussions.)
In light of ongoing doubts about what ‘surface analysis’ actually means, I propose replacing the template with {{af+}}
with the text ‘derivable from X + Y’.
Reasons for the phrasing ‘derivable from’:
- It’s simple to understand
- Avoids scientific jargon like ‘synchronic’ or ‘morphologically’ and Wiktionary jargon like ‘surface analysis’
- It’s unambiguous
- To say that English boldly is derivable from English bold and -ly is to say that those elements are combinable synchronically (that is, in English) to produce boldly.
- Meanwhile, ‘equivalent to’ is vague enough that people use it for both for synchronic combinations, such as bold + -ly, and fanciful long-range correspondences, such as ‘[month is] equivalent to moon + -th’. The latter is incorrect both synchronically (the combination would make *moonth, and -th doesn’t combine with nouns like moon anyway) and, as it happens, etymologically as well (the ending of month is not actually cognate to -th).
Thoughts? Nicodene (talk) 01:59, 2 March 2025 (UTC)
- "Derivable" is as much jargon as "surface analysis", whose parts do not appear to be Wiktionary jargon at all – "surface" is a term I've encountered in various linguistics courses (and the same ones that taught me "synchronic"), and "derivable" not that I remember. "Derivable" is also just as ambiguous to me for any of the usages under the second bullet. I have less issue with the phrase "synchronically derivable" (but again more jargon). Any term we choose will link to a glossary for further information and eventually confuse some reader, and be applied in fuzzy or fanciful situations by some user, so I don't see what any of this truly achieves that can't be helped by slightly stronger documentation and fixing the small number of mistakes. (Do we really get anywhere by claiming terms are Wiktionarisms?) The status quo seems fine, no support. Hftf (talk) 02:44, 2 March 2025 (UTC)
"Derivable" is as much jargon as "surface analysis", whose parts do not appear to be Wiktionary jargon at all – "surface" is a term I've encountered in various linguistics courses (and the same ones that taught me "synchronic"), and "derivable" not that I remember.
- Anyone who has finished school can understand derivable as it is intended here. Surface analysis is so opaque that there is constant confusion about what it’s supposed to mean (cf. the linked threads).
whose parts do not appear to be Wiktionary jargon at all
- There is no such concept in linguistics as ‘surface analysis’. There is such a thing as word derivation.
"Derivable" is also just as ambiguous to me for any of the usages under the second bullet
- No, montage is not ‘derivable’ from mount + -age (> *mountage) in any sense.
Any term we choose will link to a glossary for further information
- Derivable does not need a glossary entry at all. That is actually a point in its favour that I hadn’t mentioned.
- Nicodene (talk) 03:35, 2 March 2025 (UTC)
- Here are some entries to prove the inconsistency: singer, action, and clarity. I was expecting a "surface analysis" in the 1st and 2nd ones, but only the 2nd contains it. The 2nd and the 3rd contain "Equivalent to", tho the former is an obvious combination of sing and -er, and the latter derives from clear and -ity, despite not being *clearity, because:
- Therefore, we can deem clar- an allomorph of clear- in the noun (as well as in clarify). Alternatively, if this derivation is unclear (pun intended), I suggest using "Semantically" or something like it, which is what the current etymology of clarify gives. Davi6596 (talk) 02:51, 2 March 2025 (UTC)
- Singer and action seem straightforward enough. Clarity < clar- + ity would require adding a rule for the combining form clar- like ‘only appears with latinate suffixes’ (cf. clear + -ness > clearness, not *clarness). That seems doable. Alternatively, we just could list clarify, clarity, claritude as related terms under clear, and vice-versa. Nicodene (talk) 04:54, 2 March 2025 (UTC)
- Strong oppose. Derivable is even more opaque and jargony than the status quo; derivable how? What does that mean or imply? "Analyzable" is much clearer in that regard because it doesn't awkwardly skew the relationship (the lemma already exists and is analyzable in some way synchronically; nothing is derived synchronically), but at that point, why is "surface analysis" any less clear? Calling that Wiktionary jargon is quite silly, really; as Hftf commented above, this usage of "surface" or "surface form" is nothing atypical. Anyone familiar with literature would understand "surface analysis" to mean by analysis of the (synchronic) surface form. This is not to say I am not open to other viewpoints, but you happen markedly to be the only person I have seen consistently taking issue with this. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 07:32, 2 March 2025 (UTC)
Derivable is even more opaque and jargony than the status quo; derivable how? What does that mean or imply?
- It means boldly can be derived from bold by suffixing the latter with -ly. What’s not to understand?
Calling that Wiktionary jargon is quite silly, really; as Hftf commented above, this usage of "surface" or "surface form" is nothing atypical.
- ‘Surface analysis’ is, again, not a concept in linguistics.
"Analyzable" is much clearer
- ‘Analyzable’ has the same problem as described above for ‘equivalent to’, namely that its vagueness leads people to use it for both synchronically valid combinations and longer-range etymological connections.
because it doesn't awkwardly skew the relationship (the lemma already exists and is analyzable in some way synchronically; nothing is derived synchronically)
- ‘synchronically derived’: Google Scholar, Google Books
why is "surface analysis" any less clear? Anyone familiar with literature would understand "surface analysis" to mean by analysis of the (synchronic) surface form.
- The fact of the matter is that people keep finding it confusing. Perhaps part of the reason is that familiarity with the literature does not translate into familiarity with a term that does not exist in the literature.
you happen markedly to be the only person I have seen consistently taking issue with this.
- The previous discussions show plenty of others in favour of one alternative or another to ‘surface analysis’.
- Nicodene (talk) 08:44, 2 March 2025 (UTC)
- Nevermind, I have managed to find actual examples of ‘surface analysis’ being used like this. I stand by the rest of my points but am no longer very inclined towards change. Nicodene (talk) 10:07, 2 March 2025 (UTC)
montage is not ‘derivable’ from mount + -age
: that’s why we just call it a surface analysis or equivalent to it. Though there be absent a strict concept in linguistics, for pedagogic concern this is perfectly valid in language acquisition and language science acquisition: human memory may give structure to itself like this, and even to the resulting descriptive language—due to its linearity—, evening out the finicky detail that montage is actually > *mountage. Besides I point out that your argument withsaid itself by once insisting on linguistic usage and then eschewing its jargon; such recommendations confuse writers even more than they do readers. Fay Freak (talk) 11:39, 2 March 2025 (UTC)- I’m OK with the status quo or “analysable as”. Not keen on “derived from” because we also use that phrasing in the main part of the etymology sometimes. — Sgconlaw (talk) 14:25, 2 March 2025 (UTC)
- My understanding is that "surface analysis" says that the claim is being made without reference, but because the division is self-evident. It serves a useful purpose in that readers and editors are made aware that it reflects the opinion of the editor, which might often be right, but is worthy of further research if there's any doubt. Proudlyuseless (talk) 21:07, 8 March 2025 (UTC)
- Well, there's a substantial Venn overlap: synchrony does not equal folk etymology, but folk etymology is often synchronic. The synchronic view of earthen or biology as being built from affixation is not a hypothesis that might be wrong but rather the duck instead of the rabbit (the synchronic viewpoint versus the diachronic viewpoint); neither the duck nor the rabbit is false, and neither is conjectural. But you're right that the overlap with folk etymology is a good point, though, because speakers of natural languages rely on synchrony in a crucial way. Fluency doesn't come so much from diachronic trivia knowledge as from synchronic analysis on the fly. Regarding the topic of this thread, namely, how to dumbmaxx it all (i.e., how to dumb it all down to the truly ultimate degree), there's no final One Right Answer but rather the sound of one hand clapping or a tree falling where there's no one to hear it. Wiktionary's current word choice for this aspect is as pedagogically helpful as any of its alternatives are, so it may as well remain. Quercus solaris (talk) 22:07, 8 March 2025 (UTC)
- EDIT: I'm realizing I'm in way over my head in this discussion, I haven't been editing for very long. I'll leave this to the editors who have a better understanding of both linguistics and wiktionary.
- Well, there's a substantial Venn overlap: synchrony does not equal folk etymology, but folk etymology is often synchronic. The synchronic view of earthen or biology as being built from affixation is not a hypothesis that might be wrong but rather the duck instead of the rabbit (the synchronic viewpoint versus the diachronic viewpoint); neither the duck nor the rabbit is false, and neither is conjectural. But you're right that the overlap with folk etymology is a good point, though, because speakers of natural languages rely on synchrony in a crucial way. Fluency doesn't come so much from diachronic trivia knowledge as from synchronic analysis on the fly. Regarding the topic of this thread, namely, how to dumbmaxx it all (i.e., how to dumb it all down to the truly ultimate degree), there's no final One Right Answer but rather the sound of one hand clapping or a tree falling where there's no one to hear it. Wiktionary's current word choice for this aspect is as pedagogically helpful as any of its alternatives are, so it may as well remain. Quercus solaris (talk) 22:07, 8 March 2025 (UTC)
- I'm not sure exactly what you're referring to when you're talking about folk etymology. Folk etymology isn't a form of analysis, it's a description of a process in the evolution of language. What I was talking about is a tool that we use often in the Ukrainian corner of the dictionary, where we use the surface analysis template for words whose origin is often self-evident due to frequent use of affixes. Because those often drastically shift the meaning of a word, it is essential to put in a link to the stem in the etymology, but the transformation is so elementary that there is no point in finding a reference for the affix, if one actually exists. It could be done with a more generic combining template, but indicating that it was by surface analysis assures it's not mistaken for gospel. Proudlyuseless (talk) 03:53, 9 March 2025 (UTC)
- Yeah no. I found surface analysis to be perfectly transparent and easily understandable. I really don't think "derivable" is an improvement. MedK1 (talk) 12:49, 25 March 2025 (UTC)
Protectedpagetext
[edit]Wikipedia and other Wikimedia wikis the protection level when editing a protected page, However, Wiktionary uses the default, which does not tell you what level it is (semi, auto patrol, admin) Heyaaaaalol (talk) 23:51, 2 March 2025 (UTC)
- I'm not sure that I understand the problem. What would you like to be different? —Justin (koavf)❤T☮C☺M☯ 17:26, 3 March 2025 (UTC)
- Discussion moved to WT:RFDO.
Coincident verbs
[edit](Or so Google AI Overview tells me that the specific name is for this type of irregular verb.) I'm referring those whose simple past tense and/or past participle are the same as the infinitive, such as hit, put, cost. Apparently the Wiktionary rule is that entries for such verbs are not to include an extra section defining the past-tense or past-participle use separately from the infinitive; i.e., we never put a section that looks like
Verb
[edit]Is that indeed the rule? I can see why such a section could be considered superfluous – after all, the simple past and pp are easy to find, right there in boldface at the top of the definition list for the infinitive. But omitting it seems a little inconsistent, given that every derived form that differs from its root even by appending a single letter (e.g., puts) has its own full-fledged entry – even though these forms are also easy to find in their respective root-word entries. — HelpMyUnbelief (talk) 22:18, 5 March 2025 (UTC)
- Nobody calls them Coincident verbs. Your AI failed. Lfellet (talk) 23:29, 5 March 2025 (UTC)
- In general we don't tend to put a "form-of" entry alongside the lemma if the forms coincide in spelling. Form-of entries are purely a navigational aid - if you've made it to the put entry, you've got to where all the info is. This, that and the other (talk) 03:24, 6 March 2025 (UTC)
- As TTO says, such "form-of" entries are often omitted for English, though one finds exceptions, e.g. read (perhaps because the pronunciations differ and the etymologies are also distinguishable? ... but we aren't consistent, because I think I've seen entries where e.g. the plural was pronounced differently but didn't have its own section). Other languages (de facto) handle things differently, e.g. Latin entries often have such sections when the inflected forms have different macrons, even though that could just be shown via the declension table and via notes in the pronunciation section. (E.g. aquaria#Latin has "Pronunciation 1" and "Pronunciation 2"; mascula#Latin has only one pronunciation section with notes, but separate adjective sections...) - -sche (discuss) 17:01, 6 March 2025 (UTC)
what goes in a Template:place category?
[edit]I'm doing a bunch of {{place}}
cleanup and we need to nail down what belongs in e.g. Category:en:States of the United States or Category:ru:Countries in Europe and what doesn't. To this end I just added a |nocat=1
param to {{place}}
and {{tcl}}
so you can get the description without the category. Obviously the canonical form of a polity or subpolity (state, country, etc.) belongs, but there are lots of other forms whose meaning is the same. My general thinking is that any variant in common, current use that refers to a given polity or subpolity belongs, but things that are rare, dated, archaic, obsolete, etc. don't, nor do former entities that no longer exist. More specifically:
- Former polities and subpolities DO NOT BELONG (e.g. Yugoslavia, Czechoslovakia, West Germany, the Soviet Union, the Kingdom of Serbs, Croats and Slovenes, etc.). We have separate categories for such things. This is important and keeps the primary categories from becoming a sloppy mess of current and former entities, esp. in places like Europe where borders and names have changed frequently.
- Abbreviations in current use probably do belong; hence e.g. AZ and Ariz. go into Category:en:States of the United States. Same goes for clipped forms like Cal and Cali for California. An alternative is to segregate them into something like Category:en:Abbreviations of states of the United States.
- If there is a shorter (elliptical) and longer form, both belong. Hence, Washington and Washington, D.C. both go in Category:en:National capitals.
- Alternative forms still in use belong. Hence, Latin Carolina Australis, Carolina Meridiana and Carolina Meridionalis all appear to be valid ways of saying "South Carolina", so all three go in Category:la:States of the United States.
- Forms in alternative scripts probably do belong, as long as the script is still in use by speakers of the language. Hence, Azerbaijani Cyrillic Австрија "Austria" goes into Category:az:Countries in Europe along with the more common Latin-script Avstriya. However, the Japanese ateji spelling 墺太利 for Austria does NOT belong in Category:ja:Countries in Europe since it's specifically indicated as obsolete; it seems Japanese speakers no longer use such forms, preferring katakana forms like オーストリア. This would logically mean that Vietnamese Han forms like 比 for Belgium should not belong in Category:vi:Countries in Europe since Vietnamese Han forms are no longer in use. Korean Hanja forms are more of a gray area; they are passing out of use but my instinct is to still include them for now. Thoughts?
- Romanizations such as Monako in Japanese for モナコ do NOT belong because they are not commonly used by Japanese speakers themselves, only by foreigners.
- Consistent with the above principles, terms written in superseded spelling systems do NOT belong unless the superseded spelling system is still in use because the new system hasn't been completely accepted.
- Examples of the former kind (superseded which don't belong): Russian pre-1917 forms (Австрія instead of Австрия "Austria"; any pre-any-reform Portuguese form that is superseded everywhere (Belgica instead of Bélgica "Belgium"; this can be tricky because some spelling reforms caused some forms to be superseded only in certain Portuguese-speaking countries); any pre-1996-reform German spelling; Indonesian pre-1945 or pre-1972 spellings like Djerman in place of Jerman "Germany".
- Examples of the latter kind (superseded which do belong): pre-1990 French forms, since the 1990 spelling reform has not been universally accepted (and even in Wiktionary we lemmatize at the pre-1990 forms); probably also pre-2007 Tagalog forms like Pinlandya in place of Pinlandiya "Finland"; here I don't know for sure, but 2007 seems pretty recent for a spelling reform to have been universally accepted. Similarly for any spelling reform promulgated 2010 or later.
- Clearly dated forms probably do NOT belong, but this is a bit of a gray area. For example, Pennsylvanien is indicated as a dated-to-archaic German variant of Pennsylvania and hence doesn't belong, but for Farsi نمسا "Austria", which is merely indicated as "dated", I'm not sure. My instinct is to not include, but this may be wrong. Part of the problem here is that "dated" can mean different things for different people and languages; e.g. the Japanese ateji form 仏蘭西 "France" is given as "dated" but yet the equivalent form for Austria is given as "obsolete"; I suspect there isn't actually a significant usage difference here, more just different editors labeling things differently.
- Things still in use in a restricted non-archaic register (such as poetic or elevated speech) probably DO belong, but I'm not sure. Example: Chinese 法蘭西 "France", listed as
{{lb|zh|attributive|dated|or|poetic}}
and with quotes from 2022 and 2024. Here, the "poetic" label suggests it's still in use in elevated speech (although "attributive" gives me pause, as we don't normally include adjectives meaning "French" in "Countries in Europe" categories). - Official names of countries still count as countries for categorization purposes, I think; hence United Kingdom of Great Britain and Northern Ireland goes in Category:en:Countries in Europe. An alternative is to segregate them into a different category, such as Category:en:Official names of countries in Europe.
- Nicknames probably do NOT belong; e.g. Big Apple for New York City, Bel Paese for Italy.
- Derogatory terms, being nicknames, likewise probably do NOT belong. Example: English Oklahomo for Oklahoma (in truth I've never ever heard this term, but it's in Wiktionary ...) or Chiraq for Chicago (the name of a Spike Lee film).
- Misspellings generally do NOT belong. Example: Dutch Belgie, misspelling of België "Belgium".
- Inflected forms do NOT belong, only the base form. This comes up especially with Albanian and Aromanian, where the definite forms often use
{{place}}
. Example: Bosnja dhe Hercegovina, definite form of Bosnjë dhe Hercegovinë.
I'm sure there's something I've forgotten.
Pinging some random users who have participated in previous similar discussions on-wiki and in Discord: @-sche, @Chuck Entz, @Theknightwho, @LunaEatsTuna, @AG202, @CitationsFreak. Hopefully this is less controversial than my previous post Wiktionary:Beer_parlour/2025/January#what_counts_as_a_"country"? because there is less of a political element here. Benwing2 (talk) 03:49, 6 March 2025 (UTC)
- @Benwing2: Support on all points, except maybe 11, but there's little way to have 11 without having 12, so I'm fine with both being excluded. AG202 (talk) 02:27, 7 March 2025 (UTC)
- Support on all points except 11. It seems to me there is a very fine line separating 11 from 2, 3, and 4. Is the "City" a nickname for or an elliptical form of the City of London? 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 23:54, 7 March 2025 (UTC)
- Yes, there are cases where it may be hard to distinguish abbreviations or elliptical forms from nicknames, but it feels like these are more edge cases than the norm. "the City" in particular can be used to refer to quite a number of major cities in the right context, and feels like something that doesn't belong for this reason. Benwing2 (talk) 00:18, 8 March 2025 (UTC)
Nicknames listed as synonyms
[edit]I'm new to Wiktionary, so it's possible I'm beating a dead horse or posting in the wrong location. However I'm curious about how we reconcile listing (often derogatory) nicknames as synonyms for names. For example, Trump has Cheetolini listed as a synonym among a dozen or so other nicknames. Meanwhile Zelenskyy has green goblin listed as a slang synonym. I do not see this for Obama, although Obummer is listed as a "derived term". Are listing these as synonyms, especially with no indication of them being nicknames or slang (in some cases), not a violation of WT:NPOV? Alxeedo (talk) 03:21, 7 March 2025 (UTC)
- Well, they are synonyms. They should ideally be marked with qualifiers, though. I don't see how NPOV plays into this, as this kind of thing happens all the time for regular words, too. Just because we have arse as a synonym of buttocks doesn't mean we have a biased point of view against butts. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 23:58, 7 March 2025 (UTC)
- This issue has come up before with e.g. педераст (pederast) and содомит (sodomit) and such listed as synonyms of гей (gej) (originally without qualifiers). It's extremely important any time slang, derogatory or obsolete synonyms are listed (or more generally, any terms whose register is not the same as that of the lemma) that appropriate qualifiers or labels are supplied to make clear the register distinction, but this doesn't always happen. Some non-core editors are just itching to add synonyms. We had one guy, for example, who would list 25 unqualified synonyms of each Latin verb, based on obscure, rare usages of the purportedly synonymous verbs, sometimes even listed under incorrect senses. Same goes for the definitions themselves; Sanskrit lemmas, for example, are dumping grounds of meanings from different registers and time periods, without proper qualifiers or labels. I would personally rather have no synonyms than non-neutral-register synonyms given without proper qualifiers. (And conversely: slang terms should not have neutral-register synonyms given without appropriate labels, but this is less of a ticking time bomb.) In the Latin examples, I just delete many of the synonym lists because untangling them will take way more time than I have, and the editor who added them has not been willing to add the proper qualifiers despite numerous requests. Benwing2 (talk) 00:14, 8 March 2025 (UTC)
- Agreed. Also, when there are a very large number of derogatory, obsolete, etc "synonyms" of a non-derogatory, non-obsolete word, they can be offloaded to Thesaurus pages. I did this with the "synonyms" of Jew, inspired by someone else having done it at Muslim. Trump has enough synonyms that he could have a thesaurus page, IMO. - -sche (discuss) 07:47, 10 March 2025 (UTC)
- This issue has come up before with e.g. педераст (pederast) and содомит (sodomit) and such listed as synonyms of гей (gej) (originally without qualifiers). It's extremely important any time slang, derogatory or obsolete synonyms are listed (or more generally, any terms whose register is not the same as that of the lemma) that appropriate qualifiers or labels are supplied to make clear the register distinction, but this doesn't always happen. Some non-core editors are just itching to add synonyms. We had one guy, for example, who would list 25 unqualified synonyms of each Latin verb, based on obscure, rare usages of the purportedly synonymous verbs, sometimes even listed under incorrect senses. Same goes for the definitions themselves; Sanskrit lemmas, for example, are dumping grounds of meanings from different registers and time periods, without proper qualifiers or labels. I would personally rather have no synonyms than non-neutral-register synonyms given without proper qualifiers. (And conversely: slang terms should not have neutral-register synonyms given without appropriate labels, but this is less of a ticking time bomb.) In the Latin examples, I just delete many of the synonym lists because untangling them will take way more time than I have, and the editor who added them has not been willing to add the proper qualifiers despite numerous requests. Benwing2 (talk) 00:14, 8 March 2025 (UTC)
Universal Code of Conduct annual review: proposed changes are available for comment
[edit]Please help translate to your language.
I am writing to you to let you know that proposed changes to the Universal Code of Conduct (UCoC) Enforcement Guidelines and Universal Code of Conduct Coordinating Committee (U4C) Charter are open for review. You can provide feedback on suggested changes through the end of day on Tuesday, 18 March 2025. This is the second step in the annual review process, the final step will be community voting on the proposed changes. Read more information and find relevant links about the process on the UCoC annual review page on Meta.
The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. This annual review was planned and implemented by the U4C. For more information and the responsibilities of the U4C, you may review the U4C Charter.
Please share this information with other members in your community wherever else might be appropriate.
-- In cooperation with the U4C, Keegan (WMF) 18:52, 7 March 2025 (UTC)
Etymons
[edit]@Surjection @Vininn126 i feel like the current discord discussion about etymons should probably be moved on-wiki for transparency's sake, right? Froglegseternal (talk) 22:16, 7 March 2025 (UTC)
- Agreed.
- Based on Wiktionary:Beer parlour/2024/July#Moratorium on editing other languages' etymology sections for the purpose of English etymology trees and Wiktionary:Votes/2024-04/Allowing etymology trees on entries, I feel that the application of
{{etymon}}
as at least an ID template and at most more is probably fine. I'm not sure what harm there is adding ID's to things. Vininn126 (talk) 22:18, 7 March 2025 (UTC)- (For context, this conversation is about whether it's okay to add etymon IDs to any language this is not about trees or generated text)
- Anyways, I agree with Vinn. I'm not sure how we would even be able to add etymons if we weren't allowed to add ID's to other languages. Like at қибтӣ, I only turned on the tree for languages that I know allow it, but if I couldn't add an ID to every language then I probably wouldn't be able to make a tree at all (at least not to the same extent). I don't think there's anything harmful about adding an invisible ID to other languages to fetch its etymology, that's kind've the point of etymon imo. — BABR・talk 22:29, 7 March 2025 (UTC)
- Vin Vininn126 (talk) 22:31, 7 March 2025 (UTC)
- show me the wiktionary policy that says I can't spell it Vinn /j — BABR・talk 23:42, 7 March 2025 (UTC)
- User:Vininn126#Vininn126, which is cannon. Vininn126 (talk) 23:44, 7 March 2025 (UTC)
- canon—but I digress. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 23:47, 7 March 2025 (UTC)
- The curse of two n's Vininn126 (talk) 12:53, 9 March 2025 (UTC)
- canon—but I digress. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 23:47, 7 March 2025 (UTC)
- User:Vininn126#Vininn126, which is cannon. Vininn126 (talk) 23:44, 7 March 2025 (UTC)
- show me the wiktionary policy that says I can't spell it Vinn /j — BABR・talk 23:42, 7 March 2025 (UTC)
- Vin Vininn126 (talk) 22:31, 7 March 2025 (UTC)
- (For context, this conversation is about whether it's okay to add etymon IDs to any language this is not about trees or generated text)
- Basically what happened (for context for anyone not in the discord) is that yesterday, I made an edit to *karhu[[2]] in an attempt to fix the fact that the to address the fact that on the karu#Votic entry, the category "Votic entries referencing etymons with invalid IDs" was occurring. This was soonafter reverted by User:Surjection, after which I made a comment on their talk page, asking what I had done wrong. In the short time before they responded (rather quickly, I must add, thank you for the quick response!) I asked in the discord what was wrong with what I had done. This.... proceeded to start an argument between Surjection and User:Vininn126 about where and when it's appropriate to add the Template:etymon to an entry, with Surjection arguing against widespread adoption and Vininn saying there was nothing wrong with the actions I had taken in adding them. This quickly devolved into accusations of bad-faith, and for full transparency, I am making this thread.
- I just want to add that I am a relatively new editor who was just trying to fix maintenance issues, not take pre-emptive on something which (apparently) has not yet reached consensus, not start an argument between two admins. As such, I will try to refrain from commenting any further on this matter, though I unfortunately don't have the best self-control at times so please don't try and say I "went against my word" if I do end up making another comment. Froglegseternal (talk) 22:25, 7 March 2025 (UTC)
- I don't like mass-addition of etymology trees, and I particularly don't like the addition of these in short chains of etymologies or instead of etymologies. For instance, Ingrian etymologies, especially for inherited terms, are short by design. Thadh (talk) 22:53, 7 March 2025 (UTC)
- Well, the dispute was not about trees. Vininn126 (talk) Vininn126 (talk) 22:55, 7 March 2025 (UTC)
- Agreed, while I do want to add that if one looks at my edit history they will see that trees have been expanded due to my actions, again I am a new editor who was not aware this was contentious. Froglegseternal (talk) 22:59, 7 March 2025 (UTC)
- Also I added a singular new one, by accident, which I quickly removed. I was not trying to make any visual changes, just trying to make it so that IDs weren't linking to nonexistent IDs. Froglegseternal (talk) 23:00, 7 March 2025 (UTC)
- What other ultimate use do we have for etymon? Trees and descendant trees. If the etymology chain is short, you don't need either. Thadh (talk) 00:30, 8 March 2025 (UTC)
- @Thadh, in this case we are talking about if it's okay to add an ID (no tree or generated text) to another language so you can use etymon. Like, if I'm editing English, is it okay for me to add an id to a Latin entry, without generating a tree or text, so that I can generate a tree for English?
It doesn't seem like something that needs a discussion imo, but it came up because Surjection reverted someone for adding ID's and a minor dispute started in the discord about whether it was the right thing to do. I honestly didn't fully understand Surjections view of why it wouldn't be okay, so I think it's better he explain it himself (if he wants to). — BABR・talk 01:34, 8 March 2025 (UTC)- Except in this case there was no language that would need to generate a tree based on the etymology. There are no trees that use the Ingrian or Votic etymon ID. I think adding IDs just in case it could potentially be needed in the future is not a good idea. Thadh (talk) 01:40, 8 March 2025 (UTC)
- I added an ID because a descendant called an invalid parameter. that's it. i had no knowledge of anything linguistic, i was approaching this from a coding perspective. there was an error, i saw a way to solve the error, i acted. that's it, that's all, no more than that. so, no, i didn't add it because it was 'potentially' needed. i added it because it was causing a (minor) error. Froglegseternal (talk) 01:42, 8 March 2025 (UTC)
- You can use
{{etymid}}
to add etymology IDs. — SURJECTION / T / C / L / 09:47, 8 March 2025 (UTC){{etymon}}
also adds dercats at the moment. Vininn126 (talk) 09:49, 8 March 2025 (UTC)
- You can use
- I added an ID because a descendant called an invalid parameter. that's it. i had no knowledge of anything linguistic, i was approaching this from a coding perspective. there was an error, i saw a way to solve the error, i acted. that's it, that's all, no more than that. so, no, i didn't add it because it was 'potentially' needed. i added it because it was causing a (minor) error. Froglegseternal (talk) 01:42, 8 March 2025 (UTC)
- Except in this case there was no language that would need to generate a tree based on the etymology. There are no trees that use the Ingrian or Votic etymon ID. I think adding IDs just in case it could potentially be needed in the future is not a good idea. Thadh (talk) 01:40, 8 March 2025 (UTC)
- It is confusing for me to understand the technical aspects of this. It seems this involves the "text" feature of etymon, currently described as "[EXPERIMENTAL]" in its documentation page? I have reservations about cases where etymon is used completely invisibly, as some kind of segregated supplement to the etymology presented visibly on a page. It's harder in such cases to make sure the etymology information is properly vetted and any mistakes are fixed: you don't get as many eyes on such hidden etymologies, and if they do eventually become visible to readers on trees in downstream entries, readers of those entries may not find it obvious how to edit mistakes to fix them. But if I understand correctly, etymon was formerly being used at karu as a way of visibly displaying the etymology, just in text form (not as a tree). And it sounds like the template at karu was causing an error because there was no etymon template at *karhu, which Froglegseternal noticed and then fixed by means of adding an etymon template to the latter entry. Then Surjection reverted that edit and also removed etymon from karu because of some objection to etymon's use. Surjection would you be able to explain the reason for that here? What exactly is the reason to prefer using "inh" rather than "etymon"?--Urszag (talk) 02:05, 8 March 2025 (UTC)
- To repeat the comment from my talk page: "There is no consensus to mass-adopt etymon (there is in fact somewhat of a consensus against mass-adopting it), despite what some editors think." Some editors have apparently gotten the impression that they should start using and adding
{{etymon}}
to all entries, when the practice has always been to discuss it with tha appropriate language community. Discussions like this clearly show that the community is not OK with this kind of nondiscriminate behavior. My impression of{{etymon}}
in general is that it is fundamentally experimental and incomplete in some ways, and that it is likely that it will be redesigned at least once in the near future. But when it comes to the text feature, which is even clearly marked as experimental, I cannot understand why an editor would look at it and think it is a good idea to start adding it to entries in basically every language they can think of. Any mass-adoption needs consensus, and there is none. — SURJECTION / T / C / L / 08:07, 8 March 2025 (UTC)
- To repeat the comment from my talk page: "There is no consensus to mass-adopt etymon (there is in fact somewhat of a consensus against mass-adopting it), despite what some editors think." Some editors have apparently gotten the impression that they should start using and adding
- As an ID to allow other words pointing to it as necessary. Vininn126 (talk) 09:19, 8 March 2025 (UTC)
- @Thadh, in this case we are talking about if it's okay to add an ID (no tree or generated text) to another language so you can use etymon. Like, if I'm editing English, is it okay for me to add an id to a Latin entry, without generating a tree or text, so that I can generate a tree for English?
- Agreed, while I do want to add that if one looks at my edit history they will see that trees have been expanded due to my actions, again I am a new editor who was not aware this was contentious. Froglegseternal (talk) 22:59, 7 March 2025 (UTC)
- Well, the dispute was not about trees. Vininn126 (talk) Vininn126 (talk) 22:55, 7 March 2025 (UTC)
- @Froglegseternal: The text of Wiktionary:Votes/2024-04/Allowing etymology trees on entries *explicitly* allows the edit you were trying to make so I am somewhat baffled at @Surjection's actions in this case. Adding IDs has no visible effect on the entry whatsoever.
- @Urszag as for the discussion about
|text=
, which I think is unrelated to this case, I describe the functionality as "experimental" as I had implemented the functionality somewhat extemporaneously last year, and I think in some cases the output is not very good (although it works pretty well for simple linear etymologies). But I intentionally took a hard line against it in the vote on etymology trees so that we could save it for a separate discussion or vote which I don't think ever actually happened. Ioaxxere (talk) 02:39, 9 March 2025 (UTC)- @Ioaxxere: "Therefore, they may be used site-wide whenever necessary." allows that they are added when necessary. In this case, they are not. Thadh (talk) 13:01, 9 March 2025 (UTC)
- Ah, from reading the discussion more closely it looks like Surjection was objecting to the template being used on karu *as well as* *karhu, which is more reasonable. In that case I apologize for jumping to conclusions. Ioaxxere (talk) 17:42, 9 March 2025 (UTC)
- I think the use of
|text=
here is relevant because if the use of that parameter is not allowed in mainspace (which the 2024-04 vote seems to establish) then the edits by JnpoJuwan that originally added the etymon template to karu violated the current policy about its use. Even though the vote does allow "silent" use of the etymon template, as I mentioned, I see that as more of an anti-feature than a feature, and if etymon IDs have no effect on any visible content in any entry (which I think is currently equivalent to "are not needed for a tree"?), I think they shouldn't be included in entries. In this case, there seems to be no tree, so I think I agree with Surjection's decision to remove the template rather than convert it to a "silent" version.--Urszag (talk) 13:11, 9 March 2025 (UTC)- @Urszag thanks for pinging me in this discussion (if anyone else could ping me in etymon discussions, I would appreciate it). my reasoning for using etymons is as follows: this is a powerful tool that has the capability of helping editors with the redundancy of language derivation tasks, as copying the longest etymology to every other language is hard manual work (this is especially true with internationalisms, what I deal with a lot, but also for inherentances and borrowings, mostly as they tend to be linear). it also automatically categorises words with all points of that etymology, which is convinient. the Portuguese-language community (and Tupi, I suppose, from the work of @Trooper57) has been using etymons for those reasons, even its experimental features like the
|text=
parameter. I understand that the programmers may see that the code is not great, but for my work, the quality of the tool provided has been good. - for these small, linear etymologies, there is no harm in adding etymons with the
|text=
parameter in my opinion. past me added the etymon to Votic (and other IE and Uralic entries) due to the automatic work that etymon pulls up (categorisation and etymology detail beyond Estonian) and the text is comparable in quality, with only slight pushback due to my oversights, to which I have apologised and do apologise for. - in the case of adding etymons in other languages "silently", in other to add data (tree, text, etc) to one descendant that does allow it, I don't see what the problem with that is from reading this discussion. while editing, I have tried to add them where the additional information would be useful as I have stated above. Juwan (talk) 18:24, 9 March 2025 (UTC)
- @Urszag thanks for pinging me in this discussion (if anyone else could ping me in etymon discussions, I would appreciate it). my reasoning for using etymons is as follows: this is a powerful tool that has the capability of helping editors with the redundancy of language derivation tasks, as copying the longest etymology to every other language is hard manual work (this is especially true with internationalisms, what I deal with a lot, but also for inherentances and borrowings, mostly as they tend to be linear). it also automatically categorises words with all points of that etymology, which is convinient. the Portuguese-language community (and Tupi, I suppose, from the work of @Trooper57) has been using etymons for those reasons, even its experimental features like the
- @Ioaxxere: "Therefore, they may be used site-wide whenever necessary." allows that they are added when necessary. In this case, they are not. Thadh (talk) 13:01, 9 March 2025 (UTC)
Ukrainian - authoritative reference for perfective verbs
[edit]Recently someone corrected a mistake I made in citing a more common perfective verb as the counterpart to an imperfective, rather than a verb that is considered the standard perfective. The user referenced SUM-20 as their source, but I don't understand what this is. Can someone explain so I can refer to this in the future? Proudlyuseless (talk) 21:03, 8 March 2025 (UTC)
- @Proudlyuseless: Editors, in their contribution summaries, sometimes refer to references by their abbreviations in the template namespace, you need to select or prefix to your search in Special:Search to find, but here I link the template for you:
{{R:uk:SUM-20}}
. Fay Freak (talk) 13:23, 9 March 2025 (UTC)
How can I add an example to the first meaning only with this type of template? JMGN (talk) 13:37, 9 March 2025 (UTC)
- I don’t think it’s possible, but adding usage examples to non-lemma forms is, in my eyes, a poor practice and a mark of crusty old entries anyway. Add them to the main entry if you need to. ―K(ə)tom (talk) 14:09, 9 March 2025 (UTC)
- Then those templates should be automatically changed by a different type that dies allow it. JMGN (talk) 20:34, 9 March 2025 (UTC)
- Yeah I agree with @Ktom that you shouldn't be adding usexes to non-lemma forms except in very limited circumstances (maybe possibly with suppletive forms, but not with something like huelga). If you really had to do that, there are ways of invoking
{{es-verb form of}}
and telling it to only output the definition for a single form, so you could use this to put the separate definitions on separate lines using separate{{es-verb form of}}
calls and put the usex in between, but I'd strongly advise against that. Benwing2 (talk) 07:11, 10 March 2025 (UTC)
- Yeah I agree with @Ktom that you shouldn't be adding usexes to non-lemma forms except in very limited circumstances (maybe possibly with suppletive forms, but not with something like huelga). If you really had to do that, there are ways of invoking
- @Benwing2: huelga decir
- This belongs under a "Derived terms" header. I put it under one at huelga (it was already present at holgar). This, that and the other (talk) 06:40, 14 March 2025 (UTC)
- Then those templates should be automatically changed by a different type that dies allow it. JMGN (talk) 20:34, 9 March 2025 (UTC)
FL entries: glosses without useful hypernyms
[edit]In a few Slavic language entries I found the following "definition": "man on horseback, yellow knight (Tricholoma equestre)". Though I've run across each term and therefore recalled that a mushroom was the referent, it struck me that not too many others would. But should we require a user to search for the relevant definition by serially examining the links in each definition in the PoS section?
In an English entry the norm is to have a hypernym as a major part of each of a noun's definitions. In FL entries we require only a gloss, so generally no hypernym is present. Moreover, usually no topical label is to be found either. Also, in the desire to find a single-word definition, many FL entries seem to use rather obscure English words that would mystify most English readers and auditors.
Isn't this a glaring shortcoming for our FL entries, making them not so useful for normal users' needs, whatever its adequacy for translators? DCDuring (talk) 16:36, 9 March 2025 (UTC)
should we require a user to search for the relevant definition by serially examining the links
? Yes, otherwise this duplicates and over time desynchronizes content. This weighs more than the interest of users unlike you in such entries: either people are above-average likely to recall organism names, or at least interested in them enough to click them while not recalling them, or they have little real-life incentives to seek out these entries in the first place. The imbalance can also be mitigated by a template fetching or at least highlighting certain definitions from English, or even Wikidata,{{transclude}}
, though I have never used these mechanisms, mostly seeing them in Hebrew entries like גֶּרֶב (“sock”) for whatever reason: editors also need to spare their working memory when creating definitions, I think you yourself warned against overtemplatization making the creation and editing of entries feasible only for techies. There is nothing to be depressed about here. Fay Freak (talk) 18:55, 9 March 2025 (UTC)- You don't address my main concern, which I may have buried under too much other text.
- Could you explain how the definition I provided, which is duplicated in 4-5 entries, doesn't waste people's time by failing to provide either a hypernym in the gloss/definition or a topical label (BTW, which I dislike for other reasons)? DCDuring (talk) 19:22, 9 March 2025 (UTC)
- Your concern is that the gloss doesn't make it clear that this is a type of fungus, then? I think that's a fair point. In exceptional cases such as this where the common names does not at all make clear what type of organism it is, could you write "(the fungus Tricholoma equestre)" instead of simply "(Tricholoma equestre)"? This, that and the other (talk) 00:03, 10 March 2025 (UTC)
- Well I do that, too, for example voskovka, and DCDuring is right in general to raise awareness about this issue, which I have worked out below. I fear there is something obsessive-compulsive about the formatting in Polish entries particularly, though, preventing people from doing this right thing. Fay Freak (talk) 00:09, 10 March 2025 (UTC)
- They should add an image so we know what it is about? It involves visiting another website, just as thinking about the appropriate hypernym does since it involves comparison. I see why you liked high-effort entries from my side of that style, like оман (at which age by which share are American or English youngsters aware what elecampane is?). There are no completely satisfying alternatives. It appears like an exaggeration to speak of a waste of time then, but I see that you indeed deem it somehow inappropriate to force people to click through to even understand an entry, but again I deem it a theoretical concern due to the selection bias of people who even visit such entries, being luxurious in their use of leisure time, and sufficiently interested. Fay Freak (talk) 00:05, 10 March 2025 (UTC)
- How do we establish norms for the basic Wiktionary function of defining terms? At which and how many levels should they operate? Should any be strictly enforced? How do we communicate them?
- I don't think we even have agreement on the overall goals of definition. At the very least, we seem to accept truly poor definitions, such as the example given, which suggest that we are our FL entries are only for translators and people who like clicking links (Let's call them "browsers".). If someone other than a translator or browser comes to a polysemic entry, especially FL, should that user have to click to another entry (or entries!!!) to find out that the definition is NOT one that fits the use we is deciphering? If we could agree that we should use simple means to avoid such situations, then we could perhaps somewhere enshrine that goal and direct contributors to it. We could go further and prescribe a defining vocabulary, at least for stem-lemmas, ie those not trivially derived morphologically from other lemmas.
- Without some agreement on principles of this kind, I find it difficult to imagine norms other than formatting goals for definitions and the feel-good counsel of perfection: "all words in all languages using definitions and descriptions in English", which provides no guidance for contributors, only license. DCDuring (talk) 18:12, 10 March 2025 (UTC)
- Your concern is that the gloss doesn't make it clear that this is a type of fungus, then? I think that's a fair point. In exceptional cases such as this where the common names does not at all make clear what type of organism it is, could you write "(the fungus Tricholoma equestre)" instead of simply "(Tricholoma equestre)"? This, that and the other (talk) 00:03, 10 March 2025 (UTC)
- My personal take is that all FL definitions should eventually include both a translation (what I believe you are calling a gloss) and a longer gloss, encased in
{{gloss}}
. Often, these are only provided to differentiate ambiguities, but I prefer them to include at least a semi-complete definition, for the following reasons:- It avoids ambiguities arising in the future if new definitions are added at the target entry;
- It makes better sense of long lists of synonyms in some entries, encapsulating the general meaning in a single definition;
- It provides a way of noting shades of meaning that would otherwise be lost (for instance, rivière would be missing important information if it was simply glossed as "river");
- It saves users from needing to click onto the English entry if they don't understand the translation any better than the word they are looking up.
- Andrew Sheedy (talk) 01:33, 11 March 2025 (UTC)
Policy clarification regarding the placement of usexes
[edit]The policy on usexes says that {{ux}}
templates should "be placed immediately after the applicable numbered definition". However, the policy on synonyms (and other semantic relation templates) conflictingly says that {{syn}}
templates should also be placed immediately after the definition line.
In practice, the vast majority of semantic relation templates are placed before the usexes, so here is my proposal to amend the usex policy:
Example sentences should: [...]
- be placed after the applicable numbered definition; before any quotations associated with that specific definition, but after any associated semantic relation templates (like
{{syn}}
).
What do you think?
Tc14Hd (aka Marc) (talk) 18:36, 9 March 2025 (UTC)
- Yes, something like that. This is already practice and probably rule. The votes introducing
{{syn}}
postcede the sentence about usexes you quote, hence lex posterior rule resolves it, which is of course intransparent to not-long-term users. Fay Freak (talk) 18:59, 9 March 2025 (UTC)- So we don't need an extra vote to change the policy? Just one admin that changes it for us? Tc14Hd (aka Marc) (talk) 19:36, 9 March 2025 (UTC)
- Yep, the practice is that nyms come before usexes. I thought that was the rule. I'm not sure that this needs a formal vote – after all, it's a glaring and unintentional inconsistency in the policy. Let's see if anyone objects to a change before making it. This, that and the other (talk) 00:05, 10 March 2025 (UTC)
- Yes, I support this theme too (i.e., to fix the documentation to codify that which is already the norm/standard). Quercus solaris (talk) 20:44, 10 March 2025 (UTC)
the limits of Limited Documentation Languages
[edit]Yes, LDL's accept terms with only one cite. But do hapax locations that can't be precisely identified count? I ask because of this:
- Occitan Le Tousquirat defined as an "Unidentified location near Massat in Ariège, France." and there's a citation on the Citations page. At least the approximate location is specified, but I question whether this passes CFI except on a rather liberal reading.
- Some of the stuff in Category:twf:Place names. I'd like to entirely get rid of Category:Place names and its only subcat Category:twf:Place names, but we have entries like:
- wę̀puopʼôto defined as "Pine-Near-Water (place name)", no citation;
- kònkʼə́obo defined as "placename for a place where a buffalo turned into stone in ancient times about 0.5 miles east of Taos";
- tə̂obo defined as "to the village, toward the village, Taos pueblo";
- tə̂otho defined as "in the village, at the village, Taos pueblo";
- kwę́ʼogą defined as "among the Mexicans (i.e. Taos city, Taos village)".
At least these last three, maybe all of them, appear to be just SOP descriptions that only contextually refer to specific locations. It seems similar to telling a story where I said "I went back into town" where "into town" happens to refer to say Kalamazoo, Michigan in the context, but could be anywhere. OTOH maybe given the centrality of Taos Pueblo to Taos language speakers, it's a bit similar to "the City" referring to New York, London or various other core cities in large metro areas, for which we do have definitions (defns #1, #3, #4 under City); but all three of these are locative particles, not nouns, so they can't really qualify as placenames. @-sche? Sorry to ping you once again but from the edit history of these pages (FWIW created by User:Ishwar), you seem to know something about Taos. Benwing2 (talk) 07:07, 10 March 2025 (UTC)
- I don't see a problem with including unlocatable placenames per se. Even in WDLs there are words for places that can't be precisely located, e.g. Gomorrah and the names (in various languages) of a lot of these places; we also have entries for non-placename words that can't be (precisely, or at all) defined, e.g. あしひきの, בדולח, ᚆᚉᚉᚃᚃᚓᚃᚃ, 𒇷 and various pages that use Template:def-uncertain. There is usually more information of lexicographic interest available about placenames than solely "where is it?", after all, like etymology (showing a given root survived into the language, for example).
Of course, if the placename has no citations, it should be RFVed. And if the definition needs to be fixed, like someone defined rūrī as "Italy" or "in Italy" when it should be "locative of rus ("country")" or the like, that should be fixed. (I will see what I can find out about the Taos terms.) And if the categorization needs to be changed, e.g. to put things into the placename equivalent of Category:Terms with uncertain meaning by language or dump them into Category:en:Places (along with things like James Shoal) or something else, absolutely, let's improve the categorization. - -sche (discuss) 08:55, 10 March 2025 (UTC)- Thanks, this makes sense. Any help you can give with the Taos terms would be greatly appreciated; I'm doing a big cleanup/revamp of the
{{place}}
architecture and I'm trying to get rid of leftover categories like Category:Place names. BTW probably nothing (or at least no specific toponym) should go directly in Category:Places; maybe the only thing I could think of going there is fictional locations with no specific referent like East Bumfuck (hmm, we have East Bumfuck, Kansas) or Cockaigne, but even then we have Category:Fictional locations as well as Category:Mythological locations (IMO they should probably be merged into Category:Mythological and fictional locations as the distinction between the two is often fuzzy). Benwing2 (talk) 09:06, 10 March 2025 (UTC)- I've moved the Taos entries from CAT:Place_names over to (subcategories of) CAT:Places. Most of them look like (proper) nouns; AFAICT the non-noun-y parts of the definition (which had led the creator to reclassify them as particles) are etymological (literal morpheme-by-morpheme translations of the name). In a few cases I was unsure whether the term was really a placename/noun or was truly a particle, so I left it as a particle with an
{{attn}}
tag. - -sche (discuss) 22:05, 16 March 2025 (UTC)
- I've moved the Taos entries from CAT:Place_names over to (subcategories of) CAT:Places. Most of them look like (proper) nouns; AFAICT the non-noun-y parts of the definition (which had led the creator to reclassify them as particles) are etymological (literal morpheme-by-morpheme translations of the name). In a few cases I was unsure whether the term was really a placename/noun or was truly a particle, so I left it as a particle with an
- Thanks, this makes sense. Any help you can give with the Taos terms would be greatly appreciated; I'm doing a big cleanup/revamp of the
- I want to echo this. Even though I'm not big on adding proper nouns (a view not shared by the vast, vast majority of editors), I know this is going to end up being an issue when I (eventually?) get around to adding Old Polish place names. Vininn126 (talk) 09:11, 10 March 2025 (UTC)
categorizing demonyms
[edit]How should we best categorize demonyms? Mostly they're just dumped into e.g. Category:en:Demonyms (with 1,438 members) but we have some subcategories like Category:en:British demonyms, Category:en:Belarusian demonyms, Category:en:Demonyms for Americans and Category:en:Demonyms for Australians that are manually added and not yet properly handled by {{auto cat}}
. I ask because I renamed some {{place}}
categories (e.g. Category:fr:Normandy -> Category:fr:Normandy, France), which has flushed out a bunch of manually-categorized demonyms in the old category. If we want to subcategorize demonyms, we should probably try to leverage the existing {{place}}
mechanisms, but that would mean the names would have to be e.g. Category:en:Demonyms for people from Australia or Category:fr:Demonyms for people from Normandy, France, since {{place}}
does't know about terms like "British" and "Australian" (and I'd have to modify {{demonym-noun}}
and {{demonym-adj}}
to somehow hook into Module:place). Do these names sound OK or are there better ones? If we have demonym subcategories, how far down the place hierarchy should we go? {{place}}
knows about Category:Hubei, China and Category:Oriental, Morocco and lots of other first-level subdivisions (and sometimes even second-level subdivisions, in the case of e.g. counties of England such as Category:Herefordshire, England). Should there be a Category:Demonyms for people from Herefordshire, England or should we have some limits, e.g. only countries (or maybe also country-like divisions like England and Greenland)? Or should we follow the practice of the Norman demonyms and just dump them all into the bare category like Category:Normandy, France or Category:Herefordshire, England? It could be argued that it's sufficient to double-categorize something like French cherbourgeois (“from Cherbourg”) into e.g. Category:fr:Demonyms and Category:fr:Normandy, France, because an intersection search can easily pull up the terms in both categories to get the demonyms for Normans; this was the argument used to eliminate categories like Category:Female scientists in favor of Category:Female people and Category:Scientists. Benwing2 (talk) 20:51, 10 March 2025 (UTC)
- OK tentatively I have decided to go with the double-categorization approach as described at the end; French cherbourgeois will go in Category:Demonyms and Category:Normandy, France. This will happen through an expanded syntax for
{{demonym-adj}}
or{{demonym-noun}}
, looking something like this:
# {{demonym-adj|fr|[[Cherbourg]], a <<town>> in <<dept/Manche>>, <<r/Normandy>>, <<c/France>>|g=m}}
- The only thing different here from a regular
{{demonym-adj}}
call is the use of<<..>>
, which is borrowed from{{place}}
. Essentially, any use of<<..>>
will cause the whole expression to be parsed and displayed like a new-style{{place}}
definition, and it will categorize according to the lowest-level recognized division, in a bare category. Here, this is Category:Normandy, France, but if I end up teaching{{place}}
about the 101 French departments and having categories for them, it will automatically be re-categorized into Category:Manche, France (or maybe Category:Manche, Normandy, France depending on the naming scheme chosen). The beauty of this is that we can change our minds later on about how we categorize demonyms without having to manually (or by bot) change a zillion individual entries. Tentatively I'm thinking it will only categorize in Category:Normandy, France and not also Category:France to avoid spamming the latter category, consistent with the idea that you shouldn't usually double-categorize at different depths along the same branch. But maybe if we create department-level categories it might end up making sense to categorize demonyms both at the department and regional level, depending on how many there are. The other thing to note is that, unlike for{{place}}
, you can leave out the entry placetype (the<<town>>
in the above example) if it makes sense to do so. So for example, French Ariégeois might use
# {{demonym-noun|fr|the <<dept:pref/Ariège>>, <<r/Occitania>>, <<c/France>>|g=m}}
- which would display something like "native or inhabitant of the department of Ariège, Occitania, France (masculine or unspecified gender)" and would categorize into Category:Demonyms and Category:Occitania, France.
Benwing2 (talk) 06:15, 11 March 2025 (UTC)
- OK, I have implemented this as described above. Benwing2 (talk) 07:28, 13 March 2025 (UTC)
"a language is a dialect with an army and a navy"
[edit]Let's please merge Category:Languages and Category:Dialects into maybe Category:Languages and dialects (or Category:Languages and language varieties or just Category:Lects?). As we all know, the distinction is nebulous, and the current system clearly isn't working as we have 3,280 members of Category:en:Languages and only 169 of Category:en:Dialects; evidently editors are loath to categorize language varieties as "dialects". Ultimately, we should also split the resulting category, maybe along continental region lines (i.e. pretty much the same regions used for "Countries in X" categories: North America, Central America, South America, Europe, Asia, Africa, Polynesia, Melanesia, Micronesia with the addition of Australia and New Guinea); but such a split is a non-trivial task. Benwing2 (talk) 23:12, 11 March 2025 (UTC)
- And while we're at it, let's create a language is a dialect with an army and a navy as an entry. Purplebackpack89 01:01, 12 March 2025 (UTC)
- Let's not and say we did. Benwing2 (talk) 01:05, 12 March 2025 (UTC)
Support. Category:Languages and language varieties; I can foresee Category:Lects being deemed jargony. Fay Freak (talk) 01:23, 12 March 2025 (UTC)
- Although there is indeed no clear, objective line between language and dialect, I’m not sure it follows that removing the category difference is a good idea. I think that, no matter what theoretical models we subscribe to, we can agree on the logic of having, say, American English and British English under a grouping that does not include Swahili.
- Whichever way we go with this, I support dropping the usage of dialect (with its undesirable connotations) in favour of variety. Nicodene (talk) 03:18, 12 March 2025 (UTC)
- Yeah I agree, it'd feel weird to have those in the same category. AG202 (talk) 03:25, 12 March 2025 (UTC)
- Definitely I prefer "language varieties" over "dialects" but there are so many gray areas that keeping both categories is going to be a real headache IMO. What motivated this was trying to clean up French demonyms, and in the process I discovered things like French champenois defined as
# Champenois (Romance language or dialect)
Similarly English Bourguignon:# The Romance Burgundian language or dialect.
Making such a distinction forces us into a sometimes arbitrary choice of language vs. dialect, which may be obvious for things like Swahili vs. British English but rapidly gets nebulous as hell when you're dealing with less familiar language varieties. Note also that categories like Category:Languages of the United States already group languages and "dialects", as can be seen by looking under A in this category. Benwing2 (talk) 07:36, 13 March 2025 (UTC)- Given that we treat Champenois as a language, we might as well categorize it as such for consistency. I wonder whether some form of automatization is possible. Nicodene (talk) 09:21, 14 March 2025 (UTC)
- Definitely I prefer "language varieties" over "dialects" but there are so many gray areas that keeping both categories is going to be a real headache IMO. What motivated this was trying to clean up French demonyms, and in the process I discovered things like French champenois defined as
- Yeah I agree, it'd feel weird to have those in the same category. AG202 (talk) 03:25, 12 March 2025 (UTC)
Oppose. After a quick look through the dialect category, I don't think those are the same kinds of things we want in the language category. I'm fine with renaming the former, but I think the lower number of dialects probably comes down to the fact that people are more interested in languages, not dialects, and dialects are less likely to have their own distinct names (even looking at what's already in the English subcategory, I see things like "European Portuguese," which is linguistically less interesting than "Portuguese" itself). Andrew Sheedy (talk) 05:04, 12 March 2025 (UTC)
Oppose "the current system clearly isn't working as we have 3,280 members of Category:en:Languages and only 169 of Category:en:Dialects [...] evidently editors are loath to categorize language varieties as "dialects" " – I don't think that's the right conclusion: at this moment there are exactly 0 dialects for Dutch, this doesn't mean there are no Dutch dialects, or that editors would classify 'Antwerps', 'Leuvens', etc. as languages. People simply haven't bothered (and at Antwerps, the category hasn't been added). Exarchus (talk) 12:55, 13 March 2025 (UTC)
- Correction: there are 3 dialects at Category:nl:Dialects (I was looking at the 'n' in the list). Though 'tussentaal' is not what is properly called a dialect, so a change to 'language variety' is an idea.
- And now I noticed that you counted Category:en:Dialects and not Category:Dialects. Well, obviously people are not likely to start adding English translations of Markizaats, Kempenlands, Getelands, Aalsters, Utrechts-Alblasserwaards, Kennemerlands, and so on and so forth.
- Btw, I changed Brabantian from 'language' to 'dialect' (though 'dialect group' might be more accurate, so yeah, a change to 'language variety' sounds fine), as I don't think it is seriously considered a language by anyone. Exarchus (talk) 13:15, 13 March 2025 (UTC)
- I wonder what normal people care about and how they interpret "language", "lect", "dialect", and "language variety". Anybody have any data or conjectures? Does anybody here care? DCDuring (talk) 13:44, 13 March 2025 (UTC)
- My educated conjectures include that I agree with an earlier comment that the word "dialect" has a certain mild cultural baggage, slight but not nothing, whereby to some laypersons it connotes themes such as "niche", "ethnic", "slang", "nonstandard", and "less than" (regardless of whether it ought to), and thus saying "language variety" is better, because (I like to hope) they will have a hard time managing to trample that one with the treadmill. Not that such trampling can't be done, but it's a steeper climb that will give more of a workout. (Ask your doctor before starting an exercise regimen. Ask your doctor whether once-daily DumbDownMaxx is right for you.) Quercus solaris (talk) 03:51, 14 March 2025 (UTC)
- I agree that language variety is probably sufficiently transparent for most normal users, but only if we have hovertext and a Wiktionary:Appendix entry (and WP link?) for language variety that refers to the word dialect for those users looking for some kind of explanation. DCDuring (talk) 12:36, 14 March 2025 (UTC)
- Great point — I heartily agree. I volunteer to add "variety" to the glossary (with cross-ref link to "dialect") if anyone will take so much pity as to bestow autopatroller status on me, which is (since recently) needed to edit that page. (The first forty-seven thousand good-faith edits with a 99.8% retention rate are the hardest when it comes to hard-earning one's autopatroller status.) Quercus solaris (talk) 14:21, 14 March 2025 (UTC)
- @Fenakhay why was the Glossary protection increased? I'm strongly minded to lower it back to the previous level. This, that and the other (talk) 01:32, 16 March 2025 (UTC)
- @This, that and the other I agree and I put it back to autoconfirmed. I think Fenakhay may have raised it based on someone using something related to Nazis as an example, but that seemed to be only one case. We can raise it again if it becomes a vandalism or controversy target. Benwing2 (talk) 02:00, 16 March 2025 (UTC)
- Indeed. And in any event, the Glossary is on the watchlist of more than 50 active users, so any sporadic mischief will not go unnnoticed for long. This, that and the other (talk) 02:04, 16 March 2025 (UTC)
- @This, that and the other I agree and I put it back to autoconfirmed. I think Fenakhay may have raised it based on someone using something related to Nazis as an example, but that seemed to be only one case. We can raise it again if it becomes a vandalism or controversy target. Benwing2 (talk) 02:00, 16 March 2025 (UTC)
- @Fenakhay why was the Glossary protection increased? I'm strongly minded to lower it back to the previous level. This, that and the other (talk) 01:32, 16 March 2025 (UTC)
- Great point — I heartily agree. I volunteer to add "variety" to the glossary (with cross-ref link to "dialect") if anyone will take so much pity as to bestow autopatroller status on me, which is (since recently) needed to edit that page. (The first forty-seven thousand good-faith edits with a 99.8% retention rate are the hardest when it comes to hard-earning one's autopatroller status.) Quercus solaris (talk) 14:21, 14 March 2025 (UTC)
- I agree that language variety is probably sufficiently transparent for most normal users, but only if we have hovertext and a Wiktionary:Appendix entry (and WP link?) for language variety that refers to the word dialect for those users looking for some kind of explanation. DCDuring (talk) 12:36, 14 March 2025 (UTC)
- My educated conjectures include that I agree with an earlier comment that the word "dialect" has a certain mild cultural baggage, slight but not nothing, whereby to some laypersons it connotes themes such as "niche", "ethnic", "slang", "nonstandard", and "less than" (regardless of whether it ought to), and thus saying "language variety" is better, because (I like to hope) they will have a hard time managing to trample that one with the treadmill. Not that such trampling can't be done, but it's a steeper climb that will give more of a workout. (Ask your doctor before starting an exercise regimen. Ask your doctor whether once-daily DumbDownMaxx is right for you.) Quercus solaris (talk) 03:51, 14 March 2025 (UTC)
- I have to object to the "navy" part. Please understand that mountain races don't care for the sea or sailing. Doesn't mean we don't speak languages. Vahag (talk) 15:32, 13 March 2025 (UTC)
- Some mountain folk have a navy and encompass multiple official languages. Can navyless mountain folk make that claim? DCDuring (talk) 12:36, 14 March 2025 (UTC)
- Do all those yachts not count? Los Angeles has a coastline. Nicodene (talk) 17:13, 14 March 2025 (UTC)
- Armenia has its own Sailing Sport Federation, not to mention Bolivian Navy. On the other hand, some countries have no army. Tollef Salemann (talk) 08:26, 16 March 2025 (UTC)
- Indeed, but are any countries without an army the (or a) primary homeland of one and only one language? DCDuring (talk) 18:17, 16 March 2025 (UTC)
- @DCDuring: “Iceland maintains no standing army …” – Icelandish men desirous to practice military service have to work with Norway. Fay Freak (talk) 18:26, 16 March 2025 (UTC)
- Thanks for indulging my idle curiosity. DCDuring (talk) 01:05, 17 March 2025 (UTC)
- @DCDuring: “Iceland maintains no standing army …” – Icelandish men desirous to practice military service have to work with Norway. Fay Freak (talk) 18:26, 16 March 2025 (UTC)
- Indeed, but are any countries without an army the (or a) primary homeland of one and only one language? DCDuring (talk) 18:17, 16 March 2025 (UTC)
Majorca/Minorca or Mallorca/Menorca? Is there a general policy?
[edit]I'm sure there are other similar pairs (for example, we switched from Kiev to Kyiv a couple of years ago but still use Odessa rather than Odesa). Mallorca and Menorca are the spellings used in Catalan and Spanish, and are what Wikipedia uses, but Majorca and Minorca are what we currently use and are closer to the original Latin forms. Google Ngrams shows Mallorca overtaking Majorca in the 1990's in English but indicates that Minorca is still twice as popular as Menorca. In general, how do we balance the following competing desires?
- Use the autochthonous form (i.e. the form used in the language(s) spoken there);
- Use the most common form per Google Ngrams or some other corpus-based source;
- Follow the style guides of major newspapers;
- Do whatever Wikipedia does, since they've typically already hashed out these issues (although in general they lean towards the autochthonous form even when it's not the most common one).
Benwing2 (talk) 01:11, 14 March 2025 (UTC)
- Well, I moved the canonical page from Majorca to Mallorca per Ngrams but left Minorca for now. Benwing2 (talk) 02:19, 14 March 2025 (UTC)
- I would say follow what NGrams and the newspapers use, with preference to the newspapers. Doing this reflects how the words are used in real life, especially in terms of what is now used. CitationsFreak (talk) 17:09, 14 March 2025 (UTC)
- In English we are lucky in having Google Ngrams available. I would think that a form's greater frequency over the past 20 years makes for a rebuttable presumption that such form ought be the main one. The other data mostly seems useful to attempt to rebut the presumption, along with crude counts of hits at Google News and Google Search and the practice of other dictionaries, as can be conveniently looked at via OneLook (compare “Minorca”, in OneLook Dictionary Search. and “Menorca”, in OneLook Dictionary Search., checking the links indivdually, suggesting the Minorca is preferred by OneLook references.). WP principles are too prescriptivist for a descriptive dictionary like us. DCDuring (talk) 17:14, 14 March 2025 (UTC)
- Largely in agreement with CitationsFreak and DCDuring, I'd prefer to lemmatize whichever form is most common, particularly in more recent texts (and when it comes to changed names, in texts from after the change: are they following or ignoring it?). I would judge commonness by as many reliable metrics as people are able to bring to bear. If, for some obscure term, someone points out that form X is more common per Ngrams than Y, and no-one cares to go looking for other evidence, then it seems reasonable to rename it on that basis (after all, if the situation changes we can always rename it again), but if for these islands someone wants to look into not just the Ngrams numbers but e.g. Google Scholar numbers (supposedly 22,900 for Minorca and 101,000 for Menorca, but this seems to include many irrelevant non-English results, since there's only 6,470 for "of Minorca" vs 4,290 for "of Menorca", and 3,340 for "in Minorca" vs 3,980 for "in Menorca"), newspaper data, etc, then it seems reasonable to consider the totality of available evidence. (In this case, the Google Scholar data seems inconclusive.) If two forms are about equally common, or if (as DCDuring notes) some languages lack easy ways of ascertaining commonness, then we have to make decisions based on other factors. - -sche (discuss) 19:08, 14 March 2025 (UTC)
- Thanks! I'll leave Minorca as-is for now. I suppose we could argue that since we've renamed Majorca to Mallorca and Minorca~Menorca are about equally common, we should prefer Menorca for consistency, but I dunno if I really buy that. For example, Pennsylvanien is archaic in German but Kalifornien is still standard. Benwing2 (talk) 07:31, 15 March 2025 (UTC)
- As an FYI, what I'd consider the gold standard for common English usage is "What do EasyJet and Ryanair call them"? (For the non-Europeans, these are the low-cost airlines stereotypically associated with tourist flights to the Balearics.) EasyJet goes for Majorca, while Ryanair plumps for Mallorca, but both picked Menorca. So in conclusion, ¯\_(ツ)_/¯. Smurrayinchester (talk) 08:56, 18 March 2025 (UTC)
- Thanks! I'll leave Minorca as-is for now. I suppose we could argue that since we've renamed Majorca to Mallorca and Minorca~Menorca are about equally common, we should prefer Menorca for consistency, but I dunno if I really buy that. For example, Pennsylvanien is archaic in German but Kalifornien is still standard. Benwing2 (talk) 07:31, 15 March 2025 (UTC)
Error message: "Lua error in Module:transclude at line 326: Couldn't find the template {{senseid|en|Q804} within entry Panama." Panda10 (talk) 19:51, 14 March 2025 (UTC)
- This happens if you preview a partial page. The error goes away when it's actually saved. Benwing2 (talk) 19:56, 14 March 2025 (UTC)
But after publishing the changes, the error disappears. Panda10 (talk) 19:53, 14 March 2025 (UTC)
- Yeah it's because it's trying to look for the English text on the same page, and if you preview just a part of the page not including the English text, you'll get that error. @Theknightwho is there a way of being smarter about this? Benwing2 (talk) 19:57, 14 March 2025 (UTC)
- @Benwing2 I'll have a look, but it might be tricky. Theknightwho (talk) 20:06, 14 March 2025 (UTC)
- One idea: just add something to the effect of "If you see this error when previewing only one section of a page, check whether it still shows up when you preview or save the whole page" (ideally worded better, more concisely) to the end of the error message. - -sche (discuss) 06:06, 15 March 2025 (UTC)
- I have implemented this. If you're in preview mode and the
{{tcl}}
call references the same page and an error occurs, you get this: Lua error in Module:transclude at line 295: Couldn't find the template
Benwing2 (talk) 07:23, 15 March 2025 (UTC){{senseid|en|Q804}}
within entry Panama. NOTE: You are in preview mode. If you're previewing only part of the page, try previewing the full page, as the error may go away.
- I have implemented this. If you're in preview mode and the
Redirects from deprecated text encodings
[edit]after falling in a small rabbit hole, I wish to update Wiktionary pages with this. some pages seem to contain deprecated or non-preferred spellings, so I wish to update these to and redirect them to the preferred ones. these can be found in this page which gratiously compiled all chart sources from Unicode. it would affect the following entries, some of which I have already started updating and some of which, I don't have the rights to overwrite.
Juwan (talk) 18:29, 15 March 2025 (UTC)
Should affix categories be added transitively?
[edit]For context, see Module_talk:etymon#Bad_categorization.
Let's say there's a word XY that's derived from X- + -Y. We would categorize that word under the X- and the -Y categories. But if there's a word XYZ, formed from XY + Z, it would not be added to those categories.
To give a concrete example, both cleanliness and preattentively are not currently categorized under Category:English terms suffixed with -ly. It's not clear to me whether this is due to the limitations of our templates, or because the community has specifically decided to exclude them.
According to @Svartava, this is the status quo and most editors would oppose categorizing the words I listed above. But if that's the case, maybe we should document that somewhere, since it's a pretty far-reaching decision that affects nearly every language we have. Ioaxxere (talk) 05:13, 17 March 2025 (UTC)
- NOTE: the categorization of preattentively by -ly suffix is not the case I had in mind and is not something I object to or find much problematic as it is also analyzable as preattentive + -ly. The cases like cleanliness by -ly, effortlessly by -less, evildoer by -er (assuming "evildo" isn't an attested verb) etc. are the ones I find problematic. Svārtava (tɕ) 05:32, 17 March 2025 (UTC)
- @Ioaxxere: what does “transitively” mean in this context? I’m guessing that preattentively is from pre- + attentively, so if an editor had indicated this as
{{affix|en|pre-|attentively}}
the entry wouldn’t be automatically put in “Category:English terms suffixed with -ly”. We’d have to add “({{affix|en|attentive|-ly}}
) to effect this. Personally I don’t mind this. I’ve also wondered whether it’s correct to add a prefix or suffix category to an entry when the affixation didn’t occur in modern English. I’ve sometimes done so when it seems clear that an English-form affix was added to an earlier etymon, for example, when an English word is derived from a Greek word ending in -izein but appears in English suffixed with -ize. (I didn’t know that cleanliness has a -ly in it!) — Sgconlaw (talk) 05:26, 17 March 2025 (UTC)- “transitively” here would be referring to the math/set theory sense. Svārtava (tɕ) 05:38, 17 March 2025 (UTC)
I’ve also wondered whether it’s correct to add a prefix or suffix category to an entry when the affixation didn’t occur in modern English.
- Regarding this point, I just want to point out that
{{surf}}
does add cats. I've personally taken that to mean that yes, words not derived in the language also take affix/compound cats if they are synchronically analyzable as such, which would mean that "cleanliness" should not have the "-ly" cat but "preattentively" should have the "pre-" cat. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 19:26, 20 March 2025 (UTC)
- I think only the affix forming the word should categorize the word. Previous affixes used to make the word should not be categorized. Vininn126 (talk) 05:28, 17 March 2025 (UTC)
- I'm against putting words like "cleanliness" in categories for terms suffixed with -ly. The suffix was added during the formation of "cleanly"; including all words derived from "cleanly" is unnecessary and bloats the category. I also agree that the status quo is not to include these kinds of further derived words.--Urszag (talk) 15:43, 17 March 2025 (UTC)
- Here is a tip that people here probably already know anyway, but here goes. One can slap a nocat parameter on things like
{{surf|en|pancreatico-|duoden-|-ectomy|nocat=1}}
to yank all the cat inclusions away and then add back just the initial and terminal ones via "[[Category:English terms prefixed with pancreatico-]]" and "[[Category:English terms suffixed with -ectomy]]". I have sometimes failed to cross all my t's on that aspect, but any misses can be fixed as we reencounter them. Quercus solaris (talk) 02:54, 18 March 2025 (UTC)
- Here is a tip that people here probably already know anyway, but here goes. One can slap a nocat parameter on things like
- I agree with Vininn126 and Urszag. Ultimateria (talk) 01:30, 19 March 2025 (UTC)
- So far I have taken ‘terms suffixed with X’ to mean ‘words created via suffixation with X’. By that interpretation, cleanliness is suffixed with -ness (cleanly + -ness > cleanliness) but not with -ly (cleanness + -ly > *cleannessly, not **clean-ly-ness). It seems you have taken it to mean ‘words containing X, which is a suffix’.
- I have no particular intuition on the matter of which might make for a more useful category.
- Incidentally, preattentively has to be preattentive + -ly, since the sense is ‘in a preattentive way’ and not *‘before attentively’ (pre- + attentively). Also worth noting that pre- does not seem to attach to adverbs (**pre-now, **pre-soon). I wonder whether there exists any case in English where two different orders of affixation are equally likely. Nicodene (talk) 07:48, 19 March 2025 (UTC)
- @Nicodene: By far the most interesting example of what you mentioned is at unseatable, where the different orders of affixation result in completely different senses (plus a bonus third sense!). Ioaxxere (talk) 13:57, 19 March 2025 (UTC)
- True. I suppose I should have specified ‘for a given sense’. This entry is split by etymology and sense the way I’d expect it to be. Nicodene (talk) 16:35, 19 March 2025 (UTC)
- I'd agree. It's technically two etymologies. Vininn126 (talk) 16:50, 19 March 2025 (UTC)
- True. I suppose I should have specified ‘for a given sense’. This entry is split by etymology and sense the way I’d expect it to be. Nicodene (talk) 16:35, 19 March 2025 (UTC)
- @Nicodene: By far the most interesting example of what you mentioned is at unseatable, where the different orders of affixation result in completely different senses (plus a bonus third sense!). Ioaxxere (talk) 13:57, 19 March 2025 (UTC)
Audio pronunciations for historical languages
[edit]- For a relevant previous discussion, see User talk:Nicodene/2024#Special:Diff/78171703.
AFAICT, between 21:28 and 22:03 (UTC) on the 15th of February 2025 (but starting at least as early as the 12th), Theknightwho cleared out Category:Ancient Greek terms with audio pronunciation, Category:Latin terms with audio pronunciation, and Category:Old English terms with audio pronunciation by culling the audio pronunciations from their then-members' entries. Does the invoked justification, “No audios from non-native speakers.”, reflect stated policy and/or community consensus? And does that principle still apply even in the case of languages which have no native speakers and/or in the case of historical languages which, by dint of being historical, logically can't have native speakers? 0DF (talk) 15:31, 17 March 2025 (UTC)
- @0DF: Were recordings of Ecclesiastical Latin, which is still actively spoken, removed as well? But in the case of ancient languages, I agree with removing pronunciations in cases where editors have nothing to go off of besides academic reconstructions, because then the audio contains no more information than the IPA on its own (admittedly, it is more accessible to casual readers, but I would say that it's not Wiktionary's place to make IPA more "fun"). I don't think we should remove audios from conlangs like Toki Pona, because the pronunciation used by active speakers is the canonical pronunciation, even though they're not native. Ioaxxere (talk) 17:04, 17 March 2025 (UTC)
- @Ioaxxere: Of those Latin audios removed by Theknightwho during 21:28–22:03, 15 February 2025 (UTC), those marked as Ecclesiastical were abecedarium, absens, Achaicus, audio, crypton, Cupido, cupido, delphinus, diabolus, Euboea, Kyrie eleison, and neon. 0DF (talk) 18:18, 17 March 2025 (UTC)
- @0DF I am open to the idea of having Ecclesiastical Latin audios, by virtue of the fact it has never had native speakers, but I completely agree with @Nicodene's previous assessment that audios of historical languages that were spoken natively are essentially just conlanging. Theknightwho (talk) 19:49, 17 March 2025 (UTC)
- Second this. Vininn126 (talk) 20:25, 17 March 2025 (UTC)
- For an extinct language such as Latin, as long as a given recording is clearly labelled as showing a modern convention of pronunciation - and actually follows that convention - I am fairly neutral about it. Nicodene (talk) 20:32, 17 March 2025 (UTC)
- @0DF I am open to the idea of having Ecclesiastical Latin audios, by virtue of the fact it has never had native speakers, but I completely agree with @Nicodene's previous assessment that audios of historical languages that were spoken natively are essentially just conlanging. Theknightwho (talk) 19:49, 17 March 2025 (UTC)
- @Ioaxxere: Of those Latin audios removed by Theknightwho during 21:28–22:03, 15 February 2025 (UTC), those marked as Ecclesiastical were abecedarium, absens, Achaicus, audio, crypton, Cupido, cupido, delphinus, diabolus, Euboea, Kyrie eleison, and neon. 0DF (talk) 18:18, 17 March 2025 (UTC)
"synecdoche" vs "metonym"
[edit]We have both Category:English synecdoches and Category:English metonyms, and at the moment there doesn't seem to be any clear distinction between them on Wiktionary. As I understand it, a metonym refers to something by a name closely related to it, and a synecdoche is a specific type of metonym that refers to something by the name of part of it, but the distinction isn't very clear - Wikipedia says that "The White House" meaning the US government is synecdoche, when I'd consider it just a metonym (since the White House building is not literally part of the US government), for example. Certainly, the metonyms category is packed with terms I'd consider synecdoches (butt like "get your butt over here", face like "the familiar faces", safe pair of hands), and a couple in the synecdoche category seem more like general metonyms (desktop meaning computer wallpaper, tribe meaning tribal nation). No matter what, these categories need some clean up (not least to remove things that are metaphors, not metonyms), but my more general question is: is there any value to maintaining this poorly-defined distinction, or should we treat synecdoche as a subset or synonym of metonym?
If we do keep them, can we define some kind of border between them? For example, are metonyms related to clothing or tools (boots on the ground, suit, hired gun, virtuoso violin) synecdoches? Are the many government seats like White House, 10 Downing Street, Élysée synecdoches?
I notice that at tongue we have both a metonymic sense (a language) and a synecdochical sense (a speaker of a language), but apart from that, there's nowhere AFAICT where we use both metonym and synecdoche, so a merger of the terms would be minimally disruptive to the current organisation. Smurrayinchester (talk) 17:15, 17 March 2025 (UTC)
- I find often hard mental work to remember and maintain the distinction. (This is problem with other "rhetorical" terms, too.) Nevertheless, I think we should try. Lanham's A Handlist of Rhetorical Terms (1991) has:
- four types of metonymy: cause for effect, effect for cause; proper name for an associated quality, a quality for an associated proper name
- four types of synecdoche: substitution of part for whole, of whole for part, genus (hypernym?) for species (really hyponym?), species for genus.
- I note that White House doesn't exactly fit into any of the eight types that Lanham has.
- I further note that Lanham does not think these are synonyms or that one is a kind of the other.
- Lanham has 'see also's for these referring to each other and to other terms (like metaphor). DCDuring (talk) 20:37, 17 March 2025 (UTC)
- Thanks. Lanham's synecdoche definition feels pretty good, but the metonymy one seems rather different to in regular usage. As you say, White House doesn't seem to fit there, but a lot of common terms would be metonyms - growth as in "a benign growth" or "new growth", for instance (using the cause to name the effect). Basically every sense where our definition starts "The result of..." or "An instance of..." Smurrayinchester (talk) 06:13, 18 March 2025 (UTC)
- Both MWCD and AHD show a theme where their def for synecdoche invokes the specific themes (e.g., "the part for the whole, the whole for a part, the specific for the general, the general for the specific, or the material for the thing made from it") whereas their def for metonymy is shorter and invokes the theme of "closely associated". To my reading, this means that the two words overlap substantially in denotation but that metonymy has an additional get-out-of-jail-free card that synecdoche doesn't have: the added category of "and the rest", that is, "closely associated [in some other way not strictly meronymous, holonymous, hypernymous, or hyponymous]", which is to say, misc, &c, or handwave etc. Lol. I readily admit that it would be hard to get the rest of the world (outside of MWCD and AHD) to uphold this nuance of differentiation. As for whether Wiktionary should [do], one option could be to suffer the labels to say "metonymy" for any of these relations (the strict ones or the loosey-goosey one) and then explain at the glossary (for when the user clicks on the label) that these two words largely overlap in denotation and that when you see "metonymy" in a Wiktionary label you can be sure that "synecdoche" probably applies as well. Quercus solaris (talk) 03:14, 18 March 2025 (UTC)
- We have to follow usage in our definitions in principal namespace, but not in Wiktionary:Glossary and how we use the words. Even though we have some freedom, we should not do violence to common usage and we should remind users of the overlap and/or confusion. DCDuring (talk) 04:51, 18 March 2025 (UTC)
- Oh yes, "the material for the thing made from it". That's another type of synecdoche/metonym that as far as I can tell isn't included in either Wiktionary category currently. Would putting a (synecdochically) label on, say, the golf club senses at wood and iron or the food container senses at tin be too pedantic and confusing for readers? Smurrayinchester (talk) 06:20, 18 March 2025 (UTC)
- That's an excellent question. I suppose it prompts me to ask myself: are metonymy and synecdoche so pervasive in natural language that such a
{{label}}
would feel too repetitive if one were to express it everywhere that it truly applies? Hmm. I'll have to ponder that one. My first reaction is "no, it's fine," but is it though? How many such cases have I never even thought to label but they truly could have that label? Quercus solaris (talk) 06:41, 18 March 2025 (UTC)- I'm skeptical about the value of displayed labels for synecdoche and metonymy. If the meaning confuses some of us, the terms probably shouldn't be used in principal namespace, except in etymology and usage notes. Metaphor, synecdoche, and metonymy do seem to account for a very large share of polysemy. The use of figuratively seems more than sufficient, often unnecessary. 13:43, 18 March 2025 (UTC)
- That's an excellent question. I suppose it prompts me to ask myself: are metonymy and synecdoche so pervasive in natural language that such a
- Both MWCD and AHD show a theme where their def for synecdoche invokes the specific themes (e.g., "the part for the whole, the whole for a part, the specific for the general, the general for the specific, or the material for the thing made from it") whereas their def for metonymy is shorter and invokes the theme of "closely associated". To my reading, this means that the two words overlap substantially in denotation but that metonymy has an additional get-out-of-jail-free card that synecdoche doesn't have: the added category of "and the rest", that is, "closely associated [in some other way not strictly meronymous, holonymous, hypernymous, or hyponymous]", which is to say, misc, &c, or handwave etc. Lol. I readily admit that it would be hard to get the rest of the world (outside of MWCD and AHD) to uphold this nuance of differentiation. As for whether Wiktionary should [do], one option could be to suffer the labels to say "metonymy" for any of these relations (the strict ones or the loosey-goosey one) and then explain at the glossary (for when the user clicks on the label) that these two words largely overlap in denotation and that when you see "metonymy" in a Wiktionary label you can be sure that "synecdoche" probably applies as well. Quercus solaris (talk) 03:14, 18 March 2025 (UTC)
Standardizing Ulch lemmas to Cyrillic
[edit]Unlike e.g. Oroqen, which is spoken in China and thus has not been blessed with a pleasant writing tradition, Ulch is spoken in Russia and has an official Cyrillic script, much like the neighboring Nanai. However, currently, many entries are lemmatized under bespoke romanizations from scholarly works. These entries should be moved to Cyrillic titles; I have plans to create a translit module soon, which should help smooth the transition. A good first step in the meantime would be to populate Ulch terms by script, which I have gone ahead and created but is currently empty. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 07:56, 18 March 2025 (UTC)
- @Lunabunn: I agree with lemmatising at Cyrillic titles, but it's probably a good idea to retain Romanisation entries for the terms à la Gothic and Japanese. 0DF (talk) 12:57, 18 March 2025 (UTC)
- @0DF I'm not fundamentally opposed to romanised redirects, but in this instance I find the premise falls short. I don't think the current romanised pages should be kept, because they use a bespoke romanisation system(s); but if we were to use the "standard" romanisation system from pre-Cyrillic, a la what MOD:gld-translit currently uses, we find that it uses characters like ŋ, ə, and ʒ that make it pretty useless for search anyways. So in this case, especially considering the shortage of Tungusic editors able to perform cleanup, what benefit would these redirects bring? 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 20:42, 18 March 2025 (UTC)
- @Lunabunn: IMO all those Latin-script forms should be moved to the Cyrillic. We don't have Latin Russian either, even though it is very frequent in historical linguistic works. Thadh (talk) 21:17, 18 March 2025 (UTC)
- Yup, agreed. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 21:35, 18 March 2025 (UTC)
Amharic Entries requiring word separator <፡>
[edit]Wiktionary:Amharic entry guidelines puts forward a requirement to use word separators (i.e. ፡) between words. The usage is outdated, and seldom used in modern written Amharic. As an example, even the Ethiopian Government's own website does not make use of word separators, and just uses whitespace.
Barring objection of other editors, I'd like to propose to making the word separator optional; it can be used if the editor desires, but not requiring it as currently put forward by the existing guidelines. CatchingCots (talk) 15:51, 19 March 2025 (UTC)
- I think what we definitely should not do is allow editors to chose one or the other on their own accord - this will only bring inconsistency.
- My interpretation of the separators, based on my very small exposure to Amharic in the wild, was that its omission on the internet or in spontaneous writing is more an 'informal' thing, much like writing <е> for <ё> in Russian. Which is why I thought it best to include it. If that is not the case, and speakers conciously choose not to write it and consider writing it outdated or even wrong, then we should definitely change it, but in that case we should go all in. Thadh (talk) 16:39, 19 March 2025 (UTC)
- Not opposed to striking out word separators all together, if there's no opposition I can go through the old entries and take out any that I might find in there.
- To answer your question about its omission, it's actually not limited to just informal communications & spontaneous writings; in essentially all spheres of written Amharic, it hasn't been used virtually at all since the advent of computer typesetting. Most published books & newspapers these days do not use them, and reputable news organizations like BBC do not make use of them either. Of the 2 options, I'd definitely lean towards not using the word separators at all. CatchingCots (talk) 17:28, 19 March 2025 (UTC)
- FWIW, I find Amharic word boundaries are more immediately apparent when they're marked with ⟨፡⟩ rather than just whitespace. For that reason of immediate visual apprehensibility, I would favour retaining the guideline favouring the use of ⟨፡⟩. 0DF (talk) 14:47, 25 March 2025 (UTC)
dew-Jew merger
[edit]In some English entries we give pronounciations for mergers e.g. the "pour-poor merger" in poor or tour
Can we introduce the "dew-Jew merger" for words like dew, due, tune or educate? 46.112.116.223 19:29, 19 March 2025 (UTC)
- Wiktionary only invokes the ones that have establishment in linguistics literature (e.g., cot–caught merger, Mary–marry–merry merger). So it wouldn't introduce a novel one. Regarding yod-coalescence and yod-dropping, those terms would be the ones that it would use. Quercus solaris (talk) 21:28, 19 March 2025 (UTC)
Request for AWB Whitelist
[edit]May I request to be added to the AWB Whitelist, mainly to perform various template cleanup and migration tasks? Thanks. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 19:20, 20 March 2025 (UTC)
- @Lunabunn You have been added. Let me know if you have any issues. Benwing2 (talk) 07:39, 21 March 2025 (UTC)
- Thanks! Have confirmed access to JWB. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 08:11, 22 March 2025 (UTC)
Thoughts about how to categorize Chinese districts, prefecture-level cities and the like
[edit]@Geographyinitiative Can you help me figure out the best way to categorize Chinese prefecture-level cities and districts and such? Normally, a "district" of a "city" is a neighborhood, and so that's how Module:place categorizes them. However, this appears to make zero sense whatsoever in China, where for example the Pudong District of Shanghai covers 467.3 sq miles (1,210.4 sq km) and houses 5.7 million people. I need help trying to make sense of the best way to categorize things like "prefecture-level cities", "county-level cities", "subprovincial cities" "subprefectural cities", "districts", "subdistricts", etc. Are subdistricts small enough to count as neighborhoods or should we just categorize them as "subdistricts"? I get that a prefecture-level city is something like a prefecture with a central city, all rolled up in one, but the system of cities within cities is thoroughly confusing. w:Prefecture-level_divisions_of_China seems to indicate that there are 339 prefecture-level divisions in China, most of which are "cities" but some are autonomous prefectures. I guess we could treat prefecture-level cities and autonomous prefectures the same for categorization; it would be a bit awkward to list all 339 of the in Module:place/shared-data and categorize e.g. Category:en:Districts in Ma'anshan, Anhui, China, but maybe we could pick the 50 or 100 or so biggest ones (by population or whatever) and categorize districts and such under them, while categorizing the remainder of the districts at the province level. This is similar to how we categorize neighborhoods of major cities under the city e.g. Category:en:Neighborhoods of San Diego, but neighborhoods of smaller cities go into e.g. Category:Neighborhoods in California, USA. It depends on how many districts and subdistricts we have entered into Wiktionary; I suppose there are a lot. Thoughts? Benwing2 (talk) 06:54, 22 March 2025 (UTC)
- (1) Yes, I endorse making categories for the top 50 to 100 (or fewer?) biggest prefecture-level divisions and leaving the rest in province-level categories. One example might be: Category:en:Districts in Wuhan, Hubei, China, all the English language names of the districts have entries on Wiktionary; some are somewhat cited.
- (2) "Are subdistricts small enough to count as neighborhoods or should we just categorize them as "subdistricts"?" No, even subdistricts are likely not small enough in the cases I know of. I understand that a 街道 (subdistrict) can have a hundred thousand people or multiple hundreds with numerous residential communities- see Guanshan Subdistrict, Luonan Subdistrict and Shizishan Subdistrict, Wuhan. See also: Category:Subdistricts of the People's Republic of China. An urban "neighborhood" equivalent would be either a 社区 (residential community), like the rural 村 (cūn) (village), which is on the fifth level of administrative divisions, or areas smaller than residential communities. Here is an example of a somewhat cited entry for a residential community called 'Jianqiao Chuntian' or literally 'Cambridge Spring' (there are universities in this area): Talk:劍橋春天. See also Talk:陽光. But see also Tiantongyuan, a huge 'community'.
- (3) As for "subprovincial cities" and "subprefectural cities", I understand that subprovincial cities are always also prefecture-level cities simultaneously, however, I have not fully explored the issue. As for 'cities within cities', the key is: is the administrative division on the 2nd level, 3rd level, 4th level or 5th level of administrative divisions? If you can imagine that a prefecture can include a city, then can you imagine that a prefecture-level city (2nd level) can include a county-level city (3rd level)? --Geographyinitiative (talk) 08:58, 22 March 2025 (UTC)
- Each province has on average around 100-ish county-level divisions, including districts, county-level cities and counties and other miscellaneous divisions. I agree that with GI that there should only be additional categories for the larger prefecture-level divisions, but that way the provincial-level categories for counties might still be too large. (and perhaps districts, e.g. Guangdong has 65 districts, even after removing the 11 in Guangzhou, 9 in Shenzhen, 6 in Shantou, 5 in Foshan, you'll still get 34 districts in one category)
- Note that "districts" can also be on the sub-county level historically (there is still a few such "districts" currently – see for example sense 3 of 南山); we might need additional categories for them as well.
- "Subdistricts" (街道) are on the same level as "towns" (鎮) or "townships" (鄉); "neighbourhoods" (社區) and "villages" (村) is what GI has said. I don't think there will be that many such entries for the time being, so maybe we can categorise these as "subdistricts", "towns", "townships", "neighbourhoods", "villages" etc. under provinces or selected prefecture-level cities.
- "Subprovincial cities" are basically prefecture-level cities with special status. We can just ignore them for categorisation purposes. There are no official "subprefectural cities".
- – wpi (talk) 10:03, 22 March 2025 (UTC)
pinyin entries link to hanzi which don't acknowledge them
[edit]I keep coming across pinyin entries that say they are the Hanyu Pinyin reading of
some hanzi which, however, doesn't acknowledge that pinyin at all and only admits having some other pronunciation and pinyin instead. For example, hǎ says it's a "Hanyu Pinyin reading of 奛
", but 奛 gives its Hanyu Pinyin as huǎng instead and makes no mention of hǎ. Were our entries/modules updated to give different pinyin for those hanzi sometime after the initial pinyin entries were created (all the way back in 2006, in this case)? Or are the hanzi entries incomplete, can 奛 in fact be hǎ and the entry is just incomplete? Or what? Does someone need to run a bot to "refresh" our pinyin entries, like has been done periodically for anagrams? - -sche (discuss) 22:32, 23 March 2025 (UTC)
- @-sche: My understanding is that the earlier entries were often created automatically based on Unihan data, which used to contain many errors; a lot of these were corrected on the database, but our entries were never updated accordingly. Indeed this appears to the case for 奛, where it previously had three readings hǎ, tǎi, xiǎng according to zi.tools.
- There might be a need for a bot to update the pinyin entries according to the Unihan database, but the implementation probably needs some proper thought in order to avoid removing useful material that have been added over the past years. – wpi (talk) 16:03, 25 March 2025 (UTC)
- @Wpi: Thanks for the explanation! What about asking someone with a bot to update the pinyin entries based not on Unihan data, but on Wiktionary's own hanzi entries (so we're only saying "X is a pinyin reading of 丫" in situations where our entry on 丫 acknowledges X)? Or are there still lots of correct pinyin readings in Unihan which are lacking from Wiktionary entries? In that case, does anyone track those somewhere? Maybe someone could compare
- "cases where a certain hanzi entry on Wiktionary gives a certain Hanyu pinyin reading",
- "cases where a certain hanzi is given a certain Hanyu pinyin reading by Unihan", and
- "cases where a certain Hanyu pinyin entry on Wiktionary says it's a reading of a certain hanzi",
- and then make lists of the disconnects, i.e.
- "cases where a certain Hanyu pinyin entry on Wiktionary says it's a reading of a certain hanzi, but neither Wiktionary's hanzi entry nor Unihan acknowledges that pinyin as a reading of that hanzi" and we probably need to update our pinyin entry;
- "cases where Unihan gives a certain pinyin reading but Wiktionary doesn't" and someone knowledgeable needs to evaluate whether Unihan is wrong or Wiktionary is missing something;
- "cases where Wiktionary's hanzi gives a certain pinyin reading but Unihan doesn't" and, if our hanzi entry is correct, we might amicably let Unihan know they're missing something;
- etc.
- ? - -sche (discuss) 18:48, 25 March 2025 (UTC)
- @Wpi: Thanks for the explanation! What about asking someone with a bot to update the pinyin entries based not on Unihan data, but on Wiktionary's own hanzi entries (so we're only saying "X is a pinyin reading of 丫" in situations where our entry on 丫 acknowledges X)? Or are there still lots of correct pinyin readings in Unihan which are lacking from Wiktionary entries? In that case, does anyone track those somewhere? Maybe someone could compare
Having template editor would make it much easier for me to go through Special:WhatLinksHere/Wiktionary:Tracking/debug/track/invalid key and clean it up, since I currently can't edit (and thus can't even preview changes with) most of the modules of origin. - saph ^_^⠀talk⠀ 04:15, 24 March 2025 (UTC)
- Nominated at WT:WL. Svārtava (tɕ) 04:43, 24 March 2025 (UTC)
Purplebackpack89
[edit]I am going to propose indefinite or long-term block for Purplebackpack89 (talk • contribs). They have been here since a long time and have repeatedly been found having bad conduct and civility issues.
- They also have a tendency to create drama about things and overall lead to a toxic atmosphere being developed on the site. The essay Wikipedia:Disruptive_editing#Failure_or_refusal_to_"get_the_point" seems to apply well to them as they do not acknowledge their own faults and errors.
- I will link the previous discussions Wiktionary:Beer_parlour/2024/June#User:Purplebackpack89, Wiktionary:Beer_parlour/2024/July#Block_of_User:Purplebackpack89_by_User:Theknightwho (where finally a consensus to block them emerged but the time span was unspecified) and also mention that they have repeatedly been blocked on here, English Wikipedia and Simple Wikipedia (where they were finally given a permablock and their appeals were rejected).
- In the past, they also have had potential role in driving away productive contributors such as Mglovesfun and Equinox from the site which is extremely obstructive towards the goal of dictionary-building.
They were last blocked for 3 months and the block ended on 30 October 2024 but it is apparent that the issues are still relevant and they have not learned anything from the block. User_talk:BD2412#Need_an_edit_hidden.
- Very recently, at User_talk:BD2412#Interaction_ban?, they start making the ridiculous claim that they thought there was an "interaction ban" in place between them and Theknightwho despite there clearly being no consensus on the application of "interaction ban" in the above linked BP discussions.
- They start attacking Theknighwho, mindlessly claiming that TKW labelled them a "vandal" in this deletion summary, while in reality the word "vandal" appeared due to it being a part of the page's content before it's deletion. This is consistent with their usual tendency to cause drama.
- They also seem to like indulging in Wikipedia-style wikilawyering, e.g. saying
As an admin, you need to follow all policies and guidelines. That means assuming good faith in me whether you want to or not
which doesn't strike me as a very good comment as "assume good faith" is not (and can never be) a policy and there is no compulsion (contrary to there comment) to assume good faith if there are sufficient details that suggest otherwise. - After Benwing deleted some categories they had created, they write a dramatic bureaucracy-inclined message at User_talk:Benwing2#Pasadena_categories:
I consider your depopulation, deletion and creation-protection in error and have started an RFDO to have the categories restored
instead of asking / clarifying the reason for deletion from the deleting admin, which is generally a more collaborative and good-faith way of doing things. - They then start making claims of "OUTing" and make a (now rejected) request of hiding that edit, which is another example of overdramatizing things along with unnecessarily pinging admins there instead of trying to be cooperative. This shows that they themselves do not "assume good faith" even from very well-established and respected editors.
This, in all, showcases the recurring pattern of toxic behaviour and unacceptable conduct, and this is not the first time either. Svārtava (tɕ) 09:44, 25 March 2025 (UTC)
- I do not particularly trust this user and all interactions I have had with them have been confrontational and seem to assume bad faith on everyone else's part. Vininn126 (talk) 11:29, 25 March 2025 (UTC)
- Note: I have since reconsidered and accepted the revdel request. It is best to proceed with caution in addressing personal privacy matters. bd2412 T 19:43, 25 March 2025 (UTC)
- For the record, I wouldn't consider the request accepted as only the edit summary of the edit was hidden, and the edit summary contained nothing more than the name of the heading of the discussion in which the comment was posted. Svārtava (tɕ) 19:48, 25 March 2025 (UTC)
- The request was to delete the edit summary. I pointed out the discrepancy with the text still existing myself, but I can see the argument for the words on the talk page (which will presumably eventually be archived somewhere) being more ephemeral than an edit summary in the page history. bd2412 T 02:35, 26 March 2025 (UTC)
- I'm still completely unsure what that hiding achieved - if anything it would only attract more attention.
- I'll also bring forward some points by Equinox that are relevant and stand out:
- Special:Diff/84378714/84378733:
If your issue was really with your privacy, you'd have long ago asked Wikipedia to delete other details (what college you attend etc.) from your Wikipedia user page hsitory. But we know it's not about that. It's petty revenge as usual on whoever disagrees with you.
- Special:Diff/84383328/84386060:
you linked your full name YouTube channel from your old user page on Wikipedia, but apparently it's OUTING when anybody sneezes near you.
- Special:Diff/84378714/84378733:
- Also, PBP89 seemingly has concerns regarding privacy but they can apparently baselessly request checkuser investigation - a real and unwarranted privacy intrusion without any evidence of abuse (the claims about "harassing" are just nonsense, and even then completely unrelated to checkuser investigation). Svārtava (tɕ) 21:19, 26 March 2025 (UTC)
- The request was to delete the edit summary. I pointed out the discrepancy with the text still existing myself, but I can see the argument for the words on the talk page (which will presumably eventually be archived somewhere) being more ephemeral than an edit summary in the page history. bd2412 T 02:35, 26 March 2025 (UTC)
- For the record, I wouldn't consider the request accepted as only the edit summary of the edit was hidden, and the edit summary contained nothing more than the name of the heading of the discussion in which the comment was posted. Svārtava (tɕ) 19:48, 25 March 2025 (UTC)
- Suggest speedy close This is a joke, right? I can't think of more flimsy evidence for a block. Among other things, asking admins to assume good faith is unacceptable now? It's unacceptable to start an RFDO after a category has been speedy deleted? But it IS acceptable for an admin to shout out where he thinks you're from even though you've NEVER associated yourself with that city publicly? Purplebackpack89 11:24, 25 March 2025 (UTC)
- Assuming good faith is only applicable if assumptions need to be made, such as when we don’t know whether an editor is acting in good or bad faith. We’ve known you for a long time now, so assumptions are no longer necessary. MuDavid 栘𩿠 (talk) 00:40, 26 March 2025 (UTC)
- I don't think it is fine for you to declare "speedy close" in bold; I am striking that part. As for the
evidence for a block
, I can say that you are repeating those things you were blocked for among the things I listed above. It doesn't seem that you have correctly argued against my points. Svārtava (tɕ) 12:29, 25 March 2025 (UTC)- That wasn't a close, that was a vote. I've reworded it. Please don't refactor other people's comments Purplebackpack89 12:32, 25 March 2025 (UTC)
- I won't examine the question of PBP89's conduct here. I tend to think he unnecessarily personalizes disputes. On the other hand I strongly object to Svārtava's selective framing of Equinox's history. While undoubtedly a skilled, productive editor, Equinox was also a habitual bully and provocateur. His conduct fostered a hostile atmosphere that drove away productive editors like me. I don't know why he chose to hang his hat after nearly two decades of contributing. But I found it curious that his farewell came just minutes after I called him out for a misogynistic comment he made about me on the Discord. The general temperature on Wiktionary has gone down noticeably since his departure. Sometimes a single problematic actor can have an outsize influence within a community. Remove the gravitational pull toward hostility and many will float back to gentler orbits. WordyAndNerdy (talk) 05:49, 26 March 2025 (UTC)
Sometimes a single problematic actor can have an outsize influence within a community.
- Yes! The "single problematic actor" in this case is PBP, and he's had a 10+ year history in inciting unnecessary, highly antagonizing fights. I would suggest we go forward with an indef ban. -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 06:05, 26 March 2025 (UTC)
- From what I've seen, Purplebackpack89 has been aggressive with many other editors, so I don't think this can be described as a situation where there was only a single problematic actor. (I agree that Equinox also had problematic behavior.) It seems like a lot of Purplebackpack89's energy is put into these kinds of fights relative to more productive editing. A block seems like a good idea to me.--Urszag (talk) 06:32, 26 March 2025 (UTC)
- Equinox was a problem admin who pulled other contributors into his orbit. The systemic tolerance of his actions enabled similar conduct in others, I think. With his influence in decline, though, I think others may have levelled off. Indef bans weren't considered as a first-line remedy for Equinox or any other problem admin. I think PBP89 deserves the same grace: a chance to turn off battle mode. WordyAndNerdy (talk) 06:52, 26 March 2025 (UTC)
- He has already been given lots of chances, just see his block log. Svārtava (tɕ) 07:20, 26 March 2025 (UTC)
- If Equinox is going to return for the purpose of harassing me, then he, too, needs to be indeffed. His absence is the only thing that's allowed me to sporadically contribute over the last few months. WordyAndNerdy (talk) 10:55, 26 March 2025 (UTC)
- Equinox was a problem admin who pulled other contributors into his orbit. The systemic tolerance of his actions enabled similar conduct in others, I think. With his influence in decline, though, I think others may have levelled off. Indef bans weren't considered as a first-line remedy for Equinox or any other problem admin. I think PBP89 deserves the same grace: a chance to turn off battle mode. WordyAndNerdy (talk) 06:52, 26 March 2025 (UTC)
- I am in the midst of a family emergency so I can't make a super lengthy post. But in my opinion, rollback and autopatrol rights only belong in the hands of trusted users (rollback rights because they allow you to quickly undo lots of changes and autopatrol rights because many sensitive modules and templates are protected at the autopatroller level), and I don't trust PBP89. Fundamentally:
- PBP89 per statements they've made doesn't believe in Wiktionary's CFI, and what's worse, they act on their beliefs by unilaterally creating clearly non-CFI-worthy entries and then getting upset and bureaucratic when those entries are speedily deleted.
- PBP89 also doesn't believe in the category tree/
{{auto cat}}
system for handling categories but believes all categories should be manually curated (an impossible task given Wiktionary organization), and again, they act on it by creating manual categories that either don't belong at all or should be created in the category tree system. This happened recently with the Pasadena-related categories (see my talk page); I deleted the categories, PBP re-created them, I deleted them again and create-protected them so they wouldn't be re-created again, and PBP's response was to immediately go bureaucratic (a much better response would have been to have a discussion with me *first* about whether these pages belong and why I deleted them, and only go to WT:RFDO if a satisfactory outcome could not be achieved that way). - PBP89 demands that admins and bureaucrats "assume good faith" on their part but clearly does not assume good faith on the part of admins and bureaucrats; this is obvious from their repeated assertions that there is a "cabal" and "old guard" that is secretly out to get PBP89.
- In general PBP89 personalizes all disagreements and turns them into claims that they are being unfairly targeted and persecuted regardless of the merits or lack thereof or their point of view.
- I have seen no improvement in this behavior since PBP's 3-month block expired, so they obviously didn't learn anything from the block.
- In general, dealing with anything that PBP89 has touched is exhausting and distracts from the main goal of building a better Wiktionary because of their tendencies to (a) personalize all disputes through bad-faith allegations of persecution; (b) Wikilawyer before having a more informal discussion (something that is highly discouraged at Wiktionary). More than one user has expressed to me privately that they've avoided RFV'ing, RFD'ing or speedy-deleted a bad PBP89-created entry due to the inevitable drama that will ensue. This both diminishes the quality of the dictionary and leads to a highly toxic environment.
- As for removal of rollback and autopatrol rights due to a Discord discussion among admins, although I strongly agree with the removal of these rights, I also agree with others that the optics of this are quite bad and there should have been a BP post prior to removal of the rights.
- I would support an indefinite ban at this point for the reasons I enumerated above: PBP has shown no improvement in their behavior after the 3-month ban, is creating a toxic environment by their continued bad faith allegations and excessive Wikilawyering, and continues to not respect the basic tenets of the dictionary, meaning their edits are overall leading to an average worsening of the quality of the dictionary, and correcting the problematic edits on the part of admins is sucking time away from other pressing issues.
- NOTE: I may not be able to respond to any responses to this post for 12-24 hours due to the ongoing RL issues. Benwing2 (talk) 06:25, 26 March 2025 (UTC)
- Benny, those are ridiculous reasons for wanting to block somebody. Indeffing me for the reasons you've outlined are tantamount to saying:
- BOLD doesn't exist
- Only a tiny cadre of people who code are allowed to create categories
- No difference of opinion on CFI is permissible at all
- If Benny deletes something, that's final and it can't be discussed at all
- Any discussion is WikiLawyering, we should just give Benny his way
- You and Svārtava are allowed to make unsubstantiated assertions about private conversations about me
- People like you and Knight can harass me without repercussions, and if I clap back, I get indeffed
- Lunacy. Utter lunacy. Purplebackpack89 11:52, 26 March 2025 (UTC)
- All of this is not really part of the argument and an exaggeration of what's happening. And a good demonstration earlier of what I said when much of what you argue does not assume good faith. Vininn126 (talk) 11:54, 26 March 2025 (UTC)
- Most of this response written by PBP, apart from being rude, is quite low-effort and typical of their "I didn't hear that" behaviour. Svārtava (tɕ) 20:46, 26 March 2025 (UTC)
- Benny, those are ridiculous reasons for wanting to block somebody. Indeffing me for the reasons you've outlined are tantamount to saying:
Support.
- Edit: a permablock, to be clear. Nicodene (talk) 07:48, 26 March 2025 (UTC)
- What I see plainly here is there are two sorts of parties, one that invariably deflects to other irrelevant (and/or ancient) drama and "I didn't hear that", and the rest who are beyond tired of that shtick. The Simple Wikipedia talk page going 15 years back is eye-opening and gives little confidence in reforming of seriously problematic ways. Hftf (talk) 17:21, 26 March 2025 (UTC)
Support a block. PUC – 17:29, 26 March 2025 (UTC)
Support blocking. Purplebackpack89 seems to be a run-of-the-mill jerk rather than anything more serious, the fact this has gone on for 11 years is reason for more than a slap on the wrist. If the user started editing as a teenager, it surprises me that they still get into this amount of trouble as an adult — it seems more than a third of their edits is in discussion pages, so we're not losing a big contributor. Also, the amount of question marks in their messages astounds me. Polomo47 (talk) 18:30, 26 March 2025 (UTC)
- I don't see how "it seems more than a third of their edits is in discussion pages" means that someone isn't "a big contributor". Talk page discussions are very useful for collaborating and if someone made a billion edits total and a third of them were in discussion pages, that would still be a "big contributor" as far as I'm concerned. This seems like a distraction to me. Someone should be blocked not based on volume of edits but quality and if that person is actively harming the project. Someone who edits a few times a year and provides genuine value is welcomed. —Justin (koavf)❤T☮C☺M☯ 18:34, 26 March 2025 (UTC)
- I brought up that statistic because it made it clearer to me how this user really is prone to conflict. Having more edits in discussion pages is odd, but not inherently an issue. Looking through a few pages of their contributions to the Wiktionary: mainspace, however, more than half of these are in topics about themselves or topics of their creation criticizing other users.
- I do not want to welcome an editor with such a large part of their contributions consisting of drama. Polomo47 (talk) 18:50, 26 March 2025 (UTC)
- It's not damning in and of itself, but it can certainly be indicative. Especially when those talkpage edits are filled with conflict, i.e. a large portion of this user's edits are not dedicated to the betterment of the project, but to conflict. Vininn126 (talk) 19:00, 26 March 2025 (UTC)
- I don't see how "it seems more than a third of their edits is in discussion pages" means that someone isn't "a big contributor". Talk page discussions are very useful for collaborating and if someone made a billion edits total and a third of them were in discussion pages, that would still be a "big contributor" as far as I'm concerned. This seems like a distraction to me. Someone should be blocked not based on volume of edits but quality and if that person is actively harming the project. Someone who edits a few times a year and provides genuine value is welcomed. —Justin (koavf)❤T☮C☺M☯ 18:34, 26 March 2025 (UTC)
Support indef block. Mainspace edits seem good, but too prone to unproductive conflict with seemingly no will or ability to de-escalate. JeffDoozan (talk) 20:10, 26 March 2025 (UTC)
Support, though not indef. I expected one could just ignore him and continue to delete his contributions for previously stated reasons, and was concerned with removing his access to the Wikipedia library, though his contributions evince little scholarly depth and the reason fall away if he is blocked anyhow.
- The observations of other editors are convincing; I wasn’t previously aware of the “I don’t hear you profile”, which he engrossed to the point of lying when declaring, on the present page, something he knows to have taken place
unsubstantiated assertions
, and similarly that ISTILL haven't provided a single diff of a bad edit […]
on User_talk:BD2412#Interaction_ban?, when I mentioned a specific instance of me calling him out he needs remembers, now archived on the talk page of Sticks Nix Hick Pix, a page he created in response to the treatment of DONT TREAD ON ME. - And the shares of his edits to various namespaces have been summarized again, with the striking remark about the amount of question marks, which are unfit to express “difference of opinion”, because if you have an opinion you naturally use assertive sentences to an extent others do.
- We attach value on not banning someone for difference of opinion, or expression of it; there I have and others have outlined patterns that are designed to lead to oneself being banned, rather than promoting any opinion; a site where Purplebackpack89 were to reign would be a site whereof he would be the sole reader. Fay Freak (talk) 21:53, 26 March 2025 (UTC)
Oppose - 1. This is not a fair or proportionate response to PBP's missteps. It's not based on the severity of PBP's conduct, but on the severity of sentiment against him. The collective decision to ignore Equinox's prolonged harassment of me (including in this very thread) makes that abundantly clear. Wiktionary thinks it can pick and choose whom is subject to policy enforcement. The whipping boy gets the stick while the favoured few receive infinite carrots and free rein be as toxic and combative as they please. 2. PBP was recently subject to a systematic failure of due process. His rollback rights were stripped out of process as a result of seemingly grudge-driven backchannel collusion. Let anyone with grievances (however valid or invalid) air them publicly and in a fully transparent manner. Until then I feel like PBP's recent reactivity is a natural human response to the provocation of having his rollback rights stripped out of process. Wiktionary has forgiven worse trespasses from its esteemed elite – up to and including nuking the main page in a tantrum. WordyAndNerdy (talk) 22:30, 26 March 2025 (UTC)
Support an indefinite block. In all honesty, I don't want to say this, but I've watched this cycle continue for years now and it just appears as though we're stuck in a loop. Purplebackpack89 has been allowed more freedom than most, yet things never truly change. It's always the same routine: causing strife, blaming others, and never once taking the time to reflect upon how they're contributing to the toxicity. It's draining. People are here to develop a dictionary, not to have to continually deal with drama. Somewhere we need to ask whether or not having someone present is helping or hurting the project, and in this case I think the response's pretty clear. — Fenakhay (حيطي · مساهماتي) 23:21, 26 March 2025 (UTC)
Support an indefinite block. Vininn126 (talk) 08:38, 27 March 2025 (UTC)
Currently a lot of sign language production templates are just for ASL, and because of this 𝡝𝪟𝤫𝪤 has had template redlinks for the past 4 years where they're definitely necessary; so I put together {{sign prod}}
(which uses Module:sign prod) and it should work for every sign language once I've collected the text of all the production templates. (cc @JnpoJuwan, MedK1; I'm not sure of anyone else who has shown interest in sign languages) - saph ^_^⠀talk⠀ 10:48, 25 March 2025 (UTC)
- @Saph thank you for pinging me, love the work you started! I haven't noticed that, because I don't edit sign language entries as I don't speak any. Juwan (talk) 12:17, 25 March 2025 (UTC)
- I'm in the same boat unfortunately. Thanks a lot for the ping; great initiative fr. Here's praying you can find someone who actually knows any sign languages to help you out! MedK1 (talk) 12:44, 25 March 2025 (UTC)
- @Rhanese - pinging since you seem to be knowledgeable on sign languages. Do you have any feedback on the template? - saph ^_^⠀talk⠀ 13:52, 25 March 2025 (UTC)
- I think it's a good idea. Because the characteristics of signing are very different in some sign languages (from my experience), and if there is a template for any language, new users can contribute efficiently to sign language. RhAnese[discuss] 01:08, 27 March 2025 (UTC)
- I support the idea, certainly, of making infrastructure that works for all sign languages. I'm not sure offhand how to judge whether this particular implementation is lacking anything (I don't edit sign languages much on here, though I have limited familiarity with a few), but if no information is being lost then it seems like this could only be an improvement...? - -sche (discuss) 01:42, 26 March 2025 (UTC)
Restoration of Purplebackpack89's rollback rights
[edit]Since we're on the topic of me, yesterday Svartava took away, claiming he had consensus to do so but not demonstrating a public discussion on the matter. I believe that that was unnecessary and would like them restored. Purplebackpack89 11:24, 25 March 2025 (UTC)
- Rollback rights are granted through at the discretion of admins (usually through WT:WL - but I can't even find the nomination in your case), and can be revoked be as easily too. You don't have any activity related to patrolling and a lack of trust. Svārtava (tɕ) 12:29, 25 March 2025 (UTC)
- Your edit summary did mention some discussion with other admins. Where was this discussion? —Justin (koavf)❤T☮C☺M☯ 13:38, 25 March 2025 (UTC)
- This was at WT:Discord. Svārtava (tɕ) 14:11, 25 March 2025 (UTC)
- I have no stake in this, but I should say the Discord is not an off-wiki decision-making body. IMO you should have brought this up on-wiki first. - saph ^_^⠀talk⠀ 14:50, 25 March 2025 (UTC)
- Rights can be taken away as easily as they are granted. Taking away rights does not necessarily require a formal discussion (unless it is contested by some other admin, for example). Consensus between a few admins is sufficient to take action. Svārtava (tɕ) 14:55, 25 March 2025 (UTC)
- I have no stake in this, but I should say the Discord is not an off-wiki decision-making body. IMO you should have brought this up on-wiki first. - saph ^_^⠀talk⠀ 14:50, 25 March 2025 (UTC)
- This was at WT:Discord. Svārtava (tɕ) 14:11, 25 March 2025 (UTC)
- Your edit summary did mention some discussion with other admins. Where was this discussion? —Justin (koavf)❤T☮C☺M☯ 13:38, 25 March 2025 (UTC)
- @Svārtava: Yes, but even so, the expression of that consensus, however informal, should be accessible on-wiki, so I agree with Saph on this issue. 0DF (talk) 15:02, 25 March 2025 (UTC)
the expression of that consensus, however informal, should be accessible on-wiki
I don't think that is a hard requirement. There have been revocations performed in the past without that. In this case the rollback was given even without WT:WL nomination and approval by 1 admin each, i.e. out of process. Svārtava (tɕ) 15:09, 25 March 2025 (UTC)
- @Svārtava: Yes, but even so, the expression of that consensus, however informal, should be accessible on-wiki, so I agree with Saph on this issue. 0DF (talk) 15:02, 25 March 2025 (UTC)
- @Svārtava: IMO, it should be a hard requirement, even if it isn't one already. But since, as you write,
the rollback was given even without WT:WL nomination and approval by 1 admin each, i.e. out of process
, you would have been well within your rights to revoke Purplebackpack89's rollback rights, citing the justification that they were granted without due process in the first place. I and, AFAICT, Saph are just making procedural points. 0DF (talk) 16:20, 25 March 2025 (UTC) - FWIW, I had held rollback so long (longer than Svārtava has been editing, FWIW), that the procedures have likely changed since it was awarded to me. Saying that rollback wasn't awarded by 2025 procedures when it was awarded years ago is NOT a reason for revocation. Purplebackpack89 16:43, 25 March 2025 (UTC)
- The process for granting rights through WT:Whitelist is well-established from long; you can see nominations for rollbackership like [3], [4]. I wouldn't really care about having "held" rollback so long when the total number of rollbacks is just around 10. Svārtava (tɕ) 17:52, 25 March 2025 (UTC)
- Did you discuss this with the awarding admin before removing, @Svartava? Purplebackpack89 18:23, 25 March 2025 (UTC)
- The awarding admins (Stephen for rollback and Metaknowledge for autopatrol) are not active currently. From how much I knew Metaknowledge, I do not think he would be siding with you in this matter as admins more lenient than him have also expressed concerns with you possessing these rights. Svārtava (tɕ) 18:50, 25 March 2025 (UTC)
- If such "lenient admins" really wanted my rights taken away, they'd either have done it themselves or expressed why publicly. They've done neither. Purplebackpack89 19:08, 25 March 2025 (UTC)
- The awarding admins (Stephen for rollback and Metaknowledge for autopatrol) are not active currently. From how much I knew Metaknowledge, I do not think he would be siding with you in this matter as admins more lenient than him have also expressed concerns with you possessing these rights. Svārtava (tɕ) 18:50, 25 March 2025 (UTC)
- Did you discuss this with the awarding admin before removing, @Svartava? Purplebackpack89 18:23, 25 March 2025 (UTC)
- The process for granting rights through WT:Whitelist is well-established from long; you can see nominations for rollbackership like [3], [4]. I wouldn't really care about having "held" rollback so long when the total number of rollbacks is just around 10. Svārtava (tɕ) 17:52, 25 March 2025 (UTC)
- Okay, but "any topics for which a decision may have wider consequences, such as to active Wiktionary editors outside this server or to all Wiktionary users, are strongly encouraged to be discussed on-wiki with the wider community before making any decisions. The purpose of the Discord server is to facilitate communication, not to act as an off-wiki decision-making body." If you are appealing to a Discord discussion, then it seems like you are directly contradicting the spirit and letter of the law, which is that it's an informal chat site that can sometimes make it easy to discuss things which can then be proposed to the community or which may help individuals with their editing, not a bypass to having an actual on-wiki discussion. —Justin (koavf)❤T☮C☺M☯ 19:11, 25 March 2025 (UTC)
- I'll reply to this with what I said above and below: minor rights (such as the ones granted by WT:WL) do not need explicit discussion for removal and can be revoked by the admin's discretion. In this case, it was brought up that PBP had autopatrol and rollback rights along with the concern that they potentially lack the trust to have them - a few others seconded it and I decided to act on it. Svārtava (tɕ) 19:36, 25 March 2025 (UTC)
- You're not listening to Justin...even if that was true, those admins should have done that PUBLICLY, not in the shadows of a private Discord chat. Purplebackpack89 19:38, 25 March 2025 (UTC)
- Why have you consistently employed the passive voice throughout this thread? "Consensus between a few admins is sufficient to take action." Which admins? How many? Where? "It was brought up that PBP had autopatrol and rollback rights." Words don't materialize out of nowhere. Who said this, and where? "The autopatroller right was removed as it was opined that PBP's edits need checking." Again, what specific concerns did these conspicuously unnamed, pronounless parties express, and where? And why did you accept their assertion that PBP89 having rollback rights is a matter of concern, when, as you yourself have pointed out, PBP89 seldom uses this tool and thus cannot be said to be actively misusing it?
- Surely, if multiple admins have (perhaps entirely valid) concerns about PBP89's wiki-conduct, it shouldn't be an issue for them to voice them openly, with their usernames – and the corresponding weight of authority – attached? Otherwise, this looks less like a straw-poll consensus on how to handle a problematic editor, and more like backroom collusion between a handful of people with an axe to grind. WordyAndNerdy (talk) 04:31, 26 March 2025 (UTC)
- Enjoying inventing any scenario where "how many admins" actually matters. Oh it was seven, not six! 2A00:23C5:FE1C:3701:FD1C:68CD:8DB8:EAC0 10:25, 26 March 2025 (UTC)
- I'll reply to this with what I said above and below: minor rights (such as the ones granted by WT:WL) do not need explicit discussion for removal and can be revoked by the admin's discretion. In this case, it was brought up that PBP had autopatrol and rollback rights along with the concern that they potentially lack the trust to have them - a few others seconded it and I decided to act on it. Svārtava (tɕ) 19:36, 25 March 2025 (UTC)
- @Svārtava: IMO, it should be a hard requirement, even if it isn't one already. But since, as you write,
- Policy decisions need to be accessible to all Wiktionarians in the interest of transparency and due process. Consensus is only consensus if every user can review the process by which it emerged. This thread basically confirms personal suspicions I've had about the Discord for at least two years. WordyAndNerdy (talk) 01:08, 26 March 2025 (UTC)
- Back to IRC then! (At least it's not proprietary.) 2A00:23C5:FE1C:3701:FD1C:68CD:8DB8:EAC0 10:26, 26 March 2025 (UTC)
- Policy decisions need to be accessible to all Wiktionarians in the interest of transparency and due process. Consensus is only consensus if every user can review the process by which it emerged. This thread basically confirms personal suspicions I've had about the Discord for at least two years. WordyAndNerdy (talk) 01:08, 26 March 2025 (UTC)
- I am in complete agreement with the proposition that, absent crazy circumstances such as threats of imminent violence, user rights should never be diminished without a transparent open onsite discussion. bd2412 T 21:46, 26 March 2025 (UTC)
Restoration of Purplebackpack89's autopatrol rights
[edit]I also request that my autopatrol right be restored. Reasons are the same as above; they were not removed transparently. Svartava hasn't provided any diffs of why it was necessary. Purplebackpack89 18:23, 25 March 2025 (UTC)
- I'm disinclined to unilaterally add or remove user rights and I believe Svartava that there was some discussion, but having discussion on Discord about anything meaningful is obviously a bad idea. I only think that user rights should be removed if someone abuses that right in particular or as part of some kind of larger issue like the user dying or otherwise explicitly retiring. Since it's not obvious that PB89 has actually abused rollback or autopatrol somehow, I think they should be restored and if there's consensus to do so here, I would be willing to add them back. —Justin (koavf)❤T☮C☺M☯ 18:30, 25 March 2025 (UTC)
- The autopatroller right was removed as it was opined that PBP's edits need checking and they cannot be blanketly trusted enough to have each edit marked as patrolled automatically. In any case, autopatroller status provides the owner very little extra rights anyways and is mostly designed to help administrators in patrolling-related tasks. Svārtava (tɕ) 18:53, 25 March 2025 (UTC)
- To be clear, I'm not suggesting that PBP89 should not have it removed in principle: it may well be the case that said user's edits should be reviewed. I'm just suggesting that either unilateral removal or discussion somewhere off-wiki with whomever is not optimal. I will generally defer to other admins and I generally trust your judgement, but this is not how things should be done, which should be an obvious and uncontroversial statement. Would you support me taking away user rights from someone who did not abuse them because I exchanged some emails with someone? —Justin (koavf)❤T☮C☺M☯ 19:03, 25 March 2025 (UTC)
- "Opined"? Where? Who? Publicly?
- Diffs, diffs! If there really was an abuse issue, you'd be able to provide diffs but you haven't. And since you haven't, it's hard to take this removal seriously. Purplebackpack89 19:10, 25 March 2025 (UTC)
- OK, I personally would trust the judgement of the revoking administrator in case of rights, even if they did it by themselves without (off-wiki or on-wiki) discussion, especially for rights like autopatroller which do not impact the user as much as they impact the patrolling admin. There were concerns about their edits, and I would see why, e.g. some of the categories they created (see Special:DeletedContributions/Purplebackpack89) had module error. Svārtava (tɕ) 19:25, 25 March 2025 (UTC)
- The autopatroller right was removed as it was opined that PBP's edits need checking and they cannot be blanketly trusted enough to have each edit marked as patrolled automatically. In any case, autopatroller status provides the owner very little extra rights anyways and is mostly designed to help administrators in patrolling-related tasks. Svārtava (tɕ) 18:53, 25 March 2025 (UTC)
- This should have been done through the whitelist so there would be an on-wiki record. Doing things based on offwiki discussions looks bad, regardless of the actual motivation. As for the removal of the right: if it hasn't been used for years, it's quite reasonable to remove it. Chuck Entz (talk) 14:25, 26 March 2025 (UTC)
- I will surely keep that in mind and I realize that it could have helped save some drama.
- Now that some admins have commented in the above section, it should hopefully address some of the concerns people had here. Svārtava (tɕ) 20:34, 26 March 2025 (UTC)
- I am very suspicious of this sort of request. Being an autopatroller makes little difference to what you can do here. There are a few page protections and filters so it's not completely useless. As for rollback, I don't have rollback rights and I can still revert changes. The log message is different depending on rollbacker status. So I ask myself, is this request merely seeking to collect privileges as an ego boost or is it a plan to evade oversight by admins? Vox Sciurorum (talk) 20:14, 26 March 2025 (UTC)
Is Wonderfool trusted to close discussions?
[edit]I just blocked his latest identified sockpuppet from Wiktionary project space after a slate of RfD and RfV closes, on the understanding that sockpuppets should not be closing discussions at all, as this requires an editor trusted by the community. I am bemused here. Where does the community stand with respect to this? bd2412 T 22:41, 25 March 2025 (UTC)
- Undisclosed alternate accounts shouldn't exist, let alone be participating in votes and similar community conversations. —Justin (koavf)❤T☮C☺M☯ 22:47, 25 March 2025 (UTC)
- We have, on numerous occasions, crossed out (as to invalidate) WF's votes. Don't see why we shouldn't do the same for vote closes or any other administrative action. 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 22:49, 25 March 2025 (UTC)
- I don’t have particular objections against multiple accounts as long as they aren't used abusively, since WF alts are easily identifiable and are do not seem to be attempts to hide. However, some of his closures are sometimes problematic. Svārtava (tɕ) 06:27, 26 March 2025 (UTC)
- How can you possibly know which accounts are sockpuppets of Wonderfool or not? By definition, the ones that are particularly sneaky are ones that you don't identify... The fact that several hundred have been more-or-less trivially identified in no way means that hundreds of others weren't. —Justin (koavf)❤T☮C☺M☯ 17:17, 26 March 2025 (UTC)
- I meant that WF's accounts that participate in community discussions are usually easily identified and I don't commonly encounter suspicious user accounts appearing in or closing RFs. Svārtava (tɕ) 17:37, 26 March 2025 (UTC)
- Sure, but the point I am making is that you can only easily identify the ones that are easily identifiable by definition. You don't know the ones that aren't because they aren't. It is common to assume "Oh, I know which editors are just Wonderfool sockpuppets" but that is not obvious and we should have a clear policy to at least explicitly state that you should not use undisclosed alternate accounts. —Justin (koavf)❤T☮C☺M☯ 17:47, 26 March 2025 (UTC)
- I meant that WF's accounts that participate in community discussions are usually easily identified and I don't commonly encounter suspicious user accounts appearing in or closing RFs. Svārtava (tɕ) 17:37, 26 March 2025 (UTC)
- How can you possibly know which accounts are sockpuppets of Wonderfool or not? By definition, the ones that are particularly sneaky are ones that you don't identify... The fact that several hundred have been more-or-less trivially identified in no way means that hundreds of others weren't. —Justin (koavf)❤T☮C☺M☯ 17:17, 26 March 2025 (UTC)
- I don’t have particular objections against multiple accounts as long as they aren't used abusively, since WF alts are easily identifiable and are do not seem to be attempts to hide. However, some of his closures are sometimes problematic. Svārtava (tɕ) 06:27, 26 March 2025 (UTC)
- I haven’t been bothered myself. Wrong closures don’t appear to be any more frequently made by Mr. ’Fool rather than by anyone else; and if they are, they can always be contested. I think it’s great that Wonderfool has gone back to close older discussions, because they were otherwise forgotten. Polomo47 (talk) 15:38, 26 March 2025 (UTC)
Related to this: he also inappropriately tagged Army of Northern Virginia as having failed RfD when it's never even been to RfD. I was under the impression that sockpuppetry was a fairly bright line that usually got you indeffed, and WF trolls on top on it. Honestly, just three days was too leniant... Purplebackpack89 00:42, 26 March 2025 (UTC)
- Using an alternate account once you are blocked is one of the examples of behavior that is explicitly listed for a subsequent block (if not, no block would serve any purpose at all). The main Wonderfool account was unblocked per a vote and then request to the stewards a couple of years back. We have no local policy about multiple accounts, but a standard assumption is that users will use one account. There has been no consensus to codify in any way what is acceptable or unacceptable use of multiple accounts locally. In a reasonable world, it would be expected that someone would edit from a single account and then only use explicitly identified alternates with some cause (e.g. traveling, using a proxy for some politically sensitive reason, having a secondary account for bots or bot-like actions that still require human discrimination). —Justin (koavf)❤T☮C☺M☯ 00:52, 26 March 2025 (UTC)
- I would prefer for the alternate-account hopping to stop. The recent slightly trollish closures have lowered my trust from acceptable previously to not really trustable at this point in time. That said, WF's work appears to improve the dictionary, and I don't know that a block for this is completely necessary. Hftf (talk) 01:03, 26 March 2025 (UTC)
- We do not need to have a sitewide rule in place on this to institute a community restriction with respect to a specific editor. However, we probably should have a sitewide rule restricting discussion closes to main accounts, even where secondary accounts are not strictly prohibited from editing. bd2412 T 21:49, 26 March 2025 (UTC)
gay agenda discussion
[edit]- Gay agenda is a semi-recent WF close that I strongly feel bears re-examination. WordyAndNerdy (talk) 06:22, 26 March 2025 (UTC)
- @WordyAndNerdy: I took a look at the discussion to which you linked. I read seven votes to delete the entry (PUC, LunaEatsTuna, Sgconlaw, ScribeYearling, Mihia, Fay Freak, Polomo47), one vote to keep it (you), one comment that could reasonably be interpreted as a vote to keep the entry (-sche: “I'm on the fence, leaning towards keep.”), and one comment that could be inferred as being in favour of the entry's deletion (Ultimateria: “I will also delete transgender agenda with the reasonable expectation that it would have failed alongside gay agenda and its synonyms.”), but which probably shouldn't be counted as a vote at all. That's seven votes to delete and one or two votes to keep; in percentage terms, that's either 87.5% or 77.7% of votes being in favour of the entry's deletion. Whilst you might reasonably maintain that your argument that “entirely notional concept[s require] clear, accurate definition[s]” was insufficiently-well addressed, it is not reasonable to assert that Wonderfool (p.p. Father of minus 2) was incorrect in his judgment that the entry failed that RFD. 0DF (talk) 09:39, 26 March 2025 (UTC)
- Forming a consensus doesn't mean simply tallying votes. It involves weighing the relative merit of arguments presented. If 80% of the votes argue "SOP, delete", but 20% point out that WT:COALMINE applies, or the term has an idiomatic/regional/dated/etc. secondary sense, or there are nuances not covered by an inadequate primary definition, then the minority opinion is the correct one. I'd hoped to codify the "Wikipedia test" during the gay agenda discussion but had to prematurely disengage for reasons outlined on BD2412's talk page. WordyAndNerdy (talk) 10:42, 26 March 2025 (UTC)
- @WordyAndNerdy: I took a look at the discussion to which you linked. I read seven votes to delete the entry (PUC, LunaEatsTuna, Sgconlaw, ScribeYearling, Mihia, Fay Freak, Polomo47), one vote to keep it (you), one comment that could reasonably be interpreted as a vote to keep the entry (-sche: “I'm on the fence, leaning towards keep.”), and one comment that could be inferred as being in favour of the entry's deletion (Ultimateria: “I will also delete transgender agenda with the reasonable expectation that it would have failed alongside gay agenda and its synonyms.”), but which probably shouldn't be counted as a vote at all. That's seven votes to delete and one or two votes to keep; in percentage terms, that's either 87.5% or 77.7% of votes being in favour of the entry's deletion. Whilst you might reasonably maintain that your argument that “entirely notional concept[s require] clear, accurate definition[s]” was insufficiently-well addressed, it is not reasonable to assert that Wonderfool (p.p. Father of minus 2) was incorrect in his judgment that the entry failed that RFD. 0DF (talk) 09:39, 26 March 2025 (UTC)
- @WordyAndNerdy: I am highly sympathetic to that view (i.e. that it is the best argument that should prevail, not the most popular). I encourage you to codify the Wikipedia test in a Beer-parlour discussion. It may thereby become a principle guiding inclusion, à la WT:COALMINE. 0DF (talk) 12:42, 26 March 2025 (UTC)
Moving away from whether a specific user should or should not be closing discussions relating to deletion, verification, etc., I think the general rule should be that only users who have the appropriate user rights, and make the effort, to do all the consequential cleanup work upon closing a discussion should do so. For example, it will just cause confusion if a user who is unable to delete an entry which has failed RFD or RFV closes a discussion because the entry will remain in existence, and may then be overlooked by other editors who think the closing user has already done what is necessary to remove the entry. — Sgconlaw (talk) 16:19, 26 March 2025 (UTC)
- Others can do it with
|fail=1
, which makes entries liable to speedy deletion, and then if an admin deletes it we even had four-eyes about the consensus, since as WordyAndNerdy pointed out it is about the community attitudes towards the entry based on variously weighted factors, so indeed I assume that in general one has to be a regular editor to convincingly close; it probably is not outrageous but a shared belief that IPs can’t close deletion or verification motions on the same grounds, which itself would be irking, though due to COALMINE or perfectly durable cites they can. Fay Freak (talk) 16:41, 26 March 2025 (UTC)
Old Korean revamp and cleanup
[edit]@Saranamd @Ydaraishy @Solarkoid @Chom.kwoy I am thinking of doing significant cleanup and revamping of Old Korean entries, with probably a complete rewrite of WT:About Old Korean at some point to separate it from AKO. I am working on something similar for Middle Korean.
That being said, one change I want to make is to start creating reconstruction entries (Reconstruction:Old Korean:...) for forms that currently do not meet CFI, most importantly terms attested only in the Jīlín leìshì. These will be standardized under Yale romanization spellings. Toward this end, I would also like to create a data module and module to create automatic tables containing reconstructions of JLLS entries by various sources. I'd probably then create nonlemma entries in mainspace with the JLLS spelling as a soft redirect to the reconstructed form.
I'm creating this thread now to hopefully serve as some spitballing before I implement anything. I don't expect much since I don't think there are actually any active editors for OKO at the moment, but FWIW. Please let me know if you would prefer not to be pinged in the future for OKO-related things (especially Saranamd, I understand you are busier lately with Persian and other things). 🌙🐇 ⠀talk⠀ ⠀contribs⠀ 22:58, 25 March 2025 (UTC)
Label inconsistency
[edit](This discussion was moved from Wiktionary:Tea room/2025/March)
Can Module:labels/data be modified so that the youth slang label applied to a language's term automatically puts that term in page called Category:«Language» youth slang, a child of Category:«Language» slang, similar to how the prison slang, internet slang, military slang etc. labels work? If so, anyone with appropriate permission to edit that page, please modify it. ZapciulSlovelor (talk) 11:27, 23 March 2025 (UTC)
- I see no reason not to, we have subcategories far more specific than that. ―K(ə)tom (talk) 20:34, 23 March 2025 (UTC)
- By the way, you should have posted this to WT:Beer parlour. ―K(ə)tom (talk) 20:36, 23 March 2025 (UTC)
- "Youth slang" seems like an inherently ephemeral label: the people who were using youth slang when Wiktionary started 23 years ago are correspondingly older now, and if they're still using the same words, then those words are no longer "youth slang", are they? And today's youth have some new slang, but if they keep using it, they'll grow out of being youth, too. But I suppose that's no reason not to categorize, as long as we have the label; I suppose a category indeed makes it easier for people to go through the entries periodically and check if they're still youth slang. And I suppose in the long term we already have to check labels to see if they need to be updated, anyway (e.g. to see if things have become dated, or if, over nearly a quarter century since Wiktionary began, some dated things have become archaic, or archaic things have become obsolete). OK, I can change how the label categorizes soon, if no-one objects or wants to beat me to it; ping me in a week if I forget. - -sche (discuss) 21:28, 26 March 2025 (UTC)
- @User:-sche See WT:Tea room#withcall for a discussion of why dated, archaic, and obsolete may not be a simple grade, notwithstanding what Appendix:Glossary says. DCDuring (talk) 22:15, 26 March 2025 (UTC)
- In my experience if a youth slang term has permeated various milieus the users will grow out of it due to its supposed markedness that could stigmatize the speaker as belonging to a certain age group, and it will be a dated term fashionable once in the youth and then more broadly because of natural “how do you do, fellow kids?” tendencies in man and then no more, while other terms either just die out as belonging to an era – dated or archaic youth slang – or they stay low-key because they still mark the milieu, though then one will always ask whether it is youth slang or some other slang. But for the diachronic perspective it is not excluded that something is correctly labelled youth slang. Fay Freak (talk) 22:27, 26 March 2025 (UTC)