Wiktionary:Grease pit/2024/June

how do screen readers deal with how we present audio?

How do screen reader programs for the blind handle/speak our links to audio files, which seem to be just buttons that play the file when clicked? I notice that audio links are also just buttons over on Wikipedia (e.g. at Ankara the audio link is preceded by some also-linked text, but it links somewhere else, to the IPA page and not to the audio file, for which the only link "text" seems to be the button icon), so I hope this means this is something the WMF has thought about, and that screen readers are e.g. fed the filename to read. I'm wondering because I've read that the best practice for links (to other pages) is to make the link text descriptive or (semi-)unique (like dog being a link to the page dog), to avoid "click here for X, and this is the link to Y, click here for Z, and this is Q", because especially after someone has listened to the full page once, they may just have the screen reader read them the list of links (which lacks surrounding text not part of the link) and it's hard to remember and command the program to click on the ninth link that just says "this" vs the tenth link that just says "this": I'd hate to think screen readers are just reading something like "button icon" repeatedly on pages where lots of language sections have lots of audio files. So I'm wondering if anyone knows. (If that is what's happening, I wonder if there's anything we could do to make it more accessible.) - -sche (discuss) 18:41, 2 June 2024 (UTC)[reply]

@-sche The Wikipedia admin @Graham87 is, in my experience, always happy to lend advice on matters relating to low-vision readers and editors. The category Cat:English terms with audio links contains the entries in question - I'm not sure why we call that category "audio links" when the audio pronunciations are embedded in the entry, not "linked" in the usual sense. This, that and the other (talk) 10:07, 3 June 2024 (UTC)[reply]

@-sche They read the play button as "Play audio", which is fine. Graham87 (talk) 13:11, 3 June 2024 (UTC)[reply]

"ux" template and unwanted line breaks

For example, at who, we have

##: {{ux|en|That's the man '''who''' I saw earlier.}} (defining)
##: {{ux|en|My brother, '''who''' you met the other day, is coming to stay for the weekend.}} (non-defining)

which generates

That's the man who I saw earlier.

(defining)

My brother, who you met the other day, is coming to stay for the weekend.

(non-defining)

In other words, unwanted and unsightly line breaks are inserted. The only way around this seems to be to put comments, qualifications etc. within the "ux" template, which may or may not seem desirable. Or in fact is it desirable? Or is there another recommended solution? Mihia (talk) 20:53, 3 June 2024 (UTC)[reply]

You can use the |q= parameter mentioned at Template:ux:

{{ux|en|this is a test|q=not real}} ⇒

this is a test (not real)

It's not ideal that the qualifier is shown in italics, just like the usex itself. Perhaps we should de-italicise the qualifier for Latin-script examples only. This, that and the other (talk) 00:06, 4 June 2024 (UTC)[reply]

Right, thanks. Yes, I agree that the qualifier should probably not be in italics, to distinguish it more clearly from the example itself. The brackets aren't italicised, but this is not tremendously visible. I have noticed that you can put e.g. q=''defining'' to "undo" the italics, but then if I do this now, and italics are later switched off, then my comments would become italic, right? So it's a bit of a quandary. I wonder how many manually de-italicised ux qualifiers there are out there already. Could we just switch off the italics now and take a chance that there aren't large numbers? Mihia (talk) 08:20, 4 June 2024 (UTC)[reply]

@Benwing2 looks to have added this functionality; thoughts? This, that and the other (talk) 10:09, 4 June 2024 (UTC)[reply]

I don't see any difference, still italics. Mihia (talk) 13:07, 4 June 2024 (UTC)[reply]

Sorry, I wasn't clear - Benwing is the one who added the |q= parameter to {{ux}} in the first place. This, that and the other (talk) 02:49, 5 June 2024 (UTC)[reply]

@Mihia @This, that and the other Apologies for the delayed response. It's not hard to turn off italics for qualifiers, and I could also have it check for and ignore (but track) double apostrophes placed around the entire qualifier. It didn't occur to me there might be an issue with italicized qualifiers here, because qualifiers are normally italicized. Benwing2 (talk) 19:43, 9 June 2024 (UTC)[reply]

Is there any way to assess how many instances there are of people undoing the italics? Unless this is a widespread practice, my feeling would be to make the qualifier plain text and have italics (''...'') work just as usual. If a few cases exist that would be turned back into italics then it's not a disaster -- no worse than what we see now by default. Mihia (talk) 21:17, 9 June 2024 (UTC)[reply]

@Mihia See User:Benwing2/ux-q-double-apostrophe. This is based on searching through the June 1 dump and came up with (at most) 24 uses. Benwing2 (talk) 04:30, 10 June 2024 (UTC)[reply]

OK, well, an actual list is even better. If you're happy to change the workings so that plain text is by default, I'll go through and take the italics off the ones in that list. Mihia (talk) 09:15, 10 June 2024 (UTC)[reply]

Wiktionary:Etymology_scriptorium#forest

Why is this discussion not showing up in Wiktionary:Etymology_scriptorium/2024/June? I'm not sure if I should just manually move it over... I am not allowed to reply to it. —Caoimhin ceallach (talk) 17:36, 4 June 2024 (UTC)[reply]

@Caoimhin ceallach: I've moved it to Wiktionary:Etymology_scriptorium/2024/June. — Fenakhay ^{(حيطي · مساهماتي)} 17:40, 4 June 2024 (UTC)[reply]

Default fonts per language

why does Cyrillic and Greek (modern Greek, not ancient Greek) use different fonts when the user may have Noto fonts installed and set as default? Cyrillic uses Arial and Greek uses Gentium. this is bad for the cohesiveness of the design. also related, enabling the Default Styles gadget breaks ancient Greek as well.

(originally posted to #grease-pit on the Wiktionary official Discord server). Juwan (talk) 18:36, 4 June 2024 (UTC)[reply]

Thank you, M @JnpoJuwan. For Modern Greek (also other Greek: ancient, medieval etc): why do we view smaller fonts, inferior to the default for other languages? considering that guests do not download any special fonts and do not alter anything at Preferences? But not only guests, but us editors too. I do not click things in Preferences, because I trust that programmers of wiktionary design the default look as the best view. Why these miserable fonts? Also cf Wiktionary:Beer_parlour/2024/May#Default_font_size_for_polytonic_Greek. Thank you. ‑‑Sarri.greek ^♫ I 23:26, 4 June 2024 (UTC)[reply]

RFVE very slow

Wiktionary:Requests_for_verification/English is almost unusably slow for me to open, navigate or edit. I wonder whether anything can be done. Mihia (talk) 20:56, 4 June 2024 (UTC)[reply]

This page is 700,000 kb, which is completely ridiculous. Some other users have no problem with these requests being open for years, but I think that if they haven't been verified by four months, that is plenty of time. So the solution is to archive the vast majority now and going forward, not allow them to be open so long. It's ridiculous. —Justin (koavf)❤T☮C☺M☯ 23:44, 4 June 2024 (UTC)[reply]

It's in fact only 700,000 bytes (and now down to 670k), which is still large, but far from ludicrous. The page began to be slow to open for me today all of a sudden, which is probably a JavaScript-related issue.

The problem is that many entries listed at RFV may well be attestable, but it requires a real investment of time to properly verify them - especially the rfv-sense listings. Only a scant few users participate in this important yet difficult work. Anyone who prematurely deletes senses, or entire entries, without taking the time to double-check that that is not the wrong thing to do, is doing our readers and the project a disservice. This, that and the other (talk) 02:48, 5 June 2024 (UTC)[reply]

Sure, but making a perfect dictionary is difficult, too. If no one is motivated to do a particular task after 28 months, it's probably just going to have to go on the back-burner. There are a bunch of other tasks that can and should be done and that someone else may plausibly do. —Justin (koavf)❤T☮C☺M☯ 02:55, 5 June 2024 (UTC)[reply]

I guess one of the problems is that it is hard to prove a negative. How many people saying "couldn't find anything" do we need before we fail an RFV? I hope that people who do look at these RFVs are always reporting negative findings, even if it seems to be just repeating what someone else has said. The blurb on the page says that "After a discussion has sat for more than a month without being “cited” [...] the discussion may be closed." Does this mean "closed as failed, therefore the entry is deleted"? Despite complaining about load times, I disagree with this, if that's what it means. We should not be deleting entries unless experienced/competent editors have made a reasonable effort to cite them (unless patently ridiculous, I guess). Mihia (talk) 09:10, 5 June 2024 (UTC)[reply]

Sorry guys, most of the RFV requests were mine. I'll slow down... Denazz (talk) 22:23, 9 June 2024 (UTC)[reply]

Backslang

The main entry is backslang and glossary page uses Appendix:Glossary#backslang. Can the label display in {{lb|<lang>|back slang}} or {{lb|<lang>|backslang}} be replaced with "backslang" for consistency? Ysrael214 (talk) 00:53, 5 June 2024 (UTC)[reply]

Tool to find not-very-visible characters?

The following are normally (virtually) impossible to detect visually:

Zero-width Unicode characters.
Characters from some different alphabets eg, certain Cyrillic characters seem the same as some from Latin/Roman.

Some other characters can just be tedious to locate, eg, tabs.

Are there WM tools that can locate such interloper characters? Are there free apps? Ideally, one might want to specify multiple alphabets/charactersets (I'd be happy with one for starts.) that would be acceptable. DCDuring (talk) 21:52, 5 June 2024 (UTC)[reply]

If you gave a little more context on why/how you need to find these characters, that may help. One option re: the Cyrillic and Latin characters that look similar is very different fonts for both. E.g. you could have a serif font in your CSS for Latin characters and sans serif for Cyrillic or change the color somewhat (black versus dark grey) or something else that makes it visually distinct for you. —Justin (koavf)❤T☮C☺M☯ 22:47, 5 June 2024 (UTC)[reply]

One instance was a filter telling me that there was a tab character in an entry I was working on. I think it wouldn't let me save my work while I looked for a tool such as what I'm looking for now. There have been other instances that, in retrospect, may have this as a cause: searchbox searches that fail to find an entry I find eventually only by indirect means. I recall mention being made in some discussion of a Cyrillic look-alike character, possibly in a headword. DCDuring (talk) 01:59, 6 June 2024 (UTC)[reply]

Do we have filter for the most noxious characters? DCDuring (talk) 21:52, 5 June 2024 (UTC)[reply]

I think we could reasonably have an Abuse filter that Tags (or potentially even Warns against) various undesirable characters in pagenames or contents. One character which I periodically make WT:TODO cleanup sweeps for (and always find entries using, sometimes even in their titles!) is the soft hyphen. (Another is the one mentioned in the TR discussion that started this.) - -sche (discuss) 22:18, 5 June 2024 (UTC)[reply]

w:en:WP:AWB can clear away these characters easily and replace them, as I recall. Doing runs with that or incorporating some of its character detection into a bot could be a solution to whatever problem you're trying to fix —Justin (koavf)❤T☮C☺M☯ 22:48, 5 June 2024 (UTC)[reply]

For small-scale problems this online tool, that User:Mnemosientje directed me to, looks like it would have helped in my tab-filter problem.

Would a sweep of our headwords help? Especially of those without incoming links!!! I don't think I would want to have any more public discussion about this. DCDuring (talk) 01:59, 6 June 2024 (UTC)[reply]

There seem to be six Unicode punctuation characters with English names beginning with "invisible" or "zero-width", but there is at least one other character (U2060, "word joiner") that seems to have those attributes without such a name. DCDuring (talk) 13:38, 6 June 2024 (UTC)[reply]

@DCDuring @Koavf Wholesale removing such characters isn’t always desirable, and may mess up links in certain languages, so I don’t think it’s a good idea to simply mass-delete them like this; particularly given the languages affected are ones that neither of you edit, such as Malayalam. Theknightwho (talk) 13:44, 10 June 2024 (UTC)[reply]

@User:Theknightwho Presumably any automated deletion effort someone would undertake would be greatly circumscribed. I am presently interested in simply identifying whether such characters contribute to the 5,000+ Translingual entries that are orphaned (have no incoming links from principal namespace other than self-transclusion). The same problem might occur with any language, but pages with English L2s should not have such characters. Identification of any such entries would probably require a modest bit of review to determine whether the entry had good English content, whether there was another entry page of the same apparent name not containing the characters, and whether there was non-English content to be evaluated. DCDuring (talk) 14:02, 10 June 2024 (UTC)[reply]

I never wrote anything about mass deletion. —Justin (koavf)❤T☮C☺M☯ 18:11, 10 June 2024 (UTC)[reply]

problem with missing articles in dumps?

Hello. I'm working on a project that involves dealing with the XML dumps of wiktionary. On May 30, I got the latest dump from: https://dumps.wikimedia.org/enwiktionary/latest/enwiktionary-latest-pages-meta-current.xml.bz2 This worked fine, but there was something strange. A number of articles seem to have been nuked. For example, here's the article for "below":

<page>

 <title>below</title>
 <ns>0</ns>
 <id>51479</id>
 <revision>
   <id>79233682</id>
   <parentid>78813002</parentid>
   <timestamp>2024-05-13T08:01:11Z</timestamp>
   <contributor>
     <username>P. Sovjunk</username>
     <id>4102084</id>
   </contributor>
   <model>wikitext</model>
   <format>text/x-wiki</format>
   <text bytes="13093" />
   <sha1>ga4830ugrai5hkottafbvpup9m15lxj</sha1>
 </revision>

</page>

Of course, this isn't the real article. This was the only result of the command

rg -A 100 -B 5 "<title>below<" WIKTIONARY_FULL_24.05.30.xml

so I think that somehow the "below" article just doesn't show up in this dump. This also happened for "-ier":

<page>

 <title>-ier</title>
 <ns>0</ns>
 <id>79624</id>
 <revision>
   <id>79212250</id>
   <parentid>79212188</parentid>
   <timestamp>2024-05-11T13:10:32Z</timestamp>
   <contributor>
     <username>Jberkel</username>
     <id>1580588</id>
   </contributor>
   <comment>/* Further reading */</comment>
   <model>wikitext</model>
   <format>text/x-wiki</format>
   <text bytes="4242" />
   <sha1>ly9lx1angt2on9stnywb3wxvcattzpk</sha1>
 </revision>

</page>

And I believe a number of other articles. What's going on here? It's possible I'm confused somehow. E.g.: 1. I haven't looked in any standard wiktionary help pages--sorry, I don't know where to look. 2. Maybe this is intended behavior. 3. Maybe the "latest" XML dumps often contain weird stuff, and I'm supposed to stick to the bi-monthly dumps. 4. Something else.

Thoughts? T039mwftulnm0l (talk) 01:20, 6 June 2024 (UTC)[reply]

Note that currently those articles are normal, and they are also normal in at least one earlier dump that I had. T039mwftulnm0l (talk) 01:21, 6 June 2024 (UTC)[reply]

It looks like you have a record of the latest revision (before the dump was put together) of those entries. Could you also have the full entries elsewhere in the dump? DCDuring (talk) 02:12, 6 June 2024 (UTC)[reply]

Dumps are currently broken. The last valid dump was the 20240501 (1 May) dump. The 20 May dump failed due to a bug and was deleted from the server (which is why the link in your post is a 404 error). The 1 June dump is only just getting underway. See phab:T365155. This, that and the other (talk) 03:17, 6 June 2024 (UTC)[reply]

@T039mwftulnm0l This, that and the other (talk) 03:18, 6 June 2024 (UTC)[reply]

Ok, makes sense, thank you very much. T039mwftulnm0l (talk) 04:19, 6 June 2024 (UTC)[reply]

Final Old Polish quotation templates

Could someone send a bot to check for any redlinked Old Polish templates as well as any commented out? I should have all the information I need to fix those. Vininn126 (talk) 12:22, 6 June 2024 (UTC)[reply]

Okay, I found "Wanted templates"; still might need help with any commented out. Vininn126 (talk) 12:44, 6 June 2024 (UTC)[reply]

self-transclusion has almost completely emptied "Orphaned pages"

There are currently three pages in Special:LonelyPages: one has been deleted, the other two are candidates for speedy deletion.

Orphaned pages are often an indication of misplaced priorities in new-entry contributions. The good news about the demise of the default Orphaned pages listing is that there is more justification for more narrowly focused lists of the same kind, generated more-or-less on-demand, but relatively infrequently. DCDuring (talk) 14:07, 6 June 2024 (UTC)[reply]

Hyphen in "near-synonyms"

Could someone please change the "near synonyms ⯆" tab (that appears next to the definition) to include a hyphen ("near-synonyms ⯆") in entries where {{parasynonyms}} is used? I have no idea where it should be changed. Thanks, Einstein2 (talk) 21:25, 6 June 2024 (UTC)[reply]

I've yet to figure out what page to update; MediaWiki:Gadget-defaultVisibilityToggles.js lists the relation classes as 'near-synonym', 'imperfective', 'perfective', 'alternative-form', and the "alternative form" toggle (which also has a hyphen in the .js list) displays with a space, so maybe the .js is assuming all hyphenated singular terms should be displayed spaced as well as pluralized? - -sche (discuss) 05:02, 8 June 2024 (UTC)[reply]

@-sche: That seems to be the case: see line 333, return $(e).data('relationClass').replace('-', ' ') +. J3133 (talk) 05:46, 8 June 2024 (UTC)[reply]

Can we replace the "hyphen that should be a space" with something else like & or _ (or, gasp, a space) and safely convert that (instead of the hyphen) to a space, since no item in the list should have that symbol and want to keep it? - -sche (discuss) 04:34, 19 June 2024 (UTC)[reply]

aWa stopped working in Vector 2010

The aWa archiving tool isn't loading for me on RFC, RFV, RFD, etc (and it hasn't been for several hours). This is the case in both Chrome and Firefox, even after I turn the gadget off and back on, or log out and back in. In the past, it has sporadically failed to load for me like this, either on specific large pages or temporarily on all pages, only to show up again after a while... so I don't know if this is a diagnosable issue (e.g. changes causing it to no longer work in a certain skin; I use Vector legacy [2010]) or just the impenetrable vagueries of our gadgets/javascript sometimes not working. (It was working for me at least as recently as 5 June.) - -sche (discuss) 05:24, 8 June 2024 (UTC)[reply]

It stopped working for me months ago. First it would appear for sections near the top of a page and then stop appearing for sections lower down, and now it doesn't appear at all. I'm using the current skin in Firefox. Hope it can be fixed. — Sgconlaw (talk) 05:41, 8 June 2024 (UTC)[reply]

Aha, I can confirm that if I switch to Vector 2022, it shows up again and works (I just archived an RFC discussion to test), and then if I switch back to Vector 2010, it's gone again. So it appears to be a skin issue. Pinging @This, that and the other who solved the last certain-skins-break-gadgets issue, in case you have any ideas about this one. - -sche (discuss) 20:00, 8 June 2024 (UTC)[reply]

@-sche for some years now I have found that I often need to refresh the page many times to get aWa links to appear. I never bothered to debug it to be honest. I'll look at it if I have time.

Keep in mind that as a workaround you can always temporarily view a page in a different skin by adding ?useskin=... to the end of the URL, for instance: https://en.wiktionary.org/wiki/WT:RFM?useskin=vector-2022 This, that and the other (talk) 08:11, 9 June 2024 (UTC)[reply]

I also just noticed it not working for me, with Vector 2010 (and likewise, switching to Vector 2022 made it appear again for me).--Urszag (talk) 06:03, 9 June 2024 (UTC)[reply]

(See also discussion of Ajax Edit breaking in Vector 2010 but not Vector 2022, a few sections down.) - -sche (discuss) 14:48, 10 June 2024 (UTC)[reply]

Props to Pious Eterino, who has been archiving RFV; I can't (or don't) archive anything anymore because switching skins is too much faff. - -sche (discuss) 20:34, 27 June 2024 (UTC)[reply]

"Twice-borrowed" is not quite dead

According to this search, there are still two modules that generate the "twice-borrowed" category names. I discovered this when I found "twice-borrowed" categories in Special:WantedCategories. The ordering is different, so the fix wasn't simple enough for my limited Lua skills. Chuck Entz (talk) 21:15, 8 June 2024 (UTC)[reply]

Something gone wrong with editing project pages??

At https://en.wiktionary.org/wiki/Wiktionary:Tea_room, and also here at https://en.wiktionary.org/wiki/Wiktionary:Grease_pit, I no longer see the "Edit" links next to headings. Has something gone wrong? Mihia (talk) 21:23, 8 June 2024 (UTC)[reply]

... but I notice, after having added this with the "New Topic" tool, that I DO still see the edit links at https://en.wiktionary.org/wiki/Wiktionary:Grease_pit/2024/June. Mihia (talk) 21:26, 8 June 2024 (UTC)[reply]

... and also I see them at https://en.wiktionary.org/wiki/Wiktionary:Tea_room/2024/June Mihia (talk)

This probably has to do with the fact that the page is protected. As an admin I still see the section edit links, but when I log out I don't see them. I actually wonder why they were ever visible to logged-out users at all. This, that and the other (talk) 08:08, 9 June 2024 (UTC)[reply]

Perhaps not, but now I don't see them when I am logged in. I wonder if I could request that someone please look into this. Something has changed which means that I cannot now edit Tea Room discussions at https://en.wiktionary.org/wiki/Wiktionary:Tea_room, but have to navigate to the actual underlying monthly page. The "Reply" links still work, by the way. Mihia (talk) 08:56, 9 June 2024 (UTC)[reply]

@Mihia I see what happened here: the protection level to all the discussion venues was raised by Fenakhay. This seems to have been a busy period for raising protection levels, with Surjection being responsible for most. Was this in response to a vandal attack? In any event, having the root discussion pages fully-protected gets in the way of non-admins and I think it should be removed. We already use CSS to hide the "edit" tab and JS to redirect the "new section" tab to the correct monthly page. This, that and the other (talk) 09:15, 9 June 2024 (UTC)[reply]

And yet still people occasionally end up posting their messages to the root discussion page, which should never happen. — SURJECTION ^{/ T / C / L /} 11:17, 9 June 2024 (UTC)[reply]

@Surjection this should be dealt with by an abuse filter rather than page protection. This, that and the other (talk) 00:28, 10 June 2024 (UTC)[reply]

Thanks very much for looking into this. Unless it is deemed essential for system security, I would like to request that this change is reverted, so that section "Edit" links go back to being visible on the main pages for non-admins. Thanks. Mihia (talk) 14:16, 9 June 2024 (UTC)[reply]

@Mihia I have unprotected all the pages. Special:AbuseFilter/43 continues to prevent most users from editing the root discussion pages, and I created Special:AbuseFilter/177 to extend the warning to autopatrollers and sysops (but this filter does not prevent them from proceeding with the edit if they need to). This, that and the other (talk) 00:40, 10 June 2024 (UTC)[reply]

While we're at it, should we perhaps warn users against creating Wiktionary talk:Grease pit/, Wiktionary talk:Beer parlour/ (etc) pages? Every few months someone tries it, because it seems logical to them that the if place for a discussion about an entry is the talk page, the place for a discussion in this case is also the talk page (e.g. just recently [1]). Alternatively we could add "redirect the talk pages to the main pages" to the list of tasks that the bot/script that creates new monthly subpages does. - -sche (discuss) 01:39, 10 June 2024 (UTC)[reply]

I think redirect would solve this problem almost all of the time. —Justin (koavf)❤T☮C☺M☯ 01:43, 10 June 2024 (UTC)[reply]

Great, thanks very much for doing that. Mihia (talk) 09:08, 10 June 2024 (UTC)[reply]

Mass Import of Lorentz's Slovincian dictionary

I might be able to work on a spreadsheet that could give a bot all the information it needs to import all of Lorentz's words. I'm wondering if 1) I should pursue this 2) If someone could help me once it's done 3) Any tips 4) If I could point to a Polish entry for a definition, to save me time.

It might also be worth it to develop a more robust Slovincian IPA module (I could supply {{{1}}}). Otherwise it might be worth it to just skip syllabification for now. Vininn126 (talk) 19:46, 9 June 2024 (UTC)[reply]

AjaxEdit edit-summary section-linking is broken

Sometime in the past week, AjaxEdit stopped producing working edited-section links in its edit summaries.

This diff from 2 June shows correct behavior (note the presence, in the edit summary, of a link showing which section was edited and linking directly to said section).

In contrast, this diff from 8 June shows broken behavior (now, where the edited-section link should be in the edit summary, there's instead just a bare colon with no link).

I assume this is not intended behavior. Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 04:43, 10 June 2024 (UTC)[reply]

On a hunch, I checked and find that while it's broken if I use the Vector 2010 skin, it works if I switch to the Vector 2022 skin. Comparing the discussion about AWA not working anymore in Vector 2010 but working in Vector 2022 (a few sections above this), I conclude that some Mediawiki dev somewhere changed something in a way that has caused a bunch of gadgets to fail. Pinging User:This, that and the other just so you're aware that this is another skin-specific breakage which has happened. I don't know if this is widespread enough that we should ask the folks at Phabricator to help us figure out what broke everything and how to fix it...? - -sche (discuss) 14:45, 10 June 2024 (UTC)[reply]

Bleh, Vector 2022... Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 20:45, 10 June 2024 (UTC)[reply]

Hmm, now I see the "Ædit" links again (even in Vector 2010). This puts me in mind of TTO's point about having to refresh pages a bunch to get aWa to show back up (although I have not yet managed to get aWa to show back up). Are we running more javascript on discussion pages than we used to, are the gadgets' scripts often running out of time to run and thus not running sometimes, or what would explain such intermittency? ... - -sche (discuss) 23:36, 10 June 2024 (UTC)[reply]

It's a bit of a mystery to me. One would imagine some kind of timeout that kills the scripts after a certain elapsed time might be at work, but based on my knowledge of JavaScript, I feel that would (at least historically) have been very difficult or impossible for the MediaWiki devs to implement. I'll look into it some day. This, that and the other (talk) 09:35, 11 June 2024 (UTC)[reply]

might be something with changing HTML structures, at least in the Monobook skin; https://en.wiktionary.org/w/index.php?title=User%3AFish_bowl%2FAjaxEdit.js&diff=80099181&oldid=80098573 —Fish bowl (talk) 01:24, 13 June 2024 (UTC)[reply]

I'm having this issue with Vector 2010, tho, so it's presumably not something Monobook-specific (diff from a few minutes ago to demonstrate that the issue's still extant). Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 06:52, 14 June 2024 (UTC)[reply]

This is the result of mw:Heading HTML changes; to gadget maintainers: please see the instructions there. JWBTH (talk) 00:01, 18 June 2024 (UTC)[reply]

@Erutuon JWBTH (talk) 00:06, 18 June 2024 (UTC)[reply]

Any ETA on a fix? AjaxEdit's edited-section links're still borked. Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 03:44, 6 July 2024 (UTC)[reply]

@Whoop whoop pull up: Ping me if no one fixes this in the next few days and I'll fork the code. Unfortunately I can't edit the gadget directly, but you'll be able to import the script into your personal common.js page. Ioaxxere (talk) 21:14, 8 July 2024 (UTC)[reply]

Will do if it's still borked Friday afternoon! Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 22:49, 8 July 2024 (UTC)[reply]

Listen, we have several scripts for section editing right now. Another popular one is w:User:BrandonXLF/QuickEdit. You can install it in Wiktionary too. All of the scripts aren't being actively developed, but at least QuickEdit has critical errors fixed, and AjaxEdit seems to be a purely Wiktionary gadget which isn't cool – there is too much overhead to maintain it.
Can I ask, is there functionality in AjaxEdit that you are missing in QuickEdit? You can ask the author to add it, and if he doesn't, then we may think this through. JWBTH (talk) 00:07, 9 July 2024 (UTC)[reply]

AjaxEdit can be enabled by ticking a box in your user preferences; all the others require manually importing scripts into your common.js. If a gadget's prominent and widely-used enough to have gained this sort of semi-official endorsement, like AjaxEdit, it stands to reason that it should at least be kept working properly. Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 00:53, 9 July 2024 (UTC)[reply]

Adding a script to the gadget list is trivial (or even replacing one script with another if their functionality matches and the community decides so); maintaining and developing a script isn't. If a script is unmaintained and maintaining it poses an issue, like seen here, maybe it should just be dropped in favor of a maintained one. A lot of effort is spent in wiki projects on developing technical solutions that duplicate each other, often just because people don't know of an existing solution. JWBTH (talk) 01:14, 9 July 2024 (UTC)[reply]

I did not know that! In that case, maybe time to consider replacing AjaxEdit with QuickEdit in the gadget list? Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 02:52, 9 July 2024 (UTC)[reply]

@JWBTH: Do you know if an alternative exists to aWa? That one's been broken as well. Ioaxxere (talk) 13:40, 9 July 2024 (UTC)[reply]

It does what, moves discussion topic from a page to an archive? Then I think I know one alternative, hehe, c:User:JWBTH/CD. It can archive a topic by moving it to an archive page. I can even add some "archive" option to that dialog if you provide a list of archive paths in archivePaths config property or use some template that would encode this data in a uniform way across different wikis (which in practice likely means "as some popular enwiki bot-archivation template encodes them").

JWBTH (talk) 17:13, 9 July 2024 (UTC)[reply]

It's worth noting that aWa works in non-Vector skins, e.g. https://en.wiktionary.org/wiki/Wiktionary:RFVE?useskin=vector-2022. It just needs to be updated to account for recent changes to MediaWiki heading syntax - which were apparently signalled in advance on Tech News, but too few people have WT:Wikimedia Tech News/2024 watchlisted. (I had one of the previous year's pages watchlisted, but nobody does/has done the page move trick to duplicate the watchlist entries.) This, that and the other (talk) 23:19, 9 July 2024 (UTC)[reply]

From what -sche said earlier, the "works, except on Vector 2010" issue you describe for aWa seems to also be the case for the current issue with AjaxEdit. Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 02:58, 10 July 2024 (UTC)[reply]

something weird with page saving

I was trying to make changes to always and MediaWiki appears to be losing changes after they've been saved. Anyone else seen this? Benwing2 (talk) 09:53, 10 June 2024 (UTC)[reply]

I just edited it and reverted without issue. —Justin (koavf)❤T☮C☺M☯ 10:04, 10 June 2024 (UTC)[reply]

I did experience some MW server lag at the same time as Benwing made this post. There was a bold notice in the page footer next to "This page was last edited on ..." that said something along the lines of "The latest updates to this page may not be shown". Seems to have cleared up now. This, that and the other (talk) 11:42, 10 June 2024 (UTC)[reply]

It happened to me earlier as well—an edit which I made seemed to "disappear", then when I repeated it I was told there was an edit conflict with my earlier edit. I assumed it was related to the server lag, as there was a message displayed on the screen. — Sgconlaw (talk) 12:00, 10 June 2024 (UTC)[reply]

updates to labels and params of pronunciation templates

I've made some feature additions to labels and pronunciation templates. For labels, you can now use double angle-bracket notation to more easily mix labels with non-label text, similarly to double angle-bracket notation in {{place}}. For example, world now has this:

{{homophones|en|whirled|aa1=with both the <<wine-whine>> and <<fern-fir-fur>>|;|whorled|aa2=with both mergers, <<RP>> only}}

which renders as this:

Homophones: whirled (with both the wine–whine merger and fern–fir–fur merger); whorled (with both mergers, Received Pronunciation only)

With the previous work I did on merging accent qualifiers and labels, the accent qualifiers in the |aa1= and |aa2= parameters are basically just labels, and the double angle brackets surround actual labels, while the remainder of the text appears as-is. This also shows another new feature of pronunciation templates (so far, {{IPA}}, {{homophones}}/{{hmp}} and {{rhymes}}/{{rhyme}}), which is that you can put a bare semicolon as an argument to change the separator to a semicolon instead of a comma. (This works the same way as in the {{syn}}, {{alt}} and {{desc}} templates.)

Another example is as found in more:

{{homophones|en|maw<aa:<<non-rhotic,horse-hoarse>> (most of <<England>>, <<Australia>>, <<New Zealand>>, <<New York>>)>}}

which renders as:

Homophone: maw (non-rhotic, horse–hoarse merger (most of England, General Australian, New Zealand, New York))

This shows that you can use inline modifiers in the pronunciation templates and can put multiple comma-separated labels inside of double angle brackets.

Double angle brackets work when there is more than one label, as in the following example from portion:

{{IPA|en|/ˈpoəɹʃən/|/ˈpoːɹʃən/|/ˈpoɹʃən/|a=Scotland,Ireland,other varieties <<non-horse-hoarse>>}}

which renders as:

(Scotland, Ireland, other varieties without the horse–hoarse merger) IPA^(key): /ˈpoəɹʃən/, /ˈpoːɹʃən/, /ˈpoɹʃən/

Another new feature of labels is as shown in Rhymes:English/ɒn:

{{l|en|on}} {{a|en|except <<Southern US!Southern>> and <<Midland US>> <<non-cot-caught>>}}

which renders as:

on (except Southern and Midland US without the cot–caught merger)

Here, the Southern US!Southern label format means to reference the Southern US label but display it as Southern. It is somewhat similar to the format with an initial exclamation point, which says "display the label as-is rather than converting an alias to its canonical form", as in:

{{homophones|en|Watt|watt|wot|aa=all only in <<!British>>, <<!Australian>>, <<NZ>>, <<NYC>> accents with the <<wine-whine>>}}

which renders as:

Homophones: Watt, watt, wot (all only in British, Australian, New Zealand, New York City accents with the wine–whine merger)

Here, the ! forces the label Australian to appear as such instead of in its canonical form Australia, and is equivalent to writing Australia!Australian. Note that both types of ! notation work in regular labels; they don't need to be inside of double angle brackets.

Finally, although all of the above examples are given in terms of accent qualifiers, the double angle brackets also work in regular {{lb}} labels, so you can write e.g.

# {{lb|en|<<dated>> except in <<Scotland>>}} ...

which renders as

(dated except in Scotland) ...

This way you don't have to use awkward combinations of multiple labels with _ and such and remember which qualifier labels suppress the preceding or following comma.

The documentation isn't yet up-to-date concerning all of these changes, but will be soon.

Benwing2 (talk) 04:16, 12 June 2024 (UTC)[reply]

@Benwing2: In a similar way as the exclamation mark the asterisk for reconstruction could be had: A dialect label is reconstructed at زُوم (zūm), and a whole sense was unattested but inferrable at ܐܘܣܬܢܐ (ʾustānā). Fay Freak (talk) 08:34, 12 June 2024 (UTC)[reply]

Edit request: MediaWiki:Gadget-VisibilityToggles.js

@Erutuon, Surjection, or any other interface admins: Please edit Line 152 of that page from:

			lowCodepoint + Math.floor(Math.random() * (highCodepoint - lowCodepoint)) - 1)

to:

			lowCodepoint + Math.floor(Math.random() * (highCodepoint - lowCodepoint)))

(For more context, click

)

More context: the above snippet is used to generate a random css identifier for toggle boxes like this one, and it is supposed to generate a random lowercase letter from a to z; but because of the -1, it actually generates from ` to y; and if it does generate `, then it causes an error and the whole script is aborted.

It is used when the provided css identifier contains no valid characters, such as when it is in Chinese, as demonstrated in this example.

Currently, if you refresh that page a few times (to refresh the randomly generated identifiers), sometimes the script bugs out and the "more" texts are not rendered, and the console displays the error that there is a syntax error with the generated css identifier with the backtick.

--kc_kennylau (talk) 14:08, 12 June 2024 (UTC)[reply]

Done — SURJECTION ^{/ T / C / L /} 14:14, 12 June 2024 (UTC)[reply]

crap-tag-stic

In the entry cascade, there's the non-wikified tag (''[[chemistry#English|chemistry]]''), which should be {{lb|en|chemistry}}. I've seen this other times, and manually corrected them, but I suspect this is common enough to warrant a cleanup list (oh how I love them!) of other entries containing # (''[[foo]]''). Can someone rustle up such a list? @Erutuon, @Benwing2, @This, that and the other normally know how to do this kind of thing. Denazz (talk) 08:05, 15 June 2024 (UTC)[reply]

@Denazz I generated Wiktionary:Todo/manually crafted labels last year. I'm not sure if anyone did any cleanup on it. It's kind of annoying that it's formatted without any headings - perhaps I'll look at regenerating it. This, that and the other (talk) 12:06, 15 June 2024 (UTC)[reply]

Dividing by language would be a start. Splitting long lists within a language would be great. It would make it easier to strike off items to keep track of progress between updates. DCDuring (talk) 15:37, 15 June 2024 (UTC)[reply]

It's because of this user, see Wiktionary:Requests_for_cleanup#isomer_label_(stereopure,_dextrorotatory,_etc). - -sche (discuss) 16:54, 15 June 2024 (UTC)[reply]

Template:IPA updates part 2

I made some further additions to {{IPA}}. Most notably you can now specify a combined phonemic-phonetic spec in a single argument of the form /.../ [...] rather than have to put them as two arguments and suffer a comma between them. You can also add a gloss (using <t:...> or <gloss:...>, or separate params |tN= or |glossN=) and/or a part of speech (using <pos:...> or separate param |posN=) to a pronunciation, to help in cases where the pronunciation is split by meaning or part of speech. An example that uses both, for ruler:

{{User:Benwing2/IPA|en|/ˈɹuː.lə/ [ˈɹʉwlə]<t:measuring device>|/ˈɹuːl.ə/ [ˈɹʊwɫə]<t:one who rules>|a=UK,goose split}}

which yields

(UK, goose split) IPA^(key): (“measuring device”) /ˈɹuː.lə/ [ˈɹʉwlə], (“one who rules”) /ˈɹuːl.ə/ [ˈɹʊwɫə]

I should also add that I changed the handling of the case where both accent and regular qualifiers are specified for a given argument and side, so that they both end up inside the same set of parens, comma separated, rather than generating two sets of parens. So e.g.

{{User:Benwing2/IPA|en|/ˈɹuː.lə/ [ˈɹʉwlə]<t:measuring device><q:left regular qualifier><a:left accent qualifier>}}

yields

IPA^(key): (left accent qualifier, left regular qualifier, “measuring device”) /ˈɹuː.lə/ [ˈɹʉwlə]

This also shows that glosses and parts of speech behave currently like qualifiers and end up in the same set of parens as well. Benwing2 (talk) 09:07, 15 June 2024 (UTC)[reply]

Module:number list/data/el how to add Greek numerals

I'm in the process of adding Greek numerals to Module:number list/data/el. Ideally, these probably ought to be handled by Module:foreign numerals, but the coding is somewhat beyond my technical level at the moment. So in the interim I'm manually adding them to Module:number list/data/el directly. The snag is that the way I'm doing it results in the Greek symbols being transliterated, e.g. Γʹ (Gʹ), which is inappropriate because these are symbols, not words. Is there a way to suppress the transliteration? I tried adding {{l|el|Γʹ|tr=-}} but that doesn't play nicely with the module. Grateful for any advice. Voltaigne (talk) 10:56, 15 June 2024 (UTC)[reply]

P.S. I see that Module:number_list/data/grc uses the numeral = key, but this doesn't provide linking to the corresponding entry. Voltaigne (talk) 11:43, 15 June 2024 (UTC)[reply]

"Pacific Northwest" transcluding to "Northwestern US"

When creating an entry I discovered that the context label "Pacific Northwest" is automatically replaced with "Northwestern US." Which is flatly incorrect given that every definition of "Pacific Northwest" includes all or part of the Canadian province of British Columbia. (Definitions that encompass Alaska also tend to include the Yukon.)

"Pacific Northwest" needs to be an independent label or the transclusion text should be "British Columbia and Northwestern US." WordyAndNerdy (talk) 20:34, 15 June 2024 (UTC)[reply]

Interesting. We probably want to check all existing uses to see whether they're US-only or both-US-and-Canada. How often do we have labels that group together only parts of different countries (but not the entirety of either country) that have different national dialects? My initial reaction is that it seems "safer" (less liable to be misinterpreted either more broadly or more narrowly than intended) to require "this is used in region X of country Y, and region A of country B" to be spelled out, i.e. require people to input "Northwest US, British Columbia" if that's what they mean (or at the very least, make this what the lumper label displays), rather than relying on people to guess the scope of "Pacific Northwest" (and potentially having different people use it with different scopes, some intending to convey that both Americans and Canadians used the term, and some not realizing "Pacific Northwest" includes Canadians and thus just using it to label terms that Pacific Northwestern US-ers use whether Canadians also used them or not, etc). - -sche (discuss) 23:42, 15 June 2024 (UTC)[reply]

@-sche: it looks like the only usages are: crummy, jojo, Junuary, saltchuck and spront. The rest of Category:Northwestern US English have the label "Northwestern US". Apparently the dialectal picture is fairly complex (see

Pacific Northwest English on Wikipedia.Wikipedia ), but my impression is that the isoglosses don't align all that closely with the US/Canadian border. There are no geographic barriers along the border, and the border itself is pretty open. Unlike the border with Mexico, there are no language barriers. Influences like the Chinook jargon don't stop at the border, either. Chuck Entz (talk) 03:19, 16 June 2024 (UTC)[reply]

I agree that it's preferable to specify precise regions of use wherever possible. I just feel that "Pacific Northwest" shouldn't transclude to "Northwestern US" by default. It's one of those unintentionally America-centric things that annoys me. There's some terms that are exclusive to BC (gym strip, for example). But in general I'd say there isn't an extreme difference between Pacific Northwest English as spoken in BC versus Washington/Oregon. At least not in terms of vocabulary. (Pronunciation/accents are a different matter.) The Pacific Northwest is cut off from the rest of North America geographically and to some extent culturally. If I wanted to generalise, I'd say that Vancouverites are as likely to see themselves as residents of the Pacific Northwest as Seattlites and Portlanders, in a way that Nova Scotians wouldn't identify as New Englanders (nor Mainers as residents of the Maritimes). WordyAndNerdy (talk) 04:38, 16 June 2024 (UTC)[reply]

@WordyAndNerdy Maybe we should do the opposite and redirect "Northwestern US" to "Pacific Northwest"? It sounds like from what Chuck and you are saying that there may not be very many Northwestern US terms that don't also apply to BC. There are only 14 terms in CAT:Northwestern US English, if you're familiar with this dialect can you let me know if these terms are also found in Canada? BTW in general you can find all the places using a given label (even if it's an alias, or being used in an accent qualifier or elsewhere than {{lb}}) by looking at e.g. Special:WhatLinksHere/Wiktionary:Tracking/labels/label/Pacific Northwest. Benwing2 (talk) 08:24, 16 June 2024 (UTC)[reply]

Handy tip! The problem is that familiarity with regional terms can depend on one's experience, age, etc. I'd hazard that I'm more familiar with BC English than people from outside the province. But I've never heard crummy because I've never worked in the timber industry (nor do I have close family in that line work). One way to assess the regional scope of a term is to check newspaper usages. With Junuary, I found several local paper uses from BC, two from Oregon, and one from Washington. Woodbug (BC synonym for pill bug) is another one I'd like to attest. But it's difficult finding unambiguous cites for it. WordyAndNerdy (talk) 08:46, 16 June 2024 (UTC)[reply]

Converting elements template to use list/data

{{ordinalbox}} and {{cardinalbox}} duplicate a lot of information. {{number box}} solved this more elegantly by moving data to "number list/data".

I wonder if someone is interested in converting {{elements}} to a similar system. tbm (talk) 05:37, 16 June 2024 (UTC)[reply]

Splitting the largest entries on technical grounds?

a and 人 currently are so large and template heavy that they consistently return time out errors. splitting the pages in some fashion would solve this. I've brought this up on the discord andWiktionary:Beer_parlour/2024/May#Should we split up multi-language pages? was brought up as a prior discussion on the matter, but this less of a herculean task.

That is, unless a better solution to solving the time out errors can be had? maybe the modules on the pages in question need to be rewritten to be faster? Akaibu (talk) 14:41, 16 June 2024 (UTC)[reply]

人 is fine at the moment and doesn't really time out. a might need to be split in some way, as it's consistently proving a problem. Theknightwho (talk) 14:49, 16 June 2024 (UTC)[reply]

One possibility is splitting off the alphabet-related parts of the single-character entries to a subpage or even a new namespace. We use a lot of resources to list and link to the letters of the alphabet in different languages and to state the position of the letter in the alphabetic order. There's a difference between the determiner a, which has definitions and syntactic roles, and the letter a, which has very little meaning aside from its use to spell things. There are some edge cases where the letter is used as a letter to mean things, such as "x" to indicate "adult" content, or as a shape, but we can sort those out. Chuck Entz (talk) 15:28, 16 June 2024 (UTC)[reply]

ACCEL Lua error in Module:languages/errorGetBy

I tried to create a plural based on this singular (using a perfectly normal {{en-noun}}) with ACCEL, and I got this error message

"An error occurred while generating the entry:
Lua error in Module:languages/errorGetBy at line 16: Please specify a language code in the parameter "lang"; the value "en-Latn" is not valid (see Wiktionary:List of languages)."

- -sche (discuss) 22:32, 16 June 2024 (UTC)[reply]

For what it's worth, I reproduced the problem. :/ —Justin (koavf)❤T☮C☺M☯ 22:49, 16 June 2024 (UTC)[reply]

@Theknightwho It was due to this change of yours to Module:script utilities: [2] I reverted it but I'm not sure what it was trying to do originally. Benwing2 (talk) 23:03, 16 June 2024 (UTC)[reply]

Success. —Justin (koavf)❤T☮C☺M☯ 23:04, 16 June 2024 (UTC)[reply]

@Benwing2 I had a feeling this might break something somewhere, but I think this is a pretty major omission in our script handling, because it's the only way to force a browser to use a specific script. This is necessary in the cases where it won't have automatic support; something that applies to a very large number of smaller languages.

However, this issue is also affecting major languages like Chinese: at least on Chrome, it displays everything as though it's simplified Chinese, which is a problem for cases where trad/simp use the same codepoint but different display forms. One example is 火 (huǒ, “fire”), which is extremely common in compounds: e.g. 火車／火车 (huǒchē, “train”) should display as 火車 (traditional) / 火车 (simplified), but that's not the case at the moment. Theknightwho (talk) 23:10, 16 June 2024 (UTC)[reply]

Why is it that CSS can't handle this? —Justin (koavf)❤T☮C☺M☯ 23:17, 16 June 2024 (UTC)[reply]

@Theknightwho If it's possible to use the font-language-override property as User:Koavf suggests, that might be best; otherwise we will need to fix the code here [3] to remove script codes from the CSS lang attribute. This has to be done carefully because some lang codes have hyphens in them, and some (esp. etym langs) have capitalized components after the hyphen. Benwing2 (talk) 23:32, 16 June 2024 (UTC)[reply]

@Benwing2 @Koavf There are three big issues that I can see with font-language-override:

It's only supported by Firefox at the moment.
It's a language override, not a script one, so it effectively tells the browser to display text as though it were some other language. It's not clear to me how this would work in the context of our infrastructure.
It uses OpenType language tags, which aren't the same as ISO language codes, and don't cover a large number of smaller languages.

You're right that we need to be careful in how this is handled, though, so here are my thoughts:

In terms of scripts, we probably want to create a :getAttributeCode() functon in Module:scripts that always gives an ISO-valid script code. In most cases this is just the conventional script code, but the two kinds of special cases I can think of are:
1. Codes like Polyt and Music, where we'll need to manually specify an ISO-compliant code.
2. Codes with hyphens, like fa-Arab and mnc-Mong. In most cases the correct ISO code is just the bit after the hyphen (i.e. mnc-Mong becomes Mong), but there may be cases we want to handle them like the first type; for example, ur-Arab might want to use Aran (the code for Nastaliq).
In terms of languages, Module:script utilities always puts the full language code in attributes, which avoids the issue of etym-only languages rarely having their own valid codes, but it's probably sensible to have an equivalent :getAttributeCode() function as well, for two reasons:
1. There are some etym-only languages that do have valid ISO langcodes, and there's no reason not to use those in lang attributes (e.g. Anglo-Norman has xno). For a given etym-only language, it should check upwards through successive parents until it finds a lang with a valid ISO code. In cases where an etym-only language has a country code part, it may be appropriate to include that in the attribute as well (e.g. zh-Hant-TW is given as an example attribute here). This shouldn't be too hard to handle.
2. There are some full languages which don't have valid ISO codes, and it's probably not a good idea to be putting our custom codes in HTML attributes, as it could result in undefined behaviour. In those cases, we should be using mis (reserved for unencoded languages).

Theknightwho (talk) 00:04, 17 June 2024 (UTC)[reply]

@Theknightwho This all sounds good. The thing about doing this, though, is that MediaWiki:Gadget-TranslationAdder.js currently pulls out the lang CSS attribute to get the language code for acceleration, which won't work if you do complicated transformations of existing language codes into something ISO-compliant. It sounds like we need to store two language codes, the ISO-compliant one and the Wiktionary one (which IMO should include all etym-only codes so they can potentially have their own custom accelerator code etc.); e.g. in lang and wikt-lang respectively. As an optimization maybe we can omit the Wiktionary one if it's the same as the ISO-compliant one modulo removing the script code, and have the code in MediaWiki:Gadget-TranslationAdder.js check for wikt-lang and fall back to lang + script code removal. You will need to implement this and test it prior to pushing any changes to the lang code to production. Benwing2 (talk) 07:55, 17 June 2024 (UTC)[reply]

@Benwing2 It should be doable without storing any language codes in most cases, so long as the 2/3 letter codes are all ISO compliant. The logic for a given lang/etym lang would be:

If it has a 2/3 letter code, use that.
If it’s an etym lang, iterate up through parents until one has a 2/3 letter code, and use that.
If a full lang/an etym lang that’s reached its full lang parent and there’s no 2/3 letter code, use mis. There are some instances where we treat a langcode as a family (e.g. Southern Min), so those cases might need special handling, but there aren’t very many of those.

We could probably make it more sophisticated to account for country codes in certain etym langs, but that’s not particularly complicated to manage: the solution is to probably have a mandatory script parameter, where the language object grabs the script object’s ISO-compliant code, and which can be interpolated into the attribute code where appropriate. The output would be the complete language attribute code.

For scripts, we probably will need to manually specify them in some cases, but most won’t need it, so it can just be an extra key in the script data. Theknightwho (talk) 13:58, 17 June 2024 (UTC)[reply]

Re-reading, I realise I misunderstood your point about two language codes: if you meant we need to have another HTML attribute for the translation-adder then unfortunately we won't be able to use a custom one like wikt-lang, as the wikitext parser sanitises HTML by stripping any non-whitelisted attributes for security reasons. There are a bunch of unused attributes that we could use, but I don't know which is the most suitable: probably one from the RDFa spec (about, property, resource, datatype, typeof) or the Microdata spec (itemid, itemprop, itemref, itemscope, itemtype). Theknightwho (talk) 14:57, 17 June 2024 (UTC)[reply]

It turns out attributes starting data- don't get stripped, so I suggest we use data-lang for the wikt-internal langcode. I agree that it's probably best to only include it if it's different, which cuts down the post-expand include size a little. Theknightwho (talk) 15:03, 17 June 2024 (UTC)[reply]

@Theknightwho Yes, that's right and data-lang sounds fine to me. Benwing2 (talk) 19:03, 17 June 2024 (UTC)[reply]

@Benwing2 So I've looked into this a bit more, as it's all defined in the BCP 47 convention for language tags. It's pretty complex, but here are the types of subtag we probably want to care about:

The language subtag, which generally must be from one of the ISO 639 standards. One thing I hadn't realised is that this includes ISO 639-5 (family tags), so I think those should be used as a fallback in the event that a full language doesn't have a valid ISO code of its own. Following the same principle used for etym-langs, a language should iterate upwards until it finds a family that has a valid ISO code (e.g. Old Galician-Portuguese would use roa; Romance). There are still some instances where primary language families aren't covered by the ISO, and of course we have a bunch of unclassified languages as well, so mis should always be the ultimate fallback. There are a handful of cases where we've decided not to recognise ISO families as true families in the data (e.g. Caucasian; cau), so we might want to manually specify these as fallbacks in the data for their value as areal groupings, but it's probably best to do it case-by-case. Additionally, there are a handful of non-ISO codes which have been registered as language subtags, but most of these are now redundant; I'll trawl through them to look for any we might want to use, though. (Edit: the only 3 which aren't redundant are i-default (default language), i-mingo (Mingo) and i-enochian (Enochian). "Default language" is not relevant and Enochian is a conlang we don't cover, but we do have Mingo, so we should handle that with the ietf_subtag field.)
The script subtag - we've essentially hashed all this out above. I've added various ietf_subtag values to our script data: Polyt uses Grek, Ipach (IPA) uses Latn, Morse, Music & Semap (flag semaphore) use Zsym (symbols), ruminumerals (Rumin) uses Arab, the two ancient Iberian scripts use Zzzz (unencoded), and Image & None use Zyyy (unspecified).
The region subtag. This can either be a 2-letter ISO 3166-1 alpha-2 country code (e.g. GB), or a 3-digit UN M49 region code. Since numeric codes make for bad language codes, I've added an ietf_subtag field to the etym-only language data so that UN M49 regions can be manually specified, though I've only found one case: North American English (en-NNN), which should use en-021 (note: 021 is for "Northern America", which excludes Mexico, Central America and the Caribbean). Another one which uses this field is Korean Classical Chinese (lzh-KO), because KO isn't a real country code, and unfortunately there's no UN M49 region that specifically covers Korea as a whole, so I've set the ietf_subtag value to lzh-KR, as it's better than nothing (KR is for South Korea). However, see my comments on Unicode locale subtags below, as that might provide a solution.
The variant subtag. This is a bit more complicated, as they're essentially just a hodge-podge of things which have been registered as part of the IANA language subtag registry with the type "variant", some of which seem much more useful than others. From a brief look, it's everything from language periods (1606nict: "Late Middle French (to 1606)"), dialects (scouse: "Scouse" (Liverpool English)), romanisations (wadegile: "Wade-Giles romanization"), written standards (1996: "German orthography of 1996") and transcription systems (fonupa: "Uralic Phonetic Alphabet"). In some cases, they can only come after certain (combinations of) other tags (e.g. 1996 above can only be used with de), which is why many have codes that only make sense in that context. In at least one case, the same subtag represents different things depending on the preceding tags (pinyin is Hanyu Pinyin with zh-Latn and Tibetan pinyin with bo-Latn), and - counterintuitively - can't be used with cmn-Latn (Mandarin)). I think we should make use of these on a case-by-case basis.
The extension subtag, which is followed by any number of extended subtags. There are only two of these, and both are relevant for our purposes, but I think we should probably hold-off on adding them until we've got the main tagging system established, so I'll just give a broad overview:
- t represents transformed content (i.e. translations, transliterations and transcriptions etc.), and is defined in BCP 47 Extension T. It would make sense to include these for glosses/transliterations/transcriptions in our link/headword template outputs, I think.
- u represents the "Unicode locale", and is defined in BCP 47 Extension U. According to Wikipedia, this includes "country subdivisions, calendar and time zone data, collation order, currency, number system, and keyboard identification". Most of these aren't relevant, but country subdivisions and (potentially) number systems both seem useful for our purposes, and there may be other useful things in there as well.

Theknightwho (talk) 17:17, 18 June 2024 (UTC)[reply]

@Theknightwho OK, I read the whole thing you wrote and it sounds like you've thought this out pretty well. I would definitely implement this incrementally as there's a lot of stuff here. Benwing2 (talk) 18:59, 18 June 2024 (UTC)[reply]

@Benwing2 I think we can probably start off with the 3 basic ones: languages, scripts and regions, as they're pretty straightforward. With regions, I was thinking of using the same principle of inheritance used for language tags; i.e. a variety uses the one it has, or otherwise inherits the first one it encounters while iterating through parents. For example, Malaysian Huiyang Hakka (hak-hui-MY) uses MY, as it's part of its code, then iterates through parents until it finds one with a valid ISO code; in this case, Hakka (hak). This gives the subtags hak-MY, even though that isn't a code used for any particular variety. On the other hand, Australian Aboriginal English will use the region subtag AU, since the first one it'll encounter will be its parent, Australian English (en-AU).

There is an edge-case where etym-langs with valid ISO codes are set as a (sub)variety of something that has a region code, which comes up occasionally (e.g. Achterhoeks (act) is a variety of Dutch Low Saxon (nds-nl)). My inclination is that iterating through parents should stop once a valid lang/family subtag is found, with the region being an optional bonus picked up along the way, since act-NL feels redundant and it keeps the code simpler. Theknightwho (talk) 19:38, 18 June 2024 (UTC)[reply]

@Theknightwho Sounds good. With something like this, given the variety and inconsistency in our etym language codes, it seems we'll probably have to iterate until we come up with something that works well. In the process if you discover any etym language codes that need renaming, let me know and I can help. Benwing2 (talk) 19:48, 18 June 2024 (UTC)[reply]

@Benwing2 Thanks - will do. So far, the only thing is whether we want to change nds-de and nds-nl to nds-DE and nds-NL for consistency, since they long-predate your recent standardisation of etym-languages, but they're a special case since they're full languages. I've added ietf_subtag fields to them for now, at least. Theknightwho (talk) 20:01, 18 June 2024 (UTC)[reply]

@Theknightwho Cool. I think it would probably be a good idea to make that change (on the principle that we want to merge the handling of full and etym-only languages as much as possible), but since they're full languages we'd need to get consensus, esp. from whichever editors regularly work in these languages. Benwing2 (talk) 20:24, 18 June 2024 (UTC)[reply]

Yep, agreed. Sounds good. Theknightwho (talk) 22:16, 18 June 2024 (UTC)[reply]

@Benwing2 One other thing is that we probably want to change langcodes that start und- to mis-, since und represents "undetermined" while mis represents "unencoded". It doesn't really impact much, since it's just a Wiktionary convention, but I'd support changing it for the same reason you standardised our etym-lang codes, as it keeps things in line with the ISO standard as much as possible. Theknightwho (talk) 19:53, 20 June 2024 (UTC)[reply]

@Theknightwho Sounds fine with me. Looks like there are only six of them and I can't imagine they see much use. Benwing2 (talk) 21:26, 20 June 2024 (UTC)[reply]

@Benwing2 I'm going to add modules which contain ISO 639-1 (2-letter), 639-3 (3-letter) and 639-5 (family) codes, which can be used as a way to cross-check against our linguistic data. I'll make it very clear on the documentation that they're not to be used in mainspace (or for any other purpose), and they should be kept updated in-line with any changes to the standards; realistically, ISO 639-3 is the only one which we'll see any changes from, and from the change log they seem to do batch updates once a year or less.

Two issues I can already spot are Bihari (bh) and Serbo-Croatian (sh), as both those codes have been deprecated by ISO 639-1. In the case of Serbo-Croatian, I've no objection to using sh internally, but the tags should use the 3-letter code hbs. With Bihari, we probably want to convert it into a family, since the code was (rightly, imo) deprecated from the ISO for political reasons, after India formally recognised its various constituents as separate languages in their own right. We only have 5 lemmas for it (compared to 366 Bhojpuri lemmas alone), and we currently list all of its constituents as its descendants, which is the same issue we had with Southern Min before that got split. Theknightwho (talk) 22:04, 20 June 2024 (UTC)[reply]

@Theknightwho I already created Module:ISO 639 awhile ago; you should augment this module if possible rather than creating yet another duplicative source of language code info. As for Bihari, I have no objection to splitting it into its constituent languages. For Serbo-Croatian I agree we should stick with sh because there is a ton of infrastructure and entries that already use this code (and in any case there's no telling what might happen to hbs in the future given the contorted language politics of the former Yugoslavian countries). Benwing2 (talk) 01:02, 21 June 2024 (UTC)[reply]

can i search for all English nouns that begin with a capital letter?

ideally i'd like to exclude entries that are only used as proper nouns. e.g. for example Tabasco qualifies for what i want because it can be used, even with a capital T, to denote the pepper sauce and therefore is a common noun. i dont know how to do this, if it's possible at all. I want Category:English nouns but the sorting isnt case- sensitive. Thanks, —Soap— 20:26, 17 June 2024 (UTC)[reply]

Maybe not a perfectly efficient solution, but you could use w:en:WP:AWB to make a list from Category:English nouns and then copy past into a spreadsheet (LibreOffice Calc, Google Sheets, Microsoft Excel) and remove entries with that. —Justin (koavf)❤T☮C☺M☯ 21:43, 17 June 2024 (UTC)[reply]

The following searchbar search would let you sip from the firehose, assuming you want only English: "hastemplate:"en-noun" intitle:/Ta[a-z]+/ prefix:Ta". This quickly yields entries such as you have specified beginning with "Ta", including Tabasco with 294 others. Experimenting with similar searches might help you adjust your efforts. DCDuring (talk) 21:53, 17 June 2024 (UTC)[reply]

@Soap Best way is to use Petscan. Set it to en.wiktionary and type "English nouns" in the Categories box. On the "Output" tab, select "Plain text" and sort by "title". This sorts capital letters before lowercase letters! This, that and the other (talk) 23:33, 17 June 2024 (UTC)[reply]

@Soap it can be done in Special:Search using incategory:"English nouns" and intitle:/[A-Z]/, but the regex needs tweaking to keep it from timing out and I'm not very good with regexes. Also, any common noun that contains a proper noun, like "American eagle", is going to be capitalized. Chuck Entz (talk) 05:02, 18 June 2024 (UTC)[reply]

@Chuck Entz: American eagle does not contain a proper noun, because American is not a proper noun, but a common noun or adjective derived from a proper noun. J3133 (talk) 05:20, 18 June 2024 (UTC)[reply]

@J3133 Yes, it does: "America", as part of the adjective, "American". The presence of a proper noun anywhere in the term- even as a part of a part- is enough to force capitalization. Chuck Entz (talk) 05:40, 18 June 2024 (UTC)[reply]

@Chuck Entz: I am aware of proper adjectives (“an adjective derived from a proper noun”), which are usually capitalized; however, there is a difference between a term containing a proper noun (e.g., Arkansas toothpick) and a term derived from a proper noun (e.g., herostratic/Herostratic fame, Alexandrine). J3133 (talk) 05:48, 18 June 2024 (UTC)[reply]

Thank you to all. I'm still interested in this, but it's true that so far, all of the query strings people have given me both here and on Discord have either timed out or (sipping at the trough) returned only very few results. The on-wiki search help for whatever reason recommended using deepcategory: which didnt work at all. Maybe we can change that or specify the difference between it and incategory:. Like I said Im still interested in this, and will try to work something out on my own if I can figure out how the processor runs ... for example would adding an exclusion category speed the process up or slow it down? Thanks, —Soap— 06:32, 18 June 2024 (UTC)[reply]

@Soap did Petscan not work for you? Or is it not what you are looking for? This, that and the other (talk) 08:03, 18 June 2024 (UTC)[reply]

i apologize, actually. it seemed that it was limited to 10K results at first, but i must not have pressed the buttons properly. it seems it's still not working quite how i expected, in that i can't exclude Category:English proper nouns, even by putting it into the negative category ... but that wasnt what i asked for in the beginning. i just want it now that i realize just how huge the category is. even though it would exclude Tabasco which is both a proper noun and a common noun. i will keep working with this, though, and if necessary perhaps i could do two searches and then run through Excel to sift out the ones that appear in both columns. Still, is "negative category" intended as a way to do this, or is it something else? Thanks, —Soap— 13:07, 18 June 2024 (UTC)[reply]

to give an idea of what this is all for, i found Terramycin just now which is a brand name which we had categorized as a common noun. i think it should be a proper noun. i know i've seen others, and figure that sometimes someone who creates a page just uses the generic noun template when it should be proper. —Soap— 13:19, 18 June 2024 (UTC)[reply]

@Soap I've tried to use the Petscan negative category feature in the past without much luck - it may be buggy. Excel can be a useful tool for managing complex combinations of word lists in an interactive way (without writing code). This, that and the other (talk) 11:56, 19 June 2024 (UTC)[reply]

Changes to `{{alt}}`

@Benwing2, I don't remember {{alt}} linking to Appendix:Glossary, i.e. shoppe (obsolete). Is that a recent change? -- Sokkjō 01:03, 19 June 2024 (UTC)[reply]

@Sokkjo Yes, over the last few months I merged "dialect tags" (which is what {{alt}} used) and "accent qualifiers" (which is what {{a}} used) with labels. There have been a number of Beer Parlour and Grease Pit topics where I have posted about this. I can dig them up if you're interested, but the basic idea is that there used to be all sorts of duplication and fragmentation of dialect and such info and I have tried to consolidate it. One of the effects of this is that, since the tags in {{alt}} that follow a blank parameter are now just labels, things like obsolete link to the glossary just like labels do. There are now two primary places where dialect information can be found; one is in the language-specific label data and the other is the etym-only language info in Module:etymology languages/data. I haven't tried to merge these two for various reasons. Benwing2 (talk) 01:48, 19 June 2024 (UTC)[reply]

Is there a particular discussion related to my query you can direct me to? -- Sokkjō 05:07, 19 June 2024 (UTC)[reply]

template:synonyms and its cousins have suddenly stopped respecting the semicolon parameter?

I know it was working until recently, and now it is not working. The documentation for template:synonyms says, "It is suggested to use semicolons to separate logical groups of synonyms." I agree; and I suspect that the recent change in behavior was unintentional, as I tried a cursory effort at finding discussion of it and did not find any. Quercus solaris (talk) 02:04, 20 June 2024 (UTC)[reply]

@Quercus solaris Thanks for the bug report. I have made some code changes here and I must have broken this. Benwing2 (talk) 04:30, 20 June 2024 (UTC)[reply]

@Quercus solaris Should be fixed; let me know if you see any other issues. Benwing2 (talk) 04:40, 20 June 2024 (UTC)[reply]

Diff colours

FYI for anyone who doesn't like the new richer colours that added and removed things in diffs have, several ways of changing back to the old colours are laid out on Phabricator and in Wikipedia's Village Pump - Technical. - -sche (discuss) 01:27, 21 June 2024 (UTC)[reply]

@-sche: oh, good. I'm not opposed in principle to having colours, but I find that the dark purple colour makes it impossible for me to tell when I'm highlighting text. — Sgconlaw (talk) 02:31, 21 June 2024 (UTC)[reply]

Etymology tree

I saw an etymology tree in English biology and want to use that for other indexes. How can I do that? Thank you!

Plus, I see a lot of Templates and do not know where to find their usage guide. I've browsed through the Template page but nothing. So how do I find information about a template? Thank you. Duchuyfootball (talk) 02:58, 21 June 2024 (UTC)[reply]

Information about the etymology tree template is at Template:etymon. Templates are supposed to have documentation at pages with these kinds of names.--Urszag (talk) 03:22, 21 June 2024 (UTC)[reply]

Thank you! I've tried the template on advance but can't figure out a way for it to display the third etymon. Could you take a quick look to see what's wrong? Thank you. — This unsigned comment was added by Duchuyfootball (talk • contribs) at 03:47, 22 June 2024.

@Duchuyfootball, you're a Vietnamese editor, why would you try creating an language tree for an English word? -- Sokkjō 03:59, 22 June 2024 (UTC)[reply]

All of the ids in the etymologies (placed after the last >) have to match the ids on the page for the relevant sense. E.g. at avauncen you need to have "fro>avancier>move forward" where "move forward" matches the id parameter at avancier.--Urszag (talk) 04:39, 22 June 2024 (UTC)[reply]

Thank you, I will look into that! Duchuyfootball (talk) 08:31, 22 June 2024 (UTC)[reply]

@Sokkjo I love creating a language tree for any language, and I want to experiment with this technology before possibly finding a way to try it out on Vietnamese (since there is no example for Vietnamese, and I figure whoever uses this template would be most familiar with the English language). That said, once I manage to successfully complete an etymon tree for this English word, I will move on to the next one.

In addition, I am not aware of any rules banning a foreign editor from improving English pages, or more particularly, from creating etymon trees for English words. I am, however, not oblivious of the possibility that someone might actually overdo it and ruin the entire page. In my case, I think I am not ruining the page whatsoever.

Is it because you fear I might make mistakes when creating the etymon tree? If there are any particular rules or reasons that you don't want me to add etymon trees for English words, please kindly inform me. Thank you. Duchuyfootball (talk) 08:29, 22 June 2024 (UTC)[reply]

@Duchuyfootball, creating etymologies for languages you're not inadeptly familiar with is a quick way to make mistakes and find yourself in hot water. Even if your very limited foray, you managed to mislabel the descent path from French to Middle English the other day. Creating etymology trees should really be left to those who understand the full chain of descent and can even fix errors. -- Sokkjō 14:49, 22 June 2024 (UTC)[reply]

I'm not sure what exactly is the criteria for "being familiar with a language". I've been studying and using English for 12 years. My English proficiency level is C2. I frequently asked for improvements on the etymology of English and Latin words. I do minor tweaks to English words from time to time (and so far no complaints). So, am I qualified for the language?

I think it would be unfair to put that mislabeling mistake of mine down to unfamiliarity with the language. In my defense, it was totally because of my inability to understand how the template works. Now that I have understood how it works, I can confidently do it with a minimal level of inaccuracy.

Could you elaborate the "understand the full chain of descent and can even fix errors" part. I don't think it's rocket science - the etymology given for each word is quite straightforward, meticulously formatted so that readers can easily understand, is it not? Regarding errors, what type of errors are we talking about here?

Another reason I'm doing this is because so far I don't see a lot of entries with an etymon tree, so I decided to help. Duchuyfootball (talk) 16:03, 22 June 2024 (UTC)[reply]

If I come across as stubborn or something, I'm terribly sorry. I understand your concerns. If you wish the English entries to be exclusive to you and your fellow editors, I will respect that. Duchuyfootball (talk) 16:06, 22 June 2024 (UTC)[reply]

While waiting to your response, I decided to try the template out on deposit, melanotubule, glacer. Duchuyfootball (talk) 16:09, 22 June 2024 (UTC)[reply]

(edit conflict) @Duchuyfootball, you can be a native speaker of English and still not understand what Proto-Germanic form a word descendants from, let alone Proto-Indo-European. {{etymon}} doesn't need help bad enough for people who don't work in etymologies creating etymological trees. It isn't a game, and though it isn't "rocket science", linguistics is a science. -- Sokkjō 16:12, 22 June 2024 (UTC)[reply]

Is it not often stated clearly on the etymology section (the Proto-Germanic form a word descendants from)? Duchuyfootball (talk) 16:17, 22 June 2024 (UTC)[reply]

Now your response baffles me: aren't etymology trees visual representations of what is already given in the etymology section? Duchuyfootball (talk) 16:20, 22 June 2024 (UTC)[reply]

Of course as a sane editor, I know better than trying to create an etymology tree when there are too many vague factors. Duchuyfootball (talk) 16:28, 22 June 2024 (UTC)[reply]

Victar just dislikes the template and conflates "synching entries" with "spreading false information". Were he aware of how people use the template, he'd be aware that most of the time this misinformation that he's afraid of is already present on entries, people are just bringing it to light. But what can you expect from a contrarian who feeds on disagreement. Vininn126 (talk) 16:33, 22 June 2024 (UTC)[reply]

Can't have a Wiki unless you have toxic users like Vininn126. 🤷 -- Sokkjō 16:38, 22 June 2024 (UTC)[reply]

Says the user who brings up needless confrontation at every opportunity and has driven away tons of other editors. Don't pin your toxicity on me. Vininn126 (talk) 16:41, 22 June 2024 (UTC)[reply]

Thanks for the info. It seems like I can continue my tree-growing journey now. Yay. Duchuyfootball (talk) 16:43, 22 June 2024 (UTC)[reply]

The issue is that you will come across incorrect etymologies, either though misunderstanding, or simple error, and if you don't have an understanding of that tree of descent, copying it to other pages is just propagating misinformation. Even with {{etymon}} out of the pictures, it's not a good idea to copy etymologies if you don't understand what you're copying. -- Sokkjō 16:46, 22 June 2024 (UTC)[reply]

Then... fix the wrong etymology? Is it a better idea to feed people incorrect etymologies and don't bother a second look? Duchuyfootball (talk) 16:50, 22 June 2024 (UTC)[reply]

So then people who connect them are just bringing them to light. Therefor uncovering the misinformation we already have. Vininn126 (talk) 16:54, 22 June 2024 (UTC)[reply]

If your realm of expertise is Vietnamese, why not create Vietnamese trees? -- Sokkjō 16:46, 22 June 2024 (UTC)[reply]

Because it is poor-researched so there are few things to do. Duchuyfootball (talk) 16:51, 22 June 2024 (UTC)[reply]

There isn't really anything special about English that makes it more difficult to add an etymology tree if you're able to read the etymologies given on the existing pages. As Vininn126 said, if you accurately represent the existing etymologies, the worst-case scenario is that you just put some existing incorrect information in front of more eyes, which increases the likelihood of it being fixed. Having the knowledge to make those sorts of fixes yourself would be a useful bonus, but isn't an absolute prerequisite.--Urszag (talk) 16:58, 22 June 2024 (UTC)[reply]

Thank you. Duchuyfootball (talk) 16:59, 22 June 2024 (UTC)[reply]

@Duchuyfootball: Since now you're simply being flippant, I will have to warn you have carelessly creating wrong etymology tree may lead to a block, and "I was just copying it from here" is not a valid excuse.
Pinging @Mahagaja, DCDuring for their general awareness. -- Sokkjō 17:00, 22 June 2024 (UTC)[reply]

This is already a problem we face. That etymology must be dealt with. This is contrarianism for contrarianism's sake. I'm tired of your attitude. Vininn126 (talk) 17:01, 22 June 2024 (UTC)[reply]

Like Vininn126, I find it strange to hear you call another editor a "toxic user" while in the same conversation being extremely aggressive and threatening towards a good-faith editor who's just trying to get experience working with an unfamiliar new template. Plenty of people edit etymology sections on this wiki every day: of course they should be careful and try to avoid inaccuracy, but it isn't some sacred domain that you need to gatekeep and discourage newcomers from participating in. And Duchuyfootball is not even a new editor to the overall project.--Urszag (talk) 17:06, 22 June 2024 (UTC)[reply]

@Urszag, Good intentions not an excuse for making bad edits in an area you are unfamiliar with. Creating a full stack tree really should not be embarked upon on unless you have a grasp of the languages. I wouldn't dare create a tree for, for say, an Austroasiatic language, because it's simply out of my wheelhouse. If you find yourself editing a Proto-Indo-European entry when you've never done so before, you should really question your actions. -- Sokkjō 17:46, 22 June 2024 (UTC)[reply]

This really ignores most of the conversation. 99% of the time it's about connecting existing entries. Vininn126 (talk) 17:54, 22 June 2024 (UTC)[reply]

Vininn126, I wasn't replying to you and really could care less about what you think about anything. -- Sokkjō 18:07, 22 June 2024 (UTC)[reply]

That's the non-toxic, cooperative attitude you were boasting earlier! No hypocrisy here at all. Vininn126 (talk) 18:09, 22 June 2024 (UTC)[reply]

None of your comments above have been made in good faith -- you just came to troll -- which I'm not going to waste my time on. -- Sokkjō 18:17, 22 June 2024 (UTC)[reply]

This is objectively untrue. I called out the problems of what you have said and also the merits of people scouring these pages. If that's not "good faith", then you have a problem, which I'm very convinced of. Vininn126 (talk) 18:19, 22 June 2024 (UTC)[reply]

@Sokkjo Is this going to end with you storming off in a huff for a year like the last time we got a new template you didn't like? Theknightwho (talk) 20:37, 23 June 2024 (UTC)[reply]

LMAO, and playing Troll #2 this evening, ladies and gentlemen, Theknightwho. -- Sokkjō 23:10, 23 June 2024 (UTC)[reply]

@Sokkjo I'll take that as a yes, then. Naturally, we're all just trolls for pointing out your shitty behaviour, right? Theknightwho (talk) 00:29, 24 June 2024 (UTC)[reply]

🍿👄. -- Sokkjō 01:17, 24 June 2024 (UTC)[reply]

I'm refusing to take you seriously after writing paragraphs to receive some sentences. You lost my un-flippancy. But thanks for the warning. I will be extra careful👌. Duchuyfootball (talk) 17:12, 22 June 2024 (UTC)[reply]

@Duchuyfootball: I will say that showing the tree for {{etymon}} is supposed to be based on community consensus per its creation vote, and the main language that has that consensus is English. I'd avoid adding the template to French, Latin, and especially Old French for now, until a lot of the quirks have been worked out. AG202 (talk) 00:37, 24 June 2024 (UTC)[reply]

I'm not sure I understand. So can I still add the template to these languages, but not showing the tree? Duchuyfootball (talk) 05:32, 24 June 2024 (UTC)[reply]

Yes. Vininn126 (talk) 05:55, 24 June 2024 (UTC)[reply]

Thank you! Duchuyfootball (talk) 05:57, 24 June 2024 (UTC)[reply]

Unifying collapsible boxes

Currently there are two styles of collapsible box available for use in entries.

One style is clickable along its entire with when collapsed (the so-called NavFrame):

{{collapse-top}} ⇒

More information

(why is this box yellow inside? shouldn't that be reserved for {{trans-top}}? and why does the box title have a random indent?)

{{box-top}} ⇒

More information

contents

{{der-top}} ⇒

Derived terms

contents

The other style has a totally different look, and only the [Collapse]/[Expand] toggle is clickable:

{{collapse|contents}} ⇒

More information

contents

The second style of box is much less common and, in my opinion, less user-friendly and out of step with Wiktionary's look. Does anyone have views on the idea of migrating templates with the second style across to the first style? (Note, to be clear, I am not proposing any change to the parameters of {{collapse}}.) This, that and the other (talk) 09:31, 21 June 2024 (UTC)[reply]

{{collapse}} Looks less heavy and more integrated into the text e.g. in an etymology section such as , rugiada, excessive alt forms at potârniche. Otherwise reasonably used only in Korean entries, also automatically generated via {{ko-IPA}}, 방 (bang) at multiple places. Fay Freak (talk) 19:26, 21 June 2024 (UTC)[reply]

I think it's a good idea to use the first style. I'm not sure why the second style is green, unless the intention is for it only to be used on discussion pages to draw attention to things. (If that is thought to be useful, perhaps just add a parameter to the first style allowing a different colour to be specified.) — Sgconlaw (talk) 20:51, 21 June 2024 (UTC)[reply]

@Fay Freak I take it that you are talking about the formatting of the box contents? I agree that, when the contents of the box is regular text (as opposed to a list), it makes sense to keep the same text format as the rest of the entry. However, I still think it would be better to at least unify the appearance of the bar along the top of the box.

In cases where the collapsible box contains text, I'm inclined to combine the NavFrame bar with the larger font of {{collapse}}'s interior. Something like this:

More information

contents

Not sure whether it needs a background color like {{box-top}} or double border like {{collapse}}. Thoughts? This, that and the other (talk) 04:41, 23 June 2024 (UTC)[reply]

@This, that and the other: colourwise, maybe we should just follow the format of {{col3}}? — Sgconlaw (talk) 13:54, 23 June 2024 (UTC)[reply]

Category:Pages with broken file links

Of the 27 pages currently in this category, 23 are there solely due to templates that routinely link to every possible filename.

The first, {{tok-sitelen}}, checks for the files of both the "sitelen pona" and "sitelen sitelen" versions and only displays the ones that exist. Is there any way to keep checks for existence in template code from putting pages into this category?.
There are also a number of Han character entries where {{Han etyl}} has preloaded lists of expected glyph files, only some of which actually exist. If you click "more" on the boxes, you're treated to tons of helpfully labeled file redlinks. Is there any way to just display the ones that actually exist? Pinging @Justinrleung who would know more about the reasons the template is set up this way. Chuck Entz (talk) 19:13, 21 June 2024 (UTC)[reply]

@Chuck Entz: I suspect there was something wrong with the uploads a while back. It was Wyang who dealt with the upload and this module at the time, so I'm not sure what to do with it right now other than to remove the appropriate file names from the subpages of Module:zh/data/glyph-data. — justin(r)leung _{{ (t...) | c=› }} 21:26, 21 June 2024 (UTC)[reply]

Orange links should not be links

I recently changed a link in an article, because it went nowhere (technically, the link pointed to a nonexistent section on the same page, such that clicking the link does nothing). This change was later reverted, with the editor pointing out the value of keeping orange links.

I am grateful to the editor for explaining the value of this (I'm mostly a WP editor so I was not familiar with this concept), and I understand that orange links are useful for editors. But they are not good for readers. Having a link that does nothing when clicked just leads to confusion for most readers, and as far as I can tell it doesn't show up as orange unless you have a specific plugin installed, which casual readers will almost never have.

There's got to be some way of maintaining the benefit of orange links for editors, while not creating confusion for readers. (For example, maybe someone could update the der template -- which created the link in question in the edits discussed above -- so that when something would be an orange link, it instead creates no link and just adds that term to a category or list somewhere?) 85.200.213.110 05:38, 22 June 2024 (UTC)[reply]

You aren’t supposed to link any foreign languages outside of templates at all. I support making orange links default for unregistered editors though. People are mislead in the like fashion with links on the same page and I suspect that ambiguous links are a reason why Wiktionary seemingly lacks popularity in the world of dictionary users. Fay Freak (talk) 05:43, 22 June 2024 (UTC)[reply]

It would be nice to have orange links on by default; this was my first thought when undoing that edit. We might should consider checking for any foreign words linked using bare links. Vininn126 (talk) 07:47, 22 June 2024 (UTC)[reply]

I've noticed that orange links don't work properly when the link is to the same page, which is annoying. Theknightwho (talk) 20:25, 27 June 2024 (UTC)[reply]

I'm down with turning orange links on by default. - -sche (discuss) 20:32, 27 June 2024 (UTC)[reply]

CJK Unified Ideographs Extension I

Category:CJK Unified Ideographs Extension I block。対応するフォントとブラウザがインストールされていても、表示されないようです。数箇月も経過していますが、いまだにMediaWikiは、Unicode 15.1に対応していません。フォントはこちら。-- Charidri (talk) 14:45, 22 June 2024 (UTC)[reply]

Editing ebony: "hu t+" nonsense

While trying to remove redundant script codes for ebony, my action got automatically identified as "harmful", with the auto description mentioning "hu t+". I don't know why, since my earlier edit to fix English entries with topic categories using raw markup has somehow passed through. oldid 80501138 --> (removed redundant script codes, fixed 'hu t+' error):

45,46c45,46
< * Armenian: {{t+|hy|եբենոս|sc=Armn}}, {{t+|hy|աբանոս|sc=Armn}}
< * Bengali: {{t|bn|আবলুস|sc=Beng}}
---
> * Armenian: {{t+|hy|եբենոս}}, {{t+|hy|աբանոս}}
> * Bengali: {{t|bn|আবলুস}}
62,64c62,64
< * Gujarati: {{t|gu|અબનૂસ|sc=Gujr}}
< * Hindi: {{t+|hi|आबनूस|sc=Deva}}
< * Hungarian: {{t+|hu|ében}}, {{t+|hu|ébenfa}}
---
> * Gujarati: {{t|gu|અબનૂસ}}
> * Hindi: {{t+|hi|आबनूस}}
> * Hungarian: {{t|hu|ében}}, {{t|hu|ébenfa}}
68c68
< * Japanese: {{t|ja|コクタン材|tr=kokutanzai|sc=Jpan}}
---
> * Japanese: {{t|ja|コクタン材|tr=kokutanzai}}
73c73
< * Malayalam: {{t+|ml|കരിമരം|sc=Mlym}}, {{t|ml|കരിന്താളി|sc=Mlym}}
---
> * Malayalam: {{t+|ml|കരിമരം}}, {{t|ml|കരിന്താളി}}
107c107
< * Hungarian: {{t+|hu|ében}}, {{t+|hu|ébenfa}}
---
> * Hungarian: {{t|hu|ében}}, {{t|hu|ébenfa}}
140c140
< * Hungarian: {{t+|hu|ében}}, {{t+|hu|ébenfekete}}
---
> * Hungarian: {{t|hu|ében}}, {{t|hu|ébenfekete}}

Thank you, 83.28.217.24 09:05, 24 June 2024 (UTC)[reply]

User:Surjection This looks like an edit filter of yours, can you comment? Maybe you can insert a comment into the edit filter itself explaining what abusive behavior it's trying to catch, because it's not obvious from looking at the code. The change in question by this IP is just removing some redundant script codes. Benwing2 (talk) 19:55, 24 June 2024 (UTC)[reply]

There used to be a Hungarian IP (or maybe Austrian, but they were adding Hungarian translations at least, IIRC) editing from various ranges that would often indiscriminately replace {{t}} with {{t+}}, when the entry didn't exist in the other Wiktionary edition (if that edition existed at all). I've disabled the filter for now, since it seems to only result in false positives now. — SURJECTION ^{/ T / C / L /} 19:58, 24 June 2024 (UTC)[reply]

Edit request to MediaWiki:Mobile.css

Line 1: add margin-bottom: 0; and display: table; (results in a more efficient use of space within NavFrames)
Line 31: delete this whole block. I asked on User:Yair rand's talk page what he meant by this and never got a reply. As far as I can tell it does nothing aside from breaking the table layout.
All styles related to phab:T316670 can be removed once the issue is resolved, probably in the next few days (also applies to MediaWiki:Common.css).

Ioaxxere (talk) 16:14, 24 June 2024 (UTC)[reply]

@Ioaxxere Done. margin-bottom: 0; was already present in the block on line 1. I didn't remove the phab:T316670-related block (there's only one that I could identify), let me know when this patch goes through. Benwing2 (talk) 20:00, 24 June 2024 (UTC)[reply]

@Benwing2: Thank you! Sorry, in the first point I meant to have written margin-top: 0;. Ioaxxere (talk) 20:20, 24 June 2024 (UTC)[reply]

@Ioaxxere Like this? Benwing2 (talk) 20:22, 24 June 2024 (UTC)[reply]

Yep, that looks good. Ioaxxere (talk) 20:35, 24 June 2024 (UTC)[reply]

@Ioaxxere: Apologies for the delayed reply. I might be misremembering, but I think that that line was because we used to have collapsible tables work differently, and the tbodys were hidden by default and not expandable on mobile? In any case, it's presumably not needed anymore. --Yair rand (talk) 20:29, 14 July 2024 (UTC)[reply]

@Yair rand Thanks for the reply! Benwing2 (talk) 20:46, 14 July 2024 (UTC)[reply]

Collapsible content not working

I don't know if this is just me or if it's a wider problem, but collapsible content isn't behaving properly today. Quotes indented with #* under definition lines have disappeared completely, while inflection tables are stuck in either the open or the closed condition with no way to open the closed ones or close the open ones. Any ideas? —Mahāgaja · talk 11:05, 25 June 2024 (UTC)[reply]

Update: This only happens in the new Vector. When I switch back to Vector 2010, the problem goes away. —Mahāgaja · talk 11:27, 25 June 2024 (UTC)[reply]

@Mahagaja Is this mobile only (in which case it might very vaguely be related to the changes made yesterday in the preceding topic), or mobile + desktop, and which browser/OS? Does it happen if you switch browsers? User:Ioaxxere, any ideas? I know the Wikimedia developers are in the process of making some HTML changes. Benwing2 (talk) 20:44, 25 June 2024 (UTC)[reply]

Unfortunately I am not able to reproduce this on any device. But it shouldn't be possible for the quotations to disappear completely... maybe something in Vector 2022 is interfering with MediaWiki:Gadget-defaultVisibilityToggles.js. Ioaxxere (talk) 03:56, 26 June 2024 (UTC)[reply]

@Benwing2, Ioaxxere: Whatever the problem was, it's been resolved. Everything's back to normal today. For reference, it was happening on my desktop (I didn't even check mobile), using Firefox on Windows 10. Thanks for your help! —Mahāgaja · talk 06:32, 26 June 2024 (UTC)[reply]

Sections expand with setting off

It has always been this way for me that when I go to a Wiktionary page for a word, all sections (languages) expand, and I have to manually close them all until I reach the one I need. The "Expand All Sections" setting is OFF for me. It still does that. On my computer too. Is it possible to fix this or at least add a "Collapse all" button? Thanks. Stavats (talk) 09:33, 26 June 2024 (UTC)[reply]

@Stavats Hmmm, what skin are you using and what browser/OS? Also when you say "sections" are you referring to collapsible content like quotations, or the language sections themselves? In the latter case they are always open under the desktop; not sure about mobile. Maybe User:Ioaxxere has a comment, since they seem to be our resident HTMl/CSS expert. As for collapsible content, for me under Vector 2010, there's a bunch of links in the left rail, and under the Tools section there's an "Expand all" link that opens all the collapsible content, and when you click on it, it changes to "Collapse all", which closes all the collapsible content. There's also a "Visibility" section that lets you separately expand or collapse certain sorts of collapsible content. Benwing2 (talk) 19:07, 26 June 2024 (UTC)[reply]

I was referring to the latter – the languages themselves. Is there any way for them to start off closed? It's really an unnecessary hassle to close them all Stavats (talk) 19:55, 26 June 2024 (UTC)[reply]

I'm using Bromite on GrapheneOS and Waterfox on a Macbook Air (MacOS) Stavats (talk) 19:56, 26 June 2024 (UTC)[reply]

Thank you for the kind words @Benwing2! To answer @Stavats's question, Wiktionary on mobile is specifically configured to expand all sections since phab:T63447. One can override this with some code in your Mobile.js:

// Close all L2 sections.
Array.from(document.getElementsByClassName("open-block")).forEach(L2 => L2.click());

However, all this is part of a larger issue of poor usability on mobile which {{minitoc}} aims to solve. Ioaxxere (talk) 22:43, 26 June 2024 (UTC)[reply]

I see. Thank you! So, for a noob, what is it specifically that I can do? How do I edit this Mobile.js file? Stavats (talk) 07:05, 27 June 2024 (UTC)[reply]

@Stavats Click on the 'Mobile.js' link above and it will bring you to your own user version of Mobile.js. Paste the above code into it. Should work. Benwing2 (talk) 07:23, 27 June 2024 (UTC)[reply]

It's not working, but maybe I did it wrong. Would you kindly check? Stavats (talk) 17:54, 27 June 2024 (UTC)[reply]

@Stavats I think you need to put it in User:Stavats/common.js. @Ioaxxere there doesn't appear to be a user-specific Mobile.js, and I have no idea how to conditionalize code in common.js to be mobile-only. Benwing2 (talk) 19:28, 27 June 2024 (UTC)[reply]

@Stavats: Sorry, unfortunately there was a lot wrong with my initial code. Here is a solution which I can confirm does work: https://en.wiktionary.org/w/index.php?title=User:Ioaxxere/common.js&oldid=80529281 (ignore the two lines at the top). Ioaxxere (talk) 21:32, 27 June 2024 (UTC)[reply]

Thanks a lot! It works a little too well actually – hyperlinks no longer penetrate sections, so for example I had to find this thread by scrolling even after clicking the notification, and main page featured words don't go to their featured language. You get the idea Stavats (talk) 08:57, 28 June 2024 (UTC)[reply]

@Ioaxxere Can you fix this? Do you have access to the fragment/HTML anchor in JavaScript? If so you should be able to keep that L2 section open while the remainder get closed. Benwing2 (talk) 09:06, 28 June 2024 (UTC)[reply]

@Stavats, Benwing2: Try it now: https://en.wiktionary.org/w/index.php?title=User:Ioaxxere/common.js&oldid=80538933 Ioaxxere (talk) 18:03, 28 June 2024 (UTC)[reply]

This must have been a lot of work. Thank you. It works well with a non-bothersome side effect that whenever no # is specified, the first section opens automatically. I can live with that Stavats (talk) 18:59, 28 June 2024 (UTC)[reply]

Whoops, just noticed something else: When I go to a many-sectioned page like a, all of the text in some of the sections replaces with "The time allocated for running scripts has expired". Starts at #Sassarese Stavats (talk) 19:05, 28 June 2024 (UTC)[reply]

@Stavats: Yes, I added that intentionally. The comments say: "If the URL contains a valid anchor, don't close the L2 containing that ID. Otherwise, don't close the first L2." Although, now that I've tested it I'm not sure that it's that useful (maybe it should only do that if the first L2 is "English"?). If you know some JavaScript you can try playing around with the code.

As for the issues on a, I can tell you that that page has been a problem for a *long* time. There's a discussion right above! #Splitting the largest entries on technical grounds? Ioaxxere (talk) 19:30, 28 June 2024 (UTC)[reply]

Showing large numbers of regional alt forms, continued

I've been slowly taking notes on the various Polish dialects and subdialects (or rather dialect groups and dialects?), noting features, gathering dictionaries etc. One thing that I've been wondering about is how to show large numbers of alternative forms, since there are something like 30 odd subdialects.

I think a great way to show this is to create something akin to {{dialect synonyms}} (which is for synonyms) called {{dialect alternative forms}}. Basically I think we could have a map with the dialectal divisions drawn in, and then specific dialects and an alt form could be supplied, i.e. Kociewie=FOO|Kurpie=BAR and those respected forms would appear over their given region. I think we could even model this on {{picturedictionary}} somewhat. I might be able to scrap something up for Polish, but I'm assuming this template might be useful for other langs. One thing to consider is when dialects cross borders, i.e. w:Northern Borderlands dialect or Brazilian Polish. Vininn126 (talk) 12:07, 27 June 2024 (UTC)[reply]

It might also be possible to show dialect groups with this. Vininn126 (talk) 12:22, 27 June 2024 (UTC)[reply]

@Vininn126: Before re-inventing the wheel, look in a few Chinese entries to see what they use. Chuck Entz (talk) 14:58, 27 June 2024 (UTC)[reply]

@Chuck Entz I have. It's very impressive; it could potentially be serviceable for what I am aiming to do, but I still think something like this might be better, at least for Polish. Vininn126 (talk) 15:04, 27 June 2024 (UTC)[reply]

I think if you're going to use a map, you should also include the alt forms in list format, for consistency with other languages. Don't get me wrong, I think the map is a great idea, but certain kinds of comparisons are easier to make with a list. Andrew Sheedy (talk) 17:37, 27 June 2024 (UTC)[reply]

In theory generatable alongside. The map is better for Polish imo as many isoglosses can span multiple cardinal directions, so seeing it visually might make more sense for a 2d view, as opposed to a 1d view like for Chinese. Vininn126 (talk) 17:47, 27 June 2024 (UTC)[reply]

Perhaps you could have a toggle between map view and list view (unless that's too complicated). Andrew Sheedy (talk) 20:01, 27 June 2024 (UTC)[reply]

Also worth considering. Vininn126 (talk) 20:12, 27 June 2024 (UTC)[reply]

I can probably at least use this as a base for showing a list, as suggested below. Vininn126 (talk) 07:44, 28 June 2024 (UTC)[reply]

Another thing about this is that orthographical variation and such is smaller in Chinese. Vininn126 (talk) 14:05, 28 June 2024 (UTC)[reply]

Bot Request

Before we add parameter checking to any more templates, would someone please do a bot run to remove all pipes not followed by positional parameters? While I don't mind fixing the occasional stray error, I'm getting tired of constantly removing empty pipes by hand to fix "Parameter X is not used by this template" errors. Chuck Entz (talk) 14:55, 27 June 2024 (UTC)[reply]

@Chuck Entz This could potentially be done but if done in a blanket run for all templates I'd worry about templates (weird old ones, not implemented using Lua) where blank params are treated differently from missing params. And if done only for certain templates it would somewhat defeat the purpose. Benwing2 (talk) 00:51, 28 June 2024 (UTC)[reply]

Edit Request to Module:languages/data/2

For the Cornish langauge entry (kw), can <ò> (o with a grave accent) be added to its entry_name? The SWF Specification does not use it for any reason other than marking pronunciation, as detailed below (emphasis added):

"The short equivalent of <oo> may be spelt as <ò> in dictionaries and teaching materials for learners to show that it is pronounced differently from the short equivalent of <o>."

For words written in SWF, it would make sense to automatically remove the grave accent from <o>. Kernewek Kemmyn opts to use <oe> for both the long and short <oo> sound in SWF, and Kernowek Standard uses <ù> (u with a grave accent) for a different purpose (see pùb) - this should be excluded. Người mang giấm (talk) 22:21, 27 June 2024 (UTC)[reply]

A "newline" was added automatically to my edit and I didn't want it. (Mobile Editing)

A "newline" was added automatically to my edit and I didn't want it. How can I prevent this from happening again? It happened when I was editing with a mobile phone. Thanks! --Geographyinitiative (talk) 22:36, 28 June 2024 (UTC)[reply]

Were you using Visual Editor (not sure if that lets you edit templates), and were you editing the whole page or only the top unnamed section (not sure if the latter is possible in the mobile site)? — Eru·tuon 06:12, 29 June 2024 (UTC)[reply]

Thanks for your questions. I was using "Source editing", not "Visual editing" when I made my edit via mobile phone. I clicked the edit button at the top of the entry with the hope that I would be allowed to edit the whole entry, but I was presented only with the top unnamed section. At the moments immediately before and exactly when I pressed "Publish changes", there was no space. But the Wiktionary system seems to have added a "newline", and I didn't want it to. --Geographyinitiative (talk) 09:07, 29 June 2024 (UTC)[reply]

Category:Entries referencing etymons with invalid IDs

I've created a new maintenance category to track and hopefully fix uses of {{etymon}} which reference a nonexistent ID. If the category gets impractically large then we should look at splitting by the etymon language and/or other information (attested/reconstructed, whether the L2 exists at all, etc.). Ioaxxere (talk) 21:10, 29 June 2024 (UTC)[reply]

edit `kne` in Module:languages/data/3/k

Currently, it is only:

m["kne"] = {
	"Kankanaey",
	18753329,
	"phi",
	"Latn",
}

I request it to be edited to:

m["kne"] = {
	"Kankanaey",
	18753329,
	"phi",
	"Latn",
	entry_name = {
		Latn = {
			remove_diacritics = c.grave .. c.acute .. c.circ .. c.diaer,
		}
	},
	sort_key = {
		Latn = "tl-sortkey",
	},
	standardChars = { 
		Latn = "AaBbKkDdEeGgHhIiLlMmNnOoPpRrSsTtUuWwYy" .. c.punc, 
	},
}

Thank you! — 🍕 Yivan000 ^view_talk 13:08, 30 June 2024 (UTC)[reply]

Additionally, I also request to edit Module:headword/data, in data.hyphen_not_multiword_sep, to add this:

"kne", -- Kankanaey; hyphens for mid-word glottal stops

— 🍕 Yivan000 ^view_talk 13:23, 30 June 2024 (UTC)[reply]

Additionally, I request to edit Module:category tree/poscatboiler/data/lang-specific, to add this::

	["kne"] = true,

Without this, {{autocat}} doesnt work on this category. — 🍕 Yivan000 ^view_talk 03:43, 2 July 2024 (UTC)[reply]

Done. We're really not so good at doing edit requests on this wiki. Sorry about that. This, that and the other (talk) 04:08, 9 July 2024 (UTC)[reply]

using sort= in Template:affix

I added sort=flamen to sufflāmen, but but it still shows up under S in Category:Latin terms prefixed with sub-. Is it not working the way it should, or am I missing something? Urszag (talk) 13:49, 30 June 2024 (UTC)[reply]

`{{etymon}}` creating PIE categories

@Ioaxxere, can you fix {{etymon}} on RC:Proto-Celtic/karants and RC:Proto-Celtic/karāyeti so it isn't adding it to categories CAT:Proto-Celtic terms derived from the Proto-Indo-European root *keh₂- (to desire) and CAT:Proto-Celtic terms derived from the Proto-Indo-European root *kéh₂-ro-s (loved one) like it is now? @Qwertygiy. --{{victar|talk}} 19:31, 30 June 2024 (UTC)[reply]

Noticing that it shows up on any page that has any PIE in the tree -- the tree I added to greenfinch is generating the category Category:English terms derived from the Proto-Indo-European root *(s)ping- (small bird) for example. Qwertygiy (talk) 23:35, 30 June 2024 (UTC)[reply]

@Qwertygiy Just FYI you should not use {{lb}} next to IPA pronunciations, because it results in a term like greenfinch getting added to 'American English', 'British English', 'Australian English', etc. Use {{a}} or better yet use the |a= parameter of {{IPA}}. Benwing2 (talk) 00:12, 1 July 2024 (UTC)[reply]

@Ioaxxere I see this is a result of you adding PIE categorization to {{etymon}}. This should be disabled and left to {{root}} to curate. @Benwing2, Mahagaja --{{victar|talk}} 00:05, 1 July 2024 (UTC)[reply]

@Victar: I've cleaned up a couple entries so that Category:Proto-Celtic terms derived from the Proto-Indo-European root *kéh₂-ro-s (loved one) is no longer added. But could you explain why you don't think the two entries in question should be in the category for *keh₂-? Ioaxxere (talk) 00:21, 1 July 2024 (UTC)[reply]

@Ioaxxere: They already are, where you'll find them in CAT:Proto-Celtic terms derived from the Proto-Indo-European root *keh₂-. It seems like what you're missing is that we only add definitions to root category names when we need to disambiguate. --{{victar|talk}} 00:57, 1 July 2024 (UTC)[reply]

@Victar: "We only add a definitions to root category names when we need to diambiguate." My question is: is this a policy or just a matter of convenience? Since {{etymon}} already works with the IDs of each term, there's no extra effort in adding it to the category name. Ioaxxere (talk) 01:08, 1 July 2024 (UTC)[reply]

@Ioaxxere Victar is right that general practice is only to include parenthetical ID tags in the category name when there's more than one possible tag. It would cause a lot of headache to require that ID's be specified everywhere that a root is mentioned even if there's only one possible ID, and people would likely not be consistent, which would lead to fragmentation of the categories. Benwing2 (talk) 01:19, 1 July 2024 (UTC)[reply]

@Benwing2: Our categories are already inconsistent. Consider Category:English terms suffixed with -er, which is only about half-sorted by ID. I don't think we need to force people to use IDs but rather gradually transition towards IDs through manual and automated conversion of templates. Ioaxxere (talk) 01:29, 1 July 2024 (UTC)[reply]

@Ioaxxere That's not what I meant by inconsistent. What I meant was having different ID's with the same semantics. It's difficult enough as it is for people to know what are the correct ID's and use them. If every affix category required an ID even when there was no need for disambiguation, it would end up a big mess. Benwing2 (talk) 01:34, 1 July 2024 (UTC)[reply]

@Ioaxxere Some hypothetical roots don't even have a definition, and often root meanings are adjusted over time. It's not one size fits all but instead, as I put it, needs curation. --{{victar|talk}} 01:40, 1 July 2024 (UTC)[reply]

I'm not sure it's a bad thing to add a base category like CAT:Proto-Celtic terms derived from the Proto-Indo-European root *keh₂- automatically using something like {{etymon}}; or at least I need clear reasoning why it should need to be duplicately specified using {{root}}. It's rather the automatic adding of ID tags that I am objecting to. Benwing2 (talk) 01:46, 1 July 2024 (UTC)[reply]

"If every affix category required an ID even when there was no need for disambiguation, it would end up a big mess" is actually an issue I have more generally with how the etymon template works: I get that we need to avoid conflating unrelated etymons, but making an id mandatory in all circumstances is kind of a pain and it's often un-obvious what it should be. It's easier to use the suffix id system where they're only used for disambiguation and don't necessarily have to gloss the meaning.--Urszag (talk) 01:46, 1 July 2024 (UTC)[reply]

@Benwing2: What do you mean by "different ID's with the same semantics"? Each etymology ID is by definition a different root. Something has gone wrong if there are different IDs representing the same concept.

@Victar: IDs don't have to correspond to the definition, and it is very easy to change an ID name by bot across many entries.

@Urszag: The reason why it's good to specify IDs for single-etymology entries: we can't guarantee that the entry will always have a single etymology section forever. There shouldn't be a situation where someone adds a new etymology section to a random root and causes an unknown amount of descendant entries to break. In my opinion, robustness always beats convenience.

Ioaxxere (talk) 01:52, 1 July 2024 (UTC)[reply]

(edit conflict) It also would need to be able to distinguish between what is a root and what is a prefix. Right now, it can't even tell the difference between a root and a suffix, and is adding RC:Proto-Celtic/karants to CAT:Proto-Celtic terms derived from the Proto-Indo-European root *-rós (adjectival). --{{victar|talk}} 01:54, 1 July 2024 (UTC)[reply]

FWIW {{affix}} does that using ^ before something that looks like a prefix to signal that it's not a prefix. Possibly {{etymon}} could just "know" that PIE things that look like prefixes aren't, although that would be an issue if there are genuine PIE prefixes. Benwing2 (talk) 02:02, 1 July 2024 (UTC)[reply]

@Victar: I was discussing this issue on Discord just now. In your view, are Category:English terms derived from the Proto-Indo-European root *-trom and Category:English terms derived from the Proto-Indo-European root *-tus mistakes? Those categories have existed for years so I wasn't even sure. There are probably many others of this sort as well. Ioaxxere (talk) 02:04, 1 July 2024 (UTC)[reply]

I'm not Victar but IMO yes. These are suffixes not roots. Benwing2 (talk) 02:05, 1 July 2024 (UTC)[reply]

@Benwing2: In that case you might want to delete all of these. Also, I agree with you about Wikidata IDs: they're meaningless to humans and thus useless for (say) naming a category. Ioaxxere (talk) 02:16, 1 July 2024 (UTC)[reply]

Yes, those are mistakes. --{{victar|talk}} 02:07, 1 July 2024 (UTC)[reply]

Incidentally, {{root}} should probably also throw an error for anything that isn't like ^%*[pie_chars]%-$. --{{victar|talk}} 02:48, 1 July 2024 (UTC)[reply]

But see Category:Terms by etymology subcategories by language, which has categories for non-PIE roots as well. -BRAINULATOR9 (TALK) 21:47, 1 July 2024 (UTC)[reply]

I've been manually removing suffixes and prefixes from {{root}} on entries and've found the vast majority were done by User:The cool numel. 😑 --{{victar|talk}} 22:39, 1 July 2024 (UTC)[reply]

@Ioaxxere I am with User:Urszag here. I don't think ID's should be required because there's no standard anywhere that defines what these ID's are. (Contrast {{tcl}}, which makes use of Wikidata ID's for the most part, which are pre-defined. Maybe there are Wikidata ID's for the different meanings of a given affix but I wouldn't count on it, and in any case Wikidata ID's are annoying in their own right because they're arbitrary numbers.) Benwing2 (talk) 02:04, 1 July 2024 (UTC)[reply]

The main problem I have with {{etymon}} in general is that it gives people an incentive to spell things out in black and white to the smallest detail even when the reality is rather gray and blurry. It's one thing to say that a term is derived from Proto-Indo-European. It's another to say that it derives from a specific root, when it's not entirely certain which root, or there's disagreement as to the form of the root. Then there are all the etymologies that say "ultimately from...", with several possible intermediary steps. You can have all the disclaimers and explanations you want, but a graphic illustration seems more real and more certain. Chuck Entz (talk) 02:31, 1 July 2024 (UTC)[reply]

Yes, I agree. I like the template, but I think certain users don't realise their own limitations when they're adding it to entries. You've got to know what you're doing, and if you only have the knowledge to add part of the chain, then only add part of the chain, instead of trying to go all the way up the earliest PIE roots by rote copying. Theknightwho (talk) 20:09, 1 July 2024 (UTC)[reply]

@Ioaxxere We need to figure out what to do about all these uncreated categories like CAT:English terms derived from the Proto-Indo-European root *-tós (verbal adjectives) that are filling up Special:WantedCategories. This particular category is quadruply problematic: (a) it wrongly says "root" instead of "suffix"; (b) it's not clear we want the ID tag; (c) the ID tag itself is wrong (it should say "verbal adjective" if anything, not the plural); (d) the category is full of randomness like fetish and creepypasta (do we really want a category for this?). Benwing2 (talk) 05:57, 1 July 2024 (UTC)[reply]

BTW for now I've told my bot script that auto-creates such categories to skip all categories containing "Proto-Indo-European root", but this is not a sustainable solution. Benwing2 (talk) 05:58, 1 July 2024 (UTC)[reply]

@Victar, Benwing2: I've updated the template in line with your comments. Benwing, you made some other points which are worth discussing but I'd prefer having these kinds of conversations over Discord where multiple people are able to chat in real time. Ioaxxere (talk) 17:30, 1 July 2024 (UTC)[reply]

@Ioaxxere: It's still creating unwanted categories, in the form of LANG terms derived from the Proto-Indo-European word TERM. Can you just disable that? Some editors don't even like using {{PIE word}}.

Did you tackle the prefix vs. root issue, and if so, how? --{{victar|talk}} 20:00, 1 July 2024 (UTC)[reply]

RC:Proto-Indo-European/priHyós is getting added to CAT:Proto-Indo-European terms prefixed with Reconstruction:Proto-Indo-European/preyH-. 🧐 I strongly think all categorization from {{etymon}} should be hidden while while it's being worked out. @Benwing2 --{{victar|talk}} 20:50, 1 July 2024 (UTC)[reply]

@Victar: Why shouldn't Reconstruction:Proto-Indo-European/h₂yéwHō be in that category? Isn't it in fact a "Proto-Indo-European term derived from the Proto-Indo-European word *h₂óyu"?

Also, I have not tackled the prefix vs root issue. I don't think I should have to. It's only our crude category system that forces the template to care about whether an entry is a "root" or a "prefix" as if that should make any difference. We don't have categories for Category:English terms inherited from Middle English nouns or Category:French terms derived from Ancient Greek transitive verbs, but apparently PIE is a very special language where it is vitally important to separate Category:English terms by Proto-Indo-European root, Category:English terms by Proto-Indo-European word, and English terms derived from PIE prefixes and suffixes (they don't get a category for some reason). I'm starting to feel that trying to automate our current PIE categories is not worth the effort. Ioaxxere (talk) 06:23, 2 July 2024 (UTC)[reply]

@Ioaxxere CAT:Proto-Indo-European terms prefixed with Reconstruction:Proto-Indo-European/preyH- is a garbled category name. Also 'Foo terms inherited/derived from Bar' is fundamentally different from 'Foo terms derived from the Proto-Indo-European root/word *bar-' because the former derives only from a language while the latter derives from a specific root or word. Since the choice was made to do things this way, you can't simply make {{etymon}} categorize in an incompatible way. IMO if you don't want to be troubled to figure out how to do it right, don't do it at all. Benwing2 (talk) 06:48, 2 July 2024 (UTC)[reply]

@Ioaxxere I am not currently on Discord. Which other people are you wanting to participate in a real-time discussion? IMO real-time discussions aren't necessarily better than forums like this, because they don't allow people time to compose their thoughts properly. Benwing2 (talk) 06:50, 2 July 2024 (UTC)[reply]

@Benwing2: I've had conversations with several editors (I won't ping them all here) over Discord about how the template should work (some of which later transitioned into the BP), but it would be helpful for you to be able to give your input since you've designed so many templates in the past. By the way, I like the ^ idea for distinguishing roots from prefixes, although there's still the issue of handling the exotic affix types: infix, interfix, circumfix, simulfix; I think that's all of them. I'm holding off on categorizing affixes until I figure out what the syntax should be there. Ioaxxere (talk) 18:09, 2 July 2024 (UTC)[reply]

@Ioaxxere Let me see about getting on Discord. As for the "exotic" affix types: (1) in general we should keep things synchronized between {{affix}} and {{etymon}}; (2) circumfixes are handled using the syntax de- -en or whatever and shouldn't be an issue; (3) currently, {{affix}} treats anything written like -o- as an interfix and doesn't have a special syntax for infixes, although I was thinking of adding something like ^ma^ or ~ma~ (in place of -ma-); (4) what's a simulfix? BTW the dictionaries on Tagalog (which includes a lot of infixes) tend to use the syntax <ma> (with small less than/greater than signs) but this is confusable with spelled-as notation. Benwing2 (talk) 22:12, 2 July 2024 (UTC)[reply]

Template:etymon for nonexistent entries added to other entries

I've run accross a couple of entries where there are two instances of {{etymon}} in the same entry: one for the entry itself, and one for an etymological parent or child entry that hasn't been created yet: for instance, in an Irish entry, there's the one with the language code "ga" for Irish, and one with the language code "mga" for Middle Irish, or in an Old Irish entry, there's one with "sga" for old Irish, and one with "mga" for Middle Irish. I assume this is so that both nodes will be shown on the tree

Although it may seem harmless enough to have a template for an entry that will eventually be created on the same page because it shares the same spelling, I would like to ask that we not do that. It's not because it adds false positives to Wiktionary:Todo/Lists/Derivation category does not match entry language (mildly annoying, but I can live with it). The main problem is that it adds the page to categories as if the entry already exists, which is misleading. The main problem is that someone who is looking for, say, Middle Irish entries to create will see the page in Category:Middle Irish terms inherited from Old Irish and assume that the page has already been created. Also, someone looking in such a category who clicks on the link won't find an entry for that language on the destination page- the link is basically a lie.

What do others think? Chuck Entz (talk) 02:12, 1 July 2024 (UTC)[reply]

@Chuck Entz: This is very much an unintended way to use the template. You can remove all of the mismatched {{etymon}} uses. @Qwertygiy (who edited those entries): Please don't do this—if the term existed in Middle/Old Irish then just create the L2 and put the template under that. Ioaxxere (talk) 02:27, 1 July 2024 (UTC)[reply]

Wiktionary:Grease pit/2024/June

how do screen readers deal with how we present audio?

"ux" template and unwanted line breaks

Wiktionary:Etymology_scriptorium#forest

Default fonts per language

RFVE very slow

Backslang

Tool to find not-very-visible characters?

problem with missing articles in dumps?

Final Old Polish quotation templates

self-transclusion has almost completely emptied "Orphaned pages"

Hyphen in "near-synonyms"

aWa stopped working in Vector 2010

"Twice-borrowed" is not quite dead

Something gone wrong with editing project pages??

Mass Import of Lorentz's Slovincian dictionary

AjaxEdit edit-summary section-linking is broken

something weird with page saving

updates to labels and params of pronunciation templates

Edit request: MediaWiki:Gadget-VisibilityToggles.js

crap-tag-stic

Template:IPA updates part 2

Module:number list/data/el how to add Greek numerals

"Pacific Northwest" transcluding to "Northwestern US"

Converting elements template to use list/data

Splitting the largest entries on technical grounds?

ACCEL Lua error in Module:languages/errorGetBy

can i search for all English nouns that begin with a capital letter?

Changes to {{alt}}

template:synonyms and its cousins have suddenly stopped respecting the semicolon parameter?

Diff colours

Etymology tree

Unifying collapsible boxes

Category:Pages with broken file links

Orange links should not be links

CJK Unified Ideographs Extension I

Editing ebony: "hu t+" nonsense

Edit request to MediaWiki:Mobile.css

Collapsible content not working

Sections expand with setting off

Showing large numbers of regional alt forms, continued

Bot Request

Edit Request to Module:languages/data/2

A "newline" was added automatically to my edit and I didn't want it. (Mobile Editing)

Category:Entries referencing etymons with invalid IDs

edit `kne` in Module:languages/data/3/k

using sort= in Template:affix

{{etymon}} creating PIE categories

Template:etymon for nonexistent entries added to other entries

Navigation menu

Search

Changes to `{{alt}}`

`{{etymon}}` creating PIE categories