Wiktionary:Grease pit

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Grease pit

A grease pit

Welcome to the Grease pit!

This is an area to complement the Beer parlour and Tea room. Its purpose is specifically for discussing the future development of the English Wiktionary, both as a dictionary and thesaurus and as a website.

The Grease pit is a place to discuss technical issues such as templates, Lua modules, CSS, JavaScript, the MediaWiki software, extensions to it, Toolforge, etc. It is also a place to think in non-technical ways about how to make the best free and open online dictionary of “all words in all languages”.

Others have understood this page to explain the “how” of things, while the Beer parlour addresses the “why”.

Permanent notice

  • Tips and tricks about customization or personalization of CSS and JS files are listed at WT:CUSTOM.
  • Other tips and tricks are at WT:TAT.
  • Find information and helpful links about modules, Lua in general, and the Scribunto extension at WT:LUA.
  • Everyone is encouraged to expand both pages, or to come up with more such stuff. Other known pages with “tips-n-tricks” are to be listed here as well.

Grease pit archives edit
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023


July 2023

Bug: Links to Punjabi words with ݨ (U0768) go to page with ن (U0646)[edit]

I tried to make a link to the Punjabi word دشمݨ (duśmaṇ), but when I use a template it goes to دشمن (duśman). I've used a workaround here by quoting the word between [[ ]]. Then it works. But as you can see at the Punjabi page for ਦੁਸ਼ਮਣ (Gurmukhi script), the link to the Shahmukhi spelling doesn't work.

For some reason, links to Saraiki words with ݨ seem to work just fine (for example the link here to Shahmukhi spelling). Exarchus (talk) 14:15, 1 July 2023 (UTC)Reply[reply]

@Exarchus There's evidently an entry in the Punjabi-specific language data that maps ݨ to ن. In general this sort of mapping happens for things like macrons (e.g. in Latin and Old English), stress marks (in Russian, Ukrainian and Belarusian), vowel diacritics (in Hebrew and Arabic), tone marks (in Serbo-Croatian and Slovenian), etc.; specifically, for extra marks that are commonly found in dictionaries and contain useful pronunciation information, but which aren't normally found in the spelling of the language as naturally used by native speakers. This must have been added intentionally for Punjabi, but that doesn't necessarily mean it's correct. Unfortunately I don't know anything about Punjabi, so I can't say whether we should remove this mapping. User:Theknightwho or User:RichardW57, do you happen to know? Benwing2 (talk) 00:31, 4 July 2023 (UTC)Reply[reply]
@Benwing2, Exarchus This was done based on what the native speaker User:نعم البدل (formerly User:Taimoorahmed11) said at User_talk:BajookThug#Punjabi_lemmas. Benwing's description is accurate:
ARABIC SMALL HIGH TAH ( ؕ ) U+0615 is an extra mark on ن /n/ to represent the retroflex nasal consonant /ɳ/. It isn't normally found in the Shahmukhi spelling of the language as naturally used by native speakers. Thus, page titles should contain ن instead of ݨ . The spelling with ݨ should be in the |head= parameter of the headword template and in {{m}} & {{l}} to indicate the retroflexion of the nasal consonant in the Shahmukhi script.
Kutchkutch (talk) 01:17, 4 July 2023 (UTC)Reply[reply]
@Kutchkutch. Thanks. In that case the term دشمݨ should be moved to دشمن. Benwing2 (talk) 01:37, 4 July 2023 (UTC)Reply[reply]
@Benwing2: Moved that particular term. There are perhaps more instances of Shahmukhi Punjabi page titles that contain ݨ instead of ن such as پاݨی and کھاݨا. Is there a way to search for all such instances? Kutchkutch (talk) 02:58, 4 July 2023 (UTC)Reply[reply]
@Kutchkutch I think this search works: [1] Benwing2 (talk) 03:08, 4 July 2023 (UTC)Reply[reply]
@Kutchkutch I also did a search through CAT:Punjabi lemmas and CAT:Punjabi non-lemma forms and found these:
Benwing2 (talk) 04:02, 4 July 2023 (UTC)Reply[reply]
@Benwing2 Thanks! Kutchkutch (talk) 04:17, 4 July 2023 (UTC)Reply[reply]
Ok, thanks.
I had encountered a somewhat similar problem with Malayalam chillu letters, where ർ (U+0D7C) was encoded as a combination of characters (ര്‍). I had moved a few such entries, but thanks to your search query I can look for the others. Exarchus (talk) 10:27, 4 July 2023 (UTC)Reply[reply]
@Exarchus: I don't know what you've been doing, so I may well be telling you what you already know. Remember that the multi-character sequences for chillus remain valid even though now dispreferred. The entries should be moved or merged, but the hard redirects should be kept. (Keeping is the default for ordinary mortals at least.) I think we should also make that the policy for the multi character representations even for the newer chillus - there are fonts that support them, so text using the combinations may well be being created even now. So far as I am aware, our search functions don't recognise the equivalence. --RichardW57m (talk) 17:37, 5 July 2023 (UTC)Reply[reply]
You may find Category:Pages using discouraged character sequences useful; characters with the multicodepoint chillu encodings should wind up in it, courtesy of @Theknightwho. RichardW57m (talk) 17:43, 5 July 2023 (UTC)Reply[reply]
The problem was that links to the multi-character chillus didn't work. So I moved the multi-character pages to single-character pages, with redirects on the old ones. (And when there were already two pages, I merged them.) Exarchus (talk) 17:48, 5 July 2023 (UTC)Reply[reply]
@Exarchus That's correct - if you put multi-character chillus into {{l}} (or any other link template), it will automatically correct them to a single-character chillu whenever possible. That means we shouldn't ever have actual entries that use multi-character chillus.
The reason it's set up that way is because we can't easily stop random editors from putting multi-character chillus in links, and the most important thing is that those links go to the right places. Otherwise we'd be forced to create tons of redirects, and newbie editors would keep creating pages for them when they see redlinks. Theknightwho (talk) 18:00, 5 July 2023 (UTC)Reply[reply]
@Theknightwho, Exarchus But do we not need tons of hard redirects, so that a search will find the entry which ever way the URL is typed? Unicode does recommend that the two encodings for the older chillus should be recognised as equivalent, so permanent hard redirects should be fine. --RichardW57m (talk) 14:27, 6 July 2023 (UTC)Reply[reply]
@RichardW57m They won't hurt, but people tend not to type hard URLs. @Erutuon @Surjection Is there a way to do this with an automatic redirect? Theknightwho (talk) 14:29, 6 July 2023 (UTC)Reply[reply]
@Theknightwho, Exarchus, Erutuon, surjection:: We also need the mechanism to work for searches. --RichardW57m (talk) 14:34, 6 July 2023 (UTC)Reply[reply]
@Theknightwho, @RichardW57m I found another issue with broken links, this time it was a page which seemed to have a zero-width no-break space in front.
I see there's some Malayalam pages in Category:Pages using discouraged character sequences, but I don't immediately see what the problem is there. Exarchus (talk) 12:26, 7 July 2023 (UTC)Reply[reply]
@Exarchus It might be worth us putting in an edit filter to prevent characters like that from being in page titles, as they're almost always due to people copy + pasting without realising.
The pages in that category all have links that contain discouraged character sequences (many of which are multi-character Malayalam chillus). They're probably in translation sections (for the English entries), etymology sections (for other Indic language entries), or could be anywhere in a Malayalam entry. Theknightwho (talk) 13:34, 7 July 2023 (UTC)Reply[reply]
@Exarchus: There were almost unmentionable pages like ആണ്‍ that were in the categories, with the multi-character chillu-encodings. (One can't easily access them with {{link}}, {{mention}}. Enclosing the name in double square brackets, and clicking the 'redirected from' breadcrumb on the resulting page gets one to them.) It seems they drop out of the category when they become redirects. I saved a list at User:RichardW57/discouraged and started fixing them by merging any content and making the pages with validly discouraged names into hard links. My criterion for 'validly discouraged' is almost that Unicode says treat them the same, but I also treat the multicharacter encodings of the newer (and much rarer) chillus as being equivalent to the atomic chillus, even though Unicode does not call for that. Basically, don't assume that only Malayalam uses the Malayalam script - I don't think Syriac and Sanskrit are the only other languages that use it. I found several pages with multi-character chillus that had more extensive content then the pages with atomic chillus. --RichardW57m (talk) 13:57, 7 July 2023 (UTC)Reply[reply]
As @Theknightwho mentioned while I was typing, there are other links that promote inclusion in the category, which apart from translations are quite difficult to track down. --RichardW57m (talk) 13:57, 7 July 2023 (UTC)Reply[reply]
@RichardW57m I'll see if there's a way to subcategorise them by type without adding too much Lua strain, as it would make them much easier to find. Theknightwho (talk) 14:01, 7 July 2023 (UTC)Reply[reply]
In terms of validity, I think my original reasoning was that it's best for us to be consistent, as it obviates situations where we end up with duplicate entries. There is actually support for treating equivalent chillus as identical in MediaWiki's Malayalam-language version, in the same way it merges NFD page titles into NFC, though it isn't enabled by default (even if the wiki's language is set to Malayalam). It might be worth seeing if we can get it enabled for Wiktionary, as there are obvious benefits to it (e.g. redirects, automatic search support etc). Theknightwho (talk) 14:06, 7 July 2023 (UTC)Reply[reply]
@Exarchus: Initial and final ZWSP in page names are no-nos. We need to move and request expungement of the resulting hard link. Do we have a tool to convert ZWSP to in text? We need it as an easy sheep dip for scriptio-continua languages. --RichardW57m (talk) 14:29, 7 July 2023 (UTC)Reply[reply]

Updating MOD:pl-IPA[edit]

@Catonif has graciously written the bulk of a new module for Polish IPA, as the current one is janky, inefficient, and sometimes weird to use. There are still a few steps to do, i.e. adding a more complete list of affixes or adding qualifiers for certain transcriptions, but it's more or less ready, and testcases can be seen here. One thing I would need help with is updating many of the respellings, i.e. anything with |fs=1 is going to need to be switched to |^= (I believe the exact markup is going to be {{pl-p|^}}, please correct me if I'm wrong). A big one is going to be using the new respelling system

  1. most morphemes have been "taught" to the module, and so pages like przeświadczyć won't need any respelling (they do now), so if the printed IPA is the same with and without the respelling, the respelling should be removed. This will also be true for words ending in -istka and -ystka and any of their declined forms
  2. Some pages might need a respelling, but it will be different, i.e. pochwycić, which currently need po'chwy.cić, but can now be respelled as po.chwycić
  3. Multiword terms with prepositions (all listed in the module) will automatically have them cliticize, in the past we used - to cliticize them
  4. On that note, we will still be using - to break palatalization, and should be left alone.
  5. Some words will still have ' for forced stress, so like above if the IPA is different without the respelling then the respelling should stay. Vininn126 (talk) 19:58, 1 July 2023 (UTC)Reply[reply]

I'd ultimately like to add a dialectal pronunciation, namely the Northern Borderlands dialect, however the exact implementation should be discussed between us Polish editors first and should be able to be done even after implementing the new module. @Benwing2 and (Notifying Hergilei, Tweenk, Shumkichi, Wrzodek, KamiruPL, BigDom, Hythonia, Tashi): . Vininn126 (talk) 19:58, 1 July 2023 (UTC)Reply[reply]

@Vininn126 (1) is not hard to implement, but for (2) I need specific instructions as to how to rewrite the respellings (or alternatively I can flag all pages that still have respellings after (1) is implemented, for you to fix, but there might be a lot of them). Benwing2 (talk) 21:04, 1 July 2023 (UTC)Reply[reply]
Also, it turns out we're gonna be tweaking pl-p, too. As to 2, it's mostly going to be changing chains of .' to a single ., particularly after the affixes listed in the module. Vininn126 (talk) 21:32, 1 July 2023 (UTC)Reply[reply]
@Vininn126 Still not quite sure what you mean. Maybe if you gave several examples it would help. Benwing2 (talk) 21:44, 1 July 2023 (UTC)Reply[reply]
@Benwing2 Basically there will be many words starting with u-, po- and o- that have respellings and we'd just need to check if there's a syllable breaker after them, replace it with . and remove any other syllable breakers and check if the IPA is the same. Vininn126 (talk) 21:49, 1 July 2023 (UTC)Reply[reply]
@Vininn126 Examples would really help. I am not at all familiar with how {{pl-p}} currently works or how the new template works. Whenever you do a change like this there are always a bunch of edge cases that have to be dealt with, so the more examples you can gave, the better it will enable me to figure those out. Benwing2 (talk) 21:51, 1 July 2023 (UTC)Reply[reply]
@Benwing2 uchwycić, pochwycić, and ochwycić would all be respelled as u'chwy.cić, po'chwy.cić, and o'chwy.cić. Basically, this is needed when the next syllable has a consonant cluster, and all three syllables are separates, as the module won't print syllable breaks otherwise. Ideally, with the new module, it would be just u.chwycić, po.chwycić, and o.chwycić, with one breaker at the morpheme boundry. Vininn126 (talk) 21:53, 1 July 2023 (UTC)Reply[reply]
@Vininn126 Thanks. What about if there are more or less than three syllables? Do all syllable breaks except those after u-, o-, po- go away? Are there other prefixes that I need to pay attention to? Benwing2 (talk) 21:59, 1 July 2023 (UTC)Reply[reply]
@Benwing2 The amount of syllables does not matter. It could be 2-infinite. As for changing the respelling, I believe these should be the only affixes we are worried about. Vininn126 (talk) 22:02, 1 July 2023 (UTC)Reply[reply]
@Benwing2 Actually, also na-. Vininn126 (talk) 10:59, 2 July 2023 (UTC)Reply[reply]
@Vininn126 Also what about (3) above? I need more info, thanks. Benwing2 (talk) 22:00, 1 July 2023 (UTC)Reply[reply]
@Benwing2 In multiword entries, prepositions will have a - after them, this should be replaced with a space. A list of prepositions can be found in the module. Vininn126 (talk) 22:02, 1 July 2023 (UTC)Reply[reply]
@Vininn126 You said you'd need help with words with |fs=1. Can you give an example which uses that parameter as I can't think of anything right now. Btw. Can't a bot just change them all? Tashi (talk) 21:42, 1 July 2023 (UTC)Reply[reply]
@Tashi I pinged all the Polish editors to inform them of the change, a bot would replace fs=1. Vininn126 (talk) 21:49, 1 July 2023 (UTC)Reply[reply]
Hi @Benwing2. The update is not actually as ready as you may have been made to believe. For now what's more or less in place is the transcription into IPA, but the actual implementation of that into a working template with hyphenation and parameters like qualifiers, references, etc. is at its earliest stages. These things are traditionally handled by MOD:pl-pronunciation, for which I notify @Surjection as the original author, though there are some things which should probably be ported into the transcription module. For example, MOD:pl-pronunciation handles hyphenation, but some things which before required a respelling, like for example przechwycić because of its prefix, now are handled automatically by the IPA module through affix recognition, which means that the hyphenation may be better treated together with the transcription. Another example is the stress with -yka / -ika suffix (to see how it would visibly work, see the example gramatyka), handled by MOD:pl-pronunciation (this happening after Surj's original), though this shouldn't be the case, since a multiword term can contain an yka-suffixed word, and this should be handled accordingly. This would mean that qualifiers as well might need to be taken into consideration already in the transcription module. I'm hesitant on what is the best way to address this problems, especially since this would involve heavy changes to MOD:pl-pronunciation, which is still all Greek to me. I express some problems more thoroughly in the comments of the module. I'm not used to handling this kind of thing, so I'd be thankful for some help in the code when it comes to the heavily technical part, since I assume it won't be too different from the other languages' modules, so experience would seem to play a big role, and I wouldn't want to just cause a bigger mess for future editors to then untagle. Catonif (talk) 16:50, 3 July 2023 (UTC)Reply[reply]
@Catonif I implemented something similar for {{es-pr}} and {{it-pr}}. It is handled all in one module in Module:es-pronunc and Module:it-pronunciation. You might want to take a look at the latter; the former is more complex due to handling multiple dialectal pronunciations. Essentially there's a function to generate the pronunciation itself, which is wrapped by code to handle the argument parsing, hyphenation/syllabification and display. By putting it together in one module, you can share things like the list of affixes. I was also able to share some of the hyphenation code since both the IPA generation and hyphenation generation have to do this; but there are some differences because one operates directly on the spelling and the other operates more or less on the IPA. I can help you with some of the coding although I seem to have a lot on my plate so I'm not sure how fast I can get to it. Benwing2 (talk) 20:05, 3 July 2023 (UTC)Reply[reply]
I mean, to be honest, if we can fix the hyphenation to recognize affixes, then we could in theory use this code just for transcriptions. Vininn126 (talk) 20:36, 3 July 2023 (UTC)Reply[reply]
@Vininn126 It's not quite that simple, e.g. for example you mentioned changing the handling of Cr (and Cl?) combinations; we'd need to make that change in the hyphenation code as well, along with any other changes in the IPA code that affect hyphenation. Benwing2 (talk) 00:21, 4 July 2023 (UTC)Reply[reply]
@Benwing2 Yes, I forgot to include that, I meant "update the hyphenation". Vininn126 (talk) 10:19, 4 July 2023 (UTC)Reply[reply]
@Benwing2 One of the last major things for this module is to convert the generated IPA strings into tables, which would allow for labels. Would you be able to take a look? There are a few other minor things but those should be easily handlable. Vininn126 (talk) 18:41, 10 July 2023 (UTC)Reply[reply]
@Vininn126 Can you add a line to User:Benwing2/todo? I know some things on that list are years old but I'll make it a priority to take a look at the module. Benwing2 (talk) 21:09, 10 July 2023 (UTC)Reply[reply]
@Benwing2 Done. Vininn126 (talk) 21:18, 10 July 2023 (UTC)Reply[reply]
@Benwing2 I hate to be that guy, but have you had a moment? I'm very anxious and eager to deploy this as it should be much more efficient. Vininn126 (talk) 08:35, 18 July 2023 (UTC)Reply[reply]
@Vininn126 I did take a look but it will require some significant work given that it has to work with {{pl-p}} or equivalent. I'll try to work on this over the next few days but it's a nontrivial task. Benwing2 (talk) 07:22, 19 July 2023 (UTC)Reply[reply]
@Benwing2 we are likely replacing pl-p. Vininn126 (talk) 07:36, 19 July 2023 (UTC)Reply[reply]
@Vininn126 Yes, what I mean is we have to write that replacement as part of this work. Benwing2 (talk) 07:37, 19 July 2023 (UTC)Reply[reply]
I will be borrowing code from {{es-pr}} or {{it-pr}} but all of this takes effort. Benwing2 (talk) 07:38, 19 July 2023 (UTC)Reply[reply]
@Benwing2 understood. Once that is done a few other smaller tasks can be handled and then I'll make a template and I will be asking for help replacing the old template, I've been thinking about how exactly to do this. Vininn126 (talk) 07:42, 19 July 2023 (UTC)Reply[reply]
@Benwing2 ah, got it. Thanks for informing of the size of the task, that helps. I understand this is a huge project but I think it will be worth it, you've seen the mess that is Polish code. Vininn126 (talk) 07:39, 19 July 2023 (UTC)Reply[reply]
@Vininn126 Yes. BTW I think as a first pass we should forget about Northern Borderlands or other dialects and just focus on the standard. We can then add dialectal pronunciations afterwards. Benwing2 (talk) 07:42, 19 July 2023 (UTC)Reply[reply]
@Benwing2 yes, those are not the focus at the moment, but in terms of handling the standard I think most things are handled. NBD should be a simple task and I'm not worried about getting it taken care of right away, plus I think I'd want to add SBD as well. Vininn126 (talk) 07:45, 19 July 2023 (UTC)Reply[reply]
@Benwing2 also just to be clear the main issue is conversion of strings to tables. Vininn126 (talk) 08:01, 19 July 2023 (UTC)Reply[reply]
@Benwing2 Update: the replacement should be easier than we expected, because we've decided to remove syllable breaks except stress markers from the transcriptions, meaning that there should be much more pages where the only thing we care about is the placement of the stress marker. Do you have an estimate when you'll be able to look at it? Again no rush, just trying to figure out logistics. Vininn126 (talk) 15:23, 6 August 2023 (UTC)Reply[reply]
@Benwing2 Actually, @Catonif says it might be better to keep the transcriptions as strings after all, so once a few more small changes are implemented, it might be ready to deploy. Vininn126 (talk) 09:20, 7 August 2023 (UTC)Reply[reply]
@Vininn126 OK. I won't be able to get to this immediately, maybe in a couple of weeks. Benwing2 (talk) 09:34, 7 August 2023 (UTC)Reply[reply]
@Benwing2 Okay. The biggest thing would be just replacing {{pl-p}} with the new template. Vininn126 (talk) 09:36, 7 August 2023 (UTC)Reply[reply]

can we suppress the redundant second message generated on pages like instar ?[edit]

The decl table at instar#Declension generates the message Not declined; used only in the nominative and accusative singular, singular only. . The relevant template code is {{la-ndecl|īnstar<indecl>}}. i wonder if its three messages pieced together, of which we could suppress the last whenever it appears with the preceding message. I wasnt able to find anything in the code that seemed obvious however. Thanks, Soap 09:15, 2 July 2023 (UTC)Reply[reply]

Sometimes people add error text to entries, like this: Special:Contributions/2804:30C:1364:4E00:653E:A55:AB1E:7907. They seem to be doing it via Gadget-TranslationAdder-Data.js, which does not check whether a language code is valid or not. So this should be fixed so that it only allows valid codes (as determined from a JSON list that we have somewhere, apparently). Equinox 12:39, 2 July 2023 (UTC)Reply[reply]

To add to this, I think I've made this same mistake twice, once with Swedish (when I assumed it was se) and once with Old English (when i assumed it was oe). The first time, even though I could see the error message on the screen, I assumed that it would go away when I added the same word with the correct language code. But apparently it not only doesnt delete, it "chokes" on its own erorr message and that leads to a bigger and much messier error message. Essentially there is no undo function .... maybe that's just a limitation of the software, but if we could stop the text from being added in the first place, it wouldn't need to be undone later. (As for why I made the same mistake a second time? I just forgot about what happened the first time. I'm that way sometimes.) Soap 14:35, 2 July 2023 (UTC)Reply[reply]

Adding {{senseid}} functionality into {{lb}}?[edit]

Would it make sense to add the functionality of {{senseid}} into {{lb}} so that if the latter is already in use in an entry we could just type "{{lb|en|botany|id=botany}}" instead of "{{senseid|en|botany}}{{lb|en|botany}}"? — Sgconlaw (talk) 19:02, 2 July 2023 (UTC)Reply[reply]

I think the pros are outweighed by the cons.
Pros:
  1. Quicker to type.
  2. May encourage greater predictability of IDs.
Cons:
  1. More maintenance for template editors and template documentors.
  2. More difficult to search for senseids in large entries. Instead of just searching for 'senseid', one also has to look for '|id=' and then decide whether it applies to the sense or to a term linked to.
  3. (debateable) Counterintuitive. |id= normally refines targets, rather than defining them.
In short, I think the maintenance and use costs outweigh the quick gains when creating new sets of definitions. --RichardW57m (talk) 12:23, 3 July 2023 (UTC)Reply[reply]
Thanks. I don't have strong feelings either way, but thought it might be useful to add that functionality to {{lb}}. If |id= is thought to be a confusing parameter name, we could use |senseid= instead. Happy to hear other editors' thoughts on this. — Sgconlaw (talk) 14:24, 3 July 2023 (UTC)Reply[reply]
I like the idea (with |senseid= or |sense_id=). In many cases the senses will already have labels when linking. Maybe the label itself could (optionally) become the id, so you don't have to write {{lb|en|music genre|sense_id=music genre}} ({{lb|en|music genre|sense_id==}} ?). Jberkel 15:17, 3 July 2023 (UTC)Reply[reply]
@Jberkel: I'm thinking there might be situations where it makes sense, or one needs, to specify a different value for |senseid=—for example, if more than one sense has the same label. But we could certainly use the (first?) label as a default ID if, say, |senseid=1 is specified. — Sgconlaw (talk) 17:56, 3 July 2023 (UTC)Reply[reply]
This would also accellerate adding id's, which is something I often do, especially since I like to use {{transclude sense}}. Vininn126 (talk) 15:20, 3 July 2023 (UTC)Reply[reply]
"senseid" as the parameter would work. "sense_id" would not, because a simple search for "senseid" would not find it. I wasn't thinking of |id= as confusing, but just more work. But I suppose having more tasks to do is confusing in itself.
Does using labels as IDs work well for languages other than English? Some IDs for Pali looked bad when exposed in category names.
It would be good to hear maintainers' thoughts on this. --RichardW57m (talk) 17:14, 3 July 2023 (UTC)Reply[reply]
My first thought is that {{lb}} and {{senseid}} are logically two separate things so it makes sense to have two templates for them. However, if it's common enough to have identical labels and sense ID's next to each other, maybe we could create a combined template {{lbsenseid}} or something that creates a label and sense ID from the same tag. IMO if the sense ID tag is different from the label, we should just write {{senseid|en|FOO}}{{lb|en|BAR}}; clearer that way. Benwing2 (talk) 00:18, 4 July 2023 (UTC)Reply[reply]

Parameter nocat for {{causative of}}[edit]

@Benwing2: |nocat=1 appears to work for {{causative of}}. Could we please have its use sanctioned by being documented. I couldn't work out how to add the parameter's to the template's documentation. For future use, if you just go and add it, please tell us how you did it. --RichardW57 (talk) 21:14, 2 July 2023 (UTC)Reply[reply]

A literal definition for pācayant (having someone cooked) such as

  1. {{inflection of|pi|pācayati||present|participle}}, which is {{inflection of|pi|pacati||causative|t=to cook}}

appears to confess that pācayati is an inflection of rather than a causative of pacati (to cook), so I think I should replace it by

  1. {{inflection of|pi|pācayati||present|participle}}, which is {{causative of|pi|pacati|nocat=1|t=to cook}}

Without |nocat=1, the new definition would categorise pācayant as a causative verb. --RichardW57 (talk) 21:14, 2 July 2023 (UTC)Reply[reply]

@RichardW57 There are three entry points into Module:form of/templates. form_of_t is for templates that display arbitrary text before the lemma(s); this includes {{form of}} itself, as well as more specific versions like {{obsolete typography of}}. tagged_form_of_t is for templates that display a fixed set of one of more inflection tags before the lemma(s); this includes things like {{causative of}}. inflection_of_t is for templates that display a user-specified set of inflection tags before the lemma(s); this includes {{inflection of}} and certain variants of it like {{participle of}}. The latter two always accept |nocat=, because there may be categories generated internally by the inflection tags. The first one only accepts |nocat= if the |cat= invocation argument is given, which adds categories. The documentation for these is generated by {{form of/infldoc}}, which is well-documented; but it is missing support for the |nocat= param. (It is implemented by Module:form of doc; you can see around lines 174-177 where it adds the |nodot= and |nocap= params, but nowhere does it add |nocat=.) I need to add this. Benwing2 (talk) 21:47, 2 July 2023 (UTC)Reply[reply]
@RichardW57 I should add, just today I added support to {{inflection of}} so you can add language-specific "base lemma" params, such as |comp-of= (for inflections of comparatives), |sup-of= (for inflections of superlatives) or (in this case) |causative-of=. So we can add |causative-of= as a base lemma param for Pali, and then you could write this:
{{inflection of|pi|pācayati||present|participle|causative-of=pacati<t:to cook>}}
and it will display similarly to above. Benwing2 (talk) 02:39, 3 July 2023 (UTC)Reply[reply]
@RichardW57 I went ahead and added support for |caus-of= for causatives for Pali. See Module:form of/lang-data/pi. Hence:
{{inflection of|pi|pācayati||pres|part|caus-of=pacati<t:to cook>}}
displays
present participle of pācayati, the causative of pacati (to cook)
Benwing2 (talk) 03:26, 3 July 2023 (UTC)Reply[reply]
I'm not sure that that display is correct, because of what may be alternative forms:
  1. Causative verbs come in pairs, in -eti and -ayati. I declare them lemmas, but treat them as alternative forms of one another - it seems that only the latter comes in the middle (ex-)voice. It can be argued that they are just a single verb with a multiplicity of forms. (For example, both forms form their own present active participles.)
  2. There can be other differences. For examples, the first vowel could have been short, and there are causatives where both short- and long-vowelled forms exist. According to the PTS PED, bhindati (to break) has two synonymous second causatives bhindāpeti (to cause to be broken) and bhedāpeti - one from the present stem and one directly from the root. Overall, I prefer 'a causative' to 'the causative', but hesitate because 'a' implies there are others, and the first pair might be one verb rather than two. It gets worse with the past participle, whose voice depends on the semantics, which may have unpredictable multiple forms, far more often than English.--RichardW57 (talk) 08:03, 3 July 2023 (UTC)Reply[reply]
Any way, I look forward to the enhanced documentation, for which, thank you. --RichardW57 (talk) 08:03, 3 July 2023 (UTC)Reply[reply]
How confident are you that the reader won't misread that it is the term, namely pācayant, which is the causative, rather than pācayati? Definitions should not be comprehension tests. I use 'which is' a lot in such definitions so as to dispel the interpretation of sameness of reference. I'm treading an awkward compromise between keeping the number of clicks low and the maintenance costs of maintaining duplicated (or worse, transformed) sets of definitions. Contemplate an entry for the Sinhala script dative singular of the term, which is what we actually record a quotation for, sitting on the page for the contracted causative පාචෙති (pāceti). The previous word in the quotation is the corresponding form from the simple, non-causative verb. (In this context, 'cooking' appears to actually refer to boiling alive in oil, though I haven't found the quotations for that.) RichardW57m (talk) 11:40, 3 July 2023 (UTC)Reply[reply]
@Benwing2 How are we to find documentation of language-specific tags for {{inflection of}}?--RichardW57m (talk) 15:57, 3 July 2023 (UTC)Reply[reply]
It would be nice if you could add support for non-Devanagari Sanskrit and similar cases (e.g. Devanagari Prakrit) with links to both the same-script form (just in case, and for naturalness), and to the Devanagari form (for complete information, as the main lemma). With luck, this should make {{pi-nr-inflection of}} redundant, though it's not inconceivable that Pali could be harder as a degenerate case and also possessing multiple writing systems for several scripts. Khmer script examples could test a few things, such as different transliterations; it has such gems as potentially ambiguous gemination below repha, which seems fairly widespread in epigraphic Sanskrit and in Bengali-script Sanskrit of the Bengal Presidency.--RichardW57m (talk) 15:57, 3 July 2023 (UTC)Reply[reply]
Any immediate thoughts on how to handle different senseids for the term's script and the language's main script? I presume the usual case would be that they would be the same (or both non-existent), but that won't always be so. Perhaps |idmain= for the main script if different, with a parameter value of '-' encoding non-existence? --RichardW57m (talk) 15:57, 3 July 2023 (UTC)Reply[reply]
@RichardW57 You have written a lot of things here and I'm not sure I understand them all. Currently with |caus-of= and similar parameters, you can put multiple comma-separated base lemmas, each of which can have its own inline modifiers (which includes <id:...> for specifying the link ID). So you could write |caus-of=bhindāpeti,bhedāpeti or even |caus-of=bhindāpeti<t:to cause to be broken><id:some_id>,bhedāpeti<t:some other gloss><id:some_other_id>, etc. If there are multiple such base lemmas, they are separated using serialCommaJoin() in Module:table, which displays "FOO and BAR" if there are two, and "FOO, BAR, BAZ and BAT" etc. if there are more than two. The wording of the article preceding the tag can easily be changed from "the" to something else, or even made customizable if that would help. Documentation for language-specific tags of {{inflection of}} is still to come. As for the paragraph beginning "It would be nice if you could add support for non-Devanagari Sanskrit and similar cases ...", I don't understand this paragraph; it would be nice if you can give some examples and/or suggestions for {{inflection of}} syntax to support your use case. Benwing2 (talk) 21:44, 3 July 2023 (UTC)Reply[reply]
You've got it the wrong way round. To stick to the closely linked pair of causatives for the example, pācayati and pāceti, commonly described as the uncontracted and contracted forms, what you're suggesting would be to give pācayant the definition
{{tl:inflection of|pi|pācayati//pāceti||present|participle|caus-of=pacati<t:to cook>}}
currently yielding
present participle of pācayatipāceti, the causative of pacati (“to cook”)
It's slightly jarring to claim 'pācayant' as the present participle of pāceti when it has its own synonymous participle pācent, but it gets even worse with pairs such as the synonymous double causatives bhindāpeti and bhedāpeti of bhindati, each of which has its corresponding uncontracted form in -ayati. However, with this construction for the present participle of some causative of pacati, 'the' works. Perhaps 'some' works generally in the place of the article!
A cleaner case of synonymous forms is causatives sãreti and sarãpeti of sarati (to flow), though the second is morphologically indistinguishable from a double causative, which it may be semantically for sarati (to remember).
How do I customise the article? --RichardW57 (talk) 05:19, 4 July 2023 (UTC)Reply[reply]
@Benwing2:An example of the non-Devanagari Sanskrit is given by the entry for វុទ្ធាយ (vuddhāya), the noun definition is given as
{{sa-sc|pos={{inflection of|sa|बुद्ध|វុទ្ធ|dat|s|t=Buddha}}|term=बुद्धाय}}
yielding
Khmer script form of बुद्धाय (dative singular of វុទ្ធ (vuddha, “Buddha”))
There is no direct link to the Khmer form of the stem in this, only a manual indirect link via the alternative scripts section of the Devanagari lemma; |alt= is used by position. Additionally, there is no transliteration of the standard Devanagari form that is given, though I supposed that could be fixed via an edit to {{sa-sc}}. Also pinging (Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat): . --RichardW57 (talk) 05:57, 4 July 2023 (UTC)Reply[reply]
@RichardW57 pācayant is clearly the participle only of pācayati, so IMO pāceti or similar alternative forms shouldn't be mentioned at all in this inflection line. This is not a Pali-specific issue; many languages have alternative forms for verbs (and other parts of speech). Also the issue of having multiple scripts for a given language is not specific to Sanskrit or Pali; e.g. Serbo-Croatian can be written in either Latin or Cyrillic. However in both these cases I'm still having trouble understanding exactly what you *WANT* to have happen; you're complaining about potential issues without presenting solutions. If you give me the expected outcome I can help you figure out how to get there. Benwing2 (talk) 06:04, 4 July 2023 (UTC)Reply[reply]
Also you have written the code for {{pi-verb}} to create non-standardly-named categories like CAT:First conjugation Pali verbs and (even worse) CAT:irregular conjugation Pali verbs with lowercase initial letter. They should be named CAT:Pali first-conjugation verbs and CAT:Pali irregular verbs; the language name always needs to be first. Benwing2 (talk) 06:09, 4 July 2023 (UTC)Reply[reply]
Where is this word order mandated? --RichardW57m (talk) 11:24, 4 July 2023 (UTC)Reply[reply]
I've almost renamed them and the old categories have emptied. The hyphen felt so unnatural that I accidentally omitted it. I've left the 'Pali irregular conjugation verbs' because one would naïvely think that Pali irregular verbs listed the irregular verbs. I'm wondering if I should entirely eliminate 'Pali irregular conjugation verbs' and put its sole member atthi in the first conjugation, as Warder does[1]. It fits the definition I gave for the first conjugation. Apart from stray remnants, it has the only athematic stem left that ends in a consonant. The concept of Pali 'conjugation' refers to how the present stem is formed from the root, and not how verb forms of the present system are formed from the present stem. It's far less relevant than the Sanskrit verb classes, and the division into seven conjugations is not universal. We don't get discussion of the classification for forms that are unclear on the surface, such as gaṇhati, which historically is fifth conjugation but as far as I am aware functions as third conjugation. --RichardW57m (talk) 11:24, 4 July 2023 (UTC) RichardW57m (talk) 11:24, 4 July 2023 (UTC)Reply[reply]
If I can hazard a suggestion, I think you want to write something like
dative singular of វុទ្ធ (vuddha), the Khmer script form of बुद्धाय (buddhāya)
Then you mention the Devanagari form of the term itself under ==Alternative forms== rather than in the definition line. This sort of thing can be accomplished already by adding a base lemma param something like |khmer-of= that displays "Khmer script form" as the tag. This could potentially be automated so that rather than having 12 (or however many) base lemma params, one per script, you could have a single |alt-of= param whose tag displays the current page's script (or the lemma's script, which should be the same). That would require adding the ability for tags to be defined using arbitrary functions, which look up properties of the lemma. I can implement this if it would be helpful. Benwing2 (talk) 06:18, 4 July 2023 (UTC)Reply[reply]
Why do you keep adding posts at the end of the topic, rather than after the post you're replying to, as automated by clicking on reply? your practice is very confusing when the topic forks. You suggested giving an example or suggestions, so I gave an example and got some more sleep. --RichardW57m (talk) 12:50, 4 July 2023 (UTC)Reply[reply]
What I had in mind for the solution was a layout conveying:
Khmer script form of बुद्धाय (buddhāya), dative singular of វុទ្ធ (vuddha), which is a Khmer script form of बुद्ध (buddha)
I would envisage this being done with a structure of {{sa-sc}}, {{inflection of}}. One has to be wary of commas, as in definitions they normally join adjacent meanings. It's very easy to go down a 'garden path'.
Now, for Pali I was usually able to combine a transliteration and Roman script form of as a linked to transliteration, e.g. ທັມມະ (damma, trainable); when different, I use an arrow to point to the Roman script form, with a basic glossed link format as in ທັມມະ (damma ⇨ dhamma, “dharma”). Similarly, one wouldn't want to give the transliteration twice when, as is usually the case, both the Devanagari and relevant Khmer script form of the stem or whatever were the same. If your suggested |alt-of= refers to the lemma being inflected, that just needs a few features to be addressed:
  1. Normally being derived automatically from the inflected lemma e.g. by |alt-of==
  2. Having its transliteration suppressed if it is the same as that of the inflected lemma after override, and we should have the facility to override it manually just in case. (Or usually? - what is the rule for showing the accent in the transliteration of Devanagari Sanskrit?)
  3. Applying it as an inline qualifier with the likes of |caus-of= - or is this too recursive?
  4. Luxury feature: selecting the article.
Should we try to make it snappier? Should we make |alt-of== be the default?
Note that for Pali we have a further simplification to make in the presentation because the transliteration and Roman script form usually coincide, even for Lao-repertoire Lao script Pali, i.e. when rejecting the Buddhist Institute's revivals/concoctions added in Unicode 12.0.
Note that Serbo-Croat is not a good example of multiple scripts - cannot one work entirely in Roman script or entirely in Cyrillic script? That is not so in other cases. --RichardW57m (talk) 12:50, 4 July 2023 (UTC)Reply[reply]
@RichardW57: Thank you for the layout example above. I still think it's better to put the first clause (Khmer script form of बुद्धाय (buddhāya)) somewhere else. Usually such information is put in the headword. This is what is done in Hindi and Urdu, for example. This will simplify the definition and avoid some of the "garden path" effects you mention.
As for the "few features to be addressed", I need examples of the features you're proposing. For example I don't understand what |alt-of== is supposed to do (#1), and I don't understand what "Applying it as an inline qualifier with the likes of |caus-of=" (#3) means. #2 (suppressing redundant transliterations) should be possible although I need to think about it. #4 (selecting the article) is very easy to implement, although I need to know whether you want this done in the actual {{inflection of}} spec (which means it needs to be done on a case-by-case basis) or you want it automated through some logic in the language-specific data.
You also ask about showing accents in Sanskrit. It seems the current practice is to put the accents in the translit when possible, although this requires manual transliteration in many cases. I know that Devanagari itself has the ability to add accent marks in it, so I'm not sure why we don't just put the accents that way and have them automatically transliterated, similarly to what's done for Russian, Ancient Greek, etc.
Finally, yes of course in Serbo-Croatian you can work purely in one script or another but I don't know why that isn't the case in Sanskrit and Pali as well. AFAIK, all (or most of?) the various scripts have the ability to completely represent the phonology of Sanskrit and Pali, just like Devanagari does, so you could theoretically work entirely in one script and ignore the others. Benwing2 (talk) 21:05, 4 July 2023 (UTC)Reply[reply]
Pali has about 18 writing systems that we acknowledge in the list of alternative forms, and Sanskrit about 28 scripts. Duplicating definitions across all these would be a maintenance nightmare until we moved to a database system. It would be even worse if we demanded confirmation for each sense in each writing system. And if we decided that a Sri Lankan sense should not be recorded in a writing system of Burma (Burmese, Burmese Mon, Thai Mon, Old Shan, New Shan, Tai Khuen, ...), it could get even worse.
The reply on phonology is rather long, but can be summarised as 'Don't be so sure', and the writing systems do not all have the same capability.
Last time I looked, standard-compliant Unicode Devanagari can only represent student-level Sanskrit phrases if all of yy, ll and vv are written as vertical stacks - it can't distinguish candrabindu applied to vowels and applied to consonants - you have to resort to the Latin script and use U+0310 COMBINING CANDRABINDU. Microsoft Unicode Devanagari can cope, but I don't know if HarfBuzz (which Microsoft Edge now uses!) yet supports the character sequences.
I'm not sure about the pitch marks that are beginning to show up in Roman script chanting books. Our local temple encodes them with IPA tone letters and then uses a special font. As far as I am aware, there is nothing corresponding to them in the non-Roman scripts. I've never seen the Thai script handle Vedic accents.
Notoriously, Lao-repertoire Lao script Pali, which is a real thing, only consistently represents Pali phonology that maps onto Lao phonology - it can't distinguish -ss- and -cch-, and it can't distinguish voiced and 'aspirated' voiced stops - it collapses each pair to a voiceless aspirate or sibilant with tone realisation rules to distinguish them from the voiceless aspirates and sibilants of the Pali of two thousand years ago. (I transliterate Lao script Pali: I don't transcribe it.) It's argued that no writing system completely captures the phonology of Pali of over two thousand years ago, which accounts for some of the vagaries of the writing of canonical Pali.
I've also found Thai-script Pali which does not transliterate easily - see attested kat'añjalin (waiing).
For Sanskrit, I don't know which scripts support jihvamuliya and upadhmaniya - IAST doesn't!
Notoriously, Bengali script Sanskrit does not distinguish <b> and <v>, and as you should have noticed today, Angkorian Sanskrit usually (but not always) wrote <v> for <b>. The latter is apparently because Malay (think w:Sri Vijaya) didn't have the appropriate contrast. --RichardW57 (talk) 22:54, 4 July 2023 (UTC)Reply[reply]
@RichardW57 Thank you for all this info. This is a tangent though and doesn't change what I said above about needing examples and proposing to put the Devanagari form in the headword. Benwing2 (talk) 23:12, 4 July 2023 (UTC)Reply[reply]
@Benwing2: One last tangent to address before I get to the main topic. Moving the script classification from {{pi-sc}} or {{sa-sc}} to the headword line only be done with community agreement. For Pali, that is a bit of a problem, because we have very little communication and {{wgping}} is formally unfinished. I suggest we proceed on the basis that the statement is likely to be moved.
There are some complications with {{pi-sc}} that could cause complications with automatic movement for Pali:
  1. The declaration may declare writing system rather than script, e.g. ภาโว (bhāvo). The only automatic detection of writing system I am aware of is embedded with Module:pi-translit, and is intrinsically unreliable; transliteration in inflection tables relies on being told the writing system when it matters for transliteration. @Octahedron80 was keen on recording writing system, but
    1. I felt it more important to get the correct script automatically detected and declared, rather than being subject to copy, paste and forget to edit errors. Thank you, @Svartava, for automating script detection and transliteration in {{pi-sc}}.
    2. The different writing systems generally lack names rather than individuals' dubbings.
  2. The senses may legitimately declare different writing systems; again, see ภาโว (bhāvo) for an example.
Automatic change for {{sa-sc}} could hit a similar problem with Assamese 'script' (code as_Beng) v. Bengali 'script'. These Wiktionary-scripts are not easily distinguishable. Now, one can declare the script in both the headword invocation and {{sa-sc}}, so with luck the only problem will be headword and {{sa-sc}} being inconsistent. I think I tried putting both soft redirects under the same headword, found that the categories looked wrong, and split the entry, e.g. at অন্ধো (andho).
There's currently the issue that for Pali the headword line often simply uses {{head}} directly; often, using the apparently appropriate Pali headword line template would be deleterious. I'm currently fixing much of that as part of the process of enabling the disabling of the suppression of transliteration other than by override.--RichardW57m (talk) 11:37, 5 July 2023 (UTC)Reply[reply]
On second thoughts, I don't think we need to keep implicitly stating that the term in in the Khmer script. So, what I would suggest should do the job for this is:
{{inflection of|sa|វុទ្ធ|dat|s|alt-of=बुद्ध}}
yielding
dative singular of វុទ្ធ (vuddha)बुद्ध (buddha)
The symbol '⇨' can be read as ', which in the main spelling for Wiktionary is', but without the parsing issues of lengthy English. There might be a better dingbat, conveying 'look here for full script-independent information'.
Now, for where the spellings in the two scripts are the 'same', we would want to simplify it, so I suggest
{{inflection of|sa|สุขินฺ||nom|s|alt-of==}}
yielding
nominative singular of สุขินฺसुखिन् (sukhin)
In 'alt-of==', the second '=' means the transliteration of the form being inflected into Wiktionary's main writing system for the script. The transliteration does not have to pivot through the Roman script.
I think it looks better with the sole example of the transliteration in final place.
For a more complicated system, but still just a one-step chain, for ទត្តាយ (dattāya, to the given one), we would somewhere have the information currently given by {{sa-sc}}, namely 'Khmer script form of दत्ताय', and the definition could go:
{{inflection of|sa|ទត្ត||dat|s|alt-of==|pastpasspart-of=ទទាតិ<t=to give><alt-of==><art=the>}}
generating
dative singular of ទត្តदत्त (datta), which is the past passive participle of ទទាតិददाति (dadāti, to give)
I think the gloss should naturally get associated with the main script form, though the glosses for the form in the entry script and the main script could both be displayed.
In a first build, the entry script to main script can often be pivoted through Roman script with only occasional losses; these can be manually corrected via the value for |alt-of=. --RichardW57m (talk) 14:15, 5 July 2023 (UTC)Reply[reply]
The Thai-script example above is for entry สุขี (sukhī). The final example, for ទត្តាយ (dattāya), is entirely concocted to give a straightforward example of complexity. --RichardW57m (talk) 15:08, 5 July 2023 (UTC)Reply[reply]
What I want for article selection is customisation by language, so one can do a good fit by the language's tendency to have multiple forms at each derivation, and then overridable at each invocation of the template, as shown in the example with inline qualifier art. --RichardW57m (talk) 15:19, 5 July 2023 (UTC)Reply[reply]
@RichardW57m I really don't think the use of an arrow ⇨ is a good idea. On first glance I'd have no idea what that means. Better to spell it out in words. Benwing2 (talk) 19:37, 5 July 2023 (UTC)Reply[reply]
@Benwing2: As I suggested above, there are dingbats that would be better. How about ☞ or 🖛, possibly with a tool tip, such as 'see entry to the right for fuller script-independent information'. The obligatory commas break things up, whereas the notation should be a binding. --RichardW57m (talk) 08:15, 6 July 2023 (UTC)Reply[reply]
@RichardW57: I have the same concerns with all such dingbats, and tooltips are not a good solution because they don't work on mobile or for people with pop-up blockers. The text can read , which is a Khmer-script form of or just , which is an alternative-script form of or , for which the main-script form is or whatever. This is more verbose but much clearer. Benwing2 (talk) 08:20, 6 July 2023 (UTC)Reply[reply]
@Benwing2: The complex example above then expands to:
dative singular of ទត្ត, for which the main script form is दत्त (datta), which is the past passive participle of ទទាតិ , for which the main script form is ददाति (dadāti, “to give”)
From that, I don't get any feeling that the transliteration applies to both forms, which is an immediate loss. The example with different spellings expands to:
dative singular of វុទ្ធ (vuddha), for which the main script form is बुद्ध (buddha)
and that is tolerable.
On second thoughts, perhaps 'entry' would be better than 'script form'; 'main script form' might be construed as hate speech. We can also condense 'for which the' to 'whose', so shortening the more complicated form to:
dative singular of ទត្ត, whose main entry is दत्त (datta), which is the past passive participle of ទទាតិ , whose main entry is ददाति (dadāti, “to give”)
It makes it clearer that the Devanagari is being referenced because that it is how Wiktionary is organised. It would also work for languages in which Wiktionary's main script for a lemma is not predictable, such as Old Khmer. (Its crumbling morphology might not be suitable for avoiding duplication of senses down to derivatives.) --RichardW57m (talk) 09:01, 6 July 2023 (UTC)Reply[reply]
@RichardW57 What about using brackets? Something like this:
dative singular of ទត្ត [main entry दत्त (datta)], which is the past passive participle of ទទាតិ [main entry ददाति (dadāti, “to give”)]
Benwing2 (talk) 01:21, 8 July 2023 (UTC)Reply[reply]
@Benwing2: Better still. --RichardW57 (talk) 07:52, 8 July 2023 (UTC)Reply[reply]
@RichardW57: All right, I'll see if I can add support for this. Basically, certain places that are now hard-coded need to be replaceable with an arbitrary function, which can implement the relevant logic. Benwing2 (talk) 18:48, 8 July 2023 (UTC)Reply[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── @RichardW57 I have added handling of categories like CAT:Pali first conjugation verbs to Module:category tree/poscatboiler/data/lang-specific/pi. This allows you to just use {{auto cat}} in the definition and adds breadcrumbs, table of contents when the number of entries exceeds 200, and other nice features. This only works when the language name is first (as all other such categories are). I would recommend eliminating CAT:Pali irregular conjugation verbs one way or another. Note also that Sanskrit uses categories with names like CAT:Sanskrit class 3 verbs instead of CAT:Pali third conjugation verbs. Probably these should be harmonized. Benwing2 (talk) 07:54, 5 July 2023 (UTC)Reply[reply]

Harmonisation might cause confusion, as some works classify Pali verbs by the corresponding Sanskrit classes. The problems are usually alleviated by using Roman numerals for the Sanskrit classes, which I thought was the normal system when numbering them rather than naming them from representative roots. I think I can use Dhtm 103 to complete the justification of shoe-horning atthi into the irregulars. I also want to have at least a maintenance category for unclassified verbs, as a piece of editor-friendliness, rather than force guessing. --RichardW57m (talk) 10:11, 5 July 2023 (UTC)Reply[reply]
@Benwing2 Correction: 'into the first conjugation', not 'into the irregulars'. --RichardW57m (talk) 12:37, 5 July 2023 (UTC)Reply[reply]
@Benwing2: I've opened up a more detailed discussion on Pali verb categories at category_talk:Pali verbs. RichardW57 (talk) 07:26, 6 July 2023 (UTC)Reply[reply]

References[edit]

  1. ^ Warder A.K. (2001) Introduction to Pali, Oxford: The Pali Text Society, page 375

Fixing the Lua error[edit]

Hello, i tried fix Lua error in and create separate page for compounds of etymology 5 of this Chinese character, but Wiktionary said it's very harmful, but creating seperate pages for fixing the Lua errors isn't harmful. Greetings, Frozen Bok (talk) 12:42, 4 July 2023 (UTC)Reply[reply]

RQ:Du Bois Souls of Black Folk 2nd ed[edit]

I can't determine what's malfunctioning with my attempt at {{RQ:Du Bois Souls of Black Folk 2nd ed}}.

There is a use at ceil#Verb that isn't displaying correctly at the moment.

Note: the Documentation page includes the chapter titles, and ideally the template would display the appropriate chapter title. The new template is designed to work with the scan-backed copy of the 2nd edition being proofread at Wikisource, and which is currently about 70% done. --EncycloPetey (talk) 19:53, 4 July 2023 (UTC)Reply[reply]

@EncycloPetey You're almost certainly missing a close brace or two. Benwing2 (talk) 20:46, 4 July 2023 (UTC)Reply[reply]
That's gotten the basic functions to work (Thanks!) but it's still not linking to the chapter, and as I say, it would be nice to set things up to switch in the chapter name for display, but that's outside what I know how to do. --EncycloPetey (talk) 20:52, 4 July 2023 (UTC)Reply[reply]
Maybe User:Sgconlaw can help you with this; they are the expert here on quote templates. Benwing2 (talk) 21:06, 4 July 2023 (UTC)Reply[reply]
@EncycloPetey: let me have a look at it later. By the way, can we merge this into {{RQ:Du Bois Souls of Black Folk}}, to be indicated using |edition=2nd? — Sgconlaw (talk) 22:20, 4 July 2023 (UTC)Reply[reply]
Perhaps, but the first edition does not exist at Wikisource; only the second edition does. So the manner of, and targets for, linking will be completely different for the first and second editions. One will involve wikilinks for both book and chapter, and using the page number following the hashtag to reach the appropriate page, while the other will involve pointing to an adjusted page from a PDF scan displayed at the Internet Archive, and linking the book title to its Wikipedia article instead of a work at Wikisource. Combining the two disparate templates would be overly, and unnecessarily, complicated. --EncycloPetey (talk) 22:26, 4 July 2023 (UTC)Reply[reply]
@EncycloPetey I agree with User:Sgconlaw here; unless the external interface needs to be significantly different, it would be better to combine the two templates. You can just put an {{#if:}} clause in the template code. Benwing2 (talk) 22:38, 4 July 2023 (UTC)Reply[reply]
As I just said, the external interface will be very different. --EncycloPetey (talk) 22:41, 4 July 2023 (UTC)Reply[reply]
@EncycloPetey You mention wikilinks differing; this is part of the implementation, not the external interface. Here by external interface I mean the parameters to the template calls, which appear to be exactly the same for the two templates. Benwing2 (talk) 23:14, 4 July 2023 (UTC)Reply[reply]
Parameters aren't external nor are they interface. The external interface is the connection between what's happening inside the template and its relation to the space external to the template. That interface is completely different between the two editions, as is what the parameters would be required to do in terms of linking as part of that external interface. --EncycloPetey (talk) 23:20, 4 July 2023 (UTC)Reply[reply]
@EncycloPetey Regardless of your definition of "external interface", the params are the same so the templates should be combined. Benwing2 (talk) 00:44, 5 July 2023 (UTC)Reply[reply]
And please ping me in your responses, thanks. Benwing2 (talk) 00:44, 5 July 2023 (UTC)Reply[reply]
@EncycloPetey: If the same template can handle 2 different Chaucer manuscripts in facsimile and a 19th-century print edition, as is true for at least one Canterbury Tales template I was looking at recently, a single template can handle this. Chuck Entz (talk) 00:47, 5 July 2023 (UTC)Reply[reply]
@Chuck Entz I would be interested to see that template, so that I can determine whether this is an analogous case. Thus far, I do not believe it is, because the template would need to do completely different things with the parameters in each situation. --EncycloPetey (talk) 01:01, 5 July 2023 (UTC)Reply[reply]
The template is {{RQ:Chaucer Canterbury Tales}}. I misremembered: there was a manuscript, an early printed edition rather than a second manuscript, and the 19th-century edition- but that doesn't affect what I wrote.
The current situation is not analogous. The Canterbury Tales template is only linking to a page, and only to scans. The desired template here would link to chapters and pages at Wikisource, and the first edition scans could not link to chapters unless a set of interpretive values were added for that scan that first interpreted the chapter values using a preset table for comparison before outputting a link target. The Wikisource copy simply needs the chapter number dropped into a fixed structure to generate the link. That is, all Wikisource chapters are in the form s:The Souls of Black Folk (2nd ed)/Chapter N where N is a value from 1 to 14.
The Canterbury Tales template links only to Wikipedia articles for each item: title, chapter, etc., with just the page number linking to the source for the quote. I am proposing the chapter text generated by the Souls of Black Folk template link to the chapter at Wikisource, as well as having the page number link to the page, because doing so would be a relatively simple matter because of the standard formatting at Wikisource. That wouldn't be possible for linking to the first edition, which is a scan, and would involve a completely different set of code unique to scan linking. --EncycloPetey (talk) 02:18, 5 July 2023 (UTC)Reply[reply]
@EncycloPetey We're talking about a single top-level conditional. If you think that is too complicated, it mostly just indicates you aren't comfortable with template coding. Benwing2 (talk) 03:44, 5 July 2023 (UTC)Reply[reply]
If the multiple and disparate editions are codeable, then by all means. If that will be the case, then I recommend a look at s:The Souls of Black Folk, which lists several of the early editions, with pub. dates and links to scans where I've been able to find them. --EncycloPetey (talk) 03:49, 5 July 2023 (UTC)Reply[reply]
  • @Benwing2 @Sgconlaw Looking at the quotations that use the first edition, I would say scrap the first edition function entirely, and use only the second edition at Wikisource. Our copy is cleaner that the scan linked for the first edition. See for example the quote used on oasis#Noun, where the first edition scan is missing a significant portion of the bottom left-hand page 11 that is used in the quotation. Given the poor quality of the scan used for the first edition, with missing and distorted text, it would make little sense to juggle two different sets of parameter implementation. A single template, using only the second edition, which has clean text at Wikisource, and which has a better quality scan backing it, would make the most sense. Both editions were published in the same year, just months apart, and from the same publisher, so it makes little difference for the date of a quote, and several of the essays in the book had been previously published elsewhere. --EncycloPetey (talk) 01:14, 5 July 2023 (UTC)Reply[reply]
  • Note: The second edition is now fully transcribed on Wikisource. --EncycloPetey (talk) 21:23, 9 July 2023 (UTC)Reply[reply]
    @EncycloPetey, Benwing2, Chuck Entz: I have updated {{RQ:Du Bois Souls of Black Folk}} so that it can link to both the Internet Archive and English Wikisource versions of the 2nd edition of the work. To quote from the English Wikisource version, specify |edition=2nd and |version=WS. Please try it out. (Regarding the use of the quotation template at oasis, the scan of page 11 of the 1st edition is a little distorted but there is no missing text.) — Sgconlaw (talk) 15:28, 11 July 2023 (UTC)Reply[reply]
    @Sgconlaw: Uh, yes, parts of words are missing in the first edition scan. Please look again at what is quoted, and then read that portion from the bottom of page 11. Part of the quotation is not there because the left hand gutter has devoured some of the text. --EncycloPetey (talk) 15:34, 11 July 2023 (UTC)Reply[reply]
    Also, the template is not doing anything that was part of the discussion. It is not displaying the chapter number with title given the chapter number, nor is it linking to the WS chapter and page where the quotation comes from. The Chapter should link to the WS chapter, and shouold give both number and title of the chapter. The Page numbers should link to the transcribed page in the Mainspace like this, not to the copy in the Page namespace. The Page namespace is used for proofreading, and can be accessed from the transcribed copy by clicking on a page number, but it is the working space of Wikisource, not the primary space. Someone accessing a quote that spans pages will only get part of the quote in the Page namespace, but will have access to the full quote in the Mainspace. --EncycloPetey (talk) 15:40, 11 July 2023 (UTC)Reply[reply]
    @EncycloPetey:
    • Regarding the original version of the 1st edition, I see what you mean. I was looking at page 12, not page 11. The good news is that I have found a much better version of the 1st edition at the HathiTrust Digital Library and will upload it to the Internet Archive for use with the quotation template.
    • I have updated the page link as requested. (I originally linked to the Djvu version because that is the URL which shows up when you click on any of the page number links at Wikisource.) However, I do not think we need to display both the chapter number and the chapter name. Generally, if a chapter name is available we just display that, and if the chapters are not named we just display the chapter number.
    Sgconlaw (talk) 16:33, 11 July 2023 (UTC)Reply[reply]
    I've tested it, but the page link is not working. It's not giving an error message with {{{chapter}}} in the link. --EncycloPetey (talk) 16:38, 11 July 2023 (UTC)Reply[reply]
    @EncycloPetey: ah, I forgot that |chapter= is now mandatory with the updated page link. OK, the template now generates an error message if the chapter is not stated:
    Sgconlaw (talk) 16:52, 11 July 2023 (UTC)Reply[reply]
    Now the link isn't working because it's generating https:/ twice at the start of the link. Please check that it works before pinging me again. --EncycloPetey (talk) 16:59, 11 July 2023 (UTC)Reply[reply]
    @EncycloPetey why so testy? It is working now. — Sgconlaw (talk) 17:14, 11 July 2023 (UTC)Reply[reply]

It's working now, thanks! One final question: would the reader expect the book title to point to the book itself, or an article about the book? For the Wikisource edition, it is possible to link the book title to the main page of the book on Wikisource. --EncycloPetey (talk) 17:20, 11 July 2023 (UTC)Reply[reply]

@EncycloPetey: we generally link the book title to a Wikipedia article if one exists. I don't think it's necessary to provide a link to the main page of the book source (whether the Internet Archive or Wikisource)—after all the page number (or, for the Wikisource version, the chapter name if the page number is omitted) already links to the source. If you're fine with the updated version of {{RQ:Du Bois Souls of Black Folk}}, I'll delete {{RQ:Du Bois Souls of Black Folk 2nd ed}}. — Sgconlaw (talk) 17:28, 11 July 2023 (UTC)Reply[reply]
I am fine with it, but I do think that, where a Wikisource copy of a text can be linked from the title, it should. The main Wikisource page will link to the corresponding WP article, if the reader needs additional context. As a reader, I expect to arrive at the work whose title I click, rather than a secondary description of that work; just as I would expect to arrive at the entry for a word I clicked on Wiktionary. Linking the author to their WP article makes sense to me, but not linking to a WP article about a text when it is the yexy of the work relevant for the quotation. The page link at the end (after an OCLC link) will not be the reader's first guess at where to arrive at the work being cited. --EncycloPetey (talk) 17:46, 11 July 2023 (UTC)Reply[reply]
I've taken care of the deletion myself. --EncycloPetey (talk) 18:06, 11 July 2023 (UTC)Reply[reply]
@EncycloPetey: OK, great. My point about the title is that it seems redundant to have two external links to the same place, one at the title and the other at the page number or chapter name. Since there's already a link at the page number or chapter name, the title is more usefully linked to a Wikipedia article about the work as readers may want to learn more about it. — Sgconlaw (talk) 18:36, 11 July 2023 (UTC)Reply[reply]
I understand that a reader might want to know more about the work, and that's why Wikisource links to the WP article from the top page of its works. My point is the Principle of Least Surprise. If I pull something off the shelf labelled Moby Dick, I expect it to be the novel Moby Dick, and not an article about the novel. A reader clicking the link would reasonably expect to be taken to the novel, and finding instead an article about the book, might be disappointed. I believe that it is more reasonable for a reader to expect to be taken to the quoted work, rather than an article about the work. --EncycloPetey (talk) 18:45, 11 July 2023 (UTC)Reply[reply]

Trying to add an example[edit]

Hello, in i tried add important example needs to translate to the English and the example usage "白水着で泳いでみた" (i tried to swim in white swimsuit), it's not harmful. Frozen Bok (talk) 08:03, 5 July 2023 (UTC)Reply[reply]

@Frozen Bok You should be able to see the abuse filters you triggered here: [2] It seems the edit summary you used had keywords of a sort typically used by vandals. For the post a little ways up that triggered the abuse filter, it was because there were formatting issues in the page you created. Benwing2 (talk) 01:11, 6 July 2023 (UTC)Reply[reply]

heliotrope transcludes itself???[edit]

First among "Templates used in this preview: is heliotrope itself.

This showed up on my watchlist. It is not the only entry on my watchlist bearing the message "this page is included within other pages".

How does that happen? Is it harmful or wasteful, in actuality or potentially? If it isn't, why is such a message made to appear? DCDuring (talk) 21:50, 5 July 2023 (UTC)Reply[reply]

Perhaps you mean "Special:WhatLinksHere/heliotrope"? I, too, wondered why that is happening. As far as I can tell, there is no link in the form "[[heliotrope]]" or "{{l|en|heliotrope}}" in the entry. — Sgconlaw (talk) 22:11, 5 July 2023 (UTC)Reply[reply]
Maybe some under-the-hood lua page parsing? For {{senseno}}? – Jberkel 22:27, 5 July 2023 (UTC)Reply[reply]
It's not new. I had noticed it for months, maybe years. It shows up in both "what links here" and the listing of templates used at the bottom of edit preview. It doesn't always show up in my watchlist. I have looked at a few words from non-Roman scripts and the self-transclusion seems to occurs there as well.
I suppose a question is Why does it show up in my watchlist sometimes, but not others. DCDuring (talk) 23:59, 5 July 2023 (UTC)Reply[reply]
I notice that for Chinese entries there can be multiple transclusions of entries, which I have yet to see in other languages. DCDuring (talk) 00:03, 6 July 2023 (UTC)Reply[reply]
I also note that self-transclusion occurs at Wikipedia. Also, it seems to show up regularly for entries that are one my watchlist because that are in a category on my watchlist. So the question is:
  • How can "this page is included within other pages" be removed from my watchlist??? DCDuring (talk) 00:38, 6 July 2023 (UTC)Reply[reply]
@DCDuring If you preview word, you'll see listed under "Templates used in this preview" word itself as well as a smattering of Chinese pages and a few others. I suspect the pages listed here are those for which content is fetched by Lua code on the page. Currently, for example, if you link to a Chinese-language term, the transliteration module looks up the page contents of the term to fetch its transliteration. Presumably there is some code that run by most or all pages that is fetching the page contents of the page itself. This sounds like something User:Theknightwho might have added, or they might know what is going on. Benwing2 (talk) 01:03, 6 July 2023 (UTC)Reply[reply]
@DCDuring @Sgconlaw Tons of pages will show this, because viewing the raw contents of a page via a module counts as a transclusion of that page - even if none of it ends up being actually displayed. The headword module does various checks on the raw text of any page it's on (which makes it straightforward to add pages with issues to maintenance categories), so any page with a headword "transcludes" itself, even if there aren't any problems with it.
@Benwing2 Your suspicion is correct, as mw.getCurrentTitle():getContent() will trigger this, which is run by Module:headword/data to check for various things like manual uses of {{DEFAULTSORT:}}, as it messes around with automatic sorting and sometimes causes a (non-Lua) error to display if it overrides automatic sorting. It's in the "data" module so that it's only done once for the whole page. See Category:Pages with DEFAULTSORT conflicts, which shows that almost all of them are Japanese entries (as it used to get added by {{ja-new}} until recently). Theknightwho (talk) 01:21, 6 July 2023 (UTC)Reply[reply]
Whatever the cause, however unavoidable the underlying phenomenon, the message on the watchlist is just cruft. How can we (by which I probably mean you) get rid of it? DCDuring (talk) 11:38, 6 July 2023 (UTC)Reply[reply]
[[ping|DCDuring}} Where does it say this? I looked on my watchlist and I don't see that message anywhere. Benwing2 (talk) 20:18, 6 July 2023 (UTC)Reply[reply]
@DCDuring Oops. Benwing2 (talk) 20:19, 6 July 2023 (UTC)Reply[reply]
It certainly doesn't consistently appear. Since I've been looking, all entries transclude themselves, but only sometimes does the message appear on my watchlist. Can't something be done using CSS to suppress it? I don't want to have to wait for the root cause to be solved next year. DCDuring (talk) 21:24, 6 July 2023 (UTC)Reply[reply]
I've never seen this before either. Can you post a link to a screenshot? This, that and the other (talk) 00:28, 7 July 2023 (UTC)Reply[reply]
What's a good place to upload a screenshot to? DCDuring (talk) 01:46, 7 July 2023 (UTC)Reply[reply]
Here's some examples from the six currently appearing on my watchlist:
  1. Category:Hot words newer than a year‎ 03:17 ‎Ioaxxere talk contribs block‎ (enshittification removed from category, this page is included within other pages)
  2. Category:Entries missing English vernacular names of taxa‎ 19:27 ‎Fay Freak talk contribs block‎ (عندم added to category, this page is included within other pages)
  3. Category:Entries using missing taxonomic name (species)‎ 19:24 ‎Fay Freak talk contribs block‎ (أي added to category)
  4. Category:English adjective-noun compound nouns‎ 19:14 ‎Sgconlaw talk contribs block‎ (evil eye added to category, this page is included within other pages)
Note that example 3 does not have the message, but the line is the result of the category being on my watchlist. DCDuring (talk) 01:57, 7 July 2023 (UTC)Reply[reply]
@DCDuring I couldn’t get it to show up for me, so I’m not sure how to diagnose the issue. I Googled “this page is included within other pages” and got a few results relating to wikis, but nothing I could make much sense of as it’s all buried in random code dumps. Theknightwho (talk) 02:37, 7 July 2023 (UTC)Reply[reply]
Oh I see, you have to turn on "Category changes" from the filter list to see it.
I suspect the thinking behind it is, if page A (think: a template) is transcluded into other pages, then those other pages will all be categorised into the category that page A was categorised into, therefore, the watchlist entry is incomplete, because more than just page A was added to the category. Hence the link to WhatLinksHere.
Not sure if there is anything we can do about it, to be honest... This, that and the other (talk) 04:14, 7 July 2023 (UTC)Reply[reply]
I don't get the logic. If many entries are transcluded into themselves, then those many pages will be (redundantly?) categorized into the same categories.
I had searched for "this page is included within other pages" in template, module, and mediawiki namespaces without result.
Isn't there CSS magic to suppress such things? DCDuring (talk) 12:28, 7 July 2023 (UTC)Reply[reply]
For a naive user like me it is not clear whether "this page" refers to the Category page or the page being categorized or decategorized. DCDuring (talk) 13:02, 7 July 2023 (UTC)Reply[reply]
@DCDuring They're not actually transcluded by any normal definition of the word, so there's no redundant categorisation going on - they're scanning their own page contents for certain problems, but none of it actually ends up getting displayed. Maybe the logic behind it that @This, that and the other suggests is simply outdated, as they never anticipated situations like this being counted as transclusion.
I suppose the reason it's useful for the software to keep track of it is because it signfies that a change to page X will affect page Y, or in the case of transcluding itslf, that a change to the page might affect other things on the page in ways that are non-obvious. However, that doesn't change the fact that it's obviously not helpful for someone simply viewing their watchlist like you, as it's misleading. They should probably call it something else. Theknightwho (talk) 17:45, 7 July 2023 (UTC)Reply[reply]
So, back to the main issue, how does get this cruft eliminated from the watchlist of people like me, whatever the rationale for the misnomer.
For that matter, why not get rid of an entry's name on the list of
  1. "Templates used on this page" on bottom of the edit preview display
  2. items linked by transclusion to itself?
if an entry is not properly called a template or is not truly "used" by 'transclusion'?
Where in wikiworld does on go with such a complaint? DCDuring (talk) 18:51, 7 July 2023 (UTC)Reply[reply]
@DCDuring Turning off category changes would obviously do it, but there's a clear downside to that. Unfortunately the things you suggest are determined by MediaWiki's software, and I don't think we have any control over them. You can make a feature request at the Phabricator (as I expect they wouldn't see this as a bug since it's technically doing what it's supposed to, even if it's not actually very helpful). Theknightwho (talk) 19:10, 7 July 2023 (UTC)Reply[reply]
Filed on Phabricator, but wouldn't CSS, if possible, be quicker. Who is a good CSS guy? DCDuring (talk) 14:00, 8 July 2023 (UTC)Reply[reply]
@DCDuring I am not a CSS expert but I don't think this is possible purely with CSS; it would have to be done with JavaScript (probably a one or two-line JavaScript action would do it but I am not familiar with how to do this). Benwing2 (talk) 21:04, 8 July 2023 (UTC)Reply[reply]
That's good. We apparently have more people claiming expertise in JS than in CSS. DCDuring (talk) 21:58, 8 July 2023 (UTC)Reply[reply]
Perhaps I wasn't clear. The logic that displays "this page is included within other pages" was clearly intended for when templates, modules, etc. are added to categories, as such categorisation changes will generally affect other pages besides the page that was actually edited. It's apparent enough that the logic was not intended to be activated for pages that are only transcluded by themselves, which is the situation we are facing here. And yes, the logic is in the MediaWiki software itself, which is why the text is not visible when searching this wiki. This, that and the other (talk) 02:30, 8 July 2023 (UTC)Reply[reply]
To be clear, it only shows up when the 'watch category' "Filter" on the watchlist and not for any other watchlist item, for which the same under logic applies. That makes it spurious where it appears. Whether the term transclusion is erroneously applied or simply misleading (possibly only to me) is secondary to my concern, which is merely with decluttering my watchlist and making watching categories better for all. DCDuring (talk) 14:07, 8 July 2023 (UTC)Reply[reply]
Within the source for the watchlist items with the offending text is the following:
<a href="/wiki/Special:WhatLinksHere/neoracism" title="">this page is included within other pages</a>
HTH DCDuring (talk) 22:27, 9 July 2023 (UTC)Reply[reply]

Hidden Content Displayed[edit]

On Wenzhou in the Translation box, we see: "Wu: 溫州/温州 (1un-tseu-- !--)". This is because on 溫州's pronunciation box, we see "|w=1un tseu  !--|wz = ʔy33-1 tɕiɤu33-11 --". The obvious fix is to delete the hidden content. But what's more interesting or important is: can you prevent the Translation box from grabbing hidden content? Thanks! --Geographyinitiative (talk) 10:13, 6 July 2023 (UTC)Reply[reply]

This has to do with Module:zh-translit @Theknightwho. – Wpi (talk) 10:22, 6 July 2023 (UTC)Reply[reply]
@Geographyinitiative @Wpi Fixed.
  • It's because there's a pipe in the hidden comment, and Module:zh-translit uses pipes to know where the end of the parameter is.
  • There was also a minor bug with Wu romanisation where starting and trailing spaces would get converted to hyphens in addition to medial ones, so I've added a bit to remove any padding.
  • There may be a few instances where a page has reading XXX specified multiple times but one is entered as (e.g.) "XXX ", which would have been treated as different readings (causing a translit fail). Removing padding means that's no longer the case.
Theknightwho (talk) 11:14, 6 July 2023 (UTC)Reply[reply]
@Theknightwho Thanks, but did this change cause the unusual change in Weidu's etymology section? Check it out- "Lua error in Module:zh-translit at line 42: attempt to index a nil value." --Geographyinitiative (talk) 11:46, 6 July 2023 (UTC)Reply[reply]
@Geographyinitiative Yep, my oversight - fixed. Theknightwho (talk) 12:06, 6 July 2023 (UTC)Reply[reply]

Lua global variable global_frame[edit]

@Theknightwho recently deliberately added the said global variable, global by design and used in Module:scripts. Its use floods the logs generated by Module:log globals (@Erutuon), and I suspect it will make the use of require("strict") impossible. How do we resolve this? --RichardW57m (talk) 13:13, 6 July 2023 (UTC)Reply[reply]

@RichardW57m I've removed this, as there are drawbacks to using globals and probably a better way to achieve the same goal. Theknightwho (talk) 13:17, 6 July 2023 (UTC)Reply[reply]

Delete accesskey pages[edit]

Hello! Since gerrit:930625 was merged a couple of weeks ago and has now been deployed, the following pages are no longer necessary – the MediaWiki software will still have the "c" access key even if these pages don't exist. In other words, they can safely be deleted without any functionality changing. The pages are:

Jon Harald Søby (talk) 13:27, 6 July 2023 (UTC)Reply[reply]

 Done This, that and the other (talk) 08:51, 7 July 2023 (UTC)Reply[reply]

Missing entries list[edit]

Hi. How easy would it be to generate a list of the most wanted red-links in the Appendix:Moby Thesaurus II subpages? @Erutuon, @This, that and the other @DTLHS. Would anyone like to do this for me? Not the famouse (talk) 09:33, 8 July 2023 (UTC)Reply[reply]

@Not the famouse First 5000 here: User:This, that and the other/Moby Thesaurus wanted/1 Most are SOP. This, that and the other (talk) 10:47, 8 July 2023 (UTC)Reply[reply]
Awesome!There are plenty of Wiktionary-worthy entries mixed up in there too. I just made a couple of dozen. This en.wiktionary seems like a fine place to be Not the famouse (talk) 12:26, 8 July 2023 (UTC)Reply[reply]

"list" article broken on mobile site due to CSS shenanigans[edit]

As described on the talk page of the list article and the Reddit post where this bug was discovered, the "list" article is broken on the mobile site. That's to say the layout is broken and scrolling is impossible. The reason is that Wiktionary includes the name of the article in the name of an HTML class, and that name, in the case of the "list" article, collides with a pre-existing CSS class. Alpatron (talk) 18:09, 9 July 2023 (UTC)Reply[reply]

There's nothing we can do to fix it (it affects all Wiktionaries). It has to be reported to the WMF devs, but don't mention Wiktionary or they'll probably close the ticket as "won't fix" with immediate effect. — SURJECTION / T / C / L / 19:46, 9 July 2023 (UTC)Reply[reply]
It seems there already is a ticket. As expected, it's been mostly ignored. — SURJECTION / T / C / L / 20:57, 9 July 2023 (UTC)Reply[reply]
Any ideas why Wiktionary issues are ignored by WMF devs? ‑‑ Eiríkr Útlendi │Tala við mig 21:29, 18 July 2023 (UTC)Reply[reply]
We don't have enough page visits? DCDuring (talk) 22:22, 18 July 2023 (UTC)Reply[reply]
Most MediaWiki-related tasks (bugs) in Phabricator get ignored; I don't think it is especially connected to Wiktionary. There are simply insufficient WMF and volunteer resources to address them all. Perhaps what we lack relative to other projects, like Wikipedia, are local volunteers with knowledge of the MediaWiki codebase and the time to help fix the bugs ourselves. I have the former but not the latter. This, that and the other (talk) 00:56, 19 July 2023 (UTC)Reply[reply]
I put a local workaround in place for now. This, that and the other (talk) 07:25, 11 July 2023 (UTC)Reply[reply]

Wiki page for 핛[edit]

I was making a wikipafe for a korean syllable, help me QTReader64058355 (talk) 02:12, 10 July 2023 (UTC)Reply[reply]

@QTReader64058355 What is the issue? Benwing2 (talk) 19:39, 11 July 2023 (UTC)Reply[reply]

Request for auto-cat-ization of Japanese jōyō kanji cats[edit]

Would it be possible to auto-cat-ise Cat:Japanese terms spelled with jōyō kanji, Cat:Okinawan terms spelled with non-jōyō kanji‎, etc? These hierarchies are apparently required for all Japonic-script languages. This, that and the other (talk) 09:25, 10 July 2023 (UTC)Reply[reply]

Yes, this can be fixed by adding to Module:category tree/poscatboiler/data/lang-specific/jpx. Let me see what I can do. Benwing2 (talk) 23:32, 10 July 2023 (UTC)Reply[reply]
I have implemented the underlying code and changed the relevant Japanese categories to use {{auto cat}}. The same needs to be done for the other languages. Benwing2 (talk) 00:26, 11 July 2023 (UTC)Reply[reply]
@Benwing2 you're a legend! Thank you. This, that and the other (talk) 01:45, 11 July 2023 (UTC)Reply[reply]

Heading 1 on the entry for constructor currently displays as "function Object() { [native code] }". Voltaigne (talk) 18:25, 10 July 2023 (UTC)Reply[reply]

@Erutuon, This, that and the other I still see this. It's at the very top, where the title ought to be. Some JavaScript code is changing the title but I don't see any recent changes to our JavaScript code so I wonder if this is a MediaWiki bug. Benwing2 (talk) 22:20, 10 July 2023 (UTC)Reply[reply]
That's hilarious! Hopefully I fixed this in [3], but we will need to wait a few minutes to see the change. This, that and the other (talk) 01:59, 11 July 2023 (UTC)Reply[reply]

Auto-handling prefix and suffix alt forms[edit]

Many languages have alternative forms of prefixes and suffixes ("variants" per User:RichardW57). Currently the {{affix}}/{{prefix}}/{{suffix}} handlers aren't very smart about this, and so e.g. if a Finnish term ends in -käs instead of -kas, the etymology specified using {{affix}} or the like either needs to write {{af|fi|foo|-kas}} (despite the term's spelling) or it needs a piped link something like {{af|fi|foo|[[-kas|-käs]]}}, or a display form something like {{af|fi|foo|-kas|alt=-käs}}. I'm thinking of making this smarter; this would entail adding language-specific data modules to Module:compound. There is precedent for this, since we have language-specific data modules for {{lb}} and {{infl of}}. This would also allow e.g. the category Category:Finnish terms suffixed with -kas to auto-display the alternative forms in the category description. Thoughts? Any other ideas? E.g. it would be nice to reduce the burden of specifying |id=, but I'm not quite sure how to do it. Benwing2 (talk) 20:02, 11 July 2023 (UTC)Reply[reply]

For editors, it would be nice to have a documented option for displaying {{senseid}}, {{etymid}} and |id=. It would be off by default, and information on how to add its capability to templates and functions would not be restricted to the cognoscenti. (I suggest documentation by at or linked to from the documentation of {{senseid}}. As I edit with inflection tables displayed, I can't easily check that it is working in simple cases, as |id= it doesn't work with collapsed tables expanded, at least not on Firefox. --RichardW57 (talk) 05:48, 12 July 2023 (UTC)Reply[reply]
@RichardW57 As is unfortunately often the case, I don't quite understand your ask here. Can you clarify what you mean and give some examples? Benwing2 (talk) 05:53, 12 July 2023 (UTC)Reply[reply]
Users can customise their displays. If the option is enabled, {{senseid|cy|frog}} could, at its simplest, display as 'senseid=frog. '. For the display of |, an elegant method would be to display the ID in the same way as |tr= and |ts=, perhaps resorting to the visible prefix 'id='. An adequate method for |id=toad would be to resort to adding ' (id=toad)' after the link. I believe all this might use a class in HTML whose layout in the stylesheet depended on the user's settings. Now, it may be that an implementation in Module:links will actually enable the display of ID in all templates, but if not, I would want other implementations (perhaps using square braces and a knowledge of the fragment coding convention) to be able to add conditionally hidden text, with display controlled by the same switch. Adding it should not depend on your being around (and willing) to help --RichardW57 (talk) 06:26, 12 July 2023 (UTC).Reply[reply]
@RichardW57 Can you add an entry to User:Benwing2/todo? Benwing2 (talk) 06:31, 12 July 2023 (UTC)Reply[reply]
Looks like we can display the raw ID with the CSS code (add to Special:MyPage/common.css to test):
.senseid::before {
  content: '[' attr(id) '] ';
}
But that shows the sense ID with the language name, colon, and underscores. There would need to be an additional attribute (say data-senseid="sense ID without language name and with spaces") generated by the template to get a nice display. The text around the sense ID can be changed in the CSS. — Eru·tuon 15:55, 12 July 2023 (UTC)Reply[reply]
Read the earlier post again and I am not sure how to display the |id= attribute in links. It's more complicated because there could be a parenthesized set of annotations, no annotations, or no HTML at all ({{ll}}, though that's rarely used). — Eru·tuon 16:26, 12 July 2023 (UTC)Reply[reply]
I've added the data-senseid="" attribute to the HTML generated by {{senseid}}, so you can display the sense ID with the following CSS in Special:MyPage/common.css:
.senseid::before {
  content: '[' attr(data-senseid) '] ';
}
I might be jumping the gun on this, but at least it is unlikely to be harmful. — Eru·tuon 16:43, 12 July 2023 (UTC)Reply[reply]
This solution might be too rigid. For example, one might want to link to vowel harmony rules or the application of vowel modification rules such as umlaut or ablaut which affect the stem. Or have you already taken these ideas on board? --RichardW57 (talk) 06:33, 12 July 2023 (UTC)Reply[reply]
Not yet. I need some examples of how these rules might work. Benwing2 (talk) 06:34, 12 July 2023 (UTC)Reply[reply]
This seems like the best way to go about this, to have a template/module just know that if a Finnish term is input as having the suffix -käs it should categorize as -kas, since IMO people are unlikely to stop finding it intuitive to input whatever suffix a term visually has (like -käs), unless perhaps we started adding an explicit indication of the vowel change as a separate step (e.g. foobar + -kas + vowel harmony change of a to ä, which is way wordier than foobar + [[-kas|-käs]]). The main problems I can think of are a) already problems at present, and b) at least as much issues of editor decision-making as of template functionality. Namely: 1) need to still handle cases where there are two suffixes -foo, and one is an alt form but the other is the main or only form (or even just in general, cases where there are two suffixes; look at the state of Category:English terms suffixed with -n where forgiven and Arizonan are lumped together: there is nominally a subcategory forgiven is supposed to be in, but it is nigh-unused), which is a "users don't know / can't be arsed to use more specific links/parameters" problem more than any issue with the templates per se, and 2) deciding what to combine (e.g. how many, if any, of -ian, -an, -n, and -ean — as seen in e.g. Bangkok+ian vs Abu Dhabi+an vs Saudi Arabia+n, Java+n, Ecuador+ean vs Achebe+an vs Althea+n vs ?Achillean, ?Antillean — should the module consider alt/variant forms of the same underlying suffix?). - -sche (discuss) 19:44, 12 July 2023 (UTC)Reply[reply]
@-sche Thanks. Yeah it's not always clear where to draw the line with alt forms. My general thinking is that changes that are primarily phonologically motivated should count but other sorts of variants shouldn't, e.g. in -an vs. -ian both can occur with the same word (Arizonan or Arizonian) so this wouldn't count, although there are gray areas e.g. Latinate -al vs. -ar where the latter usually occurs with stems containing an l, hence regular vs. general, but there are exceptions like filial, as well as familiar and familial (with different meanings). Benwing2 (talk) 19:58, 12 July 2023 (UTC)Reply[reply]
FYI I am planning on renaming Module:compound and dependencies to Module:affix, since it relates much more to affixes than compounds. Any objections, e.g. User:Theknightwho User:Surjection User:Erutuon? Benwing2 (talk) 02:25, 13 July 2023 (UTC)Reply[reply]
Sounds good. Theknightwho (talk) 05:25, 13 July 2023 (UTC)Reply[reply]
@Surjection, -sche This support is available now. Currently only Finnish mappings are available; see Module:affix/lang-data/fi. Feel free to add more and expand the language support. The mappings aren't applied if there's a separate display form set with |altN= or the <alt:...> inline modifier, or if embedded (including piped) links are found in the affix. Benwing2 (talk) 20:20, 16 July 2023 (UTC)Reply[reply]
BTW I will be modifying the category-handling code to display the mapping variants on the page for the canonical affix, e.g. CAT:Finnish terms suffixed with -kas will indicate that it also includes terms suffixed with -käs. Benwing2 (talk) 20:23, 16 July 2023 (UTC)Reply[reply]
@Surjection, Fenakhay This is done; categories like CAT:Finnish terms suffixed with -kas and CAT:Turkish terms suffixed with -ci should show the variant suffixes. Benwing2 (talk) 07:40, 19 July 2023 (UTC)Reply[reply]

lemmas not separating by first letter[edit]

Normally lemmas are listed in separate groups under their first letter when you look at a category list such as Category:Ket_lemmas. However, in that Ket list all words beginning with к are followed by all beginning with ӄ, without a big capital ӄ header. The two groups seem to be correctly sorted - there are all the к words in correct order then all the ӄ words, and a proper name Кънӄоʼ is correctly sorted ignoring case. It is just missing the big capital letter dividing them.

So I wondered if it was a peculiarity of ӄ, and what other language might use it? I guessed Uzbek. Curiouser and curiouser. In Category:Uzbek_lemmas if you page through to roman Z (sorry, I don't know how to link to that) you then get a list for o' and g' and sh and ch - it knows these digraphs are supposed to be separate letters (and capital Sh etc are correctly ordered within them), but it doesn't put letter headers for them. Then it goes through Cyrillic, gets to Я, and various other letters such as ў and қ then get listed without their own header. (And I learnt that ӄ and қ are different letters.) -- Hiztegilari (talk) 21:37, 11 July 2023 (UTC)Reply[reply]

@Hiztegilari The sort key for Ket in Module:languages/data/3/k intentionally sorts ӄ together with but after к. The reason for this is that Unicode ӄ (code 1220 = 0x4c4) doesn't come directly after Unicode к (code 1082 = 0x43a), so by default they won't be grouped together, so this is being done to ensure that they are ordered correctly. The lack of a separate header is a side effect of this; User:Theknightwho can comment more on this but I don't think it's possible given the way the MediaWiki software works to ensure that ӄ comes after к but ends up with its own header. Benwing2 (talk) 23:18, 11 July 2023 (UTC)Reply[reply]
@Benwing2 It is (probably) possible, but we would need to get MediaWiki to enable sorting by the Unicode Collation Algorithm by default. There will still be edge-cases where a specific language will need further changes (which would still have the issue with headers), but it would solve a large majority of manual sorting we’re forced to do at the moment.
It’s just a matter of getting $wgCategoryCollation changed in LocalSettings.php for the site, so if we go to the Phabricator they should make the change. I suspect they’ll want to see consensus for it, but I don’t think a vote is necessary unless they ask for it. Theknightwho (talk) 00:01, 12 July 2023 (UTC)Reply[reply]
@Theknightwho I think we had a discussion about this before, and there were concerns about what would break or become backward-incompatible if we do this. If you're looking for consensus on a change like this, you should lay out what those concerns are and how to alleviate them. Benwing2 (talk) 00:10, 12 July 2023 (UTC)Reply[reply]
@Benwing2 The main concern is that we’d end up with a bunch of sortkeys double-compensating, so the easiest thing to do would be to switch makeSortKey in Module:languages off for a few days while we fix the sortkeys in the data modules up to be compatible with the new default.
There would still be double-compensation issues with manual sorting, but it’s rarely used anyway and the issue isn’t that big of a problem in the first place. Theknightwho (talk) 00:36, 12 July 2023 (UTC)Reply[reply]

Lua errors: back to the bad old days[edit]

CAT:E has filled up with the usual suspects again. Worse, a exceeds the template include size - so even the non-Lua templates are failing towards the end of the page!

I think we seriously should consider splitting our very longest entries into multiple pages. Not only are we fighting a losing battle against two technical foes (Lua memory and template include size), but the entries are now so long that the user experience is very poor, especially on mobile (try navigating to the Scottish Gaelic entry for a on your phone).

See User:This, that and the other/a for a prototype I came up with some time ago. Ironically, the extra code in {{l}} etc to handle this may well push other pages over the limit. But it's worth a shot, surely. This, that and the other (talk) 06:47, 12 July 2023 (UTC)Reply[reply]

@This, that and the other Before we go down this road I'd like to hear from User:Theknightwho and the status of their pre-parser. They seem to have had some great luck radically reducing both template include size and memory on certain heavy pages, so we might not need to go the radical step of splitting pages by language (at least not yet). Benwing2 (talk) 07:37, 12 July 2023 (UTC)Reply[reply]
Is this something like {{multitrans}} but for an entire page, or sections thereof? That's a great idea, of course - but it doesn't address the underlying usability issues with such long pages. This, that and the other (talk) 08:02, 12 July 2023 (UTC)Reply[reply]
@This, that and the other That’s correct - it’s not quite fast enough yet (as it times out on a about 75% of the way through), but on the part it did load it used about 25MB even after converting all the lite templates to normal ones. On teacher, memory use goes down from 49MB to about 13.5MB.
There are still a bunch of edge-cases that need to be worked out, and I’m currently refactoring it to try to increase the speed, since the current design centres around maximising memory savings and seems to be overkill. Theknightwho (talk) 12:28, 12 July 2023 (UTC)Reply[reply]
Just to add, by the way: there's a third limit a is very close to as well (the 10 second load time). The lite templates are a major contribution to that, as they're much more intensive for the parser (even though they don't call into Lua). Plus, certain things cause multipliers to be applied to text when calculating the post-expand include size limit - notably, parser functions - and these can compound. That means the lite templates are responsible for a butting up against that limit as well. If we can solve the Lua memory problems with the parser I'm designing, it should also help with the other two limits as well since it means we could ditch the lite templates. Theknightwho (talk) 12:45, 12 July 2023 (UTC)Reply[reply]
@This, that and the other Can you expand on the usability issues for mobile? Is there a way to mitigate them other than splitting the page? I don't normally use the mobile site but can you not just search for Scottish Gaelic, or use the table of contents? Benwing2 (talk) 18:26, 12 July 2023 (UTC)Reply[reply]
Mobile doesn't have a TOC. But yes, you could search the page for the heading, this is a good point. It actually works quite well on my phone from a performance standpoint! However, you do have to move through a couple of results (a descendant) before you get to the right place. I would still prefer a world where a "find" wasn't necessary, but perhaps there isn't as much to worry about as I thought. This, that and the other (talk) 02:03, 13 July 2023 (UTC)Reply[reply]
I've did a bit of testing, and it seems the post-expand inclusion limit can be avoided by replacing {{q-lite}} (and {{sense-lite}} which is based on {{q-lite}} with an even-more-lite template that only supports one argument; this only allows a few more sections though, up to Etymology 2 of a#Yola, and of course the Lua error messages are still there for the non-lite templates. I reckon that someone with more knowledge in how the lite templates work would probably figure out a way (like it might be beneficial to add |langname= in places where needed in order to avoid loading {{langname-lite}} which is stupidly large, or removing the ones in {{langname-lite}} that are rarely used) to bring down both the memory usage and the post-expand inclusion size.
Obviously, this would be a futile effort that requires a lot of effort for very little gain, and would soon require more work as the page grows longer. Ideally we would want to move away from the current practices that are inefficient and bring an end to the lite-pocalypse - TKW's parser seems like a promising start on such work to me. – Wpi (talk) 19:29, 12 July 2023 (UTC)Reply[reply]

Tried to add a citation to the word "singlet", got flagged as harmful[edit]

I decided to look up some old usages of the word singlet (sense 1.3 "a person who does not have a form of multiplicity"), found a Usenet citation from 1999, and tried to add it to the citations page with the appropriate template. However, my edit was flagged as harmful under "abuse 23", whatever that was. (I don't see any way to check what exactly it was.)
I've been told that if I wanted to dispute the decision I should go to this (sub)forum, so here I am. I am fairly sure that adding a citation is a constructive action, and I'm surprised that it got flagged as harmful. What exactly did I do wrong, and what could I do to fix it? 2.52.7.195 15:14, 12 July 2023 (UTC)Reply[reply]

I see it now: it was because the citation template introduced an external link, and since my IP did not have previous edits, that flagged me as a possible spammer. Still not sure on the "what I could do to fix it" side, though. 2.52.7.195 15:51, 12 July 2023 (UTC)Reply[reply]
You should be able to add it now. As it's a non-secret filter, I can tell you that it merely blocks users and IPs with one edit or no edits at all. Theknightwho (talk) 16:30, 12 July 2023 (UTC)Reply[reply]
I tried and it still didn't work. I've saved the edit text on my laptop for now. Maybe I should have changed the text somehow? Not today, though. 2.52.7.195 19:15, 12 July 2023 (UTC)Reply[reply]
It's triggering a global MediaWiki filter (metawiki:Special:AbuseFilter/214) that's not visible even to admins like me. I don't know what the purpose of the filter is, and I've copied your edit out of the abuse filter log and done it myself. Apparently the filter can be bypassed by admins. — Eru·tuon 21:09, 12 July 2023 (UTC)Reply[reply]
I think the reason @Theknightwho got confused is that the 23 in the "abuse 23" message isn't the number of the filter (that's 214 in MediaWiki's global abuse filter list), but a reference that only those who work with those filters know the meaning of. Our abuse filter 23 is designed to stop spambots that create bogus new user pages, so it doesn't apply to mainspace edits at all. Chuck Entz (talk) 03:49, 13 July 2023 (UTC)Reply[reply]
Interesting, and I just noticed that having no prior edits didn't stop a different IP from adding someone's LinkedIn link to cumslut, which means global filter 214 can't just be stopping new users from adding links (which would've been my first guess and seems like a smart enough thing to do!); it must be something other than the mere addition of a link ... perhaps some long-term vandal spams links to that specific newsgroup, or perhaps the filter doesn't like the mention of dissociation or doctors plus a link (as if maybe you might be linking to quack cures), or perhaps the very gibberishy / spam-link-esque URL 5L4WQOpUWU/m/gyzds5osXwIJ is what's triggering it. Who knows. - -sche (discuss) 05:09, 13 July 2023 (UTC)Reply[reply]
@Chuck Entz So in the abuse log the IP above triggered filter 26 (which is for stopping new users adding external links), but I see right above it that the global filter you mention was also triggered. Theknightwho (talk) 05:21, 13 July 2023 (UTC)Reply[reply]

Comma in "initialism of"[edit]

For some reason, the serial comma template {{,}} generates an "and" string when used inside templates such as {{initialism of}} (see e.g. TMA, MMR, DEI). Einstein2 (talk) 22:52, 12 July 2023 (UTC)Reply[reply]

@Einstein2 This is because of recent changes I made to the form-of templates to support multiple comma-separated lemmas. It expects actual embedded commas to be followed by a space, which isn't the case here. I can hack around this but ideally such templates should be formatted like this: {{initialism of|en|tense,mood,aspect}} instead of the hacky way of using the serial comma template. Benwing2 (talk) 00:11, 13 July 2023 (UTC)Reply[reply]
Actually not sure about that; the comma-separated lemmas are intended to express the case where a given term is simultaneously the initialism (or whatever) of multiple lemmas (e.g. multiple lemmas each of which spells out TMA), which isn't the use case here. Benwing2 (talk) 00:14, 13 July 2023 (UTC)Reply[reply]
If I understand your second comment correctly, it's neat to know that functionality exists. Reading your first comment, I had been about to say that expecting people to type {{initialism of|en|tense,mood,aspect}} (without spaces) whenever they want "Initialism of tense, mood, aspect" (with spaces) would be very unintuitive, but now I take it you mean the CSV approach is for the relatively few cases like WOC where it's equally the acronym of the singular woman of color and the plural women of color (and not that "Initialism of tense, mood, aspect or trimethoxyamphetamine or transmisogyny-affected" should be lumped together). - -sche (discuss) 07:22, 13 July 2023 (UTC)Reply[reply]
@-sche Yes, that's basically correct. I added the ability to have multiple lemmas in form-of templates mostly for {{infl of}}, so that e.g. Czech jimi can be written {{infl of|oni,ony,ona||ins|p}} (these are respectively, the masculine animate plural, feminine + masculine inanimate plural, and neuter plural third-person pronouns in Czech) rather than having to list the three lemmas separately. It ends up applying to all form-of lemmas, but it's more useful for some than others. I'm not suggesting it makes sense or is required to group multiple unrelated abbreviations together just because the functionality is there, but it may be useful for related terms, as you suggest. Benwing2 (talk) 18:41, 13 July 2023 (UTC)Reply[reply]

A lot of redirected categories are defined with {{category redirect}} or {{movecat}}. Why weren't these categories just deleted? They generally consist of cases where a language has been renamed, or are misspellings (e.g. 'adjetives' in place of 'adjectives', missing 'the' in place name categories, etc.). Sometimes the redirects themselves are broken. I'm thinking we should just delete all such categories. Thoughts? Benwing2 (talk) 02:10, 13 July 2023 (UTC)Reply[reply]

@Benwing2 It's because pages added to redirected categories don't get added to the main category, so this makes it easier to check. It comes from Wikipedia, and is mostly pointless for us as we tend not to add pages to raw categories. Theknightwho (talk) 05:06, 13 July 2023 (UTC)Reply[reply]
@Theknightwho: When you wrote 'main category', did you mean 'new category'? --RichardW57m (talk) 09:22, 14 July 2023 (UTC)Reply[reply]
@RichardW57m I mean the target of the redirect. Theknightwho (talk) 09:23, 14 July 2023 (UTC)Reply[reply]
Thank you. --RichardW57m (talk) 09:26, 14 July 2023 (UTC)Reply[reply]
Some of these could be useful, especially in cases where automated tools might continue to generate new inclusions in these categories. It's also worth keeping in mind that third-party sites might link to English Wiktionary category pages, and it would be rude to break their links altogether. I'm inclined to delete the misspelled and other obvious junk categories and see what's left behind. This, that and the other (talk) 09:37, 21 July 2023 (UTC)Reply[reply]

'Add languages'[edit]

MediaWiki:Vector-no-language-button-label has no sense here on Wiktionary, since it isn't connected to Wikidata. I think that it should be reformulated 'Not in other languages'. Àncilu (talk) 10:05, 13 July 2023 (UTC)Reply[reply]

That's true for pages in our main namespace, but not for pages in, say, the Wiktionary: namespace. A separate message for the two different scenarios should be provided by Wikidata developers. This, that and the other (talk) 11:17, 13 July 2023 (UTC)Reply[reply]
@This, that and the other (Beautiful nickname :-P) : Yes, but with {{#switch}} the problem can be resolved: the default is 'Add languages' and add namespace 0 as exception. Àncilu (talk) 11:34, 13 July 2023 (UTC)Reply[reply]
Ah, good point (although Wikibase should also fix this). I was able to confirm that it worked on Beta Wiktionary, so I made the change here too. This, that and the other (talk) 03:11, 14 July 2023 (UTC)Reply[reply]

Bug involving language labels[edit]

When Template:label is used like {{label|en|MLE}}, currently this is what is produced: (MLE)

Notice that the MLE link is directed to w:MLE rather than w:Multicultural London English.

I'm not at all familiar with how Module:labels works, but the data at Module:labels/data/lang/en seems to be correct, and the bug appears to be caused by the main Module code. Wpstatus (talk) 23:01, 14 July 2023 (UTC)Reply[reply]

@Wpstatus I've fixed this. The issue was caused by the fact that if Wikipedia is merely set to true, the main module prioritises what's in the display field over the label name. It was easy to fix by simply setting Wikipedia to "Multicultural London English" instead.
I have to admit that I didn't find this very intuitive either, and had to check the main module logic to see what was going on. Theknightwho (talk) 01:25, 15 July 2023 (UTC)Reply[reply]
@Theknightwho After digging a bit deeper into this, I think this is actually a broader problem with how aliases are handled.

The label variable declared at Module:labels#L-61 is just the second template argument. This variable is then used at Module:labels#L-137, which sets the Wikipedia_entry variable to the value of label.
This means that the value of Wikipedia_entry is determined directly by the second argument to the template. This creates a problem when an alias is used.
For example:
  • When you do {{label|en|Multicultural Toronto English}}, you get the expected result: (MTE)
  • In contrast, when you use the alias like {{label|en|MTE}}, you get this: (MTE). Notice that it links to w:MTE. This is because the logic which I described above sets Wikipedia_entry to "MTE"

I think the correct fix is to update the finalize_data function (Module:labels#L-251) - this is the function that sets up all the aliases. When the aliases are being finalized, the information about the original label name should included in the entry.
Actually, after looking through the code a bit more, I think that's exactly what the alias_of field is for (Module:labels#L-92), except that it's not set properly. The fix might be as simple as just setting alias_of correctly within finalize_data. Wpstatus (talk) 02:44, 15 July 2023 (UTC)Reply[reply]
@Wpstatus I've changed this so that alias_of gets automatically added if Wikipedia or glossary are set to true for a given label. alias_of is a bit of a misnomer, because finalize_data actually just makes all the relevant keys point to the exact same table (so they're all aliases of each other), which means it's not possible to know what the main key is supposed to be without doing it like this. Theknightwho (talk) 03:12, 15 July 2023 (UTC)Reply[reply]
Thanks! Wpstatus (talk) 03:18, 15 July 2023 (UTC)Reply[reply]

Is there a way to link to a specific meaning?[edit]

A reference can only be narrowed down to the part of speech, it seems (e.g. example#Noun). But there can sometimes be dozens of senses listed under that. How to link to one of those by number? Elfth (talk) 10:57, 15 July 2023 (UTC)Reply[reply]

@Elfth: Yes, but it needs to be set up on both sides of the link. See {{senseid}}. So at example#Noun you could (e.g.) add {{senseid|en|representative}} in the top definition, typically after the # at the start of the line in the code, and then link to it with {{l|en|example|id=representative}}. —Al-Muqanna المقنع (talk) 11:12, 15 July 2023 (UTC)Reply[reply]
Thanks. Elfth (talk) 12:09, 15 July 2023 (UTC)Reply[reply]
You should prefer giving glosses instead if possible. Having links to specific definitions from definitions or even most etymologies is not usually ideal. — SURJECTION / T / C / L / 11:57, 15 July 2023 (UTC)Reply[reply]
That's a fair point, lexicographically speaking, but Al-Muqanna's solution is not feasible for the end user who needs to cite Wiktionary via a simple URL. It would also be nice to have a unilateral way of linking to senses and not clutter the source. A graphical solution (like clicking on the sense's number to get its URL) wouldn't be perfect, but would be very convenient for the end user. Elfth (talk) 12:20, 15 July 2023 (UTC)Reply[reply]
I think there is a feature that allows full links to highlight a particular fragment from a page, but it is a very obscure feature that few people even know exist (if it does at all, I might just be misremembering things). — SURJECTION / T / C / L / 13:50, 15 July 2023 (UTC)Reply[reply]
Actually, it's an HTML feature, not a MediaWiki one. Doesn't seem to work for me, though. — SURJECTION / T / C / L / 13:55, 15 July 2023 (UTC)Reply[reply]
I'm aware of that feature, and it works fine for me, but there's a trivial limitation. It highlights only the first instance of the text appended to the link (example#:~:text=Something that serves won't highlight the third sense), so the link has to be made longer for instances further down in the webpage. This works but isn't neat, especially if there are non-ASCII characters in the link, which get percent-encoded into an illegible mess. Elfth (talk) 14:49, 15 July 2023 (UTC)Reply[reply]
@Elfth: I agree it's opaque, though other than the highlighting URL hack I'm not sure it's practicable to do it automatically with the existing software—as it stands, senses are just entries in a numbered list, and even their numbering will change as they get added, removed, and reshuffled. To generate reliable links automatically we'd really need to be treating every sense as an object with a fixed id, which implies a more advanced data structure than we've got right now. For now {{senseid}} is manual but has the advantage that the id is unlikely to ever change once it's set. —Al-Muqanna المقنع (talk) 20:00, 15 July 2023 (UTC)Reply[reply]

Module:Quotations:425: bad argument #1 to 'concat' (table expected, got string)[edit]

^ That is what appears for:
* {{Q|grc|Arist.|Pol.|1300|b|19|thru=33|refn=<sup>[[:el:s:Πολιτικά/Δ#p1300b|source]]</sup>|quote=ἔστι δὲ τὸν ἀριθμὸν ὀκτώ, ἓν μὲν εὐθυντικόν, ἄλλο δὲ εἴ τίς τι τῶν κοινῶν ἀδικεῖ, ἕτερον ὅσα εἰς τὴν πολιτείαν φέρει, τέταρτον καὶ ἄρχουσι καὶ ἰδιώταις ὅσα περὶ ζημιώσεων ἀμφισβητοῦσιν, πέμπτον τὸ περὶ τῶν ἰδίων συναλλαγμάτων καὶ ἐχόντων μέγεθος, καὶ παρὰ ταῦτα τό τε φονικὸν καὶ τὸ ξενικόν (φονικοῦ μὲν οὖν εἴδη, ἄν τ’ ἐν τοῖς αὐτοῖς δικασταῖς ἄν τ’ ἐν ἄλλοις, περί τε τῶν ἐκ προνοίας καὶ περὶ τῶν ἀκουσίων, καὶ ὅσα ὁμολογεῖται μέν, ἀμφισβητεῖται δὲ περὶ τοῦ δικαίου, τέταρτον δὲ ὅσα τοῖς φεύγουσι φόνου ἐπὶ καθόδῳ ἐπιφέρεται, οἷον Ἀθήνησι λέγεται καὶ τὸ ἐν '''Φρεαττοῖ''' δικαστήριον· συμβαίνει δὲ τὰ τοιαῦτα ἐν τῷ παντὶ χρόνῳ ὀλίγα καὶ ἐν ταῖς μεγάλαις πόλεσιν· τοῦ δὲ ξενικοῦ ἓν μὲν ξένοις πρὸς ξένους, ἄλλο δὲ ξένοις πρὸς ἀστούς), ἔτι δὲ παρὰ πάντα ταῦτα περὶ τῶν μικρῶν συναλλαγμάτων, ὅσα δραχμιαῖα καὶ πεντάδραχμα καὶ μικρῷ πλείονος.}}
on the preview screen when I edit Φρεαττώ, whereas the published page just shows a bare bullet. Note that the other two transclusions of T:Q in the entry function properly. Line 425 of Module:Quotations reads:
return table.concat(values, separator)
but I'm otherwise at a loss to understand what the problem is. Could someone more knowledgable than me explain/correct this problem, please? 0D foam (talk) 22:59, 15 July 2023 (UTC)Reply[reply]

@JoeyChen, I think your edit broke something. 0D foam (talk) 23:58, 15 July 2023 (UTC)Reply[reply]

@0D foam: it's a bit more complicated: @JoeyChen also made massive changes to Module:Quotations/grc/data at the same time. Previewing Φρεαττώ (Phreattṓ) from this version works fine, but the next edit broke everything and the following edit fixed most of the errors- but left it with this error. I don't know Lua well enough to easily spot the bug in the code, but I'm pretty confident that this is when and where it was introduced. Chuck Entz (talk) 06:19, 16 July 2023 (UTC)Reply[reply]
@Chuck Entz: Thanks for your response. I tried fiddling around with it, but nothing I tried fixed Φρεαττώ on preview. Hopefully, @JoeyChen will be able to find and fix the problem. 0D foam (talk) 12:04, 16 July 2023 (UTC)Reply[reply]
The 'separ' function introduced by JoeyChen as a wrapper for table.concat at Module:Quotations is only used (anywhere) by the Aristotle handling and is the source of the error. Each of the rlFormats for Aristotle calls .separ along the lines of {'.separ', {'.ref1', '.ref2', {'.digits', 2, '.ref3'}}, '.'}. Not a Lua expert but from the Module:Quotations docs the issue might be that separ isn't actually being given a table as a parameter since in rlFormats tables not beginning with a function are interpreted as nested variable addresses? Should probably revert the changes to Aristotle until a fix is found anyway. —Al-Muqanna المقنع (talk) 12:47, 16 July 2023 (UTC)Reply[reply]
I'm not familiar with Module:Quotations, but it seems massively over-complicated for what it's doing. Table concatenation is a really basic function, so why does it need a whole new method in the main module? Theknightwho (talk) 14:07, 16 July 2023 (UTC)Reply[reply]
@Theknightwho Yeah I tried looking at Module:Quotations; IMO the basic idea is good but I had a hard time understanding how it works. Benwing2 (talk) 20:10, 16 July 2023 (UTC)Reply[reply]
I've reverted the Aristotle stuff to the last good version. —Al-Muqanna المقنع (talk) 09:49, 17 July 2023 (UTC)Reply[reply]

----hole: not a suffix[edit]

J3133 (talk) 12:33, 16 July 2023 (UTC)Reply[reply]

@J3133 Fixed. I introduced |nosuffix= to disable this check, but also added a check for things beginning with more than one hyphen, which aren't suffixes in any case. Benwing2 (talk) 20:09, 16 July 2023 (UTC)Reply[reply]

Best way to express vowel quality that’s not normally shown in regular writing[edit]

In Norwegian, we have two sets of vowels, “open” and “narrow”. The latter stem from Old Norse long vowels (í, é, ú etc.), and sound different from the “open” vowels (i, e, u etc.), so a word like ON hof became [hɞ̞ːʋ], while hófr became [hu̞ːʋ]. However, while the vowels are clearly distinguished in normalised Old Norse writing, Norwegian generally doesn’t do this, and the pronunciation may be quite ambiguous if you don’t know the spoken form. Because of this, Norwegian dictionaries tend to add “ò” or “ó” in brackets after the word to show which vowel quality it has. This is essential to know how to say a word you don’t know.

A solution to this is to add the pronunciation in a new section, which I generally do anyway, but this needs proficiency in IPA, which not all editors have. Then there’s the fact that spellings like hòv and hóv aren’t actually prohibited. It’s an optional feature in spelling as well as just a way to show the pronunciation in dictionaries, so I think the ideal way to express it would be to show this optional accent, as opposed to only the IPA transcription. How could this best be done?

My ideas would be to either add the accent to the head form itself, or in brackets to the right:

hòv n
hov (ò) n

Alternatively, if other languages have solutions for a similar problem, I would like to know. --Eiliv / ᛅᛁᛚᛁᚠᛦ (talk) 12:42, 16 July 2023 (UTC)Reply[reply]

@Eiliv Support for adding extra accents to headwords themselves is already available and used for many languages (e.g. macrons in Latin and Old English to indicate length, tone marks in Latvian, stress+pitch accent marks in Lithuanian, stress marks in East Slavic languages, stress+tone marks in Slovenian and Serbo-Croatian, etc.). So we could definitely add this to Norwegian. It would just be a change to the Nynorsk and Bokmal entries in Module:languages/data/2 to tell the link-processing code to strip off acute and grave accents. Benwing2 (talk) 20:14, 16 July 2023 (UTC)Reply[reply]
Splendid! This would only be needed in Nynorsk though, as Bokmål doesn’t use the system I described. Eiliv / ᛅᛁᛚᛁᚠᛦ (talk) 21:08, 16 July 2023 (UTC)Reply[reply]
See Serbo-Croatian example at slovo. So it is still searchable as slovo, while you can see the accent marks at the entry. (At det er mogleg å slå det opp når ein skriv ordet i søkefeltet utan diakritiske teikn, men at desse teikna er med i sjølve ordbeskrivinga. Det er til dømes gjort her med serbokroatisk slovo og russisk слово (slovo)) Tollef Salemann (talk) 21:35, 16 July 2023 (UTC)Reply[reply]
@Eiliv I went ahead and added strippping of acute and grave accents from Nynorsk links. A side effect of this is that headwords with acutes and graves in them can't be so easily linked to, so hopefully there aren't any or they can be moved. I would add circumflex as well per Norwegian orthography but it seems we have some existing Nynorsk entries with circumflexes, e.g. vêr. Benwing2 (talk) 22:05, 16 July 2023 (UTC)Reply[reply]
Let me know if this causes any issues. The idea is that Nynorsk headwords would use the |head= parameter or similar to specify the version with accents in it. Benwing2 (talk) 22:06, 16 July 2023 (UTC)Reply[reply]
This seems good, thank you! For now, I think the circumflex should stay untouched as it’s frequently used in regular writing. Eiliv / ᛅᛁᛚᛁᚠᛦ (talk) 22:40, 16 July 2023 (UTC)Reply[reply]

Using words as quotations for letters[edit]

Moved here from WT:Information desk/2023/July#Using words as quotations for letters. --RichardW57 (talk) 03:11, 17 July 2023 (UTC)Reply[reply]

As there has been some recent disbelief as to the existence of some characters in communication, I have started using words recorded on Wiktionary as evidence of such use. For the quotations, I was using mentions with {{m+}}. As part of the emboldening, I want to embolden both the minimal rendering portion of the word containing the character and the minimal representation in its transliteration. Now, I had worked out a series of methods of doing that:

  1. Embolden as small a region of the character in the word in script as I can and still get a reasonable display.
  2. If the marked up word won't transliterate properly, supply an appropriately emboldened manual transliteration.
  3. If too much of the automatic transliteration is emboldened, then supply a manual transliteration.

However, I hit a problem with a choice of Burmese word to illustrate the Burmese use of with the intrinsic vowel overridden. I chose ရာဇာ (raja, king), for which I want an emboldened transliteration 'raja'. At step 1, I ended up with {{m|my|ရာဇာ}}, which yields ရာဇာ (raja). Step 2 isn't applicable. So for Step 3, I try to override the transliteration, {{m+|my|ရာဇာ|t=king|tr=raja}}, but the override is ignored. What should I do?

My current best idea is to fake it. I can write

{{m+|my|ရာဇာ|tr=-}} (raja, “king”)

though I know I need to replace the double quotes with the appropriate template invocations to honour user preferences. I can't induce {{quote}} to produce a display on one line with a link to the word being quoted, even on a desktop. --RichardW57 (talk) 15:21, 16 July 2023 (UTC)Reply[reply]

@RichardW57 This is a Grease Pit discussion and doesn't belong here (I for one don't regularly follow the Information Desk, but I do check the Grease Pit often). IMO using a manual translit isn't the right approach, and the reason it's ignored is that evidently Burmese is one of the languages where this is being done intentionally (there's a setting that controls this). The reason this happens in the first place, as you probably know, is that single quote chars are evidently passed unchanged through the translit process. The correct approach is to modify the Burmese transliteration module so that you can use a special character of some sort to indicate that you want only the consonant, not the consonant + vowel, to be boldfaced. E.g. maybe you can use a % sign before the consonant to indicate this. Benwing2 (talk) 01:22, 17 July 2023 (UTC)Reply[reply]
@Benwing2: So, I feed in a sequence that looks like '''ရာ'''ဇာ but has an invisible character after the final triple apostrophe that forces the final triple apostrophe one character to the left when passed through the transliteration module. Oh, and I also need the sequence of triple apostrophe and invisible character not to cause the transliteration to be done in parts. That feels horribly hacky and vulnerable to later breakage.
I suppose I should try one of INVISIBLE TIMES/SEPARATOR/PLUS. I hope no-one is feeling possessive about the transliterator. --RichardW57 (talk) 02:35, 17 July 2023 (UTC)Reply[reply]
@RichardW57 That's not what I said. I suggested something like {{m|my|%ရာဇာ|t=king}} where the % sign causes the transliterator to boldface the following single Latin char that emerges. This is a bit hacky but IMO it's better than totally faking it the way you do above. Also if you could, please move this discussion to the Grease Pit so others can contribute. Benwing2 (talk) 02:58, 17 July 2023 (UTC)Reply[reply]
@Benwing2: I'm now confused. {{m|my|%ရာဇာ|t=king}} produces %ရာဇာ (%raja, king) with direct display of the '%' in the Burmese script, which we do not want, and no emboldening of any portion of the Burmese script word. (Incidentally, we should be thinking about something like {{m|my|ရာဇာ|%ရာဇာ|t=king}} so that it links to the Burmese word.) Were you thinking of having an additional step ('exercise for the reader' to use Module:string) to strip the '%' from the display? So I'd need a template so as to avoid invoking a module directly from a mainspace page! --RichardW57 (talk) 03:36, 17 July 2023 (UTC)Reply[reply]
By trying to work out how @Benwing2's solution could work even with changes to the transliteration module, I found the basis of a moderately readable solution.
I can use
{{#invoke:string|replace|{{m+|my|'''ရာ'''ဇာ|t=king}}|a'''|'''a|count=1}}
which renders
Burmese ရာဇာ (raja, king)
and even links to the right place. This solution can even handle what I see as discontiguous transliterations - I see stacking as usually stripping the vowel from the subscripted consonant, not the one that remains unmoved. It also applies to rules such as the initial consonant determining the tone or register, though we then hit the Latin script problem that accents can't be emboldened in isolation.
The only problem is that I need to create a template to wrap the module invocation - the tooling for Module:string doesn't provide wrappers for its functions to be invoked directly from articles - unless the rules have been relaxed since I started toiling here. --RichardW57m (talk) 08:22, 17 July 2023 (UTC)Reply[reply]

RichardW57 (talk) 03:11, 17 July 2023 (UTC)Reply[reply]

@RichardW57 As I said in my first post, you need to modify the Burmese transliteration module to support this. It looks like the code is in Module:my-pron. I don't know how difficult this is but it shouldn't be so hard; essentially, pass the % sign unchanged through the transliteration process, then as a postprocessing step, convert sequences of % followed by a Latin character to a boldfaced Latin char. Benwing2 (talk) 03:46, 17 July 2023 (UTC)Reply[reply]
User:Theknightwho if you have a better idea please let me know. Benwing2 (talk) 03:48, 17 July 2023 (UTC)Reply[reply]
@Benwing2 I'd prefer we not use the percent character for this because it's already used by the Japanese templates for rubytext, meaning that we probably want to reserve % in the main link templates for that purpose to enable the smooth integration of {{ja-r}} down the line.
@RichardW57 Is this something that's likely to come up often? It seems incredibly niche. Theknightwho (talk) 04:20, 17 July 2023 (UTC)Reply[reply]
@Theknightwho Sure, any char would work, or we could just "fake it" like RichardW57 suggests. I don't know if as an alternative it makes sense to allow a way of overriding the override_translit flag by including some signal in the manual translit. Benwing2 (talk) 04:24, 17 July 2023 (UTC)Reply[reply]
@Benwing2 Hmm - I'd be keen to avoid an override flag if possible, to avoid it being abused. One thing that has come up before is whether we should have a way to send optional flags to the transliteration module, which would avoid having to specify a whole manual transliteration but would allow for regular variations (like with е (je) in Russian). Perhaps we could have a standardised way of inputting these that can be interpreted by the language as necessary? (e.g. {{m|my|ရ<!>ာဇာ|t=king}}, where ! is some flag that the language transliteration module knows how to interpret. In Russian, it could be something like {{m|ru|фэ́нте<э>зи}} to give фэ́нтези (fɛ́ntɛzi). Possibly even something like {{m|ru|фэ́нт<э>зи}} would be workable, where <э> acts as a stand-in. Theknightwho (talk) 04:45, 17 July 2023 (UTC)Reply[reply]
@Theknightwho Yup, I've long wanted this. It's especially useful in Arabic with the tā' marbūṭa in multiword expressions, which needs to be transliterated as either nothing or as t depending on the syntax. Since the transliteration module can't reasonably work out the syntax of the expression, it renders it as (t), which is less than ideal and leads to the need for a whole lot of (often long) manual transliterations. Benwing2 (talk) 04:55, 17 July 2023 (UTC)Reply[reply]
@Benwing2: You're still overlooking the need to 'embolden' part of the Burmese script text, not just its transliteration. --RichardW57m (talk) 08:02, 17 July 2023 (UTC)Reply[reply]
@Theknightwho: The particular application is likely to come up around thirty times in the next month or so, and then go back to rare. The particular application is highlighting the correct bit of Burmese text when it does not correspond to akshara boundaries.
A more related problem is that of highlighting whole words in quotations when the word boundaries occur inside aksharas. That happens for me with Pali about twice a quotation. In the Indic script, formatting boundaries have to occur at akshara boundaries. (Sometimes, as in the Thai script, breaks can occur between spacing mark and consonant.) As I am using quotations fairly intensely, selecting most words, it is quite regular. The work around is to use a manual transliteration for the quotation - it's generation can be automated. The problem with transferring this trick is that Burmese does not allow manual overrides for transliteration. --RichardW57m (talk) 08:02, 17 July 2023 (UTC)Reply[reply]
@RichardW57 There's a display_text field on languages that lets you apply substitutions to the raw text prior to display. User:Theknightwho can comment on whether the raw text or display text gets sent for transliteration but if it's the raw text, all you need to do is add a substitution in the Burmese display_text field to boldface the Burmese character after $ or whatever special char you choose. Benwing2 (talk) 09:22, 17 July 2023 (UTC)Reply[reply]
Actually, its boldfacing the akshara after the control character, or else one will get a dotted circle, but its doable once I get the editing privilege. For my purposes, it's a lot easier to apply substitutions at need - I can be as flexible as I need if the transliterations are now stable - cheap and cheerful! --RichardW57m (talk) 16:11, 17 July 2023 (UTC)Reply[reply]
@RichardW57 I see you found another solution that doesn't require changing the language definition. The restriction on not directly invoking modules is for the mainspace; if you're only using this in discussion forums, I don't think it matters. Benwing2 (talk) 09:25, 17 July 2023 (UTC)Reply[reply]
Alas, it's for a quotation supporting Translingual , so I can't even apply the solution for now! --RichardW57m (talk) 15:35, 17 July 2023 (UTC)Reply[reply]
@RichardW57 In that case wrapping the module invocation in a template seems the way to go. Benwing2 (talk) 21:36, 17 July 2023 (UTC)Reply[reply]
@Benwing2: Yes, though I'm surprised there isn't one already. I think {{replace}} will be a good name. The issue is the moratorium you urged on editing such entries as (letter). Notifying @Kwamikagami. RichardW57 (talk) 04:13, 18 July 2023 (UTC)Reply[reply]

substrate codes are renamed[edit]

For any of you who regularly work with substrates (e.g. the pre-Greek substrate, the Balkan substrate, the BMAC substrate), I renamed the substrate codes to begin with sub- instead of qfa-sub- (the exceptional qfa-pyg for the Pygmy substrate is now sub-pyg). This was done to shorten the names, particularly so that I could eliminate the nonstandard pregrc alias for the pre-Greek substrate in favor of sub-grc, which is only one character longer; the old code qfa-sub-grc was 5 chars longer, which made it significantly more annoying to type and could have justified keeping the alias. Benwing2 (talk) 06:35, 17 July 2023 (UTC)Reply[reply]

Thanks for this. It would be good to take this a bit further by eliminating any other 10 & 11-character codes wherever possible (xx-xxx-xxx and xxx-xxx-xxx), as they're really unwieldy. Theknightwho (talk) 08:12, 17 July 2023 (UTC)Reply[reply]
This probably makes some browsers / screenreaders / etc interpret them as varieties of Suku (ISO code sub), but I suppose it doesn't matter from anything but a technical-correctness standpoint, since fonts should be set by our own CSS that uses our codes, we don't normally tag text with any of these codes AFAIK (the way {{der|en|de|foo}} tags text as de) anyway, and I can't imagine anything that interprets these codes as signalling varieties of Suku will handle them any worse than if it interpreted them as private-use-range (q..) codes. However, I am not sold on the idea of updating all the xxx-xxx-xxxs, because if we start having a lot of cases where foo-bar is sometimes a variety of foo (like la-vul is a variety of la, etc) and sometimes completely unrelated but we just shortened our exception code to the valid ISO code foo + hyphen + three more letters even though it's not the language ISO code foo corresponds to, that strikes me as bad. I think most of our xxx-xxx-xxx are also proto-languages where it helps that the codes are just "family code" + "-pro", and breaking that system also strikes me as bad. I suppose we could try to rename all the xxx-xxx family codes to things in the limited ISO private-use q.. range, and then rename their corresponding protolanguage codes to qXX-pro, but it seems like that would reduce the intelligibility of the codes, since we have a lot of family codes. So I think that idea requires more thought. - -sche (discuss) 00:45, 18 July 2023 (UTC)Reply[reply]
@-sche Hmmm, I didn't think about the conflict with language code sub at the time. If this is an issue, we could always rename the substrate codes to something that won't clash; there weren't too many uses and I'm tracking the new substrate code uses. One possibility is something like qsub-foo; this is guaranteed not to clash with any ISO codes since they're all 3 letters. I agree that renaming all of the 10/11-char codes needs more thought. Benwing2 (talk) 00:49, 18 July 2023 (UTC)Reply[reply]
Ah, yes, renaming them to something that's outside the ISO schema entirely could work. I guess we could rename all the long codes TKW mentions, and their family codes, to non-ISO-style codes, too, as a possibility ... but if we rename things to (codes that start with) four-letter codes, we should probably be careful that they don't look like / conflict with four-letter ISO 15924 script codes (for which the reserved range IIRC is Qaaa–Qabx, again quite small, not allowing for many intelligible custom codes); in theory the ISO could assign Qsub as the script code of some script and then qsub-foo might be interpreted as the foo-language version of that script(?). I suppose we could do something like, assign the substrates and our xxx-xxx-xxx-sized codes five letter codes (and just make sure they don't conflict with our five-letter script codes like Polyt)? Or six letter codes? IDK; it's an idea to ponder. - -sche (discuss) 01:10, 18 July 2023 (UTC)Reply[reply]
@-sche, Theknightwho Do either of you know if script codes directly correlate with CSS classes? Looking through the module code for 'polytonic', I see various places that reference a CSS class polytonic. Does this need to change to Polyt as well? Benwing2 (talk) 05:02, 18 July 2023 (UTC)Reply[reply]
If I understand your question correctly, then yes. If you're talking about something like this with class="polytonic" lang="grc", it definitely seems to be assuming "polytonic" is the script code, so if we've updated the script from "polytonic" to "Polyt", that instance should also be updated. - -sche (discuss) 07:32, 23 July 2023 (UTC)Reply[reply]

other xxx-xxx-xxx codes[edit]

Spitballing an idea for shortening other xxx-xxx-xxx codes as requested above:

  • To keep a system where proto-languages' abc-def-pro codes and families' abc-def codes are derivable from each other, could we rename the 'second part' of the codes to "two letters + f for family or p for proto-language", i.e. rename aav-khs, aav-khs-proaav-khf, aav-khp?
  • What other 11-letter codes are there? Is it just the four qfa-xgx-... ones? (Anything else should use the nearest ISO i.e. three-letter family code, right? Like we have bnt-bal, not bnt-bbo-bal.) We could make an exception-to-the-qfa-exception and shorten qfa-xgxqpm (still in the ISO private-use range, like qfa) and shorten qfa-xgx-tuhqpm-tuh etc. Or we could make a general prefix qla- for ISO-family-code-less exceptional languages, like qfa- for families, and rename qfa-xgx-tuhqla-tuh.

- -sche (discuss) 09:27, 23 July 2023 (UTC)Reply[reply]

@-sche qpm would make sense as a family code, and would carry through to the language codes by extension. This will only ever come up in situations where we need to have wholly new family codes because there is no parent superfamily, which doesn't happen very often. In all other situations, new family codes can simply be derived from their superfamilies. Theknightwho (talk) 11:54, 25 July 2023 (UTC)Reply[reply]
@-sche, Theknightwho My main concern with aav-khp that things with -pro are instantly identifiable as proto-languages, while aav-khp isn't so obviously identifiable. But maybe we could use four-letter codes for proto-languages? E.g. aav-khsp, or gemp for Proto-Germanic, ine-bslp for Proto-Balto-Slavic in place of gem-pro, ine-bsl-pro. I think you've said there's a theoretical possibility of clash with four-letter script codes, but how realistic is that? For one thing, script codes begin with a capital letter (although I'm not sure whether CSS classes are case-sensitive). If this is an issue, we could use a special character to denote the proto-language, e.g gem+, ine-bsl+ or gem-, ine-bsl- or something. Benwing2 (talk) 06:25, 26 July 2023 (UTC)Reply[reply]
I agree it would be good to keep the proto-language codes recognizable. Testing, it seems like script codes are indeed case-sensitive such that "foo-arab" will not clash with "foo-Arab" — colour me surprised, since if you coloured me ff0000, that is not case-sensitive. I suppose that means we could indeed use aav-khsp or aav-khsP or something. But to back up for a moment, I suppose the obvious question we might should check is, how broad is the demand to shorten these? since renaming every proto-language will affect a lot of editing communities. (Creating qpm would take care of the para-Mongolian long codes independent of changes to proto-language codes.) - -sche (discuss) 23:17, 27 July 2023 (UTC)Reply[reply]

Blocking specific items on watchlist[edit]

My watchlist has become cluttered with certain topics. Is there a way that I can block items from a specific section of a specific page, say the German section of a specific entry? There are many pages, mostly user talk and WT discussion pages, I'd like to watch without having other items buried by too-frequent postings to a topic I do not much care about.

  • Is there some custom CSS or JS that I could insert that would suppress a specific section of a specific page or even just any section of any page that had a specific section name or a specific word in a section name?
  • Could I filter out contributions from certain users using similar means?
DCDuring (talk) 16:08, 17 July 2023 (UTC)Reply[reply]
@DCDuring I don't actually use Watchlists because they get too cluttered. With all the filters added, it's better but still not good enough IMO. However, you can subscribe or unsubscribe to specific topics in discussion pages; I don't know if that helps you. As for custom CSS e.g. to suppress entries related to specific users or specific keywords, User:This, that and the other or User:Erutuon any ideas here? It would seem that such functionality should be present in the Watchlist filters (e.g. filters on AWS consoles have these sorts of things and it's apparently part of a general widget library that AWS uses; not sure if it's Amazon-specific or also available in open source libraries like Bootstrap, but I'm sure it's been asked about a bunch of times on the Phabricator). Benwing2 (talk) 21:35, 17 July 2023 (UTC)Reply[reply]
I don't see how I could not watch BP, GP, ID, TR, RfVE, RfDE, RfDO. When I am watching the whole page, how would I unsubscribe to a specific section? That certainly isn't a visible option. DCDuring (talk) 22:03, 17 July 2023 (UTC)Reply[reply]
In Preferences, under Recent changes, but not under Watchlist, appears a checkbox "Group changes by page in recent changes and watchlist". Checking this box reduces clutter, but a cost in performance (It must be a large JS script.). I don't know whether it will prove adequate over time. DCDuring (talk) 03:07, 18 July 2023 (UTC)Reply[reply]
(This is one of many reasons that a reorganization of our data into language-specific sub-pages might be worth the trouble of refactoring...) ‑‑ Eiríkr Útlendi │Tala við mig 21:34, 18 July 2023 (UTC)Reply[reply]
My problem is not with languages, but with discussions not about items in mainspace. DCDuring (talk) 22:20, 18 July 2023 (UTC)Reply[reply]

Miyako and other Ryukyuan romanization modules[edit]

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Kwékwlos, Mellohi!): Where are those romaji modules for those languages? I need to change the transcription. Chuterix (talk) 19:54, 18 July 2023 (UTC)Reply[reply]

@Chuterix We need to have a proper discussion about how to romanise the Ryukyuan languages, because the transliteration module made by Huhu9001 doesn't work in a conventional way. Unless you're very confident with Lua, I'd advise that you explain what changes you want to make first. Theknightwho (talk) 21:12, 18 July 2023 (UTC)Reply[reply]
(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Kwékwlos, Mellohi!): Transcribe つ /tss/ as tsï, す /ss/ as , etc. Chuterix (talk) 21:29, 18 July 2023 (UTC)Reply[reply]
@Chuterix, as previously talked about, the various Ryukyuan languages do not have a romanization standard. If you are proposing to invent one, we need to have a lot more discussion about that first, before we encode that in a module. ‑‑ Eiríkr Útlendi │Tala við mig 21:38, 18 July 2023 (UTC)Reply[reply]
@Eirikr Ask @Kwékwlos Chuterix (talk) 21:39, 18 July 2023 (UTC)Reply[reply]
@Chuterix, why?
You are the one asking to encode an undefined non-standard romanization into our infrastructure modules -- not Kwékwlos. Why should I ask Kwékwlos anything? This is a non sequitur. ‑‑ Eiríkr Útlendi │Tala við mig 21:43, 18 July 2023 (UTC)Reply[reply]
@Eirikr: Proto-Ryukyuan (PR) *kuti, where @Kwékwlos transcribes futsï, but Japanese sites (sources?) transcribe /mya:kufutss/ as みゃーくふつ. Also examples are PR *moti (to hold) where @Kwékwlos transcribes 持つぃ (mutsï) and in PR *posi (star) where I believe @Kwékwlos transcribed fusï (previously fusu). Chuterix (talk) 21:50, 18 July 2023 (UTC)Reply[reply]
@Chuterix: Those you mentioned are manual transliterations. They are not generated by modules, nor will modifying modules affect them. -- Huhu9001 (talk) 01:20, 19 July 2023 (UTC)Reply[reply]
the problem i tqlk about in miyako. @Huhu9001 Chuterix (talk) 01:34, 19 July 2023 (UTC)Reply[reply]
@Chuterix: Are you sure it should be kuzï instead of kudzï? I remember there is a /z/~/dz/ contrast in Miyako, as they use さ行 + handakuten to represent the former. -- Huhu9001 (talk) 01:46, 19 July 2023 (UTC)Reply[reply]
@Huhu9001 I believe so. (wait...) Chuterix (talk) 01:51, 19 July 2023 (UTC)Reply[reply]
@Chuterix: What is "so" and wait what? See this one: https://cir.nii.ac.jp/crid/1010576118584460173
I don't know who is correct. -- Huhu9001 (talk) 02:16, 19 July 2023 (UTC)Reply[reply]
Speaking of a standard, there are at least two major systems to write Ryukyuan pronunciation in kana other than classical Okinawan (Omoro) spelling, as far as I know.
  • A system that used in 現代日本語方言大辞典 (1992): This system covers all Japonic dialects, but the spelling system depends on the font. The Laryngeal sounds and unvoiced nasal are written same as their normal equivalents but font type; I think that is user-unfriendly for the web environment.
  • Systems fixed by Okinawa prefecture (2022): Those systems are officially fixed by the local government for the 5 areas within the prefecture. It lacks the orthography of dialects of Amami since they are not included in Okinawa prefecture but Kagoshima.
--荒巻モロゾフ (talk) 02:06, 19 July 2023 (UTC)Reply[reply]
I also believe Omoro Soshi uses hiragana-only spelling, rather than okurigana like what JLect and @Kwékwlos is doing. @荒巻モロゾフ Chuterix (talk) 02:09, 19 July 2023 (UTC)Reply[reply]
I also realized gendai nihongo hogen daijiten is made by same author of the Ryukyuan dialects that I mentioned in Wiktionary:About Proto-Japonic/references Hirayama Teruo. Chuterix (talk) 02:11, 19 July 2023 (UTC)Reply[reply]
@Eirikr Although it's undocumented, Huhu9001 has designed the Japanese transliteration module in a way that makes it very easy to modify it as needed for the Ryukyuan languages, which is really helpful. I'm oversimplifying, but the essence of it is that you only need to specify the differences from standard Japanese transliteration. This is done in a systematic way, so it means you can (for example) specify how certain letters should behave when geminated without having to do special logic for it. However, that does mean that we need to come up with standardised ways for transliterating the various Ryukyuan languages in the first place. Theknightwho (talk) 21:58, 18 July 2023 (UTC)Reply[reply]
(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Kwékwlos, Mellohi!): check Module:ja-translit/data/mvi and ; why is it not transcribing correctly? it's supposed to be kuzï not kuzi. Chuterix (talk) 22:26, 18 July 2023 (UTC)Reply[reply]
@Chuterix I would stop pinging people so much - you've pinged some people 3 times in this discussion. We need to have a full discussion before we can answer specifics like this. Theknightwho (talk) 22:56, 18 July 2023 (UTC)Reply[reply]
@Chuterix Also User talk:Chuterix#Block Chuterix (talk) 22:57, 18 July 2023 (UTC)Reply[reply]
@Chuterix I deleted Module:ja-translit/data/ams; you created it prematurely and it was causing errors. As a general rule, please don't make premature changes like this; discuss first. Benwing2 (talk) 23:06, 18 July 2023 (UTC)Reply[reply]
I will take this as a sign that Chuterix isn't confident enough with Lua to be making changes to that module. Theknightwho (talk) 23:08, 18 July 2023 (UTC)Reply[reply]
If I'm reading the timestamps correctly, Chuterix created the faulty Module:ja-translit/data/ams (which Benwing2 later deleted) almost one hour after my comment above about how “we need to have a lot more discussion about that [how to romanize] first, before we encode that in a module.” This again points to headstrong editing that deliberately ignores what other editors are saying. And this is after I applied a one-day block to Chuterix to try to get them to pay attention and slow down -- and stop pinging so much -- as discussed at User_talk:Chuterix#Block. @Chuterix, please take heed of what other editors are telling you. Be advised that any editor who engages in disruptive behavior, and who does not change that behavior after being advised about it, is subject to being blocked. ‑‑ Eiríkr Útlendi │Tala við mig 17:58, 19 July 2023 (UTC)Reply[reply]
the module only had a comment about orthography for amami southern. @Eirikr Chuterix (talk) 18:19, 19 July 2023 (UTC)Reply[reply]
Yes, but it caused a bunch of errors to happen, because the main module checks whether it exists. Creating a blank module like you did meant it registered as existing, but didn't have any of the things the main module actually needed for it to work. Theknightwho (talk) 19:20, 19 July 2023 (UTC)Reply[reply]
@Chuterix: Sorry, I can barely understand what you are talking. I suggest you first write an descriptive essay on the translation system of the language whose romaji you want to change, including what kana it use, what sound they represent and how they are noted as Latin letters. It is important for us to grasp the whole picture so that we can have the work done with less problems. -- Huhu9001 (talk) 00:55, 19 July 2023 (UTC)Reply[reply]

Creating an entry for Wish dot com[edit]

I've attested attributive usage of Wish.com (see Citations:Wish.com) but am unable to create an entry for it because the creation of pages containing ".com" is blocked. I presume this was necessarily implemented as an anti-spam measure. But there's always edge cases such as this. Could someone with the relevant know-how create an entry for Wish.com? I presume this would done through Appendix:Unsupported titles but I don't want to try blindly mucking around there myself. WordyAndNerdy (talk) 14:51, 19 July 2023 (UTC)Reply[reply]

I feel this entry should not be created. PUC – 14:54, 19 July 2023 (UTC)Reply[reply]
Technically any admin can create it, but I agree with PUC that this one is probably not worth creating. - TheDaveRoss 14:57, 19 July 2023 (UTC)Reply[reply]
I think it's fine - compare Tesco Value. Theknightwho (talk) 15:05, 19 July 2023 (UTC)Reply[reply]
Wish.com has been created as a blank page. Theknightwho (talk) 15:06, 19 July 2023 (UTC)Reply[reply]
Judging from the citations it would seem to meet CFI, given a reasonable definition, possibly non-gloss. DCDuring (talk) 15:32, 19 July 2023 (UTC)Reply[reply]
I believe it's roughly equivalent to figurative uses of pound-shop or dollar-store. I'm not set on any particular definition. The one on the Citations page can be treated as a rough draft or placeholder. I'd be fine with a non-gloss def. WordyAndNerdy (talk) 15:54, 19 July 2023 (UTC)Reply[reply]

There are now three entries for terms derived from this constructed language (dracarys, valar dohaeris, and valar morghulis). Would it be possible to add High Valyrian to the language database for the purposes of categorizing these entries? This was done with Dothraki (codified as "art-dtk") several years ago. WordyAndNerdy (talk) 19:37, 19 July 2023 (UTC)Reply[reply]

@WordyAndNerdy WT:CFI is rather vague about what the criteria are for including Appendix-only constructed languages; it just says "at the community's discretion". Given that we seem to include other languages related to well-known fantasy series (e.g. CAT:Na'vi language‎) it is probably fine to include this as well. Benwing2 (talk) 22:54, 19 July 2023 (UTC)Reply[reply]
Speaking of constructed languages, WT:CFI says only Esperanto, Ido, Volapük and Interlingua are allowed in the mainspace, but we also have CAT:Eskayan language. @-sche Do you know anything about this language? Presumably it should be moved to the Appendix? Benwing2 (talk) 23:38, 19 July 2023 (UTC)Reply[reply]
All three of the entries referenced above were created (and attested) as English terms. I'm not proposing adding HV entries to mainspace other than those that can be attested as English or another natural language. I'm suggesting HV be added to the languge database so a category such as Category:English terms derived from Dothraki may be created. WordyAndNerdy (talk) 00:05, 20 July 2023 (UTC)Reply[reply]
@WordyAndNerdy added with the code art-vlh. You can create the appendix Appendix:High Valyrian along the lines of Appendix:Dothraki if you like. This, that and the other (talk) 00:14, 20 July 2023 (UTC)Reply[reply]
Thanks! I see now that someone's created appendix entries for fiction-only Dothraki terms. Arakh is probably attestible as an English term at this point. There doesn't seem to be a single accepted term for sickle-like swords in fantasy media. Also, would Daenerys belong in Category:English terms derived from High Valyrian? This is established within the series as a given name derived from High Valyrian, but there isn't a canonical meaning/translation assigned to it, AFAIK. WordyAndNerdy (talk) 04:40, 20 July 2023 (UTC)Reply[reply]
Yes, I can't see why not. This, that and the other (talk) 05:43, 20 July 2023 (UTC)Reply[reply]
Can it accurately be described as having "derived from High Valyrian" when it hasn't been given a clear etymology in the real-world constructed language of High Valyrian? Its status as a High Valyrian name is in-universe information from the fictional world of ASOIAF/GOT (members of the Targaryen dynasty have names originating in Valyria with few exceptions). Not really sure how this particular case would fit under the WT:FICTION framework. WordyAndNerdy (talk) 06:18, 20 July 2023 (UTC)Reply[reply]
Hmm... IMO, if the books/show/conlang materials affirmatively say Daenerys is a High Valyrian-language name, then it seems correct to say it's "derived from High Valyrian Daenerys", even if there's no further "meaning" of the High Valyrian term we can relay beyond "it's a name"; that seems like a common situation with names. Lee says it's derived from Chinese 李 (Lǐ); it doesn't (probably could, but doesn't need to) say anything about what "李" means beyond being a surname. OTOH, if the books/etc don't actually say Daenerys is a Valyrian-language name, and we'd just be assuming based on Targaryen names usually being Valyrian, then ... we probably shouldn't assume. - -sche (discuss) 07:09, 20 July 2023 (UTC)Reply[reply]
@Benwing, re Eskayan: I added it as outlined at Wiktionary:Beer parlour/2015/August#Eskayan; I have no strong feelings about it but intuitively it feels different to me (having longstanding use by a particular people-group like a natural language) from everybody's dog's failed proposal for a world language, or recent commercial/media-franchise inventions like Dothraki. But that prior discussion was just three people, so likely we should just have a new discussion. - -sche (discuss) 07:09, 20 July 2023 (UTC)Reply[reply]
@-sche Probably so. I also don't feel strongly about this but IMO either we should move it to the Appendix or update the section of WT:CFI that says only Esperanto, Ido, Volapük and Interlingua get to go in the mainspace. (Personally I would rather move all conlangs other than Esperanto to the Appendix; I'm making an exception for Esperanto not because I have any partiality towards it but because it seems qualitatively different in its reach vs. all the others. But if we're allowing more than Esperanto in the mainspace I won't object to Eskayan being there if those in the know feel it belongs there.) Benwing2 (talk) 21:06, 20 July 2023 (UTC)Reply[reply]

Verb -ing accelerator is broken[edit]

It's creating "Participle" entries instead of the usual "Verb", and also putting them in a category that was deleted by vote a long time ago. See Category:English present participles for the list that will require fixing. Equinox 22:31, 20 July 2023 (UTC)Reply[reply]

@Equinox I changed this cross-linguistically to accord with how most languages work. Didn't realize English behaves differently; what POS category should it go in? Benwing2 (talk) 23:49, 20 July 2023 (UTC)Reply[reply]
Until now it has been like this (===Verb===, "English verb forms", rather than this). - -sche (discuss) 23:57, 20 July 2023 (UTC)Reply[reply]
The new "pattern" seems okay. Thanks. If I was supposed to see a message about this, I didn't. But I'm happy now. Cf. The Smiths. Equinox 04:54, 26 July 2023 (UTC)Reply[reply]
@Equinox Cool. Sorry I should have pinged you after I fixed the accelerator code and cleaned up the entries that got stuck into Category:English present participles. Benwing2 (talk) 06:52, 26 July 2023 (UTC)Reply[reply]
@Benwing2: I note that the Accelerator has also changed in what it just did with backsawn, creating a Participle header instead of Verb. Equinox 20:18, 11 August 2023 (UTC)Reply[reply]
@Equinox That was intentional; for most languages we prefer to mark participles with a Participle header to match the POS of the {{head}} template, and since backsawn is only a past participle (rather than a combination ed-form and past participle), I think it probably makes sense here. Benwing2 (talk) 21:52, 11 August 2023 (UTC)Reply[reply]

Script codes harmonized[edit]

I finished renaming the following script codes:

  • polytonic -> Polyt
  • Latinx -> Latnx
  • musical -> Music
  • Ruminumerals -> Rumin
  • IPAchar -> Ipach

With this change, all script codes are either of the form Xxxx (for ISO 15924 script codes) or Xxxxx (for Wiktionary-invented codes), or have a language prefix added to one of these codes (e.g. fa-Arab). Currently there are seven Wiktionary-invented codes of the Xxxxx format: the five above as well as Morse (= Morse code) and Semap (= flag semaphore). For the moment, the old script codes still work in that Module:scripts will accept the old names and automatically convert to the new ones, and MediaWiki:Common.css and MediaWiki:Mobile.css recognize both old and new names as CSS classes. I have tracking set up for any uses of the old names that go through Module:scripts (which means unfortunately that uses in the *-lite templates won't get tracked). After awhile I will change Module:scripts to throw an error if it sees the old names, suggesting that the appropriate new names be used instead, and sometime after that (maybe well after) I will remove that special-handling code so you just get an "unrecognized script" error upon using the old names.

The first two on the list (Polyt and Latnx) are the biggies that are used in many places; the remainder are hardly used. If you see any new font-related weirdness esp. related to polytonic Greek characters or Latin characters with diacritics on them, please let me know. Benwing2 (talk) 07:23, 21 July 2023 (UTC)Reply[reply]

Chapter translator in {{quote-book}}[edit]

Something I'm surprised isn't already possible (given all the options for other stuff!) is the ability to designate a translator, or indeed an editor, for a specific chapter in {{quote-book}}. This is relevant in edited volumes where a specific contribution might have its own translator independently of the book as a whole—and is reasonably important since the translator in that case is the one who's actually responsible for the quoted text. As it stands, specifying a chapter and a translator will get you something like "2000, Joe Bloggs, 'A chapter', in Mary Bloggs, transl., Book". With a chapter-specific translator this should be "2000, Joe Bloggs, Mary Bloggs, transl., ...". Would this be worth implementing? Alternatively maybe the second format should just be presumed anyway, since it applies just as well where the translator's rendered the entire book? —Al-Muqanna المقنع (talk) 16:30, 21 July 2023 (UTC)Reply[reply]

@Al-Muqanna I support this. There was also a request (by User:GianWiki?) to support the ability to add per-chapter transliterations and such. IMO, however, if we do this we should rename the relevant parameters to be more standard. For example, there is currently a |trans= param for the book-level translator, but a |trans-chapter= param for a translation (i.e. gloss) of the chapter name, so we can't call the chapter translator |trans-chapter=, as you might expect. IMO the chapter gloss should be something like |chapter-t= or |chapter-gloss=, the chapter translit should be |chapter-tr= and the chapter translator should be maybe |chapter-trans=. Technically |chapter-trans= doesn't conflict with |trans-chapter= but having both with different meanings would be very confusing so I'd recommend renaming the existing uses of |trans-chapter=. Maybe User:Sgconlaw can comment, as they are the expert on these templates. Benwing2 (talk) 02:00, 22 July 2023 (UTC)Reply[reply]
@Benwing2, Al-Muqanna: I don't have any objection either for a parameter for the translator for a chapter, but am wondering whether it is a good idea to name it |trans-chapter= and then rename the parameter for translations of chapter names to something else, if that's what the proposal is. If we did do that, then we'd have to think about renaming the parameter for translations of titles as well, as it's currently |trans-title= (or |trans-journal= for journals). — Sgconlaw (talk) 05:38, 22 July 2023 (UTC)Reply[reply]
@Sgconlaw So my plan is to name it |chapter-trans= (in that order) for the chapter translator and rename |trans-chapter= to avoid confusion. In addition I'd like to support glosses and transliterations of titles and journal names, with more standard names like |title-t=/|title-gloss= and |title-tr= (or |journal-t=/|journal-gloss= and |journal-tr=). Presumably it doesn't make sense to have a translator for titles but it might for journals if they're part of a series. Alternatively we could rename |trans= to something else like |tlat= or |tlator=. My main concern is the confusion caused by having |trans= mean two different things in different param names as well as the fact that |trans= is not used anywhere else and is ambiguous between "translator", "translation" and "transliteration", so it might make sense to adopt |tlator= anyway. Then we'd have something like the following:
  1. |tlator= for the overall translator (or something shorter e.g. |tlr=?)
  2. |t= for the gloss/translation of the text/passage in question
  3. |tr= for the manual transliteration of the text/passage in question
  4. |1= for the lang code of the text/passage in question, if it's not in English, so it can be auto-transliterated
  5. |chapter-tlator= for the chapter translator (or maybe shortened to |chapter-tlr=)
  6. |chapter-t= for the gloss/translation of the chapter name
  7. |chapter-tr= for the manual transliteration of the chapter name
  8. |chapter-lang= for the lang code of the chapter name, if it's not in English, so it can be auto-transliterated
  9. |section-tlator= for a section translator (or maybe shortened to |section-tlr=)
  10. |section-t= for the gloss/translation of the section name
  11. |section-tr= for the manual transliteration of the section name
  12. |section-lang= for the lang code of the section name, if it's not in English, so it can be auto-transliterated
  13. |title-t= for the gloss/translation of the book title
  14. |title-tr= for the manual transliteration of the book title
  15. |title-lang= for the lang code of the book title, if it's not in English, so it can be auto-transliterated
Generally I want to keep the params as consistent as possible both with each other and with the way params work in other templates. Benwing2 (talk) 05:56, 22 July 2023 (UTC)Reply[reply]
@Benwing2: that looks good. Personally, I prefer "tlr" to "tlator". — Sgconlaw (talk) 21:39, 22 July 2023 (UTC)Reply[reply]
@Sgconlaw, Al-Muqanna, GianWiki, RichardW57 I have implemented foreign-script handling for authors, titles and chapters in my sandbox module. See User:Benwing2/test-quote for some examples. You can specify the language by prefixing the author, title or chapter with a language code followed by a colon. If you do this, you get proper script handling and transliteration. If you don't do this, it still tries to figure out the correct script and do the right thing with that script, but it may not get certain language-specific handling correct (e.g. it won't be able to detect Persian vs. Arabic and use the special Persian version of the Arabic script), and it won't do auto-transliteration. The relevant params are named e.g. |title= (the text itself); |title-tr= (the translit); |title-ts= (the transcription); |title-sc= (the script, if not detected properly); |title-t= or |title-gloss= or (for compatibility) |trans-title= (the gloss/translation); and |title-lang= (an alternative to prefixing the title with the lang code). I am not completely sure of the correct formatting when both translit and gloss are present; currently they aren't clearly distinguished and both show up within the brackets following the text. Note that |first= and |last= are still supported as an alternative to |author= but don't come with any foreign-script support; it is better to use |author=. Only |title=, |author= and |chapter= currently have foreign-script support, but it's not hard to add it to other params (let me know what other params are so deserving). Chapter translator support (|chapter-tlr=) is not there yet but is coming. Benwing2 (talk) 07:50, 28 July 2023 (UTC)Reply[reply]
They may be handled automatically, but |title2=, |2ndauthor= and |chapter2= would also need handling. |section= will need the same support as |chapter=. |author2= etc. need the same treatment as |author=, and I don't think editors should be overlooked either. --RichardW57 (talk) 22:29, 28 July 2023 (UTC)Reply[reply]
@RichardW57 Thanks. I am adding this support now. I wasn't sure about |section= but it does look like it needs handling this way. I am also doing this for translators, editors, quotees, coauthors, etc. What about |other=, |others=, |quoted_in=, |publisher=, |city=, |location=, |original=, |by=? I assume yes but I'm not sure what all of these are for. What about |laysummary= and |laysource= (I have no idea what the purpose of these is)? How about |genre=, |format=, |edition=, |volume=, |volume_plain=, |series=, |seriesvolume=? Maybe no on these latter ones? I also need to add some hacks to handle more special-purpose params like |journal= (which don't occur in Module:quote but do occur in the template wikicode of specific {{quote-*}} templates). Benwing2 (talk) 22:44, 28 July 2023 (UTC)Reply[reply]
BTW now that I think about it, it may make more sense to implement this exclusively using inline modifiers; it's going to be painful to handle all the stuff like |blog=, |site= and |work= (synonyms for |title= used specifically in {{quote-web}}) any other way. Benwing2 (talk) 22:51, 28 July 2023 (UTC)Reply[reply]
For the latter ones, I would say that {{|page}} had a higher claim. Not all of us read Burmese numbers fluently. There are some nasty subtleties - not all of us twigged that Thai ฉบับปรับปรุงครั้งที่ did not mean 'first edition', but first revision, i.e. second edition. --RichardW57 (talk) RichardW57 (talk) 23:01, 28 July 2023 (UTC)Reply[reply]
@Al-Muqanna, RichardW57, GianWiki I switched title, chapter, author to use inline modifiers (although as a special case, |trans-title=, |trans-chapter= and |trans-author= are still supported for backward compatibility) and added the same treatment for all the other params mentioned above except the "latter" ones. I also added a chapter translator under |chapter_tlr=; if there is a "republished in" second set of params, the chapter translator for the republished/etc. book is |chapter_tlr2=. There's an example of both |chapter_tlr= and |chapter_tlr2= in User:Benwing2/test-quote along with examples using inline modifiers. I'd appreciate it if people could create some complex examples using non-English params, so we can test that everything is working before I push this live. Also, User:Sgconlaw I guessed about the significance of the |other= and |laysource=/|laysummary=/|laydate= params, which aren't documented anywhere that I can see; maybe you can document what they're supposed to do so I have a better idea of whether they can contain foreign text. Benwing2 (talk) 07:53, 29 July 2023 (UTC)Reply[reply]
Thank you very much for all your work —— GianWiki (talk) 12:19, 29 July 2023 (UTC)Reply[reply]
@Benwing2: Where are the 'inline modifiers' documented? I couldn't find anything by looking at the display of {{quote-book}}. RichardW57m (talk) 10:30, 11 August 2023 (UTC)Reply[reply]
@RichardW57: I'm working on this; just last night e.g. I was doing a bunch of work rewriting the {{quote-*}} documentation. It's a bit more work than it would normally be because there are 12 such templates and I don't want to have to manually copy the text to all 12 doc pages. In the meantime, the modifiers currently supported are <t:...>, <gloss:...> (alias of <t:...>), <tr:...>, <ts:...> and <sc:...>. Benwing2 (talk) 21:55, 11 August 2023 (UTC)Reply[reply]
@Benwing2: Thank you; I think the set was begging for an easier to maintain unified set of documentation. I feared you thought you had provisionally finished the task. I had suspected that the language of the section name or whatever might be one of the parameters.
As |gloss= as a synonym of |t= is deprecated, should you be adding it to something new like inline parameters? --RichardW57 (talk) 09:13, 12 August 2023 (UTC)Reply[reply]
@RichardW57 Hmm, I will remove <gloss:...>. The language of the text in question *IS* specifiable, you just prefix the text with the language code, like this: ru:Баллада о королевском бутерброде<t:Ballad of the King's Bread>. Sorry, forgot to mention this. Benwing2 (talk) 20:20, 12 August 2023 (UTC)Reply[reply]
@Benwing2 The (partially?) consolidated documentation seems a lot easier to use. Thank you.
You have picked up some outdated documentation for |1=. On its own, it does not prompt the generation of "(in LANG)", which rather pertains to |worklang=. I think there was a brief time, prior to the introduction of |worklang=, when that documentation was true. Of course, it might be that you are holding fire on fixing this. I remember the documented logic led to a brief time when we were claiming that Michael Everson had written some Unicode proposals in Pali! --RichardW57m (talk) 10:05, 18 August 2023 (UTC)Reply[reply]
@RichardW57 Yes, the documentation isn't done. I still need to document inline modifiers, for example, and change all the templates to use the new Module:quote doc; so far only {{quote-book}} and {{quote-journal}} are using it. As for |1=, I will fix things so that if all three of |worklang=, |termlang= and |1= are different, it displays both |worklang= and |1=, and document accordingly. Benwing2 (talk) 20:18, 18 August 2023 (UTC)Reply[reply]
@Benwing2: Here are my interpretations of some of the parameters. I'd held off hoping someone more knowledgeable would comment.
|genre= is surely meant to be English, as documented forms include 'fiction' and 'non-fiction', though we may see some barely adopted terms.
|format= is likely surely 'English'. We have examples of 'paperback' and 'hardback', bit I think I've seen 'PDF' used.
|series= is surely going to be like |title=; it is after all, the title of a series of books.
As |edition= takes text and documentation gives "3rd corrected and revised", we should expect foreign language text to show up. In fact, I should probably include the Thai language edition identification in {{tl:R:nod:MFL}} to dispel doubt as to what issue the page numbers refer to. (And some entries appear on significantly separated pages!) I do give a date, but 'when all else fails, read the manual' applies.
|volume= expects a number, but the earlier remarks on page apply. The same applies to {{|plain_volume}} and {{|series volume}}, where I think additional inline qualifiers start to look attractive and the law of diminishing returns is cutting in.
I think |city= is an obsolete synonym of |location=; it's not documented for {{quote-book}}. I think we could do with a check on the validity of parameters; one can't tell why a parameter doesn't affect the display - dismissed as excess detail, or a mistyped parameter name?
|original= seems well-documented to me, and mutatis mutandis should take the same immediate variants or supplementary parameters as |title=. However, |original2= seems an oxymoron.
|others= is documented for {{quote-book}} under |quoted in=, and looks like designer fatigue. It may need be translated, but I've been conveying the information using |newversion=, which parameter I believe should itself take English text. I'd be inclined to discontinue support for it.
This is definitely something that has just growed. --RichardW57 (talk) 09:50, 29 July 2023 (UTC)Reply[reply]
@Al-Muqanna, RichardW57, GianWiki I pushed all these changes live (including support for normalization in {{quote-*}} templates). Benwing2 (talk) 05:31, 1 August 2023 (UTC)Reply[reply]
Thanks. Now to consolidate - most securely done by reverting a change by @Fenakhay and then further edits. --RichardW57 (talk) 07:42, 1 August 2023 (UTC)Reply[reply]

(Notifying Eirikr, TAKASUGI Shinji, Atitarev, Fish bowl, Poketalker, Cnilep, Marlin Setia1, Huhu9001, 荒巻モロゾフ, 片割れ靴下, Onionbar, Shen233, Alves9, Cpt.Guapo, Sartma, Lugria, LittleWhole, Kwékwlos, Mellohi!): Why are the categories broken? Chuterix (talk) 03:31, 23 July 2023 (UTC)Reply[reply]

@Chuterix: Why ping all the Japanese editors about a technical issue that has nothing to do with Japanese? There was nothing wrong with the categories, but something about your template syntax was messing up the transclusion. I gave the template some dummy parameters wrapped in noincludes and wrapped the real parameters in includeonlys as a preliminary, but that unexpectedly seems to have fixed the problem all by itself. I don't know know enough about template syntax and transclusion to say for sure what happened- it may even have had nothing to do with my edit at all, except as the equivalent of a null edit. Chuck Entz (talk) 05:24, 23 July 2023 (UTC)Reply[reply]
On further thought, perhaps your moving the template and its documentation page without editing the template meant that the template was still transcluding the old location of the documentation subpage, which had been replaced by a redirect when you moved it. My edit would have forced the system to update all the internal references. Chuck Entz (talk) 05:36, 23 July 2023 (UTC)Reply[reply]
@Chuterix You have been asked multiple times not to do these mass pings, as they're likely to just annoy people, and also mean it's harder to tell what's important and what isn't. Please stop. Theknightwho (talk) 16:12, 23 July 2023 (UTC)Reply[reply]
OK. Chuterix (talk) 16:13, 23 July 2023 (UTC)Reply[reply]

Multiword terms and senseid[edit]

Do these work together? I can't figure out how to do it. The word in question is Northern Thai ย้อนว่า (because), for which my quotation source has line-breaking between the constituent morphemes. For the headword, {{head|nod|conjunction|[[ย้อน]][[ว่า]]}} gives separate links to the two constituents, but I have distinguished the two etymologies for the first word by etymids. I want clicking the first part to definitely initially take one to the correct etymology.

Also, am I missing a trick for showing the line-breaking opportunity? The quotation source carefully marks up acceptable linebreaks, something that is fairly rare in the Thai script. A naive Northern Thai syllable boundary detector knowing only phonology could wrongly find the split ย้อ|นว่า. I can't use {{compound}} because this is not a single word, but an idiomatic phrase. --RichardW57 (talk) 13:31, 23 July 2023 (UTC)Reply[reply]

Marked up display text with <nowiki> to show actual text! --RichardW57 (talk) 13:40, 23 July 2023 (UTC)Reply[reply]
@RichardW57 --
If you're working with entire etym sections, use {{etymid}} instead of {{senseid}} -- the latter is best suited to single senses, rather than etym sections.
For instance, Japanese terms spelled in kanji often have multiple different readings (pronunciations), themselves with separate etymologies and other lexical details, so I make use of {{etymid}} with some frequency. I added a couple just a bit earlier to the 愚者 entry, which covers three terms -- gusha (derived from Chinese), oremono (from Japonic roots, borderline obsolete), and orokamono (from Japonic roots, still current).
In your example, I just added two etymids to the Thai ย้อน (yɔ́ɔn) entry, so to leverage these in your link to make sure that the first term in the compound links through to the first etym section, I'd use this syntax:
  • {{head|nod|conjunction|[[ย้อน#Thai:_return|ย้อน]][[ว่า]]}}
Note that it is strongly advised to not use numbers for etymids -- etym sections move around as the entries are edited, so some relatively obvious non-numeric identifier is preferrable. For example, for Japanese entry etymids, I'll use the reading in romanization for kanji headwords, and the main gloss for kana headwords -- I used the gloss just now for that Thai entry for purposes of this illustration (please adjust as appropriate).
HTH! ‑‑ Eiríkr Útlendi │Tala við mig 21:47, 25 July 2023 (UTC)Reply[reply]
@Eirikr: Thanks. I had contemplating using that structure, but was worried that @Theknightwho might reasonably think that that undocumented naming convention could freely be changed. As you probably saw, I have been using {{etymid}}; I said {{senseid}} because that's what most of the documentation talks about. I also think etymids will be more stable than senseids, especially for new or sketchy entries, where there can be a lot of later refinement. (Well, I hope so.) --RichardW57 (talk) 23:05, 25 July 2023 (UTC)Reply[reply]
@Eirikr, Benwing2, Theknightwho Actually, it looks as though the convention for fragment ID's might be fixed by the documentation for argument id of function language_link exported from Module:links were it correct, so my worry may be misplaced. (The bold text should also be taken as a bug report on Module:links, for misreporting the behaviour of function anchor exported from undocumented Module:senseid.) Alternatively, we need a template to follow any changes in convention - does one already exist? --RichardW57m (talk) 09:12, 26 July 2023 (UTC)Reply[reply]

Recent changes by content language[edit]

Special:RecentChanges is quite advanced, with lots of filtering options for namespaces, tags etc. Is it also possible to get a version of RecentChanges for only Swedish words, i.e. for pages containing "==Swedish=="? (Or could tags be used for this?) -- LA2 (talk) 13:56, 23 July 2023 (UTC)Reply[reply]

you can change the category Show changes on pages linked from box, selecting any category etc. Vininn126 (talk) 14:00, 23 July 2023 (UTC)Reply[reply]
Thank you! Yes, of course, I should have known this. --LA2 (talk) 15:43, 23 July 2023 (UTC)Reply[reply]
No worries, the website is big and complicated, glad to help. Happy editing! Vininn126 (talk) 15:44, 23 July 2023 (UTC)Reply[reply]
Of course, this doesn't show many Swedish entries that have problems with headword templates, so it's not complete. It's better then nothing, though. Chuck Entz (talk) 15:51, 23 July 2023 (UTC)Reply[reply]

NFC v. SoP[edit]

(Wrong forum - moved to Beer Parlour.)

pedocon[edit]

I've tried to create an entry for "pedocon" (Internet slang for conservatives whom the speaker suspects of being pedophiles, or at least of tacitly supporting pedophiles), but I keep running into edit filters (either anti-vandalism or anti-spam). What do I do? 93.72.49.123 13:36, 24 July 2023 (UTC)Reply[reply]

If you have three valid (i.e. other than Reddit and Twitter) citations you can provide I'll create it for you. - TheDaveRoss 13:46, 24 July 2023 (UTC)Reply[reply]
Why are Reddit and Twitter invalid? 93.72.49.123 13:50, 24 July 2023 (UTC)Reply[reply]
WT:CFI, specifically the section on attestation. Twitter and Reddit have not been agreed upon as acceptable, durably archived media. And based on recent events with each of those their durability is more in question than ever. - TheDaveRoss 13:54, 24 July 2023 (UTC)Reply[reply]

Request to adjust protection on discussion pages[edit]

Can an admin please adjust the move permissions on the newly-created August discussion pages, specifically Wiktionary:Grease pit/2023/August, Wiktionary:Tea room/2023/August, and Wiktionary:Information desk/2023/August. I wrote a script to automatically generate future discussion pages near the end of each month and but I'd prefer to run it from my bot account instead of my user account. If it's possible to add "bots" to the allowed users that would be ideal. If not, can we either drop the move protection or if the protection is important, can we grant Template editor permission to the bot account User:AutoDooz? Thanks! JeffDoozan (talk) 12:53, 25 July 2023 (UTC)Reply[reply]

@JeffDoozan I added the extended mover permission to AutoDooz, does that enable moving these semi-protected pages? - TheDaveRoss 17:22, 25 July 2023 (UTC)Reply[reply]
@TheDaveRoss: No, unfortunately, I still get a "page locked" error when trying to use the bot to move the protected pages. JeffDoozan (talk) 17:27, 25 July 2023 (UTC)Reply[reply]
@JeffDoozan I gave your bot Template Editor permission, which is also what my bot has (so I can edit templates and modules with the bot). Benwing2 (talk) 06:30, 26 July 2023 (UTC)Reply[reply]
Thanks, Ben. The bot's scheduled to create the discussion pages on the 25th of every month, which should give human editors time to notice if something goes wrong and the pages need to be manually crated, while also being close enough to the end of the month to give editors time to add or remove themselves from the current watchlist. Previously, all the pages were created en-mass once a year so everyone watching the page in December was added to the watchlist of all of next year's pages, but new users had to manually add themselves to the watchlist every month for the rest of the year. JeffDoozan (talk) 14:56, 26 July 2023 (UTC)Reply[reply]

SIM карта[edit]

Hi, I've just created the Bulgarian entry for SIM card - SIM карта (SIM karta). I got a warning telling me that I'm mixing Latin and Cyrillic characters, which - while true - is not wrong in this case. There is a small number of Bulgarian compound words where one of the constituents is written in the Latin alphabet. For example:

The warning suggested that I discuss my edit on here. Personally, given the rarity of this use case, and the fact that the warning is non-blocking, I'm fine with the status quo. What I don't like are the bogus categories that got created because of the "SIM" part, e.g. Bulgarian terms spelled with I. Is there a way to suppress those?

Thanks,

Chernorizets (talk) 03:57, 26 July 2023 (UTC)Reply[reply]

@Chernorizets I don't think you need to worry about this - the warning tells people to come here because people might not realise that Latin and Cyrillic letters that look identical are encoded differently.
I'm not sure I understand your last point - Category:Bulgarian terms spelled with I is a completely valid category. When you say "there is a small number of Bulgarian compound words where one of the constituents is written in the Latin alphabet", that's the kind of term that categories like that are designed to contain. Theknightwho (talk) 04:14, 26 July 2023 (UTC)Reply[reply]
@Theknightwho prior to me adding SIM карта, the only other "spelled with" subcategories under the parent Category:Bulgarian terms by their individual characters were for a couple of obsolete letters and a couple of non-alphabetic characters (as well as "E", which looks like a mistake I'll go fix). Is the intent of such categories to cover terms with "non-standard" letters w.r.t. the official alphabet of the language? If so - great, although IMO that's not obvious from the names of those categories, or from their descriptions. Chernorizets (talk) 04:31, 26 July 2023 (UTC)Reply[reply]
@Chernorizets Yes, that's exactly right. Compare Category:Russian terms by their individual characters, which has a spread of Latin and obsolete Cyrillic letters. Theknightwho (talk) 04:34, 26 July 2023 (UTC)Reply[reply]
Thanks @Theknightwho! I'd recommend that those category descriptions, in general, make it more explicit that they're intended solely for characters not in the language's current standard alphabet. Otherwise, I can see this question popping up again down the line.
Cheers,
Chernorizets (talk) 04:39, 26 July 2023 (UTC)Reply[reply]

{quote-foo} syntax changed?[edit]

I've just had a look at a couple of entries (pin curl, geezer teaser) on which I used {{quote-web}} and {{quote-journal}} without named parameters. E.g.:

  • {{quote-journal|[lang]|[year]|[author]|[title]|[journal]|[url]|[page]|[text]|[translation]}}

These worked (or seemed to) yesterday, but now the quoted text doesn't appear. When I added text= on pin curl, the quotations showed up as before. Is this a bug or an intended change that hasn't been documented yet? Cnilep (talk) 05:42, 26 July 2023 (UTC)Reply[reply]

@Cnilep This is a bug. I made some changes to the underlying code and seem to have broken this. Let me take a look. Benwing2 (talk) 06:28, 26 July 2023 (UTC)Reply[reply]
Can we avoid/deprecate these long chains of unnamed parameters? They're bound to break eventually, and I've seen many half-displayed instances which needed fixing. Jberkel 17:34, 26 July 2023 (UTC)Reply[reply]
@Jberkel Yes, I agree. At some point I'm going to rename the translator-related parameters in the quote templates and I may see about renaming the numeric params at the same time. Benwing2 (talk) 19:23, 26 July 2023 (UTC)Reply[reply]
While that is not obviously the wrong thing to do, it is not obviously the right thing to do, either. Pro: Avoiding unnamed parameters makes this sort of problem less likely. Con: Human-readable parameters make the code a bit heavier, and more of a pain to add by hand; arbitrary (less human-readable) parameters could be shorter, but more difficult to use, especially for newer editors. Both removing and keeping unnamed parameters have merit, and neither is obviously superior. Since its a matter of preference, discussion toward achieving consensus seems in order. Cnilep (talk) 01:11, 27 July 2023 (UTC)Reply[reply]
One suggestion (or set of them):
  • Keep the positional params to avoid breaking older instances of template use in the wikicode.
  • Add names for those positional params.
    • Provide longer-form names that are human-readable and obvious -- this supports newbies and anyone worried about wikicode legibility.
    • Also provide aliased short-form names that are guessable enough for experienced editors -- this supports power users and makes for more-compact code that is less onerous to type in by hand.
‑‑ Eiríkr Útlendi │Tala við mig 17:06, 27 July 2023 (UTC)Reply[reply]
@Eirikr, Cnilep This is already the case. I wrote a script (but haven't run it yet) to rename numbered params, and e.g. for {{quote-book}} the mapping is as follows:
      move_params([
        ("8", "t"), ("7", "text"), ("6", "page"), ("5", "url"), ("4", "title"), ("3", "author"), ("2", "year")
      ])
The longest named param is author at only 6 chars. For the other quote templates, it's similar. The longest name occurs in {{quote-video game}} where |5= is named |platform=. The only case that has several relatively long such params is {{quote-hansard}}:
      move_params([
        ("10", "t"), ("9", "text"), ("8", "column"), ("7", "page"), ("6", "url"), ("5", "report"), ("4", "debate"), ("3", "speaker"), ("2", "year")
      ])
However, having 10 numbered params is incredibly error-prone, esp. since the significance of the numbers varies from template to template, and there are 12 templates. Few people are going to be able to keep this straight, and there are in fact tons of existing errors due to numbered params. We could provide shorter aliases of params over say 5 chars if this would help, but I strongly believe we should not encourage people to use these numbered params and that in fact we should deprecate them. Note that most quote templates have several params and take significant work to enter, and in comparison the work required to type the names is very small (and while they may slightly increase the size of the wikicode, this is negligible). Benwing2 (talk) 21:45, 27 July 2023 (UTC)Reply[reply]
Excellent, thank you for the explanation. Over the years, I've seen some unfortunate changes to templates in the JA space that involved aggressively short parameters with no documentation and sometimes confusing usage (such as one template using named parameter ko to indicate the "kan'on" reading type and another template using that same parameter name to indicate a combined "kun + on" reading). Very happy to read that usability and longer-term maintainability are considerations here. 😄 ‑‑ Eiríkr Útlendi │Tala við mig 23:15, 27 July 2023 (UTC)Reply[reply]

Better internal permalinks[edit]

Here is a "permalink" deep down in some page,

Well it won't be so permanent once one day a new section is added above it.

And yes, it was never intended to be permanent in the first place.

So some permalink scheme should be attempted.

Maybe like #Spanish_Suffix . Jidanni (talk) 10:18, 26 July 2023 (UTC)Reply[reply]

@Jidanni: Use {{etymid}} or {{senseid}}. —Al-Muqanna المقنع (talk) 10:43, 26 July 2023 (UTC)Reply[reply]
Thanks. But I am talking about people who want a permanent link to something, without having to edit anything to get it. Jidanni (talk) 00:08, 27 July 2023 (UTC)Reply[reply]
@Jidanni This is hard because there may be more than one header of the same name for a given language, and most other aspects of the wikicode are liable to change over time. Benwing2 (talk) 00:17, 27 July 2023 (UTC)Reply[reply]
@Jidanni, ya, as @Benwing2 describes, there is no clean means of automatically and algorithmically generating unique link anchors that are also human-readable.
We could use some kind of infrastructure (modules and templates, perhaps) to automatically generate globally unique identifiers (GUIDs) for anything and everything that someone might want to link to. However, these will consist of unwieldy, long strings of randomized alphanumerics and hyphens, things like 50025ae1-ef80-4542-a061-418e53aea6e5. Humans (at least, neuro-typical ones) will never be able to remember these, so they will have to edit the target page (or inspect in a browser's dev tools) to find the anchor string, before they can use it in a link from some other page.
Numeric positioning is unstable, as you commented -- as soon as anyone adds or removes any of various parts of the wikicode, [[-ia#Suffix_21]] suddenly becomes [[-ia#Suffix_19]] or [[-ia#Suffix_25]], and all the existing links to [[-ia#Suffix_21]] are now pointing at the wrong thing.
Even your proposed #Spanish_Suffix is prone to breakage. What if there is more than one such suffix? What if the part of speech is later changed?
Any approach that is human-readable and understandable, and reasonably stable, will require that humans edit the targeted page to add such anchors. ‑‑ Eiríkr Útlendi │Tala við mig 17:16, 27 July 2023 (UTC)Reply[reply]

Category:Île-de-France French[edit]

Trying to make the page sandwich grec show up in the aforementioned category. Can somebody help? Synotia (talk) 16:48, 26 July 2023 (UTC)Reply[reply]

@Synotia: Have added it to Module:labels/data/lang/fr so you can use {{lb|fr|Île-de-France French}}, and changed sandwich grec appropriately. —Al-Muqanna المقنع (talk) 16:56, 26 July 2023 (UTC)Reply[reply]

BP brokenness[edit]

Something is wrong with the top-level Beer Parlour display; it's not correctly transcluding the last two months. I don't see any recent changes to the top-level BP page or the transcluded pages. Benwing2 (talk) 04:45, 27 July 2023 (UTC)Reply[reply]

Should be a Category:Pages where post-expand include size is exceeded problem. – Wpi (talk) 05:02, 27 July 2023 (UTC)Reply[reply]
@Wpi Aha, thanks. Benwing2 (talk) 05:42, 27 July 2023 (UTC)Reply[reply]
@Benwing2 @Wpi I've created Module:discussion which now implements {{discussion recent months}}, which has reduced the post-expand include size by about 20%, but it's still not enough for the BP at the moment. Not much else we can do. Theknightwho (talk) 02:32, 30 July 2023 (UTC)Reply[reply]
By my count, The July BP page currently has 1,272 edits for a current size of 590,154 bytes, which averages out to something like 464 bytes per edit. This has been an epically verbose, high-stress month- and we've still got a day or two to go. Don't mind me- I think I'll go cower and whimper in the corner... Chuck Entz (talk) 03:26, 30 July 2023 (UTC)Reply[reply]
It's still not appearing properly for me when viewing WT:BP, even though I assume it's transcluding only July & August now and that August is nearly empty. It might be a cache issue, but there's no way to tell and the usual hard refresh keystrokes didn't work, and neither does viewing it on another device which hasnt viewed the Beer Parlour lately (if ever). Soap 14:37, 1 August 2023 (UTC)Reply[reply]
It seems @Theknightwho's change to {{discussion recent months}} has made it worse. I tried previewing with the version with parser functions, and that seems to work. (at least for now - the post-expand include size is 1.6/2MB, so it will likely break after a week or two) – Wpi (talk) 15:10, 1 August 2023 (UTC)Reply[reply]
@Wpi Yes - it seems the change I had to make to ensure section editing worked has (more than) cancelled out the benefit we were originally seeing. Unless we can find some way to get section edit links to work with the first method, we'll probably have to call this a failed experiment. Theknightwho (talk) 15:25, 1 August 2023 (UTC)Reply[reply]
@Benwing2 @Wpi @Chuck Entz @Soap I've come up with a solution to this that seems to work, which is a slightly extended version of the solution someone else came up with on StackExchange ([4]): Module:User:Theknightwho/discussion. The short version is that the module grabs the page contents, iterates over the section headers and replaces them with (e.g.) <h2 data-source="palabra" data-section="38">Spanish</h2>. A simple JS script (as seen in User:Theknightwho/common.js) is able to convert this into a section edit link for every heading when rendering the page. This would be an extremely useful thing to have, because it also means we can parse pages in a single invoke without breaking section editing.
Currently it can only handle whole page transclusions (which is mostly what we need it for), but it wouldn't be difficult to extend it to parts of pages. However, it does support transcluding multiple pages, because the source pagename has to be included in each header tag. I also suspect the reply gadget won't work, but (a) I don't think that's a major problem, and (b) it may be fixable anyway.
The one assumption it makes (which I suspect is unlikely to change) is that the section edit links are numbered sequentially from 1 to n (i.e. the 38th heading is linked to with &action=edit&section=38). I don't think this is a documented feature. Theknightwho (talk) 17:36, 1 August 2023 (UTC)Reply[reply]

What is phoneticExtraction in Module:links?[edit]

@Theknightwho, RichardW57 Maybe one of you two knows this. There's a special phoneticExtraction table in Module:links, where if a language is in this table (currently only Thai and Khmer), getTranslit is called Module:th or Module:km in place of the regular transliteration mechanism. What the hell is going on here? Why does this exist, and why, if it's important, is it not integrated into the regular transliteration mechanism? I should add, transliteration ought to be as simple as calling lang:transliterate() but in fact Module:links and Module:headword both do significant (and different) logic related to transliteration. I'm asking about this because I'm implementing language tagging, transliteration and transcription for titles, chapters, etc. in Module:quote and it feels like I'm doing a lot more logic than I should need to. Benwing2 (talk) 04:06, 28 July 2023 (UTC)Reply[reply]

@Benwing2 This pre-dates any of the work I did, and I completely agree it should be integrated: it seems like a horrible hack. From what I can tell, Wyang was terrible at integrating things into the core modules, so anything they worked on (e.g. Module:th and Module:km) is either not integrated at all, or uses special-case crap like this. Another case-in-point are all the Chinese modules. Theknightwho (talk) 04:17, 28 July 2023 (UTC)Reply[reply]
@Theknightwho Oh yeah, now I remember, there was a wheel war between Wyang and Rua over this; take a look at the history of Module:links between June and August 2016. I think both got desysopped for awhile as a result. Do you know what the getTranslit code is actually doing for these two languages? I don't know anything about Thai or Khmer transliteration but I know it's complicated, and Richard has complained about multiword Thai expressions not getting transliterated correctly. If it "works" better for links than headwords, this is the cause. Benwing2 (talk) 04:28, 28 July 2023 (UTC)Reply[reply]
One semi-related question: What are the categories that get returned as the third return value of lang:transliterate()? Do I need to worry about them when generating transliteration of titles, chapters, etc.? (I would use full_link() but the existing formatting of titles/chapters/etc. and their glosses is somewhat different from what full_link() generates, and I'm preserving that difference while incorporating translits and transcriptions, so I have to roll my own annotations.) Also what are the values of tr_fail, and what does it mean if both the translit and tr_fail are nil? And for the 25th time, can you actually document this shit so I don't have to keep asking you? Benwing2 (talk) 04:36, 28 July 2023 (UTC)Reply[reply]
@Benwing2 I'm not sure beyond a basic understanding, but I think what it does is scrape the pages to see if there's a phonetic transcription available to use as the transliteration, and then it calls the translit module as a back-up if there isn't. It's a neat idea, but should be packaged up into the relevant translit modules. I have no idea why Wyang insisted on doing it like this, and to be honest I'm getting really sick of clearing up their spaghetti code. Theknightwho (talk) 04:51, 28 July 2023 (UTC)Reply[reply]
It was decided to do transliteration as a form of transcription, and so it got bundled up with working out pronunciations. The translit module works off a scraper and the core of the pronunciation module. Unfortunately, it only addressed the immediate needs, so doesn't attempt phrases. There's also a strong antipathy towards Internet standards for marking word boundaries, which have to be marked for a scraper to handle phrases. It thus doesn't integrate with {{quote}} or {{quote-*}}, and also not with Module:languages:1284: The function getByCode expects a string as its first argument, but received nil.. The code doesn't like to concede that European letters and numbers end up in Thai texts (e.g. email addresses and acronyms such as VDO, as well as telephone numbers). We've therefore had to wrestle with obvious things like Western Arabic numerals (the use of Thai digits is generally a sign of hostility to foreigners) being rejected, resulting inter alia in the falsification of quotations.
I don't think I have anything else to add to the description of @Theknightwho. I think the code has a decent chance of surviving surgery. I think @Octahedron80 has actually worked on the code. As I say, its antipathy to non-Thai characters needs to be worked on, but remember that transliterating the Baht symbol (which is in the Thai block) would involve needless work. --RichardW57m (talk) 14:42, 28 July 2023 (UTC)Reply[reply]
You don't need to worry about the categories. I don't think that value is actually returned by anything at the moment, since we decided to turn it off for Chinese. When I've got some time, I will see if there's a better way of implementing it. Also the second value (an explicit fail) could probably be replaced by simply returning false (as opposed to nil). If the translit and tr_fail are both nil, it means that the module has intentionally decided not to return anything (e.g. Arabic) that the transliteration was an accidental fail (e.g. it was only partially completed), whereas an explicit fail means the module intentionally returned nothing (e.g. the Chinese module had nothing to scrape) the . Given I remembered it the wrong way around, I'll make sure to document it. I've now done the documentation, and both previous edits were partially wrong: if tr_fail is true then it means that maintenance action could be required (as it was an accidental fail), while if it's false then it means the expected output was nil (usually because the input was "-"). Theknightwho (talk) 04:57, 28 July 2023 (UTC)Reply[reply]
Ironically, Thai transliteration is one of the few that actually tries to return some categorisation when it fails, but I think that currently doesn't make it to the display. It would make sense as an error reporting mechanism - one needs it when only a small subset of the permutations of characters have any reading. I think a combination of a failed transliteration category - to be monitored by a keen human, so probably per language - and an error message masquerading as an uncreated category would be helpful. --RichardW57m (talk) 14:13, 28 July 2023 (UTC)Reply[reply]
@Benwing2: Remember that Thai transliteration is technically impossible for an algorithm to get right, and that Thailand is important enough that therefore software rather than users take the strain of line-breaking between words. You may have to ask people to mark up Thai titles with word divisions, for which I recommend '<wbr>'. Some of us are horribly familiar with using ZWSP and WJ. You may also face the grief that Thai personal names are excluded from Wiktionary by policy, but then you don't really want the Wiktionary transcription in a source's references. It will look horribly amateur - and some Romanisations were bestowed by Rama VI along with the surname. Think of all the Thai names ending in 'porn' (e.g. Pittayaporn); that would generally come out as 'phɔɔn'. Part of the effort in using a Thai work is deciding on the Romanisation of the author's name! --RichardW57m (talk) 15:33, 28 July 2023 (UTC)Reply[reply]
Correction: It comes out as 'pɔɔn'.
@Benwing2:: Don't use isn't very helpful advice, even if it be the soundest. If you must do Thai transliterations as part of references, I think selecting RTGS is the most defensible option. Doing Thai automatic transliteration should generate some sort of warning; manual is almost always better. --RichardW57 (talk) 16:36, 28 July 2023 (UTC)Reply[reply]
@RichardW57 OK I think as a first step we should just move the scraping stuff into the actual Thai transliteration algorithm, same for Khmer. Other fixes can come later. Benwing2 (talk) 19:04, 28 July 2023 (UTC)Reply[reply]
@Benwing2: Please note that Northern Thai, Northeastern Thai and Southern Thai are all possible future users of this logic, at least for the Thai script. Their pronunciation modules are at least potentially all different, for at the very least they have different vowel and tone systems. If it's written and is truly Southern Thai, the Tak Bai dialect of Southern Thai is likely to merit a different pronunciation module, as its tone system is different yet again. RichardW57 (talk) 21:50, 28 July 2023 (UTC)Reply[reply]
(A little late to this party, but hey...)
Ya, what I recall from the 2016 kerfuffle was that Thai-script spellings apparently don't align well with Latin-script romanizations, and some of the argument between Wyang and Rua / CodeCat was about how to deal with this discrepancy and the difficulties in any algorithmic approach to extracting pronunciation information out of Thai-script strings.
Mind you, I've never really dealt with Thai (or Khmer) -- this is purely based on my own spotty and subjective memory of discussions at that time. ‑‑ Eiríkr Útlendi │Tala við mig 21:00, 28 July 2023 (UTC)Reply[reply]
Looking back, I get the impression that CodeCat's point wasn't communicated very well, as the proponents of the current approach (Wyang and Metaknowledge) didn't seem to realise that no-one was opposing the mechanism, but merely the fact that it was bolted onto Module:links in a nonstandard way that isn't necessary and creates a maintenance headache. Module:zh-translit also scrapes translits, but it's all inside the module. Theknightwho (talk) 21:01, 30 July 2023 (UTC)Reply[reply]

brackets=on not working in quote-web[edit]

In this diff, I add a quote that I feel should be in brackets since it is an alternative form of the entry term. Normally I put this kind of thing in brackets using brackets=on, but today, brackets=on does not work for quote-web. It seems to be working for quote-book. Thanks for looking at this. --Geographyinitiative (talk) 09:26, 28 July 2023 (UTC)Reply[reply]

@Geographyinitiative Blah, I thought I fixed that, let me take a look. Benwing2 (talk) 19:04, 28 July 2023 (UTC)Reply[reply]

About citations tab at the top and Visibility title on the left[edit]

Hi!

I am an interface editor from Turkish Wiktionary. I made some changes that works with new vector skin. For the people who use new vector skin, Citations tab is not aligned with the Entry or Discussions tabs. Also, on mobile version of Wiktionary does not have this tab. Well, I made some changes for this gadget. And now, for Turkish Wiktionary, we have perfect Citations tab on new vector skin, also on mobile. You can check it now: tr:göz. Also this does not mean the Citations tab won't work for old vector skin. It works. You can view the same page on mobile, the Citations tab is there.

For the Visibility options, as we know, new vector skin has a second sidebar on right which has "Tools" in it. Previously these tools were on left. On this wiki, the Visibility title is not seen with new vector skin's style. You can see again, in tr:göz, we have those options on the right.

I can make these changes here too. Just someone contact me, and ask me for the codes. Because I cannot edit the gadget or mediawiki pages here.

Cheers. ~ Z (m) 15:22, 28 July 2023 (UTC)Reply[reply]

@HastaLaVi2 This sounds like a good idea to me, pending review. User:Erutuon and User:This, that and the other, I'm wondering, can one of you take a look at this and let me know what you think? I don't feel completely qualified myself to make a judgment here. Benwing2 (talk) 19:25, 28 July 2023 (UTC)Reply[reply]
Yes, it should happen. @HastaLaVi2 if you would be so kind as to advise what changes are required? This, that and the other (talk) 04:22, 30 July 2023 (UTC)Reply[reply]
@This, that and the other, ok. I am listing the codes. You only need to copy/paste these. I already tested them from my common.js, so everything is ok.
1- The gadget for adding Citations and Documantation tabs. Copy these codes, and replace MediaWiki:Gadget-DocTabs.js page entirely with them:
2- Adding Citations tab in mobile version. Copy these codes, and paste them in MediaWiki:Mobile.js without deleting any other codes on that page:
3- Moving the visibility options to the right (for new vector only) with adaptible styles (for old and the new vector). Copy these codes, and replace MediaWiki:Gadget-VisibilityToggles.js page entirely with them:
4- Inserting the feedback inside the sidebar, rather than outside (currently it is on the outside). Copy these codes, and find "WT:FEED" in MediaWiki:Common.js. And replace the codes under "WT:FEED" with these:
I hope I explained everything and all is ok. If anything goes wrong, you can ping me anytime. Thanks! ~ Z (m) 21:30, 30 July 2023 (UTC)Reply[reply]
Just noting that I didn't forget about this. It seems that our versions of these scripts (in 1 and 3) are quite different from those you are suggesting here, so I'd need to do some testing before I publish these changes, to make sure we don't lose any local functionality. I will do it at some stage. This, that and the other (talk) 10:09, 17 August 2023 (UTC)Reply[reply]
Thanks! Benwing2 (talk) 18:57, 17 August 2023 (UTC)Reply[reply]
Of course! And if there is anything wrong with the new codes, I will try to adjust them to the old ones. I want to see the new changes on this wiki. The current display of the Citations tab and visibility toggles do really annoy me. ~ Z (m) 22:30, 18 August 2023 (UTC)Reply[reply]

Beer Parlour page suddenly disappeared[edit]

The subpages still work, but viewing the main WT:BP page just gives me a link to Template:discussion recent months. Is it just me? Thanks, Soap 03:28, 29 July 2023 (UTC)Reply[reply]

@Soap: See Wiktionary:Grease pit/2023/July#BP brokennessFenakhay (حيطي · مساهماتي) 03:31, 29 July 2023 (UTC)Reply[reply]
Sorry, Ive been a bit careless lately, and keep missing things. So it should be back to normal in a few days? Good to know, thanks. Soap 03:33, 29 July 2023 (UTC)Reply[reply]
I propose a vote to ban sloppiness Seoovslfmo (talk) 23:41, 29 July 2023 (UTC)Reply[reply]

sloppiness inside quote templates[edit]

Is there a way we can generate a list of all uses of incorrect params inside quote templates? To find crap like this. Deliberately mistyping a few common parameters brought up a handful of errors, so there's bound to be a bunch. 280, I predict... Seoovslfmo (talk) 23:43, 29 July 2023 (UTC)Reply[reply]

Hi, WF, I am working on this now. See User talk:JeffDoozan. Benwing2 (talk) 02:52, 30 July 2023 (UTC)Reply[reply]
See User:Benwing2/quote-templates-bad-params-warnings-1, User:Benwing2/quote-templates-bad-params-warnings-2 and User:Benwing2/quote-templates-bad-params-warnings-3. I split the warnings into 3 parts because there are a lot of them and the pages would be too big otherwise. Benwing2 (talk) 03:59, 30 July 2023 (UTC)Reply[reply]

@Benwing2. A few easy changes, you've probably caught some of them already Fuzzy warm feeling (talk) 08:45, 30 July 2023 (UTC)Reply[reply]

  1. place= can be changed to location=
  2. track= can be added to Template:quote-song under note=track
  3. retrieved= to accessdate=
  4. link= to url=
  5. co-authors= to coauthors=
  6. publication-year= to year_published=
  7. mainauthors= to mainauthor=
  8. archive-url= to archiveurl=
I would like to note that "author1=" was my mistake for "author2=" here: [5]. --Geographyinitiative (talk) 11:55, 30 July 2023 (UTC)Reply[reply]
Thanks, WF, I have already renamed place -> location, co-authors= to coauthors= and archive-url= -> archiveurl=. Benwing2 (talk) 16:50, 30 July 2023 (UTC)Reply[reply]
@Geographyinitiative Blah, looks like I'm gonna have to review those author1 changes manually. The code checks that there's no author= before renaming to author= but I didn't think about authors specified using first/last=. Benwing2 (talk) 16:53, 30 July 2023 (UTC)Reply[reply]
Yeah, I have seen people input "last=Doe|first=John|author2=Richard Roe" or "last1=Doe|first1=John|author2=Richard Roe" and found it very weird. It seems like they should be entered all consistently using one format. - -sche (discuss) 17:20, 30 July 2023 (UTC)Reply[reply]
OK, I reviewed all the |author1= changes. All were fine except for a couple that User:Geographyinitiative already fixed. Benwing2 (talk) 18:59, 30 July 2023 (UTC)Reply[reply]

Another weirdness[edit]

From the main Grease Pit page, you can't edit a section; you get "Section editing not supported". Benwing2 (talk) 03:59, 30 July 2023 (UTC)Reply[reply]

User:Theknightwho This was due to your change to {{discussion recent months}}. Benwing2 (talk) 04:02, 30 July 2023 (UTC)Reply[reply]

Acknowledging translations with quote-* templates[edit]

When the translation displayed in {{quote-book}} is licensed under CC-BY-SA, and the attribution is short, how do I acknowledge the translation? If I use |translator=, it looks as though the book itself is based on the translations being acknowledged. You can see an example at Pali ယက္ခ (yakkha). The translations are stored centrally in Module:RQ:pi:Shan Paritta, but I'm not confident that the acknowledgement on that page suffices. --RichardW57 (talk) 04:19, 30 July 2023 (UTC)Reply[reply]

Adding Proto-Eastern Algonquian[edit]

Can we add Proto-Eastern Algonquian to Module:languages/data/exceptional. @-sche

m["alg-eas-pro"] = {
	"Proto-Eastern Algonquian",
	2257525,
	"alg-eas",
	"Latn",
    type = "reconstructed"
}

--{{victar|talk}} 04:31, 30 July 2023 (UTC)Reply[reply]

Are you planning to split PEA content off to its own Reconstruction pages separate from PA? My main hesitation—why I haven't bothered adding PEA before—is that I wonder whether it's (un)helpful to readers to split things up too finely vs having all the roots of Algonquian terms in one place; in the extreme, that leads to things like Malayo-Polynesian and other Austronesian content being divided between the myriad nigh-identical Proto-Foo-Bar, Proto-Foo, Proto-Western-Foo, Proto-Southwestern-Foo entries for every stage of the Proto-Autronesian tree which Tropylium and some other editors tried to prune a while ago. The changes from PA to PEA were relatively modest, the loss of some vowel length distinctions and a few other changes, like *e to *ə and changes in the circumstances under which certain linking vowels were used in negatives, etc. OTOH, I see there has been more research into PEA even in just the last few years, and I suppose it is more like distinguishing just one level (PWGmc but not separate pages for Weser-Rhine, etc) from PGmc rather than umpteen levels of Austronesian. I'm ambivalent; @MiltonLibraryAssistant, Hk5183, do you have any opinions on whether PEA entries should be split off from PA? - -sche (discuss) 18:16, 30 July 2023 (UTC)Reply[reply]
Unlike Proto-Algonquian, there isn't a publicly-accessible dictionary for PEA online. Also, I can't seem to find much literature on the subject, let alone any attempts at a PEA reconstruction. Take a look at this Proto-Algonquian entry: *nepyi. We don't need create separate entries for Proto-Plains Algonquian, Central Algonquian, Eastern Algonquian, etc. It would be too convoluted, and that's assuming that we already have a reliable source and standard for those proto-language reconstructions. Against. MiltonLibraryAssistant (talk) 04:44, 31 July 2023 (UTC)Reply[reply]
So there are some PEA reconstructions in "Algonquian linguistic change and reconstruction" (I. Goddard, 1991), but only for very specific words. There's not a lot to go by, I think we should just leave it as is, until we can gather a list of standardised reconstructions for more general words (e.g. "water", "land", etc.) . MiltonLibraryAssistant (talk) 05:18, 31 July 2023 (UTC)Reply[reply]
That may have been once the case, but things have much progressed since then, beyond Goddard, from Costa (2007) to Cunningham (2022a), and the schema for reconstructing PEA is very well established (more than PWG TBH).
The issues that I'm running into is reconstructing PAlg with only PEA descendants, which is problematic because 1) I have to make dodgy guesses to what the proto form was, ex. *xk or *θk, and 2) many terms likely didn't even exist in PAlg at all, but are instead PEA constructions.
The whole situation is very reminiscent of PGmc and PWG, as you pointed out, and the morphological changes are just as significant. In summation, not reconstructing PEA is academically outdated, and both linguistically and chronologically problematic. --{{victar|talk}} 08:27, 31 July 2023 (UTC)Reply[reply]
@MiltonLibraryAssistant: Sidenote, Plains and Central Algonquian are areal groups, not genetic groups, like Eastern Algonquian, making them a moot point to this discussion. --{{victar|talk}} 08:41, 31 July 2023 (UTC)Reply[reply]
Everything makes sense now. MiltonLibraryAssistant (talk) 09:46, 31 July 2023 (UTC)Reply[reply]
If PEA reconstruction is more well-established than PWG as you claimed then I can't be against this anymore. MiltonLibraryAssistant (talk) 09:51, 31 July 2023 (UTC)Reply[reply]
Sorry for the late response. I would be interested to know if there are any particularly reliable sources for PEA orthography. I've come across quite variable spellings for the same phonemes (not that this is specific to PEA and not PA).
While I know that there are many quite firmly established PEA reconstructions, I am a bit concerned that there is often a rather fuzzy line between PA and PEA reconstructions.
Of course, my foremost concern is that without any formal style-guide or norms for adding PEA terms, they are likely to be added in a piecemeal manner which may not greatly improve upon existing PA pages.
From a personal point of view, while I believe that PEA should eventually be added, effort would likely be better spent adding new (and improving the quality of existing) PA entries. Hk5183 (talk) 22:44, 2 August 2023 (UTC)Reply[reply]

grunduz[edit]

The decl table on Reconstruction:Proto-Germanic/grunduz is not for grunduz but rather for grumþuz. Since we list this as an alternate form, I suspect this mismatch is on purpose, but I think we should explain it. If it is in fact a template bug, I have no idea how to fix it or even identify the problem, since the template call on that page is a simple {{gem-decl-noun}} with no parameters whatsoever. By contrast, Reconstruction:Proto-Germanic/handuz appears as normal. Soap 08:10, 31 July 2023 (UTC)Reply[reply]

It's coming from this module code. It was added by Mellohi! in 2020.
As a general point, it seems unwise and unnecessary to house inflection table data for sui generis words in a Lua module. The inflections should just be ordinary parameters in the entry itself. This, that and the other (talk) 12:09, 31 July 2023 (UTC)Reply[reply]
@This, that and the other It can sometimes be useful, because it helps with the inflections of derived terms. The Latin modules make use of this quite a bit. Theknightwho (talk) 14:47, 31 July 2023 (UTC)Reply[reply]
The nominative singular was intentional indeed. From Kroonen's EDPG intro page xxxii:

When Proto-Germanic still had a mobile accent, these ti- and tu-stems probably had root-stress in the nominative, and suffix-stress in the genitive, e.g. nom. *gʰrḿ-tu-s, gen. *gʰrḿ-té/ó-us. After the Germanic sound shifts, the nominative developed into *grumfþuz, whence G Cimb. grumf, while the genitive *grundauz ultimately served as the basis for Go. grundus and the aforementioned West Germanic forms. ON grunnr, on the other hand, goes back to *grunþuz, and appears to be a secondary variant with analogical n or þ. The fact that this analogy was possible proves that the paradigmatic Verner alternation must have remained intact until after the breaking up of Proto-Germanic and survived into Proto-Norse.

At the time I hardcoded the paradigm since I was too inexperienced in Lua to do otherwise. But the hardcoding can be minimized later, yes. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 16:38, 31 July 2023 (UTC)Reply[reply]
Just FYI I agree with User:This, that and the other that we should avoid hard-coding individual paradigms in Lua if possible. I do do this for Romance verbs e.g. Italian fare but that is because they often show up in collocations and with prefixes, so it makes sense to put the conjugation info in one place rather than on each page using it. Benwing2 (talk) 18:46, 31 July 2023 (UTC)Reply[reply]
I think this is what User:Theknightwho is saying though. Benwing2 (talk) 18:46, 31 July 2023 (UTC)Reply[reply]
Update: I have replaced many hardcoded Proto-Germanic declensions (namely irregular n-stems and a-stems) with a couple of multi-parameter declension routines. More will come. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 08:20, 1 August 2023 (UTC)Reply[reply]
@Benwing2 Need your help. I've tried to add a subroutine to Module:gem-decl-noun/data to deal with the ablaut in *burþiz (oblique stem burd-) but the second parameter I added refuses to work. — Ceso femmuin mbolgaig mbung, mellohi! (投稿) 21:04, 1 August 2023 (UTC)Reply[reply]
@Mellohi! It's never calling your tis-f routine. If you put an error() statement right before where it calls the declension routine, you'll see that the decl_type is i-mf. Presumably there's a bug in the code in lines 74-84 that is supposed to be taking the declension from the |stem= argument. Benwing2 (talk) 21:26, 1 August 2023 (UTC)Reply[reply]

Template:trans-top-also error when adding translation[edit]

The following error is thrown when I try to add a translation to back#Translations-furniture_backrest:

Could not find translation table for 'be:спі́нка'. Glosses should be unique

I just converted that box from {{trans-top}} to {{trans-top-also}}. Is there a technical problem or did I do it wrong (or both)?

See also seemingly relevant recent discussions at:

Thanks Voltaigne (talk) 21:42, 31 July 2023 (UTC)Reply[reply]

Hyphenation/syllabification for East Asian languages (Japanese and Korean) romanization[edit]

Hello, Can East Asian languages (Japanese and Korean) using hyphenation/syllabification for romanized words (e.g. 鉄道, tetsudō and 만화, manhwa)? Yuliadhi (talk) 23:54, 31 July 2023 (UTC)Reply[reply]

@Yuliadhi Seems fine to me, but I'm not sure if others have any ideas on this. I don't see why not though. Kiril kovachev (talkcontribs) 19:11, 2 August 2023 (UTC)Reply[reply]
So long as it's implemented correctly. There are Korean terms with romanized medial -nh- as in 만화 (manhwa), and it isn't always algorithmically obvious where the "H" belongs. And for Japanese, we'd need to pay attention to Japanese morphophonemic rules; in the 鉄道 (tetsudō) example, hyphenating as tet-sudō would be incorrect, as the medial tsu is a single integral phonemic unit in Japanese. ‑‑ Eiríkr Útlendi │Tala við mig 18:18, 14 August 2023 (UTC)Reply[reply]

August 2023

Abenaki ô/ȣ/8[edit]

@-sche, Hk5183, Abenaki entries are using ô, ȣ, and 8 all for the same character. Can we convert them to ô or ȣ with a bot? --{{victar|talk}} 00:46, 1 August 2023 (UTC)Reply[reply]

@-sche, Hk5183, victar We could, but we shouldn't. According to Wikipedia, they belong to different writing systems. --RichardW57m (talk) 13:04, 1 August 2023 (UTC)Reply[reply]
Ummm, yep, thanks, that's why we should choose one. Many NA languages have multiple transcription systems. --{{victar|talk}} 15:41, 1 August 2023 (UTC)Reply[reply]
No way. We can prioritise one, and treat the others as alternative forms, but we shouldn't collapse them into a single entry. Moreover, what you were proposing would create forms that do not belong in either (or possibly any) orthography. Given the number of Abenaki lemmas we have, it looks like a manual job for someone who understands the writing systems. --RichardW57m (talk) 16:08, 1 August 2023 (UTC)Reply[reply]
Woof. @-sche, please take over whatever 🫲 *this* 🫱 is. --{{victar|talk}} 20:11, 1 August 2023 (UTC)Reply[reply]
The symbols I've seen the most in modern books and on the web are ô (used by the Nulhegan Abenaki tribe's website, various modern books, and also Laurent's old native dictionary) and 8 (used by the online Western Abenaki Dictionary and the Cowasuck tribe's site); I have not seen ȣ much (and mostly, but not exclusively, in older works). Since 8 is obviously hacky, my instinct would be to standardize on ô (unless you want to make a case for ȣ?), and this indeed seems to be the most-used symbol on Wiktionary, too.
It would be reasonable to let people create "alternative spelling of..." soft-redirects from the other spellings to the main spellings (at least in cases where they are actually attested, like we have some soft redirects for actually-attested non-normalized Norse). And some older works may use spellings that require more normalization than just "replace ȣ with ô". I think those two things are what Richard was pointing out.(?) Fortunately, we seem to only have two entries with ȣ in their titles and ten that use it in their bodies, and none of them seem to require any other changes apart from ô/ȣ/8. I will try to look over the entries which use 8; the 17 which use it in their titles seem like they could be moved with no other issues.
We could also consider systematically invisibly displaying the other orthographies (perhaps in the headwords of the lemmatized entries?) so that searches for those other spellings find the relevant entries, in the manner discussed for Arabic here (Ctrl-F "non-displaying span"). - -sche (discuss) 03:08, 2 August 2023 (UTC)Reply[reply]
@-sche: ô is fine with me. There a few handfuls of Abenaki links with 8 and some ȣ that need converting too, which is why some swift bot-action would be nice. --{{victar|talk}} 03:45, 2 August 2023 (UTC)Reply[reply]
@victar, -sche: Those were the two things of concern. A third, which I realised later, is that not only would any merely obvious bot action obliterate the existing spelling , it would also falsify the references. Does victar not care about this? I did notice that many entries by -sche lack any backing from references or quotation; one could clean up by eliminating them via {{rfv}}, but I think that would be disruptive and actually leave Wiktionary in a worse state. One would hope that {{rfv}} would induce -sche to supply the references, but it is not something I would want to rely on. --RichardW57 (talk) 08:04, 2 August 2023 (UTC)Reply[reply]

@victar, -sche, RichardW57: This was interesting to look into. Chief Sozap Lolô's "Abenakis Alphabet" can be seen here (unexpectedly, ô's upper-case equivalent is not Ô, but O’). Henry Lorne Masta uses 8 throughout his 1932 Abenaki Indian Legends, Grammar and Place Names; his Abenaki Indian Grammar starts here. I also found Fr Sebastian Rasles' post-1691 A Dictionary of the Abnaki Language, in North America, which uses Ȣ and ȣ (including, notably, ȣ̑); see here for John Pickering's analysis of Rasles' orthography (from Memoirs of the American Academy of Arts and Sciences, second series, volume I (1833), pages 370–574). Finally, Philippe Charland says here that "[t]he sign '8' is a vowel sound that…occurs a lot in the language, and is sometimes written as 'ô'". My guess is that 8 is used because it's easier to input than ȣ (or ô, for that matter). 0D foam (talk) 12:04, 2 August 2023 (UTC)Reply[reply]

Ah, but one can claim that the upper case is Ô, but it just looks like O’ in appropriate fonts. Seriously, Unicode acknowledges a lot of distortion in the diacritics on upper case letters. --RichardW57m (talk) 16:11, 2 August 2023 (UTC).Reply[reply]
Yes, I am sure that the usage of 8 for ȣ is because, outside of Ivy League schools (and to some extent the church) the Byzantine Greek ȣ ligature was (and is) a character that is rare and is unavailable for printing or typing. I think it may be a good idea to use a bot to default entries to ô, including a softlinked alternative spelling using ȣ or even 8. (I don't believe there are any lemmas using 8 or ȣ with references which would be falsified.) Hk5183 (talk) 23:08, 2 August 2023 (UTC)Reply[reply]
Are you talking about font-licensing issues? On recent machines, that open-topped character seems to be reasonably well supported for display in the guise of LATIN CAPITAL/SMALL LETTER OU. For typing on Windows, the owner of the machine may have to agree to the use of Abenaki. --RichardW57m (talk) 09:31, 3 August 2023 (UTC)Reply[reply]

I'm going through and updating entries which use the other characters. Only a few have turned out to need other changes besides just to this character. It would probably be a good idea to have an edit filter that tracks (a) the addition of ==Abenaki== L2s to [new or existing] pages with ȣ or 8 in the title (if we could exclude cases where the contents include {{altspell}} / {{altform}} so we don't catch creations of valid soft redirects, all the better), or (b) the addition of new links to Abenaki words (e.g. via {{l|abe}}, {{m|abe}}, or {{t|abe}} or variants thereof) spelled with ȣ or 8.
Some links in other languages such as Loup continue to use ȣ. - -sche (discuss) 01:56, 3 August 2023 (UTC)Reply[reply]

@-sche: FWIW, what I've done on some Munsee entries was have the entry at the common spelling, chiikhiikan, the stressed form in {{head|head=chiikhíikan}}, and the alternative (academic) spelling in {{head|tr=či·khí·kan}}. If kept this way, I should probably create a custom {{head}} template to automate it. --{{victar|talk}} 06:11, 3 August 2023 (UTC)Reply[reply]

Colored boxes[edit]

@Quercus solaris. Various entries include a list of "colored boxes", e.g. purple box. This list has been copied manually from entry to entry. I have just created orange box and do not want to copy it everywhere. Could somebody please simplify things with a reusable template? Equinox 13:02, 1 August 2023 (UTC)Reply[reply]

Hi, I have never made a template before, but I should try to learn. Will plan to start. If anyone else beats me to it on this instance (colored boxes), godspeed and thank you. Also I acknowledge here (to all) that Wiktionary doesn't necessarily have to show (as coordinate terms) a contrast set of cardinal notions of colored boxes, but I think it's worthwhile on balance. Thanks all. Quercus solaris (talk) 14:47, 1 August 2023 (UTC)Reply[reply]
I was going to suggest moving {{colored boxes}} to the "Template:list:en:" naming scheme, but I notice it's not the only English list template that just has a generic name. My inclination would be to rename all the outliers. BTW Template:en-compass weirdly displays both English and Catalan everywhere it's used. - -sche (discuss) 03:15, 2 August 2023 (UTC)Reply[reply]
@-sche I agree that we should rename all the non-conforming templates. All of them appear to be old (created 10+ years ago by User:Visviva, User:Doremítzwr or User:Daniel Carrero, none of whom are active currently). Benwing2 (talk) 23:26, 2 August 2023 (UTC)Reply[reply]
@-sche I renamed all the non-conforming templates. I deleted {{en-compass}} in favor of existing {{list:compass points/en}}. Benwing2 (talk) 03:49, 4 August 2023 (UTC)Reply[reply]

Smarter timeline templates?[edit]

{{timeline}} and {{en-timeline}} (and probably other timeline templates) are not smart enough to give good displays. Citations:voluntell illustrate the low value of the display at present. The years are not even actually displayed. Subintervals are stacked up in a crude summary of the more useful bold years starting the citations lines themselves. Not good use of graphics. We may as well not have the timeline template at all in such cases.

I don't think a box-and-whiskers plot display is suitable for our citation-date data, at least not for cases with fewer than 10 citations. In addition it may be a bit much for our users. But a smarter template would adjust the width of the display intervals and the first and last intervals to what was actually present. It would also display all the dates. {{en-timeline}} appears on 14,184 pages in Citations space. DCDuring (talk) 16:22, 1 August 2023 (UTC)Reply[reply]

I feel like this should be a default feature of citation pages, to be honest... Vininn126 (talk) 16:34, 1 August 2023 (UTC)Reply[reply]
It's not worth displaying a timeline in cases where:
  1. we don't have all our citations for the definitions being displayed on the citations page OR
  2. there are, say, three or fewer citations to be displayed.
DCDuring (talk) 16:39, 1 August 2023 (UTC)Reply[reply]

Issues regarding the Inuit languages[edit]

(moved to Wiktionary:Beer parlour/2023/August)

{{quote-av}}'s |time= parameter is borked[edit]

The |time= parameter of Template:quote-av's currently broken; instead of displaying the full time entered, it displays only the seconds portion of said time. Example of this behavior: the quote for verb sense 1 of dodge, where "|time=26:48" but the rendered template shows "48 from the start" instead of "26:48 from the start". Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 08:47, 2 August 2023 (UTC)Reply[reply]

@Whoop whoop pull up Should be fixed. Benwing2 (talk) 23:09, 2 August 2023 (UTC)Reply[reply]
Thanx! Any idea what begat this bit of borkage? Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 00:11, 3 August 2023 (UTC)Reply[reply]
@Whoop whoop pull up It's a bit complicated. I implemented language prefixes in various params such as |title=, where you can now say e.g. |title=ar:عُنْوَان<t:Title> to indicate that the title is in Arabic (and in this case has a translation "Title" in English). This processing is also done on the |section= param, and currently {{quote-av}} sticks the value of |time= into the underlying |section= param. Formerly, the code to parse language prefixes in Module:parse utilities would wrongly (a) include numbers in the pattern checking for prefixes and (b) silently ignore and truncate bad language prefixes (rather than e.g. including them in the text itself). I fixed the code that checks for language prefixes to only recognize things that actually look like language codes (otherwise leaving the prefix untouched), and to throw an error if the language code is unrecognized. Benwing2 (talk) 00:40, 3 August 2023 (UTC)Reply[reply]
Ahhhh, that makes sense. Whoop whoop pull up Bitching Betty ⚧️ Averted crashes 00:47, 3 August 2023 (UTC)Reply[reply]

Bulgarian accent problem[edit]

Hello, on и#Bulgarian, I went and added ѝ as a homophone, but our current treatment (somewhere in the pipeline) of the diacritic on this letter results in the link getting emboldened, presumably to suggest it's trying to link to и instead. Although ѝ is not a separate letter in Bulgarian, this single word is the only case in which it's distinguished from the no-diacritic version. Where can we change it so that this works as expected? Kiril kovachev (talkcontribs) 19:10, 2 August 2023 (UTC)Reply[reply]

Done. 0D foam (talk) 22:01, 2 August 2023 (UTC)Reply[reply]
@Kiril kovachev The data in Module:languages/data/2 for Bulgarian (line 250, under entry_name) causes acute and grave accents to be removed from entry names, which is why you're seeing this. It's probably possible to make an exception for the entry ѝ specifically. This stuff has been changed a lot in the last few months by User:Theknightwho, can you answer how hard it is to make an exception for ѝ? Does this require that we have a special entry_name module, or is there a simple setting for it? Benwing2 (talk) 23:14, 2 August 2023 (UTC)Reply[reply]
@Benwing2 Very easy - there's a remove_exceptions field designed for precisely this purpose. It is documented, but it's probably easier to just take a look at Serbo-Croatian to see how it works. Theknightwho (talk) 23:39, 2 August 2023 (UTC)Reply[reply]
Actually, that being said, it's not possible to exclude individual entries. We might have to do something special for it in that case. Theknightwho (talk) 23:42, 2 August 2023 (UTC)Reply[reply]
@Theknightwho Yeah that's what I figured; we'd have to create a special module for this purpose. Benwing2 (talk) 23:43, 2 August 2023 (UTC)Reply[reply]
@Benwing2 Actually, I think it could be done with a bit of modification to the logic. remove_exceptions doesn't support patterns at the moment, but ideally we could put "^ѝ$" to exclude only this entry. Theknightwho (talk) 23:48, 2 August 2023 (UTC)Reply[reply]
@Benwing2 @Kiril kovachev This was actually very straightforward to do (and was also the impetus for refactoring the remove_exceptions code, which is now a lot more efficient). Links to ѝ (ì) now work as intended, but in any other contexts ѝ will have the diacritic stripped from the link (e.g. ѝѝѝ (ììì)). Theknightwho (talk) 00:24, 3 August 2023 (UTC)Reply[reply]
Great, thanks! Benwing2 (talk) 00:58, 3 August 2023 (UTC)Reply[reply]
@0D foam @Theknightwho @Benwing2: thanks a lot, this is great. I do want to correct what I said earlier, though, which is that the word "ѝ" itself is distinguished from "и", which means if there's a multiword phrase or something like that which were to feature this word, the link would still be broken. E.g., in книгата ѝ (knigata ì, her book), this should link to книгата ѝ and not книгата и. Sorry that I didn't explain that very well originally. Fortunately I think there aren't any such cases in our entries, and I couldn't find any entry like this in the dictionaries, so we should be fine for now. I believe a more suitable pattern (in regex) would be \bѝ\b, the only issue being I don't think Lua supports this. Kiril kovachev (talkcontribs) 10:01, 3 August 2023 (UTC)Reply[reply]
@Kiril kovachev Unfortunately Lua doesn’t support regex, but the equivalent would be %f[^%z%s]ѝ%f[%z%s]. Theknightwho (talk) 13:11, 3 August 2023 (UTC)Reply[reply]
@Theknightwho That looks good, would it be possible to substitute that in place of the current pattern? Kiril kovachev (talkcontribs) 13:19, 3 August 2023 (UTC)Reply[reply]
@Kiril kovachev Yep - done! Was on my phone when I posted the earlier comment, and it's awkward to edit Lua modules on mobile. Theknightwho (talk) 13:32, 3 August 2023 (UTC)Reply[reply]
Thanks, nice one! книгата ѝ (knigata ì) @Theknightwho Kiril kovachev (talkcontribs) 13:34, 3 August 2023 (UTC)Reply[reply]

WOTD corrections possibly[edit]

WOTD for 8/3/23; two misspellings. Rigamarole was spelled rigmarole (which is correct in the UK, but not the U.S. as per https://www.oxfordlearnersdictionaries.com/us/definition/english/rigamarole#:~:text=%2F%CB%88r%C9%AA%C9%A1%C9%99m%C9%99r%C9%99%CA%8Al%2F-,%2F%CB%88r%C9%AA%C9%A1%C9%99m%C9%99r%C9%99%CA%8Al%2F,is%20annoying%20and%20seems%20unnecessary), and conman should be con man (in the U.S., as per https://www.merriam-webster.com/dictionary/con%20man).

Thanks! :) Tanscribingmama (talk) 16:08, 3 August 2023 (UTC)Reply[reply]

@Tanscribingmama: The Oxford Learners Dictionary entry you linked states that rigmarole is both British and North American, and it's about 3x more common than rigamarole in the New York Times archive. The Merriam-Webster link doesn't say anything about conman, it's less common than con man but looks well-attested too. —Al-Muqanna المقنع (talk) 16:17, 3 August 2023 (UTC)Reply[reply]
Thank you for this! Everywhere I've seen it spelled has had it spelled rigamarole. I could've just seen it written strangely of course, but here are a few more sources for everyone to decide on. "Rigamarole." Vocabulary.com Dictionary, Vocabulary.com, https://www.vocabulary.com/dictionary/rigamarole. Accessed 03 Aug. 2023. Interestingly, I found this article which states slightly different definitions depending on spelling, though it says both are correct: https://thecontentauthority.com/blog/rigamarole-vs-rigmarole.
Con man seems more correct simply because you call it a "con." I see documented instances of both as well, but here is Cambridge's answer (which you are correct, simply states both are fine): https://dictionary.cambridge.org/us/dictionary/english/con-man Tanscribingmama (talk) 16:33, 3 August 2023 (UTC)Reply[reply]

{{{alts}}} not printing qualifiers in {{desc}}[edit]

I distinctly remember that {{{alts}}} used to grab whatever qualifiers were in {{alt}} and print them in {{desc}}, did something change? (An example of a lack of this would be the Belarusian descendent at Reconstruction:Proto-Slavic/slimakъ. Vininn126 (talk) 17:01, 3 August 2023 (UTC)Reply[reply]

@Vininn126 Are you sure about this? I rewrote this code last November, but I checked the code from 2020 and it ignored dialect tags (i.e. anything after a double pipe), just like it does now. I can implement this, but how should they display? For ideas, see Template:synonyms/documentation#Dialect tags; {{syn}} etc. supports both overall and term-specific dialect tags, displaying the former after an en-dash (should be an em-dash BTW) and the latter in brackets. Benwing2 (talk) 00:14, 4 August 2023 (UTC)Reply[reply]
@Benwing2 I'm just curious why, I suppose. I personally feel it would be useful to have i.e. "Middle Polish" displayed among the forms. Vininn126 (talk) 07:07, 4 August 2023 (UTC)Reply[reply]
@Vininn126 Should it display in brackets similar to dialect tags for synonyms? Benwing2 (talk) 07:08, 4 August 2023 (UTC)Reply[reply]
@Benwing2 Yeah, I think so. Vininn126 (talk) 07:14, 4 August 2023 (UTC)Reply[reply]
@Benwing2: I don't think the brackets look good, nor is it consistent with how we normally format qualifiers, with parentheses, per |q= and |qq=. --{{victar|talk}} 03:52, 5 August 2023 (UTC)Reply[reply]
@Victar Blah. The brackets are also what is done with {{syn}} et al.; see the documentation there. I did this because with parens you get two sets of parens when there's translit (as in the case that User:Vininn126 gave), which IMO looks worse than brackets. Benwing2 (talk) 04:03, 5 August 2023 (UTC)Reply[reply]
@Benwing2: then put it before, as in |q=. We did it that way for the same reason before {{desc}}. --{{victar|talk}} 04:10, 5 August 2023 (UTC)Reply[reply]
I'm going to wait to see if there are other opinions. Benwing2 (talk) 04:13, 5 August 2023 (UTC)Reply[reply]
The current approach looks spammy. See Reconstruction:Proto-Kartvelian/ḳerčx- for instance. The dialect tag should be printed once before the terms, I see no point in repeating the tag for each alt. კვარია (talk) 08:19, 5 August 2023 (UTC)Reply[reply]
@კვარია Let me see what I can do; I did this because of the way it is combining multiple sets of terms, where the dialect tags apply to only some of them, but I see your point. Benwing2 (talk) 04:27, 7 August 2023 (UTC)Reply[reply]
Cool. Also, a small inconsistency: if the first alt's tags/qualifiers don't match that of the main entry, it should be separated with a semicolon rather than a comma. კვარია (talk) 06:37, 7 August 2023 (UTC)Reply[reply]
I would not wish for that. --{{victar|talk}} 06:51, 7 August 2023 (UTC)Reply[reply]
All commas then? Currently it's not consistent.
main-term COMMA alt-term-list [TAG] SEMICOLON alt-term-list [TAG] SEMICOLON ... etc
Though I'm sure nobody actually pays attention to that except for weirdos such as myself. კვარია (talk) 09:12, 7 August 2023 (UTC)Reply[reply]
All commas would be my preference, with the qualifier in front. --{{victar|talk}} 15:38, 7 August 2023 (UTC)Reply[reply]
@Benwing2, it's been a week. Thoughts? --{{victar|talk}} 18:09, 15 August 2023 (UTC)Reply[reply]
@Victar @კვარია Sorry, I started on fixing this and got bogged down by the fact that {{desc}} already supports both overall and per-term dialect tags, with per-term dialect tags displayed before the term as a qualifier, and the overall tags displayed after all terms in brackets. Since I don't think this support is documented, I may just remove it since I suspect it's not used. In any case I will work on this tonight. My plan was to use semicolons as necessary for grouping in cases where you have several terms, some of which have a dialect tag and some don't. IMO there's no way around using semicolons unless you want to tag every term with its associated dialect tags, which I agree is overkill. Benwing2 (talk) 18:22, 15 August 2023 (UTC)Reply[reply]
@Benwing2: We've gotten around using brackets and semicolons by putting the qualifier in front, indicating everything beyond it falls under that qualifier, ex. English: term1, (dialectal) term2, term3, (archaic) term4. --{{victar|talk}} 19:01, 15 August 2023 (UTC)Reply[reply]
@Benwing2: In the meantime, can we turn it off? Right now, it's super janky, making it harder to read. --{{victar|talk}} 00:25, 17 August 2023 (UTC)Reply[reply]
@Victar Yeah I will turn it off. I've gotten at least 50 messages in the last 24 hours and it's a lot of work responding to them all, so I don't always have time to work on fixing stuff like this. I should add though in terms of comma-only, that (a) this is ambiguous, (b) it doesn't work at all if e.g. term4 should not have the dialectal tag but term2 and term3 should. Benwing2 (talk) 02:21, 17 August 2023 (UTC)Reply[reply]
Pace victar, I also consider the comma-only requirement problematic. 0DF (talk) 10:46, 17 August 2023 (UTC)Reply[reply]
For the record, |alts= never grabbed qualifiers from {{alt}}. While you're there, can you get {{desc}} to also search for the header ====Alternative reconstructions====? --{{victar|talk}} 08:14, 4 August 2023 (UTC)Reply[reply]
@Victar Sure, doesn't sound hard. Benwing2 (talk) 19:22, 4 August 2023 (UTC)Reply[reply]

Separate pronunciations in la-IPA[edit]

@Benwing2: Quick idea independent of the current discussion about what to label "ecclesiastical" pronunciation. It would be nice to be able to manually specify separate pronunciations in {{la-IPA}} for the two versions presented atm. So far I've just used the janky workaround of separate |eccl=no and |classical=no lines, as at -cumque recently. —Al-Muqanna المقنع (talk) 14:53, 4 August 2023 (UTC)Reply[reply]

@Al-Muqanna Yup this is a good idea. It's a bit similar to how we can specify separate pronunciations for different Portuguese dialects (esp. Brazilian vs. European). Benwing2 (talk) 19:23, 4 August 2023 (UTC)Reply[reply]

Add the word "dictionary" somewhere in every page[edit]

The lack of the word "dictionary" apperaring no where in the definition pages causes Wiktionary to rank very low when searched on google and other search engines. See example https://www.google.com/search?q=cat+dictionary. 85.246.37.185 01:37, 6 August 2023 (UTC)Reply[reply]

I suggest to change “random entry” in the sidebar to “random dictionary entry”, if this is not too long. Fay Freak (talk) 02:45, 6 August 2023 (UTC)Reply[reply]
I think it would be better to add a tag like <meta name="title" content="cat – Wiktionary, the free dictionary"> in the HTML <head> tag. However, doing so requires asking the developers, who are not interested in this project. — SURJECTION / T / C / L / 08:13, 6 August 2023 (UTC)Reply[reply]
Or alternatively make something like that the new <title> and use JavaScript to replace it with the current format where the title is only followed by - Wiktionary. — SURJECTION / T / C / L / 08:17, 6 August 2023 (UTC)Reply[reply]
In fact, we can control the page title ourselves without troubling developers, using MediaWiki:Pagetitle. (Moreover, search engines probably don't pick up JavaScript-based page title changes.) @Surjection how about we change it to "$1 - Wiktionary, the free dictionary"?
We could also add the word "dictionary" to MediaWiki:Wikimedia-copyright, which appears in the footer of each page. The prefix "Text is available..." could be altered to "The text of Wiktionary, the free dictionary, is available..."
These very small and unobtrusive SEO tweaks might have a real impact, especially on non-Google search engines. This, that and the other (talk) 11:31, 6 August 2023 (UTC)Reply[reply]
I thought of another idea. Wikipedia displays MediaWiki:Tagline for logged out users under the page title. Maybe we could do the same. — SURJECTION / T / C / L / 11:57, 6 August 2023 (UTC)Reply[reply]
Not an option, it seems - Google indexes the mobile versions of the pages which never displays the text. — SURJECTION / T / C / L / 13:50, 6 August 2023 (UTC)Reply[reply]
The appropriate message is MediaWiki:Pagetitle. This is the title shown by Google. See w:ca:MediaWiki:Pagetitle after the same discussion on cawiki. Vriullop (talk) 06:15, 7 August 2023 (UTC)Reply[reply]
I've changed MediaWiki:Pagetitle to "$1 - Wiktionary, the free dictionary" just to see what happens. Happy to revert it if there are objections. This, that and the other (talk) 00:35, 8 August 2023 (UTC)Reply[reply]
Hi, that is better but can you change it to "- Definition from Wiktionary, the free dictionary"? Like how it already is in a div that is not displayed (which has no effect to SEO if i'm not mistaken) <div id="siteSub" class="noprint">. I swear the only reason people are using other online dictionaries instead of Wiktionary is because of Wiktionary's poor ranking, I swear the definitions are better and more expansive in Wiktionary. --ModernDaySlavery (talk) 03:18, 8 August 2023 (UTC)Reply[reply]
We shouldn't have "definition from" in the page title. On the contrary, we should remove "definition" from the tagline, which I have now done. — SURJECTION / T / C / L / 04:13, 8 August 2023 (UTC)Reply[reply]
But the keywords "define" and "definition" is used even more than "dictionary". --ModernDaySlavery (talk) 05:50, 8 August 2023 (UTC)Reply[reply]
Agreed. "Define" and/or "definition" should be included somehow. Andrew Sheedy (talk) 02:02, 18 August 2023 (UTC)Reply[reply]
Why "$1"? User:CitationsFreak (talk) 04:06, 18 August 2023 (UTC)Reply[reply]
@CitationsFreak: That's how MediaWiki does variables in system messages. Without it, the title for this page would be just " - Wiktionary, the free dictionary" instead of "Wiktionary:Grease pit/2023/August - Wiktionary, the free dictionary" Chuck Entz (talk) 04:40, 18 August 2023 (UTC)Reply[reply]
Ah. cf (talk) 05:03, 18 August 2023 (UTC)Reply[reply]
Maybe we could also add "define" and "definition" somewhere too. I do hundred percent agree that the main reason people not using wiktionary but use dictionary.com, merriamwebster, cambridge dictionary and collins is that wiktionary ranks lower than those in google searches regularly --ModernDaySlavery (talk) 22:12, 6 August 2023 (UTC)Reply[reply]
Regardless where we add it we have to add it somewhere, it's the golden rule of SEO if a word doesnt exist in a page it wont be shown in the results, how do we contact the developers User:Surjection? --39.109.192.209 10:53, 6 August 2023 (UTC)Reply[reply]
The developers would be contacted through Phabricator. — SURJECTION / T / C / L / 14:08, 6 August 2023 (UTC)Reply[reply]
Our logo image says "Wiktionary, the free dictionary", so just give it the appropriate alt text, which it should have in any case for accessibility for the blind. Equinox 13:28, 6 August 2023 (UTC)Reply[reply]
Tim Starling, a long standing developer, has stated that the site logo is purely decorative and should not carry alt text. This, that and the other (talk) 00:33, 8 August 2023 (UTC)Reply[reply]
Which seems like an ill-advised opinion with respect (or in this case, disrespect) to accessibility. -_-
- -sche (discuss) 01:06, 8 August 2023 (UTC)Reply[reply]
I support making changes to this effect, such as the one TTO made to MediaWiki:Pagetitle; IMO it would be good to also have the word appear in the body text somewhere, not just the title. For desktop users, we could also change the "About Wiktionary" link in the footer to "About Wiktionary, the free dictionary" (and maybe update that page: I have gotten feedback from a couple people over the years that when they were looking at a Wiktionary entry, and wanted to find out more about our 'editorial stance' and what we included vs didn't include, that's the link they clicked on). But that text / link doesn't seem to be loaded on mobile. There is, however, a bare "Wiktionary" by itself at the bottom of every mobile page, in between the "Last edited N ago by X" and the "Content is available under..." : could we expand that to say "Wiktionary, the free dictionary" too? - -sche (discuss) 01:06, 8 August 2023 (UTC)Reply[reply]
I also support this. We could ask for the WikiSEO extension to be installed, which is designed for exactly this kind of thing. It would enable the {{#seo}} parser function, which would allow us to add a bunch of things to pages. This could probably be done via the headword module in Lua. Theknightwho (talk) 04:40, 8 August 2023 (UTC)Reply[reply]
I would support this. What's the process for installing an extension? Ioaxxere (talk) 15:48, 8 August 2023 (UTC)Reply[reply]
@Ioaxxere Our best bet is to have a vote on it, and then go to the Phabricator requesting it. WikiSEO does seem to be well-maintained, which makes it more likely we'll get approval.
@Surjection Are you familiar with the process at all? Theknightwho (talk) 20:55, 8 August 2023 (UTC)Reply[reply]
@Theknightwho here's the vote. Ioaxxere (talk) 21:12, 9 August 2023 (UTC)Reply[reply]
FWIW, this prompted me to search for google:"Wiktionary defines" vs. google:"Merriam-Webster defines" to see how we compare in who cites us and for what: results are here. Probably more informative is that in Google Books' Ngram Viewer, "Wiktionary defines" has climbed to about 1/5th as common as "Merriam-Webster defines", which is not bad; M-W is itself about 1/5th as common as "the OED defines". Having even a bare minimum of SEO now should help, since we do in my experience turn up dead last in searches for "definition of foobar", except when foobar is such an obscure word other dictionaries don't define it. - -sche (discuss) 12:45, 8 August 2023 (UTC)Reply[reply]
How about adding "definitions" to the copyright messages on both desktop and mobile? e.g. changing desktop "Text is available..." to "Definitions and other text is available..." and mobile "Content is available" to "Definitions and other content..." — SURJECTION / T / C / L / 14:11, 8 August 2023 (UTC)Reply[reply]
("Definitions and other text are available!") Equinox 14:37, 8 August 2023 (UTC)Reply[reply]
Should've proofread my own comments... — SURJECTION / T / C / L / 14:54, 8 August 2023 (UTC)Reply[reply]
Sounds like a fine idea. (I wonder why the desktop version has "Text is available..." while the mobile version has "Content is available..."; just different people writing them at different times?) - -sche (discuss) 15:19, 8 August 2023 (UTC)Reply[reply]
If anyone else wants to institute this (as in if another interface admin agrees with this idea), the desktop text is at MediaWiki:Wikimedia-copyright and the mobile one at MediaWiki:Mobile-frontend-copyright. — SURJECTION / T / C / L / 18:58, 8 August 2023 (UTC)Reply[reply]
(I wonder if those support namespace-specific wording?...) — SURJECTION / T / C / L / 19:47, 8 August 2023 (UTC)Reply[reply]
Special:Permalink/75586872 (MediaWiki), Special:Permalink/75586716 (last two lines to add to Common.css) appears to work. — SURJECTION / T / C / L / 20:40, 8 August 2023 (UTC)Reply[reply]
I've added the simple version; if someone else can double-check that the code for the NS-specific version looks good, we could add that. Is MediaWiki:Aboutsite the page that makes the "About Wiktionary" text at the bottom of the (desktop) page in between "Privacy policy" and "Disclaimers"? I'm wondering if it might help to change it to "About Wiktionary, the free dictionary" so that "dictionary" is on every page and not just in the title. Alternatively, We could stick "dictionary" into the copyright notice too, as TTO suggested; that way it be in the body (not just the header) of pages even on the mobile site, which someone said above is the one Google indexes. - -sche (discuss) 22:27, 8 August 2023 (UTC)Reply[reply]
It doesn't necessarily have to be in the body of the page if it's already in the page title. — SURJECTION / T / C / L / 11:08, 9 August 2023 (UTC)Reply[reply]
BTW Wiktionary is even outranked by sites that copy it. See for example https://www.google.com/search?q=%22But+patent+applications+are+increasingly+accompanied+by+volumes+and+volumes+of+data+on+DVD%2C+which+taxes+the+resources+of+the+patent+office%22&sca_esv=555356848&ei=JGnUZOnKGZW5qtsP_6WwgA4&ved=0ahUKEwjp7N-KotGAAxWVnGoFHf8SDOAQ4dUDCBA&uact=5&oq=%22But+patent+applications+are+increasingly+accompanied+by+volumes+and+volumes+of+data+on+DVD%2C+which+taxes+the+resources+of+the+patent+office%22&gs_lp=Egxnd3Mtd2l6LXNlcnAijAEiQnV0IHBhdGVudCBhcHBsaWNhdGlvbnMgYXJlIGluY3JlYXNpbmdseSBhY2NvbXBhbmllZCBieSB2b2x1bWVzIGFuZCB2b2x1bWVzIG9mIGRhdGEgb24gRFZELCB3aGljaCB0YXhlcyB0aGUgcmVzb3VyY2VzIG9mIHRoZSBwYXRlbnQgb2ZmaWNlIkgAUABYAHAAeAGQAQCYAQCgAQCqAQC4AQPIAQD4AQHiAwQYACBB&sclient=gws-wiz-serp with a quote from an IBM blog that is no longer online; Wiktionary ranks 3rd behind two sites that copy it. Benwing2 (talk) 04:38, 10 August 2023 (UTC)Reply[reply]
Here is an example [6] where a Wiktionary copy site appears first and Wiktionary doesn't appear at all. Benwing2 (talk) 07:45, 10 August 2023 (UTC)Reply[reply]
I see Wiktionary as #2 on the search today. DCDuring (talk) 01:50, 18 August 2023 (UTC)Reply[reply]
@Ioaxxere I'm late to the party, and let me know if you'd rather I ask this on the vote page you created. How high should Wiktionary entries rank in search results? I understand the passion and dedication of Wiktionary editors, and I'm trying to be a helpful one myself, but do we have any quality metrics around how good we are, overall? I'm not well-versed in SEO, so I don't know how much of a boost the proposed use of WikiSEO would give us, but if our results were to bubble up next to the Merriam-Websters of the world (to pick an example), would that be a "good thing"? Chernorizets (talk) 06:45, 17 August 2023 (UTC)Reply[reply]
There is zero chance of being able to determine exactly where we fall in Google rankings. Given that a large number of Wiktionary English entries are unambiguously better than their M-W counterparts I wouldn't care about it anyway. —Al-Muqanna المقنع (talk) 08:29, 17 August 2023 (UTC)Reply[reply]
For what it's worth, our French equivalent, Wiktionnaire, seems to rank pretty high when I look for words and is much more widely used within the French-speaking world than Wiktionary is in the English-speaking world. Their entries are higher quality in some ways, but overall, our quality is pretty comparable. If we ranked higher, we'd also attract more editors, who would help to resolve some of the quality issues. Personally, I find Wiktionary better (clearer layout, no distracting ads, often better definitions) and more comprehensive than any other free online dictionary. Andrew Sheedy (talk) 19:41, 17 August 2023 (UTC)Reply[reply]
@Andrew Sheedy I like the idea of higher visibility attracting more editors. Both you and @Al-Muqanna have made statements about Wiktionary's quality compared to other resources available online - I don't have data to disagree, but I also do believe it's fundamentally a data-driven question. I think it would be hugely beneficial to Wiktionary to periodically perform well-scoped, well-defined studies to qualitatively and quantitatively determine how well we're doing compared to professional lexicographers. The Wikipedia article about Wiktionary mentions its use in academia, so some high-level quality indicators might already be available from projects and institutions. Chernorizets (talk) 20:59, 17 August 2023 (UTC)Reply[reply]
In my experience the quality of Wiktionnaire's French entries is just on another level relative to our English entries. They have rich collections of usage examples, collocations and quotations for every sense of common words. On the other hand, many of our English entries are languishing in Webster-1913-land and don't have anything beyond a definition, and maybe a dodgy usage example or synonym if you're lucky. Still, some of our more obscure English entries rank highly in Google search results, as do (to take one example) our Latin entries. This, that and the other (talk) 23:13, 17 August 2023 (UTC)Reply[reply]
For common words, it’s true, but we are catching up, and English Wiktionary has provided more value all the time because of its seamless coverage of the most fringe registers of a language, even if it has been achieved by dodgy transitions and redundancies of senses: underrated, whereas common words have inherent difficulties and one edits them randomly and slowly in connection with specialized vocabulary, for any great ideas should invite caution of the editor who would desire the big picture, and of the reader who understands that ideal conditions for the bulk of commonplace words cannot be expected to be found in an online dictionary yet.
And even in that we have acutely become better by luring in the internet through accurate coverage of recent internet coinage on which no reliable or reputable source reports. The ordinary has less potential to excite and thus make a dictionary stand out from the crowd, though it is not to be denied from an objective standpoint that for everyday terms it must also be helpful.
It’s also an old schism of taste between dictionarians, deeply rooted in personalities engendering diverse work approaches. Some have a penchant to recover the rareties and by their help perhaps make novel comparison points; others pursue romantic ideas of basic vocabularies, like Swadesh lists, and in them find copious distinction by gradually further digressing into the realm of idioms and subtlety; both efforts are of equal productivity and ultimately connect to each other, Nöldeke’s attestation dictionary for the most sought words and Ullmann’s dictionary of the classical Arabic language as practical complement each other while leaving out about as much, then again the less controlled ones like Wehr and Freytag are indispensable, because no man can have first-hand introspection of a whole culture language’s vocabulary; wrong are only those who are blithely unaware of frequencies and other-than-glossed language description altogether and treat languages as doculects. Ultimately Wiktionary will be better than all because all checks and considerations of a nation of intellectuals, philologic and linguistic and other scientific and artisanal thought collectives, are combined. Meanwhile, English Wiktionary has been most impressive and least ridiculous, although certainly also as a function of the domination of the English language on the internet, in science and in programming, which created a peculiar network effect. On French Wiktionary, I can barely rely that they identify a term as French correctly, rather than say a French transcription, and a hapax legomenon, like trasciner, where they were disinclined to recognize its singular context, without some secondary source evidently supporting them, so they will always stay secondary. Who is on which level? It’s like these two dictionaries levelled their skill-trees varyingly. Fay Freak (talk) 00:44, 18 August 2023 (UTC)Reply[reply]
  • This link to an archived discussion thread refers to one or two different site indexes which do not seem to be done automatically, but which Google (and others?) uses somehow. Are these being done for en.wikt? DCDuring (talk) 01:46, 18 August 2023 (UTC)Reply[reply]
    @DCDuring Probably not, as the lack of indexing is a serious issue. Some of my (comprehensive, well-organized) entries have yet to be shown by Google even months after their creation. After we install the extension, we need to make an inquiry on Phabricator to figure out what's going on. Ioaxxere (talk) 04:38, 18 August 2023 (UTC)Reply[reply]
    As might be expected there is some discussion at Phabricator. It even reached a wiki-tech mailing list, which is how I heard. DCDuring (talk) 12:20, 18 August 2023 (UTC)Reply[reply]

acquaintanced is not a word, yet wikipedia accepts it as one.[edit]

Can we change it so that the page for this word refers to the actual word, "acquainted"? 185.80.221.235 06:20, 6 August 2023 (UTC)Reply[reply]

Wikipedia is irrelevant. This is Wiktionary. Wiktionary is a descriptive dictionary based on usage, so if enough people use it in English in running text to convey meaning, we have an English entry covering it. That makes it a word, as far as Wiktionary is concerned. We may label it as rare, proscribed or nonstandard, but we don't close our eyes and pretend it doesn't exist. Chuck Entz (talk) 06:34, 6 August 2023 (UTC)Reply[reply]
I've RfVed it. DCDuring (talk) 14:21, 6 August 2023 (UTC)Reply[reply]

Trouble with T:outdent in forked discussions[edit]

Template {{outdent}} and the '[reply]' button don't play nicely. Put at its simplest, the code implementing the reply button doesn't understand {{outdent}}, with strange to disruptive effects in long, forked discussions. --RichardW57m (talk) 08:41, 8 August 2023 (UTC)Reply[reply]

Normalization with Latf (Fraktur)[edit]

I'd be nice to get normalizations for quotes in Latf (Fraktur) as Latn (normal Latin script), either by adding a |normsc= parameter or doing so automatically. {{lang}} isn't really an option, because it causes different script classes to be nested, and that cannot be nicely dealt with in CSS when adding fonts. — SURJECTION / T / C / L / 15:30, 8 August 2023 (UTC)Reply[reply]

@Surjection There is in fact a |normsc= param. It might not yet be supported for {{quote-book}} specifically, but all the other quote-* templates support it. Let me know and I'll get {{quote-book}} to support it. Benwing2 (talk) 07:44, 9 August 2023 (UTC)Reply[reply]

links[edit]

Is there a way I can see the meaning of a word without clicking at it? 130.43.71.251 14:40, 9 August 2023 (UTC)Reply[reply]

I would imagine there would be a problem with links generating without an ID. If there's a linked ID, it might be possible. Vininn126 (talk) 16:54, 9 August 2023 (UTC)Reply[reply]
I suppose the IP is referring to tooltips? Wikipedia seems to have functionality where if you hover over a link, it shows a snippet of the page in question. Anyone know how this works? User:Vininn126 what do you mean without an ID? I imagine the JavaScript code that implements the tooltip could parse the {{m}} or {{l}} call, call full_link() and fetch text from the right section of the resulting page. Benwing2 (talk) 18:40, 9 August 2023 (UTC)Reply[reply]
@Benwing2 I'm just wondering what text would be previewed from a link using only [[]]'s with multiple definitions or even L2's. Vininn126 (talk) 19:16, 9 August 2023 (UTC)Reply[reply]
@Vininn126, Benwing2 My ideas—
For a bare link with multiple languages, list the first few languages.
[[box]]

box has definitions in English, Czech, and 12 other languages.

Alternatively, treat [[box]] as [[box#English]] or whatever the first L2 header is and follow the example shown below.
For a bare link with a single language or a link to a language section, list the first few definitions.
{{l|en|box}}

English definitions of box:

  1. Senses relating to a three-dimensional object or space.
    1. A cuboid space; a cuboid container, often with a hinged lid.
    2. A cuboid container and its contents; as much as fills such a container. (and 47 other definitions)
Or, use a more sophisticated algorithm that prioritizes covering as many sections as possible.

English definitions of box:

  1. A cuboid space; a cuboid container, often with a hinged lid.
  2. To place inside a box; to pack in one or more boxes.
  3. Any of various evergreen shrubs or trees of genus Buxus, especially common box, European box, or boxwood (Buxus sempervirens) which is often used for making hedges and topiary. (and 47 other definitions)
And with the same idea for linking to sections within an entry.
[[box#Etymology 3|box]]

English definitions of box at § Etymology 3:

  1. A blow with the fist.
  2. To strike with the fists; to punch.
  3. To fight against (a person) in a boxing match. (and 1 other definition)
I don't think we need to need to worry about confusion from mixing parts of speech as it should be obvious from the definition.
Ioaxxere (talk) 23:56, 9 August 2023 (UTC)Reply[reply]
@Ioaxxere, Vininn126 Yes, this seems solvable. I think a bare link should show only the English section if one exists, since most bare links show up in definitions and are intended for English words. Benwing2 (talk) 00:42, 10 August 2023 (UTC)Reply[reply]
Seems reasonable. Vininn126 (talk) 09:39, 10 August 2023 (UTC)Reply[reply]
@Benwing2 @Ioaxxere I think we need to be careful in assuming that. While it's probably true in a general sense, there are quite a lot of small languages where most/all of their entries have bare links due to being edited by one person for a few months who didn't really know what they were doing (from a technical perspective). While we should obviously aim to clear them up, many of them have sat like that for years. Theknightwho (talk) 22:02, 15 August 2023 (UTC)Reply[reply]
@Theknightwho Most of these will have no English section, and will fall back to displaying all languages. If they do happen to have an English section, they probably have a lot of other languages, too, so showing all languages is unlikely to be helpful. We can try to make this smarter by (for example) assuming that bare links in lists in certain sections (Related terms, Derived terms, etc.) are more likely to be of the language of the section they're in, rather than English; in fact I have a script fix_links.py that I've run various times on various languages that makes more or less exactly these assumptions in order to clean up bare links. But in this case the perfect is clearly the enemy of the good; have you taken a look at the JavaScript gadget linked below by User:-sche (in particular, all 7,285 lines of it)? Benwing2 (talk) 06:03, 16 August 2023 (UTC)Reply[reply]
@Ioaxxere: The first idea feels racist! What have you got against the Welsh? --RichardW57m (talk) 10:26, 10 August 2023 (UTC)Reply[reply]
You can turn on MediaWiki:Gadget-popups.js (currently called "navigation popups", historically called "Lupin popups") in your Preferences; it is also what Wikipedia uses. Some other users wrote some code I implemented last year, so it now shows more useful content more often (but for some pages, fails to show anything), but it would be great if someone could either figure out how to make it pull the language and definitions as proposed above consistently, or code a new popup/tooltip to do that. I agree with Benwing that a [[bare]] link should show some definitions (whether English-only or more), since we intend for people to use such links in definitions to link to English words. Ideally we should retain the useful functionality that you can currently also hover over a "diff" link in e.g. your watchlist and see the changes. - -sche (discuss) 11:30, 10 August 2023 (UTC)Reply[reply]

Treatment of roots as lemmas in Semitic languages[edit]

Currently, roots are categorised as lemmas in all Semitic languages, which is completely false. I understand that in some languages (like Proto-Indo-European) words are lemmatised at the root, but that shouldn't be the general behaviour. Is it possible to modify the infrastructure to handle roots as lemma or non-lemma depending on the language? @Benwing2, TheknightwhoFenakhay (حيطي · مساهماتي) 16:33, 9 August 2023 (UTC)Reply[reply]

Does it really make sense to group roots in with inflections? It might be worth splitting the category called "non-lemmas" up, which may address some of the other issues we've had with (e.g.) alternative forms. Theknightwho (talk) 17:30, 9 August 2023 (UTC)Reply[reply]
@Benwing2, Theknightwho, Fenakhay, Fay Freak: In general, referring a root to a single lemma will not be adequate, and it generally makes sense to see the root as more basic than the word-level lemma. (Not all word-level citation forms are words, though.) --RichardW57m (talk) 09:32, 10 August 2023 (UTC)Reply[reply]
I needed to think a few minutes why Fenakhay reckons it wrong: Apparently it is because the “roots” aren’t the citation forms of anything. While e.g. in Sanskrit they they apparently are. This may exemplify another layer of polysemy of the linguistic term of a “root”—we use it in POS headers in two meanings thus. Fay Freak (talk) 18:27, 9 August 2023 (UTC)Reply[reply]
Yeah either splitting "root" into two terms (although I don't know what it'd be called) or (more radically) splitting non-lemmas is possible (although again I don't know what the resulting groups would be). Benwing2 (talk) 18:29, 9 August 2023 (UTC)Reply[reply]
I think you'll find it's the Sanskrit verbs which show a tendency to lemmatise at the root, with a competing tendency to lemmatise at a 3s of the present tense. At Wiktionary, the Sanskrit present tense forms seem to have more senses than the Sanskrit roots. The problem is that there may be multiple stems for the present tense, with a potential for different stems to have different meanings. That sort of thing has been called out for Classical Greek perfect tenses (ablaut v. -ka). That sort of split has even been claimed for English past tenses with alternative forms in -t and -ed. Calling on the Sanskrit editors for comments - (Notifying AryamanA, Bhagadatta, Svartava, JohnC5, Kutchkutch, Inqilābī, Getsnoopy, Rishabhbhat): . With Sanskrit, not all finite verb forms are made from a present stem, increasing the tendency to refer meanings to the root rather than present tense citation form.
I'm not sure that there is a systematic difference between the types of verbal roots - it's more what groups of editors have chosen to do, probably largely based on the dictionaries they're copying. --RichardW57m (talk) 10:18, 10 August 2023 (UTC)Reply[reply]

Invocation and template params[edit]

What are these? The terms turn up in inline documentation of form-of modules. Where should I have found the definitions? My first thought is that invocation params are ones that are supplied when invoking a template and that the 'template params' are the ones in the module invocation that define the functionality of one template as opposed to another, but the descriptions then don't make sense. Pinging @Rua, Benwing2, Erutuon, Theknightwho as the likeliest to know or supply good mnemonics. It might be the other way round, as there are warning messages around that a page needs an inflection template, when what it usually needs is the invocation of a pre-existing inflection template. --RichardW57m (talk) 14:46, 11 August 2023 (UTC)Reply[reply]

@RichardW57 Where is this mentioned exactly? I'll clean this up. "Invocation params" are the params passed in when you call {{#invoke:...}} and "template params" are the params of the template whose definition involves calling {{#invoke:...}}. Invocation params are retrieved using frame.args and template params are retrieved using frame:getParent().args. Benwing2 (talk) 21:58, 11 August 2023 (UTC)Reply[reply]
Thank you. The terms are found in Module:form_of/templates. --RichardW57 (talk) 08:37, 12 August 2023 (UTC)Reply[reply]

Testing categorisation[edit]

Is there a good way of setting up a permanent test of a template intended to perform categorisation, e.g. in a 'testcases' module file? The problem I see is that some categorisation templates seem only to work when invoked from a page in main space. An typical example would be invoking {{causative of}}. --RichardW57m (talk) 14:54, 11 August 2023 (UTC)Reply[reply]

@RichardW57 Many modules have a force_cat setting at the top of the module for this purpose; set it to true (without saving the module) and preview the testcases page. Module:form of has this, right at the top of the module. (The template editor protection on the module might be an issue; let me know if this is the case.) Alternatively, sometimes the template itself has a |force_cat= param; the advantage of this is you don't have to do the module editing business, but the disadvantage is you need to add |force_cat=1 to every template in your testcases page. It doesn't look like the form-of templates have this method set up. Benwing2 (talk) 22:05, 11 August 2023 (UTC)Reply[reply]

template:examples on mobile view isnt really that great[edit]

it seems that the Minerva skin (default mobile skin) ignores most of the CSS formatting and just puts the examples above the definitions, as if they were definitions themselves. it also adds a header that doesnt close, so if anything it looks like the definitions are part of the examples section. not the biggest issue to worry about right now, since it still is perfectly legible and all that, but i wonder if it could at least have a box around it so it stands out from the definitions more. Thanks, Soap 17:10, 11 August 2023 (UTC)Reply[reply]

@This, that and the other Wonder if you can help? This template adds a toccolours CSS class but I don't see it referenced in MediaWiki:Common.css. Maybe the box is coming from the -moz-box-sizing: content-box; inline setting; is there a mobile (or non-Mozilla) equivalent of this? Benwing2 (talk) 22:10, 11 August 2023 (UTC)Reply[reply]
Styling templates should better be done using mw:Extension:TemplateStyles, see the manual. MediaWiki:Common.css is extraordinarily large (and all of that is loaded to every visitor), 5 times larger than w:MediaWiki:Common.css. A lot of that could be moved to per-template stylesheets.
You could easily add rules for small screens there as well, e.g. with a rule
@media (max-width: 36em) {
	/* rules */
}
(I have been working at styles for {{uxi}} on mobiles, will post soon.) JWBTH (talk) 03:58, 12 August 2023 (UTC)Reply[reply]
This wiki seems to have had a real shortage of users with frontend CSS skills, which is part of the reason why our Common.css languishes in such a state and mobile view is missing so much formatting...
In this case, however, there is little motivation for mucking around with TemplateStyles, since the template will as a rule appear no more than once on per page. I'll try and fix the formatting on mobile using inline styles. This, that and the other (talk) 11:57, 12 August 2023 (UTC)Reply[reply]
Need to do something with {{wikipedia}} too, see the screenshot. You can also see the effect of bigger image margins at https://en.m.wiktionary.org/wiki/proper_noun. The reason is the default browser style for <figure> (these tags have become used for images in MediaWiki a month ago). JWBTH (talk) 03:44, 12 August 2023 (UTC)Reply[reply]
I fixed it to be slightly less terrible. But the "proper noun" entry is a great showcase for all that is wrong with our mobile site styling. The font sizes and box paddings are all over the shop. Plus, on a larger device (e.g. tablet), having all the elements in sequence, rather than floating to the side like they do on desktop, really kills the flow of the page. This, that and the other (talk) 13:31, 12 August 2023 (UTC)Reply[reply]
@This, that and the other Thanks! Yeah I really don't know shit about CSS and I think that's the case with a lot of the tech people here; I have always been a backend coder. Benwing2 (talk) 21:07, 12 August 2023 (UTC)Reply[reply]

Language family trees could use more collapsing.[edit]

Many pages for our languages use trees, such as Proto-Indo-European, to show the family origin. However, I can't collapse individual branches. This is a pain if I just wanna look like a single branch or two, like the Germanic and Italic branches only for PIE. I think we should have the collapse button for every single element in all the language trees. cf (talk) 03:32, 13 August 2023 (UTC)Reply[reply]

Fixing misnamed Nyunga templates[edit]

The LDL language Nyunga has a handful of templates intended for use inside the "References" section, but they're all named "RQ:" instead of "R:". Is it better to rename them to "R:" or "R:nys:"? JeffDoozan (talk) 14:18, 13 August 2023 (UTC)Reply[reply]

They may have been intended as a source for quotations, in which case keep 'RQ:'. To make the judgement, one needs to look at the book. If you can't access the book, stick with what we have - winding up with a chain of redirects is bad. --RichardW57m (talk) 12:50, 14 August 2023 (UTC)Reply[reply]
@JeffDoozan Contrary to Richard they do appear to be References templates, not quotation templates, in which case they should begin with R:nys:. Benwing2 (talk) 21:01, 14 August 2023 (UTC)Reply[reply]

Problem in zh-see[edit]

If you look at 鸭绿, you'll see "For pronunciation and definitions of 鸭绿 – see 鴨綠 (“Yalu [[river”)." How is this corrected? --Geographyinitiative (talk) 16:24, 13 August 2023 (UTC)Reply[reply]

I don't think {{zh-see}} supports {{place}}. — Fenakhay (حيطي · مساهماتي) 16:31, 13 August 2023 (UTC)Reply[reply]
Not solved --Geographyinitiative (talk) 00:26, 21 August 2023 (UTC)Reply[reply]

Broken “alternative form of” in ό,τι κι αν[edit]

In ό,τι κι αν (ó,ti ki an) the template “alternative form of” is broken. This entry is supposed to be an alternative form of ό,τι και να (ó,ti kai na), but due to the comma, the phrase is split into two parts *ό (ó) and *τι και να (ti kai na). OosakaNoOusama (talk) 04:44, 14 August 2023 (UTC)Reply[reply]

@OosakaNoOusama: I found a workaround, but there should be a more straightforward way to get it to ignore the comma. Chuck Entz (talk) 05:33, 14 August 2023 (UTC)Reply[reply]
@Chuck Entz The comma is ignored if there's a space following it. This is a strange term without a following space. @OosakaNoOusama Is it common to have Modern Greek terms with embedded commas without a following space? Benwing2 (talk) 07:39, 14 August 2023 (UTC)Reply[reply]
As far as I can tell, the comma without a space is only used in the term ό,τι (ó,ti), where the comma is supposed to represent the hypodiastole. I have not yet discovered any other such terms. OosakaNoOusama (talk) 07:50, 14 August 2023 (UTC)Reply[reply]
@Benwing2: Whatever, there is a lot of broken documentation. Or are editors expected to read and understand the source code of Module:links? --10:06, 14 August 2023 (UTC) RichardW57m (talk) 10:06, 14 August 2023 (UTC)Reply[reply]
@RichardW57 Please no snark here. We are doing the best we can, there is a lot to document. The code that handles the comma here and allows for multiple lemmas (and inline modifiers) is actually in Module:form of/templates. I documented this in Template:inflection of/documentation#Multiple lemmas and inline modifiers but it needs to be added to the general form-of documentation in Module:form of doc so it shows for all form-of templates. Benwing2 (talk) 21:27, 14 August 2023 (UTC)Reply[reply]

Displaying Language of Quotations for Translingual[edit]

How should one display the language of a quotation supporting a translingual sense? For example, I am minded to add some Thai quotes to support senses for the comma, so I would most likely use {{quote-book}} plus {{th-usex}}, but how does one state to the reader that the language was Thai as opposed to Pali or Northern Khmer? --RichardW57m (talk) 10:52, 14 August 2023 (UTC)Reply[reply]

@RichardW57 I don't quite understand your question. The quote-* templates support three different concepts of language: The language(s) of the quote itself (specified using |1= and can be a comma-separated list), the language(s) of the work the quote is within (specified using |worklang= and can be a comma-separated list) and the language of the term being illustrated (specified using |termlang= and should be a single language). Generally you only use the latter two if they differ from |1=. Benwing2 (talk) 19:49, 14 August 2023 (UTC)Reply[reply]
@Benwing2: As far as I am aware, |1= and |termlang= don't display, and the latter would be 'Translingual' anyway. Now, in this case, |worklang=th would half work because I would probably take the Thai quote from a work in Thai, but I want to say that the quotation is in Thai. It would work by the reader misinterpreting the display! It wouldn't work for a Pali quotation from a work in Thai showing a translingual term, such as exhibiting a sentence-terminating full stop in Thai-script Pali, or a sentence-terminating comma in (Tai) Tham script Pali in a work in pre-reform Lao-script Lao. Perhaps @Al-Muqanna has some thoughts on what he would hope to see once the moratorium on editing such pages is lifted. --RichardW57 (talk) 21:12, 14 August 2023 (UTC)Reply[reply]
@RichardW57 Once again I have no idea what you're saying. Can you rephrase it with an example, and state what your desired outcome is? The algorithm for handling |worklang= and |termlang= is as follows:
  1. If |worklang= is given, display "(in WORKLANG)".
  2. Otherwise, if |termlang= is given and is different from |1=, display "(in LANG)" using the LANG of |1= (the quotation language).
  3. Otherwise display nothing.
There's a comment that gives the example text "(in German; quote in Nauruan)", which suggests that the intention in case all three are different is to display both the work lang and quote lang. That doesn't currently happen but I can fix this if that will help you. Benwing2 (talk) 21:20, 14 August 2023 (UTC)Reply[reply]
As for the moratorium, you're referring to single-character entries? I had hoped the issues would be resolved by now. What is the current state? I know the issue is mainly between you and Kwami, and Kwami said awhile back he had some real-life issues to deal with and might be away from editing for a little while, but I don't know where things currently stand. Benwing2 (talk) 21:23, 14 August 2023 (UTC)Reply[reply]
@Kwamikagami Benwing2 (talk) 21:23, 14 August 2023 (UTC)Reply[reply]
I'm all for adding attestation and usage to translingual entries. My reason for RfD was because many entries contained no information apart from the Unicode def, which could easily be wrong or misleading if we do not have confirmation. In many cases, I would search for info, including from the docs submitted to Unicode to justify the character, and if I couldn't find a language that used it, to be able to create a functional definition myself, I would RfD because we were pretending we had an article when effectively we didn't. In some cases Richard was able to provide attestation for the character, which was great.
My problem with many of Richard's edits was what I took to be a Fascist promotion of certain languages (mostly Burmese, but also potentially Thai) over others, because the Burmese Army can shoot people in the head, which makes Burmese superior to other languages. If he claims that the translingual pronunciation is Burmese (or Thai), or similarly narrows any other aspect down to usage by a particular language -- that is, if he claims that Burmese, Thai etc. are "translingual" rather than Burmese or Thai, then I would argue that his addition is prejudicial and should be moved to a Burmese or Thai section.
I rather doubt Richard intends to create Burmese and Thai sections for the usage of letters or symbols in those languages, as he still seems to be pushing for Burmese domination. But if he really is willing to add translingual attestation to a translingual section, then I have no objection to him doing so. kwami (talk) 22:00, 14 August 2023 (UTC)Reply[reply]
@Kwamikagami: You still seem not to understand that with one exception I was simply reverting impermissible deletions; I confess I restricted myself to restoring defensible entries. The exception was when I started to tackle the misleading implication that Burmese pronunciations not clearly labelled as such were representative of the whole range of use, and realised that our labelling policy did not work for pronunciation differences by language rather than geographical area. Look at the history to see who added Burmese pronunciations to translingual letters.
The relevant feature of the Burmese army was that they regularly defeated other armies, even if it was concluded that it was cheaper to pay tribute to the Chinese than to slaughter their forces. Thus, foreigners ended up dealing with the Burmese, and Mon reportedly (a Mon admission) became inadequate for talking about politics. --RichardW57m (talk) 12:35, 15 August 2023 (UTC)Reply[reply]
@Kwamikagami I tend to give the benefit of the doubt when it comes to accusations of racism. In this case unfortunately I don't understand the essence of what Richard is saying, or why who paid tribute to who has anything to do with creating a dictionary. In any case I disagree that the Unicode definition of a given character is irrelevant or useless information and that we should delete characters whose only definition is based on Unicode. I'm pretty sure this view is the consensus. Benwing2 (talk) 00:37, 16 August 2023 (UTC)Reply[reply]
The history is why, if we ignore the Unicode catalogue of characters, descriptions of the Burmese alphabet provide reasonable definitions of the Burmese script characters that are in the Burmese alphabet. (I'm not sure though that the proposition that and are different letters of the Burmese alphabet is accepted widely enough to use.) --RichardW57 (talk) 07:47, 16 August 2023 (UTC)Reply[reply]
@Benwing2: We're currently waiting for me to beef up the text for อ‍ย so that we can pipe clean the use of {{rfm}} to change the language from translingual, which should properly be raised by @Kwamikagami - unless he's happy that it is translingual. There is a high risk that only @Octahedron80, Noktonissian and I have the relevant books for the letter; possibly only I have them. The only usage I have evidence for it shows it being used in what may be seen as a transliteration (as opposed to transcription) of Tai Tham Northern Thai into the Thai script in books whose language is Siamese. Being a ligature, rather than a single character, it is not subject to the moratorium. In particular, I need to work out/remember how to make usable and lawful images of the letter's various glyphs. (Printing in black and transparent as opposed to black and white is not as easy as one might think.) --RichardW57m (talk) 12:06, 15 August 2023 (UTC)Reply[reply]
A mock-up of a simple 3-language case is
{{quote-book|th|termlang=mul|title=Rhubarb|worklang=fr|year=2023}}
{{th-x|หมา , แมว|dog, cat}}
yielding
2023, Rhubarb (quotation in Thai; overall work in French):
หมา, แมว
mǎa , · mɛɛo
dog, cat
I don't entirely trust {{th-x}} to embolden the comma when on the page, but emboldening punctuation is unreliable anyway. Now, if I omit |worklang=fr, I get
2023, Rhubarb (in Thai):
หมา, แมว
mǎa , · mɛɛo
dog, cat
The only problem with this is that it claims that the language of the cited work is Thai.
I was actually expecting an answer such as, "Just stick emboldened 'Thai: ' in front of the quotation, yielding for example:
2023, Rhubarb (quotation in Thai; overall work in French):
Thai: หมา, แมว
mǎa , · mɛɛo
dog, cat
This is, after all, rather an unusual case. --RichardW57m (talk) 11:38, 15 August 2023 (UTC)Reply[reply]
I think the work language should always be displayed if given explicitly. [Clarified.]
If the quote language (|1=) differs from the term language (|1=), I think the quote language should also be given.
There may be different considerations if the term language is not the language of the language section - the documentation of |brackets= suggests that this may be possible in some esoteric cases. I think the logic would be easier if we knew the language of the language section. RichardW57m (talk) 13:05, 15 August 2023 (UTC)Reply[reply]
@RichardW57 All right. As usual I have difficulties making sense of what you've said (you're almost as impenetrable as Fay Freak) but I think you're asking for both the quote language and work language to be displayed explicitly if they are different from each other and the term language, which I can implement. Benwing2 (talk) 20:50, 16 August 2023 (UTC)Reply[reply]
I'll restate the corrected garbled first sentence - "If the quote language (|1=) differs from the term language (|termlang=), I think the quote language should also be displayed."
I think the quote language should be displayed if different from the term language.
In your understanding, do work language and term language always exist, or do they only exist if respectively parameters |worklang= and |termlang= are supplied? I think I need to draw up a logic table (@RichardW57). The user will normally expect the term language to be the language of the language section, but this expectation may be wrong for bracketed 'quotations'. --RichardW57m (talk) 09:17, 17 August 2023 (UTC)Reply[reply]
@RichardW57m: See what I've done with the German and Latin quotations supporting Ancient Greek terms at Citations:Φρεαττώ. Is that the kind of solution you're looking for? 0DF (talk) 10:47, 17 August 2023 (UTC)Reply[reply]
{re|0DF}} These do argue that one doesn't need to display the quote language separately from the work language if they're the same. A better example along these lines would be a Latin statement about a Greek word inside a German book. --11:53, 17 August 2023 (UTC)
Incidentally, I think you should have |brackets=on for these quotations, because they're mentions, not uses. RichardW57m (talk) 11:53, 17 August 2023 (UTC)Reply[reply]
@RichardW57m: Done. Could you explain why the |worklang= matters, please? I'm not seeing the point of it. 0DF (talk) 12:06, 18 August 2023 (UTC)Reply[reply]
Generally it's of secondary importance; it is a relatively recent addition. I see it as a guide to the reader of the difficulty of navigating the work and understanding the context. For example, I found what at first looked like some mentions of weird Tai Tham-script Pali spellings in a Thai-language book on Northern Thai. I had to study the text hard to decide that they were probably borrowed terms rather than actual Pali. If I had accepted them as Pali, someone who can't read Thai at all shouldn't even bother to try to verify that I had interpreted the text correctly. The book seems to be out-of-print, so getting hold of a copy may be hard. (It seemed widely available 15 years ago - I even saw it on sale at a service station.) Note that an inscription is a valid quotation for CFI even if verification requires someone to visit it in order to verify it. --RichardW57m (talk) 13:04, 18 August 2023 (UTC)Reply[reply]
@RichardW57m: OK, yeah; that makes sense. 0DF (talk) 23:38, 18 August 2023 (UTC)Reply[reply]
@RichardW57 I'm not sure what you mean by "do they always exist". There is always a work language (i.e. the work is always in some language), which is assumed to be the same as the quote language in |1= if not specified; same for the term language. Benwing2 (talk) 18:34, 17 August 2023 (UTC)Reply[reply]
@Benwing2: One might distinguish between having |worklang= specified and not having it specified. The same goes for |termlang=. Your answer means that presence and absence of the parameter are not distinguished, but rather that the code works with 'resolved' values. I will draw up the table with that in mind. --RichardW57m (talk) 09:47, 18 August 2023 (UTC)Reply[reply]
@RichardW57 I changed the algorithm for handling |worklang= and |termlang=. See the bottom of User:Benwing2/test-quote for some examples and let me know what you think. Benwing2 (talk) 05:48, 19 August 2023 (UTC)Reply[reply]
@Benwing2: A quibble: Such cases should show "quotation in [language]", rather than "quote in [language]", IMO. 0DF (talk) 12:03, 19 August 2023 (UTC)Reply[reply]
@Benwing2: They work. --RichardW57 (talk) 20:34, 19 August 2023 (UTC)Reply[reply]
I've finally put a logic table together, at User:RichardW57/quote-lang. The examples have highlighted an issue with what may be legacy templates from before the advent of |worklang=. There are English-language works being quoted from which need to be tagged as such. It's a shame the quotations will then look worse. --RichardW57 (talk) 20:34, 19 August 2023 (UTC)Reply[reply]
I haven't addressed what should be done when |termlang= is not the language addressed by the language section. --RichardW57 (talk) 20:34, 19 August 2023 (UTC)Reply[reply]

Wikifunctions[edit]

The Wikifunctions project has recently gone live.

Wikifunctions is a Wikimedia project for everyone to collaboratively create and maintain a library of code functions to support the Wikimedia projects and beyond, in the world's natural and programming languages.
A "function" is a sequence of programming instructions that makes a calculation based on data you provide. Functions can answer questions, such as how many days have passed between two dates, or the distance between two cities.

Is there any possible application for these functions on Wiktionary? Ioaxxere (talk) 03:57, 15 August 2023 (UTC)Reply[reply]

It sounds like what we are currently achieving using modules like “Module:string” and “Module:roman numerals”. — Sgconlaw (talk) 05:55, 15 August 2023 (UTC)Reply[reply]
@Ioaxxere @Sgconlaw The purview of such a project is extremely broad, and as a result who knows what it will end up amounting to. As Sgconlaw points out, we already have modules to do lots of things, and a lot of the modules we have or need are fairly specialized to Wiktionary so won't be found in Wikifunctions (for example, we could definitely do with a generic headword-processing module; it's something I've thought of writing at various points). I think the only possible benefit is if you're looking for an implementation of some well-known algorithm like Levenshtein distance; in most such cases, however, you can already find implementations in Wikipedia or Stack Overflow, and if not, I wouldn't necessarily trust Wikifunctions to have high-quality code. Benwing2 (talk) 20:05, 15 August 2023 (UTC)Reply[reply]
That's broadly where I stand as well. We need to generalise where it makes sense for us to generalise, but still tailor things for Wiktionary's specialist needs wherever possible. Theknightwho (talk) 21:57, 15 August 2023 (UTC)Reply[reply]

Auto cat and the Roman Empire[edit]

@Theknightwho, Benwing2, J3133 and anyone else who knows: Why does {{auto cat}} work on Category:Places in the Roman Empire and its language-specific subcats but not on Category:Cities in the Roman Empire and its subcats? —Mahāgaja · talk 08:33, 15 August 2023 (UTC)Reply[reply]

I'm not them, but the answer can seemingly be found in point 2 under the "Polities" heading at Module:place/shared-data:
[] former states such as Persia, East Germany, the Soviet Union and the Roman Empire should have their cities, towns, rivers and such listed under the current entities occupying the same area.
But this makes me wonder why we have these former-place cats in the first place. What is meant to be in them? This, that and the other (talk) 11:20, 15 August 2023 (UTC)Reply[reply]
@Mahagaja@This, that and the other The problem here is we don't have a clear policy about how to handle such situations. For example, Moscow is a "city in the Soviet Union" but I don't think classifying it this way is helpful. We could potentially classify a city like Stalingrad that existed only in the Soviet Union as such but I think it's better to define it as a former name of Volgograd, and put it as a city in Russia. When creating such a policy, we have several cases to consider:
  1. Larger entities like provinces don't make sense outside their particular polity, so should be classified e.g. in CAT:Provinces of the Roman Empire.
  2. For cities and towns, there are at least three cases: (1) Cities in recently extinct entities (East Germany, Czechoslovakia, the Soviet Union, etc.) which generally have continuity with modern cities, but possibly different names. (2) Cities in ancient entities, where the city still exists (e.g. Rome, London, Antioch); arguably there is continuity with the modern entity but sometimes it is fuzzy. (3) Cities in ancient entities that no longer exist. I'm not quite sure what the best policy is for each case, but note that we have Category:Historical settlements, which is defined as "names of cities, towns and villages that no longer exist or have been merged or reclassified". This is perhaps sufficient for (3) and for the cases under (2) that don't have historical continuity. Note that if you use <<former city>>, <<ancient city>>, <<historical city>> or the like, you get classification into Category:Historical settlements automatically.
  3. Rivers in ancient polities generally still exist, possibly with different names, so should be treated as case (2) above for cities and towns. The only tricky case is where the ancient definition of a river covers a different stretch of water than the modern definition, but I don't think this warrants a category like Category:Rivers in the Roman Empire.
Note that this still doesn't answer the question of why there's a category Category:Places in the Roman Empire that contains cities (cf. Italian Costantinopoli, which is defined using <<historical capital>> and gets classified automatically into Category:it:Historical capitals, Category:it:Historical settlements and Category:it:Places in the Roman Empire. Probably this shouldn't happen, and it's even more dicey with a category like Category:Places in Czechoslovakia (which contains only two terms, English Prague and Pardubický kraj, both of which still exist with the same name in the modern Czech Republic). Benwing2 (talk) 23:28, 15 August 2023 (UTC)Reply[reply]
The Roman Empire categories (except provinces) are I think not obviously helpful and probably should not exist. My understanding has been that places should be categorised into the present-day country, and settlements that no longer exist should likewise be categorised into the present-day country but marked as historical settlements. There are in any case, for example, only 7 entries in Category:la:Places in the Roman Empire (compare 258 in Category:la:Places in Italy alone). That was my understanding from existing practice, and I've not felt it to be unclear until this came up. With regard to provinces, I agree those pertain to polity and need to be treated accordingly. —Al-Muqanna المقنع (talk) 23:58, 15 August 2023 (UTC)Reply[reply]
@Al-Muqanna Yeah I agree with you and I think I'm making it more complex that it needs to be; what you're describing is indeed the existing practice that I codified into Module:place except for the Category:Places in FORMER POLITY, which I agree should not exist. Benwing2 (talk) 00:06, 16 August 2023 (UTC)Reply[reply]
Hmm - I’m not sure I fully agree with the idea that historical settlements should be categorised by the modern country. It feels much more natural to say that Londinium and Aquae Sulis were historical settlements in Britannia than it does to say they’re historical settlements in the United Kingdom, for example. The continuity between modern cities and their ancient counterparts is often very loose, to the point where they have really quite distinct identities. This particularly applies when historical records became very sparse for hundreds of years, as they did in Britain. Theknightwho (talk) 01:20, 16 August 2023 (UTC)Reply[reply]
@Theknightwho They are not categorized as "historical settlements in the United Kingdom" (or "... England") but merely as Category:Historical settlements. Currently whether they end up as Category:Places in the Roman Empire or Category:Places in England depends on the definition using {{place}}. You're touching on case #2 above for historical settlements, where sometimes the continuity is questionable, but I would argue that Londinium is still a place in England, because that's where it's physically located. Benwing2 (talk) 01:34, 16 August 2023 (UTC)Reply[reply]
Like Königsberg is a place in Russia? Also, Londinium is or Londinium was? --RichardW57 (talk) 07:58, 16 August 2023 (UTC)Reply[reply]
My usual practice has been to write both in the entry itself, but the categorisation should follow the modern country—otherwise it would quickly become very tedious to use. (I don't remember one I did off the top of my head, but it's the same sort of thing as at Complutum.) Also in practice you would probably put historical settlement in modern England, which seems fairly natural to me, not just the UK: see Category:la:Cities in England. —Al-Muqanna المقنع (talk) 08:20, 16 August 2023 (UTC)Reply[reply]
Usefulness of categorisation may vary by language. I'm sure it would be more useful to categorise Alcalá de Henares in CAT:es:Places in Spain, but I'm also sure that it would be more useful to categorise Complūtum in CAT:la:Places in the Roman Empire. 0DF (talk) 12:22, 16 August 2023 (UTC)Reply[reply]
Er, why would it be more useful to put it in different categories per language? Bear in mind, too, that Latin place names for settlements continued to be used in Medieval Latin and New Latin. Then we have cities like Dura-Europus that changed hands regularly between various polities, including the Roman Empire, in ancient times—categorising as a place in Syria makes sense, as a place in the Roman Empire, not so much. Categorising such names separately and pedantically into every polity their referents were ever part of would be completely pointless, and if I want to know, say, Latin names for places in Spain or Italy, all of them being dumped into a Roman Empire category would also be totally unhelpful. The existing system works well. —Al-Muqanna المقنع (talk) 12:52, 16 August 2023 (UTC)Reply[reply]
Yes, I'm inclined to agree (although I still think there's a certain difference between modern cases like Königsberg/Kaliningrad and ancient cases like Complutum/Alcalá de Henares). The alternative, for example, would be to try and categorize Königsberg as a place in "Germany" when the "Germany" that it was a part of no longer exists and a different Germany now has the same name. We could potentially add a category such as Category:de:Former names of settlements for former names; maybe that would help. Benwing2 (talk) 20:47, 16 August 2023 (UTC)Reply[reply]
@Al-Muqanna, Benwing2: Why not categorise both ways? And if "Places in the Roman Empire" is too broad (yeah, probably), why not also CAT:la:Places in the Hispania Tarraconensis? As for Königsberg, what about Category:de:Places in Prussia? 0DF (talk) 00:07, 17 August 2023 (UTC)Reply[reply]
We're a dictionary, not a historical gazetteer. I'm not convinced that this kind of detailed historical categorisation is of any lexicographical benefit. This, that and the other (talk) 00:27, 17 August 2023 (UTC)Reply[reply]
@0DF, This, that and the other: Königsberg would also belong in cat:Places in Prussia. The advantage of detailed historical categorisation lies in the hope that it will include the categorisations of interest. The disadvantage is the burden imposed by shifting borders. --RichardW57m (talk) 09:53, 17 August 2023 (UTC)Reply[reply]
@This, that and the other: What is the intended lexicographical benefit of any of these "Places in [bigger place]" categories? @RichardW57m: And I'd think that any problems with proliferating categorisation would be pretty minimal, given that the category names are all at the bottom of pages. 0DF (talk) 10:52, 17 August 2023 (UTC)Reply[reply]
The problems are not minimal since they would need to be both maintained and consistent. Categorisation by fixed geographical regions corresponding to the modern borders benefits ordinary readers who'd e.g. want to see what Latin names are available for places they might be familiar with, and for diachronic linguistics it's also somewhat useful to be able to track placenames by fixed region. In general it seems obvious to me that listing "Roma" as the Latin name for a city in Italy is helpful in a way that categorising it as a city in the Roman Kingdom, a city in the Roman Republic (classical), a city in the Roman Empire, a city in the Exarchate of Ravenna, a city in the Papal States, a city in the Roman Republic of 1848, a city in the Kingdom of Italy, etc., is not. —Al-Muqanna المقنع (talk) 11:04, 17 August 2023 (UTC)Reply[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── I've opened an RFD at Wiktionary:Requests for deletion/Others#Roman Empire toponym categories. —Al-Muqanna المقنع (talk) 11:35, 17 August 2023 (UTC)Reply[reply]

@Al-Muqanna: OK, I've responded there. 0DF (talk) 12:27, 18 August 2023 (UTC)Reply[reply]

What is the possibility of creating a -lite template for the template:label ?[edit]

having worked on a page for the last some hours, I've noticed that template:label seems to be by far the most common template that seems to not have a respective -lite version for dealing with Lua memory errors. Is this due to a technical limitation of some sort? QuakDucc (talk) 21:08, 16 August 2023 (UTC)Reply[reply]

It's not realistic to make a -lite version of this template. The categorisation, linking, etc. functionality is too complex to replicate outside Lua. I suppose we could create special -lite versions of the absolute most common labels, like (transitive) and (obsolete), if needed. And of course, sui-generis labels like (of a person) can be created using {{q-lite}}. This, that and the other (talk) 00:22, 17 August 2023 (UTC)Reply[reply]

achnąć's category[edit]

Also pinging @Hythonia There's a problem with achnąć where the term was used by thieves, however, the label "thieves cant" links and refers to specifically English thieves' cant. What can we do? Vininn126 (talk) 13:49, 17 August 2023 (UTC)Reply[reply]

Note there's a pending split request to divide thieves' cant per language. In the literature I can see uses of terms like "Russian thieves' cant" so that might be one solution, though the decision there has apparently been to use "criminal slang" for other languages, which also works. —Al-Muqanna المقنع (talk) 13:59, 17 August 2023 (UTC)Reply[reply]
Commented there. Vininn126 (talk) 14:02, 17 August 2023 (UTC)Reply[reply]

I would appreciate your comments please[edit]

Hi,

Eyeglasses parts

I’ve been exploring a modest proof of concept to present terms in a visual way, much like visual dictionaries. It’s not new, and has been implemented here by the picdic template for instance. I propose to use traditional callouts with thin lines to link a visual element to its term. It works pretty well on mobile, desktop and print versions, even when zooming in and out. Ideally, a user could create these labels graphically in a small GUI external to Wiktionary, which would yield wikicode that they would then paste into the edit box. For the test I propose on the right, I created an SVG image with callouts in Inkscape, then converted it semi-automatically in Python to produce wikicode. This could be done in Lua of course.

I like the fact that labeling these pictures may reveal some missing terms.

This concept is not entirely farfetched, but I would really appreciate it if I could have your impressions about it. It may be out of scope for this project, the wikicode could be too ugly, and so on. I am not sure because I am a bit new at this. I also don't know how much time it would take to implement this elegantly in a module for instance.

Thank you.

Edit: This figure would appear in eyeglasses. Jeran Renz (talk) 19:26, 17 August 2023 (UTC)Reply[reply]

@Jeran Renz This looks great, thank you! We need more of this. WT:PICDIC is severely underdeveloped, but I think such diagrams are very helpful (and I like the design of yours much better than anything already in the picture dictionary). They're common in many print dictionaries and encyclopedias. As possible ideas for getting started, I might mention, in no particular order: parts of (1) a horse, (2) a bird, (3) a car, (4) a train, (5) a house, (6) various rooms in a house, (7), a computer, (8) a camera, (9) a tree, or (10) a flower. I suggest starting with more common items, especially ones where the terms are relatively well known, since this is extremely helpful for English learners. I look forward to seeing what you come up with! Andrew Sheedy (talk) 19:51, 17 August 2023 (UTC)Reply[reply]
I second Andrew's comments. Looks great on my laptop and my phone too. —Al-Muqanna المقنع (talk) 20:40, 17 August 2023 (UTC)Reply[reply]
@Jeran Renz: This looks lovely. Yes, the voluminous code would be a bit much in an entry, but I assume it could be housed elsewhere and then the result transcluded into the entry. 0DF (talk) 11:56, 18 August 2023 (UTC)Reply[reply]
Yes, that would be the best approach. If you make these into a template, it would allow for including the image not just in the eyeglasses entry, but also at nose pad, endpiece, eyewire, etc., since it serves to illustrate those as well. Andrew Sheedy (talk) 17:37, 18 August 2023 (UTC)Reply[reply]
Looks really nice! — Sgconlaw (talk) 17:55, 18 August 2023 (UTC)Reply[reply]
Thank you for your constructive feedback. @Andrew Sheedy: Would a module be as acceptable as a template? The template language looks very unwieldy. --Jeran Renz (talk) 18:20, 18 August 2023 (UTC)Reply[reply]
Unfortunately, I'm pretty ignorant when it comes to the technical side of things. I'm sure someone who knows better will weigh in. However you decide to implement this, I think it will add a lot to the entries. Andrew Sheedy (talk) 18:22, 18 August 2023 (UTC)Reply[reply]
@Jeran Renz Modules are totally fine. Keep in mind that by convention we usually wrap a module using a template. It is fine to put the wikicode for a given diagram in either a module (wrapped in the appropriate template) or directly in a template and invoke it from the pages that need it (e.g. for this diagram, eyeglasses, nose pad, endpiece, etc.). One thing you might consider is generating the actual wikicode using a module along with a spec that just specifies the terms, positions and images. That way if people need to edit the spec manually, it's at least somewhat possible to do this, whereas editing the wikicode directly is more painful. Benwing2 (talk) 19:36, 18 August 2023 (UTC)Reply[reply]
One thing to bear in mind as well is that ideally it should be fairly straightforward to create foreign-language versions of your diagrams. Andrew Sheedy (talk) 19:44, 18 August 2023 (UTC)Reply[reply]
Thank you so much, I'll follow your recommendations! --Jeran Renz (talk) 00:17, 19 August 2023 (UTC)Reply[reply]
@Benwing2 @Andrew Sheedy I have finished a first version of a module which turns annotation specs into a full-fledged annotated diagram. The module is called visual-dict. You can see an example output on my sandbox here, as well as a few desiderata. My code is open for all to review, and comments are welcome. Thanks! --Jeran Renz (talk) 20:45, 25 August 2023 (UTC)Reply[reply]
I can't comment on the code, but I'm excited about what you have planned for it! This would really add a lot to Wiktionary if it becomes widely used. Andrew Sheedy (talk) 21:35, 25 August 2023 (UTC)Reply[reply]
@Jeran Renz Your code looks good. A few minor comments:
  1. I would document more clearly what the annotation format is and how the main entry point works; you can put this in a comment at the top of the main entry point in the code itself.
  2. You might consider simplifying the specification of links to English terms. One possible way is to use a format like this in the annotation spec: <<rim>> or <<lens|left lens>> (if you need to specify separate link and display terms).
  3. You might consider using Module:parameters to parse the arguments to the main entry point; see Module:form of/templates#L-509 for an example of how to do this.
  4. By convention we tend to use spaces instead of hyphens in module names; hence Module:visual dict would be a slightly better name.
Benwing2 (talk) 22:24, 25 August 2023 (UTC)Reply[reply]

Vertical fractions[edit]

How should we handle these?

I added a quotation at that included a vertical fraction. To enable that, I imported the template Template:sfrac from WP. User:Fenakhay deleted the template with a comment that templates should not be imported from WP. They then altered the quote to 1/3☉, which is the opposite of what was intended. I fixed it by adding parentheses, (1/3)☉, but that means there are now parentheses in our quotation that aren't in the original. I know trivial changes to quotations are allowed, but adding parentheses to a mathematical equation (even a simple one) doesn't strike me as trivial. So, how should this be handled? kwami (talk) 11:07, 18 August 2023 (UTC)Reply[reply]

@Kwamikagami: I'd just write "⅓☉" (precomposed fraction, no parentheses), personally. The vertical, as opposed to diagonal, arrangement is only really stylistic, anyway. Moreover, were you to copy something like "13☉", you'd end up pasting "13☉", which is something very different semantically (a 39-fold increase in mass), and not just stylistically; copying and pasting "⅓☉" preserves its meaning. 0DF (talk) 11:42, 18 August 2023 (UTC)Reply[reply]
"⅓☉" could be read as 1/3☉ rather than as 1☉/3. At best it's ambiguous, which is why diagonal fractions are not used in mathematical texts. To preserve the meaning, parentheses are needed. So, is that trivial enough a change to not worry about when quoting something? kwami (talk) 11:50, 18 August 2023 (UTC)Reply[reply]
I think if it's a quote, it's much better to keep the original format. Yes, it's stylistic, but so are many uses of punctuation, spelling, and word choice, all of which we preserve as originally written. Andrew Sheedy (talk) 17:41, 18 August 2023 (UTC)Reply[reply]
@Kwamikagami, Andrew Sheedy: The best thing would be if the vertical forms could be specified using a variation selector. Unfortunately, it appears that none of the precomposed fractions in Latin-1 Supplement and Number Forms or the fraction slash in General Punctuation have standardised variants defined for them. 0DF (talk) 01:35, 19 August 2023 (UTC)Reply[reply]
Indeed, and Unicode really doesn't like VS's. It would be difficult to get them to accept new ones for fractions. kwami (talk) 06:54, 19 August 2023 (UTC)Reply[reply]
Do we have a template for denoting fractions? If not, I'd support keeping {{sfrac}}, perhaps with some of the Wikipedianisms removed. @Fenakhay if what Kwami says is correct, this really should have gone to RFDO IMHO.
@Kwamikagami you can also use <math> notation, like <math>\textstyle\frac{1}{3}</math> = . This, that and the other (talk) 11:56, 18 August 2023 (UTC)Reply[reply]
The math tag option seems best. —Al-Muqanna المقنع (talk) 17:42, 18 August 2023 (UTC)Reply[reply]
Agreed. Importing templates from Wikipedia is often a bad idea, as they often have dependencies that we don't actually want because their functionality already exists under a different name. In this case, Kwamikagami also added Module:Unsubst, which pointlessly duplicated Module:unsubst. This has happened before, with Module:yesno being duplicated as Module:Yesno. Theknightwho (talk) 19:36, 18 August 2023 (UTC)Reply[reply]
@0DF, Kwamikagami, Fenakhay: I'm not aware of any prohibition on taking templates from Wikipedia, though we need to be careful to comply with the Creative Commons Attribution-ShareAlike License, and in this case it seems that we would be importing a lot of machinery. In this case, I supported 0DF's suggestion - the style difference does not reach the character level, though I think @This, that and the other's solution works better. --RichardW57m (talk) 11:58, 18 August 2023 (UTC)Reply[reply]
Just noting that we have {{frac}} but I suppose it doesn't produce the effect which @Al-Muqanna was seeking, in which case using math markup is better. — Sgconlaw (talk) 17:53, 18 August 2023 (UTC)Reply[reply]
I took a look at {{sfrac}} in Wikipedia and it doesn't look like importing it is the worst possible thing, since it seems to have no dependencies except a CSS file. But using the <math> tag directly is also fine. Benwing2 (talk) 19:41, 18 August 2023 (UTC)Reply[reply]
Okay, I'll use the 'math' tag. kwami (talk) 00:59, 19 August 2023 (UTC)Reply[reply]

Edit requests directed here[edit]

Heads up that on an different thread, there was some confusion about where to post edit requests for modules and others seemed to not understand that we can direct these requests to the Grease pit. I don't typically edit language modules, so I'm not familiar with standard practice, but I went ahead and edited MediaWiki:Protectedpagetext to direct all such requests to the Grease pit. If this is wrong, please let me know. —Justin (koavf)TCM 23:04, 18 August 2023 (UTC)Reply[reply]

Template for the similar text in every Variations page?[edit]

On pages like Appendix:Variations of "daga", there is always introductory text like: "The word “daga” appears in many languages with many variations in the use of capitalization, punctuation, and use of diacritics." I think this should be done with a template, for easy updating and to avoid typos. Equinox 23:05, 19 August 2023 (UTC)Reply[reply]

Sure, why not. cf (talk) 23:23, 19 August 2023 (UTC)Reply[reply]
To be honest, the variations pages need an overhaul anyway. They're a bit of a mess. Theknightwho (talk) 00:51, 20 August 2023 (UTC)Reply[reply]
That introductory text almost always includes a link to the corresponding Wikipedia page. I suggest including that in the template, but also a parameter |w= to specify the name of the Wikipedia page. —Mahāgaja · talk 08:59, 20 August 2023 (UTC)Reply[reply]
@Equinox, Theknightwho, Mahagaja: In fact we already have a template {{variations}} that purports to do this, but it's (a) unused, (b) not usable in its current state as it displays all the sections (e.g. "Capitalization and punctuation", "Diacritics" and "Other scripts") without any way of customizing their output. I think it should be trimmed to only display the header, and maybe be renamed to {{variations header}} or {{variations nav}}. Benwing2 (talk) 23:45, 20 August 2023 (UTC)Reply[reply]
@Equinox, Theknightwho, Mahagaja: Damn brackets. Benwing2 (talk) 23:46, 20 August 2023 (UTC)Reply[reply]
@Benwing2 Thanks for this. How about we go the other way and bring it up to scratch by using a version of the column template? That would allow us to use sort, and would also address an issue I came up against the other day regarding unsupported titles, since there’s no language-neutral link template. Obviously it’s possible to do a bare link, but it’s not ideal - particularly when it comes to terms in non-Latin scripts. {{also}} uses plain_link in Module:links for a similar purpose, and we could incorporate it here as well. Theknightwho (talk) 00:05, 21 August 2023 (UTC)Reply[reply]
@Theknightwho Can you clarify with an example? I'm not quite sure what you have in mind here by "using a version of the column template". Benwing2 (talk) 00:10, 21 August 2023 (UTC)Reply[reply]
@Benwing2 Currently all the lists at (e.g.) Appendix:Variations of "a" are manually sorted bare links in a bulleted list, under headers such as {{top5}}. Unlike in mainspace, there’s no way to convert these to column templates at the moment (which would allow them to take advantage of things like sorting and script tagging), because the links need to be language-neutral in the same way {{also}} is. {{also}} uses plain_link for this purpose, so I think we should use a column template that does the same. Theknightwho (talk) 00:19, 21 August 2023 (UTC)Reply[reply]
@Benwing2: I think {{variations}} is intended to be substed in when creating a new variations appendix. It creates the basic framework which the editor can then customize after hitting Save changes the first time. —Mahāgaja · talk 06:45, 21 August 2023 (UTC)Reply[reply]
@Mahagaja I see, that makes sense. But IMO it's the wrong approach; something like what @Theknightwho proposes would be better. Benwing2 (talk) 07:13, 21 August 2023 (UTC)Reply[reply]
We could have both, just replace the header text of {{variations}} with the new {{variations header}}. @BD2412: you created {{variations}} and a lot of the variations appendices, what do you think? —Mahāgaja · talk 07:59, 21 August 2023 (UTC)Reply[reply]
I use {{variations}} to create pages by subst'ing, but I have no objection to just making it a template. One note, we would have to be able to toggle on and off the "and in other scripts" portion, as every appendicized term has variations in capitalization, punctuation, and diacritics, but many do not have script variations. If the template is made accordingly, I'll be glad to do the job of incorporating it into all of the pages. bd2412 T 13:18, 21 August 2023 (UTC)Reply[reply]

Kapampangan standard characters and sortation[edit]

Originally posted earlier this month at module talk:languages/data/3/p

m["pam"] = {
	"Kapampangan",
	36121,
	"phi",
	"Latn", --also Kulitan, which lacks a code
	entry_name = {Latn = {remove_diacritics = c.grave .. c.acute .. c.circ}},
	standardChars = {
		Latn = "AaBbDdEeGgHhIiLlMmNnOoPpQqRrSsTtUuWwYyZz",
		c.punc
	},
	sort_key = {
		Latn = "tl-sortkey"
	},
}

Edit request will add typical characters for Kapampangan words as well as automatic accent stripping when a Kapampangan word with diacritics is provided to a link (e.g. amánu should link automatically to amanu without adding a second parameter). Proposed edit will also add sortation already used for Tagalog and some other Philippine languages (e.g. Cebuano, Ilocano, Hiligaynon, Waray-Waray) so Ñ and NG are handled as separate letters for sortation. TagaSanPedroAko (talk) 00:42, 21 August 2023 (UTC)Reply[reply]

@TagaSanPedroAko OK, I added this. Let me know if anything goes wrong. Benwing2 (talk) 07:19, 21 August 2023 (UTC)Reply[reply]
@Benwing2 all good, but I think I accidentally removed K from the list of typical letters. Here's a fix.
Latn = "AaBbDdEeGgHhIiKkLlMmNnOoPpQqRrSsTtUuWwYyZz",

TagaSanPedroAko (talk) 18:24, 21 August 2023 (UTC)Reply[reply]

@Benwing2 Also should remove this other characters from typical letters: Q and Z. Those mostly occurs in spellings following the Bacolor norm and some proper nouns. TagaSanPedroAko (talk) 18:40, 23 August 2023 (UTC)Reply[reply]

Adding to talk page?[edit]

I attempted to add a section to a talk page but was not permitted with the following notice: "This action has been automatically identified as harmful, and therefore disallowed. If you believe your action was constructive, please start a new Grease pit discussion and describe what you were trying to do. A brief description of the abuse rule which your action matched is: probably vandalism". Uh, what? Redranger8402 (talk) 13:33, 21 August 2023 (UTC)Reply[reply]

@Redranger8402 That filter looks for signs of vandals who randomly click buttons in the edit window and add lots of the same characters (among other things). I'm guessing what triggered it was the combination of "Italic text" and lots of apostrophes because you had lots of bold and italics in the markup. The filter looks only at edits by new accounts, so you won't have to worry about it for long. Chuck Entz (talk) 14:37, 21 August 2023 (UTC)Reply[reply]
It's impressive how many edits that filter catches, and how obviously-garbage all the rest of them (apart from this) are. (And although many of them at first many of them look like unintentional edits akin to butt-dialing, it's interesting that users often try to re-add them, suggesting intentional vandalism after all.) As Chuck says, what triggered it is that you left an ''Italic text'' in your comment. BTW, you also initially added an unsigned comment directly onto the same line as another user's old comment, like this, which you noticed and corrected in subsequent attempts, and which I don't think any filter caught, but @other admins: should a filter catch that? It seems undesirable for new users to add text directly onto other people's old comments. - -sche (discuss) 14:55, 21 August 2023 (UTC)Reply[reply]
Thanks, removed the 's and it was accepted. I understand what you mean regarding adding unsigned text to existing comments and will make sure to avoid doing so in the future. Redranger8402 (talk) 20:24, 25 August 2023 (UTC)Reply[reply]

Filter 82[edit]

The last edit that abuse filter 82 caught was in January, and before that it was March 2021; it's only caught 32 edits ever. Do we still need it, or can we turn it off? (Its contribution towards the condition limit is small, but every little helps, and no reason to keep it on if it's not doing anything anymore.) BTW I also notice abuse filter 96 has only caught five edits by 2-3 users this year (two users in May, another in February). - -sche (discuss) 15:31, 21 August 2023 (UTC)Reply[reply]

What are these filters for? JezperCrtp (talk) 22:13, 22 August 2023 (UTC)Reply[reply]
Blocking certain common vandal actions before they can even be submitted. Equinox 22:16, 22 August 2023 (UTC)Reply[reply]

Request for OED Online[edit]

Thanks to @Sgconlaw, {{R:OED Online}} has been updated to automatically generate the URL for the OED's new URL schema from the headword and POS. This is fine most of the time but it falls apart in some edge cases, e.g. multiword terms.

These are still predictable though with a few rules:

  1. Capital letters → lower case
  2. Apostrophes and initial hyphens are deleted
  3. Spaces → hyphens
  4. The POS comb. form becomes combform (the only problematic POS based on their glossary)

So "mauvais quart d'heure", n. is at /mauvais-quart-dheure_n; "Saxish", adj. is at /saxish_adj; "-sophy", comb. form is at /sophy_combform; "Low-Churchmanism", n. (with a hyphen in the middle) is at /low-churchmanism_n.

This should be a simple regex thing but my Lua is rudimentary so not sure how to handle it exactly. —Al-Muqanna المقنع (talk) 21:39, 21 August 2023 (UTC)Reply[reply]

@Al-Muqanna:This should be straightforward in Lua. I’ve noticed diacritics are deleted (“mouton enragé”, n. is at mouton-enrage_n), but how are distinctions like naive and naïve handled? Theknightwho (talk) 14:31, 22 August 2023 (UTC)Reply[reply]
@Theknightwho: There's no collision for that one but a comparable case is resume (to start again) vs. résumé (to summarize). In that case they are just distinguished by the numerical identifier: /resume_v1 and /resume_v2. (The same is also done in the headword, so it's "resume" v.1 and "résumé" v.2 even though they have different accenting.) Actually the numerical identifier in general (n.1 and so on) probably needs to be a distinct parameter: at the moment it's handled by manually entering it into the POS field but now that it's used in the URL too it should probably be split out and formatted automatically. —Al-Muqanna المقنع (talk) 14:39, 22 August 2023 (UTC)Reply[reply]
@Al-Muqanna Thanks. They also seem to delete final hyphens and brackets: see 401(k). I haven’t been able to find any other nonstandard characters. Theknightwho (talk) 15:06, 22 August 2023 (UTC)Reply[reply]

A better definition template[edit]

I feel like we need a better way of handling certain definitions. I don't think anything will ever replace using a bare list of translations with a good gloss, however this isn't always enough. {{transclude sense}} can handle certain things, but it copies everything. This is indeed useful in some certain situations, like when two words are a perfect match, but I feel we need something in between. What if we had a template {{definition}} ({{def}} for short) that could handle most things on the definition line in a more elegant manner.

  1. A param for the language {{{1}}}
  2. For the definitions, perhaps multiple definitions could be given using {{{2}}}, {{{3}}}
  3. Add a senseid to the definition, perhaps using {{{id}}}
  4. Add labels and categories, probably using {{{lb}}}, {{{c}}}
  5. Ideally it would be able to (but not necessarily have to) link to a specific English definition, perhaps using a parameter {{{enid1}}}, where the number matches the word entered in params 1-whatever etc?
    1. We could also potentially make this template obsolete {{transclude sense}} and have a parameter {{{transclude}}} and the English ID would be mandatory
  6. Add a gloss, {{{gloss}}}, {{{gl}}}
  7. Add government? We never finalized a way to regularly add it, but in theory this could be added here as well.
  8. Perhaps controversially a reference, {{{ref}}}, this may or may not be useful for LDL's or something.
  9. I suppose in theory you could include {{rfeq}} as a parameter, but I'm not sure I see the value.
  10. Probably a qualifier? {{{q}}} may or may not be useful.
  11. Perhaps a way to deal with commas vs semicolons between translations if idea 1 is kept
  12. I'm not sure how {{defdate}} and references for it would factor in.

Again, this template would not be mandatory, but I feel it would definitely fill a gap and also really reduce a lot noise. For example, one could take

{{senseid|pl|condition}} {{l|en|state|id=condition}} {{gl|set of circumstances}} ({{l}} is used for the id, very useful imo)

and change it into

{{def|pl|state|id=condition|enid=condition|gl=set of circumstances}}

The amount of space, keystrokes, and brackets saved only add on when we add things like labels, etc.

We would also need to consider how verbs are handled, as verb should use "to" or "I" in special cases.

Perhaps this is the result of personal preference, but I hate having to call tons of templates, it makes it difficult to read the definition line in the editor, and I find parameters easier. Another upside would be that it could help regularize definitions, things could appear in the proper order. Vininn126 (talk) 23:30, 21 August 2023 (UTC)Reply[reply]

Sounds good. I think this could possibly wait until resources are a bit less tight, but it’s inevitably the way we’re going to have to go if we want to make things more structured. The amount of unnecessary duplication across entries is huge at the moment, and while we often want to show information in multiple places, it’s usually good to have a way to keep it all synchronised. Having a special definitions template would greatly assist in that, because it would make transcluding senses across entries much easier. Theknightwho (talk) 23:43, 21 August 2023 (UTC)Reply[reply]
@Theknightwho This is why I wanted to include transclude as a parameter. Vininn126 (talk) 23:45, 21 August 2023 (UTC)Reply[reply]

PUA character in drab[edit]

drab contains a private-use-area character in the wikicode, in {{zlw-opl-IPA}}, which causes it to be unreachable / uneditable in AutoWikiBrowser. I'm not sure why AWB can't open pages with PUA characters (which is explicitly what's going on, it throws up an error message saying so), but it also seems like we shouldn't be using PUA characters in mainspace, so changing it to a stable character would resolve everything. - -sche (discuss) 16:01, 22 August 2023 (UTC)Reply[reply]

@-sche That module will be obsoleted soon-ish anyway~, so the tofu can be replaced. Vininn126 (talk) 16:18, 22 August 2023 (UTC)Reply[reply]
I replaced the PUA with something else. Vininn126 (talk) 16:23, 22 August 2023 (UTC)Reply[reply]
Thanks! - -sche (discuss) 17:42, 22 August 2023 (UTC)Reply[reply]

No table of contents on the Aug 2023 Etymology Scriptorium page[edit]

As of one minute ago, there was no table of contents for the (lengthy) August 2023 page of the Etymology Scriptorium - is this a known issue?

Thanks,

Chernorizets (talk) 08:30, 23 August 2023 (UTC)Reply[reply]

Fixed. Someone had added "__notoc__" to a thread, which removed the table of contents for the whole page. —Mahāgaja · talk 08:58, 23 August 2023 (UTC)Reply[reply]

Neighborhoods in topic cat[edit]

I think neighborhoods and districts aren't handled right in Module:category tree/topic cat/data/Places. They are subdivisions of a city, but are categorized in "Neighborhoods in State" and get all mixed up. For example, Neighborhoods in Santa Catarina, Brazil has three entries, two are from a city and one from another. Trooper57 (talk) 21:55, 23 August 2023 (UTC)Reply[reply]

@Trooper57 Yeah, this needs some work. I'll get to it eventually; there are a lot of accumulated fixes needed for the {{place}} modules. Benwing2 (talk) 05:55, 25 August 2023 (UTC)Reply[reply]

Manual transliterations don't work in Template:ko-regional[edit]

This template occasionally needs manual transliterations, which don't work as in {{ko-regional|나뭇잎|나무잎|tr=namunnip}}. See 나뭇잎 (namunnip) (manually transliterated "namunnip") and its North Korean equivalent 나무잎 (namu'ip). Anatoli T. (обсудить/вклад) 23:20, 23 August 2023 (UTC)Reply[reply]

Index namespace deleted[edit]

After Wiktionary:Votes/2021-07/Deleting the Index passed, the old Index namespace was emptied of pages, but it was never actually removed from the wiki. This has now been done. There is no longer an "Index" (number 104) or "Index talk" (number 105) namespace on this wiki. Update your bots, scripts, etc.

(If you don't know what I'm talking about, then you can safely ignore this announcement!) This, that and the other (talk) 07:25, 24 August 2023 (UTC)Reply[reply]

@This, that and the other Thank you. My scripts don't hardcode the set of namespaces; this is handled by pywikibot AFAIK. Benwing2 (talk) 05:56, 25 August 2023 (UTC)Reply[reply]

Nocat in citation page quotations[edit]

In a bot run today I see the |nocat= param was removed from quotations in Citations pages. Those pages are now categorised under "X terms with quotations". I'm not sure this is a good idea, because it means that non-existent entries with a citations page, either deleted by RFV or not yet ready to publish, are now categorised as entries. That's mainly why I was using |nocat=1 there. IMO the nocat parameter should either be restored or categorisation disabled on citations pages. @Benwing2 (sorry to bother again!) —Al-Muqanna المقنع (talk) 09:27, 24 August 2023 (UTC)Reply[reply]

Perhaps a maintenance category of Citations pages would be useful instead. Vininn126 (talk) 09:43, 24 August 2023 (UTC)Reply[reply]
That would also work. I do think it's odd, in general, that Citations pages and entries themselves are mixed together in the existing system, though it makes sense to indicate an existing entry where the quotations are on its Citations page. It might be pedantic but since it's not a hidden cat and the desc says it's for entries I do think there's potential for reader confusion. —Al-Muqanna المقنع (talk) 12:38, 24 August 2023 (UTC)Reply[reply]
@Al-Muqanna There are a large number of Citations pages and only a few of them were using nocat=1. I didn't realize it was you adding them but as a general rule something like this shouldn't be done manually. Either Citations pages should or shouldn't be in the "X terms with quotations" categories, or maybe it should be done only for Citations pages where the corresponding mainspace page exists. But in any case it should be handled by the module, not manually. Benwing2 (talk) 05:54, 25 August 2023 (UTC)Reply[reply]
@Benwing2: Yes I agree. Granted it's not usual atm though I don't think it's just me since I took it from somewhere originally—your last suggestion about only categorising them if the main entry actually exists seems like the most straightforward solution to this. —Al-Muqanna المقنع (talk) 07:39, 25 August 2023 (UTC)Reply[reply]

I notice that if you write {{homophones|en|}} or {{homophones|en||}}, it throws an error (as it should).
And if you write {{homophones|en|foo|}}, it just shows "Homophone: foo" and ignores the empty parameter.
But if you write {{homophones|en||foo}}, it doesn't throw an error or ignore the empty parameter, instead it displays "Homophones: term, foo" with the word term erroneously added to the list of homophones, and it doesn't seem to add any kind of error tracking category flagging that something's amiss, either.
And if you write {{homophones|en|||foo}}, it displays "Homophones: term, [Term?], foo".
Presumably it should do something to indicate something is amiss, whether by throwing a visible error or adding a tracking category for some bot to cleanup. - -sche (discuss) 06:52, 25 August 2023 (UTC)Reply[reply]

Semicolon to "and" in publisher parameter (quote templates)[edit]

@Benwing2, Sgconlaw: E.g., in {{RQ:Richardson Pamela}}, |publisher=[[w:Rivington (publishers)|C[harles] Rivington]],{{nb...|in St. Paul’s Church Yard}}; and J. Osborn,{{nb...|in Pater-noster Row.}} results in “C[harles] Rivington, […] and and J. Osborn, […]” instead of “C[harles] Rivington, […]; and J. Osborn, […]” because semicolons are automatically changed to “and”. Special:Search/"and and" shows many more examples of this issue. J3133 (talk) 11:24, 26 August 2023 (UTC)Reply[reply]

@Benwing2: it probably shouldn’t operate as a form of markup in the |publisher= field. — Sgconlaw (talk) 12:18, 26 August 2023 (UTC)Reply[reply]
@J3133 @Sgconlaw I will change this so that semicolons are displayed as semicolons in the |publisher= field; this allows multiple publishers with inline modifiers attached to each, but won't affect the display in cases like this. Benwing2 (talk) 19:19, 26 August 2023 (UTC)Reply[reply]
@Benwing2: ah, I see. Thanks for making it “smart” then! — Sgconlaw (talk) 21:21, 26 August 2023 (UTC)Reply[reply]

Pinyin input method does not recognize "er"[edit]

Not sure where I am supposed to report this, but it seems that the Pinyin input method does not recognize "er" as a syllable and would not add the tone mark if a number is typed after "er". It works if you type "e", then tone number, then "r", though. Stormraiser (talk) 16:09, 26 August 2023 (UTC)Reply[reply]

@Stormraiser welcome! Can you please advise what you are referring to by "Pinyin input method"? I don't believe Wiktionary provides such a thing; normally input methods are provided by your device's operating system. This, that and the other (talk) 09:28, 27 August 2023 (UTC)Reply[reply]
It's a MediaWiki extension, I think https://www.mediawiki.org/wiki/Help:Extension:UniversalLanguageSelector/Input_methods/zh-pinyin-transliteration
Apparently it is not enabled on Wikipedia as I've never seen it there Stormraiser (talk) 09:39, 27 August 2023 (UTC)Reply[reply]

Problems with entry, talk, citations tabs in Vector skins[edit]

At quaint there are Talk:quaint (aka "Discussion") and Citations:quaint.

From Talk:quaint I cannot see a tab for Citations:quaint.

From Citations:quaint I cannot see a tab for the entry and I see a tab called "Discussion" that is a redlink.

These seem to me to be bugs, not features. Could this be fixed, please? — This unsigned comment was added by DCDuring (talkcontribs) at 17:43, 26 August 2023‎ (UTC).

All three tabs appear on all three pages for me. —Mahāgaja · talk 17:47, 26 August 2023 (UTC)Reply[reply]
I've tried this on both new and old MonobookVector, with the same result. Have there been any recent changes that might have caused this and since been changed? I don't often shut down/restart my browser or close/reopen wiki windows, let alone reboot my PC. DCDuring (talk) 17:51, 26 August 2023 (UTC)Reply[reply]
For the Talk and Citations pages what you're describing is what appears with Javascript off, but I don't know why the Citations tab would still appear on the main entry since that one is also loaded in by script. —Al-Muqanna المقنع (talk) 17:52, 26 August 2023 (UTC)Reply[reply]
I misspoke above. The problem occurs in the two Vector skins. The Citations tab did appear under 'Tools', but there was a redlinked 'Discussion' tab.
Is this the kind of thing that is a symptom of technical debt?
It's OK with Monobook, to which I have now returned. DCDuring (talk) 17:59, 26 August 2023 (UTC)Reply[reply]

Link issue[edit]

Isle of Man: From {{af|en|[[isle|Isle]]|of|[[Man#Etymology 2|Man]]}}.: “From Isle +‎ of +‎ [[Man.”, Man linking to “Unsupported titles/`lsqb``lsqb`Man”. J3133 (talk) 06:51, 27 August 2023 (UTC)Reply[reply]

@J3133: for this, use {{etymid}} at Man and {{af|en|...|...|Man|id3=...}} at Isle of Man. This, that and the other (talk) 09:27, 27 August 2023 (UTC)Reply[reply]
@This, that and the other: I did not add this etymology; this issue, which did not exist before, is also present in other entries such as Ronniecoln and Vriscourse. J3133 (talk) 09:39, 27 August 2023 (UTC)Reply[reply]
Ah, I see. Relevant modules have been edited by both @Theknightwho and @Benwing2 recently. This, that and the other (talk) 10:21, 27 August 2023 (UTC)Reply[reply]
@This, that and the other @J3133 Thanks - I didn’t realise it was affecting it like this as well. I’ll have a look. Theknightwho (talk) 10:24, 27 August 2023 (UTC)Reply[reply]

laryngeal - example of using {{affix}} with "lang1=NL.", which generates Category:New Latin twice-borrowed terms[edit]

It doesn't seem like this category should be generated in this case. I spent a few hours trying to figure out the call chain from Module:affix/templates to the pieces of code (in Module:etymology and submodules, AFAICT) that know about "twice-borrowed terms", but I came up empty. Could anyone please help identify why this category is being created in this case?

Note: after noticing this category was a redlink on a few pages, and before even asking myself whether it made sense where it appeared, I created it with {{auto cat}}. It is now complaining about an incorrect label being passed to {{topic cat}}. I guess that's secondary to the problem of whether the category makes sense at all.

Thanks,

Chernorizets (talk) 09:02, 27 August 2023 (UTC)Reply[reply]

By some magic I think I fixed this in Special:Diff/75803876. This, that and the other (talk) 11:02, 27 August 2023 (UTC)Reply[reply]
@This, that and the other ha, the one function I didn't look at closely, because I thought it just created some hyperlinks. But lo and behold, it loads Module:etymology and invokes the code that knows about "twice-borrowed terms". Nice catch!
It looks like the edit that occurred 3 edits before yours did, in fact, introduce the typo you fixed. That edit was done on 8/26, so we've only had this category weirdness for a few days. @Benwing2 for viz.
Methinks all the modules I looked at in this investigation - Module:affix, Module:affix/templates, Module:etymology, Module:etymology/templates and Module:auto cat need way better tests. The few tests that do exist are mostly failing. I'm happy to give it a go, but I'm still learning the codebase so I'll be slow at it.
Cheers,
Chernorizets (talk) 12:08, 27 August 2023 (UTC)Reply[reply]