User talk:JeffDoozan: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
Line 1,058: Line 1,058:
We now have a module error at [[肋巴骨]] because [[:Module:quote]] is looking in your userspace instead of the template namespace. Please fix. [[User:Chuck Entz|Chuck Entz]] ([[User talk:Chuck Entz|talk]]) 17:37, 12 April 2024 (UTC)
We now have a module error at [[肋巴骨]] because [[:Module:quote]] is looking in your userspace instead of the template namespace. Please fix. [[User:Chuck Entz|Chuck Entz]] ([[User talk:Chuck Entz|talk]]) 17:37, 12 April 2024 (UTC)
: {{done|fixed}}, thanks for the heads-up. The calls to my userspace are temporary while I cleanup invalid params on existing cite- template uses. [[User:JeffDoozan|JeffDoozan]] ([[User talk:JeffDoozan|talk]]) 17:45, 12 April 2024 (UTC)
: {{done|fixed}}, thanks for the heads-up. The calls to my userspace are temporary while I cleanup invalid params on existing cite- template uses. [[User:JeffDoozan|JeffDoozan]] ([[User talk:JeffDoozan|talk]]) 17:45, 12 April 2024 (UTC)

== Changing T:cite-* {{para|1}} to {{para|lang}} ==

Where was it decided to changing T:cite-* {{para|1}} to {{para|lang}}? I have to say I disagree with this change thoroughly. -- [[User:Sokkjo|Sokkjō]] 02:59, 13 April 2024 (UTC)

Revision as of 02:59, 13 April 2024

Welcome

Hello, welcome to Wiktionary, and thank you for your contributions so far.

If you are unfamiliar with wiki-editing, take a look at Help:How to edit a page. It is a concise list of technical guidelines to the wiki format we use here: how to, for example, make text boldfaced or create hyperlinks. Feel free to practice in the sandbox. If you would like a slower introduction we have a short tutorial.

These links may help you familiarize yourself with Wiktionary:

  • Entry layout (EL) is a detailed policy on Wiktionary's page formatting; all entries must conform to it. The easiest way to start off is to copy the contents of an existing same-language entry, and then adapt it to fit the entry you are creating.
  • Check out Language considerations to find out more about how to edit for a particular language.
  • Our Criteria for Inclusion (CFI) defines exactly which words can be added to Wiktionary; the most important part is that Wiktionary only accepts words that have been in somewhat widespread use over the course of at least a year, and citations that demonstrate usage can be asked for when there is doubt.
  • If you already have some experience with editing our sister project Wikipedia, then you may find our guide for Wikipedia users useful.
  • If you have any questions, bring them to Wiktionary:Information desk or ask me on my talk page.
  • Whenever commenting on any discussion page, please sign your posts with four tildes (~~~~) which automatically produces your username and timestamp.
  • You are encouraged to add a BabelBox to your userpage to indicate your self-assessed knowledge of languages.

Enjoy your stay at Wiktionary! Ultimateria (talk) 19:18, 11 May 2020 (UTC)[reply]

Bot Fails

Please check your code: {{syn}} doesn't work without a language code, and {{sense}} is not part of the synonym list. See carvallo, dobladora, Día de los Inocentes, and todo el tiempo, where I had to revert your bot due to module errors. Chuck Entz (talk) 03:21, 9 September 2020 (UTC)[reply]

Thank you catching that. For what it's worth, I made all of those mistakes by hand when running the bot in a manual edit mode, so the blame is 100% on me. I'll build in some double-checking before using it like that again. Thanks! JeffDoozan (talk) 13:08, 9 September 2020 (UTC)[reply]

Pt bot run

Hi, could your bot also re-arrange synonyms for Portuguese entries? – Jberkel 19:48, 10 December 2020 (UTC)[reply]

It's running now, it should be able to clean up ~8,000 of the ~10,000 Portuguese entries with synonyms. – JeffDoozan (talk) 16:55, 17 December 2020 (UTC)[reply]
Great, thanks. So the bot doesn't make changes in ambiguous situations, which means that all remaining "Synonyms" sections (2000?) require human attention? – Jberkel 11:17, 18 December 2020 (UTC)[reply]
That's correct, the bot won't make any changes when it encounters any sort of ambiguity when matching the synonyms to the glosses (eg, when a synonym has no {{sense}} and the word has multiple glosses). I have a wild idea of looking up the synonym's glosses and matching them against the word's glosses as a way of resolving some ambiguities. From my experience cleaning up the unhandled Spanish Synonyms, I'd estimate that could resolve at least half of the remaining 2000 entries, leaving the most ambiguous cases (gustar) in need of human attention. I'll do what I can to reduce the 2000 to something more manageable. JeffDoozan (talk) 13:21, 18 December 2020 (UTC)[reply]
Ok, let me know how it goes, and maybe put a list of entries requiring attention somewhere, so the work can be distributed. – Jberkel 19:40, 18 December 2020 (UTC)[reply]
We're left with 971 articles that need manual attention.JeffDoozan (talk) 15:26, 19 December 2020 (UTC)[reply]

Inflection > Declension

Hello,

Would you perhaps be interested in running a bot to replace all Inflection headers with Declension headers in Macedonian entries or subsections for nouns, adjectives, determiners and pronouns? Some will have three equal signs around them and others four. Since "Conjugation" has already been used for verbs, it is more logical to use "Declension" for nominal words, given that "conjugation" and "declension" are co-hyponyms of "inflection", such that having "Inflection" alongside "Conjugation" mixes degrees of specificity. Furthermore, the declension tables for nominal words use "declension" and "conjugation" in their internal title, and there is a general rule according to which nominal inflection templates are given names containing -decl-.

The change that I am proposing should only affect the bolded headers as they appear on a page when it is not being edited and not any headword-line templates or inflection templates. I am contacting you since you successfully took care of several other issues with Macedonian entries recently. Martin123xyz (talk) 08:48, 18 October 2021 (UTC)[reply]

@Martin123xyz, It's a very simple bot job to rename every Inflection header to Declension in all Macedonian entries. Is that what you're asking for, or are there some Inflection headers that should not change (eg, in subsections outside of nouns, adjectives, determiners and pronouns)?
For reference, there are currently ~18,000 Inflection headers, ~6,000 Declension headers and ~7,000 Conjugation headers. JeffDoozan (talk) 13:36, 18 October 2021 (UTC)[reply]
Yes, that is basically what I am asking for. As far as I am aware, there are no inflection headers that should remain as they are, because "Declension" and "Conjugation" cover all words and there are no cases where a single inflection table can be used for both a verb and a nominal word. I cannot think of any other scenario where "Inflection" would be the only possible choice. However, it is possible that a very small number of verb entries or subsections contain an Inflection header instead of a Conjugation header as well. In that case, those would need to be changed to "Conjugation". Could you check for the existence of Inflection headers in verb entries and subsections and program the bot based on your findings? Thank you in advance. Martin123xyz (talk) 13:54, 18 October 2021 (UTC)[reply]
@Martin123xyz, I found and fixed two verbs that used Inflection instead of Conjugation and now the bot is running to rename all remaining Inflections to Declensions.JeffDoozan (talk) 14:31, 18 October 2021 (UTC)[reply]
Thank you very much for the help. Martin123xyz (talk) 06:14, 19 October 2021 (UTC)[reply]

DRAE non-links

Hi Jeff. Perhaps you could get your bot to make a note of all pages it doesn't add a DRAE link to - this would serve a couple purposes, firstly to find some errors and secondly to smugly say "ha, we are better than the DRAE coz we have all these words and they don't" QuickPhyxa (talk) 23:15, 18 October 2021 (UTC)[reply]

@QuickPhyxa Such a list would be too large to be useful. DRAE has ~87,000 unique lemma words, we have ~82,000. We only share ~44,500 with DRAE. I'm still looking for a good way to divide that into a useful list of "hey, here are some words DRAE has that we don't" that could be of some use to human editors. JeffDoozan (talk) 23:32, 18 October 2021 (UTC)[reply]
Those are pretty good numbers. QuickPhyxa (talk) 20:24, 20 October 2021 (UTC)[reply]

bot

when you start your bot to creat Spanish forms Amirh123 (talk) 16:28, 21 October 2021 (UTC)[reply]

Yes. Your bot is creating lots of adjective and noun forms, that's great! MooreDoor (talk) 23:12, 23 October 2021 (UTC)[reply]

List of Macedonian comparable adjectives

Hello,

Would you by any chance like to create a list of all Macedonian adjectives which are currently formatted as comparable (with only {{mk-adj}} in the header, rather than {{mk-adj|-}}, which is for the ones lacking comparable and superlative forms) but which do not have comparative and superlative forms in the declension table, i.e. where the "c=1" parameter, which adds said forms, is missing? I could then open the pages of all such adjectives and add the c=1 parameter manually, essentially bringing the declension template in line with the headword-line template. This step should not be automatic because the adjective might currently be wrongly formatted as comparable when it is not, in which case I would add a hyphen to the headword-line template rather than adding the "c=1" parameter. Thank you in advance. Martin123xyz (talk) 12:16, 27 October 2021 (UTC)[reply]

@Martin123xyz Here's your list. 483 matches out of 2528 total adjectives. JeffDoozan (talk) 13:06, 27 October 2021 (UTC)[reply]
Thank you very much. Martin123xyz (talk) 13:09, 27 October 2021 (UTC)[reply]
I have now gone through the list and fixed just about all of the entries (barring a few obscure adjectives that I don't feel knowledgeable enough about). Martin123xyz (talk) 14:11, 28 October 2021 (UTC)[reply]

Bot added duplicate definition

See diff. I doubt this is a common error, but I figured you should know. Ultimateria (talk) 22:47, 28 October 2021 (UTC)[reply]

@Ultimateria: Thanks for the heads up. The bot doesn't currently parse {{form of}} - luckily it's only used on 39 pages and most of them are verbs forms. I'll make sure the bot learns to handle that and on the next run, it'll automatically clean up the duplicates. JeffDoozan (talk) 22:53, 28 October 2021 (UTC)[reply]

Your bot blanked the page. DTLHS (talk) 20:28, 31 October 2021 (UTC)[reply]

@DTLHS: Good catch, thank you. For now, I've reverted the bot edit on that page. It was blanked due to a bug that's been resolved: the lemma, fulano is technically a form of the Proper Noun Fulano, so the bot correctly removed the Noun form section, but failed to create the Proper Noun section. When the bot runs with the new database dump tomorrow, it should complete the page changes successfully (as well fulanos, which was also affected but not blanked because it has a Portuguese section) JeffDoozan (talk) 23:47, 31 October 2021 (UTC)[reply]

Macedonian References

Hello,

Could you please fix all Macedonian reference sections where the references are not preceded by a bullet point, as you did with the IPA transcriptions? Thank you in advance. Martin123xyz (talk) 11:35, 3 November 2021 (UTC)[reply]

@Martin123xyz: All done. It fixed 94 entries. JeffDoozan (talk) 14:40, 3 November 2021 (UTC)[reply]
Thank you again. Martin123xyz (talk) 14:41, 3 November 2021 (UTC)[reply]

Macedonian verbs without labels

Hello,

Could you please compile a list of all Macedonian verbs with definitions that do not start with any label, i.e. which start directly with "# to", including those which have some definitions starting with a label and others starting without one? Until today's edit, се ороди (se orodi) belonged to this type of entry. I would like to fix them so that all verb definitions contain at least one of the labels {{lb|mk|transitive}}, {{lb|mk|intransitive}} and {{lb|mk|reflexive}}. Verb forms as well as verb lemmas with only a template in the definition (e.g. {{alternative form of}} or {{nonstandard form of}}) should be excluded. Thank you in advance. Martin123xyz (talk) 13:57, 10 November 2021 (UTC)[reply]

Here's your list. 214 verbs. JeffDoozan (talk) 22:42, 10 November 2021 (UTC)[reply]
@Martin123xyz: Oops, forgot to ping you. JeffDoozan (talk) 22:43, 10 November 2021 (UTC)[reply]
Thank you for the help. Martin123xyz (talk) 08:16, 11 November 2021 (UTC)[reply]

Preverbs

What justification do you have for doing this? DTLHS (talk) 17:15, 14 November 2021 (UTC)[reply]

@DTLHS "Preverb" isn't listed as an allowed header per Wiktionary:POS#Part_of_speech, but "Prefix" is. Since there were only 87 pages using Preverb and they all ended with "-", it seemed like it was a typo or a misunderstanding. Having now read the preverb entry, I can see that may have been the wrong assumption. I'm happy to revert all pages back to Preverb if you think we should keep Preverb as a POS. JeffDoozan (talk) 18:06, 14 November 2021 (UTC)[reply]
You shouldn't change headers in languages that you don't know. Please revert your edits. DTLHS (talk) 18:07, 14 November 2021 (UTC)[reply]
@DTLHS: All fixed, thank you for catching my mistake so quickly! JeffDoozan (talk) 18:49, 14 November 2021 (UTC)[reply]

Macedonian terms with red links in their headword lines

Hello,

I tried to create the category Category:Macedonian terms with red links in their headword lines with {{auto cat}} but it is empty, even though the Romanian and the Welsh equivalents, also created with that template, contain a great deal of pages. What code needs to be added and where so that the Macedonian page fills up too? I primarily need it in order to see which verbs have an aspectual pair in their headword line template for which no page has been created yet (i.e. verbs which have {{mk-verb|impf|pfn=X}} or {{mk-verb|pf|impfn=X}} where X is a redlink and n is an optional cardinal number (pf, pf2, pf3 etc.); for example, се распламти (se rasplamti) and мрази (mrazi)). If the page I have created cannot be made to work, could you create the list of verbs fitting the above description by other means? Thank you in advance. Martin123xyz (talk) 13:30, 18 November 2021 (UTC)[reply]

@Martin123xyz:, I have no idea, I haven't used the category system much, but I'd be interested in learning. If you post this on WT:BP, and get a technical explanation of the changes needed, I can help you implement them. If not, I can write something to generate the list for you. JeffDoozan (talk) 13:42, 18 November 2021 (UTC)[reply]
Thank you for the reply. I have posted my query in the Grease Pit, since it is for technical problems, whereas the Beer Parlour is for proposals. Martin123xyz (talk) 14:06, 18 November 2021 (UTC)[reply]
After receiving a reply in the Grease Pit, I looked at the language-specific headword modules and found that the redlink categories are populated by codes such as the following (for Welsh):
for _, inflection_set in ipairs(data.inflections) do
for _, inflection in ipairs(inflection_set) do
if not inflection:find("%[%[") then
local title = mw.title.new(inflection)
if title and not title.exists then
table.insert(tracking_categories, langname .. " " .. poscat .. " with red links in their headword lines")
end
end
end
end
I do not feel comfortable using such advanced code. Could you implement something like this for Macedonian at Module:mk-headword? Martin123xyz (talk) 09:32, 19 November 2021 (UTC)[reply]
@Martin123xyz: Thank you, that's exactly the code that it needed! I created Category:Macedonian verbs with red links in their headword lines, which is just now starting to populate. You may need to create additional categories for other parts of speech that have inflections or plurals. JeffDoozan (talk) 02:02, 22 November 2021 (UTC)[reply]
Thank you very much. For the time being, I do not need categories for parts of speech that have inflections in their headword lines, since I am focusing on creating new lemma pages, rather than pages for noun plurals, for example. I will leave those to a bot, when someone agrees to program one for Macedonian. Martin123xyz (talk) 07:45, 22 November 2021 (UTC)[reply]

Idioms → Idiom

Have seen the latest change by your bot removing the "s" on the Idioms header on the Japanese sumo page. Weird, but doesn't this break the flow of other headers like Derived terms, Proverbs, and the like (may apply to other languages other than Japanese), that are in the plural? ~ POKéTalker11:05, 24 November 2021 (UTC)[reply]

@Poketalker: Before the bot edits, there were 7000+ pages using "Idiom" and only 400 using "Idioms", so it seems that the majority has already favored the singular spelling. Per WT:ELE, the rule seems to be that we prefer the singular for all parts of speech or, more broadly, for all things that would have definitions (including Proverb, without the s) and the plural for everything else (Derived terms, Alternative forms, Translations, Usage Notes, References). If you think if it as "Here are definitions for term when used as a noun/proverb/idiom" it makes sense to use the singular.
Per WT:POS, "Idiom(s)" is an explicitly disallowed section. Given its widespread usage that seems like something worth changing and while we're doing that we can decide which spelling should be used. If you have strong feelings about the category and its naming and you can get consensus to permit it and to use the other spelling, and I'll be more than happy to run the bot again to rename Idiom -> Idioms. JeffDoozan (talk) 12:24, 24 November 2021 (UTC)[reply]
@Poketalker:. I see now that the "Idioms" section in 相撲 was actually a L4 section below the L3 Noun section, so it's not subject to the some POS rules. I will fix the bot to prevent future mistakes and undo the mistakes it already made. JeffDoozan (talk) 22:28, 24 November 2021 (UTC)[reply]

Facilitating the categorization of Macedonian nouns

Hello,

New categories have been created for the classification of Macedonian nouns by inflection type, and in all cases in which it was possible, I have incorporated the categories into the declension templates. For example, {{mk-decl-noun-f-e}} now automatically fills Category:Macedonian feminine nouns with vocatives in -е, which is why there are already 901 pages there (and the number is growing as I write). However, there are other inflectional categories for which a single generic template is used, the inflectional differences being dealt with by means of individual parameters. For example, masculine nouns with plurals in -ови and those with plurals in -еви use the same template, and the correct forms are generated by indicating the plural stem, including -ов- or -ев- in the second parameter. In such cases, the inflectional categories need to be added manually. Is there a way to generate lists of all the noun pages to which a particular category needs to be added so that I don't have to go through the list of all Macedonian nouns and spot the relevant cases one by one? For example, could a list be generated for Macedonian masculine nouns where the indefinite singular ends in a consonant X, with an undetermined number of characters before it, and the indefinite plural ends in Xови (to be checked in the output of {{mk-decl-noun-m}}, e.g. столб), so that I can open those one by one and add the category Category:Macedonian masculine nouns with plurals in -ови? I have written a code at {{mk-decl-noun-m}} which adds the category through pov=1.

For plurals in -еви, the singular will end in X and the plural in Xеви, but there are also cases where the singular will end in XYеви and the plural in Xеви (with deletion of the final consonant, e.g. рој), so those might be more difficult to collect into a list. Finally, for Macedonian nouns with final palatalization, the singular will end in -к, -г or -х and the plural in -ци, -зи or -си respectively (e.g. ученик). If it is not possible to compile lists for these, I will just go through all masculine nouns - it's not a very important issue.

Thank you in advance. Martin123xyz (talk) 08:22, 13 December 2021 (UTC)[reply]

Does this search have the results you're looking for with your first example? If so, you can adjust it to find the other endings you're looking for. JeffDoozan (talk) 13:01, 13 December 2021 (UTC)[reply]
I changed it to search for -ов- rather than -б- (the consonant before the plural morpheme does not need to be specified), but it also includes many false positives from Bulgarian and Russian, e.g. радон, because it has the related adjective радонов, even though it does not have a Macedonian plural *радонови. What can be written so that it looks for -ов- only within the curly brackets of {{mk-decl-noun-m}}, with an undetermined number of characters to the left and to the right (to cover cases like зет, with {{mk-decl-noun-m|зет|зетов|voc_sg2=зете}} in the declension section)?
The solution will work similarly for -еви, but not for cases where -к, -г, and -х pluralize in -ци, -зи and -си, because the plural -и will not be in the source code and because simply searching for the consonants ц, з and с in the source will return too many false positives, including Macedonian masculine nouns which already end in -ц, -з and -с, like светец, with the plural светци.
Martin123xyz (talk) 13:34, 13 December 2021 (UTC)[reply]
I have another idea: to search for -овите and -евите in pages (the definite forms, which are distinctive enough to exclude many false positives). However, when I write *овите or *евите, with the asterisk representing an indeterminate number of letters to the left, I get no results, even though this works in the other direction: ов* gives me complete results for pages containing words starting with ов-. Martin123xyz (talk) 13:37, 13 December 2021 (UTC)[reply]
@Martin123xyz: If you search for insource: /\{\{mk-noun\|m/ insource: /\{\{mk-decl-noun-m[^}]*бов/, you'll get results only inside {{mk-decl-noun-m}} JeffDoozan (talk) 13:42, 13 December 2021 (UTC)[reply]
Thank you for the help. This still gives me false hits like вдовец which contains -ов- inside the root, but I think these are few enough to be ignored. Do tell me if you come up with a solution for the plurals in -ци, -зи- and си with palatalization. Martin123xyz (talk) 13:45, 13 December 2021 (UTC)[reply]
I have come up with a solution for all the categories where insource: is not specific enough: I export Macedonian nouns into Excel and use Excel functions to order them by their final letter or letter sequence. This enabled me to identify all Macedonian nouns ending in к, г and х. I then used more Excel functions to wrap them in -, so that I could copy paste them into a test page on Wiktionary, efficiently open them as links, and add the parameters assigning them to the relevant category. The whole categorization process is now advancing well. I just thought I'd post this in case it gives someone else useful ideas. Martin123xyz (talk) 09:19, 14 December 2021 (UTC)[reply]

Good job making this list, BTW. Could you rerun it, as I think we've corrected most of them already? Br00pVain (talk) 22:06, 16 December 2021 (UTC)[reply]

It's generated from the database exports that run on the 1st and 20th of each month. When I have new data, I'll update the list. JeffDoozan (talk) 00:14, 17 December 2021 (UTC)[reply]

Also, Category:Requests for translations of Spanish usage examples is another useful cleanup list, but could be better - can you make a list of any untranslated quotations that appear more than once? I mean things like a costa de and coordinadamente, which have exactly the same quote. They should probably be done first to avoid repeating work, and I probably was responsible for most of them when adding them to help me study conectores for the C2 Spanish exam I did earlier this year (I passed, BTW) Thanks again for generally not being a substandard user. Br00pVain (talk) 22:13, 16 December 2021 (UTC)[reply]

That was a compliment, BTW Br00pVain (talk) 22:13, 16 December 2021 (UTC)[reply]
@Br00pVain: I appreciate the high praise. Here's your reward for passing the C2: Congratulations! JeffDoozan (talk) 00:14, 17 December 2021 (UTC)[reply]

Demonic demonyms

Another Spanish list that would be useful is one containing "someone from ..." or "of from from" but without being in Category:es:Demonyms. Any chance of whipping that up? Br00pVain (talk) 22:16, 16 December 2021 (UTC)[reply]

How do they currently get classified into Category:es:Demonyms? Is it an explicit tag, or a template that includes them? I'm generating reports based on the wiki source code so I have to break it down into something like "page source contains this and does not contain this" JeffDoozan (talk) 00:14, 17 December 2021 (UTC)[reply]
I always just manually classify it, either with [[:Category:es:Demonyms]] or {{C|es|Demonyms}} Br00pVain (talk) 11:14, 17 December 2021 (UTC)[reply]
@Br00pVain: Here's your list. JeffDoozan (talk) 16:00, 17 December 2021 (UTC)[reply]
Pretty good. We don't want any verbs in their, of course. Or stuff already under Category:es:Nationalities. I'll work through these... Br00pVain (talk) 20:18, 18 December 2021 (UTC)[reply]

Your bot is making mistakes

Hi. Your bot wrongly corrected several Yoruba entries with ==Alternative terms== to ==Related terms== when they should have been corrected to ==Alternative forms==. E.g. Ọwọ, ọna, ẹnu, etc. Benwing2 (talk) 05:26, 5 January 2022 (UTC)[reply]

@Benwing2: Thank you, I'll adjust the code and fix the 20 affected entries. While you're here, it looks like our bots have different ideas about the appropriate level of that section at ao, ua, pari, airo, ilu, agho. My bot adjusts section depth relative to the parent section so that any child with a section level > parent+1 is adjusted to parent+1 and will make this adjustment to all sections on a page whenever it's already making a more substantial edit. From your edit at aio, it looks like you might agree with the N+1 depth in this specific circumstance, but I wanted to get your feedback so we can avoid any unintentional bot edit warring before I fix the titles and depth of the other entries my bot edited at atupa, Ọwọ, ẹdun, ẹnu, ọna, aghọfẹn, aafin, ọrọkọrọ, sọrọ, ẹni, eewọ, ariro and aọn. JeffDoozan (talk) 13:56, 6 January 2022 (UTC)[reply]
What my bot does is indent ==Related terms==, ==Derived terms==, ==Declension==, etc. to L4 unless there are multiple etymology sections (==Etymology 1== etc.), in which case they are indented to L5. In some circumstances they are left alone at whatever depth they are currently at, e.g. if there are two or more parts of speech above it in the same etymology section or if it's at L3 and after all etymology sections. This is because in some circumstances people want these headers to apply to all parts of speech in a given etymology section or to all etymology sections. I did not anticipate having these headers above the part of speech, which is why they are getting intended to L4 even in that case. My bot handles ==Alternative forms== specially and recognizes that it may occur above all parts of speech, in which case it should be L3. If you fix those entries to use ==Alternative forms== and put them at L3, my bot will leave them alone, but I will see about fixing it to recognize other headers above parts of speech (where they should not be ...) and leave them alone as well. Benwing2 (talk) 02:00, 7 January 2022 (UTC)[reply]

Hello. Unfortunately, WingerBot has converted all {{bor}} templates to {{bor+}} in Indo-Aryan language entries: diff, thereby making this project page empty. What should I do now 😬? Is it possible for you to regenerate a list of all pages that formerly employed {{bor}}— possibly by tracking down archived revisions? Really sorry for the trouble cause, I should have noticed this development beforehand. Thank you. ·~ dictátor·mundꟾ 16:01, 17 January 2022 (UTC)[reply]

Perhaps the revision history could be of some help… ·~ dictátor·mundꟾ 16:32, 17 January 2022 (UTC)[reply]
@Inqilābī: Updated JeffDoozan (talk) 18:06, 17 January 2022 (UTC)[reply]
Thanks a lot :) !! ·~ dictátor·mundꟾ 16:50, 18 January 2022 (UTC)[reply]

Bot request

Could you use your bot to remove inflectional info from translation tables? Specifically under adjective headers, I see e.g. flojo m, floja f, flojos m-p, flojas f-p. I would limit the search to Spanish to start and only those that include a masculine or epicene gender. Also, I may be the only one who is bothered by this, but I believe in cases like e.g. flojo m, suelto m, etc, the gender should be removed. This would be consistent with our headword templates for Romance languages. Ultimateria (talk) 06:53, 21 January 2022 (UTC)[reply]

@Ultimateria: It should be possible. Just to clarify, using smooth as an example, the bot should make the following change:
{{t+|es|sofisticado|m}}, {{t+|es|sofisticada|f}} -> {{ t+|es|sofisticado}}
and in nominative:
{{t+|es|nominativo|m}} -> {{t+|es|nominativo}}
JeffDoozan (talk) 23:35, 21 January 2022 (UTC)[reply]
Yes, exactly. Ultimateria (talk) 03:35, 22 January 2022 (UTC)[reply]
@Ultimateria: Here's a list of the easiest changes, note that the bot will aggressively resolve forms to their "core" lemma, including forms that are "alternative spellings" (see Beige: beige -> beis, and Good: buen -> bueno, güeno -> bueno). JeffDoozan (talk) 22:38, 25 January 2022 (UTC)[reply]
Looks good to me! I hadn't thought to remove alt spellings but I think we're better off with fewer of them in t-tables. Ultimateria (talk) 01:00, 26 January 2022 (UTC)[reply]

Macedonian verbal nouns and adjectival participles

Hello,

Could you please create a list of Macedonian verbs whose verbal nouns and adjectival participles are red links (in the conjugation tables, e.g. the slots occupied by the blue links носен and носење in the table for носи), so that I can create them? I have been creating these forms manually for all verbs but I have probably missed some. Verbs where no verbal noun and/or adjectival participle exists, i.e. where the corresponding slots are filled by a dash rather than a red link, should not be taken into consideration.

Thank you in advance Martin123xyz (talk) 14:10, 6 February 2022 (UTC)[reply]

@Martin123xyz:, My tools mostly use the wikidumps of the page source, so I just see {{mk-conj-и-и|impf|нос}}, and have no way of knowing if should generate носен and носење or when it would generate a dash. If there's a simple rule like always take the paramater нос + ен or ење, I can make something for you. If it's more complex than that, it might be easier to have someone modify {{mk-conj-и-и}} to generate redlink categories for those slots. JeffDoozan (talk) 16:10, 6 February 2022 (UTC)[reply]
There are unambiguous rules, but they are complex and differ depending on the conjugation template. The generation of verbal nouns and adjectival participles is also regulated by special parameters. I was hoping that your tools could straightforwardly look at the output rather than the source code, but if that is not the case, I think it would be best for me to generate my own lists of red links manually. Thank you anyway. Martin123xyz (talk) 20:39, 6 February 2022 (UTC)[reply]

Small bot request

The category Category:Buddhas wrongly contains some English entries— they should be relocated to Category:en:Buddhas. Thanks in advance! ·~ dictátor·mundꟾ 18:41, 20 March 2022 (UTC)[reply]

@Inqilābī: I don't have a good way of doing that with my bot tools, sorry. JeffDoozan (talk) 18:47, 20 March 2022 (UTC)[reply]
You just need to change {{cat}} to {{c}}, as in diff. Seems quite simple… ·~ dictátor·mundꟾ 18:52, 20 March 2022 (UTC)[reply]
@Inqilābī: So '{{cat|en|Buddhas}}' should be replaced with '{{c|en|Buddhas}}' on these pages? JeffDoozan (talk) 19:01, 20 March 2022 (UTC)[reply]
Yes. ·~ dictátor·mundꟾ 19:02, 20 March 2022 (UTC)[reply]
@Inqilābī: Done. JeffDoozan (talk) 23:10, 20 March 2022 (UTC)[reply]

Der/Rel/Af

Hey, sorry to bring this up again, it seems like some of the pages with {{l}} and {{affix}} were skipped for no reason? I took care of the lists, could you send you bot again? Vininn126 (talk) 11:33, 8 April 2022 (UTC)[reply]

@Vininn126: Done. I also updated the list of skipped pages. JeffDoozan (talk) 17:20, 8 April 2022 (UTC)[reply]
Awesome thank you. I'll do these. Vininn126 (talk) 17:28, 8 April 2022 (UTC)[reply]

Spanish noun and adj forms

Hi JD. I hope your bot can mass-create lots of the forms soon - it'll make finding missing lemmas to create a lot easier! Notusbutthem (talk) 13:24, 15 April 2022 (UTC)[reply]

Rhinos → Rhinoceroses

Hi. Per an RFM discussion, Category:Rhinos is to be moved to Category:Rhinoceroses. In order to do that the former category would have to be emptied; for that, would it be possible for you to change all [[Category:LANG:Rhinos]] to [[Category:LANG:Rhinoceroses]], {{topics|LANG|Rhinos}} to {{topics|LANG|Rhinoceroses}} (similarly for {{c}}, {{C}}, {{top}})? —Svārtava (t/u) • 09:20, 27 April 2022 (UTC)[reply]

List request

Can you create a list of Spanish lemmas ending in [x]ismo that don't link to [x]ista (where [x] is an identical root) and vice versa? Ideally the -ismos and -istas would be sorted separately. I'll probably request more of these but I wanted to start with a short-ish list. Ultimateria (talk) 04:32, 29 April 2022 (UTC)[reply]

Here are -ismos without -istas and vice versa. I don't have a good way a checking links, so it just checks for any occurrence of the text, let me know if you need a better filter. The lists will update automatically twice a month when a new database dump is available. JeffDoozan (talk) 15:54, 29 April 2022 (UTC)[reply]
@Ultimateria: and here's a ping that I forgot in the above message. JeffDoozan (talk) 15:56, 29 April 2022 (UTC)[reply]
Thanks! I forgot to specify though, I was looking for all existing pages to make connections rather than find redlinks. Could you filter out redlinks? Ultimateria (talk) 16:19, 29 April 2022 (UTC)[reply]
@Ultimateria: Done. JeffDoozan (talk) 19:48, 29 April 2022 (UTC)[reply]

Missing Ety List

I finally migrated your list over to my own userpage, in case you wanted to remove it from yours! It's something I'm still gonna have to slowly chip away at. Thanks again! Vininn126 (talk) 13:00, 20 June 2022 (UTC)[reply]

bot change to refugiarse -- probably not helpful

Hi. Back in January, your bot made this change, from:

{{es-verb}}

# {{reflexive of|es|refugiar}}

to

{{head|es|verb form}}

# {{es-compound of|refugi|ar|refugiar|se|mood=infinitive}}

I'm not sure why you made this change, but it's not really correct. refugiarse is a pronominal verb meaning "to take shelter"; it's not literally refugiar + -se. (refugiar is a less common transitive verb meaning "to shelter, to provide shelter for".) The current convention is to place pronominal meanings like this on the non-reflexive page, and have the reflexive equivalent use {{reflexive of}}. Benwing2 (talk) 03:27, 22 July 2022 (UTC)[reply]

@Benwing2: At the time of that edit, refugiar didn't have any senses tagged as pronominal or reflexive, so the bot made the change using the best available information. Now that refugiar has a sense tagged as reflexive, the next time the bot runs against a database dump containing the new changes, it will re-edit refugiarse and replace {{es-compound of}} with {{reflexive of}}. JeffDoozan (talk) 14:25, 22 July 2022 (UTC)[reply]

... and friends are excellent lists! Some expressions are so basic that it's shameful we didn't have them until now. I'll duly work through the lists, doing the easy ones first. GreyishWorm (talk) 22:46, 27 September 2022 (UTC)[reply]

@GreyishWorm: User:JeffDoozan/lists/es missing drae is the new hotness. It will get updated automatically, and it has a fancy comment section where you can dump lines that don't need to be fixed and shouldn't be regenerated by the bot. JeffDoozan (talk) 18:53, 29 October 2022 (UTC)[reply]

Section reordering

I have a bot request to move lemma sections above non-lemma sections in Spanish entries (or any other languages you see fit). The worst offenders for Spanish are female equivalent nouns under adjective forms, and adjectives under past participles. Ultimateria (talk) 02:47, 29 September 2022 (UTC)[reply]

I like that idea. In the event that the sections are not alphabetized, should it preserve the original order, or should it sort them? I'm slightly in favor of sorting alphabetically, but there are cases like leche where the non-alphabetic sorting might be intentional. For reference, here's a list of pages with forms before lemmas, and a list of pages where the sections are not alphabetical. JeffDoozan (talk) 15:43, 30 September 2022 (UTC)[reply]
Lemma sections will be more complicated to sort. If possible they should follow the "primary" sense which can be difficult to determine. E.g. In Spanish demonyms, it's clear that the noun senses are extensions of adjective senses. The current format at leche seems ideal to me because the noun is much more common than the interjection, and also more "relevant" (whose exact meaning I can't pin down). In English dive, it's hard to say the verb is much more relevant than the noun, but the verb does predate the noun at least. Unfortunately these might be best handled on a case-to-case basis. Ultimateria (talk) 00:07, 3 October 2022 (UTC)[reply]
@Ultimateria: I could sort by the most frequently used POS for a given entry, so that van#Spanish would sort the verb form before the noun lemma, although it would likely produce inconsistent results for demonyms. Otherwise, I can preserve the lemma ordering and sort all forms alphabetically by POS beneath them. JeffDoozan (talk) 00:37, 3 October 2022 (UTC)[reply]
The latter option sounds better one to me. I'm not sure non-lemmas should go above lemmas even when they're more frequent. Ultimateria (talk) 01:39, 3 October 2022 (UTC)[reply]
@Ultimateria: This should be done now. Let me know if you notice anything else that could use some bot work. JeffDoozan (talk) 23:16, 3 October 2022 (UTC)[reply]
Great, thank you! Ultimateria (talk) 01:27, 4 October 2022 (UTC)[reply]

Portuguese headwords

Can you replace the headwords of these past participles to {{pt-pp}}? Ultimateria (talk) 00:52, 6 October 2022 (UTC)[reply]

@Ultimateria: All done. JeffDoozan (talk) 02:08, 6 October 2022 (UTC)[reply]
I undid the change on derroto that looked like an exception. JeffDoozan (talk) 02:10, 6 October 2022 (UTC)[reply]

Latin comp/sup cleanup

Could you apply {{comparative of}} and {{superlative of}} to Latin forms? See diff, where I removed a now redundant link to the lemma in the headword template. Ultimateria (talk) 03:17, 2 November 2022 (UTC)[reply]

@Ultimateria: This looks like it'll need some manual review, here's a dump of the 'comparative of' forms where the bot has removed pos= and generated a line with {{comparative of}} but without removing the existing sense lines. You can delete or edit the lines that you think should be deleted. If there are entries that should be completely unchanged without removing pos= or generating the {{comparative of}} line, delete all the lines (but leave the ____xxx_____ header). JeffDoozan (talk) 02:18, 3 November 2022 (UTC)[reply]
 Done. Ultimateria (talk) 03:41, 4 November 2022 (UTC)[reply]
@Ultimateria: Fixed! Here's the list of 'superlative of' forms. JeffDoozan (talk) 22:46, 4 November 2022 (UTC)[reply]
 Done. Ultimateria (talk) 23:29, 4 November 2022 (UTC)[reply]

Spanish suffixed pronouns vs reflexives

Your bot just did a run of converting [infinitive]la to use {{es-verb form of}}. The problem is that these aren't reflexives, which are the only pronoun-suffixed inflections that the module has been programmed to deal with. That means all of them are in CAT:E. Also pinging @Ultimateria, Benwing2.Chuck Entz (talk) 03:00, 6 November 2022 (UTC)[reply]

Looking further, I see that things are more complicated than that, since most of those have no module errors. Something's seriously wrong, but I no longer have any idea what. Chuck Entz (talk) 03:04, 6 November 2022 (UTC)[reply]
@Chuck Entz {{es-verb form of}} does know about pronoun suffixes like -la, but something like adormilarla is indeed not a form of adormilarse, because the latter is reflexive and the former isn't. I'm not sure which changes you're referring to that don't cause module errors but they might have -sela or something suffixed. Benwing2 (talk) 03:06, 6 November 2022 (UTC)[reply]
Jeff, I would recommend making your bot change messages more descriptive than just 'Spanish: Replaced v forms'. If you look at some changes made by my bot you'll see the change messages are long and indicate exactly what is being done and why. This makes it possible later on to understand why a particular change was made. Benwing2 (talk) 03:11, 6 November 2022 (UTC)[reply]
@Benwing2: Yes, I figured that out after my first post, which prompted my second one. The problem seems to be that some edits are using the bare infinitive and some are using the reflexive one, and I have yet to see any pattern that would explain why. I'm afraid I'm not being much help here... Chuck Entz (talk) 03:16, 6 November 2022 (UTC)[reply]
@Chuck Entz Yeah, the existing mess as shown by examples like this is why I didn't fix up these cases by bot. I'm guessing Jeff's bot is looking to see whether there's an existing non-reflexive verb and picking the reflexive one otherwise. This probably works in most cases but there are almost certainly edge cases where this messes up. Benwing2 (talk) 03:22, 6 November 2022 (UTC)[reply]
Okay, I see now what's going on. As you say, the ones using the reflexive are those where the lemma is reflexive and the infinitive is a form-of entry. I'm not sure why a reflexive verb would even have non-reflexive suffixed forms, but, then, I only got up to second-year high school Spanish, so there's a lot I don't know. Chuck Entz (talk) 03:28, 6 November 2022 (UTC)[reply]
@Chuck Entz, @Benwing2 many thanks to two of my favorite editors for jumping in on this so quickly. As Ben pointed out, this is a case of my bot believing incorrectly that adormilar + la could be a possible form of adormilarse. Since this specific conversion is a new capability of the bot, I did review the first few bot edits manually to ensure that it wasn't generating module errors (a feature of {{es-verb form of}} that I really appreciate, since it surfaces corner cases like this) before stopping the run for unrelated errors. Unfortunately, as Chuck noticed, the first few edits didn't include these reflexive-only verbs and I didn't check CAT:E afterwards so I missed this.
While I fix the bot not generate incorrect forms of reflexive verbs, what should I do with the pages for non-existent forms that still appear in CAT:E, revert them or flag them for speedy delete? JeffDoozan (talk) 13:18, 6 November 2022 (UTC)[reply]
My first inclination would be to do both: revert the bot edit, then tag them so the information isn't lost. I notice that these seem to be all created in a run by NadandoBot back in 2018, so we should probably get @DTLHS' opinion on this. I'm reluctant to just delete them based on my limited knowledge of Spanish. Chuck Entz (talk) 16:09, 6 November 2022 (UTC)[reply]
Okay, they've all been reverted. For DTLHS' reference, since the pages are no longer in CAT:E, here's a list of the affected pages aborregarla, abstenerla, achaparrarla, achicopalarla, adentrarla, adormilarla, adueñarla, agermanarla. If we decide to delete these, I can generate a list of all the similar pages that match the reflexive-only-infinitive + la pattern. JeffDoozan (talk)

Bot is removing categories

E.g. https://en.wiktionary.org/w/index.php?title=aband%C3%B3nenos&oldid=64931347 versus https://en.wiktionary.org/w/index.php?title=aband%C3%B3nenos&oldid=69687428. The latter is lacking Category:Spanish forms of verbs ending in -ar and Category:Spanish combined forms, as well as language about. Is this on purpose? Is there some consensus to remove these? —Justin (koavf)TCM 23:43, 7 November 2022 (UTC)[reply]

@Koavf: The bot is replacing {{es-compound of}}, which is difficult for human editors to use and contains no way to validate that the form is authentic with {{es-verb form of}} which automates all of the hard word and validates that the form actually matches the verb's conjugation (see also: previous thread on this page). Any category changes reflect the behavior of the new template and not a deliberate decision change categorization. If those categories happen to be useful or important (I would argue they're not), it would be possible to adjust {{es-verb form of}} to generate the equivalent categories. JeffDoozan (talk) 00:02, 8 November 2022 (UTC)[reply]

Pali Usage Notes and Declensions

When you swap usage notes and declension, the wording in the usage notes may need to be reworded. For example, in နဝမ (navama), after the per-gender feminine stems have been listed as part of the declension section, it makes sense to explain, "နဝမဳ (navamī) is the Mon writing of the feminine", but before presenting the declension, one should rather say "The Mon writing of the feminine is နဝမဳ (navamī)". Perhaps the explanation should be in free form writing in the declension section; the design of notes is taking a long time as it is not easy. --RichardW57 (talk) 17:01, 4 December 2022 (UTC)[reply]

Hi @RichardW57, Forgive my lack of knowledge of Pali, but is that note something that could be automatically generated by {{pi-decl-noun}} whenever it has "gender" and "stem" parameters? If so, it's probably better to adjust the template rather than manually creating and maintaining Usage notes. JeffDoozan (talk) 17:09, 4 December 2022 (UTC)[reply]
That's the aspiration, but it's especially complicated with adjectives, which are currently just built by fishing together noun declension templates. In this case, I'd ideally detect that the stem is also used in Burmese Burmese script Pali, though I think that detection should probably be done manually, as with the two Thai script Pali writing systems, and the multitudinous Lao script Pali writing systems. In a parallel case, the neuter singular of the relative pronoun ya and demonstrative pronoun/adjective ta (total of 3 lemmas per writing system) needs a tailored note for each of the eleven or twelve basic writing systems for a total of 33-36 forms. --RichardW57 (talk) 11:55, 5 December 2022 (UTC)[reply]
That sounds pretty far out of my league. I see there are about 130 Pali entries with a Usage notes section and I'm happy to help get them adjusted to your liking. I see three options, feel free to suggest another:
  • If you think they all need a minor rewording for their new position above Declension, I can extract them to one page where you could edit them all at once.
  • If you prefer the freeform entry, I could extract the existing text and place it at the bottom of the Declension.
  • I can make them child sections the Declension section and adjust my bot to leave them alone. JeffDoozan (talk) 01:44, 6 December 2022 (UTC)[reply]
@RichardW57 I forgot to ping you on the earlier reply. JeffDoozan (talk) 14:35, 6 December 2022 (UTC)[reply]
To my pleasant surprise, the other cases of my being flexible seem to have survived the transposition. There was just enough introduction, so we can leave them in the preferred, not mandatory order. Are you reviewing the changes, or just assuming that the commutator is trivial? --RichardW57 (talk) 14:57, 6 December 2022 (UTC)[reply]
@RichardW57: I wrote the bot very carefully to only operate where there is no ambiguity and I reviewed the proposed bot edits before running it to verify that it operates as expected. I have not reviewed the text of each individual page for logical consistency. JeffDoozan (talk) 15:16, 6 December 2022 (UTC)[reply]

Marking AutoDooz's edits as minor?

Hi, I was wondering if you could mark AutoDooz's formatting clean-up edits as minor. I'm getting a lot of emails because many of the entries are on my watchlist. — justin(r)leung (t...) | c=› } 00:26, 6 December 2022 (UTC)[reply]

@Justinrleung:  Done. My apologies, I thought the bot flag was enough to suppress the notifications, I had no idea I was spamming all of Wikipedia. JeffDoozan (talk) 01:16, 6 December 2022 (UTC)[reply]
Thanks, and I appreciate you running the bot! — justin(r)leung (t...) | c=› } 03:20, 6 December 2022 (UTC)[reply]
@Justinrleung But note that swapping sections round without reviewing the text is not minor. And yes, I too have had a flood of notifications, with, as it transpires, just one bad edit unless one considers draft policy violations. --RichardW57 (talk) 15:01, 6 December 2022 (UTC)[reply]
@RichardW57: I can see what you mean. I would think they're minor enough (like relatively uncontroversial and no content is being changed) to be marked as minor if it's run by a bot, but if you think otherwise, I am not too opposed to taking that off (and letting the notifications flood in again). — justin(r)leung (t...) | c=› } 15:08, 6 December 2022 (UTC)[reply]
@justinrleung: Most sections' content is fairly mechanical. However, 'usage notes' contain a greater element of composition, and could easily include 'above' and 'below' to reference other material. I don't know if it's a useful compromise, but perhaps transpositions should be consider minor unless 'usage notes' or 'trivia' are transposed. I would say that in general, transpositions are more likely to be minor if done by a human rather than a bot; one can hope for greater checking of subtle effects from a human. --RichardW57 (talk) 23:37, 6 December 2022 (UTC)[reply]
I've just read (reread?) Wiktionary:Votes/pl-2015-12/Usage notes. It would seem that the usage notes following the Pali declension tables were in the right order, but perhaps should have been subordinated to the inflection section, as suggested by JeffDoozan above. They're still close, but not afterwards as formally required. Unfortunately, L5 headings do not suggest their status outside tables of contents even when they are at level 5, let alone level 6! I think I should raise this in the Beer Parlour. --RichardW57 (talk) 23:37, 6 December 2022 (UTC)[reply]
Scrub that. The positioning clause was scrubbed in 2015. Usage notes are no longer required to follow what they talk about. The only defence for having them after the declension section is 'flexibility'. --RichardW57 (talk) 23:47, 6 December 2022 (UTC)[reply]

Missing female eqs and unexpected eqs

Would it be possible to make a list of Spanish nouns with any definition marked as "m. y f." in the RAE that is not marked as m/f here and also has no f= parameter in the headword? And on a related note, could you make a list of Spanish nouns whose f= parameter is not the default output for the pagename? I've come across a few errors that were mistyped before the template overhaul. Ultimateria (talk) 03:15, 14 December 2022 (UTC)[reply]

@Ultimateria: Here are not just mf, but all DRAE gender mismatches, and overriden feminine forms JeffDoozan (talk) 23:42, 14 December 2022 (UTC)[reply]
@Ultimateria: I improved the mismatch detection on DRAE gender mismatches it now lists any Spanish nouns on Wiktionary that don't match any of the genders of the corresponding DRAE entries. Hopefully this makes it easier to identify any mistakenly gendered nouns on Wiktionary. JeffDoozan (talk) 15:28, 15 April 2023 (UTC)[reply]
Great, thank you! I did sort of abandon this when I saw the number of false positives. Ultimateria (talk) 19:37, 15 April 2023 (UTC)[reply]

m= cleanup

Can you remove |m= parameters from Spanish nouns whose only definition is just {{female equivalent of|es|X}}? I'll review the ones with glosses inside or outside the template, and any with additional senses.

This is a holdover from the old acceleration code that has since been removed because of redundancy between the headword and definition lines, but should probably be included in special cases (most of which already have glosses). Ultimateria (talk) 05:06, 10 January 2023 (UTC)[reply]

@Ultimateria:  Done, here are the items with glosses or additional senses. JeffDoozan (talk) 15:49, 10 January 2023 (UTC)[reply]

Cleanup error

[1] This edit was a mistake because the last further reading section applied to both etymologies, whereas the ones nested below the etymology only applied to their respective senses. 70.172.194.25 03:03, 22 January 2023 (UTC)[reply]

You're right, thank you. I'll fix the bot and find and fix any affected pages. JeffDoozan (talk) 14:21, 22 January 2023 (UTC)[reply]

Whitespace changes error

Hello! Check this out. It looks like a mistake to me. Gorec (talk) 18:50, 3 February 2023 (UTC)[reply]

@Горец: It is a mistake (and easy to fix, too), thank you! JeffDoozan (talk) 18:54, 3 February 2023 (UTC)[reply]

Needs deleting. I think you muddled it with surtir efecto :) Half Norwegian Dude (talk) 12:50, 19 February 2023 (UTC)[reply]

Yes, my mistake. Good catch! JeffDoozan (talk) 14:08, 19 February 2023 (UTC)[reply]

Well, Jeff, you keep having to clean up my etymology-heading messes. I do appreciate your graciousness in helping me learn – albeit ever so slowly, and mostly by trial, error, and reversion – the subtleties of Wiki layout. 8-) – HelpMyUnbelief (talk) 21:48, 21 February 2023 (UTC)[reply]

@HelpMyUnbelief: No worries, keep up the good work! JeffDoozan (talk) 22:06, 21 February 2023 (UTC)[reply]

Bot messing up Akkadian Cuneiform/Logogram entries.

Hi! I've noticed that your bot is changing a lot of Akkadian Cuneiform/Logogram entries reverting the order of the "Sign values" and "Etymology" sections. The correct order is:

  1. Sign values
  2. Etymology
  3. Logogram

Would you be able to fix all those entries that now appear in the wrong order Etymology > Sign values > Logogram? Thank you! — Sartma 𒁾𒁉𒊭 𒌑𒊑𒀉𒁲 12:30, 4 March 2023 (UTC)[reply]

@Sartma:  Done Thanks for the clarification. I remember looking at some Akkadian entries when I was trying to figure out originally where "Sign values" should go, but I guess I looked at the wrong entries or just messed it up when I coded the order list! They're now all correctly sorted Sign Values > Etymology > Logogram. JeffDoozan (talk) 15:25, 4 March 2023 (UTC)[reply]

Una pregunta acerca de encespedar

And we plunge from the erudite heights of Akkadian cuneiform back down to something on my level...your AutoDooz bot moved the Synonyms section for this (short and simple-structured) entry from its own section to a hideable line under the (single) definition. For my future reference – is the template the preferred format now even for entries whose definitions don't have multiple senses? – HelpMyUnbelief (talk) 09:06, 8 March 2023 (UTC)[reply]

@HelpMyUnbelief: When editing Spanish always use {{syn}}. For other languages, use {{syn}} unless the entry already has a separate Synonyms section in which case you should use {{sense}} to specify the exact sense the synonym applies to (see: medicine) JeffDoozan (talk) 15:11, 8 March 2023 (UTC)[reply]

Spanish noun and adj forms

Hi JD. I hope your bot can mass-create lots of the forms soon - it'll make finding missing lemmas to create a lot easier! Van Man Fan (talk) 23:25, 21 March 2023 (UTC)[reply]

@Van Man Fan: I just had it add forms for the most common 250k words in the google ngrams corpus (we were only missing 700!) so anything that's still a redlink is either a form of a lemma we don't have or a form that's used infrequently in the ngram corpus. If you're working through a specific list of words, let me know and I can have the bot clear out those forms for you. JeffDoozan (talk) 01:06, 23 March 2023 (UTC)[reply]
Thanks! How about those in User:DTLHS/eswikipedia and its subpages? I'm currently plodding through User:DTLHS/eswikipedia/7 Van Man Fan (talk) 08:13, 23 March 2023 (UTC)[reply]
@Van Man Fan:  Done through /8. Pages that were created with just forms but that should have lemmas will show up on User:JeffDoozan/lists/es_forms_with_drae_lemmata on April 1. JeffDoozan (talk) 19:27, 23 March 2023 (UTC)[reply]
Excellent teamwork :) If I have any quibbles, it is that Autodooz is missing a trick by creating yaruros but not yarura or yarura. Might as well get the whole set, right? Van Man Fan (talk) 21:33, 23 March 2023 (UTC)[reply]
Can you do all the subpages? Van Man Fan (talk) 19:26, 29 March 2023 (UTC)[reply]
I'll do everything with at least 7 uses, which will go through the middle of /10. If you make it past there with the lemmas, I can keep going. Note that User:JeffDoozan/lists/es_missing_drae only includes words with > 5000 uses in the Ngram corpus and I can always lower that limit if you want more words to work on there. My hunch is that it'll probably surface more interesting words than those with only 6 uses on Spanish Wikipedia. JeffDoozan (talk) 23:06, 29 March 2023 (UTC)[reply]
Love you, Jeff. That should probably be enough for now. It's true, most of the es.wikipedia words are pretty boring these days. Van Man Fan (talk) 02:12, 30 March 2023 (UTC)[reply]

Please check. Chuck Entz (talk) 16:35, 24 March 2023 (UTC)[reply]

Easy bot job

Hey, the Czech updates got me thinking, could we switch all instances of {{col3}} on Polish and Old Polish lemmas to {{col-auto}}? Vininn126 (talk) 13:02, 27 March 2023 (UTC)[reply]

@Vininn126: I can do that, do you want to change just {{col3}} or {{col2}}-{{col5}}? JeffDoozan (talk) 15:33, 27 March 2023 (UTC)[reply]
If those exist, then those should be changed too. {{col-auto}} is best for readers. Vininn126 (talk) 15:37, 27 March 2023 (UTC)[reply]
@Vininn126: Okay, it's running now, should be done in an hour or so. JeffDoozan (talk) 16:00, 27 March 2023 (UTC)[reply]
I'm cleaning up the Czech col-auto's, I think the "Multiple l's" section could easily be converted, looking at it. Convert all the l's into a single col-auto, and if it has |g=, I say just get rid of it. The other two remaining sections I think will require more attention. Vininn126 (talk) 16:36, 27 March 2023 (UTC)[reply]
I think we can reduce a lot of the text_outside_template section, because a lot of it is "}} se". I think we can safely delete the se (it's a reflexive particle and those aren't usually saved in the derived terms sections). Vininn126 (talk) 16:43, 27 March 2023 (UTC)[reply]
@Vininn126: I fixed up the pages with the "se" text, but I think I'll leave the "Multiple l's" for manual cleanup because there aren't a ton of them and I can foresee some corner cases that would take longer to code exceptions for than to manually edit. In case it's easier for you as an editor, you can just clean up the errors the bot's complaining about rather than converting everything by hand to {{col-auto}} and then I can run the bot again to do the conversion. Also, I didn't notice that you had edited the error page so your changes just got overriden when I updated it to excluded the "se" pages. Sorry! JeffDoozan (talk) 17:55, 27 March 2023 (UTC)[reply]
Oh, sorry! And thanks for the help. I can slowly work on the remainders. Vininn126 (talk) 18:03, 27 March 2023 (UTC)[reply]
I have gone through the list. Thanks for the help! Vininn126 (talk) 00:12, 28 March 2023 (UTC)[reply]
Could you do the same for Maltese entries? There are probably some leftovers. Thanks in advance. — Fenakhay (حيطي · مساهماتي) 15:52, 11 April 2023 (UTC)[reply]
@JeffDoozan There are some Old Polish pages without col-auto, could you have your bot make changes like this? diff (You can ignore the descendent section change). (Just No need for another language at the moment). Please and thank you! Vininn126 (talk) 10:27, 29 June 2023 (UTC)[reply]

Another task

Man, I'm really sorry for these 3 quick-fire messages. It's like the time when I learned that Erutuon (talkcontribs) could make a list of pretty much anything, so I got them to make increasingly random ones, like the most reverted editor, all the Wonderfool accounts, the most thanked and thanking editors, and which editors stayed up until ridiculous times in the morning (it was basically just Equinox...) Van Man Fan (talk) 19:30, 29 March 2023 (UTC)[reply]

The request is a French version of User:JeffDoozan/lists/es missing drae. User:JeffDoozan/lists/fr missing AF Van Man Fan (talk) 19:31, 29 March 2023 (UTC)[reply]
I can make a list of missing lemmas pretty easily. Sorting it by frequency of use is a little trickier, but let me see what I can do. JeffDoozan (talk) 23:07, 29 March 2023 (UTC)[reply]

TLFi

Hi, {{R:TLFi}} was renamed to {{R:fr:TLFi}} some months ago, and I am currently doing a bot run to rewrite all the old usages, so you should update your bot script that adds these if you haven't already. Benwing2 (talk) 04:30, 2 April 2023 (UTC)[reply]

@Benwing2: Thanks, I hadn't noticed the change. I'll update the bot accordingly. JeffDoozan (talk) 14:18, 2 April 2023 (UTC)[reply]

Silesian Mantel

Why did you just delete that? Vininn126 (talk) 21:18, 2 April 2023 (UTC)[reply]

Nevermind, it was duped for some reason, sorry for the bother! Vininn126 (talk) 21:19, 2 April 2023 (UTC)[reply]

cite-text

Hi, your bot has converted 1500+ bare references to {{cite-text}}, which does not exist, so these instances are currently displayed as red links to the template (Special:WhatLinksHere/Template:cite-text). Could you please fix this? Thanks, Einstein2 (talk) 20:33, 23 April 2023 (UTC)[reply]

@Einstein2: Thanks! Temporarily fixed by redirecting Template:cite-text to Template:cite-book until I can get the bot to rename them to the appropriate citation templates. JeffDoozan (talk) 20:39, 23 April 2023 (UTC)[reply]

Bot job request

Could you run a quick bot job to replace stacked images with the {{multiple images}} template?

For example, replace

[[File:something1|thumb|something2]]
[[File:something3|thumb|something4]]
[[File:something5|thumb|something6]]

with

{{multiple images
|direction = vertical
|image1 = something1
|caption1 = something2
|image2 = something3
|caption2 = something4
|image3 = something5
|caption3 = something6
}}

The template both looks cleaner and stops the page layout from getting messed up on the mobile site. Thank you! Ioaxxere (talk) 22:32, 23 April 2023 (UTC)[reply]

@Ioaxxere: I assume it should also handle "Image" as an alias of "File" and "thumbnail" as an alternative of "thumb". What about the extended image syntax parameters, what should it do in the event of
[[Image:MENISCAS 180.jpg|thumb|right|An optical telescope.]]
[[File:Telescope render.jpg|thumb|Telescope render]]

or

[[File:Finger = 4 open.JPG|thumb|right|200px|A human hand, showing its four fingers and thumb.]]
[[File:Mallet Xray anterior.png|thumb|right|200px|An X-ray of human fingers.]]
[[File:Fishfinger1.jpg|thumb|200px|Fish fingers.]]
? I'm guessing it's okay to drop any existing values for Border, Location, Alignment, Size, Page and Langtag and pass along Link, Alt, Caption JeffDoozan (talk) 20:16, 24 April 2023 (UTC)[reply]
In those two cases, the "200px" and "right" parameters are redundant since those are the default values anyway. If you encounter some other alignment or size, I guess it would need a manual check but I doubt that nonstandard values are ever necesssary. Ioaxxere (talk) 22:18, 26 April 2023 (UTC)[reply]
Also: using {{multiple images}} with a single image when that image is directly below a {{wikipedia}} template fixes a visual issue on mobile (example). This is a bit of a band-aid solution, but it would be great if you could do that as well. Ioaxxere (talk) 23:01, 26 April 2023 (UTC)[reply]
@Ioaxxere: I really don't like the idea of using a template called {{multiple images}} to wrap a single image, I'd prefer something like like {{images}} or {{col-img}}. Are there other templates for wrapping images that should be considered? JeffDoozan (talk) 13:21, 25 July 2023 (UTC)[reply]

Templating quotes

Hey. Thanks for these guys. Are you aiming to templatize all the quotes on en.wikt? That obviously would be freaking awesome Wonderfool April 2023 (talk) 20:59, 24 April 2023 (UTC)[reply]

@Wonderfool April 2023: I'm aiming to templatize as much as can be reasonably automated but there are still over 50k untemplated quotes outstanding each seemingly using its own subtle format variation. JeffDoozan (talk) 21:07, 24 April 2023 (UTC)[reply]

Fake authors

Hi. Thanks for catching this one. I don't remember adding it, so I was probably drunk at the time Wonderfool69 (talk) 19:26, 26 May 2023 (UTC)[reply]

We are down to 49 entries in this after I deleted all the gerund+object forms of reflexive-only verbs. Can you help classify the remainder so we can figure out what to do with them? Some of them appear to be familiar imperative + se, which should maybe be deleted; some are third-person singular + se, which are archaic; etc. I am not super familiar with which of these combinations are allowed, which are disallowed, and which are archaic, maybe you have more experience here. Thanks! Benwing2 (talk) 05:23, 22 June 2023 (UTC)[reply]

@Benwing2: I finally had a chance to look through that category, it seems like half of them can be deleted and the rest look like some kind of bug in {{es-verb form of}} JeffDoozan (talk) 19:38, 10 July 2023 (UTC)[reply]

These look like errors to me:

acercándoseme - DELETE, se+me not a valid clitic combination
acuérdase - DELETE, 2nd person imperative + se is not valid
agrégase - DELETE, 2nd person imperative + se is not valid
apropríate - DELETE - should be aprópriate
apropríense - DELETE - should be aprópriense
apropríese - DELETE - should be aprópriese
apártase - DELETE, 2nd person imperative + se is not valid
avéntate - DELETE, should be aviéntate
avéntense - DELETE, 3rd person plural imperative + se is not valid
avéntese - DELETE, should be aviéntese
declárase - DELETE, 2nd person imperative + se is not valid
descúbrese - DELETE, 2nd person imperative + se is not valid
desházlo - DELETE, should be deshazlo
diviértese - DELETE, 2nd person imperative + se is not valid
déme - DELETE, should be "deme" without tilde
dése - DELETE, should be "dese" without tilde
encuéntrase - DELETE, 2nd person imperative + se is not valid
explícase - DELETE, 2nd person imperative + se is not valid
habiéndoseme - DELETE, se+me not a possible clitic combination
inclínase - DELETE, 2nd person imperative + se is not valid
levántase - DELETE, 2nd person imperative + se is not valid
pelete - DELETE, should be pélete
permítaseme - DELETE, se + me is not a valid clitic combination
permítasenos - DELETE, se + nos is not a valid clitic combination
preguntádselo - DELETE, should be pregúntadselo
sítuate - DELETE, should be sitúate
úsase - DELETE, 2nd person imperative + se is not valid

These seem to be unhandled by {{es-verb form of}}

hubiérale - UNHANDLED, hubiera (3p subjunctive) combined with a clitic (it would have)
hubiérase - UNHANDLED, hubiera (3p subjunctive) combined with a clitic (you would have)
manténlo - UNHANDLED - the tilde is nonstandard (normally mántenlo) and I don't think there's any way to override a combined slot

These look like valid forms to me, but generate errors with {{es-verb form of}}

apriétaselo - BUG - this should work
dásela - BUG
dáselo - BUG
díselo - BUG
encárgaselo - BUG
llévaos - BUG
llévaselo - BUG
llévente - BUG
llévete - BUG
pedídselo - BUG
pregúntaselo - BUG
págasela - BUG
págaselo - BUG
páguete - BUG
quítaselo - BUG
revuélvese - BUG
Thank you! I will take a look at the bugs and delete the bad forms. Benwing2 (talk) 21:03, 10 July 2023 (UTC)[reply]

Disable Etymology ↔ Pronunciation flip feature in AutoDooz bot on Indic language pages

Hello. I noticed that the AutoDooz bot is flipping the Etymology and Pronunciation sections on Indic language pages (e.g., Telugu language entries like this one). Is there a way to disable this? While it makes sense to have Pronunciation sections under Etymology sections for some languages that don't necessarily have phonemic orthographies (e.g., English), where the different etymology might inform the pronunciation differently, this is not the case for basically any Indic language, which are highly (if not perfectly) phonemic. As such, having the Pronunciation section below the Etymology section does not make much sense. This is especially the case where a word might have multiple etymologies, but only one pronunciation. Thanks. Getsnoopy (talk) 20:55, 1 July 2023 (UTC)[reply]

The time to complain is when Level 3 'Pronunciation', 'Etymology 1' gets flipped. Does this happen? I think not, as leave follows that structure. --RichardW57 (talk) 08:41, 2 July 2023 (UTC)[reply]
I get the impression that nuktas are frequently omitted, so words of Moslem origin may well not be homophonous with native homographs. Tatsamas may also follow different rules to native words. Finally, the schwa deletion rule in Hindi seems to work differently for uninflected and inflected words - I've found a Hindi vocative plural assigned different transliterations inside an inflection table and as a noun form. --RichardW57 (talk) 08:41, 2 July 2023 (UTC)[reply]
I haven't noticed that it does that.
As for the nuqtas, while that is true that Hindi has some quirks about the manifestation of the pronunciation of some words, it's a rare case. It isn't true for most of the cases, and even in those cases, the convention for the Perso-Arabic senses of those words is to list the pronunciation that would result from the nuqta being there first regardless.
Tatsamas and native words wouldn't normally be spelled the same way (in all the cases I can think of at the moment), so that wouldn't apply here. Getsnoopy (talk) 18:13, 2 July 2023 (UTC)[reply]
@Getsnoopy: Hi! I think the bot should already handle the corner cases you're worried about. For the entry you mentioned, where there's only a single etymology section and a single pronunciation section, the bot enforces the Etymology first order specified in WT:ELE, which may not be ideal for the reasons you mentioned but probably won't misinform a reader. For other entries that have a single Pronuncation section shared by multiple Etymologies, the bot follows the guidelines in WT:ELE and maintains the L3 Pronuncation above the multiple L3 Etymologies. For more complex situations, the bot should do pretty good job of respecting the relationship between multiple etymologies and pronuncations so you can have a page with L3 "Etymology 1" with two nested L4 "Pronuncation 1" and L4 "Pronunciation 2" subsections, then L3 "Etymology 2" with its own L4 "Pronuncation". If it finds a page with sections "Etymology 1", "Etymology 2", "Pronuncation" in that order all at level 3, it will not do any sorting. JeffDoozan (talk) 13:17, 2 July 2023 (UTC)[reply]
Well the problem is that it then becomes inconsistent with the Pronunciation section being above the Etymology section(s) in some cases, but not others. WT:ELE says that the order is not a rigid set of rules, and I think it makes sense to make an exception in the Indic languages case because of what I mentioned above. Getsnoopy (talk) 18:16, 2 July 2023 (UTC)[reply]
@Getsnoopy: I agree it's weird, but right now it's consistently inconsistent across all entries in all languages. I'm happy to adjust the bot to sort Pronunciation before Etymology for specific languages if there's community support for the languages that would be affected. JeffDoozan (talk) 18:35, 2 July 2023 (UTC)[reply]
I understand; I guess petitioning to get the standard order changed on WT:ELE is another task entirely. Let's do it for Telugu, at least. I just realized I'm only one of two people in the Telugu workgroup, and am the one who has been active in the last year. Getsnoopy (talk) 18:51, 2 July 2023 (UTC)[reply]
@Getsnoopy Etymology always goes before Pronunciation unless there are multiple etymologies. I don't think there should be lang-specific exceptions. Benwing2 (talk) 22:22, 8 July 2023 (UTC)[reply]

Etymology Misplacement

AutoDooz misreordered Level 3 heading sequence Letter, Etymology, Adjective to Etymology, Letter, Adjective in Special:diff/70859276. I've fixed it manually, but there are several unimplemented red flags for that combination that should suggest the need for manual intervention. --RichardW57 (talk) 08:16, 2 July 2023 (UTC)[reply]

Thanks for the heads up, I noticed that it did something similar on 𑄃𑄖𑄴𑄗 which it also should have classified as "needs manual intervention". I'll get it fixed and check for any other bad edits. JeffDoozan (talk) 12:53, 2 July 2023 (UTC)[reply]
I've only just noticed a wetware bug there, so there might be other cases. When I clone a Pali section from Roman script to another script, I strip out the etymologies and viciously collapse the senses to |t=, but of course I should only strip out the immediate contents below numbered etymology headers and leave the headers, or strip them out and promote the contents. With Pali, I generally don't need to worry about pronunciation sections because most editors know better than to try to cover the deeply divergent pronunciations. --RichardW57 (talk) 14:15, 2 July 2023 (UTC)[reply]

I haven't figured out why, but whatever your bot is doing to this page is wrong. My best guess is that {{quote-text}} is interpreting named parameters it doesn't recognize as positional parameters, and |year= is the first positional parameter after the language code. At any rate, your bot seems to be using the wrong parameter names for this particular template. Chuck Entz (talk) 21:28, 2 July 2023 (UTC)[reply]

@Chuck Entz:: Thanks! It looks like the url has unescaped | characters in it and the template is interpreting part of the URL as its own parameters. The bot does a sanity-check of the each of the template parameters to make sure they don't contain | but some reason it's letting that one slip through. I'll track down the problem and hopefully you won't have to fight against the bot any more. JeffDoozan (talk) 21:39, 2 July 2023 (UTC)[reply]

Pronunciation 1

Hey I noticed you added a Pronunciation 1 header on rišti. IMO generally these should use Etymology 1, Etymology 2, etc. unless there's a good reason not to. Here, the non-lemma form does not have the same etymology as the lemma form even though it may ultimately be a non-lemma form of the same lemma. Benwing2 (talk) 22:24, 8 July 2023 (UTC)[reply]

@Benwing2: You're right. I think I was on a "cleanup {{IPA}} outside a Pronunciation section" run when I made that change and I didn't look too closely at the rest of the page. Thanks for fixing it. JeffDoozan (talk) 14:14, 10 July 2023 (UTC)[reply]

Quote translations

If you ever run that bot again, can I please request that you don't ever consider a line in a quote to be a translation unless it is intended further than the text itself? This isn't acceptable. — SURJECTION / T / C / L / 10:33, 17 July 2023 (UTC)[reply]

@Surjection: Luckily that was a very short-lived bug, the bot's much smarter at handling multi-line quotes and translations now. I thought I had cleaned up all of the bad edits but clearly I missed that page. Thank you for finding and fixing it. JeffDoozan (talk) 13:00, 17 July 2023 (UTC)[reply]
That's not an outlier. I've had to fix a few dozen of pages like that. — SURJECTION / T / C / L / 13:02, 17 July 2023 (UTC)[reply]
@Surjection: "Indented"? Chuck Entz (talk) 13:06, 17 July 2023 (UTC)[reply]
Yes, "indented", my mind was probably a bit elsewhere while I was still typing it. — SURJECTION / T / C / L / 13:07, 17 July 2023 (UTC)[reply]
@Surjection: Okay, I'll do another sweep to check for similar bad edits. If you run into future bad bot edits, feel free to ping me instead of cleaning them all up yourself, dozens of pages of fixes sounds pretty bad so thank you for fixing them. JeffDoozan (talk) 13:09, 17 July 2023 (UTC)[reply]
There wasn't much point to bot it. Someone had to add the English translations to the quotes anyway. — SURJECTION / T / C / L / 13:11, 17 July 2023 (UTC)[reply]
@Surjection: It looks like most of the bad edits affected Finish proper nouns, which explains why nobody else had noticed them until now. I think they should all be fixed now. JeffDoozan (talk) 15:34, 17 July 2023 (UTC)[reply]

Bot job request

Could you run the bot to sort the lines under definitions? The usual order is synonyms, antonyms, other semantic relations, usage examples, and quotes. For example

# Definition
#: {{syn|en|xyz}}
#: {{ant|en|xyz}}
#: {{cot|en|xyz}}
#: {{ux|en|xyz}}
#* {{quote-text}}

(the newlines aren't showing up for some reason...) Ioaxxere (talk) 01:30, 25 July 2023 (UTC)[reply]

(You can use <pre> ... </pre> for preformatted text with line breaks.) Equinox 01:36, 25 July 2023 (UTC)[reply]
As far as I know, that order is generally accepted but not actually written in any policy document, which inevitably means that there exists some page where someone thinks it's really important to use a different order. I'm in favor of consistency, so if you want to start a discussion and get consensus for documenting and enforcing that order (don't forget to put collocations, too), I'm happy to do the sorting. JeffDoozan (talk) 13:16, 25 July 2023 (UTC)[reply]
@JeffDoozan Would the bot be extracting the dates of quotations by executing modules? While for many quotations the date is just sitting in a template invoked from the page, for stuff using {{Q}} or most Pali, Old Khmer (I think) and Northern Thai quotations, the date is sitting in a data module or a template invoked from Lua. I trust it won't be replacing template invocations by the current expansions; templates offer a means of improving the referencing of many quotations, and in a lot of what I do, coordinating corrections for any errors in quotation, transliteration or translation used on several pages (sometimes as many as 20). --RichardW57m (talk) 14:35, 26 July 2023 (UTC)[reply]
@RichardW57m: The bot will not sort quotes by date because, as you mention, there is no 100% reliable way for it to get a trustworthy date from the template. If a sense has multiple quotes, they will remain in the same order, but will be sorted after nyms, collocations, and usage examples. JeffDoozan (talk) 14:45, 26 July 2023 (UTC)[reply]

mistakes in templatizing quotes

See [2] for an example. URL's containing embedded vertical bars were converted to named params without URL-encoding the vertical bars. Probably the same issue with equal signs, although this may be less problematic. I'm trying to convert numbered params to named ones and there appear to be thousands of cases where this has happened. Benwing (talk) 23:25, 26 July 2023 (UTC)[reply]

BTW my script output warnings for 1,116 distinct quote templates when checking all the pages that have {{quote-book}} on them. Not all of them are due to conversions made by your bot or are specifically due to this issue. Benwing2 (talk) 00:20, 27 July 2023 (UTC)[reply]
@Benwing2: Okay, I'll get those cleaned up soon. The errors caused by my bot will probably have named url= or section= parameters containing a pipe and then an unexpected positional parameter, in case that helps you to ignore them for right now. JeffDoozan (talk) 01:15, 27 July 2023 (UTC)[reply]

@Benwing2: I escaped | with %7C in the urls listed on User:JeffDoozan/badquotes. To generate that list, I scanned all "{{quote-*" templates looking for named parameters containing "//" and "?" in their value and followed by a non-blank positional parameter or an unknown named parameter. I manually reviewed the list to discard valid items like

url=http://books.google.com/books?id=_wUJAAAAQAAJ&pg=PA62#v=onepage&q&f=false|Cain

and manually fixed a handful items with obvious typos like

url=http://books.google.co.uk/books?id=b8AfAAAAIAAJ&q=ambilevous&dq=ambilevous&ei=ah2SSMqaDoj2jgHb5_TXDQ&pgis=1|page 607

It looks like there's a lot of pretty ugly stuff when you search for quote templates with unnamed parameters following named parameters, let me know if you see anything else that looks like bot damage or if there's anything I can help out with on this. JeffDoozan (talk) 18:56, 27 July 2023 (UTC)[reply]

Thanks! I also downloaded all of the quote-* templates (all 261,698 pages referencing Module:quote, over 11 hours 10 mins) and ran a script over them to find similar issues. It's not quite as correct as what you did because it only looks for numbered params directly following a named param |url= or |section= or |url2=, |section2=, etc. Right now I downloaded again just the pages flagged by this process, reran the script and manually filtered out the things that looked OK. The results are here: User:Benwing2/quote-possible-unescaped-vertical-bar-or-other-mistakes There are 41 lines, some of which are typos e.g. writing volume-6 instead of volume=6 but some look to be cases you missed. You might want to check to see why they were missed, if possible. Also, how did you check for an unknown named parameter? There are a lot of named params. If you have a regex or something for this, I can check for them as well to see if anything else is missed. Benwing2 (talk) 20:25, 27 July 2023 (UTC)[reply]
@Benwing2: I think I missed those entries because sometime during debugging, I filtered only templates that included at least 3 positional parameters and forgot to remove that. I'll get the rest of them cleaned up tomorrow. To get the list of params, I had my script dump out any named parameters immediately after the detected url and then verified them manually to get: 'accessdate', 'accessed', 'album', 'archivedate', 'author', 'book', 'brackets', 'by', 'chapter', 'chapterurl', 'city', 'column', 'date', 'digitized', 'doi2', 'edition', 'edition2', 'editor', 'editor2', 'editors', 'editors2', 'footer', 'genre', 'isbn', 'isbn2', 'issn', 'issn2', 'issue', 'journal', 'lccn', 'lines2', 'location', 'location2', 'magazine', 'month', 'network', 'newsfeed', 'newsgroup', 'newspaper', 'newversion', 'oclc', 'oclc2', 'origdate', 'original', 'origyear', 'others2', 'page', 'page2', 'pages', 'pageurl', 'passage', 'publisher', 'publisher2', 'quote', 'quotee', 'section', 'series2', 'subst', 't', 'text', 'title', 'title2', 'tr', 'trans', 'trans-title', 'trans-title2', 'translation', 'translator', 'translators', 'url', 'volume', 'volume2', 'work', 'year', 'year2', 'year_published' JeffDoozan (talk) 20:56, 27 July 2023 (UTC)[reply]
Great, thank you! Benwing2 (talk) 20:58, 27 July 2023 (UTC)[reply]
Your list of params is incomplete. I made a more-or-less complete list by going through the wikicode for the 12 quote templates, and also made a list of additional params that were seen following a URL and appear to be valid params but aren't recognized, or are misspellings of recognized params. (There are undoubtedly more that appear elsewhere in quote templates; haven't gotten there yet.) Here is the list of valid params:
recognized_named_params_list = [
  "accessdate", "accessdaymonth", "accessmonthday", "accessmonth", "accessyear", "actor", "album", "archivedate",
  "archiveurl", "article", "artist", "at", "author", "authorlabel", "authorlink", "authors", "autodate", "bibcode", 
  "blog", "book", "brackets", "by", "chapter", "chapterurl", "city", "coauthors", "column", "columns", "columnurl", 
  "column_end", "column_start", "composer", "date", "debate", "director", "directors", "DOI", "doi", "edition", 
  "editor", "editors", "email", "entry", "entryurl", "episode", "first", "footer", "format", "genre", "googleid", 
  "group", "house", "id", "indent", "ISBN", "isbn", "ISSN", "issn", "issue", "journal", "jstor", "lang", "last", 
  "laydate", "laysource", "laysummary", "LCCN", "lccn", "line", "lines", "lit", "location", "lyricist", 
  "lyrics-translator", "magazine", "mainauthor", "medium", "month", "network", "newsgroup", "newspaper", "newversion",
  "nocat", "nodate", "note", "notitle", "number", "OCLC", "oclc", "ol", "origdate", "original", "origmonth", "origyear",
  "other", "others", "page", "pages", "pageref", "pageurl", "page_end", "page_start", "passage", "periodical", "PMID", 
  "pmid", "publisher", "quote", "quoted_in", "quotee", "report", "role", "scene", "season", "section", "sectionurl", 
  "series", "seriesvolume", "site", "sort", "speaker", "ssrn", "start_date", "start_year", "subst", "t", "termlang", 
  "text", "time", "title", "titleurl", "tr", "trans", "trans-album", "trans-chapter", "trans-entry", "trans-episode", 
  "trans-journal", "trans-title", "trans-work", "transcription", "translation", "translator", "translators", 
  "transliteration", "ts", "type", "url", "urls", "url-access", "url-status", "version", "volume", "volume_plain",
  "work", "worklang", "writer", "writers", "year", "year_published", "2ndauthor", "2ndauthorlink", "2ndfirst", "2ndlast"
]
And here is the list of seen but unrecognized or misspelled params:
unrecognized_named_params_list = [
  "archive-date", "archive_date", "archiv-datum", "access", "access-date", "accessed", "archiveorg", "archive-url",
  "asin", "author1", "digitized", "fulltext", "i2", "isbn10", "meeting", "new version", "newsfeed", "oldurl", 
  "originalpassage", "p", "paragraph", "part", "pos", "producer", "publish-date", "rfc", "retrieved", "stanza", "via", 
  "website",
  # misspellings:
  "Chapter", "colunm", "Id", "IISBN", "isn", "Jnewsgroup", "oage", "Page", "pge", "Section", "tirle", "year_publsihed",
]
After updating my script to use your logic and (as a special case) ignoring cases with params |6= and |7= that have a numeric param value (since these are page number params that occur frequently), and running it on my original download of all 261,000+ quote templates, I find 2,548 possible cases of unescaped vertical bar. I then re-downloaded all the pages in question and reran the script; there are still 1,919 possible cases. Of these, 1,016 have one of [+%&] in the param name, which means they are almost certainly valid, and another 15 have %22 in the param value, which means they are very likely valid as well. This leaves 888 for manual review, most of which aren't valid (but some are, especially those without spaces in the param value). If it helps you, the likely valid cases are here: [User:Benwing2/quote-possible-unescaped-vertical-bar-or-other-mistakes-likely-valid] and the cases needing manual review are here: [User:Benwing2/quote-possible-unescaped-vertical-bar-or-other-mistakes-for-manual-review]. After you do a further round I'll check again and see if I see anything left. Thanks again for your help. Benwing2 (talk) 03:03, 28 July 2023 (UTC)[reply]
@Benwing2: Okay, I think this mess is pretty well cleaned up now. Thank you for generating a more complete list of named parameters. In addition to your list, I added
other_named_params_list = [
  "isbn2", "location2", "publisher2", "title2", "year2", "isbn2", "oclc2", "page2", "column2", "issn2",
  "author2", "authorlink2", "doi2", "edition2", "editor2", "editors2", "indent2", "lines2", "others2",
  "series2", "url2", "volume2",
  "pg", "editor(s)",
]
then I scanned all parameters containing "//" and "?" outside of square brackets and the appended all following parameters until encountering
  • a named parameter in recognized_named_params_list, unrecognized_named_params_list, or other_named_params_list
  • a positional parameter with no value (||)
  • a positional parameter in position 6 or 7 (page) containing a number < 10000
  • a positional parameter in position 5 or 6 (url) starting with http
  • any parameter after a parameter containing a newline
  • any parameter having a space in the name or value (I manually verified everything filtered by this and fixed the two valid matches (put paid to, unsub)
With the resulting urls, I manually reviewed any urls without '&' in the last concatenated parameter and discarded a handful of items like url=http://books.google.com/books?id=_wUJAAAAQAAJ&pg=PA62#v=onepage&q&f=false|Cain and url=https://books.google.com/books?id=Y0RfAAAAcAAJ|29–30. That left 1424 urls, including the items detected and fixed yesterday. All of the bad urls include ".google." and only 66 don't end with "&f=false" or "&redir_esc=y". JeffDoozan (talk) 18:04, 28 July 2023 (UTC)[reply]
Thank you so much for your work! I forgot to mention yesterday that I auto-generate all the named params ending in a 2 from the base list, which is why they were missing. I think I can now run the script to move numbered params to named ones. Benwing2 (talk) 19:15, 28 July 2023 (UTC)[reply]
BTW I found just a few issues with non-mainspace pages and fixed them by hand. All other pages look good. Benwing2 (talk) 20:38, 28 July 2023 (UTC)[reply]

bad params in quote templates

Hi again. See User:Benwing2/quote-templates-bad-params. I made a list of all the unrecognized/unhandled params in all quote templates (there are actually four lists: (1) Handled numbered params; (2) Handled named params; (3) Unhandled params by reverse count; (4) Unhandled params by name). By looking at the unhandled params you can see the following types:

  1. misspelled params;
  2. params that should be handled (e.g. |author6=, |author7=, ...);
  3. URL fragments;
  4. cases of missing equal signs (e.g. |authorC. R. Gallistel, Rochel Gelman page=);
  5. params that logically might exist but don't (e.g. |place=, |editor-first=, |editor-last=, |date_published=);
  6. params that I suspect formerly existed because they are so common (e.g. |track=) (User:Sgconlaw can you comment on this?).

Can you help me fix up some of these? In particular, I'm working on changes to Module:quote, and among them I will implement support for |author6=, |author7=, etc. If you could help e.g. with the URL fragments, that would be great. Some of the misspellings can be handled by just making the appropriate conversion tables, but some will need further investigation. Benwing2 (talk) 22:08, 29 July 2023 (UTC)[reply]

Note, the hope is eventually to implement checking for unrecognized params so we don't get into this mess again; but first we need to eliminate most existing cases of bad params. Benwing2 (talk) 22:09, 29 July 2023 (UTC)[reply]
+1 to staying out of this hole once we manage to climb out of it. From a very quick look at your list, it seems like most of the url stuff would be fixed by replacing "=" with "%3D" in "books.google.com/books?id=" and "books.google.com/books?isbn=". It might not even be necessary to restrict it to the quote templates, replacing those two strings globally on the mainspace sounds safe to me, which means I'm not thinging of something. If that's enough to unblock you right away, go ahead. Otherwise I'll take a more careful look at that and the rest of the list on Tuesday. JeffDoozan (talk) 22:49, 29 July 2023 (UTC)[reply]
BTW some of the examples look like this (I found 476 cases like this): {{quote-book|en|title=The Mystical Harvest|page=431|books.google.com/books?isbn=0595481140|author=Somar|year=2008|passage=The driver snatched a packet of cigarettes out of the glove compartment and '''absconded''' the driver's seat without a word}}, where the URL not only has an unescaped equal sign but is missing the https:// prefix. I checked a couple of these (on ablaut and abscond) and they were added around 2013 by User:DCDuring. Do you remember how you added them? I'm asking because if you used a tool, I'm wondering if there is a bug in the tool that leads to this outcome, which needs to be fixed. Benwing2 (talk) 23:22, 29 July 2023 (UTC)[reply]
I sometimes copy-paste quotations, sometimes urls too. Usually I remember to insert all parameter names with an equal sign. Sorry if I omitted that. I always include multiple authors in a single "author=" because it seems like something a bot or some other tool should do. I confess to probable occasional error or types 1, 3, 4, 5. DCDuring (talk) 16:32, 30 July 2023 (UTC)[reply]
BTW, I would just as soon stop citing entries using quotations with multiple authors as do the multiple authors thing. That almost certainly means never using Google Scholar for quotations. DCDuring (talk) 16:34, 30 July 2023 (UTC)[reply]
@DCDuring That can be automated, just like the |page= vs. |pages= thing (that I was going to automate). Benwing2 (talk) 16:48, 30 July 2023 (UTC)[reply]
I hoped so. I wonder if it is a good use of time. Does anyone have any need for knowing whether someone is author5 or author3? Does anyone realistically foresee programs to find author counts? Are there any uses for these? Are we going to have to put author names in canonical form to facilitate such efforts? Are we really obliged to make Wiktionary even more of a therapeutic community for those with OCD by creating 'needs' of this kind? DCDuring (talk) 16:56, 30 July 2023 (UTC)[reply]
BTW, I notice that many pageurls seem inordinately long. The carry both evidence of how the quotation was found and unique identifiers for the containing document. I strongly suspect that we have no need for the record of the search. Can that be cleaned up automagically? DCDuring (talk) 17:00, 30 July 2023 (UTC)[reply]
@DCDuring The author3/author5 thing was a relic of when these templates were implemented entirely using template syntax (instead of Lua), where things like splitting on commas is hard. As for cleaning up pageurl's, I'm a bit loth to do that without a better understanding of how Google books params work in URL's; sometimes the page # isn't there and my fear is that by removing the search terms, we'll remove any way of finding the exact page of the quote. Maybe you could help me with that if you understand how they work. Benwing2 (talk) 17:04, 30 July 2023 (UTC)[reply]
Then why would we want to spend time on cleaning up "author#" now?
In my copious free time, I will attempt to determine how to retain the highlighting of the headword and the appropriate document and page links. But I am not very efficient at such tasks. If we know whether a search has failed due to changes at Google sites we can rerun a search using a large chunk of the passage. But Google seems to maintain the links. DCDuring (talk) 17:15, 30 July 2023 (UTC)[reply]
What I mean by the author3/author5 thing is that the code is now in Lua so we can easily split an author= value on commas and display them as if they were specified separately (which normally uses semicolons between authors). I'm not proposing actually rewriting the params to make them use author2/author3/author4/etc. Benwing2 (talk) 17:19, 30 July 2023 (UTC)[reply]

Vietnamese

Hi. It seems like AutoDooz bot when formatting quotes in Vietnamese entries has accidentally hidden translitterations, for example, in trỗi. I would be thankful if you could make the transliterations show up again. PhanAnh123 (talk) 06:08, 8 August 2023 (UTC)[reply]

@PhanAnh123: Oops! Thank you for noticing that so quickly. I assumed that RQ: templates that handled |translation= would also handle |transliteration=, but it looks like there were a handful of Vietnamese templates that didn't:
I added |transliteration= support to each of the above templates, which should bring back transliterations on all Vietnamese entries. If you find anything else, please let me know. Thanks! JeffDoozan (talk) 13:58, 8 August 2023 (UTC)[reply]

Level request

While you're cleaning up leveling issues, can you fix indented rhymes like diff? They're mostly Icelandic and Faroese from before we had a standard format. I wouldn't touch it in entries with multiple pronunciations though; see course for an example where it's fine. Ultimateria (talk) 18:11, 8 August 2023 (UTC)[reply]

Removal of "static templates"

Are edits like this one useful? I think it is useful to know that these quotation templates, although crappy, exist so that I can clean them up when I encounter them. — Sgconlaw (talk) 22:53, 11 August 2023 (UTC)[reply]

@Sgconlaw: My idea was to remove the templates, too, after cleaning up the references. There are about 800 crappy templates like this that WF created between 2020 and 2022. Many of them weren't used anywhere and the rest were just used on a handful of pages. It's easier to nuke them entirely than to waste time trying to convert them to good templates. JeffDoozan (talk) 23:04, 11 August 2023 (UTC)[reply]
I see. Yes, I guess that's all right. The quotation templates can always be recreated properly where necessary. — Sgconlaw (talk) 21:51, 12 August 2023 (UTC)[reply]
@Sgconlaw: (also pinging @ExcarnateSojourner as another RQ expert) There are 600 static templates without any backlinks outside of their own documentation pages: User:JeffDoozan/lists/static rq template/fixes - I propose we remove all of them unless you have any objections. Additionally, there are about 150 static templates with backlinks from other documentation pages or active use. See User:JeffDoozan/lists/static rq template/errors for details - I'm not sure what to do with those, maybe you want to keep them all or maybe some should be removed. If someone can give me a list of the templates to remove, I'll take care of replacing them anywhere they're used on pages and you can take care of removing them from any documentation. JeffDoozan (talk) 20:30, 12 August 2023 (UTC)[reply]
Yes, I think quotation templates and/or their documentation subpages which are not actually in use in any entries can be safely deleted. — Sgconlaw (talk) 21:51, 12 August 2023 (UTC)[reply]
@Benwing2 Is it easy for you to mass-delete the unused static templates listed on User:JeffDoozan/lists/static_rq_template/fixes? I don't know if there are admin tools for that or if you would have to run it through a bot with delete rights. If it's complicated, please let me know who else to bother or how to make it easier. Thanks JeffDoozan (talk) 19:51, 21 August 2023 (UTC)[reply]
@JeffDoozan It's not hard. I have a bot script to do it. Benwing2 (talk) 20:30, 21 August 2023 (UTC)[reply]

error in incorporating passages from Template:quote

Hi. See this example in Esquimo:

#* {{quote-text|de|year=1837|author=George Back|title=Wunderbare Reiſen und Abenteuer zu Waſſer und zu Lande beſtanden von Capitän Back in den Jahren 1834 und 1835 um den für verloren gehaltenen Capitän Roß aufzuſuchen|page=67f
|passage=sc=Latf|t=Doch bald ſollte auch dieſe Freude getrübt werden. Der Bote berichtete uns nämlich, daß ſchon vor einem Monat ein ähnliches Paquet an uns abgeſendet worden ſey, und daß ſich mein alter Gefährte, der '''Esquimo'''-Dolmetſch  A u g u ſt u s, an die Überbringer desſelben alſogleich angeſchloſſen habe, ſobald er meine Anweſenheit erfuhr. {{...}} Die Anhänglichkeit dieſes guten '''Esquimo''', und ſein höchſt wahrſcheinlicher Tod betrübten mich auf lange Zeit.}}

The original had this:

#* '''1837''', George Back, ''Wunderbare Reiſen und Abenteuer zu Waſſer und zu Lande beſtanden von Capitän Back in den Jahren 1834 und 1835 um den für verloren gehaltenen Capitän Roß aufzuſuchen'', page 67f.:
#*: {{quote|de|sc=Latf|Doch bald ſollte auch dieſe Freude getrübt werden. Der Bote berichtete uns nämlich, daß ſchon vor einem Monat ein ähnliches Paquet an uns abgeſendet worden ſey, und daß ſich mein alter Gefährte, der '''Esquimo'''-Dolmetſch  A u g u ſt u s, an die Überbringer desſelben alſogleich angeſchloſſen habe, ſobald er meine Anweſenheit erfuhr. {{...}} Die Anhänglichkeit dieſes guten '''Esquimo''', und ſein höchſt wahrſcheinlicher Tod betrübten mich auf lange Zeit.}}

It looks like your code assumed that whatever comes after the language code in {{quote}} is the passage. This may be a unique example, because this is the only such case flagged by a check in a script of mine for embedded params in {{quote-*}} values, but it suggests to me that you're not using mwparserfromhell to do the parsing of templates. I'd definitely recommend using this library because there are lots and lots of edge cases in templates and this library does a great job of handling them all. Benwing2 (talk) 01:24, 12 August 2023 (UTC)[reply]

Thanks, you're right, at that time the code skipped parsing that specific line with mwparserfromhell because it only needed to handle {{ux}} with a single parameter, which of course grew up to bite me later on when I expanded it to handle a second parameter for translations and later added support for {{quote}}. I just scanned the August dump (well after this bug was fixed) for similar errors and found Esquimo plus 6 pages where it generated "t=tr=" all of which you had already fixed on August 7th with your "correct run-on params in {{quote-*}}" run. JeffDoozan (talk) 14:14, 12 August 2023 (UTC)[reply]
Cool sounds good, thanks for the update. Benwing2 (talk) 20:17, 12 August 2023 (UTC)[reply]

Label fix

Could you replace all instances of incel|_|slang in labels with incel slang? This will allow proper categorization into Category:Incel slang by language rather than Category:Incel community. Ioaxxere (talk) 16:34, 13 August 2023 (UTC)[reply]

@Ioaxxere:  Done JeffDoozan (talk) 17:09, 13 August 2023 (UTC)[reply]

Slovenian > Slovincian

I've reverted your bot's edit of Klobučar. The surname is Slovenian, the templates are Slovenian, etc., so I suppose it was a mistake. — Phazd (talk|contribs) 20:50, 21 August 2023 (UTC)[reply]

@Phazd: Thanks for fixing that. Looking at WT:LOL, "Slovenian" isn't a canonical language name, which is why the bot was trying to fix it, however, "Slovenian" should have been corrected to "Slovene" and not "Slovincian". I'll figure out what went wrong and fix up any other damage it may have done. JeffDoozan (talk) 21:23, 21 August 2023 (UTC)[reply]
Right, now I realise I used the name "Slovenian" due to my native language, I didn't notice WT prefers "Slovene". I'll correct that now. — Phazd (talk|contribs) 22:19, 21 August 2023 (UTC)[reply]

A few (hopefully easy?) bot jobs

Hey Jeff, I was wondering if you'd be willing to run a few bot jobs on Polish lemmas. Nothing vital, so if it needs to wait, I understand.

  1. Can we remove any instances of {{etydate}} from Polish lemmas inherited from Old Polish? (there should be {{inh+|pl|zlw-opl}}<nowiki/> in the same line, or <nowiki>ref={{R:zlw-opl:SPJSP}} or ref={{R:zlw-opl:SSP1953}}, or <ref></ref> with the same templates)
  2. Can we remove :* {{R:pl:NFJP}} and * {{R:pl:NFJP}} from the ===References=== section?
  3. Can we change any instances of <references/> to {{reflist}}? Vininn126 (talk) 12:27, 28 August 2023 (UTC)[reply]
@Vininn126: #1 and #2 are done. #1 may need some manual cleanup on a, arbuz, bieluń, bogomyślny, czeladnik, jąć, kosa, kolczyk, miecz, nóż, odszczepieniec, pogrzeb, rocznik, urynał, znamienity, świnia where there was a <ref> tag after {{etydate}} JeffDoozan (talk) 19:04, 9 September 2023 (UTC)[reply]
@JeffDoozan Thank you. I'll clean htose up now. I was wondering if you'd be willing to help two more changes, one very easy and one I'm not so sure. By the way, welcome back! Vininn126 (talk) 19:08, 9 September 2023 (UTC)[reply]
More specifically:
  1. Could we check any instances of {{col-auto}} without titles and separate the terms therein by POS and give them a title? For example
    1. {{col-auto|pl|foo(adjective)|bar(particle)}} -> {{col-auto|pl|title=adjectives|foo(adjective)}} (new line) {{col-auto|pl|title=particles|bar(particle)}}
  2. Use plural titles in col-auto for Polish, Old Polish, Kashubian, and Silesian? e.g. |title=adjective -> |title=adjectives
  3. Change instances of From {{af|lang|foo|-o-|bar}}., From {{af|lang|foo|-i-|bar}}., and From {{af|lang|foo|-y-|bar}}. to {{com+|lang|foo|-vowel-|bar}} for Polish, Old Polish, Kashubian, and Silesian.
Please let me know if you have any questions and thank you for all that you do! Vininn126 (talk) 14:14, 12 September 2023 (UTC)[reply]

quote-* changes

Have you been keeping up with the quote-* changes? I've deprecated some params and added support in others for multiple semicolon-separated entities and inline modifiers. I know you have code to convert raw quotes to templatized quotes; can you post the code or email it to me? I can review it and let you know if anything needs to change. Benwing2 (talk) 13:07, 30 August 2023 (UTC)[reply]

@Benwing2: I've been pretty occupied with non-wiki stuff for the last few weeks so I haven't stayed on top of the {{quote-*}} changes. One thing I think may need changing is that the bot will produce |authorN= values for any names it can confidently, which I think that was on your list of things to deprecate. I can easily change that to produced a semicolon separated list under |author=. It doesn't do any handling of multiple language codes, so the language code prefixing shouldn't have any effect.
The quote fixer code is here with additional helper utils here. The code wasn't written with the expectation of anyone but me reading it, so it's pretty ugly and it might be easier to look at the (also pretty ugly) tests to see how it would parse/process different raw quotes. If you don't want to dive into that, I'm happy to dig through the {{quote-*}} revisions to see what needs to change.
Speaking of quote changes, I was thinking of adding something like |formatted_source= to Module:quote that, if it exists, would be used directly instead of calling export.source(). This would let us use templates for the 27,000 bare quotes that the bot can't parse so that, even though the source isn't machine parsable, the quote lines will be parsed into |passage= and |t= where they can be categorized and generate cleanup notices for untranslated passages. I would deliberately name it something long like |formatted_source= and probably only use it with {{quote-text}} in the hope that nobody would ever use it manually when adding new quotes. I think there's some precedent for this in {{zh-usex}}, which generates a source line if |ref= is specified. Thoughts? JeffDoozan (talk) 16:17, 30 August 2023 (UTC)[reply]
I wonder if it wouldn't be better to just use {{quote}} on the quotation part and leave the citation part unformatted; or maybe better, add a cleanup category using {{rfcleanup-quote}} (or something like that, which adds all the necessary categories). The alternative you propose, although it gets you a cleanup category, seems like it would result in materially fewer manual cleanups of quotations. Benwing2 (talk) 18:56, 30 August 2023 (UTC)[reply]

#1:1 fixed list item depth

Hi, AutoDooz just changed the indentation of {{alti}} instances on various Ukrainian noun/adjective/participle forms (e.g. беззаперечному, контакту). The alt forms had been deliberately double-indented with "##:" to line up with the correspondingly double-indented sense (in each instance, the last item in a list). The bot's edits have brought them out of alignment by deleting one of the hashes. May I request a revert? Thanks Voltaigne (talk) 17:54, 6 September 2023 (UTC)[reply]

@Voltaigne: Hi! Personally, I think the double indent makes it look like the alt form only applies to the last sense (ie, only when using the "dative singular"), but if there's consensus for making an exception after {{infl of}}, I can do that. Alternatively, if you don't like the way a single indent looks, maybe using an "===Alternative forms===" section, like on the Spanish word pie would avoid the problem. JeffDoozan (talk) 18:26, 6 September 2023 (UTC)[reply]
In these instances, the alt form does only apply to the last sense in the list rendered by {{infl of}}, hence the placement of {{alti}} directly underneath it (such that the "[alternative form▼]" open/close link appears to the right of it). When an alt form doesn't apply to the last listed sense, or applies to more than one sense in the list, I use a "===Alternative forms===" section and specify the relevant senses using {{sense}}. Voltaigne (talk) 21:04, 6 September 2023 (UTC)[reply]
@Voltaigne:, wow, that's a really interesting way to combine those two templates, but it seems like something that other editor (and bots) would break inadvertently in the future if they need to make changes. Ideally {{infl of}} would have a way to add per-sense modifiers for alternative forms, but since it doesn't what do you think of doing something like
# {{infl of|uk|беззапере́чний||m//n|loc|s}}
# {{infl of|uk|беззапере́чний||m//n|dat|s}}
#: {{alti|uk|беззапере́чнім}}
to make it really explicit that {{alti}} is tied to a specific inflection? JeffDoozan (talk) 23:55, 6 September 2023 (UTC)[reply]
Okay, I will abandon the method I've been using (which admittedly is a workaround) and from now on will use either separate {{infl of}} instances per sense or ===Alternative forms=== sections. It would indeed be ideal if alt forms could be juxtaposed inline with listed senses within {{infl of}}, as (to my eye at least) it makes for a more elegant layout. Voltaigne (talk) 09:36, 7 September 2023 (UTC)[reply]

es missing drae

Hey. Can you regenerate i.e. eliminate the blue links on User:JeffDoozan/lists/es missing drae phrases3, User:JeffDoozan/lists/es missing drae phrases4 and User:JeffDoozan/lists/es missing drae phrases5? Could you also reduce the frequency from 5000 down to 4000, which hopefully springs some pretty things up. P. Sovjunk (talk) 21:33, 19 September 2023 (UTC)[reply]

@P. Sovjunk: The code I used to generate the phraseX lists evolved into what is now used to generate the unified list at User:JeffDoozan/lists/es_missing_drae. I'll adjust that to use 4000 instead of and hopefully you'll find something new and interesting over there when it gets regenerated from the next database export. JeffDoozan (talk) 13:25, 20 September 2023 (UTC)[reply]

Quote cleanup suggestion

[3]: In some rare one-line "year - passage - metadata" format, such as at wuzu. —Fish bowl (talk) 01:12, 21 September 2023 (UTC)[reply]

Pronunciation of a Misspelling

Hey, I was checking to make sure AutoDooz was working correctly, and I noticed one thing that I'm not 100% clear about. Here: [4] we see a pronunciation section moved from Etymology 1 only to the top so it would apply to both etymologies. However, the second etymology is a misspelling. Off the top of my head, I don't know of any misspelling entries on Wiktionary where a pronunciation is given, this would be a first for me. What do you think? I'm fine with status quo because the misspelling "probably might" be pronounced the same was as the correctly spelled etymology. But not sure exactly if everything is "right". I'm just gonna move this discussion to the Talk page since I think it could come up again, unless you have any objection- see Talk:Jingsha. --Geographyinitiative (talk) 16:20, 21 September 2023 (UTC) (Modified)[reply]

Titi

Titi as a slang in Spanish is not a young chick.!it means auntie. Check the bad bunny song titi me pregunto si tengo muchas novias. Auntie asked me if I have a lot of girlfriends.! 2607:FB91:D7D:47FB:2CC4:8198:C928:796A 11:42, 7 October 2023 (UTC)[reply]

Your bot is making mistakes in the quotes.

For some quotes on the main page, there's this sequence:

##*: blah blah blah.

Your bot's edit changes it to

##: blah blah blah.

This makes it so that the quote isn't identified as a quote, and thus no show-quote tab is added. CitationsFreak (talk) 15:27, 24 October 2023 (UTC)[reply]

@CitationsFreak: Diff please. JeffDoozan (talk) 15:30, 24 October 2023 (UTC)[reply]
https://en.wiktionary.org/wiki/Special:Diff/76434295 CitationsFreak (talk) 15:31, 24 October 2023 (UTC)[reply]
@CitationsFreak: Yup, that should have been corrected to ##* blah blah. I'll figure out what's going wrong. Thanks! JeffDoozan (talk) 15:35, 24 October 2023 (UTC)[reply]

Double redirect made by bot

https://en.wiktionary.org/w/index.php?title=Wiktionary_talk:Grease_pit/2023/December&redirect=noJustin (koavf)TCM 14:56, 25 December 2023 (UTC)[reply]

There are apparently 2 different paradigms for this verb, and your bot switched at least 39 of the inflected forms to the wrong one: e.g. lloviznéis used to have {{|tl|es-verb form of|lloviznar}} which generated "only used in os lloviznéis, second-person plural present subjunctive of lloviznarse", so switching it to {{es-verb form of|lloviznar<only3s>}} was definitely a mistake. The "39" is the number in CAT:E, but I have a hunch that there are others that aren't calling attention to themselves by throwing an error- but are still wrong. Chuck Entz (talk) 04:10, 4 January 2024 (UTC)[reply]

Hi. [This edit] by Autodooz introduced a bolding glitch (2004 quotation in the verb section). It turned out to be 'caused' by an apostrophe in a parameter of a "sic" template usage in the text that Autodooz embedded in a "quote-text". I 'fixed' it by changing the straight quote <'> to a curly quote <’>. I imagine this is well towards the rare end of the scale and hardly worth the bother of fixing. (Were one to do so, one might find oneself changing the "sic" template—perhaps simply to add a note in the documentation.)— Pingkudimmi 04:32, 2 February 2024 (UTC)[reply]

Weird! The ' in {{sic}} should not be interpreted as formatting. I added a workaround to the documentation on {{sic}} and opened a discussion on Grease pit that might lead to a better solution. JeffDoozan (talk) 17:11, 2 February 2024 (UTC)[reply]

Edits to quotation templates

Hi, I'm not familiar with Lua. What are the effects of your bot's recent edits to quotation templates by adding |propagateparams= in some cases and modifying |allowparams= in others? Thanks. — Sgconlaw (talk) 22:04, 4 February 2024 (UTC)[reply]

@Sgconlaw: Any param names listed in |propagateparams= is passed along to the module for handling, so |propagateparams = url has the same effect as writing url = {{{url|}}}.
|allowparams= lists the named and numbered params that are handled by the template, which allows for parameter checking (the template will throw an error if the user makes a typo like |txt= instead of |text= or tries to included a parameter that won't be used) JeffDoozan (talk) 22:18, 4 February 2024 (UTC)[reply]
I see ... I'm probably going to keep coding the templates without using these parameters as I'm still a bit unsure about how to use them, though. — Sgconlaw (talk) 22:25, 4 February 2024 (UTC)[reply]
@Sgconlaw: That's fair, the bot can figure out what should be in |allowparams=, and |propagateparams= is just a shortcut. JeffDoozan (talk) 22:36, 4 February 2024 (UTC)[reply]

──────────────────────────────────────────────────────────────────────────────────────────────────── Hi, Jeff. Could you explain to me what should be put into |allowparams= and |propagateparams=? Your bot removed |section= from a quotation in cough as an unused parameter. I realized this was an error in {{RQ:Wodehouse Summer Lightning}} so I reverted the change and added |section= to the quotation template. Your bot then added "section" to |allowparams= when it invoked Module:quote. However, the bot removed |section= from the quotation a second time and I'm not sure why. Thanks. — Sgconlaw (talk) 19:49, 8 February 2024 (UTC)[reply]

@Sgconlaw: The bot edited cough a second time because it was using the template data from the data export on 2/1/2024 and not the live version. I've fixed that so it shouldn't happen again and I've reviewed all of the bot's edits to avoid dropping data that wasn't being generated by the template, but I'm sure I've missed some things so in the next few days I'll make you a list of all of the bot edits so you can double check that it didn't drop any other important data.
|propagateparams= is a list of the parameters that, if passed to the RQ template, will be used directly by {{quote-book}}, minus a few parameters it always handles: "brackets", "footer", "passage", "text", anything listed in |textparams= (and, if |pageparam= is set, also minus "page", "pages", and all variables listed in |pageparam=)
|allowparams= is a list of parameters the template should "allow" users to specify even though they won't be used directly by {{quote-book}}. It's pretty must just a list be a list of every variable used in triple braces inside the template. JeffDoozan (talk) 20:12, 8 February 2024 (UTC)[reply]
Thanks for explaining. I'll see if I can get this right when I work on quotation templates. — Sgconlaw (talk) 20:27, 8 February 2024 (UTC)[reply]
@Sgconlaw: User:JeffDoozan/lists/history contains a list of all the bad params the bot detected in pre-conversion RQ templates. They've all be renamed, removed, or added to supported params in the template and no immediate action is needed. However, it's possible some of the data was useful and in the future someone may want to adjust the templates to add support for the removed data. JeffDoozan (talk) 16:19, 10 February 2024 (UTC)[reply]

Bot job suggestion

Identify and correct pages where a reference comes before a punctuation mark. For example:

Incorrect: According to Jones (2020)[1], the term is []

Correct: According to Jones (2020),[1] the term is []

Ioaxxere (talk) 22:59, 14 February 2024 (UTC)[reply]

@Ioaxxere: There are 6000 refs before a punctuation and 51,000 refs after a punctuation so it looks like there's a consensus that refs should be after punctuation. Can you browse through some of the 6000 refs that would be affected by this and let me know if there are any exceptions I should look out for: is there anything in that list of 6000 that shouldn't get changed? JeffDoozan (talk) 01:38, 15 February 2024 (UTC)[reply]
I found a few cases where the punctuation is on both sides, so make sure those are handled properly. On menstruation there are two different punctuation marks which would probably have to be handled manually, but there should be relatively few of these cases.
Super incorrect: According to Jones (2020),[1]. the term is []
Ioaxxere (talk) 14:57, 15 February 2024 (UTC)[reply]
@JeffDoozan: The bot is causing issues in which the punctuation is worse than before. Ex: having the period in between references. There is happening on several pages, but two of the ones on my watchlist include: Jeju 으시 (-eusi-) and Igbo ụ̀tụtụ̀ ọma. I also think that if there is a consensus on this, it should actually be formalized and placed somewhere rather than solely discussed on a user page before implementation. People with backgrounds in other languages, ex: Spanish where both options are correct, may edit differently and until there's a policy, we shouldn't unilaterally adopt one or the other. AG202 (talk) 01:39, 16 February 2024 (UTC)[reply]
@AG202: Thanks for mentioning those two pages: the bot wasn't handling references that were a single tag, but that's an easy fix and already running as I write this. I regret opening this particular can of worms, have no opinion on the formatting, and am certainly not trying to unilaterally implement anything. Given that it was nearly 90% one format it seemed like a non-issue to fixup the remainder, but I should have asked @Ioaxxere to check for formal consensus before running this. JeffDoozan (talk) 02:01, 16 February 2024 (UTC)[reply]
No worries about it and thanks for the fixes! Your analysis was valid and made sense; I just wanted to make sure that more people had a chance to see it, as some people may have done their references like that on purpose and may now be confused. AG202 (talk) 02:05, 16 February 2024 (UTC)[reply]

Korean with wrong script code

I recently noticed that because of changes in language data, Korean terms are no longer transliterated if the script code is given as Hang (or Hani or Cyrl or Armn, but I fixed those cases). There are some cases in link templates that I put into a list: User:Erutuon/lists/ko not with Kore. The solution is just to remove the script parameter when most of the characters in the text are Hangul. Then automatic script detection will assign an implicit |sc=Kore, which will enable transliteration.

Could you remove the script parameters in these cases? I've done some bot tasks like this in the past, but I'm reluctant because I'm slow at writing the code for a new task.

I'm not sure what should be done when the term is in Latin script and |sc=Latn (for instance, {{t+|ko|DVD|sc=Latn}}). Latn is not one of the scripts assigned to Korean, so I guess this is a matter for Korean editors to decide.

There are some other cases in my list of best script not matching |sc= parameter where the |sc= parameter could be removed, like when it's just a difference between various different Arab script codes. It won't have as noticeable an effect because many of the Arabic-script languages don't have automatic transliteration like Korean does, but I could print out the parameters that need to be changed if you want. — Eru·tuon 18:06, 20 February 2024 (UTC)[reply]

I had the bot remove |sc=Hang from templates matching {{t|ko}}, {{t-check|ko}}, and {{t+check|ko}} on the pages you listed because it was easy to manually verify that the bot's proposed fixes matched what was listed on your page (I can't read Hang or Hangul). The remaining handful of mismatches are probably faster to fix by hand than by bot since the bot is dumb and might propose removing a valid |sc=Hang if there are multiple matches on the page. User:Erutuon/lists/best_script_not_matching_sc_parameter looks interesting, if you can make me a machine readable list with ["page", "{{old-template\n|exact\n|string including whitespace }}", "{{new-template|string}}", "summary message"], for only the items that need to be fixed, I can have the bot apply your changes. JeffDoozan (talk) 18:56, 20 February 2024 (UTC)[reply]
forgot the ping @Erutuon JeffDoozan (talk) 18:57, 20 February 2024 (UTC)[reply]
Thanks! I'm working on a list of fixes User:Erutuon/lists/best script not matching sc parameter/fixes here (not in the correct format yet). I've got to write the correct edit summaries and possibly remove things that don't need fixing so that I can find some more rules that can be applied. Scanning the list is inefficient. — Eru·tuon 22:42, 20 February 2024 (UTC)[reply]
It's possible that some of the printed templates won't match what's in the wikitext, because I dump the templates as name, redirected name, and parameters as an array of key-value tuples and that means I could be printing numbered parameters in the wrong way. (In this format, {{m|en|term}} isn't distinct from {{m|en|2=term}} for instance.) — Eru·tuon 22:46, 20 February 2024 (UTC)[reply]
@Erutuon: Ok, I was hoping to be lazy and just do direct string replacements, but I can write a function to compare templates that matches implied/explicit numbered parameters. JeffDoozan (talk) 15:58, 21 February 2024 (UTC)[reply]
A literal replacement should work in the vast majority of cases. It's fairly rare for people to use explicitly numbered parameters and it's not very common for the templates to be changed after the dump is released. But here is a format with template name and parameters as key-value pairs. I would probably whittle it down further to the parameters that actually identify whether the template needs changing (language code, term, alt, maybe sc, with explicit nulls when the parameters are missing) and the name of the script parameter to remove. Then if someone only changes an irrelevant parameter (for instance, |id= or |tr=) after the dump was released, you can still remove the |sc= parameter. — Eru·tuon 18:22, 21 February 2024 (UTC)[reply]
Actually, scratch that. I don't have a way of determining what the name of a missing parameter will (for instance for alt, |3=, |4=, |alt=, |alt1=, etc., depending on the template). So, the bot still has to identify the template by all its parameters and bail out if something has changed. Unfortunate, but it's probably fairly rare. I also realized that there could be multiple changes per template (like if {{affix}} has multiple unnecessary script parameters; though apparently there are no cases right now), so I've changed the format to account for that. Also added edit summaries. Going to regenerate from latest dump (up till now, it has been from the 2024-02-01 dump). — Eru·tuon 20:04, 21 February 2024 (UTC)[reply]
Okay, sounds good. I'll do an exact match on template name and all param names/values and have it log anything where it doesn't find the expected data. If possible, can you wrap the whole page in [] brackets and add a comma after the end of each line so I can load the file as JSON without having to preprocess it? JeffDoozan (talk) 20:13, 21 February 2024 (UTC)[reply]
I can if you'd prefer, but another way would be [json.loads(line) for line in text.split("\n")] (or [json.loads(line) for line in open("path")]) if you're using Python. I like JSONL because it can be written in a loop (letting the program be interrupted) and be read bit-by-bit if the file is very large. — Eru·tuon 23:17, 21 February 2024 (UTC)[reply]
@Erutuon: Thanks for adding the formatting. I ran the first 30ish entries and everything seems to be working. I added the template name to the summary and removed any duplicate log messages. Please check that the edits look like what you're expecting and I can run the rest of the fixes tomorrow. JeffDoozan (talk) 01:15, 22 February 2024 (UTC)[reply]
Checked the edits in popups and they look good! The only minor change is you could link the template name as it's confusing when it's a single letter: T:l, T:bor (or [[T:l|{{l}}]], which looks like {{l}} in an edit summary). — Eru·tuon 01:22, 22 February 2024 (UTC)[reply]
@Erutuon: I linked the template name using the {{l}} format, see ahad, Allah, alur. Ready to push this through on your approval. JeffDoozan (talk) 15:55, 22 February 2024 (UTC)[reply]
Looks good! I forgot to mention (though perhaps you noticed) that some of the templates ({{bor-lite}} for instance) have "action":["add","sc","Latn"]. I think you haven't hit that yet in your test edits, so you might want to run a few of those or do them in a second pass after the "remove" ones. — Eru·tuon 18:02, 22 February 2024 (UTC)[reply]
@Erutuon: Ok, I added support for the "add" action - it will add a parameter if it's missing, or set the value if it already exists, see ba. All template actions are executed in order upon matching a template, so it's theoretically possible to add, modify and then remove the same parameter which would just generate a big summary message with no actual changes - I'm guessing you haven't done this but mentioning it just in case. Anything else? JeffDoozan (talk) 19:18, 22 February 2024 (UTC)[reply]
That's it and you should be good to go. I was surprised, but there weren't multiple actions for any of the templates (at least in the version from the first dump this month). So nobody inserted |sc1=Arab and |sc2=Arab into {{affix}}. — Eru·tuon 21:24, 22 February 2024 (UTC)[reply]
@Erutuon:  Done I'll keep the script around in case you come up with any other fixes. JeffDoozan (talk) 01:46, 23 February 2024 (UTC)[reply]
Thanks for your help! I definitely will. I'm filtering out the handled cases from the list and have already found some more fixes. And people are likely to insert more incorrect script codes of the types already identified. — Eru·tuon 15:46, 23 February 2024 (UTC)[reply]

I've filled up the list with a bunch more changes that you can run whenever you have the time. I'm pretty confident with the list now because I've looked it over by filtering with a script and eliminated a bunch of dubious changes. I added language information to the edit summary to make it easier to search for edits if necessary. (Though now edit summaries might run up against the character limit when there are multiple changes per page. Maybe less likely if you can eliminate duplicates with something like "; ".join(set(summaries)).) — Eru·tuon 21:36, 27 February 2024 (UTC)[reply]

 Done JeffDoozan (talk) 22:16, 27 February 2024 (UTC)[reply]
Thanks! In time for the next dump in a few days. — Eru·tuon 03:06, 28 February 2024 (UTC)[reply]

Thanks etc.

The list of template parameter errors has already been helpful to me in correcting those errors and others in organism name entries.

See User talk:JeffDoozan/lists/template params/errors for some specific comments. DCDuring (talk) 18:19, 20 February 2024 (UTC)[reply]

Hello, Jeff Doozan. Thank you for your recent work on {{taxlink}}. (See User talk:0DF#Gomphocarpus filiformis for my interest in the matter.) 0DF (talk) 00:35, 2 March 2024 (UTC)[reply]

Hello there. Would you mind fixing whatever the problem is that prevents Gomphocarpus filiformis from being properly italicised in Asclepias filiformis var. buchenaviana and Gomphocarpus filiformis var. buchenavianus, please? 0DF (talk) 12:45, 9 March 2024 (UTC)[reply]

@ODF: It's showing as italacized in your examples, are you saying it should be italacized differently? JeffDoozan (talk) 18:31, 9 March 2024 (UTC)[reply]
It doesn't look italicized to me, but the problem seems to be that {{synonym of}} is ignoring italics in in that parameter. I think the italicized version would need to go in a separate |head= or |3= parameter. Chuck Entz (talk) 18:59, 9 March 2024 (UTC)[reply]
Ah, I thought you meant in the resulting output of {{taxfmt}} not on the pages themselves. Chuck's right, it looks like {{synonym of}} is adding italics to italics and resulting in bold text. JeffDoozan (talk) 19:09, 9 March 2024 (UTC)[reply]
@Jeff Doozan: The problem I see is as Chuck Entz describes it. I assume this is a fix that needs to be made to {{synonym of}}. 0DF (talk) 19:11, 9 March 2024 (UTC)[reply]
@0DF: how to handle italicization of italicized text is a broader issue. I'm not sure we want to mess with the way {{synonym of}} deals with that, since it probably would have to be applied to {{m}} and most if not all of the etymology templates. For many uses, plain text to contrast with surrounding italics is proper formatting. By the way, pinging someone on their own talk page does nothing and just makes you look clueless. Chuck Entz (talk) 19:28, 9 March 2024 (UTC)[reply]
@Chuck Entz: Well, DCDuring concerns himself with this proper italicisation, and I suspect this isn't the outcome he'd want, but whatever. By the way, you may notice that I wrote @Jeff Doozan: to show clearly at a glance to whom I was writing without the link to User:JeffDoozan that would've elsewhere generated a ping; I do know how that works, you know. 0DF (talk) 22:22, 16 March 2024 (UTC)[reply]
There is no fully satisfactory way to have contrasting type faces (italic vs. normal; normal vs. italic) when there is little or nothing to contrast with, so templates like {{syn of}}, {{alt of}}, etc. can do what they will AFAIAC. DCDuring (talk) 03:52, 17 March 2024 (UTC)[reply]

I see you've already generated a list of these problematic entries. Is there anything stopping you from fixing them? It would be pretty trivial, for example, to replace {{clipping of}} with {{clipping}}. Ioaxxere (talk) 19:12, 2 March 2024 (UTC)[reply]

@Ioaxxere: I made that list by request, but nobody followed up with specifics on cleaning it up. If someone makes a list of def_template/ety_template pairs, and checks that each ety_template can handle the parameters used by def_template (or, if it doesn't let me know what which of old_template's paramaters need to be renamed or manually-adjusted), I can run the bot to rename the templates. JeffDoozan (talk) 19:54, 2 March 2024 (UTC)[reply]
{{clipping of}} -> {{clipping}}
{{initialism of}} -> {{initialism}}
{{contraction of}} -> {{contraction}}
{{acronym of}} -> {{acronym}}
{{ellipsis of}} -> {{ellipsis}}
{{syncopic form of}} -> {{syncopic form}}
{{apocopic form of}} -> {{apocopic form}}
{{causative of}} -> {{causative}}
{{aphetic form of}} -> {{aphetic form}}
For others, we should either create new templates or subst them, but these ones can be done right away. Ioaxxere (talk) 17:49, 6 March 2024 (UTC)[reply]
@Ioaxxere: Not all of those template support the same parameters, and many of them need punctuation fixes after being replaced. I'm not interested in writing the code to handle all of the possible corner cases. I configured the list to auto-update every few weeks, feel free to use it to apply fixes manually. JeffDoozan (talk) 14:01, 16 March 2024 (UTC)[reply]

Please stop the bot: LOTS of errors

Category:Entries with redundant template: taxlink is full of cases of {{taxlink}} being applied when {{taxfmt}} is what is wanted. DCDuring (talk) 23:31, 7 March 2024 (UTC)[reply]

I blocked the bot to prevent error continuation and propagation. Let me know if you have a problem getting it restarted once it is fixed. DCDuring (talk) 23:35, 7 March 2024 (UTC)[reply]
@DCDuring: Thanks. I'll figure out what went wrong and clean up the affected pages when the block expires tomorrow. JeffDoozan (talk) 02:16, 8 March 2024 (UTC)[reply]
More errors are still appearing, I assume due to latency in updating category membership. DCDuring (talk) 02:40, 8 March 2024 (UTC)[reply]
The root cause was the bot not processing the |mul=1 in the {{taxlink}}s when building a database of existing, valid taxlinks. I fixed that and added a secondary check to avoid creating taxlinks for any bluelinks. Don't worry about the number of pages in the category, it's easy and fast to roll back the bot edits. JeffDoozan (talk) 03:01, 8 March 2024 (UTC)[reply]
@DCDuring: Please don't roll all these changes back by hand, it takes 30 seconds to do it with the bot. JeffDoozan (talk) 03:03, 8 March 2024 (UTC)[reply]
I started doing that, but I stopped a while ago. Now I'm now just working on items AutoDooz hasn't touched: my usual. DCDuring (talk) 03:05, 8 March 2024 (UTC)[reply]
Then I cleaned out the category, because I couldn't stand it.
What is the state of your runs? I'm not seeing any large numbers of items that need {{taxfmt}} and I am seeing lots of items that have {{taxfmt}}. I will tell you about any items that I find with redlinks, blue links, or unlinked items.
Do you need anything specific from me, either urgently or long-term? DCDuring (talk) 19:41, 10 March 2024 (UTC)[reply]
@DCDuring:Everything the bot can is able to mark as {{taxlink}} has been converted. Almost everything the bot can mark as {{taxfmt}} has been converted, except for text that occurs inside of "ja-r/multi", "ja-r/args", "gl", "gloss", "coi", "syn", "ngd", "cog", "syn of", "synonym of", "obs form", "obsolete form of", "suffix" templates, which I'll convert in a few days.
Lists of taxon things that may be of interest to you:
  • Category:Pages_using_bad_params_when_calling_Template:taxlink contains a list of all the pages where {{taxlink}} is used with an unhandled parameters.
    If you go to Preferences, and scroll down to "Gadgets" and enable "Catch My Attention" and then go to any of the pages containing a {{taxlink}} with invalid params, you should see an alter in the page. Also, in the edit preview, there will be a red warning on taxlinks using bad params.
    Are |ver= and |nover= still used, or can I have the bot delete those paramaters from the existing templates? If they're still used, I'll add an exception for them in the template so they don't fill up the bad params categegory.
    Leave nover; ie, It is not an error. It is not necessary to create a category for it, but it is useful infor. These were for taxa that I couldn't find or correct after a moderately diligent search. I have always intended to get back to them in my copious free time. As to ver= I have removed ~170 from a Chinese data module. Your previous work should have eliminated them directly in mainspace. DCDuring (talk) 01:44, 11 March 2024 (UTC)[reply]
  • User:JeffDoozan/lists/external_taxons/errors - list of taxlinks that share a taxon but have a mismatch in the other parameters
  • User:JeffDoozan/lists/local taxons/errors - list of L2 Translinguals that have multiple {{taxon}}
  • User:JeffDoozan/lists/ttp with other l2 - All pages containing an L2 Translingual section plus any other L2 section.
  • User:JeffDoozan/lists/taxons with redlinks - All redlinks found inside a T2 Translingual section.
    Almost of the ones with initial lowercase letters are vernacular names. Almost all the ones with initial uppercase are taxa. I will work through the lowercase ones. DCDuring (talk) 01:53, 11 March 2024 (UTC)[reply]
    I thoroughly reviewed the "A"s on that list. I didn't find any item with an initial uppercase that was not a taxon. Let me go through the rest to confirm that hypothesis and delete those that are place names, personal names, etc. DCDuring (talk) 02:45, 11 March 2024 (UTC)[reply]
  • User:JeffDoozan/lists/possible taxons - list of italacized and/or redlinked items that might be taxons.
  • User:JeffDoozan/lists/approved taxons - manually curated list of taxons and ranks that the bot should handle that are not currently used in {{taxon}} or {{taxlink}}
The bot will automatically update all of those lists every time a new data dump is released, ususally on the 1st and 20th of each month. Whenever there's new data, I'll have it apply {{taxlink}} and {{taxfmt}} to any new text that has been added.
The only pressing thing I need from you is what I should do with |ver= and |nover=. I hope the above lists are useful for cleanup tasks, but there's no urgency in any of that. Is there anything else you need from me to help with any remaining cleanup? JeffDoozan (talk) 20:26, 10 March 2024 (UTC)[reply]
I never mentioned that there are hundreds of instances of {{taxlink}} on linked taxonomic name in a Chinese data module. The other linked taxa should, perhaps in principle, get {{taxfmt}}ed. We may not ever have to do anything about them. I'm going to work through the "error" categories. If I see any straightforward patterns of frequent errors, I will let you know. DCDuring (talk) 01:34, 11 March 2024 (UTC)[reply]

Fixing non-Latin lookalike characters in transliterations

I've extracted some fixes where there are Cyrillic characters in otherwise Latin-script transliterations from the somewhat complete list of instances of non-Latin characters in transliteration. The JSON format is similar to the script code one, except I put the edit summary at the page level in the JSON to reduce the repetitiveness. It'd be hard to add the template names to the edit summary this time, but I can try to do that if you'd prefer. — Eru·tuon 23:32, 8 March 2024 (UTC)[reply]

 Done JeffDoozan (talk) 19:20, 9 March 2024 (UTC)[reply]

Hi. Your edit in Template:commons produces an error. “tabletable” now appears on all pages that use the template. Vivaelcelta (talk) 18:28, 10 March 2024 (UTC)[reply]

@Vivaelcelta: Thanks, fixed in diff JeffDoozan (talk) 18:35, 10 March 2024 (UTC)[reply]

AutoDooz on {{cite-book}}

Hey Jeff, AutoDooz running {{cite-book}}: removed trailing empty positional param is removing intentional line breaks from templates, like seen in this edit, instead of removing the duplicated ||, as I assume you intended. -- Sokkjō 00:36, 31 March 2024 (UTC)[reply]

Hi @Sokkjo, the bot's edits, although lacking in aesthetics, are the only safe way a superflous | can be removed without affecting other parameters. In the diff you mentioned, removing just the | would have the unintended consequence of adding <!-- --> to the value of the previous parameter. In the case of {{cite-book}}, it doesn't matter if |publisher= or |page= suddenly have <!-- --> appended to their values, but that's not true of all parameters across all templates so the bot removes everything from the erroneous | until the next | to be sure it's not introducing any errors into the template's output, even if that affects the visual aestetic of the template source. It does check that any HTML comments removed contain only whitespace before removing the parameter, to ensure that it's not removing important information while doing the cleanup. JeffDoozan (talk) 17:48, 31 March 2024 (UTC)[reply]
Yes, I understand that the change did not break the template, but removing the formatting on a template is still an unintended edit your bot is making, and should be fixed and avoided. Formatting is not simply a "visual aestetic" but helps an editor parse and modify a template. @Benwing2 -- Sokkjō 20:01, 31 March 2024 (UTC)[reply]
@Sokkjo @JeffDoozan In this case I agree with Victar; your bot code should be smart enough to handle cases like this correctly. Benwing2 (talk) 20:34, 31 March 2024 (UTC)[reply]

invoke checkparams warn at Greek templates

Hello. The R AutoDooz adds this

{{#invoke:checkparams|warn}}<!-- Validate template parameters
-->

in many Greek templates: Category:Pages_using_bad_params_when_calling_Greek_templates
In one case, I found the mistake. But I cannot find anything at others.
What must we do? What is wrong? Which ones are bad parameters? Could someone help? Thank you. ‑‑Sarri.greek  I 00:41, 7 April 2024 (UTC)[reply]

Hi @Sarri.greek. You don't have to do anything, it's just a cleanup category for tracking unused parameters on template calls, for example adding |author=Homer to {{el-phrase}} won't have any effect, it just gets ignored by the template. If you want to clean them up, you can go to preferences, and then scroll down to "Gadgets" and enable the "Catch My Attention – display {{attention}} templates when reading entries" gadget to see a message on the page telling you which template has invalid parameters. You'll also see a warning when previewing your edit if there are any templates with unused parameters. In most cases, the source of the error is pretty obvious like typos or parameters that were once used but are no longer needed, but there are some complex cases where templates are called by other templates where the cleanup is not obvious. If you find any warnings that seem wrong or that you don't know how to fix, just give me a ping and I'll help as much as I can. JeffDoozan (talk) 01:10, 7 April 2024 (UTC)[reply]
You might also be interested in User:JeffDoozan/lists/template_params/errors where you see all of the errors together to decide which pages are worth visiting. JeffDoozan (talk) 01:29, 7 April 2024 (UTC)[reply]
Thank you! I will go to User:JeffDoozan/lists/template_params/errors because I never use 'Gadgets', I do not know what they do. Sometimes, at infletional Templates, such unused parameters are provisional for optional change of Template (e.g. X template does not use them, but if one switches to another declension, they would be needed). ‑‑Sarri.greek  I 03:11, 7 April 2024 (UTC)[reply]
It's great! that you show all typos. I will gradually do all mistakes concerning Modern templates. For some, I can see the first 10 that you show, but there are more! I hope your clever robot will go on adding them! -but it will take some time to do them all. Please, if you wish, give me instructions of how to mark done and finished, or 'done and please add more'. Thank you very much for this assistance. ‑‑Sarri.greek  I 07:29, 7 April 2024 (UTC)[reply]
@Sarri.greek: I'm glad it's helpful. I re-generated the page with full entries for all of the Greek templates so you should now see everything. It looks like {{el-adj-form}} and {{el-noun-form}} have a lot of uses where |1= or |2= is a transliteration, probably from a time before the templates could do the transliteration automatically, so I filtered out all of those to just focus on the real errors. If you think it's okay to automatically remove all of the transliterations, I can have the bot remove them. You don't need to mark anything as "done" unless it's helpful for you, the page will be refreshed every 2-3 weeks and anything that has been fixed during that time will automatically be removed from the lists. JeffDoozan (talk) 14:58, 7 April 2024 (UTC)[reply]
Great, Thank you, Sir! It is so good, to work in pairs! you know programming, I know greek, ... wonderful. I had no way up to now, to see these mistakes. While I remove the dated transliterations, I add more things (ipa etc), because it is a good opportunity to review old lemmata.
If you have time -sorry to bother for more: I became greedy-. Do you know how to do statistics?, like Pages with modern Greek and Ancient Greek only or Pages with modern Greek, Ancient Greek and Ls'...? There are double, triple and quadruple pages (with dialects). If there is an easy way to do this could you just type the command, and I will add it at the WT:... of these languages. It is interesting to see total Greek (e.g. many ancient and modern words coincide). I am waiting for Medieval Greek to be added as a new L2. Thank you. ‑‑Sarri.greek  I 15:13, 7 April 2024 (UTC)[reply]
@Sarri.greek: For the statistics, I think you can search for incategory:"Greek lemmas" incategory:"Ancient Greek lemmas" to find pages with both Greek and Ancient Greek. I added parameter checking to {{el-adj-form}} and {{el-noun-form}} so those categories should start slowly filling up on Category:Pages_using_bad_params_when_calling_Greek_templates if you really want to manually fix all of the transliterations. JeffDoozan (talk) 15:44, 7 April 2024 (UTC)[reply]

merged pages= into page=

I strongly disagree with the "merged pages= into page=" change. Was this discussed anywhere? I can't find a discussion about this.

If you look at the help text for cite-book, you see that these parameters are clearly distinct (the help text of cite-journal isn't so clear, it seems): "The page number or range of page numbers referred to. The parameters page and pages can be used together to indicate that the citation refers to, for example, “page 3 of 10”.

IMHO having "page" (specific page reference) and "pages" (range of pages for the whole text) as distinct parameters makes a lot of sense. I don't see any justification for merging them into one, as your bot did. tbm (talk) 04:39, 9 April 2024 (UTC)[reply]

Hi, @tbm! This is part of a bigger project to make the cite- templates share the same parameters and programming as the quote- templates because the quote- templates are better documented and support more advanced features. Unfortunately the use of |pages= is one of the few places where the two template families act differently: in the quote- templates |pages= is used when a quoted passage spans multiple pages and is used in place of |page=, while in cite- templates it is sometimes used the same way, and sometimes used in combination with |page= to indicate a larger range. It's only this last case, which is very useful but not widely used, where the bot made any changes, where it merged "of PAGES" into the existing |page= value to achieve the same output without using both values simultaneously. If you think it's important to maintain this as a separate parameter, I can add support for a new parameter that could be used by both the cite- and quote- templates (and re-do the bot edits to use the new parameter). What do you think? Do you have any suggestions for names for the new parameter, maybe page_range? JeffDoozan (talk) 13:53, 9 April 2024 (UTC)[reply]

"merged changes from Module:User:JeffDoozan/quote"

We now have a module error at 肋巴骨 because Module:quote is looking in your userspace instead of the template namespace. Please fix. Chuck Entz (talk) 17:37, 12 April 2024 (UTC)[reply]

 fixed, thanks for the heads-up. The calls to my userspace are temporary while I cleanup invalid params on existing cite- template uses. JeffDoozan (talk) 17:45, 12 April 2024 (UTC)[reply]

Changing T:cite-* |1= to |lang=

Where was it decided to changing T:cite-* |1= to |lang=? I have to say I disagree with this change thoroughly. -- Sokkjō 02:59, 13 April 2024 (UTC)[reply]