Wiktionary talk:Todo

Definition from Wiktionary, the free dictionary
Jump to: navigation, search


Old discussions have been archived to Wiktionary talk:Todo/archive.

reciprocal links[edit]

Sometimes existence of a link in one direction should imply existence in the other:

  • homophones — should always reciprocate, though this is not bottable, as it might be accent-specific
  • rhymes — the entry and the Rhymes: page should always reciprocate, though this is not bottable, as it might be accent-specific
  • {{also}} — should usually reciprocate (whether it links to another entry or to a forms-of appendix)
  • 'nyms and related terms— should usually reciprocate, though this is not bottable, as it might be sense-specific
  • derived terms — where listed as derived at [[foo]] should also list foo in the etymology; not usually bottable, as explanation is needed in the etymology, but perhaps if there is no etymology section at all then one can be added listing just the word?

Any others?​—msh210 (talk) 19:19, 2 September 2010 (UTC)

(Note that although I said "also should usually reciprocate", I elsewhere questioned how usual the "usually" actually is.​—msh210 (talk) 17:33, 3 September 2010 (UTC))
I'm struggling to think of specific examples, but for related terms I sometimes use see {{term|foo|lang=foo}} to avoid repetition, something like dogmatically could link to dogma, as a specific example. Mglovesfun (talk) 17:37, 3 September 2010 (UTC)

Entries only in Category:Idioms and no others[edit]

Such a list would allow us to find a fair few of the entries lacking POS categories. Mglovesfun (talk) 14:42, 7 December 2010 (UTC)

Robert Ullmann's lists[edit]

As I imagine RU's analyses won't be run anytime soon, I've looked through his subpages for cleanup lists that we might want to independently generate. He had other projects, several aimed at finding missing entries, but I'll leave those for others. I've made a rough list of those I think we should try and replicate, and those I'm not sure about.


Anyone want to tackle any of these. I think I can do the L2/invalid one without too much hassle.done --Bequw τ 15:37, 29 August 2011 (UTC)

Mglovesfun's lists[edit]

If anyone wants to tackle any of the subpages of User:Mglovesfun/to do‎, please do. I'll be around a lot less so a lot of these lists may never get done unless someone else fixes a few entries. Mglovesfun (talk) 22:03, 24 September 2011 (UTC)

IPA cleanup things[edit]

Until just a moment ago, our edittools wrongly contained a non-IPA g in [g̊]. It could be corrected to [ɡ̊]. (It could be corrected straightaway without a list; there is no reason why a g in text should have a voiceless symbol.) A list could also be made of g (the non-IPA "g") in IPA sections (there may be valid uses of it, e.g. in refs). - -sche (discuss) 03:27, 29 August 2012 (UTC)

A bot could also convert instances of and , and if they exist even ˈ. and ˌ., to ˈ, , if this is indeed policy (to not mark syllable breaks with dots where there is already a stress marker). - -sche (discuss) 03:30, 29 August 2012 (UTC)
A bot could also convert diphthongs like /aɪ̯/ to /aɪ/ (especially in German entries?), if and only if the latter is (as I think) the preferred broad transcription format. - -sche (discuss) 07:06, 30 August 2012 (UTC)
I don't know if there is a way to catch this sort of thing, but: accents entered without {{a}}. - -sche (discuss) 21:53, 6 October 2012 (UTC)
from to
g (U+67) ɡ (U+261)
ε (U+3B5) ɛ (U+25B)
ǝ (U+1DD) ə (U+259)
  • Also from AF's talk: 'Let's not forget the colon ":" versus the IPA colon "ː".' - -sche (discuss) 05:20, 4 January 2013 (UTC)
  • And: 'dotless-i ı (U+0131) should be corrected to small capital i ɪ (U+026A).' (Does anyone write the former meaning the latter? *Shudder* I'll try to handle this myself soon.) - -sche (discuss) 05:26, 4 January 2013 (UTC)
  • And ' (apostrophe) to ˈ (primary stress). That one might need a human to look it over, because in languages like Georgian or Tigrinya it might see erroneous use for ʼ (ejective marker). But a bot can at least do it automatically for the Germanic and Romance languages, which don't have ejective consonants. —Μετάknowledgediscuss/deeds 05:35, 4 January 2013 (UTC)
    Loads of Catalan entries are using the comma instead of the primary stress mark. That's what brought me here. Mglovesfun (talk) 11:43, 18 February 2013 (UTC)
Using AWB, I've found 130 entries where {{IPA|...}} contains ε, ı, ǝ or :, and started replacing them. - -sche (discuss) 03:45, 18 February 2013 (UTC)

Sense used instead of context, qualifier or gloss[edit]

Another thing someone could check for: {{sense}} used in definition lines, like this. {{qualifier}} or {{gloss}} (or, if sense is at the start of the line, {{context}}) should be used instead. - -sche (discuss) 20:36, 30 September 2012 (UTC)

I've even found {{head}} and {{infl}} used where someone just wanted a formatted link. DCDuring TALK 23:59, 2 October 2012 (UTC)
Ugh. - -sche (discuss) 01:47, 3 October 2012 (UTC)
{{a}} makes its way into the mix, too: [1]. - -sche (discuss) 05:08, 15 September 2014 (UTC)

Fuzzy task: find SOP phrases linked-to as if they weren't[edit]

In [[dresser]], I just unlinked {{t-|it|persona che veste in un certo modo}}. I wonder if other mistakenly-linked SOP phrases could be found by a search for any redlink {{t-}} term (redlink and {{-}} because neither we nor the other Wiktionaries are likely to have entries for such entries) containing, say, more than two spaces. Obviously, this is "fuzzy" and would also find some valid translations of idioms, etc. Perhaps that could be reduced by excluding translations of Idioms, Proverbs or Phrases. - -sche (discuss) 23:47, 2 October 2012 (UTC)

Convert parentheses in Tbot entries to Gloss[edit]

Would it be possible/desirable to automatically convert parenthetical glosses in Tbot entries to use {{gloss}}? Perhaps only those entries with one set of parentheses per sense line could be converted, and any with more could be flagged for human eyes (which might decide to run a second bot pass on them all, lol). As usual, I'm just throwing ideas out here as they come to me. - -sche (discuss) 20:09, 5 January 2013 (UTC)

Find language codes used in wikilinks[edit]

Example of the problem: [2]. —Μετάknowledgediscuss/deeds 04:44, 15 January 2013 (UTC)

Maybe this could be fixed as part of a larger effort to standardize links with {{l}} or some other template. DTLHS (talk) 06:36, 15 January 2013 (UTC)

Fix {homophones} bot error[edit]

SemperBlottoBot batch-loaded a bunch of French verb forms a while back that didn't have lang=fr specified in {{homophones}}. An example of the problem can be found here. Perhaps this could be added automatically like what Autoformat currently does for {{IPA}}? —Μετάknowledgediscuss/deeds 19:27, 16 February 2013 (UTC)

Redundant transliterations[edit]

It seems like nobody pays attention to this page anyway, but I was wondering if someone can figure out how to bot-remove redundant Armenian and Old Armenian transliterations like the ones here. —Μετάknowledgediscuss/deeds 22:49, 16 November 2013 (UTC)

Common misspellings[edit]

A great many misspellings occur in our entries, even in headers. See Wiktionary:Todo/Misspellings, and add to it. Then, we can periodically search for and eliminate instances of the listed misspellings. - -sche (discuss) 05:40, 2 March 2014 (UTC)

Etymology 2 without Etymology 1[edit]

It would be worthwhile to periodically check if there are entries like this, i.e. entries which

  1. have an ===Etymology 2=== without an ===Etymology 1===, or indeed have any number higher than 1 without also having all the numbers that lead up to it, or
  2. have an ===Etymology #=== section with L3 (rather than L4) POS sections in it.

I just fixed all current instances of the first problem. - -sche (discuss) 09:15, 29 March 2014 (UTC)

Stray spaces[edit]

Stray spaces appear in a number of predictable/easily-findable circumstances, such as this. - -sche (discuss) 06:55, 13 July 2014 (UTC)

Instances of people starting with pl2= or other numbered parameters >1[edit]

According to TemplateTiger, this was the only example of someone using {{en-proper noun}} and starting with pl2= rather than with the unnamed first parameter. It might be fruitful to check if other templates have been used in the same way, i.e. with a second plural form ("pl2=") declared prior to any first plural was declared, particularly if (as here) no first plural is automatically displayed, and/or pl2= is set to whatever the automatically displayed plural would be. - -sche (discuss) 19:23, 1 August 2014 (UTC)

Tool for manually finding misspelt or unsupported parameters[edit]

TemplateTiger is a good tool for finding which entries use certain parameters of a given template, regardless of whether or not the template supports those parameters. Among other things, this can allow one to find misspelt or mistaken parameters, like the "compound=" or "current=" parameters formerly used in save-all and barrel roll, or the following misspellings of "head=" : "head]", "haed", "hwad", "heead". - -sche (discuss) 20:35, 1 August 2014 (UTC)

Middle dots as decimal points[edit]

Doremítzwr seems to have used middle dots as decimal points(??) in some entries, e.g. the depth measurement here. These should be located and cleaned up. - -sche (discuss) 03:59, 20 August 2014 (UTC)

Look for brackets in the displayed text of pages[edit]

If someone could examine the displayed text of pages (as opposed to the wikitext) and look for instances of {{, [{, {[, [[, }}, ]}, }], or ]], that would probably be informative. I imagine most occurrences of such strings are the result of mismatched brackets or bot-errors breaking templates across lines. - -sche (discuss) 20:11, 27 August 2014 (UTC)

I suppose one would have to have some sort of local wiki markup parser to do this. DTLHS (talk) 20:36, 27 August 2014 (UTC)
Is mwparserfromhell of use? - -sche (discuss) 20:54, 27 August 2014 (UTC)
I believe mwparserfromhell will only give you valid templates / links- it's not going to tell you if something is malformed. DTLHS (talk) 01:04, 28 August 2014 (UTC)
Wiktionary:Todo/bad links. I just looked for lines where the number of occurrences of "[[" doesn't match that of "]]". Technically links can extend over multiple lines (but they probably shouldn't). Looking for malformed templates is much harder. DTLHS (talk) 03:15, 29 August 2014 (UTC)
That looks like a very useful list; thank you! I've cleaned up a few entries already. One idea I may suggest in the GP is that we try to make an abuse filter that tags edits that leave a page with more [[s than ]]s or {{s than }}s, to alert us to new instances. I think abuse filters can do that; there's one than warns people against <ref> without <references/>. - -sche (discuss) 06:52, 29 August 2014 (UTC)

Italian plurals reinterpreted by bots as singulars[edit]

In addition to the obvious problem with diff being broken across multiple lines, a user has just pointed out that it also took what was clearly labelled a plural form and incorrectly labelled it a singular. I don't know if there are more entries like this out there. - -sche (discuss) 21:01, 26 January 2015 (UTC)

I think it might be easier just to redo all of the Italian verb forms- there have been so many bots doing different things I can't imagine there's much consistency any more. DTLHS (talk) 21:03, 26 January 2015 (UTC)
I've just searched the site for all pages containing both "third-person singular imperative of" + "second-person plural present of", and fixed the pages I found. In doing so, I noticed that some lines still haven't been templatized, e.g. "feminine plural past participle of contaminare" and "of dare" and "of impregnare", which don't even contain a wikilink to the lemma. - -sche (discuss) 21:17, 26 January 2015 (UTC)

Commas after {{circa}}[edit]

A bot could check for an remove commas after {{circa}} (which itself adds a comma, making an additional comma superfluous), like so. - -sche (discuss) 19:43, 10 June 2015 (UTC)

Random excessive whitespace[edit]

Like this. - -sche (discuss) 21:47, 20 June 2015 (UTC)

Latin infinitives glossed as first-person forms[edit]

I've noticed several entries like this one, where the infinitive (not the first-person form) of a Latin word is given, but it is glossed as a first-person form. This is obviously incorrect regardless of whether one prefers to lemmatize infinitives or first-person forms. - -sche (discuss) 02:50, 29 June 2015 (UTC)

Untemplatized links to dictionaries[edit]

Should be found and templatized like [3]. I will try to do this myself. - -sche (discuss) 02:19, 7 July 2015 (UTC)

English terms spelled with Æ/Œ not marked as archaic/obsolete[edit]

For example, [4]. Some are valid (Æsir) but most are not. - -sche (discuss) 05:37, 30 July 2015 (UTC)

@-sche User:DTLHS/cleanup/english ae oe DTLHS (talk) 20:15, 20 August 2015 (UTC)
Thank you! If it's not too difficult, would it be possible to remove inflected forms of lemmas which also have Æ/Œ (e.g. œcologies, plural of œcology) — in such cases, it's sufficient that the lemma be marked; the plurals are generally not any more obsolete than the lemmas. Plurals of lemmas that don't contain Æ/Œ (e.g. cassiæ, plural of cassia) should stay on the list, since in those cases the plurals usually are more obsolete than other possible plurals. If that's too much bother, don't worry about it — I'll go through the entries on the list with AWB and can easily ignore œcologies-type entries. - -sche (discuss) 22:24, 20 August 2015 (UTC)
I don't really have an easy way to distinguish them, sorry. DTLHS (talk) 22:35, 20 August 2015 (UTC)
Since I’m responsible for a sizeable chunk of these, I feel obligated to express my regret that I’m making you clean these up. I was pretty ignorant and immature back then, but I realise now that I was acting inappropriately. --Romanophile (talk) 08:47, 23 August 2015 (UTC)

Miscapitalized labels[edit]

Discussion moved to Wiktionary:Grease pit/2015/August#Miscapitalized_labels.

RFC discussion: August 2014–July 2015[edit]

TK archive icon.svg

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

accents which are not accents

Everything in Special:WhatLinksHere/Template:accent:Others should be changed to specify which "other" accents, I think. And Special:WhatLinksHere/Template:accent:not according to standard pronunciation should be shortened to "nonstandard", at least. And how do we feel about using {{a}} to specify part of speech? See Special:WhatLinksHere/Template:accent:adjective, Special:WhatLinksHere/Template:accent:adverb, Special:WhatLinksHere/Template:accent:singular, Special:WhatLinksHere/Template:accent:plural, Special:WhatLinksHere/Template:accent:noun, etc. Should these entries be switched to {{qualifier}}? - -sche (discuss) 08:46, 5 August 2014 (UTC)

The switch to a module has made it harder to track these, but they still exist and are problematic. In particular, quite a range of unstandardized labels are in use in German entries. It would be useful if someone could make a list of all accent labels which are in use, so that unusual ones could be standardized or (in the case of e.g. "Others") cleaned up. 07:15, 22 July 2015 (UTC)

RFC discussion: June 2015[edit]

TK archive icon.svg

The following discussion has been moved from Wiktionary:Requests for cleanup (permalink).

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.

1811 Dictionary words

Many words from the 1811 Dictionary of the Vulgar Tongue, which are generally findable/recognizable as such because they cite that dictionary and/or use the context label "1811", are labelled and categorized as "obsolete". (Indeed, the "1811" label inalienably includes an "obsolete" label.) In many cases, however, there are as many or more citations from the modern period (using the term to create a historical atmosphere) as from the historical period, such that the correct label seems to be "archaic". FYI. - -sche (discuss) 22:01, 15 June 2015 (UTC)