Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search
This page is for cleanup jobs. Request jobs are at Wiktionary:Task lists.

This page lists cleanup requests affecting multiple entries. These may include updating templates, categories or generic entry structure, but not specific terms, which should be tagged with {{rfc}} and put on WT:RFC. Therefore, tasks that have previously been divided across discussion and user pages are grouped together in one place where they are easier to find.

Regular tasks[edit]

Semi-regular tasks[edit]

Usually dump-analyzed:

  • Unhelpful abbreviations — These should use the full term.
  • Category:Limit of template reached.
  • Occasionally, people write {{w:Foobar}}, this should be {{w|Foobar}} (and can be found by searching the database for instances of {{w:).
  • Occasionally, soft hyphens or other invisible/zero-width characters (­|​|‌|‍) sneak into the content of entries or even the pagenames; the soft hyphens should be removed; the other characters should be discussed.
  • People sometimes type {[, }] etc when they mean {{ / }}. It is useful to periodically scan dumps for instances of this. Here is some regex: ([^\[\{]\[\{[^\[\{]|[^\[\{]\{\[[^\[\{]|[^\]\}]\]\}[^\]\}]|[^\]\}]\}\][^\]\}]). Simply searching for ]} will not work, because there are many valid instances of it, e.g. {{m|en|a [[link]]}}.
  • Every few months, check for instances of the common but nonstandard headers "Alternative form", "Alternative spelling" and "Alternative spellings" (which should be "Alternative forms") and "Usage note" (which should be "Usage notes"). Many other nonstandard headers exist, but none are as common as those. Also, no L1 headers should exist in the main namespace (language headers should always be L2, and all other headers should always be L3 or more). See User:Erutuon/mainspace headers for a full list of non-language headers and User:Erutuon/mainspace headers/possibly incorrect for a list of possibly incorrect headers.
  • Check for entries using modifier letters or deprecated IPA characters.
  • Search for (using the site search function) and fix "Etymology 2" -"Etymology 1" and other cases of higher-number etymologies without the full complement of lower-number etymologies.
  • Check for misindented quotations (pages with a line containing {{quote- but not starting with #* or ##*)
  • Check for mismatched labels (e.g. {{lb|de|...}} not inside ==German==)

To be monitored manually:


Useful search queries[edit]

All subpages[edit]

Subpages of Wiktionary:Todo :


Updating Websters definitions[edit]

User:Visviva/Cobwebs contains 4 lists of entries that have markers of insufficient editorial attention since import from Websters 1913. It would be nice if, over the next 3 years, we could modernize these entries so that by the hundredth anniversary of the publication we could honor it as an inspiration for a modern dictionary instead of just relying on its often obsolete set of senses.

The markers that put the entries on the list should not be eliminated until the entry is thoroughly updated. The problems run the gamut: excessive reliance on "literary" usage examples, missing citations information, obsolete language, inclusion of related terms definitions, missing senses, superseded etymologies.

There are many basic words on the General Service List - core words - that suffer these problems. DCDuring TALK 00:28, 2 December 2009 (UTC)

Someone could use Wiktionary:Abbreviated Authorities in Webster and AWB to fix ambiguous author citations. It would be even better if pedia links were included. --Bequw¢τ 05:18, 2 December 2009 (UTC)
That's treating a symptom. Should a literary style quote from 100-200 years ago (or more) appear in a long entry for a basic word (vs on Citations page). The main disease is that many of our entries from Webster's, especially the long ones, have not had enough constructive contributor attention. They are more likely to get some additional hyperspecific sense added (poker, cricket, mycology, etc) than to get updated definitions. If cleanup only treats symptoms we will succeed in masking the serious disease. Or perhaps users don't need entries for such basic words or at least don't benefit from unabridged-dictionary-style treatment of them. DCDuring TALK 22:42, 2 December 2009 (UTC)
True. BTW have all the original pages been imported? What are all the Appendix:webster 1913:* pages doing? --Bequw¢τ 03:39, 4 December 2009 (UTC)
I have assumed that the pages have been imported. The remaining terms mostly don't seem a high priority. I wouldn't add that list here until we do some of the more important or urgent work. Perhaps we need a warning to discourage uncritical importing without updating the wording and checking for obsolescence. DCDuring TALK 11:34, 4 December 2009 (UTC)


Misemboldened material in quotes and usage examples[edit]

How hard would it be to create a clean-up list of instances of emboldened text in quotes and usage examples that are not identical to the headword, especially in the entry, but also on any citations page? Is this a task for Autoformat? DCDuring TALK 17:28, 18 January 2011 (UTC)

Not all bold text different form the headword is incorrectly bold. E.g., [[say]] has boldface said and says in quotes/usexes, and foreign words have boldface transliterations and translations.​—msh210 (talk) 17:34, 18 January 2011 (UTC)
It would be lovely if the matching was against any inflected form as well.
How about bold text with a space when the headword is a single word? That would capture cases where cites were copied from an MWE to one or more component words without adjustment. DCDuring TALK 18:28, 18 January 2011 (UTC)
What's an MWE? (Not seeing it at [[MWE]] (redlink), [[Appendix:Glossary]], or [[Wiktionary:Glossary]].) In any event, I'm sure there are languages where a cited term can have a space even if citing a word without a space. (Perhaps "auf-" verbs in German, cited as ".. auf"?) I seem to recall seeing even some English entries that do this: cite a space-including form like dump truck for dumptruck. (We don't like those at RFV, but they're useful to show early use, especially where the version in the cite is a form-of entry.)​—msh210 (talk) 19:26, 18 January 2011 (UTC)
Multi-word entry. I picked it up from BP or RfD. It doesn't prejudice the case as to idiomaticity.
As a manual cleanup item we should be able to catch those. It is English entries that have the most usage examples and citations that might need the clean up. I don't really understand why we have separate entries (vs, say, redirects) for alternative forms, but conflate them all in entry-page citations. If we want to conflate them somewhere, why not in citation space with transclusion or redirection? We have a problem with the multiple purposes of citations: usage example for a sense, attestation of form, history of lemma. Citations that, say, show a headword to be an adjective are not the best for sense usage or history of lemma. DCDuring TALK 20:36, 18 January 2011 (UTC)
Yeah, if it's to be done manually, okay. The false negatives (positives? I mean badly bold entries) shouldn't overwhelm the true ones.​—msh210 (talk) 20:48, 18 January 2011 (UTC)
When I mentioned AF I was thinking of the problem-detection-and-marking function it or its successors have. DCDuring TALK 23:05, 18 January 2011 (UTC)
Not too many in Latin script: /bolded spaces in single-word entries. --Bequw τ 02:58, 19 January 2011 (UTC) (Slightly edited by msh210 18:59, 19 January 2011 (UTC).)
I don't see any instance of blanks in emboldened text in the first few I looked at. Of course, I am a little suspicious that they are all one-character entries. Is there a problem in the selection logic? DCDuring TALK 12:45, 19 January 2011 (UTC)
Should be fixed. I also shows now the matches so that you don't have to guess as much. --Bequw τ 16:44, 19 January 2011 (UTC)
Thanks. The 5000+ entries fits my expectations better. Showing the matches is a help. DCDuring TALK 17:25, 19 January 2011 (UTC)
I just went through five dozen or so Hebrew entries and discovered that virtually all of them have the "error" in a translation and were actually fine. Any way you can re-run the script, skipping boldfacing in #*:: and #::, please? (Or maybe there's a better way to omit translations.)​—msh210 (talk) 18:06, 19 January 2011 (UTC)
Much less serious, because visible from the page, are the instances of "more" and "most" in the inflection line, and dates in quotes. I am only going for what are probably English entries because the list doesn't seem to have a very high yield of real problems for non-English because of what msh210 describes. DCDuring TALK 18:50, 19 January 2011 (UTC)
"More"/"most" is good to list: the inflection^Wheadword line needs a template then. No?​—msh210 (talk) 18:56, 19 January 2011 (UTC)
The instances I saw were within templates. DCDuring TALK 01:56, 20 January 2011 (UTC)
Reran not matching those line prefixes (or ##:: or ##*::). Ones already removed were kept out. --Bequw τ 01:37, 20 January 2011 (UTC)

Category:Pages with incorrect ref formatting[edit]

Found this by chance. Some of the entries tagged are in the main namespace. Mglovesfun (talk) 13:54, 31 January 2011 (UTC)



This shouldn't contain names of animals, as animal names aren't limited to zoology. Names of animals should be in more specific categories, such as Category:en:Animals. Mglovesfun (talk) 09:38, 3 August 2012 (UTC)

The problem with this as a cleanup list is that too many of the items that are in the category are supposed to be in the category. I sorted out most of the capitalized terms, all those I was sure belonged in Translingual. I may try with those with Latinate endings, in search of species epithets that should be Latin. DCDuring TALK 17:59, 14 August 2012 (UTC)
Or Translingual, there's no consensus on this, apart from in your head, but I for one don't care enough to take any action on the matter, and based on observation, nor does anyone else. Mglovesfun (talk) 21:35, 14 August 2012 (UTC)
No one else active seems to care about any of these, except in principle, and not so much that either. Someone can mass-change the Latin stuff if ever a consensus emerges in favor of making New Latin a dialect of Translingual. I mark the senses of the New-Latin-coined Latin terms with {{New Latin}} and try to define the sense using {{n-g}} and the word epithet, so it should be easy enough to identify them. At the moment they might be in English, Translingual (not so many now), or Latin, but most likely they are redlinks, yet to be defined, at least in the form used in species names. Maybe the forms should be defined as inflected forms of unattested New Latin lemmas. DCDuring TALK 22:17, 14 August 2012 (UTC)

/External links to Wikipedia[edit]

RuakhTALK 15:04, 30 September 2012 (UTC)

/Entries containing non-template contexts, etc[edit]

- -sche (discuss) 07:56, 6 October 2012 (UTC)

Regenerated, and split off cases where {{a}}, {{q}} or {{sense}} is used after #:
- -sche (discuss) 17:32, 20 May 2018 (UTC)

/Former name of[edit]

Most entries which contain the phrase "former name of foo" should be changed to use {{obsolete}}, {{historical}} and {{qualifier}} and/or {{defdate}} ([1], [2]). "Stalingrad" isn't the word for "a former name of Volgograd", it is a former name of Volgograd. The word for "a former name of Volgograd" might be an "exvolgogradonym" or something. - -sche (discuss) 08:53, 31 December 2012 (UTC)

Wiktionary:Todo/former name of DTLHS (talk) 06:54, 2 January 2013 (UTC)


/All senses[edit]

These entries contain the gloss "all senses", which should usually be replaced by an actual list of which senses. - -sche (discuss) 05:55, 24 May 2013 (UTC)

Thanks! I wish I'd've thought of this. Mglovesfun (talk) 10:33, 27 May 2013 (UTC)
As -sche indicates, there are times when it isn't worth the effort of splitting up synonyms etc. Doing so might make it easier for machines to read the information, but not for humans to either enter it or read it. DCDuring TALK 16:41, 20 June 2013 (UTC)
The thing is, when you add a sense to the English word, you are implicitly adding that sense to the foreign language word as well. Same applies to deleting a sense. Mglovesfun (talk) 16:45, 20 June 2013 (UTC)

/Latvian adjectives[edit]

See Wiktionary:Grease pit/2013/May#Bot_request:_Latvian_adjective_homographs_missing_second_headers. - -sche (discuss) 18:32, 25 May 2013 (UTC)

Pronunciation problems[edit]

At User:-sche/pronunciation problems, I have attempted to make a comprehensive list (additions welcome) of all possible problems which may exist in pronunciations sections, to aid those who would create lists of and fix entries which suffer from those problems.

This incorporates User:Robert Ullmann/Pronunciation exceptions (from June 2010), Wiktionary:Todo/non-standard pronunciation transcriptions (from July 2012) (which is finished, but for a few Egyptian entries, but could be re-run periodically), and /Entries containing obsolete IPA characters (from September 2012) (which is also finished, but should be re-run periodically).

- -sche (discuss) 06:27, 25 June 2013 (UTC)

/Non-templatised genders[edit]

These 3027 pages use ''f'', ''m'', ''n'', ''c'', ''p'' or ''pl''. In many cases, the pages could be updated to call {{g|f}}, etc. - -sche (discuss) 21:22, 3 July 2013 (UTC)

Regenerated. Now <200 entries. - -sche (discuss) 14:30, 19 February 2018 (UTC)

Wiktionary:Todo/Slovene masculine translations[edit]

> This is the list of entries, as of the last database dump, that contain Slovene translations with the gender m ("masculine"). They should most likely be changed to use either m-an (+ "animate") or m-in (+ "inanimate"), since that distinction has grammatical consequences in Slovene. (?)

RuakhTALK 14:34, 11 September 2013 (UTC)


/Mandarin translation not nested under Chinese[edit]

These are Mandarin translations not nested under the Chinese section. They probably have to be cleaned up manually. Matthias Buchmeier (talk) 18:17, 31 January 2014 (UTC)

I have updated the list from the recent dump. Matthias Buchmeier (talk) 10:58, 11 March 2017 (UTC)


/Pages containing LTR marks and /RTL marks[edit]

In many cases, these are unnecessary and cause problems. - -sche (discuss) 18:16, 21 January 2015 (UTC)

What are LTR marks and how should one improve the entry? --A230rjfowe (talk) 21:00, 15 July 2015 (UTC)
What are RTL marks and how should one improve the entry? --A230rjfowe (talk) 21:00, 15 July 2015 (UTC)
They are invisible characters that otherwise behave like strongly left-to-right characters (such as Latin letters) or strongly right-to-left characters (such as Arabic letters), in that they influence the direction of surrounding characters that do not have a defined text direction. So they are sometimes used to change the direction of characters in text. For instance, on Wiktionary, where text direction is generally left-to-right, punctuation characters can be forced to render right-to-left by sandwiching them between Arabic letters and a right-to-left mark.
But CSS should be used to change text direction instead, whenever possible. On Wiktionary, we do this by adding classes that have the correct CSS properties: for instance, enclosing Arabic text in class="Arab", which has the CSS direction: rtl; unicode-bidi: embed; applied to it in MediaWiki:Common.css. This is done automatically by most linking templates.
You can read more in w:Left-to-right mark and w:Right-to-left mark and w:Bidirectional text. — Eru·tuon 16:50, 16 October 2019 (UTC)
Regenerated. - -sche (discuss) 14:48, 19 February 2018 (UTC)

/Page with untemplatized etymologies[edit]

A partial list of pages where at least one language section simply states, in plain text, without using {{etyl}}, that it derives from German, French, Latin, Greek, Ancient Greek, Chinese or Spanish. - -sche (discuss) 17:43, 25 January 2015 (UTC)

Regenerated (1469 entries). - -sche (discuss) 14:44, 19 February 2018 (UTC)

/Last line in trans table could benefit from xte[edit]

The last line of one of the translations tables or checktrans tables in each of these entries ends in ]] and could benefit from being adapted to use {{t}}, e.g. via the xte gadget. (More adept searching could catch instances where other lines would benefit from xte, but this is a decent start.) - -sche (discuss) 22:47, 9 February 2015 (UTC)

/North American[edit]

A list of entries which are labelled as being Canadian, or American, but not both. It is likely that many should in fact have both labels. See Wiktionary:Beer_parlour/2015/March#North_American_English_vs_Canadian_and_American_English for a bit of background. - -sche (discuss) 05:00, 7 March 2015 (UTC)

Erroneous Greek characters[edit]

Any place that the character ϕ is used in place of φ or ϑ in place of θ in a string that is marked as being grc or el should be listed so that an editor can look them over and fix mistakes. I just found one lying around in a {{term}}, which made me think that these shouldn't be overly hard to find. —Μετάknowledgediscuss/deeds 21:01, 12 May 2015 (UTC)

@Metaknowledge: Never knew this page existed. Ironically I came across this why searching for incorrect uses of ϕ. For future reference, here is the search for ϕ and here is the search for ϑ (other incorrect characters are ϖ ϛ ϰ ϱ ϐ ϵ ϲ ϗ ȣ; there may be more). --WikiTiki89 13:20, 21 April 2017 (UTC)
If nothing has been done about this, I can make Module:script utilities search for these characters when it tags text, and add a tracking template or a category. — Eru·tuon 23:50, 20 May 2017 (UTC)
@Metaknowledge, Wikitiki89: Done.Eru·tuon 00:02, 21 May 2017 (UTC)
@Erutuon: It's never done, people will keep adding them. --WikiTiki89 15:03, 22 May 2017 (UTC)
Oh sorry, you were referring to having Module:script utilities search for them. It's not that nothing has been done, I went through and removed over a hundred of these. But again, people will keep adding them. --WikiTiki89 15:05, 22 May 2017 (UTC)
Right. I just found one in polypharmacy... 🙄 — Eru·tuon 18:14, 22 May 2017 (UTC)

Not click characters[edit]

All over the dictionary, e.g. in the name and content of !nawas and in this translation, ! turns up for ǃ, and I wouldn't be surprised to find other substitutions for click consonants. The best way I can think of to find such uses is: create a list of all languages that use clicks, or as a presumably easier-to-make approximation of that a list of all Khoisan languages, then search a database dump for all translations, language sections, and {{m}}/{{l}}s of those languages that contain !. I've just cleaned up the few pages which misused ! in their pagenames (only 31 pages on Wiktionary used ! in their pagenames at all). - -sche (discuss) 18:42, 25 August 2015 (UTC)


Not really unisex names[edit]

At User:-sche/names are lists of entries that are in both 'male given names' and 'female given names' but not yet 'unisex given names' (they should also be in the third cat if they are in the first two), and also entries in 'unisex given names' that are not in both 'male given names' and 'female given names' (so they are missing a 'female'/'male given name' definition line, or are not really unisex). - -sche (discuss) 06:52, 19 May 2017 (UTC)

Regenerated. - -sche (discuss) 19:15, 26 January 2018 (UTC)

Check IDs[edit]

As discussed at Wiktionary:Grease pit/2017/May § Adding ids to enable linking to headwords, we need to check for sense ids in {{senseid}} and the |id= parameter of headword templates that are on the same page and have the same language and have the same id string: that is, those that would create the exact link when input into an entry linking template. Each sense id for a given language on a given page should be unique. — Eru·tuon 16:57, 19 May 2017 (UTC)

Usage note template naming[edit]

User:-sche/Usage note templates lists some usage-note templates which could be moved to fit our usual naming scheme, as described on the page and [3]. - -sche (discuss) 22:01, 26 May 2017 (UTC)

Possibly mislabeled affixes[edit]

Wiktionary:Todo/interfixes: These look like interfixes, but are labelled "prefixes" or "suffixes". - -sche (discuss) 19:57, 8 June 2017 (UTC)

Regenerated (per request on my talk page). Note that some, e.g. for Navajo, may be fine as they are. - -sche (discuss) 03:34, 15 February 2020 (UTC)

Pronunciation audio files[edit]

User:DerbethBot/Add manually: DerbethBot adds pronunciation files to entries, but some audio files need to be added manually. (See also User:DerbethBot for more info.) -- Curious (talk) 12:00, 11 June 2017 (UTC)



Entries where label language does not match entry language. – Jberkel 00:01, 28 February 2018 (UTC)

Category:Requests for quotation by source[edit]

A big category of pages which need a quote, but whose source author is mentioned, so it may be easier to quote them than other requests. --XY3999 (talk) 19:36, 28 August 2018 (UTC)

Terms not restricted to legal jargon[edit]

Quite a few entries with usage notes like this are labelled {{lb|en|law}}, but are in fact in general use and not at all restricted to legal jargon (so the label should be removed). - -sche (discuss) 00:10, 23 December 2018 (UTC)



Verbs missing a corresponding Conjugation section JeffDoozan (talk) 00:38, 17 December 2020 (UTC)