Wiktionary:Misspellings

From Wiktionary, the free dictionary
Jump to navigation Jump to search
link={{{imglink}}} This is a Wiktionary policy, guideline or common practices page. This is a draft proposal. It is unofficial, and it is unknown whether it is widely accepted by Wiktionary editors.
Policies – Entries: CFI - EL - NORM - NPOV - QUOTE - REDIR - DELETE. Languages: LT - AXX. Others: BLOCK - BOTS - VOTES.

Wiktionary accepts common misspellings. These are intended to help users who search for them, rather than being met with a red link, the entry directs them to the correct spelling.

Another use of misspelling entries is by editors mining corpora for forms not yet covered by Wiktionary: they may appreciate having a common database of forms known to be misspellings, even if relatively rare ones. The use case is rather different from the first one.

Common[edit]

Almost any word can be misspelled in dozens of ways. For this reason, we only accept common misspellings. There are currently no criteria to establish what a "common" misspelling is. Misspellings have to meet Wiktionary:Criteria for inclusion just as much as any other entry, so some entries may be required to pass Wiktionary:Requests for verification.

Evidence to support commonness could include uses in reliable third-party published materials, such as books, magazines, leaflets and newspapers. Per our criteria, anything that is in "widespread use" should be included.

Frequency ratio test[edit]

This paragraph is not known to be universally accepted.

One test of what is a "misspelling" and what is a "common misspelling" is the frequency ratio test, considering how common the misspelling is relative to the correct spelling. Compared to less common alternative spellings, misspellings tend to have poor frequency ratios, and rare misspellings even worse. For instance, conceive/concieve at Google Ngram Viewer for concieve shows the frequency ratio of about 2500, still fine for a "common" misspelling, while conceive/concive at Google Ngram Viewer for concive shows over 47 000, which would make it a "rare" misspelling. However, this test is not a policy and is not universally accepted by Wiktionary editors. It works less well for hyphenated forms since they are all too often scanned as solid forms.

Another test of common misspelling is how common it is relative to other misspellings. For instance, if concieve is accepted as common, we may note concieve, (enthousiastic*5) at Google Ngram Viewer and conclude enthousiastic is not hugely rarer. authoritive,concieve at Google Ngram Viewer shows authoritive to be similarly common as concieve. By contrast, (acclamate*20),concieve at Google Ngram Viewer shows acclamate to be on a different order of magnitude of frequency, so less protected by concieve. To use this test, one would have to pick a benchmark; concieve is a candidate, but it may be too common and thus too high a bar to compare against.

One idea behind the use of frequency ratios in Google Books is that it reveals copyeditors voting, as it were, for what is incorrect by removing it during the editing process. What slips through their fingers is going to be rare.

The test is consistent with WT:CFI#Spellings: "There is no simple hard and fast rule, particularly in English, for determining which category (correct spellings, misspellings, variant spellings) a specific spelling belongs to. Published dictionaries, grammars, style guides and statistics can be useful guides in this regard but they are not necessarily binding." Note "statistics". A previous version had more explicit language: "statistics concerning the prevalence of various forms".

Absence from Google Ngram Viewer[edit]

This paragraph is not known to be universally accepted.

If an English variant spelling is not found in Google Ngram Viewer while the "correct" spelling is found, it is hard to claim it is a common misspelling, at least in absolute terms. Such an entry may still serve the purpose of tracking the misspelling for corpus miners.

Copyedited corpus test[edit]

This paragraph is not known to be universally accepted.

A putative misspelling's not being attested in a copyedited corpus such as Google Books and only being attested in Usenet or the like can support its being a misspelling.

Style guides[edit]

WT:CFI#Spellings mentions style guides as one cue for classification of spellings as misspellings or variant spellings. However, RFD discussions mentioning style guides are hard to find. Moreover, it is unclear how it would work: for instance, GPO style manual favors micro-organism, but that does not make microorganism spelled solid a misspelling.

Precedent[edit]

Typos[edit]

Typos are excluded regardless of frequency per WT:CFI, e.g. amgydala. As an aside, this typo is on the same order of magnitude of frequency as concieve after 2000: concieve, amgydala at Google Ngram Viewer.

Obsolete spellings[edit]

Obsolete spellings such as musick are not marked as misspellings. They are misspellings from today's point of view, but were standard spellings at the time, and their being today-misspellings follows from their being obsolete.

Anomalous spellings[edit]

Anomalous spellings, those failing a pattern, are not misspellings.

In English, prefixing capitalized words nearly always retains the capital letter and adds a hyphen. But:

English hardly ever uses diacritics such as diaeresis or acute accent in spelling. However, the following spellings are common and accepted:

See also W:English terms with diacritical marks.

Urgency of deleting common misspellings[edit]

As long as misspellings are marked as such, the reader will not be mislead, and there is no urgency. However, going out of one's way to create entries for rare misspellings seems inadvisable: it is useless for the readers and creates more cleanup work for others.

Formatting[edit]

Misspellings should appear under a part of speech heading like Noun, Adjective or Verb but should not appear in those categories. Misspellings can be included in entries that already have other meanings, or other languages. These entries should appear in the relevant categories. The template {{misspelling of}} is designed to do all the formatting necessary for misspellings.

Example (stationery)[edit]

==English==

===Noun===
{{en-noun|-}}

# [[writing]] [[materials]]

===Adjective===
{{head|en|misspelling}}

# {{misspelling of|en|stationary}}

So this entry appears in Category:English nouns but not Category:English adjectives, as the {{en-adj}} template is not used for misspellings.

Alternatives[edit]

Wiktionary documents usage, therefore misspellings that are commonly judged to be 'misspellings' are included as such. There are alternatives, however:

  1. {{nonstandard spelling of}} for entries that are deliberate misspellings, such as kewl for cool.
  2. {{deliberate misspelling of}}
  3. {{archaic spelling of}} and {{obsolete spelling of}} for spellings that are no longer used, but were not considered incorrect at a certain time.
  4. {{alternative spelling of}} for spellings that are less common, but not considered incorrect.
  5. {{eye dialect of}}

See also[edit]