Wiktionary:Criteria for inclusion: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
No edit summary
Uncle G (talk | contribs)
→‎Names: Rewritten and expanded to attempt to clarify what we do
Line 88: Line 88:




===Proper nouns (names)===
===Names===
Names fall into two categories: individual given names and family names, which are single words, and the names of actual people, places, and things. Wiktionary classifies both as proper nouns, but applies caveats to each.


====Given names and family names====
While [[proper noun|proper nouns]] are basically subject to the same guidelines as other terms, some special considerations apply. As a rule of thumb, a proper noun should be included only if:
'''Given names (such as [[David]], [[Roger]], and [[Peter]]) and family names (such as [[Baker]], [[Bush]], [[Rice]], [[Smith]], and [[Jones]]) are words, and subject to the same criteria for inclusion as any other words.''' Wiktionary has main articles giving etymologies, alternative spellings, meanings, and translations for given names and family names, and has two appendices for indexing those articles: [[Wiktionary Appendix:First names]], [[Wiktionary Appendix:Surnames]].


For most given names and family names, it is relatively easy to demonstrate that the word fulfils the criteria, as for most given names and family names the name words are in widespread use in both spoken communication and literature. However, being a name ''per se'' does not automatically qualify a word for inclusion. A new name, that has not been attested, is still a protologism. A name that occurs only in the works of fiction of a single author, or within a closed context such as the works of several authors writing about a single fictional universe, does not meed the criterion for independence.
# It is used as a common noun (especially if it is commonly written without capitalization).

# It is used in an attributive sense with the expectation that the meaning will be widely understood (''a David Beckham hairstyle'').
[[hypocoristic]]s, [[diminutive]]s, and [[abbreviation]]s of names (such as [[Jock]], [[Misha]], [[Kenny]], [[Ken]], and [[Rog]]) are held to the same standards as names.
# Words or terms derived from the name are already in Wiktionary.

# The name appears in different forms in different languages (e.g. John/Johann/Jan/Juan/Jean/Giovanni ...)
The status of [[patronymic]]s has not been settled.

====Names of actual people, places, and things====
'''A name should be included if it it is used attributively, with a widely-understood meaning.''' For example: [[New York]] is included because "New York" is used attributively in phrases like "New York delicatessen", to describe a particular sort of delicatessen. A person or place name that is ''not'' used attributively (and that is not a word that ''otherwise'' should be included) should ''not'' be included. ''Lower Hampton'', ''Empire State Building'', and ''George Walker Bush'' thus should not be included. Similarly, whilst [[Jefferson]] (an attested family name word with an etymology that Wiktionary can discuss) and [[Jeffersonian]] (an adjective) should be included, ''Thomas Jefferson'' (which isn't used attributively) should not.

'''A name should be included if it has become a generic term.''' For example: [[Remington]] is used as a synonym for any sort of rifle, and [[Hoover]] as a synonym for any sort of vacuum cleaner. (Both are also attested family name words, and are included on that basis as well, of course.) [[Hamburger]] is used as generic term for a type of sandwich. One good rule of thumb as to whether a name has become a generic word is whether the word can be used without capitalization (as indeed "[[sandwich]]" was in the previous sentence).

'''Being a trademark or a company name does not guarantee inclusion.''' (Of course, ''some'' company names are derived from family names, and are included on that basis.) Although some words are trademarks and company names, not all trademarks and company names are words. (Indeed, trademark holders will vigourously defend their trademarks against becoming words. According to [[w:Adobe Systems|Adobe Systems]], there is no such word as ''Photoshopped'', since Photoshop® is a trademark and [http://www.adobe.com/misc/trade.html not a common verb] that can have a past participle; according to [[w:Xerox|Xerox]] there is no such word as ''xerox'', since Xerox® is a trademark and [http://www.theinquirer.net/?article=18492 not a common verb]; according to Sony there is no such word as ''Playstationize'' since there's no word ''Playstation'' at all and PlayStation® is a trademark and not a common verb.) Many trademarks and company names are ''deliberately'' protologisms. To be included, the use of a trademark or company name ''other than its use as a trademark'' (i.e. a use as a common word) has to be attested.

====What Wiktionary is not with respect to names====
'''[[Wiktionary:What Wiktionary is not|Wiktionary is not a genealogy database]].''' Wiktionary articles on family names, for example, are not intended to be about the people who share the family name. They are about the name ''as a word''. For example: Whilst [[Yoder]] will tell the reader that the ''word'' originated in Switzerland (as well as give its pronunciations and alternative spellings), it is ''not'' intended to include information about the ancestries of people who have the family name Yoder.

'''[[Wiktionary:What Wiktionary is not|Wiktionary is not an encyclopaedia]].''' That's Wikipedia's job. Wiktionary articles are about words, not about people or places. For example: Many places, and some people, are known by single word names that qualify for inclusion as given names or family names. The Wiktionary articles are about the words. Articles about the specific places and people belong in Wikipedia. For example: Wiktionary will give the etymologies, pronunciations, alternative spellings, and so forth, of the ''names'' [[Darlington]], [[Hastings]], [[David]], [[Houdini]], and [[Britney]]. But articles on the specific towns ([[w:Darlington|Darlington]], [[w:Hastings|Hastings]]), statue ([[w:David|David]]), escapologist ([[w:Houdini|Houdini]]), and pop singer ([[w:Britney|Britney]]) are Wikipedia's job.


===Wiktionary is not an encyclopedia===
===Wiktionary is not an encyclopedia===

Revision as of 17:56, 22 May 2005

As an international dictionary, Wiktionary is intended to include "all words in all languages".

General rules

As a general guideline, a term should be included if it is attested and idiomatic.

Attestation

"Attested" means any of these applies:

  • It is clearly in widespread use.
  • It is used in a well known work.
  • It appears in a refereed academic journal.
  • It has been used in running text in at least three independently recorded instances, whether in print, audio, video or on the internet.
  • Note: the fourth entry above (that attempts to validate unverified internet sources) is disputed as being in direct conflict with the [[[:Template:lurl]]Wiktionary:Criteria_for_inclusion&oldid=125137 preceding version] of this page.

Running text

In the above, in running text is meant to exclude references to words such as:

The word baeiouc has no known meaning, but does contain all five vowels in order.

The term should be used in ordinary sentences, for its meaning.

Independence

The criterion of independence is meant to exclude made-up or exteremely specialized words that appear only in works by a given author or otherwise within a closed context. A use which defines the term is not independent. This applies particularly to proper names. For example, in an article mentioning "Lawrence city commissioner Boog Highberger", it's clear that "Boog Highberger" is a proper name denoting a city commissioner. On the other hand, if a New York Times article were to mention Boog without glossing who he is, that would definitely count.

Similarly, the Harry Potter books contain detailed accounts of Quidditch, but these are not independent, since Rowling does not expect the reader to be familiar with Quidditch without having first read her explanation of the game. A usage such as "Quidditch (a fictional game featured in the Harry Potter books)," would not be independent. A usage such as "the shipping department had all the order and decorum of a Quidditch game played by mutant wombats." would count (assuming it wasn't written by Rowling).

Protologisms

There is a separate designation, protologism, for terms defined in the hopes that they will be used, but which are not actually in wide use. These are listed on Wiktionary:List of protologisms, but should not be given their own entries.

Idiomaticity

"Idiomatic" means that a term is used with the expectation of being understood without further explanation, but its full meaning cannot be easily derived from its parts.

For example, this is a door should not get an entry, but shut up and red herring should. If a term is only used with an explanation of its meaning, there is no need to include it. This applies particularly to proper names. There is no particular need to include completely regular inflections such as cameras or singing. If they are present, they should redirect to the stem form. On the other hand, irregular forms such as geese and were should have their own entries. Inflected forms — whether regular or irregular — with idiomatic meanings, such as blues or smitten, should have their own entries, with the predictable meanings briefly noted.


Issues to consider

Phrases with multiple forms

Many phrases take several forms. If the forms vary only in the pronoun in use, use one or one's, as in feel one's oats. Use the least inflected form that is actually used. In the worst case, there may need to be separate entries for variants, with links between them.

The saying It's raining cats and dogs is an interesting example. One can also say It was raining cats and dogs, or I think it's going to rain cats and dogs any minute now, or It's rained cats and dogs for the last week solid. The entry should be (and is) under rain cats and dogs, with the other variants derived by the usual rules of grammar (including the use of it with weather terms and other impersonal verbs).

Attestation vs. the slippery slope

There is occasionally concern that adding an entry for a particular term will lead to entries for a large number of similar terms. This is not a problem, as each term is considered on its own based on its usage, not on the usage of terms similar in form. Some examples:

  • Any word in any language might be borrowed into English, but only a few actually are. Including spaghetti does not imply that ricordati is next.
  • Any word may be rendered in Pig-Latin, but only a few (e.g., amscray) have found their way into common use.
  • Any word may be rendered in leet style, but only a few (e.g., pr0n) see general use.
  • Grammatical affixes like meta- and -ance can be added in a great many more cases than they actually are. (Some basic suffixes like plural -s and past tense -ed really can be used almost anywhere.)
  • It may seem that trendy internet prefixes like e- and i are used everywhere, but they aren't. If I decide to talk about e-thumb-twiddling but no one else does, then there's no need for an entry.

Language considerations

Uncommon languages are acceptable as long as they are (or were) used for everyday communication by some identifiable, natural population of humans. If the language lacks an ISO 639 language code, it's almost surely not acceptable.

Since this is the English Wiktionary, all definitions should be given in English. If a non-English word has the same spelling as an English one, place all of the definitions on the same page but arrange them under their respective language headings with the English entries first. For example:

==English==
===Noun===
'''boot'''
# A shoe that covers part of the leg.
===Verb===
'''to boot'''
# To kick.

==German==
===Noun===
'''Boot'''
# Boat.

For more information about formatting entries, see Wiktionary:Entry layout explained.

Terms included need not be "words" in any narrow sense

So long as it meets the criteria above, a term need not be a single word in the usual sense. Any of these is also acceptable:


Names

Names fall into two categories: individual given names and family names, which are single words, and the names of actual people, places, and things. Wiktionary classifies both as proper nouns, but applies caveats to each.

Given names and family names

Given names (such as David, Roger, and Peter) and family names (such as Baker, Bush, Rice, Smith, and Jones) are words, and subject to the same criteria for inclusion as any other words. Wiktionary has main articles giving etymologies, alternative spellings, meanings, and translations for given names and family names, and has two appendices for indexing those articles: Wiktionary Appendix:First names, Wiktionary Appendix:Surnames.

For most given names and family names, it is relatively easy to demonstrate that the word fulfils the criteria, as for most given names and family names the name words are in widespread use in both spoken communication and literature. However, being a name per se does not automatically qualify a word for inclusion. A new name, that has not been attested, is still a protologism. A name that occurs only in the works of fiction of a single author, or within a closed context such as the works of several authors writing about a single fictional universe, does not meed the criterion for independence.

hypocoristics, diminutives, and abbreviations of names (such as Jock, Misha, Kenny, Ken, and Rog) are held to the same standards as names.

The status of patronymics has not been settled.

Names of actual people, places, and things

A name should be included if it it is used attributively, with a widely-understood meaning. For example: New York is included because "New York" is used attributively in phrases like "New York delicatessen", to describe a particular sort of delicatessen. A person or place name that is not used attributively (and that is not a word that otherwise should be included) should not be included. Lower Hampton, Empire State Building, and George Walker Bush thus should not be included. Similarly, whilst Jefferson (an attested family name word with an etymology that Wiktionary can discuss) and Jeffersonian (an adjective) should be included, Thomas Jefferson (which isn't used attributively) should not.

A name should be included if it has become a generic term. For example: Remington is used as a synonym for any sort of rifle, and Hoover as a synonym for any sort of vacuum cleaner. (Both are also attested family name words, and are included on that basis as well, of course.) Hamburger is used as generic term for a type of sandwich. One good rule of thumb as to whether a name has become a generic word is whether the word can be used without capitalization (as indeed "sandwich" was in the previous sentence).

Being a trademark or a company name does not guarantee inclusion. (Of course, some company names are derived from family names, and are included on that basis.) Although some words are trademarks and company names, not all trademarks and company names are words. (Indeed, trademark holders will vigourously defend their trademarks against becoming words. According to Adobe Systems, there is no such word as Photoshopped, since Photoshop® is a trademark and not a common verb that can have a past participle; according to Xerox there is no such word as xerox, since Xerox® is a trademark and not a common verb; according to Sony there is no such word as Playstationize since there's no word Playstation at all and PlayStation® is a trademark and not a common verb.) Many trademarks and company names are deliberately protologisms. To be included, the use of a trademark or company name other than its use as a trademark (i.e. a use as a common word) has to be attested.

What Wiktionary is not with respect to names

Wiktionary is not a genealogy database. Wiktionary articles on family names, for example, are not intended to be about the people who share the family name. They are about the name as a word. For example: Whilst Yoder will tell the reader that the word originated in Switzerland (as well as give its pronunciations and alternative spellings), it is not intended to include information about the ancestries of people who have the family name Yoder.

Wiktionary is not an encyclopaedia. That's Wikipedia's job. Wiktionary articles are about words, not about people or places. For example: Many places, and some people, are known by single word names that qualify for inclusion as given names or family names. The Wiktionary articles are about the words. Articles about the specific places and people belong in Wikipedia. For example: Wiktionary will give the etymologies, pronunciations, alternative spellings, and so forth, of the names Darlington, Hastings, David, Houdini, and Britney. But articles on the specific towns (Darlington, Hastings), statue (David), escapologist (Houdini), and pop singer (Britney) are Wikipedia's job.

Wiktionary is not an encyclopedia

Care should be taken so that entries do not become encyclopedic in nature; if this happens, such content should be moved to Wikipedia, but the dictionary entry itself should be kept.