Wiktionary talk:Criteria for inclusion

Definition from Wiktionary, the free dictionary
Jump to: navigation, search


The archive for this talk page can be found at Wiktionary talk:Criteria for inclusion/Archive.

There is the page Wiktionary:Editable CFI where proposed changes to CFI are made, and discussed.

Subpages of Wiktionary:Criteria for inclusion:

Criteria for inclusion/Brand names Criteria for inclusion/Editable Criteria for inclusion/Editnotice
Criteria for inclusion/Fictional universes Criteria for inclusion/Language-specific Criteria for inclusion/Languages with limited online documentation
Criteria for inclusion/Well documented languages Criteria for inclusion/attestation

Multi-word entries, sums of their parts and translations[edit]

I've been thinking about a possible guideline regarding the multi-word terms. In particular, I'd like to neglect Davilla's Pawley test topic for this post, although the best solution would probably be a combination of both. See this as a possible additional test.

Some words that have been RFD'd lately I feel do merit some kind of inclusion here, whilst others don't, and the easiest way for me to determine that is to look at their translations. Example is WT:RFD#indoor baseball.

Thinking in English-only, I don't see the merits of including this particular term, or any of the other indoor terms, as their meaning is defined by indoor. However, as my argument there described, such terms are translated in one word in at least two languages that I know of, German and Dutch, and possibly as well in more languages that I don't know of.

This rule may have a large impact, which those who know some German or Dutch will know, for terms like vintage car may be (I don't know) translated into one word there.

There may be two benefits:

  • Non-English entries of such terms, for instance the Dutch zaalvoetbal, can link properly to indoor football, instead of to indoor football.
  • indoor football will list the correct translations for at least German and Duch, so that users don't have to go through the process of looking up indoor, which would have the Dutch translation zaal- (a combining form), then looking up football, and then guessing how to link them, keeping in mind the various very complex rules for morphological word building in Dutch and German.

Opinions? — Vildricianus 13:24, 4 June 2006 (UTC)

I think there's a very close relationship between single words in other languages and what we would consider to be a single concept in English. However, other languages clearly also have concepts that do not exist in English as set phrases, such as father's older brother. To a person who speaks Chinese, mention of the word would immediately conjure images of what an older uncle might be to a younger, and the people in his own family who are associated with the word, as well as his father's friends as it turns out. To an English speaker, the phrase would have to be mentally summed, and the implications are not immediately obvious. So I don't think this could be used as an inclusive rule. I would wonder if it could be used as an exclusive rule, for instance if no other languages had a single word for skateboard wheel, or at least a term that passed the inclusive rules; essentially, if skateboard wheel isn't demonstrably a single concept in any other language, then it isn't one in English either. The inability to apply an exclusive rule like this to, say, vintage car because of some translation, would add credibility to the idea that it very well could be a single concept in English. Davilla 15:42, 5 June 2006 (UTC)
There have been a few debates on RFD where this migh apply, particularly active volcano. Despite what I wrote above, I've been thinking that this is a pretty good criterion to fall back on, even if it isn't ever included specifically. We should definitely include last night because you can't say yesterday night in most contexts without sounding a little funny. We should probably include last year because of the translations, and it's a pretty common expression anyways. There isn't any reason I can see for keeping last financial year, but maybe I'm just trying to stir trouble. DAVilla 21:33, 15 December 2006 (UTC)

I think that the question of the inclusion or exclusion of an expression as a derived term should not simply be a question of whether the meaning can be derived from a composition of its constituent words, but rather if it also includes a significant degree of markedness such that the use of some other combination of words to express the same meaning would be considered unnatural to a native speaker. This conventionality can be measured by looking at the distribution of the collocation relative to the distribution of other collocations with a similar derived meaning. As I understand it we're not just building another Webster's here, but rather trying to declare a much larger, more detailed description of the human lexicon. We don't just want a laundry list of how you might express a particular meaning, but also how one would express a particular meaning. If wikitionary is going to function well as a cross-linguistic resource, which I think it should, it needs to include the conventional. We can make another formal argument in favor of the inclusion of conventional expressions in the lexicon by considering the performance of Natural Language Processing systems. Systems that include statistically mined conventional multiword expressions in the lexicon perform significantly better at selecting a correct syntactic parse from amongst the many thousands of well-formed possibilities. This makes a lot of sense when you realize the agent, computer or human, that is listening or reading must first tokenize the input stream before interpreting it. If a collocation doesn't exist in the lexicon, then it can't be treated as a token, thus greatly (and unnaturally) increasing the combinatorial complexity of the language stream. Johnfbremerjr 10:54, 10 February, 2008
Please see Wiktionary:Idioms that survived RFD, which is an attempt to import the Pawley guidelines and rationalize why the community supports some phrases and not others. DAVilla 07:34, 11 February 2008 (UTC)


When were "blogs" added as durably archived? They are not. All of the citations of blog sources I've seen so far have not been google archive links (therefore, not to the durably archived source.)

This seems to be quite fallicious, as google doesn't seem to archive them.

The discussions that do mention blogs above give clear reasons for not using them (as CFI used to state) oddly, from the most inclusionist contributor Wiktionary has seen so far!

What gives? Who added "blogs" and why?

--Connel MacKenzie 10:27, 20 August 2006 (UTC)

Of course being durably archived is the most important aspect. If the CFI is incorrect then this needs to be changed.
I vaguely remember a time of revision of this article when this change might have been made, anyways it's somewhere in the history. (Wouldn't it be nice to be able to select a portion of the article and find when that text was most recently changed?) I don't think the intent was malicious but it would be nice to ask for the reasoning. One of the arguments at the time was about feeds being archived. I'm not literate enough in the technology to know if that applies to blogs. DAVilla 21:07, 21 August 2006 (UTC)
I'm still puzzled by this. Can we make it go away? -- Visviva 14:37, 22 June 2007 (UTC)
I have now removed this text. -- Visviva 02:20, 28 September 2007 (UTC)

Reconstructed languages[edit]

I object in the strongest possible terms to the unilateral imposition of 'policy' on the part of User:Robert Ullmann. Refusing repeated invitations to constructively state his position on Wiktionary_talk:Reconstructed terms, and following a failed deletion request, he just made unilateral changes to Wiktionary:Reconstructed terms to suit his whim, without bothering to give any explanation beyond '1/2 rewrite', knowing perfectly well his changes would be controversial.

I am not seeking to impose any fixed opinion of mine, I am looking for intelligent debate among people aware of the issues involved. Robert Ullmann's suggestion has some merit, but it also has flaws, and as long as he just keeps imposing it without debate, there is no way of ironing them out. Robert Ullmann does important work on wiktionary. But he has very idiosyncratic views on etymology and langauge reconstruction, and no interest in, and consequently no knowledge on the matter. It is bad enough that he abuses his admin privileges to chastise me over alleged violation of CFI (which has still 'Semi-Official status'), but to insert such a "policy" into CFI after the fact, and after realizing that it had not in fact been there at the time he chose to chastise over it is simply wikityranny (making up your laws as you go along), indefensible under wikiquette, and unacceptable on any Wikimedia project. Let him either discuss the issue amicably, or step down from policing about it.

I do invite anyone interested in the topic to seek for a solution acceptable to everybody, but I will not put up with such bullying tactics. Dbachmann 10:48, 26 January 2007 (UTC)

User blocked for one week, knowingly removing CFI clarification made as result of policy vote, change reverted. Robert Ullmann 12:09, 26 January 2007 (UTC)

Oxford English Dictionary[edit]

I noticed that materteral is listed for deletion yet it appears in the OED. It is my thought that if a word is in the OED, it merits inclusion in wiktionary. What do others think? WilliamKF 19:43, 8 February 2007 (UTC)

The tenuous decision has been generally to allow such references as some kind of refereed academic work, even though it is no such thing. For RFV, see {{nosecondary}} which explains some of the reasons why we don't/won't/can't take everything the OED has. The concession was made, I might add, during a dispute with Wiktionary's most infamous copyvio vandal, long before his actions were exposed as being 100% copyright violations. If it comes to a vote, I'd vote strongly against such folly; the minute a vote passed, someone would start a bot stubbing in the OED entries, exposing WMF to certain copyright concerns. On the other hand, if a word appears in the OED and here, but no other major dictionaries, we probably should delete it, even if it isn't a word-for-word copy. --Connel MacKenzie 20:02, 8 February 2007 (UTC)
I'm not convinced by the copyvio reasoning implied in your last sentence (though I very much agree with the rest). AFAIK, OED only includes words for which it has either found prior use or prior mention (eg in earlier dictionaries). In the former case, we can make up our own minds as to the meaning of a prior use (though it would be dodgy if we cannot find the same or different cites via a separate search). In the latter case, copyright is less likely to be a problem, particularly since the dictionaries cited are usually >>120 yrs old.
In the case of materteral I see that there are at least three good b.g.c. cites, so having found it's meaning, and feeling somewhat avuncular about it, I suppose I might weigh in behind it. --Enginear 20:28, 8 February 2007 (UTC)
Yes, the Oxford English Dictionary (OED) is strictly based upon giving examples quoted from literature. In terms of copyright violations, the OED in its first edition (1928) and supplements dates back to the beginning of the twentieth century and therefore, would not be subject to copyright similar to how the Encyclopedia Britannica 11th edition is used on wikipedia. WilliamKF 20:48, 8 February 2007 (UTC)
Here is a link where images of out of copyright pages of the fascicles may be found and rationale. WilliamKF 22:02, 8 February 2007 (UTC)
Note: I'm blocking this latest Primetime (talkcontribsdeleted contribspage movesblock userblock logactive blocks) sockpuppet WilliamKF (talkcontribsdeleted contribspage movesblock userblock logactive blocks). --Connel MacKenzie 15:02, 13 February 2007 (UTC)

Sign languages?[edit]

Does Wiktionary in principle allow inclusion of words in sign languages? Showing the gesture shouldn't be too difficult -- stationary gestures can be shown with an image, mobile ones with a video -- but getting a gesture to be the name of an entry page might be more difficult. Angr 23:00, 10 February 2007 (UTC)

This might need a separate namespace because of the very different format of presentation. How do you list synonyms and antonyms, for instance? How do you put a sign language entry into a translations table? Besides which, there are many different sign languages, including American, British, Hungarian, and that of the American Plains Indians. There are some websites linked from the Wikipedia article on w:sign language that might provide ideas. Can you imagine what the American sign language Wiktionary would look like? --EncycloPetey 23:06, 10 February 2007 (UTC)
Listing synonyms and antonyms is not hard: include a picture or the like. Having an entry is what's hard: how do we include the term as the PAGENAME? See also my comment below, in this section.—msh210 17:44, 11 February 2008 (UTC)
There has been a bit of discussion on this question at various times. Wiktionary:About sign languages has some of the results of that discussion, and Wiktionary:Information desk#American_Sign_Language currently has my reply to someone else who recently asked the same question you just did, Angr. Any ideas you have would be great; probably the Wiktionary:Beer parlour or Wiktionary talk:About sign languages would be the best place to mention them, the former especially if they are ideas on how to include SL entries.—msh210 17:44, 11 February 2008 (UTC)

Inflected forms[edit]

I think that inflected forms should be included if they belong to two different words in the same language. I once looked at a play script in Spanish and had to pause at the word viste to figure out which verb was meant. Other examples in Spanish are fue etc., ve, and the regular siento, sienta, and siente. Russian has дне and хоре.PierreAbbat 02:25, 11 June 2007 (UTC)

Because spellings so easily overlap in different languages, here on en.wiktionary, we aim to include all inflected forms, not just ones that might have obvious problems. --Connel MacKenzie 07:51, 11 June 2007 (UTC)

use in a refereed journal: mathematics[edit]

I'm not sure about other fields, but in mathematics a refereed article will often have what we call "ad hoc definitions". That is, for example, the author, call him Smith, will say "let a foo subgroup be a subgroup that is finite and central". Smith then uses the word "foo" a hundred times over the course of his paper, but is never heard of again in the literature. Words like these should I think not have entries. On the other hand, sometimes Smith does the same thing, and then another author will say "Let a foo subgroup be, after Smith, a finite central subgroup" and use the term in his paper, and a third author will say "If the subgroup is foo (in the sense of Smith 2007), then..." and a fourth will say "If the subgroup is foo, then...". (This process does not occur over the span of four papers. But the progression is approximately correct.) At what point in this process does the word become acceptable in en.wikt?—msh210 18:45, 16 August 2007 (UTC)

(Note incidentally that the word may have been in use by Smith and his colleagues in various universities well before it was ever published. But i'm assuming for the sake of argument that we cannot attest that.)—msh210 18:45, 16 August 2007 (UTC)
If a word, e.g. “foo”, becomes strongly-enough associated with a particular definition that many authors begin to use the word without defining it, “foo” in that sense will naturally meet the existing CFI. Rod (A. Smith) 18:52, 16 August 2007 (UTC)
Well, yes, that's the "fourth" stage above. But does a "in a refereed academic journal" rule apply to any of the earlier stages, was more my question.—msh210 19:55, 16 August 2007 (UTC)
Well, the attestation section (actually, bulk of the CFI) was written to clarify the general rule: “A term should be included if it's likely that someone would run across it and want to know what it means.” Following the spirit of that general rule, I'd say that readers of academic journals are only likely to want to know what a given term means if the journal uses the term without defining it. That is, I assume that the “Appearance in a refereed academic journal” part of the attestation section is present to refine the clause “someone would run across it”, not to override the “[someone would] want to know what it means” clause. Does that make sense? Rod (A. Smith) 20:23, 16 August 2007 (UTC)
Actually, that was Dmh's phrase ("if it's likely that someone would...want to know what it means,") if I recall correctly. Frankly, I don't know how that phrase escaped notice. --Connel MacKenzie 20:39, 16 August 2007 (UTC)
The problem is that a word could have just one "appearance in a refereed academic journal" and be admitted, even if it were just the definition—a mere mention of existence, or even inexistence! Likewise, "usage in a well-known work" allows for literary nonces, which have been received with skepticism. We should alter CFI to say that all terms must convey meaning in three independent instances over a year, and that if disputed they must be so cited, with the exception of clearly widespread use. I cannot imagine that change eliminating anything of substance. DAVilla 13:09, 17 August 2007 (UTC)
I disagree. I think there is a problem here, and I think you've mostly identified it correctly, but I don't think the solution is to remove these exceptions; the exceptions serve an important purpose. For example, there are plenty of languages that simply don't have the written corpus needed for their words to meet the normal CFI; but the academic-journal exception means that if linguists (or anthropologists) publish papers about these languages (or people) and define some words, we can include those. Also, while it might be obvious to us, after looking for independent cites, that a word in Romeo and Juliet is a nonce-word, the casual reader might not find that so obvious; and while obviously it's not worthwhile to include every nonce-word in every work, some works (the King James Version of the Bible, several of Shakespeare's plays, the U.S. Constitution and Declaration of Independence, etc.) are sufficiently well-known and widely read that it does make sense to include even their nonces. —RuakhTALK 16:37, 17 August 2007 (UTC)
Anthropologists and ethnolinguists are wonderful people, but they are no more reliable than lexicographers when it comes to defining words. If there are no authentic durable records of a language whatsoever -- no recordings, no transcripts -- there is simply no material for us to work with. In this respect I think the use-mention distinction trumps any value in peer-reviewed scholarship ... so I'm inclined to agree with DAvilla that this clause no longer serves any useful purpose, and in fact contradicts our current practice. -- Visviva 02:39, 28 September 2007 (UTC)

Personal names from languages with a non-Latin script[edit]

Having perused this I cannot fathom why the Russian name Дмитрий exists only in Latin form and Владимир exists in both scripts??? I recommend strongly for names with no (orginal !) Cyrillic articles yet written to be moved to articles with the appropriate Cyrillic titles, if nobody minds. Is the article in Latin letters appropriate at all, since transliteration is always provided? Bogorm 08:56, 16 August 2008 (UTC)

I think the issue here is simply the incompleteness of Wiktionary. Дмитрий certainly should exist (and Dmitry was in some serious need of cleanup). However, much as the two are related, the existence or lack thereof of Дмитрий should not in any way affect whether we keep Dmitry. If it can be attested, it should be kept. -Atelaes λάλει ἐμοί 09:08, 16 August 2008 (UTC)
Deleting of Dmitry is not my concern, since I am not administrator and until someone proposes it for deletion. If you do not mind, I am going to move Dmitry to Дмитрий, so that the Latin title be used as redirection. Bogorm 09:12, 16 August 2008 (UTC)
That would be inappropriate, since we do not use redirects that way. See Wiktionary:Redirects. --EncycloPetey 16:48, 16 August 2008 (UTC)
In the future, please bear in mind that an entry requires some reformatting if you move it to a different language. -Atelaes λάλει ἐμοί 17:58, 16 August 2008 (UTC)
Why have you deprived the article of the Transliterations section? What do you mean under work? It is not moved to a different language - Дмитрий is the only admissible form of the Russian name and Dmitry is an independent article about the English case, though I strongly doubt that any British would opt for a purely Russian name, unless he is a fervent adherent of the current Russian president. Bogorm 18:17, 16 August 2008 (UTC)
I have removed the transliterations because we have a specific transliteration format for Russian words, and I don't know Cyrillic script. I put a marker on it to garner the attention of someone who knows Russian to come and add it. It is moved to another language because you took the content of Dmitry (an English word), and moved it to Дмитрий (a Russian word), without properly reformatting it. It was still classified as an English proper noun, as well as being in the English category for Latin and grc derivations. -Atelaes λάλει ἐμοί 18:25, 16 August 2008 (UTC)
Well, I speak fluently Russian, but am not knowledgeable about formatting of proper names according to your wishes - so I shall essay to be helpful by elucidating here the transliteration but without adding it in templates, since I have not yet got accustomed to using templates (besides the quoted on my talk page): the official scientifical transliteration is "Dmitrij", but the popular for the English-speaking world is Dmitry(corroboration for my words is to be found in the article about President Medvedev, whose first name is rendered as transliteration in brackets and in the popular rendering in the title of the article). I do not know however, old Russian (Eastern Old Church Slavonic) and the question about the ancient spelling can hopefully be resolved by EncycloPetey (see below). Bogorm 19:15, 16 August 2008 (UTC)
I'm not sure what you mean by "official scientifical transliteration" of Russian, unless you mean the International Scholarly System, since there are several standard schemes in use just for Russian transliteration, including the system used by the Libarary of Congress (which would transcribe Дмитрий as "Dmitrii". There are also different Latinizing transcription systems used in Germany and Poland that I have seen, and presumbaly there are many more besides. See Romanization of Russian on Wikipedia for a little more, including a table comparing 7 of the systems. The system preferred by the Russian government and the Russian Commenwealth is GOST 7.79. --EncycloPetey 20:16, 16 August 2008 (UTC)
Yes, I meant the first one, because it is international and present in all articles about Russia in Wikipedia as the quoted one. And regional regulations have not international jurisdiction. Bogorm 22:11, 16 August 2008 (UTC)
While there may be only one Cyrillic form in common current use, that is not the only spelling possible in previous centuries. Unfortunately, I have discovered that I am missing the relevant page from my copy of Nikolaj Michailovič Tupikov's Wörterbuch der Altrussischen Personnenamen (Köln & Wien: Böhlau Verlag, 1989). However Wickenden's Dictionary of Russian Names includes a large number of spellings. (Wickenden's are transliterated, but use a consistent transliteration system). --EncycloPetey 18:26, 16 August 2008 (UTC)

Somewhat related discussion is at my talk page.—msh210 21:09, 20 August 2008 (UTC)

Scientific nomenclature[edit]

In general, how far should we be diving into the scientific/technical names of things? I have two main question's in mind:

  • Should every (established) genus-species of organisms have an entry? ("established" meaning there's good citations, but that's pretty easy if you consider all the scientific journals out there who use the terms without defining them)
  • Likewise, what makes chemical names worthy of inclusion? There's so many variations.. see w:8-Azaguanine where I listed all the synonyms I could find appearing in at least two independent sources. If you Google most of them, the chemical uses are buried deep in the results, however, if you use Google Scholar, they're usually all you get.

Now, I understand people don't normally use dictionaries for looking up this kind of information, but for the sake of completeness (and the limitless potential of Wiktionary), I guess my main question is where do you draw the line? Voxii 23:47, 3 March 2009 (UTC)

Here's my sense of the current state of things:
  • The sense of the community lately seems to be "no", that we should have genus names and species epithets but leave actual binomials and trinomials to Wikispecies. (But I would defer to EncycloPetey, DCDuring, and any others who have actually worked on this area lately.) This has varied over time, and we do have a number of binomial names. The not-yet-closed RFD for B. splendens may be pertinent.
  • Sum of parts hyphenated chemical names (e.g. full IUPAC names and variations thereof) are out, with a possible exception if they happen to be used outside of the chemical field. So I don't think we would want anything above "8AG" in your list. Trade/commercial/brand names are out unless they happen to satisfy WT:CFI#Brand names. I expect that identifiers like "NSC-749" are also out, though I don't think that's ever been put to the test. Terms like "triazologuanine" should be fine, I think, provided that they are verifiable. -- Visviva 03:11, 4 March 2009 (UTC)
Ok, that sounds reasonable. I figured things like "5-Amino-1,6-dihydro-7H-v-triazolo(4,5-d)pyrimidin-7-one" were out. Most of the ones that are a couple letters and numbers are usually only used by certain organizations or are something like a brand name, so I guess we don't want those either. As for the species names, I'll take a look around. I think some of the more popular ones might be worthy of inclusion. Thank you very much for your answers. Voxii 10:29, 4 March 2009 (UTC)
I agree that "some of the more popular ones" should be included, such as Homo sapiens, Tyrannosaurus rex, and E. coli. Angr 15:06, 6 March 2009 (UTC)
I think Visviva has fairly summarised current unofficial preferences.
I think we do a good service when we take attestable, includable vernacular (even brand/product) names and translate them into current scientific terms and find the best current WP article (and/or outside source) to link to. Sometimes dated scientific terms can be sussed out, though that is harder. WikiSpecies is already much more complete than we are ever likely to be about the structure of the taxonomic tree.
Getting all the one-part taxonomic names is plenty challenging. I suppose the same would be true for all the combining forms of chemical terms as well. DCDuring TALK 16:02, 6 March 2009 (UTC)


The rule for inclusion of proper names are outdated, they are not followed and unhelpful. Maintenance complexity should not be a factor if name spelling can be checked against a variety of dictionaries.

The changes I propose are:

  • Allow all country names, their capitals in the English language and in the original language and script, etymologies, alternative spellings, meanings, pronunciation and translations.
  • Allow regional centres - capitals of states, provinces, counties, shires, regions, prefectures, oblasts, etc. regardless of their size.
  • The inclusion of other place names to be discussed. Population, historical or economical importance? Provide reference to a dictionary (to discuss, which dictionaries are considered valid)

In any case, I suggest not to restrict but encourage the inclusion of proper names. Anatoli 03:14, 11 March 2009 (UTC)

What do you suggest we use for evidence? Should all administrative units be included or just primary ones? Should the regions have a governmental administrative structure or could they be statistical areas or popularly used names for areas. We have obsolete meanings of words; should be have obsolete meanings of place names? Should we have official names or popular names or both. What about mythical places (Valhalla) or historical "places" with uncertain boundaries (Scythia)? Obviously not all of these questions need be answered at once. DCDuring TALK 03:45, 11 March 2009 (UTC)
Judging from your questions, I can see that your main concern is maintenance and who is going to verify the accuracy? Like in all Wiki projects, there is always a risk and someone who knows the correct information can change.
My main concern is that a gaseteer adds no value. The "maintenance" include conceptual concerns.
  • By the regions I mean governmental administrative structure, like Urumchi/Ürümqi - capital of Xinjiang. Include unrecognised/partially recognised and disputed territories - in a neutral informative tone. (Western Sahara, Kosovo, etc). The status can have a leading entry explaining the status.
    Should that include statistical areas that are officially defined? Such as SMSAs and SCAs in the US?
  • Can't see any problem with obsolete names, if they are a redirect entry, "alternate or obsolete spelling" entry.
    I meant obsolete definitions of a word. Should every major border change be reflected?
  • Official names for place names (cities/towns), countries - popular names (as is the case already)
Some possible variations in the name in the name can be described in the entry, as was the case with Rostov. The Russian city of Veliky Novgorod was officially called Novgorod till 1999. I I were to create the entry (it's an administrative centre of a region) now, I would call it "Veliky Novgorod" with a link to the alternative or older spelling - "Novgorod". Popular names may be useful but if they are appropriate for this language. In the Russian entry I would make Великий Новгород the main entry with a popular name Новгород.
  • Mystical and uncertain historical names are not my concern. Not sure if they need to be included but I don't see why not, if they can be useful for users. The leading sentence should specify what the entry is. Perhaps, we can exclude them for now but it's up to other Wictionarians. Anatoli 04:11, 11 March 2009 (UTC)
Evidence? Happy to discuss this but in most cases, the names are obvious, well-known and easily verifiable by a simple search. The evidence may only be required in cases of a dispute. Then one needs to provides something solid. But isn't this the case already? If I write a name and you agree with the spelling, then there is no need for any evidence. This could be happening with the spelling of well-known names, such as San‘a’ (capital of Yemen), El Aaiún (Western Sahara), Urumchi, etc., which can have more than one spelling. In this case, we need to discuss the correct spelling for the entry. Anatoli 04:21, 11 March 2009 (UTC)
You seem focused in this discussion on big entities, but you have spoken of all places, whatever their population. There are many of them. Should governmentally designated places be automatically included, whether or not there is a government associated with the place? See w:Place (United States Census Bureau), especially w:Census-designated place.

Furthermore, there is likely to be a great deal of interest in natural features: bodies of water; mountains, hills, mountain ranges; valleys, plains, plateaus; below-surface features; and man-made structures (buildings, kurgans, jetties); public places (parks, squares, plazas); public transportation; and roads. Are these items beyond your present concern? Should they be? Are the entities for which you would propose amending WT:CFI more or less meritorious?

Lastly, but most importantly, how are users better off because Wiktionary includes such place names and/or names of political/governmental entities? DCDuring TALK 14:49, 11 March 2009 (UTC)

  • I think the proposed changes (1 and 2) would be fine, inasmuch as they mostly reflect current practice and keep the set of permitted entries small (countries, primary subdivisions of countries, and capitals thereof -- I suppose this would be a bit over 10,000 words, many of which we already have). Even though I'm not thrilled with the idea, and there are very serious unresolved problems in our treatment of proper nouns, I would support the change, just because it would reduce the current gap between policy and reality. Then we can go back to arguing about exactly how to handle these, and what to do with all the other place names.  :-) Question: there are some very large cities that aren't the capital of anything, notably Los Angeles and Chicago in the States; should we have a clause permitting any city of more than 1 million people? -- Visviva 15:29, 11 March 2009 (UTC)
    • I think we should at least have a clause for including the most heavily populated city in each state. It would be absurd to include Tallahassee but exclude Miami. I'd go farther down than a million, at any rate - 250,000 is a reasonable baseline. For place names with multiple uses (e.g. Jacksonville, Springfield), if one comes in, all should come in to avoid confusion. It's also fine to handle this, as we have, by defining the term as the name of multiple places and referring the reader to the Wikipedia disambiguation page. bd2412 T 03:47, 16 March 2009 (UTC)
    I think the proposal looks fine, for the same reasons as Visviva (having the same reservations as well). -Atelaes λάλει ἐμοί 18:48, 11 March 2009 (UTC)
Natural features are beyond my present concern (although, I don't see any reason to object their existence, if they are correct), neither are smaller town districts. If the number of governmentally designated places are too many, I am happy to reduce and limit to about next level of the nations capital (states, provinces, prefectures, autonomous regions, territories or oblasts). The entries are not forced to be created and are not created automatically, they are created by editors manually, so I don't see any reason for concern of having too many to handle. I've been checking the appendixes, they seem to be mainly linked to the main body of Wiktionary, anyway, and if I tried to create an entry from a red link, it would create an entry in Wiktionary, not in the appendix. Am I missing something? The benefit for the users? - I have already explained, like any dictionary, it's for the information, besides, here, it's multilingual, allows to discuss/inform about etymology, pronunciation, transliteration, grammar and other linguistic issues. In reality yes, we have quite a number of proper names already, which is behind the policy. As I said before, my attitude is more is better than less, as long as it is accurate. Los Angeles is a big place and having it here is only beneficial. All 1 mln. (if not less) cities must be included, IMO. Sorry, mixed all answers in one paragraph, hopefully, it's readable. :) Just in case, I prefer "New York" to "New York city", and names of regions coinciding with its capitals/centres can go into one entry. Anatoli 00:45, 12 March 2009 (UTC)
Why is Wiktionary the right home for such entries, as opposed to Wikipedia? The users would seem to get vastly more from Wikipedia. Is all of this just so the various names can be translated? DCDuring TALK 02:46, 12 March 2009 (UTC)
Wiktionary is an online dictionary (among other things). Wikipedia has large article with volumes of information irrelevant for finding translation for proper names. The linked multilingual articles (if you mean using this method for finding out what it is called in another language) are not necessarily linked to an identical article in another language, e.g. "USA" may be linked to "United States of America". The translations are not grouped in one place and the etymology, basic pronunciation is not available. I understand what you are referring to but this method is not for everyone and not is user friendly. Besides, Wiktionary provides a concise meaning of a proper name (at least, country and what it is, e.g. a city). That's all you need from a basic dictionary. Etymology and related terms would be a bonus but if you don't have a stub, there won't be anything to improve on. Anatoli 03:41, 12 March 2009 (UTC)

Toponyms are a special case, and I think a few extra rules need to be stated explicitly. As a general principal, we should handle them strictly lexicographically, and leave the encyclopedic documentation to Wikipedia. They should qualify for inclusion the same way as any other term: three attestations in durable works.

Contrary to common sense, most geographic references should not be used. Only references which examine place names from a linguistic (onomastic, toponymic, or etymological) angle should be used.

  • Exhaustive official lists of place names should not be used, because they are prescriptive. If no one has ever written about Lower Slobovia in English, then it shouldn't have an English entry in a descriptive dictionary like ours.
  • Place-name entries in general dictionaries should not be used as references or examples:
    1. General-reference dictionaries add toponyms to increase their quick-reference value for users. We add a Wikipedia link for the same purpose.
    2. General-reference dictionaries don't treat toponyms lexicographically, providing etymologies, documenting attested use, etc., rather they give encyclopedic or gazetteer information, like population, etc.
  • Atlases should also be prohibited as references:
    1. Modern atlases are prescriptive, relying on official lists of approved geographic names rather than actual native-language usage.
    2. Modern atlases transcribe native place names for all but the most well-known places, and don't necessarily present names as used in English.

The “definitions” or descriptions, like those of other terms, should be the minimum necessary to define the place. Encyclopedic information like population, etc, should be prohibited. Michael Z. 2009-03-16 03:12 z


The proverbs section states that if the phrase is a complete sentence, it should start with a capital letter. The linked example redirects to an uncapitalised version, and all entries in Category:English proverbs that do not begin with a proper noun are uncapitalised. I assume this document is dated, as opposed to many entries in that category being wrong, and thus needs revision. Mindmatrix 20:52, 24 March 2009 (UTC)

I've removed the section. Mindmatrix 13:45, 8 April 2009 (UTC)


The formatting says # {{misspelling of|[[...]]}} but I thought practice was # {{misspelling of|...}} with no linking (provided by the template instead). RJFJR 20:21, 19 May 2009 (UTC)

See discussion and news (s.v. December).—msh210 20:41, 19 May 2009 (UTC)
No, that's different. RFJFR is right: we don't even want misspelling-only entries to count in the statistics. (Of course, it's hard to prevent it, because some of the other Wiktionaries will automatically create entries in response to ours, which means that we get interwiki links, which contain [[. But that's the idea.) —RuakhTALK 21:43, 19 May 2009 (UTC)

Translation target[edit]

The criteria for inclusion could be extended to include sum-of-parts terms if they serve as a translation target. Specific criteria for how to recognize a translation target are not yet clear.

Examples of possible translation targets:

  • high school student – French: collégien or lycéen; added later: but: "highschooler"
  • indoor football – Dutch: zaalvoetbal; added later: is this actually a non-SoP name of a sport?
  • problem solving – German: Problemlösen; added later: but: "problemsolving"; but-but: "problemsolving" is much less common than "problem solving"
  • small boat – Czech: loďka, lodička; diminutives in general; added later: but: "boatlet"; but-but: "boatlet" is rare.
  • two-wheeled – Finnish: kaksipyöräinen
  • email message – Finnish: sähköpostiviesti
  • rice noodles - German: Reisnudeln


See also:

Feel free to add further examples and bullet items identifying discussions to this post.

--Dan Polansky 17:01, 22 August 2009 (UTC)

I believe that indoor football is a set phrase to be included anyway, and isn't SOP: indoor football is the name of a sport, with its own rules, it's not only football played indoor. high school student might be considered as a set phrase too, it would not be absurd. But adding small boat, small ... for the purpose of translations to languages such as Dutch, with a heavily used diminutive suffix, does not seem appropriate nor useful. So, yes, but only if a set phrase. Lmaltier 08:32, 23 August 2009 (UTC)
These are good points. Yet, "high school student" is a sum-of-parts, and set phrases are not included per current WT:CFI, so "high school student" would be a newly included term if translations targets are added to WT:CFI. --Dan Polansky 09:26, 23 August 2009 (UTC)
Why mentioning two-wheeled? It is already includable with current CFI. Lmaltier 13:21, 23 August 2009 (UTC)
It is not all that clear that "two-wheeled" is includable per current CFI, given the current request for deletion of "two-wheeled". To me, "two-wheeled" seems rather SoPish.
Feel free to add to the list above good examples of terms that would be added because of their translation-targetness. --Dan Polansky 14:33, 23 August 2009 (UTC)

None of the examples of SoP terms needed as translation targets are necessary; a high-school student is also a highschooler (or, specifically, a highschoolboy or girl, if you like), indoor football is idiomatic, problemsolving (written as a single word) is common, a small boat is a skiff (or, more predictably, a boatlet), and something that has two wheels can be called birotate.  (u):Raifʻhār (t):Doremítzwr﴿ 15:51, 23 August 2009 (UTC)

What's wrong with glossing by simple glosses, like “small boat?”
Accepting this proposal would multiply the potential inventory of acceptable words by an order of magnitude. Every inflected verb or noun would suddenly need a dozen or two new English entries created exclusively for it. Agglutinative languages might require English entries like for your (plural) repeated pretending to be undesecratable (Hu. megszentségteleníthetetlenségeskedéseitekért).*
This is taking glosses which belong in quotation marks and setting them in italics. It is also inviting editors to create entries for 100,000 S-o-P terms, phrases, and whole sentences. This is to increase the load on RFD a dozen-fold.
There will always be terms which have no synonym in a foreign language. Heck, every language has many regionalisms which have no general equivalent.
I'd be in favour of some new criterion for accepting “set phrases” or common expressions, but not for pretending that English has direct translations for every term in every language. Michael Z. 2009-08-23 18:11 z
I agree, -ish. We should never include a term that no one would ever look up. We can include terms that only a professional translator would expect to be able to look up; and we can include terms that most people would come across via internal links rather than by looking them up directly; but we should not include every series of English words that would be used to translate any foreign word. So, why do I say I only agree "-ish"? Because your comment purports to be objecting to Dan's proposal, but I don't think that is what Dan is proposing. He gives examples of series of English words that would be used to translate certain foreign words, and then labels them very explicitly as possible translation targets. Meaning that his proposal doesn't mandate all such entries. So, where you actually say what you don't think we should do, I agree; but where you object to "this proposal", "this", etc., I don't agree, or I don't know if I do, because I don't know if you're even talking about what you seem to be. —RuakhTALK 19:27, 23 August 2009 (UTC)
Okay, maybe I misinterpreted it some. But our RFV and RFD pages are already swamped with totally s-o-p phrases. Allowing more entries by criteria that require subjective judgment might be asking for trouble. I'd rather include English phrases by their intrinsic English qualities than because they are handy for reasons involving every language but English. Michael Z. 2009-08-23 22:23 z
Am I correct in assuming that the whole point of this proposal is to allow translation tables to exist housing single terms in foreign languages whose English-language æquivalents are SoP phrases?  (u):Raifʻhār (t):Doremítzwr﴿ 20:23, 23 August 2009 (UTC)
(indent) An explanation: I am not proposing anything yet. I am trying to execute a descriptive undertaking: to understand the specific and concrete, meaning example-based, impact of the proposal. I am sorry that I have redirected the discussion here; it could have stayed in Beer Parlour. I have created this section so that the topic has its home location, from which it should be possible to link to the discussions in Beer Parlour. The discussions should be easier to find months or years later.
In any cases, examples of the impact are desperately needed; the above discussion shows that people do as yet agree on what the impact of the proposal would be. And it is the impact or consequences of the proposal that make the proposal good or bad. --Dan Polansky 07:30, 24 August 2009 (UTC)

Voting on clarification at Wiktionary:Votes/pl-2009-08/Clarify names of specific entities[edit]

I started a vote, after BeeP discussion, to clarify the wording without changing the meaning of this section. Michael Z. 2009-08-27 04:54 z

The OED cites Usenet, too.[edit]

I find it interesting to note that the OED’s sub-entry for “ˈfelching n. cites a Usenet newsgroup as its earliest quotation in support of the term; I reproduce it literatim hereat:

1989 Re: How can you eat Unwashed Pussy? in alt.sex (Usenet newsgroup) 17 Nov., The story also talks about sucking on the clitoris… But‥I want to read about *felching!

It seems like we’re not the only ones who allow Usenet groups as evidence of attestation…  (u):Raifʻhār (t):Doremítzwr﴿ 18:08, 18 September 2009 (UTC)

They also cite plain old websites occasionally, tagging it as something like "OED archive." I understand their thinking, and obviously they have the resources to create their own "durable archives", but it seems kind of lame. -- Visviva 08:26, 21 September 2009 (UTC)
Why? For better or for worse, we’re past the lexicographical age of restricting our quotations to those from literary magna opera. For a term in frequent current use, whether or not its use is in durably-archived media has nothing to do with whether a person will “run across it and want to know what it means”. Durable archiving is necessary solely for lexicographical verification. If a particular website coined or popularised a term, or represents the earliest recorded instance of its use, then it seems entirely appropriate to quote it as such; and if it isn’t durably archived, then it also seems entirely appropriate to durably archive it oneself, be that in the form of a printed screen-capture or whatever. In the continuum of descriptivist ethe, I could scarcely be described as a rabid inclusionist/inclusivist (Which is the better term there? Exclusionist, exclusivist, inclusionist, and inclusivist are all in the OED.), but I don’t see what value durable archiving has other than to facilitate lexicographical verification.  (u):Raifʻhār (t):Doremítzwr﴿ 13:15, 21 September 2009 (UTC)
Even if an attestation doesn't help qualify a term for inclusion, it might still be useful to show when and wherefrom it came into use, how it was used early on, etc. Michael Z. 2009-09-21 13:46 z
Agreed.  (u):Raifʻhār (t):Doremítzwr﴿ 14:16, 21 September 2009 (UTC)
My view was probably soured by the fact that I first came upon this when researching some dictionary word or other (not sure if it's one I've added to the list yet or not). Their lone bona fide citation for this word, which had been coined in the mid-17th-century and passed from one dictionary to another since, was from a 21st-century German website. It seemed painfully obvious that this was simply the infelicitous choice of a hapless website translator who made the mistake of relying on a German-English dictionary that had copied the word in turn from some earlier dictionary. I would have liked to think that the OED might feel just the slightest twinge of shame for their own role in perpetuating this misinformation.
But yes, there is certainly a valid use for this. -- Visviva 15:13, 21 September 2009 (UTC)
Perhaps they felt “just the slightest twinge of shame” for harbouring a “zombie word” based on an argumentum ad verecundiam and wanted to bolster their descriptive credentials by showing the word to be attestable. (That would also explain why so many of these dictionary-only words get tagged {{obsolete}} in post–second-edition draft revisions.) It seems to be their policy that once a word is added it never gets thrown out.  (u):Raifʻhār (t):Doremítzwr﴿ 17:14, 25 September 2009 (UTC)
It's also kind of lame in that their Web-site says, for example, “At the moment, because Internet addresses and references can change, texts that exist solely online cannot be used as a source for quotations.”[1] They don't say anything about an OED archive that renders a text non–solely online. —RuakhTALK 12:05, 25 September 2009 (UTC)
Yeah, they should probably clarify that…  (u):Raifʻhār (t):Doremítzwr﴿ 17:14, 25 September 2009 (UTC)

What Wiktionary is NOT[edit]

I believe we also need some statement of what Wiktionary is not.

Wiktionary is NOT an arbiter of what is suitable english, good english, correct english, grammatical. Like any English dictionary, Wiktionary is merely documenting, explining what is in use in English. It should be sufficient to show that a word or idiom is (or has been) in use, be it common useage or a specific group (such as the medical fraternity).

It seems to me that every time I come to Wiktionary and check through some of the words or idioms proposed for deletion, there are purists using arguments that essentially sets them up as arbiters of what is good, acceptable english.

To quote from WT:RFD#US_American

  • "US America" is not a term that I have heard or read and is not plausibly an etymon of "US American". It seems not to matter to this self appointed arbiter that several citations of use are given.
  • Acceptability as English is one thing. Suitability for any specific purpose is another.

And neither has anything to do with whether it should be in an English dictionary. No one here is/should be setting themselves as some authority to decide what is acceptable, what is suitable. Wiktionary should only be concerned with what is and is not used. If you want to decide what is acceptable or suitable use, or "Linguisitically Correct" you should go join the French Academy (or similar). The role of Wiktionary is not to decide any such thing. Is it used? Is there reasonable evidence of its use? There is. End of argument.

see also WT:TR#chillaxin --Richardb 11:09, 25 September 2009 (UTC)

You seem to be looking for WT:NOT. A line about "Wiktionary is not prescriptive" would be a useful addition there. But this line seems especially pertinent to the current situation: "Wiktionary is not a battlefield. Every user is expected to interact with others civilly, calmly and in a spirit of cooperation."
You may also wish to reacquaint yourself with the distinction between idiomaticity and attestation, both of which are discussed at length on the present page. -- Visviva 11:41, 25 September 2009 (UTC)

I agree (anyway, there is no other possible practical option on a wiki if you want to avoid edit wars, this is the NPOV principle). I just want to add that this is true for all languages, not only English. Lmaltier 17:24, 25 September 2009 (UTC)

Clarification Required[edit]

The CFI need clarification on one point:-

“Attested” means verified through
 *Clearly widespread use, 
 *Usage in a well-known work, 
 *Appearance in a refereed academic journal, or 
 *Usage in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year. 

Are those 4 attenstation criteria joined by OR, or by AND.

My personal view is that they should be joined by an OR, so that a term that meets ANY of the criteria, and does not need to meet ALL of the criteria.

I would suggest a change of the paragraph to

“Attested” means verified through meeting ANY of the following conditions
 *Clearly widespread use, 
 *Usage in a well-known work, 
 *Appearance in a refereed academic journal, or 
 *Usage in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year.

I cannot be bothered to mount a campaign or vote on my own. Any agree enough to take it on ? --Richardb 14:31, 1 October 2009 (UTC)

It may be that the wording could be better, but the "or" reading is how it is applied, without any controversy (about the "or", anyway) in my experience. DCDuring TALK 16:42, 1 October 2009 (UTC)
I believe the "or" on the end on the second to last line makes it clear that disjunction is intended. I missed that word reading it the first time, though, so it could be clearer. --Bequw¢τ 17:48, 2 October 2009 (UTC)

Blunder needs to be corrected in CFI definition[edit]

Someone, at some time, has made a blunder, that has apparently been subsequently accepted by a vote.

Under ==General rule== we find the line-

A term should be included if it's likely that someone would run across it and want to know what it means. This in turn 
leads to the somewhat more formal guideline of including a term if it is attested and idiomatic.

I hate to point out the absurdity, but, if obeyed, this would mean we would have ONLY idioms in Wiktionary !

I propose that the General Rule should be changed to:-

A word should be included if it meets any of the following criteria
*Clearly in widespread use, 
*Used in a well-known work, 
*Appears in a refereed academic journal, or 
*Used in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year. 
(See below under Attestation for clarification of these criteria)

A term other than a single word needs to meet the above criteria, and additionally be idiomatic. (See below for Criteria for Idiomaticity)

This change would also remove the disparity between the very loose, almost colloquial general rule (if it's likely that someone would run across it and want to know what it means) and the more formal attestation requirements.

Again, it needs to be changed, but I personally can't make the effort to mount a vote and a campaign. Anyone want to take it on ?--Richardb 14:58, 1 October 2009 (UTC)

So, you want to get logical about this, eh? To avoid a premature vote on all the wording changes, Wiktionary:Editable CFI has been begun. How that will interact with "official" CFI remains to be seen, but it is likely to be constructive. And it's an easier place to make such suggestions. DCDuring TALK 16:48, 1 October 2009 (UTC)
Not necessary in this case. It says "if", not "only if". The effect only applies if the condition is true. If the condition is false, then take no action one way or the other. --EncycloPetey 00:02, 3 October 2009 (UTC)
I think some of the confusion may be due to the different meanings of idiomatic, two of them listed here:
 1) Pertaining or conforming to the mode of expression characteristic of a language.
 2) Resembling or characteristic of an idiom.

I think CFI is using the first meaning here, not the second. (Please correct me if I'm wrong) Facts707 09:53, 3 February 2010 (UTC)

medical terms policy?[edit]

I'm wondering if there is any set policy on medical terms. Many of them are of Latin or Greek origin and the same term is used in many languages (eg: aorta). But bruit (from the French) is commonly used in English speaking medicine to describe a certain heart sound, but that is not mentioned in bruit. Also, foramen ovale is defined but not foramen magnum. Facts707 10:08, 3 February 2010 (UTC)

MW3 includes "bruit" in medical sense, but without a medical context, suggesting that it is likely worth inclusion. (Citations would be conclusive.) One of the best features of Webster's Third New International Dictionary ("MW3") is its coverage of scientific vocabulary. My print edition has Addenda with mostly technical terms dated as late as 1993. In the main (1961) portion they have five different singular compounds of "foramen" and one of "foramina". They add nothing further in the Addenda.
Are you asking about Translingual status? I don't think we would make something Translingual until we had evidence that the term was used in a "significant" range of languages. Thus for "bruit" attested usage in running English text, should just lead to an English entry. If it is attestably used in German, French, Swedish, Italian, Russian,I would think that there is a case for Translingual. I haven't seen any particular shortcut to Translingual status for medical terms. Perhaps if multiple medical dictionaries declared it something like "International Scientific Vocabulary" ("ISV") as MW3 does with many entries, but not "bruit". DCDuring TALK 11:52, 3 February 2010 (UTC)

"Not a sum of parts" - proposed entry[edit]

We need an entry in "General rule" (after “Terms” to be broadly interpreted) to say that an entry shouldn't be included if it is just a sum of parts. For example, party leader should not be included (means "leader of a political party", "leader of an expedition", "leader of a celebration" depending on context), but post office should be, as it means "a place to send or receive mail" and not "a place that manages posts (such as football posts or signposts)" or "a place that manages jobs or positions (I'll put a new guard in that post). Facts707 19:52, 4 March 2010 (UTC)

Uh, we do. See WT:CFI#Idiomaticity, "An expression is “idiomatic” if its full meaning cannot be easily derived from the meaning of its separate components.". --Yair rand 20:14, 4 March 2010 (UTC)

Why should we have "Given and family names"? - they are handled better and more completely by Wikipedia[edit]

I don't see why we need "Given and family names". Wikipedia has all the same information, but with greater coverage of names included and usually better etymologies and translations. If someone searches for a term that is not found in Wiktionary, the user will see the same term at Wiktionary, plus other related searches.

Compare Smith with w:Smith and w:Smith (surname) for example.

I don't think we should spend our limited resources trying to do something our sister project is already doing, and better.

Likewise, why do we have Seattle, but not Tacoma or Redmond? Or why do we need Tower of London when we have w:Tower of London?

These are not words in the English language, they're historical names and thus belong in Wikipedia unless they have entered the language for some other reason such as an idiom, e.g. Waterloo. Facts707 14:24, 13 May 2010 (UTC)

The toponym Churchill, for example, is an English word (more precisely, a lexical unit). It has an etymology, an eponymous literal meaning (“church hill”), and it is applied to certain kinds of referents (places and people). We systematically compile such lexicographical information, and a person who just wants to “look it up in the dictionary” needn't read a whole encyclopedia article for it. Wikipedia could (but currently doesn't) have an article about w: Churchill (name), including encyclopedic information which doesn't belong in the dictionary. There is also an open question of whether we should include onomastic information, more specific to names than to other words.
Of course, the person of w: Winston Churchill and the town of w: Churchill, Manitoba, are not words or names, so we shouldn't be duplicating Wikipedia's efforts by “defining” them here.
As you may have noticed, place names are not accounted for by our guidelines, and there have been many discussions and proposals regarding them over recent months, but none has yet achieved concensus. Michael Z. 2010-05-13 19:14 z
For an academic justification for proper names in dictionaries, see Mufwene (1988), “Dictionaries and Proper Names,” in International Journal of Lexicography, v 1, n 3, p 268. Michael Z. 2010-05-13 19:16 z
By the way, there's no deadline, so our resources are effectively unlimited. As long as we define concrete limits on the scope of the project, by our wt: CFI, then it will remain doable. Michael Z. 2010-05-13 19:19 z
Wikipedia's articles on names have none of the same goals as Wiktionary's. Wiktionary includes pronunciation, etymology (from a linguistic standpoint), translations, inflections (for non-English names), and other information, basically the same kind of things as for words. Names fit perfectly into the mission, I can't see any reason not to have them. Wiktionary includes information about words, Wikipedia covers concepts. Thus, information such as that Seattle in American Sign Language is S@NearSide-PalmForward Sidetoside and audio pronunciations of the word exist in Wiktionary, and information about the things themselves belong in Wikipedia. There is little (if any) overlap. --Yair rand 19:26, 13 May 2010 (UTC)

Permanently recorded media[edit]

For a discussion of "permanently recorded media", see also Wiktionary talk:Searchable external archives, and #The OED cites Usenet, too. --Dan Polansky 16:05, 19 May 2010 (UTC)


  • Beer parlour: What is Usenet?, September 2010

--Dan Polansky 12:04, 27 September 2010 (UTC)

Durably archived source[edit]

See #Permanently recorded media. --Dan Polansky 16:03, 19 May 2010 (UTC)

compare to Wikipedia[edit]

I added this at Wikipedia:Notability, but Wiktionary's CFI is locked:

"It is similar in basic concept, but has vastly different criteria from, the criteria for inclusion (CFI) on the Wiktionary project."—This unsigned comment was added by Facts707 (talkcontribs).

The basic concept is different: Wikipedia has articles on various topics, things, ideas, people, places, etc, while Wiktionary's entries are only about terms, names, proverbs. Michael Z. 2010-05-25 21:45 z

WT:Phrasebook not mentioned[edit]

While this page does mention the term "phrasebook", it does not explain what to do with these types of phrases, nor does it mention WT:Phrasebook. Facts707 21:36, 28 May 2010 (UTC)

Nobody yet has any inspiring vision for the Phrasebook and accordingly we don't have criteria either. Some think a phrasebook is sufficiently distinct from a dictionary that it should be a separate project. Some think it must be part of Wiktionary. Some think we should have a limited experiment. Some think it should have a separate namespace within Wiktionary. Some think that we should have a sex-tourism phrasebook as it is a neglected area in print phrasebooks. Other are offended or think it risks making us a laughing stock or placing us on blocked-site lists. In the meantime, I would not hesitate to add any phrase that is actually in a contemporary phrasebook. DCDuring TALK 15:50, 10 July 2010 (UTC)


Shouldn't Láadan be moved to "languages whose origin and use are restricted to one or more related literary works and its fans"? The complete use of the language is pretty much restricted to w:Native Tongue (Suzette Haden Elgin novel). And while I'm at it, shouldn't we remove Orcish from that list? w:Orcish doesn't even mention a language, and while I'm sure there have been many unnotable proto-languages named Orcish, it's not a real language of any note. And even further, why don't we delete Delason, Glos, Jakelimotu, Kyerepon, Latejami, Linga, Sasxsek, Suoczil, and Tceqli from the list? None of them have Wiktionary entries, and I'm somewhat familiar with the field and don't recognize any of them. We can't exhaustively list constructed languages, so why mention a bunch of unnotable ones?--Prosfilaes 15:01, 10 July 2010 (UTC)

Those seem like good changes. I haven't heard of anyone arguing for the inclusion of any of those languages. And none of the languages listed at the end even have Wikipedia pages. Bring it up in the WT:Beer parlour to hopefully others will agree. --Bequw τ 23:55, 10 July 2010 (UTC)

Generic use[edit]

The attributive rule got voted out, but I didn't object to its idea, just its wording. It's was as badly worded as can be imagined. It would be nice to add it back in a new form, that is fully explaining what it means. A few points

  1. Most important IMO and most potentially controversial, specific entities should not 'require' generic use, but generic use should be one way for an entry to pass. Therefore if Late Latin isn't used generically, it won't be deleted.
  2. Uncontroversially, the wording should be precise and leave as little room for doubt as possible. For example, attributive use could mean grammatical attributive use. So David Beckham haircut would be attributive use of David Beckham to modify haircut. Generic use, IMO, should be a meaning other than the primary one. So Billy Elliot would pass because of three citations of 'a Billy Elliot' referring to a young male dancer. All three citations would have to back up the same meaning, not just any meaning. Mglovesfun (talk) 08:38, 4 August 2010 (UTC)

Can you give some context please. What part of the CFI are you proposing to modify? What do you want to see included/excluded that isn't currently? Why? You appear to be talking about both generic use and attributive use, (although it's not clear what the uses are of) yet the section title is just generic use? Can you link to the vote in question so we can see what it was about and what the wording was? Have you got any specific wording in mind or is this just a statement of desire for someone to do something about something? Thryduulf (talk) 09:27, 4 August 2010 (UTC)

Wiktionary:Votes/pl-2010-05/Names of specific entities. Now that we don't have an attributive use rule, I'd like a generic use rule. I'll try and work out some wording when I have time. Mglovesfun (talk) 10:22, 5 August 2010 (UTC)
Question book magnify2.svg
Input needed: This discussion needs further input in order to be successfully closed. Please take a look!

Usage in a well-known work[edit]

For a discussion of the criterion of "usage in a well-know work", see also WT:BP#CFI: Removing usage in a well-known work, January 2011. --Dan Polansky 13:01, 30 January 2011 (UTC)

Delete or improve?[edit]

Can I suggest, rather than flagging entries for deletion, we should try and improve the entries by adding missing etymology, pronunciation, linguistic info, etc? User Msh210 has again started a war against place names entries, e.g. Cannes and a few others starting with C. I was heavily involved adding translations for many and I feel upset. Place names are allowed to have and missing info should be nicely requested without threats to delete within a month.

Adding {{placename/box}} takes a second, adding all the required information takes much more effort. Can we all be more proactive and not try to wipe somebody's work but add the required information? --Anatoli 23:53, 8 March 2011 (UTC)

I'm sorry that I upset you. It was not my intention. I's also not my intention to delete the entries after only a month. More discussion on this is still in the BP, at [[#placenames up for deletion]]; might I suggest that you contribute to that discussion rather than here?​—msh210 (talk) 18:00, 9 March 2011 (UTC)


I don't see the point of linking to Wiktionary:Votes/pl-2010-01/Renaming CFI section on genealogic names. DAVilla 13:50, 30 May 2011 (UTC)

"Encyclopedic" entries, attestation of individual elements of definitions[edit]

Those wishing to discuss the above-named issues(s) in the future may find this long, thoughtful discussion (sooner-or-later to be archived to Talk:Baidouska) interesting. - -sche (discuss) 04:17, 19 August 2011 (UTC)

Incidentally, baidouska might be considered a variant of pajduska/paidushka/pajduško/pajdusko/paidushko, which appears to be verifiable in English without too much trouble. Michael Z. 2012-01-30 22:55 z

SOP applied too deletionist[edit]

I feel that WT:SOP is applied too freely, and the result is words that aren't particularly SOPy are deleted anyway, often almost arbitrarily. Part of the problem is that in a two-word phrase, one or more of the words has multiple definitions, and it isn't always 100% clear which definition is meant. Therefore, I propose a slight relaxation of SOP involving composite words where one of the words has multiple definitions Purplebackpack89 (Notes Taken) (Locker) 16:01, 24 November 2011 (UTC)

There are a few problems with this. The main problem is, that means that any combination of words where at least one of them has two meanings being allowed. So, readily has two meanings, I readily agreed and I readily said yes would meet CFI. They're both attested. I has more than one meaning anyway, so any combination of words including the word I meets CFI. There are other objections, but if I lump them all in one message, the conversation will get too sparse. Mglovesfun (talk) 15:49, 25 November 2011 (UTC)
IMO, the alternative, keeping SOP the way it is, is much worse. It means that phrases that aren't crystal clear are being deleted anyway. SOP really is in conflict with NOTPAPER Purplebackpack89 (Notes Taken) (Locker) 20:14, 25 November 2011 (UTC)
CFI are not clear, and misunderstood, because they are much too complex. The simple way would be to clearly state: all words (including set phrases) used in a language and that may be considered as belonging to the vocabulary of the language may be included. Lmaltier 20:32, 25 November 2011 (UTC)
"may be considered" does not seem like a usable rule. Considered by whom on what grounds? That's why we need CFI. Equinox 20:33, 25 November 2011 (UTC)
It's a simpler sentence, but applying it would not be simpler than what we have now. BTW I don't think NOTPAPER means "add every conceivable combination of words". We do need some credibility. Mglovesfun (talk) 20:37, 25 November 2011 (UTC)
Of course not every combination of words. Only elements of the vocabulary. And I'm sure that such a sentence would help very much (but additional rules would be required nonetheless, to understand what the rule means, and when limits are required for practical reasons, e.g. for numbers). Lmaltier 20:55, 25 November 2011 (UTC)
Since Purplebackpack89 hasn't replied to my first point, I'll go on to my second. The policy would support a fundamental mistake in how language work. In a dictionary, we define words and terms in isolation (one at a time) whereas in the real world words and terms appear in a context. It's often very clear from the context what something means. For example 'car crash' could mean the crash of a 'first part of a cons in LISP. The first element of a list' (etymology 2 of car) however the context tells us this, so we don't need to define car crash separately. If you spend to much time breaking something down into its parts, you forget about the whole. Mglovesfun (talk) 09:45, 26 November 2011 (UTC)
I didn't reply to your first point because I don't see any problem with that. Perfectly fine with lots more entries. Your argument seems to be based on a line of thinking (unproven, I might add) that. I think the opposite of the that is true...if readers can't find a definition for the word or phrase they're looking for, or have to go to Wikipedia to complete the definition, they'll be disappointed. And if this community keeps thinking that commonly used two-word phrases like car crash and television show don't belong, I feel that readers will not think too highly of it and use something else Purplebackpack89 (Notes Taken) (Locker) 15:08, 26 November 2011 (UTC)

General rule incomplete for Wikisaurus[edit]

The current general rule is "A term should be included if [and only if?] it's likely that someone would run across it and want to know what it means."

I think this rule is incomplete for Wikisaurus. For the most part, this rule addresses reading. A thesausus primarly addresses writing. You know the definition of a word (possibly with great attention to its subtleties); you just want to find a synonym.

I would suggest that we ammend the rule to read something like "A term should be included if it's likely that someone would run across it and want to know what it means or (for Wikisaurus) that someone would want to find a word or phrase with a similar meaning." Or we could deal with this in a new subsection. Or even in a brand new project page: say, CFI for Wikisaurus.

The reason for this is that the current rule has led us to delete thousands upon thousands of Wikisaurus idioms and slang phrases that do not meet the current criteria for inclusion in Wiktionary but, in my opinion, are useful in finding a synonym. A phrase may be defined by its words (and thus need no separate entry in Wiktionary) but still be useful in finding a synonym idiom with just the right connotations.

For example, the phrase "exercise my anus" was an entry in Wikisaurus's defecate in January 2009 but doesn't appear now. Similarly, "bikini stuffers" was a synonym for breasts in March 2006 but not now. If I was writing a story about a lazy summer on the beach, I might want to use "bikini stuffers" instead of "racks" or "boobs".

The Wikisaurus:Breasts entry now includes the warning "Only words that meet criteria for inclusion can be included." My point is that it's incomplete for thesauruses. --RoyGoldsmith 01:10, 29 January 2012 (UTC)

I don't think it unreasonable to include things in Wikisaurus that we don't have entries for. I don't find your examples the most convincing, though.--Prosfilaes 11:16, 30 January 2012 (UTC)
Why should Wikisaurus have laxer criteria for inclusion? Shouldn't a writer using this reference be confident that a suggested synonym is an attested part of the language, rather than some nonce coinage? Furthermore, should the writer find an unfamiliar word, he would also benefit from a detailed definition, notes about connotation, register, usage, etc., so she can employ the word correctly and appropriately. We have no business making up such entries out of the blue.
I don't see how we can include such terms sufficiently well, and I it looks to me like this dictionary would suffer from their inclusion, while the thesaurus wouldn't really benefit. Michael Z. 2012-01-30 22:15 z
To answer my own question, perhaps currently-popular neologisms, too recent to meet our CFI could be included, but I would think they should be labelled as such. Michael Z. 2012-01-30 23:12 z
Because often the best term is SOP. Frequently, "at the seashore" is better then "littoral" and "named after a person" is better than "eponymous". But those two are rightly not eligible for entries.--Prosfilaes 00:21, 31 January 2012 (UTC)
Perhaps, but those feel more like definitions than something I would be hoping to find in a thesaurus. Michael Z. 2012-01-31 16:18 z
Can't thesaurus entries include non-dictionary-worthy items, just not making them links? Equinox 22:59, 30 January 2012 (UTC)
We used to have pages like Wikisaurus:breasts/more, but they were so embarassing the community decided to get rid of them. -- Liliana 16:24, 31 January 2012 (UTC)
OK, let's say I'm a writer, writing a story about the beach. I might want to use "bikini stuffers" (if I knew about it) as opposed to "breasts" or "boobs" or "rack". Breasts are too clinical, boobs too teenage-ish and rack too racy. I might want something that specifically relates to swinsuits and bikinis. Roget's splits terms by numeric category so that you can find exactly the same meaning terms (synonyms), almost-exact-meaning terms (hyponyns or hypernyms), sort of the same meaning terms, terms that are related in some vague sense to other terms and, particularly, terms that are like other terms but with a different connotation.
In another example, "spitting chips" (a current red link) does not have the exact same connotations as any other synonym for Ws:angry. Are we supposed to leave out spitting chips (or bikini stuffers) just because they aren't listed as terms in Wiktionary? (And I'm not saying that spitting chips or bikini stuffers deserve an entry in the main dictionary.)
Remember, a thesaurus is not merely a list of synonyms. And it certainly isn't a list of attested synonyms. Creative writers want "nonce coinages" and one-time-only usages. In certain cases, they may even prefer phrases that have yet to see the light of publication. If a reader wants only attested usages, they always have blue links vs. either red links or straight text. For non-published phrases, I would isolate them on /more pages. --RoyGoldsmith 05:15, 1 February 2012 (UTC)
Are there professional thesauruses that make up new words, or are we about to corner the market in this category? Is there evidence of demand for this service, or is that speculation? Michael Z. 2012-02-01 17:35 z
What is a professional thesaurus? Do you mean published thesaurus? Perhaps even "published in paper-book form"? Or do you mean well-known thesauruses? Well, in Roget's International Thesaurus (5th edition, published by Harper Collins in 1992) has tons of new terms. On virtually every page, in virtually every entry, you have roughly as many new reference terms (not in the thesaurus as a root) as root terms. For example, the first entry on page 1 is BIRTH. It has 28 noun terms, including "having a baby", "giving birth", "the stork", "birth throes", "blessed event" and so on. Most of these phrases are not used as root terms.
What I'm trying to say is that there should be NO connection between a dictionary (used for looking up meanings) and a thesausus (used for finding like-meaning terms). The methodology for constructing them is totally different. For one thing, idiomaticity gets thrown out of the window. In a thesaurus, you want the root terms to be "easily derived from the meaning of [reference terms'] separate components" but a reference term might not be easily derived from the root term. For example, you can easily figure out that "bikini-stuffers" means "breasts" but, given the concept of breasts, you probably would not derive the phrase bikini-stuffers. --RoyGoldsmith 04:36, 5 February 2012 (UTC)

A summary?[edit]

I often thought that CFI is quite long and dense to read, as it uses a lot of technical terms. This makes it hard for newcomers to understand, which is a problem because they are the ones who need it the most! So maybe it would be a good idea to provide a short summary of the most important parts of CFI, maybe on a separate page, in simple "welcome message"-style prose? —CodeCat 19:11, 8 April 2012 (UTC)

Use full name of COALMINE[edit]

If you aren't going to summarize what WT:COALMINE means you could at least give the full name of the vote/policy ("Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word") which pretty much explains it. Siuenti (talk) 14:12, 15 July 2012 (UTC)

If you want it to be fixed, raise the issue in the WT:BP and get community consensus. --Μετάknowledgediscuss/deeds 07:06, 18 July 2012 (UTC)
Yes check.svg Done it used to give the full name, but Daniel changed that with no discussion, so I reverted it. -- Liliana 12:16, 18 July 2012 (UTC)
Oh, I see. Good call. --Μετάknowledgediscuss/deeds 23:46, 18 July 2012 (UTC)

Text for COALMINE.[edit]

Currently, § "Idiomaticity" ends with this indented, italicized paragraph:

:''The vote [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]] adds a criterion for inclusion without specifying text to be amended in this document, so please see it for the additional criterion.''<ref>([[WT:COALMINE]]) [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]]</ref>

I'd like to propose that it be replaced with this unindented, unitalicized paragraph:

If a collocation is significantly more common than an included single-word spelling, then the collocation ismay be included as well, even if it is unidiomatic or debatable. For example, {{term|coalmine|lang=en}} is well attested, but {{term|coal mine|lang=en}} is significantly more common, so both are included, regardless of whether {{term||coal mine|lang=en}} is otherwise idiomatic.<ref>([[WT:COALMINE]]) [[Wiktionary:Votes/pl-2009-12/Unidiomatic multi-word phrases to meet CFI when the more common spelling of a single word]]</ref>

(O.K., so that wording isn't great. But it's an improvement over what we've got now. And I'd welcome further improvements.)

RuakhTALK 21:05, 17 August 2012 (UTC)

I'd support that, but this isn't the venue to propose it in. --Μετάknowledgediscuss/deeds 21:20, 17 August 2012 (UTC)
Why not? —RuakhTALK 21:22, 17 August 2012 (UTC)
I expect non-admins don't watch this page. --Μετάknowledgediscuss/deeds 21:26, 17 August 2012 (UTC)
I think it should read "then the collocation may be included as well" rather than "then the collocation is included as well." "Should be included" would be fine too, AFAIAC. DCDuring TALK 23:10, 17 August 2012 (UTC)
I see no need for such a change. However, if we're to make it, then I think the text should indicate in its discussion of the coal mine example that coal mine and coalmine are forms of the same word phrase thing, which it doesn't now.​—msh210 (talk) 05:13, 11 September 2012 (UTC)
Er, also in the normative part. As currently worded, it allows the house as significantly more common than encephalon.​—msh210 (talk) 05:20, 11 September 2012 (UTC)

Discussions of durability[edit]

Because it may be useful to have this index of them, here are some past discussions of durability:

- -sche (discuss) 23:43, 18 August 2012 (UTC)

Formatting of misspellings.[edit]

I assume it's uncontroversial to change this:

Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by this simple entry:
# {{misspelling of|[[...]]}}
An additional section explaining why the term is a misspelling should be considered optional.

to this:

Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by this simple definition:
# {{misspelling of|...}}
An additional section explaining why the term is a misspelling should be considered optional.


(I.e., changing "entry" to "definition", and removing the [[ and ]] from inside {{misspelling of}}?)

RuakhTALK 15:22, 10 September 2012 (UTC)

Never assume.  :-)  But I, for one, support such an edit without a vote.​—msh210 (talk) 16:29, 10 September 2012 (UTC)
I would support something like "... followed by a simple definition using the following format:". --BB12 (talk) 18:28, 10 September 2012 (UTC)
I support the spirit of the change, but it needs to mention and explain the lang= parameter too. —CodeCat 19:21, 10 September 2012 (UTC)
O.K., first of all, @BenjaminBarrett12 and @CodeCat: your comments imply that you don't support the currently proposed changes unless modified as you propose. If you don't, then — why don't you? Do you not consider them to be improvements? Do you feel that they're too minor, on their own, to warrant editing WT:CFI? Something else? (I ask because part of the point of being able to make uncontested changes after mere discussion, without a full vote, is that it allows smaller changes to be made piecemeal, without much bureaucracy. If lots of people jump on and add riders, refusing to support the original change, then I think we'll end up back where we started. I hope that you two aren't holding this change "hostage" to other changes you want.)
Those questions out of the way . . . how about:
Once it is decided that a misspelling is of sufficient importance to merit its own page, the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by a simple definition using the following format:
# {{misspelling of|occurred|lang=en}}
An additional section explaining why the term is a misspelling should be considered optional.
 ? (This incorporates BenjaminBarrett12's change; it adds lang=en per CodeCat — though I suspect that now DCDuring will object; and it uses the "occurred" example from earlier in the "Spellings" section, rather than .... This last part is because lang=... was too vague, and I feared that of|...|lang=en could be taken to imply that only English misspellings are allowed, whereas of|occurred|lang=en seems more obviously just an example.)
RuakhTALK 19:41, 10 September 2012 (UTC)
Some people here think black and white like that but I try not to. Any improvement is good, even if it's not yet the end result I would prefer. I support the change, but I'm also pointing out that it can be improved further and that I would prefer that. —CodeCat 19:51, 10 September 2012 (UTC)
Basically ditto. --BB12 (talk) 21:39, 10 September 2012 (UTC)

Support with all changes (Ruakh's, BB's, and CodeCat's) --Μετάknowledgediscuss/deeds 14:00, 11 September 2012 (UTC)

Does that mean that you only support if all changes are made? Or do you support each change independently? —RuakhTALK 22:44, 11 September 2012 (UTC)
Independently. In general, you can assume that my votes for certain changes must be enacted together iff I say "iff". --Μετάknowledgediscuss/deeds 00:27, 12 September 2012 (UTC)
Support Ruakh's original change, don’t mind (would support, but would also be OK without) BB's or CC's. - -sche (discuss) 19:21, 11 September 2012 (UTC)
I support Ruakh's and Ruakh+BB12's also.​—msh210 (talk) 20:15, 11 September 2012 (UTC)
Does that mean that you object to CodeCat's change, or merely that you don't actively support it? —RuakhTALK 22:44, 11 September 2012 (UTC)
It's a good idea in theory, but I can't think of an implementation that is not too wordy or awkward and that refers to English also (not only foreign entries). So I suppose I'm opposed to the exact wording proposed above while in favor, perhaps, of another.—msh210℠ on a public computer 03:27, 12 September 2012 (UTC)
lol, this is almost as bureaucratic as a vote... - -sche (discuss) 23:14, 11 September 2012 (UTC)
"Fancy thinking the Bureaucracy was something you could hunt and kill!" said the head. "You knew, didn't you? I'm part of you?" —RuakhTALK 23:29, 11 September 2012 (UTC)
I agree with Sche. This really isn't that complicated, guys. Nitpicking ≠ consensus-gathering. --Μετάknowledgediscuss/deeds 00:27, 12 September 2012 (UTC)
I mean to say, אוי#Yiddish. --Μετάknowledgediscuss/deeds 00:38, 12 September 2012 (UTC)
I support Ruakh’s and CodeCat’s changes, and don’t mind Benjamin’s. I also assume it’s uncontroversial. — Ungoliant (Falai) 23:32, 11 September 2012 (UTC)
  • Yes check.svg Done Any admin who objects, please comment here and revert. Any non-admin who objects, please comment here saying so, and an admin will revert for you. —RuakhTALK 15:08, 18 September 2012 (UTC)

Suggest writing guidelines for...[edit]

Suggestion to add guidleines about why not to provide the definition of phrases like at the, or other words which are often paired but convey no special meaning or importance over the same words singularly, such as biker gang. And, why it is important, if a word is often used to describe something which shirks or goes against the usual defining meaning of the word, to explain with etymology or irony as appropriate, such as the use of the words hero and protagonist in literary review, to describe a main character irrequisite of the quality of being heroic, or the position of protagonising. RTG (talk) 12:12, 7 November 2012 (UTC)

Grammatical error[edit]

"A name that occurs only in the works of fiction of a single author, a television series or a video game, or within a closed context such as the works of several authors writing about a single fictional universe is not used independently and should not be included." I think it should be "If a name that occurs only in the works of fiction of a single author, a television series or a video game, or within a closed context such as the works of several authors writing about a single fictional universe is not used independently then it should not be included."Trongphu (talk) 05:59, 27 December 2013 (UTC)

And the reason for that is the former sentence doesn't make sense.Trongphu (talk) 06:00, 27 December 2013 (UTC)
It is fine. It says "a name [of that kind] should not be included". Equinox 06:13, 27 December 2013 (UTC)
Maybe a comma could help prevent future (mis)conceptions:
"A name that occurs only in the works of fiction of a single author, a television series or a video game, or within a closed context such as the works of several authors writing about a single fictional universe, is not used independently and should not be included."
Just my two paras. --biblbroksдискашн 17:36, 27 December 2013 (UTC)

Archaic inflected forms[edit]

There is a number of verbs in Russian that have multiple choices of inflection for all forms of the same verb (not just one complementary form). And these forms are equal in use, for example: "брызгать": я брызгаю/я брызжу, "алкать": я алкаю/я алчу, "рыскать": я рыскаю/я рыщу etc. For some of the verbs there is a contemporary way of inflecting them and the old one, that was used in the 19th century pretty widely (by Russian classic writers). The old ways of inflection might even encounter as main ones in some grammar books of the beginning of 20th century. Lexicographers, of course, do mention the contemporary way of inflecting in today's dictionaries and either omit the old ones or mark them as archaic. Sometimes the words get considered as w:defective verbs in the new dictionaries: they even get infinitive as not-existent, while preserving the most widely used forms, for example: an infinitive "обымать" is stated to be eligible only for the standard conjugation, but for the old one it's not longer considered as an infinitive; for the old conjugation ("объемлю" 1st.p. pres.) only present tense is considered existent.

Since the main purpose of Wiktionary is to describe all words (and their forms) despite their outdatedness to be able to search for any form, my idea is to specifically prescribe in the Wiktionary policy, that all, even old forms (which belong to this language not the Old Language counterpart) to be included in the word articles, no matter there are in use today.

As an example, I made this article with two conjugations with the second one marked as old one. The other user, being guided by today's dictionaries moves the second conjugation into a defective verb article. Please, arbitrate, who was right, and prescribe the correct way of dealing with such cases. Soshial (talk) 16:30, 12 January 2014 (UTC)

I don't know if that's really feasible. Sometimes there are many different ways that an old form was written (seien is an extreme example), and we can't fit all those forms into one table. —CodeCat 17:02, 12 January 2014 (UTC)
I see, but I was talking not about spelling variants, but about forms that are equal in usage but their production belongs to different classes of conjugation. Soshial (talk) 18:14, 12 January 2014 (UTC)

children's language[edit]

Do we need to tweak CFI for children's language? For example, i seem to remember from long ago that "pesk" was used at least by children as a noun in English in the USA (S/He's a real pesk.). I haven't lived in an English-speaking country for a long time, so i don't know whether it's still used. Websites and even Google Books do a bad job of recording the language of children, so i'm not surprised Google only finds very few hits for "a real pesk". --Espoo (talk) 10:45, 3 February 2014 (UTC)

Slang and dialect will be hard to cite, but I don't see any reason to change our rules. On one hand, IMO the citation rules are important in keeping words that people might actually find and look up instead of pretending to cover all unrecorded slang. On the other, children's language is at the bottom a hopeless mire; every family has its own cute mispronunciations and English spellings will vary over the map.--Prosfilaes (talk) 15:00, 3 February 2014 (UTC)

Relevancy, misspelling or just not a word[edit]

Is "overcommissioning" a word, as it only has 125 hits on Google? (A virtual disk might rather be overcommitted, than overcommissioned; but acting role slots for a play, maybe not.) --Alien4 (talk) 08:37, 20 May 2014 (UTC)

It's rare, but I found 2 hits on Google Books, as well as at least one for hyphenated over-commissioning, so it's not completely made up. —Aɴɢʀ (talk) 10:23, 20 May 2014 (UTC)
In context, such a word might be easily understood, so the fact that it might not meet the standards for inclusion in a dictionary need not keep one from using it happily in an appropriate situation. OTOH, without more context I don't really get the connection between a verb overcommission or a possible noun overcommissioning and "acting role slots". DCDuring TALK 12:19, 20 May 2014 (UTC)

"Some examples include..."[edit]

This phrase has some redundancy about it. IMO, we should say "Examples include..." or "Some examples are..." Equinox 21:14, 11 July 2014 (UTC)

Question re attestation.[edit]

So we can't quote Wikipedia, or anything else Wiki. But if book, otherwise quotable, quotes Wikipedia, can we quote its quote as attestation of words quoted? DeistCosmos (talk) 06:19, 23 August 2014 (UTC)