Wiktionary talk:Thesaurus/criteria

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Possible criteria:[edit]

1000 Google hits[edit]

One thousand google hits is as arbitrary a standard as can possibly be, throuhg gets 102,000 hits! Do we want all the common typos to be included while real but uncommon words (threptic, 419) to be excluded? Also, if I claim that "Joe McCarthy" is a synonyms for "penis", 15million hits say that it should be included as such, even though none of the hits necessarily indicate that there is any relationship between the words. If we are going to right a criteria for inclusion let us not half-ass it, let's come up with a meaningful standard. It may be necessary to establish a clear set of goals for this project, a statement of purpose or vision or whatever, it is hard to decide what course to take when there is no clearly stated goal. - TheDaveRoss 08:07, 30 May 2006 (UTC)

Wiktionary CFI[edit]

I am of the opinion that if a word doesn't meet the standards of the main namespace dictionary, it doesn't belong in WikiSaurus either. The purpose of WikiSaurus in my mind is not to include all of the many ways people have expressed each idea or concept over the years, but rather to provide the user with as many useful relationships to the word they are querying as possible. There is certainly some use to an all inclusive listing of every slang term ever conceived, but it isn't what I think this project is or should be about. I would MUCH prefer to see an incomplete but accurate thesaurus than an overcomplete and relatively useless (beyond confirmation of a slang terms probable intent) cess pool of billingsgate. - TheDaveRoss 05:41, 31 May 2006 (UTC)

Small Improvement re Citations[edit]

With the idea of Wiktionary:Citations becoming adopted more, consider putting the supporting citations, google count etc in the .../citations sub-page. And include a link in the main entry, See Cites, linked to the citations page.

I added the above, I will conduct a discussion on Beer Parlour--Richardb 03:55, 13 June 2007 (UTC)

Other Criteria[edit]

What should be the standard for inclusion:

  • Usage in a published work.
  • Usage in X number of running text citations gathered from any primary source.

What doesn't merit inclusion[edit]

What should be excluded:

  • Any word exclusive to a small group of people (e.g. a highschool community vs. regional).
  • Generic synonyms. For many entries (especially the "anatomical" ones) there are thousands of permutations from a basic concept all of which could be called synonyms, e.g. any phallic shaped object can be used as a synonyms for 'penis'. This does not mean that everything which anyone has ever observed to be phallic shaped, or even called a penis should be included. There needs to be a stricter rule for whether or not something is valid for items such as these.
  • Nonce words. If a character on a television show refers to some bodily organ as their "little skippy" this should not warrent an entry, unless there is a larger usage sparked from such a usage.

Lets decide WikiSaurus general rules for 99% of entries[edit]

Lets decide WikiSaurus general rules for 99% of entries, not for the 1% of Billinsgate words, as you so quaintly put it. And then perhaps have some special rules for the "tough" words. I did not propose the 1,000 Google Hits criteria for all words. Just as a default if someone can't think of anything better as the criteria for any particular word. And, of course (why do people deliberately try to misinterpret) any 1,000 Google hits have got to be seen to be at least partially relevant to the usage!

I strongly disagree with the limited idea of what the WikiSaurus is for. You are eseentially saying that the WikiSaurus is for "writers". That is, someone is writing something, and wants to either avoid using the same word repeatedly, or a more appropriate (approved) word, or just show their erudition. Whereas as I am looking at WikiSaurus as also being useful for foreign language or less erudite "readers". That is, someone sees or hears a word and wonders what it means. Even for writers who are trying to make their text less pompous and stiff. In this case it's better to have a synonym entry in WikiSaurus than no entry at all in Wiktionary. A crossover of these two positions can also be envisaged, where a foreign language translator knows a perhaps inappropriate translation, but wants to find a more appropriate synonym (Cleaner or "dirtier" may be appropriite to the context). If you take all the disgusting / stupid synonyms out, then how will the translator find a link to what more appropriate words might be available ?

Again, I would say to all the deletionists. Wiktionary is still of very poor quality becuase of the lack of thoroughness with the basic English words, and the number of commonly used words and phrases that are still missing. If the deletionists would only put their efforts into making the damn thing more complete and comprehensive than into bowdlerising it, limiting it, it might yet be a worthwhile project. But so many deletionists just can't see past the "dirty" entries. I think some of you have one track minds. (Of course, I know many people think I'm the one with the dirty mind since I put so mnay "dirty" words in to start with. But I'm nowhere nearly as fixated on these words as some of our bowdlerisers are.)

So, while agreeing with TheDaveRoss on trying to find some common formats, I'm completey at odds so far as critieria for inclusion are concerned. Doesn't need to stop us working together on what we agree on though.--Richardb 03:50, 4 June 2006 (UTC)

From my point of view, a dictionary is for what words mean, a thesaurus is for how words relate. If you want a new speaker to be able to understand the meaning of what they come across then lobby for inclusion in the main dictionary. Lists of synonyms aren't what I think this should be about. It seems very incongruous to the nature of a thesaurus to try and use it as a dictionary for the less "accepatable" terms that got rejected from the main dictionary.
I don't think that the vision I have for WS is one that would only be useful to writers, I think it will be useful to a wide variety of people, from someone in search of a synonym to someone who wants a full set of semantic relations and beyond, the writers and so forth. I think that we should have what every thesaurus out there has and more, we can fit it all and we can make the layout and the hyperlink system work for us, why not.
As for the 99% which is a great place to start: I would like to see the articles populated with more of the standard fare, the stuff that everyone in the known universe agrees is a word in the English language. This is of course not a good standard, but since I am one of the few people who are adding anything to WS at present, I get to pick and choose :)
For an actual guideline, I am very concerned about the following: 1) Nonce, protologisms and generics should be excluded. I know this rubs you wrong, but that is what I will be advocating. I think I have articulated my arguements, and you have articulated yours, and we are just going to disagree on that, but I am solidly there. For the rest...I am fond of the Wiktionary CFI, I would actually be in favor of a stricter one than that, I would like to see words only from sources which are edited and hold a wide circulation, but I have gotten past this and am accepting of a wider selection of words, but there has to be a line if this project is to mean anything at all. I also don't agree with the characterisations of people who are against the inclusion of the uncommon items from the anatomy entries as "bowdlerisers", my (and I am willing to bet others) intent is not to censor these entries on a moral ground, I am totally in favor of even the raunchiest words being in this project, my desire is not to have WS reflect only the words that I myself use. I want it to reflect how the language is actually used widely, which I don't consider to be a censorship at all, but rather a requirement of validation on the items we do include. - TheDaveRoss 04:51, 5 June 2006 (UTC)
I agree with Dave that the criteria here should be at least as rigid, or perhaps even more than the ones for dictionary inclusion. Rather than making the Saurus a depository for obscure and vague terms, it should be the opposite and contain that which is acknowledged (and useful for the majority of users). If people want to look up ten thousand other (unattested) ways for saying "penis" or "breasts", let's make some kind of WT:LOP part two for synonyms.
For dubious terms and possible tosh, I think we should keep a stiff upper lip and subject them to the same verification processes as we currently do for the dictionary. What is more: perhaps we should thighten all links with the dictionary part of Wiktionary and allow only bluelinked terms in Wikisaurus. This will both make the job for Wikisaurus easier, less complicated, and will stimulate the dictionary part as well, and moreover, it will not needlessly complicate things on the CFI side of Wikisaurus. The beginnings may be frustrating, but the end result will be much more impressive, I guess.
Excluding these thousands of synonyms for anatomical terms is not bowdlerizing, but, as Dave put it, merely following the CFI. Terms must be attested by means of printed sources if we ever want Wiktionary (both its dictionary and thesaurus parts) to be taken seriously. How am I supposed to find a synonym for any of these if I can't be sure it is actually in common use? — Vildricianus 11:32, 5 June 2006 (UTC)

Just a note re /more[edit]

I think the debate has moved on a bit since most of the above discussion, since the trial use of the /more page to allow the trashier stuff to be stored seperately. And since the adoption of semi-protection for those words that were overpopulated with dross.--Richardb 03:58, 13 June 2007 (UTC)

Green check.svg

The following information passed a request for deletion.

This discussion is no longer live and is left here as an archive. Please do not modify this conversation, but feel free to discuss its conclusions.


I can't believe this nonsense exists. It is in direct conflict with the VOTE passed on this topic. Crap in the WS namespace has to pass WT:CFI, or it is not on the WS page. Instead, such items are relegated to a "/more" subpage, without wikification. --Connel MacKenzie 08:47, 11 June 2007 (UTC)

Delete or rewrite. DAVilla 16:58, 11 June 2007 (UTC)
The only vote I could find related to Wikisaurus was Wiktionary:Votes/2006-09/Wikisaurus semi-protection. Is this the vote you are refering to? -- A-cai 20:41, 11 June 2007 (UTC)
Not surprising you could not find such a vote. As is often the case, CM asserts something is common practice, or has been VOTED on, when there is no truth in his assertion, and the evidence to counter him can be shown.
Contrary to CM's assertions, Wiktionary:Wikisaurus/criteria is a valid policy page. It has been a policy page since May 2006. Futhermore, Wiktionary:Wikisaurus/criteria is referred to by a paragraph in WT:CFI that has been there since May 2006. And WT:CFI has been VOTED on as policy since then, I believe. If WT:CFI is policy, then this is policy too, by virtue of this clear, long standing exception reference.
What CM really means is, it doesn't match his idea of what policy should be. He's entitled to that view. But it is totally inappropraite to RFD a policy page. If you want to delete it, AND make the necessary amendments to WT:CFI, then come up with a proposal, argue it, and put it to a vote. Do not just RFD it, and abuse me in the process.
To avoid a "war" I am not removing the RFD banner at this time, even though I think it was put there totally inappropriately, contrary to policy. I would ask CM to remove the RFD tag, and put his proposal for change to a VOTE.--Richardb 12:49, 12 June 2007 (UTC)
This doesn't quite paint the whole picture. If you look at the edit history, you'll see that the page in question was a Policy Think Tank as recently as last October. It was not accepted policy. The current banner on the page does not assert that the page is policy, but rather that it is policy or a guideline or common practice. That banner was edited, but not the page on which it appears, and neither was the page ever elevated to policy except by the quirk that the banner which appears there was altered. This "policy page" was never voted in as policy, so as far as I'm concerned it's still a "think tank" or "guideline", not policy. --EncycloPetey 15:33, 12 June 2007 (UTC)
So, let me understand this. You know the history, so you "know" it isn't policy. And that the facts, that it has a Policy Banner on it, and is referred to directly by WT:CFI, carry less wait than your personal knowledge. But how are other less "knowledgeable" users to know this. Why would they not believe it to be current policy, with the banner and the clear reference ? You either have and abide by policies, or you act regardless of the policies. When will people get that into their heads.
Personally, I believe that since WT:CFI was voted on, with the clear reference to the exception for Wikisaurus criteria in it, then the Wiktionary:Wikisaurus/criteria page has some standing as Policy, or suggested policy. No matter your personal "knowledge" or assertions otherwise. --Richardb 02:58, 13 June 2007 (UTC)
The page history of {{Template:Policy-TT}} shows 15:45, 28 January 2007 Connel MacKenzie (Talk | contribs | block) (Redirecting to Template:policy) [rollback] '. ie: Connel made the redirection so that Policy-TT pages (think tank) became Policy pages. Whether he intedned that consequence or not I cannot tell, but that was the effect.--Richardb 03:12, 13 June 2007 (UTC)
Please don't try to spin-doctor my statements. I know you're upset at Connel, but getting angry at everybody will not bolster your position. I am laying out facts that are evident to anyone who cares to look in the edit histories. Your argument above seems to be that since Connel (knowingly or not) changed the banner to say that the page is policy, therefore it is policy. You're giving Connel a lot of personal power with that argument. --EncycloPetey 17:06, 13 June 2007 (UTC)
If a someone who doesn't know the history comes to WT and sees the CFI page pointing to the Wikisaurus/criteria page, and the Wikisaurus/criteria page with a policy banner on it, and the policy banner says it can only be changed by voting on it, what are they to then make of CM marking it for deletion without a VOTE first. It would seem to be clearly a violation of that "no change without a vote" instruction. And surely, the policy as seen by such a person has to take precedence over some somewhat personalised rememberance of what was really intended, especially when that rememberance is disputed. To have rememberance over-ride clearly stated policy and instructions is not the way a policy controlled environament should be working.--Richardb 13:26, 18 June 2007 (UTC)
By the way, all along we have all been asssuming that CFI was voted upon. Yet I cannot find a reference to this in the VOTE history, or on the page History of CFI. Anyone know when/where the vote was taken. Can we, should we "post-facto" move such an important VOTE record into VOTE history.--Richardb 03:12, 13 June 2007 (UTC)
No, you have been making that assumption. I have not because I have never seen such a vote take place. Since I have been the one taking the time to archive old votes, the fact that I haven't seen any evidence of such a vote should be taken to strongly mean there probably wasn't one. There certainly was never one on the VOTE page. In fact, very few policies have ever been voted into place. Most of them were grandfathered in under your policy restructuring a little over a year ago, rather than voted on. --EncycloPetey 17:06, 13 June 2007 (UTC)
I have been making that assumption precisely because CM repeatedly asserts that CFI was VOTED on by the community. My argument was only that IF it was voted on by the community, then the reference to Wikisaurus as an exception was in there WHEN it was VOTED on. But now you are telling me, that despite all CM's assertions, CFI as such was not voted on! --Richardb 13:26, 18 June 2007 (UTC)
How do you read I'm getting angry at everyone into my trying to use a logical argument?--Richardb 13:26, 18 June 2007 (UTC)
What I'm saying is that you seem willing to accept that someone can slap a banner on something saying it's policy, even when no vote happened. When the same person returns to correct that mistake, you begin arguing that the page is now policy. Your position is coming from two irreconcilable viewpoints, and is therefore illogical. The only sense I can make of the situation and your behavior is that you are temporarily blinded by anger, otherwise none of this makes the least bit of sense to me. --EncycloPetey 22:32, 18 June 2007 (UTC)
Anger at me aside, has this been orphaned and deleted yet? --Connel MacKenzie 20:25, 30 January 2008 (UTC)
  • deleted. Never went anywhere, wasn't widely accepted. - [The]DaveRoss 01:47, 15 April 2008 (UTC)