Wiktionary talk:Requests for verification/Archive 1

From Wiktionary, the free dictionary
Jump to navigation Jump to search

This page contains archived threads from Wiktionary talk:Requests for verification from 2005, 2006, 2007, 2008 and 2009. Archived Requests for Verification (discussions of the attestation of individual words) are found on the words' talk pages or WT:RFVA.

General comments about the page

[edit]

I'm so happy to see this that I'm reluctant to offer anything resembling a criticism. However ...

I'm not comfortable with the one-month deadline (or any such deadline). If it's the compromise needed to make this work, so be it. I'm certainly fine with deleting an article that turns up nothing at all after a month, particularly since this page seems to have had the desired effect of encouraging people to look for cites. My concern is, what happens to something that turns up one or two cites, or turns up inconclusive cites (e.g., the word is used, but the exact meaning isn't clear, or independence is in question)? To my mind, this is just an incomplete article. More evidence might turn up at any time. I would strongly prefer to see such an article carried along indefinitely with a nice big RFV disclaimer. Anyone looking up the word has been adequately warned not to put too much stock in the term. My problem with deleting a term has always been that we lose whatever work has been done so far. Even if it's not much (but don't underestimate the work that can go into winnowing out even two good cites), why lose it at all? If anything, showing our process in action so openly helps our credibility. -dmh 03:01, 12 October 2005 (UTC)Reply

Well, the main purpose to adding a deadline was indeed the compromise between the deletionists who would throw such content right out, and people like you and me who would prefer to retain it especially when doubts on validity exist: I wanted a window where the words could prove themselves, instead of perching precariously on the sill of RFD. If we were a smaller wiki with fewer users, I would even have suggested a longer deadline, perhaps up to a year or more.
You do bring up a good point about what to do when we have evidence but not conclusive evidence. I would hope at the very least that the information accumulated in the process could be archived on the talk page.
At any rate: I did put some thought into the creation of this process, but I did it all myself; and, as we are a wiki, and as this process is all somewhat new and thus hopefully not yet set in stone, I hope that other users, such as yourself, will be bold and propose or implement improvements to this policy so we can have a better wiktionary overall in the end. —Muke Tever 04:36, 12 October 2005 (UTC)Reply
Moving borderline cases to the talk page and deleting the main entry seems reasonable to me. If we go with that, I'd like to see a notice on the "not found" page mentioning that there may be preliminary material available on the talk page (ideally with a link to the talk page itself). This would satisfy my desire to preserve work in progress, and I hope it would also satisfy the desire to limit main entries to well-verified material (or, more realistically, to increase the proportion of well-verified main entries, which goal I support). -dmh 15:32, 12 October 2005 (UTC)Reply

I believe we have an adequate compromise on the main point of this debate. The main point for me was a time limit. Of course, I would have preferred a shorter limit, but I can accept one month as reasonable. We probably still disagree on what constitutes a good citation, but that problem is less immediate. I will continue to consider blogs and usegroups as less reliable references, and put considerable reliance on paper sources.

I don't think that conditions like having at least three citations is that important, nor is the condition that they be spread over a year. One good solid reference can be enough. Nevertheless, I would hesitate to remove these points from CFI. I think too that references can be requested for any word, including the most common ones even though that request can be easily satisfied with a Webster reference. In the case of an obvious inflection having a link to a documented root word should be enough.

On the matter of what to do when the month expires, there are many possibilities, and I would prefer to withhold comments until we start to have examples for which the month has expired. I trust that the sniping will come to an end and that we can find more ways of working together. Eclecticology 23:30, 14 October 2005 (UTC)Reply

I would want to avoid referring to other secondary sources (e.g. Webster, OED) where possible. On the other hand, appearance in one of the major print dictionaries certainly presents a prima facie case that a term is or has been in use. So yes, cite other dictionaries as positive evidence that a term merits an entry, and even use out-of-copyright citations mentioned in other works, but don't copy the definitions themselves, even if the dictionary in question is out of copyright. The primary/secondary/tertiary/... distinction really crystalizes for me what I don't like about importing Webster 1913 wholesale (beyond concerns about the form of the definitions).
As a trivial case, if, say, Webster 1913 supports a term with a quote from Shakespeare, then we should consider the term supported by appearance in Shakespeare, regardless of its appearance in Webster. In cases where Webster quotes a (now) lesser-known author, it would be best for us to dig up the exact cite. I was going to give a specific example, but gave up after checking a couple dozen of Poccil's entries. Otra vez.
In any case, we can continue to hammer out details of policy and process. I believe CFI as it stands recognizes the value of "one solid reference" (in the form of a "well-known work"). However, rather than invite debate over just what that is, we should probably give a fairly narrow interpretation and revert to the more objective though admittedly somewhat arbitrary "three spanning at least a year" rule if there's any doubt. The whole point of appearance in a well-known work is that it tends to correlate with actual use.
Certainly, appearance in print in and of itself doesn't confer a great deal of solidity from a linguistic point of view.
I will say, however, that I will take a dim view of any policy which tends to lose even small amounts of serious work. This does not include vandals, BJAODN and the like, but it does include anyone's serious attempt at a definition (even — or especially — an anon's). In practice, this is not the subjective call that it might appear to be. -dmh 03:43, 15 October 2005 (UTC)Reply

I find importing from and quoting the definitions of secondary sources to be perfectly acceptable as long as credit is given. Even short citations from copyright sources can come within fair use. When we make up our own definitions we are getting too close to original research, and it is too easy for those definitions to be POV.

The problem with the citations in the 1913 Webster is that the attributions are skeletal, and give only the author's surname. Perhaps a print version has an appendix that identifies these authors and their works, but I don't have a copy. Even the Shakespeare quotes should at least identify the play. So certainly, exact cites are much better. I find the 1914 Century Dictionary much more helpful. Eclecticology 06:57, 15 October 2005 (UTC)Reply

I honestly thought that producing our own definitions, drawing on actual usage, was exactly the main thrust of our work here. Pronunciations and such are also nice. Etymologies would be great, but except in the case of new words they're very hard to track down well, but definitions are our bread and butter. A definition from the early 1900's, even if it was a perfectly good definition then, is likely less than ideal now (and Webster 1913 bears this out).
Actually, I find the Webster 1913 entries are by now very good historical material to draw on. For example, I had no idea that worth had the verb sense given, but given that, it's very easy to see that it's cognate with German werden, and that in turn sheds light on the modern noun and adjective senses.
The danger here is not that the old entries are likely to be outright wrong — I'm sure they're very well researched — but that they lack context. A non-native speaker could become quite frustrated following the Webster 1913 defintions of worth. Moreover, worth is not a rare case. In short, an entry from an older dictionary cannot simply be imported wholesale. Or rather, it can be, but the usefulness of the result must be verified. There are basically two possibilities:
  • The imported entry is significantly out of date, in which case it's useful background material, but not useful as an entry.
  • The imported entry appears to be essentially up to date, in which case it is not really adding anything and we should do our own research.
As you mention, there is a question lurking here concerning "original research". Let's assume for the moment that we are to use Wikipedia as a basis for analogy. Wikipedia, as I understand it, is meant to pull together original sources such as news articles, scientific publications, public records and so forth, digest them and summarize them in a way that presents the various original sources as evenhandedly as possible. This text will generally be original, but originality per se does not make it POV or NPOV.
In our case, we are trying to pull together actual usages as our original sources, digest them and summarize them in a way that presents the original usages as evenhandely as possible. When we make a definition, we are basically stating "We have observed that this word is used in this sense." In most cases, we don't need to say this explicitly, but in some contentious cases we use usage notes and such to be more explicit.
For example, in a case like podium, we define two senses, and also note that some consider the overwhelmingly common sense incorrect. In a case like decimate, we shouldn't even define the "correct" sense, unless it's actually attested. As far as I can tell, all that's really attested is that people object to the sense currently used.
In neither case is producing original definitions presenting a POV, except in the very narrow sense that it presents a POV about what is actually in use. Naturally, some definitions will start out POV (or be made POV) for various reasons, but we have a process for correcting that. -dmh 15:03, 17 October 2005 (UTC)Reply
I agree that I thought non-copyvio definitions were our bread and butter, per se. But I disagree with your conclusion about the Webster 1913 entries. The defective entries seem more a product of Poccil's selective importing than poor original material. I do not know how many times I've come across entries of his, where the only definitions he chose to import were the obsolete meanings, omitting the current (correct) meanings. It has been quite a few though. --Connel MacKenzie 15:23, 17 October 2005 (UTC)Reply
Poccil's imports were highly selective, and he certainly did not favour including the quotes that are in the Webster. It would not be fair to judge Webster by Poccil's actions. Webster's 1913 definitions were presumably valid in 1913. That does not make them wrong now; we just have additional more modern definitions. Yet other words were valid in Shakespeare's day, and their meaning remains unchanged in the present time. I would not consider either meaning of decimate to be wrong. (They were both there in the 1913 Webster.) I still hesitate at the meaning of reducing "to" one tenth instead of reducing "by" one tenth. Someone has put a lot of quotations on that page, and they will go a long way toward sorting that one out. There are currently no references for podium. Our first meaning has already migrated significantly from 1913, and our second is easily based on a misunderstnding. It presents a good example of why we need to document these usages with their dates.
Definitions may very well be our "bread and butter", but that does not mean that there is a need to always be writing new definitions. In a word like democide anything but the coiner's own definition is bound to lead to serious misunderstandings. Many scientific terms can only be defined in one way. Eclecticology 23:47, 17 October 2005 (UTC)Reply

Stare decisis

[edit]

Thinking over what happens when an entry here expires without sufficient evidence, I believe we need some sort of "stare decisis" rule, to the effect that, if an entry is removed for lack of verification, it may not be reinstated unless new evidence is presented. The practical effect would be that sysops should feel free to re-delete speedily material that someone re-creates without presenting any new supporting material.

Example: Someone creates an entry for flooby, claiming it means thirty minutes before sunset. The entry is rfv'd. Nothing turns up in a month, and the article is deleted. The original poster simply re-creates the article. It can be shot on sight. Later, someone turns up a citation ("The sun approached the horizon. It was flooby. Soon the sun would be down.") that at least suggests the definition is correct. The entry can be recreated, and (if, as in the example given, the evidence is not conculusive) go through another round of rfv.

All this is, of course, assuming that partial work is archived on the appropriate talk page or elsewhere, so that the next round doesn't start from scratch. The point is that, once we make a decision at the end of an rfv cycle, we stick with it until there is explicit reason to reconsider it. This would also apply the other way round. If we decide that evidence is sufficient, the article may not be removed unless someone provides new material indicating that the original evidence was flawed. -dmh 15:14, 17 October 2005 (UTC)Reply

This sounds difficult to administer. Wiktionary: List of bogus words to be deleted whenever they reappear?  :-) --Connel MacKenzie 15:23, 17 October 2005 (UTC)Reply
While I support the general thrust of what dmh is saying, I can also understand the practical difficulties raised by Connel. Of course one can get a clue to the situation by looking at the article history and seeing the comment that there are "n deleted edits".
A more likely scenario when one of these is deleted after a month is that no evidence is presented at all rather than insufficient evidence. If no evidence is presented there is nothing to archive. As long as there is a reasonable effort to maintain archives there should be no problem with this. Eclecticology 22:17, 17 October 2005 (UTC)Reply
What about words which are demoted to protologisms, but later find their way into significant useage? One would have to be able to remove words from the Blacklist (Duncan 12:04 28th November 2005)
Absolutely. But those will presumably have numerous respectable print citations spanning over a year to back them up. --Connel MacKenzie T C 14:36, 17 December 2005 (UTC)Reply

"Added to rfv by so-and-so"

[edit]

Is prefacing every entry here with who submitted it helpful? Will I get a demerit for tagging too many entries with rfv, just because I seem at times to be the only one watching anon contribs? The entry's history will show that if needed; I don't think the any emphasis should be on who made the first tagging of an entry though. --Connel MacKenzie 15:35, 17 October 2005 (UTC)Reply

Eheh... I meant rather to give the date the rfv was added, which generally does need noting, on account of the time limits; the who added it isn't particularly important. Sorry if it seemed to put undue emphasis on ya. :p —Muke Tever 19:27, 17 October 2005 (UTC)Reply
Muke's point is well taken. The date is the most important part of the originating signature. Admittedly, who posts something here, as on rfd, will give an initial impression about the claim, and whether I might be in more of a hurry to examine it. Over a full month, however, this will be less important. The ultimate decision should be based on the material itself. Eclecticology 22:28, 17 October 2005 (UTC)Reply

frienemy

[edit]

Just a question on process here. The term frienemy failed rfv, right? The term got "downgraded" but then changed again (without attestation.) Is this how it is supposed to work, or should it be whacked? --Connel MacKenzie T C 19:42, 15 December 2005 (UTC)Reply

pollyin'

[edit]

Er... excuse me... will the culprit who deleted pollyin' without at least placing it on Rfv please stand up? It is an actual word, I have citations of its use in regional English going back at least to 1996 and will recreate the page. Alexander 007 04:42, 27 February 2006 (UTC)Reply

I'm a bit ticked because I do not keep a record of entries that I start and if someone deletes something while I'm away I may not even notice. At least have the courtesy to notify me. You may not be familiar with a word and may not have the means to verify its existence, but I may. A word known to someone in Brooklyn, New York may be unknown to someone in Sydney, Australia, and vice versa. Alexander 007 04:52, 27 February 2006 (UTC)Reply
Who knows what else some hasty lugnut has erased in my absence. I'm not here to completely waste my fucking time, you know. Alexander 007 04:55, 27 February 2006 (UTC)Reply

Improvement Needed to Special Page "Contributions/user"

[edit]

What we seem to need is an improvement to the Special Page "Contributions/user", so that it lists pages the user contributed to, even after they have been deleted. How can we request that ?? --Richardb 13:08, 18 March 2006 (UTC)Reply

Discussion on RFV policy (from Beer Parlor)

[edit]

Perhaps I am re-inventing the wheel. I want to re-open the debate about verification / inclusion on a policy level. Perhaps one of the clever chaps could move this to an appropriate forum and link discussions. There is much acrimonious debate over what should and should not be included and for me the discussion is too broad. I propose that we discuss one point comprehensively and reach consensus. Thereafter we discuss a further point and reach consensus etc. Each aspect of the colloquium should last for a period of two weeks. If consensus is reached, the summarized point will be put up on a semi-policy document. If not, the discussion will be removed from this page to a separate forum while the next point is discussed. I realize that two weeks is short, but we have all entered into the debate at one point or another, and this is an attempt to crystalise ideas and not to rehash old ground.

I should like people to be as concise as possible. In terms of formatting the debate, please use #* for your first point and make the point in bold. Thereafter use #: :for any sub-points and : for responses to a particular contribution. under that contribution.

Please remember that this relates to verification / inclusion. It could be used as a base for all entries, but I am considering the discussion particularly for purposes of verification procedure.

Please also let me know if you are not in favour of this type of discussion, as it would be pointless to continue if that is the case.

If you are in favour, please suggest topics to be discussed in this format. I do not want this to be my project only, and would prefer others to think clearly about topics that will improve the Wiktionary project and to prepare similar proposals. Andrew massyn 21:03, 31 May 2006 (UTC)Reply

My proposal for the first point:


  1. Citations.
    • Current policy (observed in the breach) is to have three citations per definition.
    Citations to be independent and spread over at least a year
    OR one citation from a well known work --Enginear 03:12, 1 June 2006 (UTC)Reply
    If a word is put up for rfv or rfd, it is not for the person who submits the word to find the verification.
    Those who wish to defend the word,should put three appropriate citations on the rfv/rfd page and on the page in dispute. If the defender of the word cannot be bothered to do so, and merely waffels about what a great word it is, it will be assumed that citations are not readily available, and the word will be a candidate for deletion after the appropriate period, without further discussion.
    There is no point in ?trebling the length of rfv/rfd pages by copying the cites to them in full, particularly as the entry page is linked in the section title. A note that cites have been added (perhaps also noting, eg, 2 books & 1 blog, from 1980 - 2005) should suffice. --Enginear 03:12, 1 June 2006 (UTC)Reply
    If it is a new word, with no appropriate citations, the defender of the word must say so, and provide cogent reasons why the word should be kept.
    The reasons should contain at a minimum,
    Where used and a quote. (1 citation)
    Confirmation of the spelling.
    Confirmation of meaning.
    • The citations should be in date order.
    If an earlier citation is found, it would be great to substitute the earliest citation.
    • The citations should illustrate the particular definition aptly.
    Because Wiktionary is not a paper dictionary, the citations can be somewhat longer than in a paper dictionary, which would add interest to the word.
    Unless the earliest citation is over one year old, the definition will be tagged and marked as (Protologism)
    Unless the latest citation is less than 20 years old, the definition will be tagged and marked as (Archaic)
    These tags could probably be kept up to date by bots
    If all the cites are from technical sources (or sources relating to use only within a small community, or a particular region) and there is no non-citable evidence demonstrating likely wider usage, the definition will be tagged appropriately --Enginear 03:12, 1 June 2006 (UTC)Reply
    • A written work is the preferred means of citation. (for purposes of this discussion, a dictionary is not a written work).
    While this is my practice too, I am not fully comfortable with it unless the text is available online -- it is obviously useful for people using wt to be able to look up the wider context of a quote, which they can do with an online book or a blog, but not with other books. Possibly therefore blogs, etc should rank between online books and offline books. --Enginear 03:12, 1 June 2006 (UTC)Reply
    For a book work the citation should show in the following order and format:
    The first date of publication of the work. (in bold)
    The relevant quote with the word cited in bold.
    The author. (In italics)
    The name of the publication (In Italics)
    The publisher. (in Italics)
    The page number (in Italics).
    If applicable, a link to an online source for the publication.
    example:
    1976 “John grabbed his angora goat and ran for the hills.” William Spokeshave: The Trials of John and his Goat: Bloomsbury p.135.
    Wiktionary:Quotations#How to format a quotation explains that the format was changed from one similar to the above to the present standard, to accomodate the use of templates for frequently quoted works. --Enginear 03:12, 1 June 2006 (UTC)Reply
    My policy has been to link to online sources, except when I read it on books.google -- any comments on this approach? --Enginear 03:12, 1 June 2006 (UTC)Reply
    If the page becomes messy because of too many citations, they should be placed on the citations page with the appropriate link. I personally do not like this, and would prefer them on the talk page, but precedent is against me and it would not be sensible to change the policy now.
    This abuts, or even overlaps, discussion at WT:GP#2-level_dictionary (which could sensibly be renamed, since it has morphed). In my view (albeit as someone who likes cites to be immediately available adjacent to the defs they relate to) two cites usually look alright adjacent to each def. A minor change of formatting (eg greater indent) would probably mean three would look OK, ie not obscure the defs themselves. I shall be arguing for that. --Enginear 03:12, 1 June 2006 (UTC)Reply
    If the citation is a newspaper or magazine, the citation should show in the following order and format:
    The date of publication of the newspaper or magazine in bold.
    The relevant quote with the word cited in bold.
    The newspaper / magazines name (in italics)
    If possible the publisher unless it is very well known. E.g. The Times of London The Times (of London) (owned by the New York firm, News Corporation, who are very proud of owning a paper published since 1785, and would not like its simple name qualified, any more than some, ahem, Londoners ;-) Enginear 03:12, 1 June 2006 (UTC)), Time Magazine, Popular Mechanics, The Washington Post. For purposes of this discussion, “The Angora Goat Breeder” is not a well known work. Please use your judgment here. :).Reply
    The page number (in italics)
    If applicable a link to an online source.
    Example
    16 June 2005Angora goat farmers suffered a setback when the price of angora wool tumbled” The Times of London. p6. wwwtimesof London.com.
    For academic papers
    The date of publication. (in bold)
    The relevant quote
    The Title.
    The writer of the article.
    The academic institution and department.
    Example
    July 2003 “The influence of angora goat farmers on medieval basket weaving cannot be underestimated. Indeed Smith opt cit states that angora farmers bought most of the baskets!” The history and development of medieval basket weaving in Delft. Victoria Crumb. PhD. Thesis University of Leyden (History).
    The following are not written works. Pamphlets and flyers.
    • The next preferred means of verification is a published dictionary.
    My view is that the community should prefer certain sources.
    Oxford, Webster’s & Chamber’s spring to mind immediately. For technical words, a technical dictionary would be appropriate.
    The citation should be as follows.
    The date of publication (in bold).
    The name of the dictionary in full.
    The publisher.
    The ISBN number.
    Certain dictionaries should not be used.
    Urbandictionary seems to rank in this category. If there are others, please amplify.
    Any self-editible dictionary on line should not be used.
    Self-editible dictionaries seem to be self-referent, or worse plagiarise shamelesly from each other, and are therefore not to be implicitly trusted.
    One dictionary citation is not sufficient for verification. It could be part of the three sources quoted.
    There should be at best one dictionary source per entry, but as stated above, the preferred source is a written publication.
    I suggest not more than one dic to count towards the 3 cites (or even go for 3 non-dic cites + one dic) --Enginear 03:12, 1 June 2006 (UTC)Reply
    • In ranking order, a web-page is third.
    The web-page should be from a cached source.
    Agreed, and I've read that google caches all blogs, but how do we actually know what's cached? Does anyone know? --Enginear 03:12, 1 June 2006 (UTC)Reply
    Any word with less than 20 hits from a search engine must be examined closely.
    This is an arbitary choice. My personal inclination is to make it less than 100 hits. This should exclude genuine spelling mistakes as opposed to spelling varieties.
    In my view, screening adequately to ensure that three cites relate aptly to the definition, and come from acceptable, independent sources, usually covers this (17 more are probably mirrors of Wikt, etc, and 80 are misspellings or relate to different defs) so it needn't be independently specified. --Enginear 03:12, 1 June 2006 (UTC)Reply
    It would also screen dialectic use. See point below.
    Subject again to the outcome of WT:GP#2-level_dictionary I believe all attested dialects should be included (appropriately annotated) partly to increase usefulness to the majority of English speakers, who speak dialects, and partly because the etymology is fascinating (eg, I'm told that Ghanaian dialect for brother (pronounced brer) is totally separate from the Jamaican brere, even though many of the slaves shipped to Jamaica had been captured by Ghanaian tribes. One day, when I know more of linguistics, I may research it.)
  1. I would even argue that ideolects may be valid if they belong to people who make public speeches, write books, etc. Present such people might be w:George W. Bush and w:John Prescott. People with only occasional word-related gaffes, eg correcting "potato" to "potatoe" are probably not eligible. A past master of gaffes was w:David Coleman, British sports commentator, after whom the [Private Eye: Colemanballs] column was named. Arguably, the criterion should still be 3 independent cites, presumably the original speech/text plus two other people knowingly quoting it in their own work. I do also think that words made up by authors, and which have a clear definition, should be added (and I don't understand claims that the Clockwork Orange words are undefined -- I am fairly sure there is a glossary of them at the back of my copy of the book (unfortunately packed away for a few months until I get my house extension built)). --Enginear 03:12, 1 June 2006 (UTC)Reply
    Any editible page should not be used, except as a last resort.
    This will dispose of words which are used in tiny communities and are therefore so regional as not to be regarded as "English" but as a regional dialect.
    Because it is a last resort, such entries will be subject to harsher scrutiny.
    • In ranking order, spoken words rank fourth.
    Spoken words include words from movies and television.
    Unless a script is provided, verification of spellings meanings etc are difficult.
    Meaning is particularly difficult. Take for example, Mark Antony's speech in Julius Ceaser "For Brutus is an honourable man." Depending on the actor's intonation, Brutus could be either honourable or dishonourable. An actor interprets the scriptwriters words, and may give a spin to the speech that the scriptwriter never intended. Obviously in this case irony featured strongly, but it is not always that easy.
    As Widsith pointed out to me recently "Any word can be used ironically." I now agree with his viewpoint that we should only define the non-ironic use. But I think we could add a note to a particular quote (intended ironically). Also, in this case, we would be quoting the script, rather than a particular actor. Your point is valid when there is no script, eg an ad lib, and for that, one would have to hear it to be sure. --Enginear 03:12, 1 June 2006 (UTC)Reply
The last two point links with the next topic to be discussed, namely "Regional English, Nonce words and Neoligisms.". however input here will assist me in developing the topic. Kindly provide input one way or another, including whether it is too detailed or not detailed enough; as I will then know whether to continue with this project or not. Many thanks to all. Andrew massyn 21:03, 31 May 2006 (UTC)Reply
I think the level of detail so far is about right. --Enginear 03:12, 1 June 2006 (UTC)Reply

Davilla's comments

[edit]
I'm not going to respond individually to any of the ideas above because, although I agree with them for the most part, aside from a few quibbles, I do not consider the discussion to be properly approached, and it would be better to start over with a more solid foundation. On the first hand, you mix issues of formatting with this policy. Your treatment of preferred texts also jumbles other issues into it. For instance:
  • You claim that dictionaries are not valid references. On the contrary! Although an entry for a particular word classifies under mention and does not meet the criteria for use, a word that is actually used in defining another word should definitely be included. Consider, for instance, if a printed slang dictionary claimed that a word was introduced as a protologism in a certain year, and became a neologism a few years later. I would definitely count that as an attestation of protologism!!! On the other hand, a peer reviewed journal that defines a term does NOT count as use if the term is not used thereafter, being defined only illustratively in a linguistics study, for instance. Yes, I understand your intent, and usually you'd be right, but I don't think the question is approached correctly.
One advantage of a wiki -- it only needs one of us to spot erroneous wording, and it can be corrected! I think we are agreed that the statement at WT:CFI#Conveying_meaning is appropriate. Whether the book is a dictionary is irrelevant (except to note that a reputable dictionary is a peer-reviewed work) --Enginear 06:01, 2 June 2006 (UTC)Reply
  • Meaning can be derived just as easily from a script as from a printed work. The question of indication of usage is an important one, but it can't be divided into print versus speech. Granted, irony is less common in writing because it can be misunderstood, but there are equally other ways to use a word without giving much indication of what it means, e.g. "I love my cat because she's so adorable" versus "A cat uses its long tail to balance itself." You see, indication of usage, supporting a definition worded one way versus another, is a separate problem. In my opinion you therefore rank spoken words too low, especially in modern media. Your example of "an honourable man" perfectly illustrates why recorded speech is in fact superior, since the printed script loses a layer of information. Web pages on the other hand are much less reliable. After all, if quotations from a blockbuster movie, lowly ranked in your scheme, occur on some schmo's webpage, would you then rank it more highly as a result? I've seen plenty of webpages that were flat wrong.
I feel cites have two equally important purposes:
  • they allow us editors to prove (now and in the future) that the usage we claim in the def is valid
  • they allow users (at times in the future) to see examples to improve their understanding of what the def means
For the first purpose, the cites must be expected to be available, archived, for some time, accessible to (say) at least two admins; for the second purpose, since it is often useful to read the context, the long term archive is still important, but it must now be readily available to all users, ie must be available online.
We don't necessarily need all cites to be available online, but I suggest at least one per def should be. Any highly successful TV programme is by [my] definition a "well known work". If the script or an audio file is available online, I would personally place it equivalent to a book, etc available online; otherwise, if available on video (or in a publicly-sold book of scripts) I would place it equivalent to a book, etc unavailable online. Obviously, if there is no written source generally available, it might require a letter to the scriptwriter, or similar, to confirm the spelling, but that is no more onerous than confirming the pronunciation of a written word. --Enginear 06:01, 2 June 2006 (UTC)Reply
Dialects again are a different thread of thought, and the question isn't how to verify the word as how to verify the dialect. This and the other issues should be separated from an assessment of what goes into verifying a word. You've also exluded a lot of process, such as what happens when a failed word is resubmitted. I would offer a different criteria than the pure number of hits on a seach engine but it's a personal preference, and I've blasted you enough already. Anyways, it's not because of the content so much that I disagree with the above, as it is the reasoning. Davilla 09:58, 1 June 2006 (UTC)Reply
Thanks for that input. In the opening paragraph, I said that I wanted to review verification in general, and proposed that individual topics be discussed under that heading.
The first discussion (which I had hoped was going to be non-contentious) was of citations only.
The next thread would be Regional English, nonce words and neologisms, again under the broad heading of verification. In this way, as a community, we can "nail" a topic and be done with it before moving to another.
Sorry, I read too much into what you had written, and took Enginear's comments to support that assumption. Davilla
I understood the intended focus, but in hindsight my answers strayed outside it -- sorry. --Enginear 06:01, 2 June 2006 (UTC)Reply
Since formatting relates to citations and was therefore included.
Unless you're going to delete entries because the quotations are incorrectly formatted, the importance as far as formatting goes is how much is required for verification. A link to a search doesn't do anyone favors. Are external links to the correct pages sufficient by themselves? Are they helpful when other information is given?
Agreed. Unfortunately, I'm currently also mulling over some comments to give to a GP discussion, where I will argue that, with some changes to layout of quotes, a more readable page can be produced while still keping the cites adjacent to the relevant defs; so my initial response was to broaden the discussion rather than shut it down. I now agree with you, let's concentrate on what is required, and consider how it should look as a separate discussion when we have a better idea of what the whole entry will consist of. --Enginear 06:01, 2 June 2006 (UTC)Reply
The ranking of dictionary citations and spoken words is again part of this discussion, as is the validity of dictionary definitions.
I totally agree, which is why I took space to argue you assessment of the rankings, perhaps confusingly so. Davilla 19:43, 1 June 2006 (UTC)Reply
I agree that dialects are a different thread, which will hopefully be discussed in the next thread, and said as much in my closing paragraph.
If certain things are left out, I would be more than happy if they were included, either at the bottom of the general article or in the appropriate place in the article.
I therefore do not feel that our views are that far apart. Perhaps if you could re-look at the topic now that I have clarified, you will agree with me. If not, perhaps a short pithy series of points you would like discussed under the greater heading verification and the lesser heading citations, would give us all direction as to where we should be aiming this discussion. It is my belief that once the greater topic is cleared up, much acrimony would disapear and the difficulty with asspus type words will have a clear set of rules to provide verification. Regards Andrew massyn 15:54, 1 June 2006 (UTC)Reply
The current breakdown of the CFI is this:
  1. Clearly widespread use
    • Dictionaries could be used as references in this case, although these words are not often in contention. Words from any dictionary with stronger criteria than our own should be automatically admitted, and rejected only in special cases when they arise.
      This deals only with the use of cites for editors to verify usage, but not with the use by users to flesh out our definitions. While perhaps a low priority, we should add cites to words in widespread use too. Adults learning English as a foreign language will find them useful. --Enginear 06:01, 2 June 2006 (UTC)Reply
      Usually well-crafted example sentences will suffice in this case. But you're right that the wording of a definition also has to be weighed at times. The full urban dictionary description of choad for instance was severely trimmed. This probably deserves a deal of treatment, and you're right I haven't given any. Davilla 16:12, 2 June 2006 (UTC)Reply
  2. Usage in a well-known work
    • This one's a little open to interpretation. For those funny nonce words especially, what does well-known mean?
      Well, someone's got to start: "A work which at least 10 million people (eg the majority of adults in at least one medium-sized English-speaking country) will have heard of and at least 100,000 will have personally read or heard." A strong indication of the latter can of course be gained from statistics of book sales or audience reach, which are, I believe, independently verified in most countries. --Enginear 06:01, 2 June 2006 (UTC)Reply
  3. Appearance in a refereed academic journal
    • This one is a bit too loosely restricted, and should eliminate e.g. the names of Java classes, ad-hoc definitions in mathematical proofs, etc.
      I agree, and would go further. In my view, one of our fundamental purposes is to confirm the meanings of words, and certainly the fundamental purpose of cites is to attest that the word was used in the way defined. So I believe that WT:CFI#Conveying_meaning should apply to individual defs of words in clearly widespread use, in well-known works, or in refereed academic journals, as well as the "other" category it currently applies to. Re "meaningless words" I would argue that even um and er convey the meaning I've just noticed that my brain is working slower than my mouth. It is arguable that, for the sake of confused readers trying to make sense of Jabberwocky we should include nonce words from well known works, but I sense the majority here believe that is the function of encyclopedias rather than dictionaries, and I am now ambivalent about it. --Enginear 06:01, 2 June 2006 (UTC)Reply
  4. Usage in permanently-recorded media, conveying meaning, in at least three independent instances spanning at least a year.
    • This is the meat of the matter. The others are for the most part pragmatics to make this practicable.
      We need to consider both the number of cites and the minimum time interval from first to last. I agree with the time interval. I am happy with three cites conveying meaning. I am prevaricating re whether two cites + "any" dictionary entry should be adequate. I think it should only be valid if the requirements for citation on that dictionary are similar to, or better than, ours. --Enginear 06:01, 2 June 2006 (UTC)Reply
      We also need to bear in mind that we are all thinking about English words. At some point we need to specifically consider foreign words and, in particular, "dead" language words, eg OE or Latin. Since it is more difficult to find cites (and particularly online cites) should we allow a lower standard for them? --Enginear 06:01, 2 June 2006 (UTC)Reply
(Note that this is an entirely separate issue from idiomacy or encyclopedic nature.)
Which we must be sure to discuss (perhaps under another heading) --Enginear 06:01, 2 June 2006 (UTC)Reply
I would start with the last item instead, forming a definition of what we consider to be a word (giving nonce words special treatment), and then list ways to verify it, whether from a dictionary or citations or what have you.
As a designer, I see here the standard disagreement between those who like to approach a design "from the bottom up" as AM has started, and those who like to approach it "from the top down", as you propose. Many studies in many different fields have concluded that there is little difference in efficiency or in the quality of the finished product, the reason being that actually the process is always to some degree iterative. Those arguing the merits of a bottom-up approach often point out, as AM did, that it allows a start on non-contentious issues. However, while some people's minds are attuned to this method, others find it hard to concentrate on details when they haven't agreed fundamentals.
Okay, that's fair. Davilla
I therefore propose a method I have used for managing complex building designs:
  • Start by looking at details at the bottom. Come up with solutions that seem about right, but do not cast them in stone. Park them, and move on up.
  • Sometimes, changes at the next level may mean that bottom items will require review. Note that, but do not make the corrections yet, even though the method required may be clear. Continue up, repeating the process through as many levels as necessary.
  • Arrive at the top. At this point, the "top down" people will perk up and give new life to the design, while the "bottom up" people will be very clear about what is likely to be practicable and what isn't. The result will usually be good, efficient and appropriate.
  • Start working down again, making corrections at each level as required. With the experience already gained by discussion on the way up, this will usually be very quick, but some arguments will have to be rehearsed again for the benefit of "top down" people who couldn't visualise them properly without knowledge of the "top" item.
  • Arrive at the bottom again, press "Go" and celebrate!
So in short, I am happy to support AM's proposed step by step process until we reach the fundamental issue of what we consider to be a word, leaving draft policies in place as we go. Then, moving down again, modifying the draft policies as required to fit the new requirements, until all items are in place and the policy can go live. I think this is a particularly valid way for a wiki to make policy decisions (certainly it has worked before for pretty disparate sets of designers with no clear heirarchy) since the ability for people to contribute either "top down" or "bottom up" according to their preference should enable a consensus to be reached quicker. --Enginear 06:01, 2 June 2006 (UTC)Reply
Then go into citations. Permanent media is required, and the more professional the fewer demands are placed. For instance, an academic journal may not even have to convey meaning, whereas a personal blog or usenet post would be highly scrutinized.
I disagree. Certainly, there are some "privileges" for words from academic journals, eg (except for quoted speech, etc) they should be exempt from any allegations that they are (informal use only). But in general, no cite is useful for attesting usage according to a particular def unless it either conveys meaning or is defined in a dictionary which itself requires cites conveying meaning. (For an exception, see Talk:tonk.)
To take an example, I have just checked the title page of my NIV Bible. Surely, any words on a title page should be important cites? Well, No. The page ends with the following seven words: Hodder and Stoughton London Sydney Auckland Toronto. Perhaps the and is a useful cite, but only because it is a clear example of its use to join two words of near-equal status. Our familiarity with standard title page formatting would lead us to predict that L, S, A, T were locations where the publishers operated, and indeed that Hodder and Stoughton were parts of a publisher's name. But I do not see that any of the cites, except for and are worth having. --Enginear 06:01, 2 June 2006 (UTC)Reply
Otherwise, the easier to verify the better. Webpages are ranked low because they are malleable, hence a cached version, not the easiest thing to find, may be required. Discussion forums are ineligible.
I would still welcome any advice on exactly what on the net is known to be cached. Are discussion forums perhaps excluded because they are not cached? Is that the difference between them and discussion groups which, according to WT:CFI#Attestation are "favored"? And how am I supposed to know the difference? --Enginear 06:01, 2 June 2006 (UTC)Reply
Then go into the process. If Google hits don't turn up even a single relevant entry, this project and its mirrors aside, it's safe to speedily delete. If it turns up ONLY as definitions in other online dictionaries, then that counts as no attestation, so it's safe to speedy. If it turns up urban dictionary cruft and a bunch of questionable pages of material, it has to be placed through the verification process if not contested for other reasons. The number of hits is irrelevant. If there are two hundred Google hits and an admin goes through every one of them, and then through those of potential alternate spellings, it's safe to speedy delete. The question is just where you make the tradeoff between listing for deletion or completely checking first. A slang dictionary might not be suffienient to show widespread use, nor could it be used as a citation of the word in use, except for one printed dictionary in this above proposal, but regardless it would be legitimate grounds for giving the entry some leniency in the verification process. A discussion thread doesn't count as a citation, but it lends further credibility. I would say that only in the case that nothing legitimate is presented can the entry be deleted without being requested for deletion. Davilla 19:43, 1 June 2006 (UTC)Reply
There are actually three points implied here:
  • What should be the procedure for Speedy DELETE/RFD/RFC testing/voting?
Not actually discussed here, and neither have AM or I mentioned it. Perhaps we should have. However I suggest that the decision to delete will require consideration of more than verification, so is beyond the remit of this discussion. It would seem appropriate to consider moving on to that at the end of the process.
  • What is the minimum verification requirement for a non-compliant entry to avoid the chance of speedy DELETE?
Since speedy DELETE is potentially wasteful, it seems reasonable to propose a minimum standard of citation which will avoid its use. However, it is soon clear that this is hard to generalise. For example, for an entry which was clearly extremely offensive, liable to promote mass violence, cf the "Images of Mohammed" furore which "resulted in over 130 deaths" [[1]], or even illegal, I suggest that the minimum to avoid speedy DELETE would be an entry in Webster's, the OED or Chambers. However for non-offensive safe, legal words, I would second your proposal, which I take to mean: if there are any google hits which the admin has not yet checked, or if there are any which show usage conveying the meaning in the definition, speedy DELETE should NOT be used. For words which are mildly offensive in between, eg possible libels, the admin can use judgement between the two extremes.
What does offensive have to do with it? This is the same crowd that railed when Webster's Third added ain't in 1961, that wouldn't accept fuck in a dictionary until 1965. A word's a word. Davilla 16:12, 2 June 2006 (UTC)Reply
Yes, and the pen is mightier than the sword! But my bad choice of words -- I was obviously too tired, and have now modified them. I am strongly against censorship, but I also feel that no one person, particularly an amateur, should be allowed, on their own, to use a community enterprise to cause civil unrest, or use it in such a way that it is shut down by the courts, to the detriment of the rest of us, and indeed the rest of the world. There are (at least potentially) a very few "words" which should be debated in relative privacy, and made public after they are approved as good quality and in accordance with CFI, rather than being left in the public gaze during the debate. We have a growing number of users. The best defence against court action is to have an appropriate policy in place and to follow it. More importantly, it is also the best defence against "collateral damage" in the war against censorship, which we might regret for the rest of our lives. --Enginear 01:03, 3 June 2006 (UTC)Reply
Two further occasions warranting speedy DELETE:
  • At the end of the RFV/RFV-Sense month (see below) there are still no cites at all, and there have not been any positive comments (except perhaps by contributors with a reputation for unreliability).
  • Further entries added by a contributor who has been asked to desist until RFV/RFD issues with an existing entry are resolved.
  • What should be the procedure for RFV/RFV-Sense testing?
This should be an "output" of the process, but we have not specifically mentioned it. In my view, unless the discussions at Wiktionary:Grease_pit#2-level_dictionary result, via BP, in a change, the requirement should be that unless the entry complies with the standard of citation mentioned within one month, it should be RFDed. Additional time could be given if a cite had been disallowed within the last week or, discretionally, if an editor specifically requested an extension to allow further research.
I don't have any problem with automatically deleting a failed word provided there are no quotations or references at all, and no comments by known contributors on its behalf, the "I've heard this before" sort. I would consider an effective extension of the RfV process to be automatic in the case that consensus is not reached on the subsequent request for deletion. An editor could request as much; however the vote of one person does not determine the consensus. Davilla 16:25, 2 June 2006 (UTC)Reply
In general, I agree. However, I suggest that positive comments should be acceptable "from all except known unreliable contributors" rather than only from "known contributors". To give an unknown the benefit of the doubt only means an RFD process rather than a speedy delete. I have to admit a personal interest here, as I discovered the RFV page at a time when I had only added one new entry and done minor edits on about six others, and noticed a queried word which I thought I'd heard, and which had already been listed for over a month. I left a note saying so, and that I would search for cites, but it might take a few days (16 days as it turned out) so please wait. It would have been very discouraging if I had returned to find the entry had gone. --Enginear 01:03, 3 June 2006 (UTC)Reply
If the contributor of an entry which appears to be against CFI then starts adding more entries which appear dubious, he should be asked not to do so until the RFV/RFD process on the first one is completed, and warned that any entries he does make during that period may be speedy DELETEd on sight (to avoid wasting everybody's time by running through the RFV/RFD process for each one).
A possible outcome of discussions at Wiktionary:Grease_pit#2-level_dictionary may be to recommend to BP a change of policy to allow words/senses without adequate verification to remain, but be tagged (sense not verified), (protologism), etc as appropriate. We should bear this in mind as we proceed, in case we have any useful comments to make. However, I do not think it is likely to effect the process of verifying the "normal" dictionary. --Enginear 06:01, 2 June 2006 (UTC)Reply
  • The citation approach to verification is flawed because three citations are never enough to demonstrate meaning. Most dictionaries need at least a dozen citations for usage in a certain sense to be demonstrated. For example, the entry "magnetic resonance imaging" in Merriam-Webster's Collegiate Dictionary has 30 citations on file, and "greenmailer" has 19. Further, people verifying a sense on the Request for Verification page often grab quotations from publications in a single field from an online source. If you want to ascertain the meaning and popularity of a term, you will need to examine a broad range of media, from radio and television transcripts, to periodicals and books. You will also need to examine quotations from media designed for more than one audience. Printed dictionaries are, in general, the most reliable books in the world, so if I had the choice between verifying a word using a single citation from the Random House Unabridged Dictionary or three quotes from some books off Google Book Search, I would use the dictionary without any hesitation at all.--Dfd33 01:25, 3 June 2006 (UTC)Reply
  • I am wildly against the notion that new users/anon IPs should be respected for RFD discussions. The idea here, is that we regulars are working together to build a dictionary. Occasionally we get outside help. But the reality is, that the majority of new/unregistered users are simple vandals. It takes time to verify if certain users (e.g. User:Dfd33 who only posted here) are sockpuppet accounts of Primetime's or not. To give equal voice to anons/new/non users on the discussion pages is pointless. They can "easily" add references in whatever format they like, and someone will clean it up to our standards, but to have them enter the debate is pointless. How can they possibly know or understand the nuances of prior month's decisions? --Connel MacKenzie T C 21:19, 3 June 2006 (UTC)Reply
Assume good faith, and don't bite the newbies. If the argument seems reasonable respect it on its own merits rather than on the basis of who's saying it. If it's from a genuine vandal, they soon make themselves obvious in the way they express themselves. Respecting anons doesn't mean we have to agree with everything they say. Eclecticology 07:57, 4 June 2006 (UTC)Reply
Yes, I am not suggesting treatment as the equal of a known trustworthy regular, but merely that they are given a little time to research. Were I "clearing out" RFV, my response would be to leave the entry in place for a fortnight after the newbie had posted, to see if they came up with anything useful, perhaps with a note to that effect, as AM has tended to do. A more active response would be to move to RFD (rather than speedy DELETE) or to move it to the newbie's User Talk page. And obviously, if the argument reeks of bad faith, then continue with the speedy delete anyway. --Enginear 17:06, 4 June 2006 (UTC)Reply

Split by month?

[edit]

Hi, any chance we can split RfV by month? It's maaad long. Cheers! bd2412 T 22:18, 26 September 2006 (UTC)Reply

I agree that the page is unfortunately long, but I'd just like to point out that the current list is only two months long; If you didn't leave the previous month, you'd have very little chance to RFV items added toward the end of the month. Or perhaps you were suggesting a rolling window?
I had previously suggested in a BP discussion that perhaps this page would be better served by being just a set of links to talk pages of articles. That way, we wouldn't need to move the discussion there when done RFV-ing. However, it occurs to me that by doing that, you'd lose a durable archive of failed RFVs. Perhaps we could fiddle with the namespaces to do something along the lines of a table of contents heirarchy with links to months? Something like RFV\December\bogusword ? Should this discussion be in the BP or GP? --Jeffqyzt 10:45, 27 September 2006 (UTC)Reply

little procedural question

[edit]

When section heads are struck out here, as in Nicea, SDF, and Jeopardy! above, does that mean someone has decided the entries are adequately verified and don't need further discussion, or something else? —scs 17:59, 5 December 2006 (UTC)Reply

I think so. I believe that is the preliminary step, then after one week (if no further discussion erupts) the section can be archived to WT:RFVA and if kept, the entry's talk page. --Connel MacKenzie 18:03, 5 December 2006 (UTC)Reply
I usually strike them when I get to them, about a month after they have first been entered. At that point the conversations are usually finished, and a decision needs to be made. Once the decision is made, the heading is struck. I would appreciate if others would also strike once a decision is made, as it then means it is dealt with. Andrew massyn 07:59, 10 December 2006 (UTC)Reply

de rfv

[edit]

megapenny has been on rfv for over a month now; can I remove the rfv template?

I notice a few others on the list have been held in limbo for just as long; is this normal or a backlog ? Jayvdb 21:24, 21 February 2007 (UTC)Reply

There is a tendency for a month's RFVs to be cleared out at or shortly after the end of a calendar month, so January's will probably be dealt with in about a week's time. The RFV tag can be removed earlier IF the article has been given adequate cites to meet WT:CFI, this has been noted on the RFV page, and a further week has then passed without any objection to the cites being raised.
Note that there is an approx summary of CFI at the top of the RFV page. CFI normally requires that we "Cite, on the article page, usage of the word in permanently recorded media, conveying meaning, in at least three independent instances spanning at least a year." "Conveying meaning" means that the word must be used rather than just mentioned or defined, so appearance as a headword in another dictionary or similar work does NOT help meet CFI.
We don't vote on RFVs. The decision should be made purely on the fact of whether there are adequate cites in place, in the article, or not. (Sometimes, if it is clear that adequate cites are available, just not yet added, the words are given a further month's grace, or occasionally even waved through. Similarly, the "durably archived" criterion is sometimes waived. However, in cases like megapenny where appearence here might be taken as spam for the eponymous website, the rules tend to be more strictly applied.)
So in the case of megapenny, it currently has no cites added, and the discussion on RFV hasn't noted any cites which clearly meet CFI, so unless someone comes to its rescue in the next week, it is likely to be deleted.
If you want to help save it, you could check out some of the usage that dmh mentioned (though you'll need to find the background info, ie dates, authors, etc (see WT:")). The hits you mention on b.g.c. seem to be the name of a website, rather than usage to match your definition. Similarly, nearly all the hits on A9 seem to either be "mention" or refer to the Megapenny project, but you might find enough there...you only need three, if they appear to be "durably archived", and even if not, the benefit of the doubt is often given provided they appear to be sources which will stay up for a few years. But the usage must match the definition. Mention of (say) the Megapenny project is irrelevant, not least because it is spelled with an M rather than the m in the definition.
It's quite a nice word, but unfortunately I've got too many others to champion at present. dmh seems keen on it -- why not ask him for some help if you feel unsure. --Enginear 23:35, 21 February 2007 (UTC)Reply
Thanks for the overview and an eye into the likely outcome as the article stands. I'll do a bit of work to add sources, if only to become more familiar with the wikt way. Jayvdb 19:11, 22 February 2007 (UTC)Reply

Archive

[edit]

The article badly needs archiving. It's monstrously large and uneditable. Wjhonson 14:04, 30 May 2007 (UTC)Reply

Yeah, we're working on it. That said, you probably shouldn't be trying to edit the whole page, anyway; section editing is better, as it reduces the risk of edit conflicts. —RuakhTALK 18:13, 30 May 2007 (UTC)Reply

subpages

[edit]

Is there any reason each entry can't have its own subpage? Gets annoying having to wait for the entire long page to load/refresh each time... -Eep² 06:54, 26 June 2007 (UTC)Reply

Yes. We have only 50 administrators for 427,000 entries, not counting categories, appendices, indexes, and other pages. Having a few central fora for discussing issues makes it possible to find and keep track of discussions. Spreading them out over half a million talk pages would lead nowhere. --EncycloPetey 07:21, 26 June 2007 (UTC)Reply
Er, the main "Requests for verification" page would still have links to all subpages, obviously. Wikipedia does this for its Requests for comment, Articles for deletion, and other sections. ∞ΣɛÞ² (τ|c) 08:00, 26 June 2007 (UTC)Reply
This is not Wikipedia. Please re-read what I wrote above, because the situation I described is very different from the situation on Wikipedia. Wikipedia has several hundred administrators and a much larger body of regular contributors than we do. If that method works for them, that's great, but the situation here is very different. --EncycloPetey 18:25, 26 June 2007 (UTC)Reply
Putting things on subpages also makes it impossible to see the changes on your watchlist unless you are going to watchlist every subpage as it is created. What we really should do is get better at responding to and archiving entries in a timely manner. I think, thanks to Ruakh and others, we are getting better at that, but I have to admit usually I contribute to the backlog more than I help clear it. Dmcdevit·t 01:05, 27 June 2007 (UTC)Reply
How about dividing rvf in some other way, perhaps by month or week? bd2412 T 01:20, 27 June 2007 (UTC)Reply
I think that dividing by month would be a good initial compromise between page size and visibility. Conceptually it also fits in well with the 1 month timeframe of verification requests. Thryduulf 20:43, 27 June 2007 (UTC)Reply
So, maybe run it the way VOTES happens, with an embedded page for the month? Then the RFV page itself can be set with "link to CURRENTMONTH" (I don't know the exact syntax), but that way the discussion pages could all be set in advance and noone would have to remember to update when a new month begins. --EncycloPetey 21:17, 27 June 2007 (UTC)Reply
We used to do it exactly that way on Wikipedia. That was about three years ago, when we didn't have the flood of crap we do these days. bd2412 T 01:42, 28 June 2007 (UTC)Reply
The alternative is to task a bot with creating the new page at the start of the month and updating the links on the main RfV page. I think this is how the daily updates of w:Wikipedia:Articles for creation is handled for example. Thryduulf 08:07, 28 June 2007 (UTC)Reply
The other benefit is that the link remains the same and doesn't get into some annoying archive subpage... ∞ΣɛÞ² (τ|c) 10:33, 28 June 2007 (UTC)Reply
Ok, so let's do it - how hard can it be? bd2412 T 08:13, 10 November 2007 (UTC)Reply

I've thought about this at various times, in particular how to scale to lots of entries under review. What I'd come up with is something like a {review} tag (for both RfV and RfD), that would go on the talk page, followed by a section wrapped in onlyinclude tags (if you don't know, this is like includeonly, except text not inside any tag is excluded if onlyinclude appears) with the section header inside. The RfV/RfD pages can then be lists transcluding the talk pages, and section editing will edit the talk page. The only time things need to be moved is archiving deletions.

Or something like that ;-) Robert Ullmann 14:22, 10 November 2007 (UTC)Reply

The problem with that, addressed above, is that some folks like to have all the changes pop up on their watchlists. I actually have two proposals to cut the list, which I think will suffice at this point. One is to group RfV's (and RfD's for that matter) by month. Another is to split RfV itself into seperate pages on words (evaluated as to whether they exist at all) adn phrases (evaluated as to whether they are idiomatic). I thought about proposing a split between English and non-English as well, but so few of our review are of non-English terms. Cheers! bd2412 T 15:36, 10 November 2007 (UTC)Reply

Let me put it this way: would anyone object if I were to split the page by month right now? bd2412 T 18:12, 10 November 2007 (UTC)Reply

Yes, I strongly object. --Connel MacKenzie 18:27, 10 November 2007 (UTC)Reply
What would be the basis of your objection? bd2412 T 18:32, 10 November 2007 (UTC)Reply
I've a thousand objections to that. But first of all, who unblocked User:Eep?
Not everyone uses Watchlists in exactly the manner you suggest. (You do, but I'm not aware of anyone else.)
Watchlisting is not an effective way to review RFVs anyhow.
What is WT:RFV for, if not listing out the current open RFVs? Your complaint is about the archiving of dead RFVs, so you'd annihilate the current working list?
The existing links from entries to RFV sections are critical.
The existing links from RFV sections to entries are critical.
Your proposal breaks critical links in both directions, with no benefit to page rendering speed (what's-his-name's recent monthly subdivision made page loads almost twice as long.)
Your proposal breaks critical links in both directions, with no benefit in editing speed.
Your concepts about scalability are not unwarranted, but misdirected. If there were an [[/rfv]] subpage for each of these entries, there would be no central list (perhaps, what, a category? Puh-lease!) and no flow for the current (related) nominations, nor the current (related) resolutions, nor the current (related) discussions.
...and many, many, many more objections. If you are so concerned about the page speed, whittle away at the backlog, but get them archived properly. The more of them you do, the more AWB automation you'll add to the mix. The more special cases that have to be handled, that are identified, the better the automated archiving can be. Fuckup proposals (like the monthly subpage thing) only set everything back. (Fuck. Back to the drawing board, again. Those subpages now prevent the history-style links from being used...great. Just fucking great.)
--Connel MacKenzie 18:52, 10 November 2007 (UTC)Reply
No old links would be broken; the current page would still exist, as a way station to the monthly pages. How about doing it just going forward (i.e. beginning with RfV nominations on December 1)? bd2412 T 20:27, 10 November 2007 (UTC)Reply
Connel, do you have anything constructive to contribute, or are you just going to flame anybody trying to help? You're making stuff up: bd isn't suggesting anything about watchlists that I see here. This talk about breaking links -- RFV is already partly split by month, and no links are broken. Rather than making up fake objections, just admit you don't like change so people can ignore you. Cynewulf 20:40, 10 November 2007 (UTC)Reply
And as for page load times, the only thing that it's taking twice as long as is the archived version, before the transcluding, in order to satisfy what's-his-name:
  • Now: 11.8 10.9 11.4 11.9 9.8 9.8 9.9 avg 10.78 sec
  • After split, before transcluding: 5.0 4.1 8.9 4.4 5.3 5.0 5.0 avg 5.39 sec
  • Before split: 11.8 13.4 14.1 11.5 13.4 13.5 12.9 avg 12.94 sec
(if you want to test this remember to turn off popups, or it'll probably try to load the page twice)
As for "whittling away at the backlog", I'm done. I'm not doing another. It's up to Ruakh now. I may reconsider if Connel deigns to get his precious hands dirty and actually closes a few RFVs, but I'm not holding my breath. He'll just flame anybody who tries to make archiving easier, and then complain about why nobody is archiving and all the horrible horrible pageload times.
Incidentally, that covers all of Connel's objections so far. (do I have to point out that the suggestion he's objecting to is a monthly split, and that an /rfv subpage for individual entries has nothing to do with that?) Cynewulf 21:20, 10 November 2007 (UTC)Reply

Can anyone RFV a word?

[edit]

Or do you have to be admin or something? RobbieG 18:37, 31 October 2007 (UTC)Reply

Anyone.—msh210 18:45, 31 October 2007 (UTC)Reply
Thanks. RobbieG 18:51, 31 October 2007 (UTC)Reply

using song lyrics as citations

[edit]

When using song lyrics as citations, one should bear in mind Fontana's Law:

Every purported transcription of song lyrics available in a generally-accessible online medium contains at least one substantive error.

Also, Manfre's Corollary:

Any substantive error in an online transcription of song lyrics will be found in a vast number of places online because these Web sites steal content from each other wholesale.

Just my two cents.—msh210 18:26, 21 November 2007 (UTC)Reply

The solution, I would think, would be to use liner notes or go to the artist's website. bd2412 T 18:46, 21 November 2007 (UTC)Reply
I don't trust even that. The lyrics can change subtly every performance, so I put on my Sennheisers and I listen to the recording very closely. But for the purpose of quotations, what the artist should have or thinks he said is also durably archived (often in a sleeve), and probably less contentious, so yeah, use that. DAVilla 19:03, 21 November 2007 (UTC)Reply

more old news

[edit]

The Associated Press reported yesterday that "Google... is trying to expand the newspaper section of its online library to include billions of articles published during the past 244 years."—msh210 21:59, 9 September 2008 (UTC)Reply

Subpages by month?

[edit]

Hi there all. I would like to bring up this topic here because I think that the requests for verification page is a bit too large for some people to edit. I would therefore like to propose that we split the page up into several subpages split up by month. That way, the RFV page would be smaller and easier to load, and you could possibly get more people involved in this process. What do you guys think about this solution? Thanks, Razorflame 07:05, 20 February 2009 (UTC)Reply

{{rfv-archived}}

[edit]

Many discussions don't really end in "the sense in question has been cited and kept", nor in "the sense in question, being uncited, has been deleted". This happens for a variety of reasons:

  • no claims were being disputed to begin with.
  • the claims being disputed were not actual senses that we can cite, but rather things like etymologies, usage notes, sense labels (context tags), etc.
  • the initial concern is addressed by rewriting the sense, but without (fully) citing the new version.
  • the initial concern is addressed by editors vouching for the sense, by an authoritative source being produced, or by links to carefully crafted b.g.c. searches that seem to have enough relevant hits, but without (enough) actual citations being added to the entry or its citations page.
  • the claims being disputed are removed without waiting a month.
  • etc.

It's debatable whether all of these are really satisfactory — in some cases RFV was not the appropriate forum, and in other cases perhaps the outcome was not really consistent with the stated goals of RFV — but absent a major change in the human condition, there are probably better things we can be worrying about.
After a long period of being annoyed at having to use either {{rfv-passed}} or {{rfv-failed}}, even though often neither one's text is really accurate, I've just now created the neutral template {{rfv-archived}} to be used in such cases. The text of the discussion will simply have to stand for itself.
I'll wait a few days before I start using it, in case there are objections. If there are, please speak up.
RuakhTALK 23:45, 29 May 2009 (UTC)Reply

I like the idea of such a template, but generalize from "The discussion ended in such a way that neither {{rfv-passed}} nor {{rfv-failed}} is appropriate." to "". (I don't see a need for that sentence, in other words.) Or, if you really like that sentence, at least allow for a parameter to omit it, as "The discussion ended in a {{{result|way that neither {{rfv-passed}} nor {{rfv-failed}} is appropriate}}}."—msh210 18:34, 8 June 2009 (UTC)Reply

Archiving, 08/08/2009

[edit]

I haven't had time to move all the stuff I removed from this page to the archives because there's so much of it, probably more than I can do tomorrow. Plus, check out #Subpages by month?, I'd strongly support this, as to archive you just remove {{/March 2009}} and hey presto, it no longer appears on the main page. Used on fr.wikt and various Wikipedias already. Mglovesfun (talk) 22:03, 8 August 2009 (UTC)Reply

For the love of all that is good and sane, you should never, ever remove content that you believe should be archived, unless you are actually going to archive it yourself (or have already done so). Otherwise, once it's gone, it's gone. A high-volume page like this, no one will ever be able to track it down again, even if they manage to figure out that there is something to be tracked down. —RuakhTALK 20:10, 26 December 2009 (UTC)Reply
Also, I can't imagine we'd ever want to leave an entire month up until all of it is ready for archiving. The piecemeal approach, of archiving individual entries as they're ready, works better than any other approach we've tried — and trust me, we've tried others. —RuakhTALK 20:14, 26 December 2009 (UTC)Reply
The contents of RFV should not be archived on a /archive page. It doesn't make sense and isn't particularly useful. They should be archived on the talk page of the relevant term or terms. This way in the future if a question arises the discussion and previous material are all located on an adjacent page rather than in an archive somewhere. - [The]DaveRoss 22:42, 2 February 2010 (UTC)Reply