Wiktionary:Beer parlour/2006/March

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives +/-
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014

Ec about Wiktionary:Assume good faith:

    • If we are going to keep this and the other behaviour "policies" they should be shortened and combined into somnething readable. Nobody ever reads this kind of dense bafflegab, so what's the point of having it? Since this "comment" is unattributed, I'm going to feel free to get rid of it in a few days. --Richardb 09:59, 15 April 2006 (UTC)
    • I think it is important to spell it all out, but I think haveing a quick blurb at the beginning summarizing the policy would be a good idea. Any thoughts? -- Psy guy 05:55, 19 April 2006 (UTC)

Contents

Primetime

I am not the only one who has expressed concerns about this contributor, but I may be the first to bring it up here. Primetime has been asked repeatedly to follow the standard formatting practices of wiktionary, and hasn't, and recently their contributions have more and more closely resembled copyvios. This evening they made 4 edits in the space of a minute (xanthomatous, xanthophyll, xanthophyceae, xanthism) consisting of 113 words not including formatting, a tremendous feat of typing let alone recall and expertise. I am not sure what the policy is regarding contributors and dubious activities, but I am certain that Primetime merits some scrutiny. - TheDaveRoss 06:04, 4 March 2006 (UTC) (I appologize if this is the wrong venue)

The reason why the entries were created so quickly is I use MS Word to write my them because it has a spell checker. I write a list of entries, then move them here. In any case, my format still includes headers, italics, etc. I have explained in detail in previous discussions here why I don't add a second, unecessary header and why I don't link words unrelated to the entry in question. I don't see why Connel blocked me over this.--Primetime 06:39, 4 March 2006 (UTC)
What then, are you using to upload them? And when, did you alegedly compose all these wonderful submissions? Why do you have no references cited? Any of these considerations taken alone are untenable; as a whole, there is simply no way that you are not engaging in massive copyright violations. Even if you are alternating your sources for each entry, you are still cycling through them all and cutting and pasting from those resources. Your refusal to format any part of your entries as per our formatting conventions is obviously due to the inconvenience of reformatting your copyright-violation sources.
But the ultimate reason I blocked you, was because the rate of your copyvio flood was increasing. The longer you were permitted to continue, the more would have to be deleted later. --Connel MacKenzie T C 07:02, 4 March 2006 (UTC)
My copy of MS Word does not tack spaces onto the end of everything. (Just tested, pasting into a couple different browsers.) --Connel MacKenzie T C 08:07, 4 March 2006 (UTC)

An additional concern (besides spaces) is the use of ALL CAPS for linking to other definitions. Appropriate capitalization is much more important in Wiktionary than in some of the other WikiMedia. --EncycloPetey 07:17, 5 March 2006 (UTC)


I'm simply going to give two parallel sets of quotations. One comes from seven randomly chosen articles created by Primetime in the last couple of days. The other comes from Webster's Third International New International Dictionary of the English Language (Unabridged), copyright 1993 by Merriam-Webster, Inc. Any commentary would be redundant. Keffy 22:45, 5 March 2006 (UTC)

  • quaddle
    • Primetime: dialect England: GRUMBLER
    • Webster's 3rd: dial Eng: GRUMBLER
  • quarter blanket
    • Primetime: a blanket used under a horse's harness to cover from the tail to beyond the saddle
    • Webster's 3rd: a blanket used under a horse's harness to cover from the tail to beyond the saddle
  • quaestuary
    • Primetime: archaic: interested in or undertaken for monetary gain or profit "this may be termed the quaestuary class, this being the end which they aim at" -- J.F.Ferrier
    • Webster's 3rd: archaic: interested in or undertaken for monetary gain or profit <this may be termed the ~ class, this being the end which they aim at --J.F.Ferrier
  • xystus
    • Primetime: a long and open portico used especially by ancient Greeks or Romans for athletic exercises in wintry or stormy weather; sometimes: a walk lined with trees
    • Webster's 3rd: a long and open portico used esp. by ancient Greeks or Romans for athletic exercises in wintry or stormy weather; sometimes: a walk lined with trees
  • gonystylus
    • Primetime: a small genus of East Indian trees (order Malvales) constituting a monotypic family, having alternate leathery leaves, regular paniculate flowers and woody fruits, and yielding fragrant timber resembling agalloch
    • Webster's 3rd: a small genus of East Indian trees (order Malvales) constituting a monotypic family, having alternate leathery leaves, regular paniculate flowers and woody fruits, and yielding fragrant timber resembling agalloch
  • gopura
    • Primetime: the gateway of a temple in southern India; often the massive tower resembling a pyramid above the gateway
    • Webster's 3rd: the gateway of a temple in southern India; often: the massive tower resembling a pyramid above the gateway
  • xerogel
    • Primetime: a solid formed from a gel by drying with unhindered shrinkage
    • Webster's 3rd: a solid formed from a gel by drying with unhindered shrinkage -- compare AEROGEL

You should provide evidence to back up your claims. You're Australian, so how did you get an American dictionary? I can guarantee you that these are not from M-W, Unabridged.--Primetime 23:29, 5 March 2006 (UTC)

Any single quote from M-W or anywhere can be fair use, but a pattern of copying from such a source is not acceptable. When there has been a reasonably documented claim of copyright infringement that shifts the burden on the contributor to show evidence of where the material comes from. If you can guarantee that the material is not from M-W, and is legally being used then do it by showing your sources. Please make this your priority. Eclecticology 10:09, 6 March 2006 (UTC)
OK, so can you please tell us where they are from. The evidence is right here, and clearly suggests that these could be copyvios. Some of them are clearly too recent to be from an out-of-copyright source such as Webster 1913. Where did you get them from? — Paul G 10:01, 6 March 2006 (UTC)
Most of the entries are either from The Century Dictionary, edited by William D. Whitney, 1891; or Funk and Wagnall's A Standard Dictionary of the English Language, 1893.--Primetime 11:16, 6 March 2006 (UTC)
Primetime, the onus is on you, to provide individual, specific references. The very first term in question did not pan out when I searched. Where exactly are you pasting these from? --Connel MacKenzie T C 18:26, 6 March 2006 (UTC)
What do you mean exactly by "it did not pan out"? Where did you look? I use OCR (optical character recognition) to move the scanned pages directly to a word processor. The books indeed exist, you can find references to them here and here
--Primetime 19:08, 6 March 2006 (UTC)
Well, it really doesn't matter what scanned images I looked at; you still haven't provided citations. Are you planning to? --Connel MacKenzie T C 19:15, 6 March 2006 (UTC)
I would be happy to provide citations if I were to be unblocked.--Primetime 20:07, 6 March 2006 (UTC)

Ever willing to give people the benefit of a doubt, Keffy goes off to check out Primetime's claimed sources. Alas, his non-Australian city has no publicly available 1893 editions of F&W, so the quotes below are from Funk & Wagnalls Standard Dictionary of the English Language, International Edition (1963).

Until somebody finds an original F&W, I'm willing to admit it's not completely impossible that Philip Gove, editor of Webster's Third, despairing that his huge staff would ever get the job done, begged his company's arch-rival for help, and that not only did F&W let him use verbatim copies of their definitions, they also graciously agreed not to use those definitions themselves in their own later editions. Not completely inconceivable, but not likely enough in this universe that I'm willing to waste any more of my time on this. Keffy 00:01, 7 March 2006 (UTC)

Webster's 3rd Primetime Century Dictionary

(1895 printing)

Funk & Wagnalls (1963)
gonystylus a small genus of East Indian trees (order Malvales) constituting a monotypic family, having alternate leathery leaves, regular paniculate flowers and woody fruits, and yielding fragrant timber resembling agalloch a small genus of East Indian trees (order Malvales) constituting a monotypic family, having alternate leathery leaves, regular paniculate flowers and woody fruits, and yielding fragrant timber resembling agalloch (no entry) (no entry)
gopura the gateway of a temple in southern India; often: the massive tower resembling a pyramid above the gateway the gateway of a temple in southern India; often the massive tower resembling a pyramid above the gateway In India, especially in the south, a pyramidal tower over the gateway of a temple. Also gopuram. (no entry)
quaddle dial Eng: GRUMBLER dialect England: GRUMBLER (no entry) (no entry)
quaestuary archaic: interested in or undertaken for monetary gain or profit <this may be termed the ~ class, this being the end which they aim at --J.F.Ferrier archaic: interested in or undertaken for monetary gain or profit "this may be termed the quaestuary class, this being the end which they aim at" -- J.F.Ferrier (no entry) (no entry)
quarter blanket a blanket used under a horse's harness to cover from the tail to beyond the saddle a blanket used under a horse's harness to cover from the tail to beyond the saddle A horse-blanket intended to cover only the back and a part of the hips. It is usually put on under the harness. (no entry)
xerogel a solid formed from a gel by drying with unhindered shrinkage -- compare AEROGEL a solid formed from a gel by drying with unhindered shrinkage (no entry) (no entry)
xystus a long and open portico used esp. by ancient Greeks or Romans for athletic exercises in wintry or stormy weather; sometimes: a walk lined with trees a long and open portico used especially by ancient Greeks or Romans for athletic exercises in wintry or stormy weather; sometimes: a walk lined with trees (no entry) an exterior portico in ancient Greece facing south where athletes exercised in the winter

I went ahead and looked at Merriam-Webster's Third myself. As you can see above, their definitions are completely different from Keffy's.--Primetime 06:11, 7 March 2006 (UTC)

My local library lets me check out references, so I have "Webster's Third New International Dictionary" subtitle "OF THE ENGLISH LANGUAGE UNABRIDGED A Merriam-Webster REG. U.S. PAT. OFF." © 1993 BY MERRIAM-WEBSTER, INCORPORATED.
ISBN 0-87779-201-1
Needless to say, each one of Keffy's citations above match exactly. --Connel MacKenzie T C 23:43, 7 March 2006 (UTC)
M-W has been printing the Third since 1961, yet you happen to have a 1993 printing, also. What library did you get it at?--Primetime 14:04, 8 March 2006 (UTC)
"WEBSTER'S THIRD NEW INTERNATIONAL DICTIONARY" / "PRINCIPAL COPYRIGHT 1961". As to what library, none of your business. --Connel MacKenzie T C 16:51, 8 March 2006 (UTC)

Despite his block, he is still editing here, even today. Yes, I accidentally came across florid where I wanted to add a quote, and I immediately recognized his format. — Vildricianus 20:52, 11 March 2006 (UTC)


Action

It has been 6 days since Primetime's last "rebuttal", and I personally feel the evidence has been given and we ought to make a decision. I see two main things (and a third, tangential thing) which should be considered (read: voted on).

  • What to do with Primetime.
    1. Maintain the present indefinite ban.
    2. Alter it to some other timeframe (I don not know what the standards are here and elsewhere, but something along the lines of 1 month to 1 year seem more suitable to me.)
    3. Some other solution.
  • What to do with Primetime's edits, especially the ones where he is sole contributor. I think the evidence is clear here that many of them are copyvio, which may make this a legal matter, I do not know the precident.
    1. Delete them all.
    2. Sift through and delete only proven copyvios. (I don't know exactly how this would work.)
    3. Some other solution.
      Just looking at the talk page, the user has a track record of copyvios. Revert all changes that fit that pattern unless they can be proved otherwise. Davilla 20:19, 15 March 2006 (UTC)
  • The third thing which I think should be addressed is how we should handle this situation as a community in the future. I felt like I didn't have any clear direction to take this, and couldn't find anywhere which explained what to do. Was this situation well handled? Should we move in the direction of *pedia's ArbCom for issues such as this? Thoughts on this are important as we grow because this situation WILL come up again. - TheDaveRoss 08:37, 16 March 2006 (UTC) (sorry I did not sign this initially, I don't know how I missed that.)

It is unfortunate that these suggestions were not signed.

  1. The indefinite block is just fine, and indefinite is not the same as infinite. If he shows himself willing to be more co-operative with the community and its standards he can come back. He will need to convince some key people that he is reformed before that happens, and his work will be closely watched if and when he does come back.
  2. There is no real need to delete anything. The simple fact indicated above that two versions, thirty years apart, of the same dictionary would give such radically different definitions of the same word should set off alarms. As David Crystal points out in his The Stories of English there is a strong tradition of "borrowing" in English dictionaries that goes at least back to the time of Robert Cawdrey's first English dictionary in 1604. Any single entry from another dictionary whose copyrights are still valit is not ipso facto a copyright violation. At worst it is fair use, and may not itself have been copyrightable by the people who put it in that dictionary. It may result from a series of borrowings that can be ultimately traced to a public domain source, or it may (as Patrik so ably indicated on my talk page) be subject to the merger doctrine which applies when the copyrightable form of expression and the uncopyrightable idea expressed come into conflict. We can protect ourselves by simply giving a reference to the source of a definition. What really can be a violation of copyrights is an extended pattern of borrowings; no-one should be allowed to establish such a pattern, and we should stop anyone who seems to be headed in that direction, as we have done with Primetime to disrupt the pattern.
  3. When I read about Wikipedia's Arbcom I find that it creates as many problems as it solves. Giving direction and maintaing a collaborative community are often conflicting aims. It is easy to apply rules to newbies, but it is also extremely problematical to apply the same rules to difficult contributors who have been highly innovative and have contributed a lot to the project. I think the Primetime situation has been well handled despite the uncertainties that often prevailed. A consensus system is made strong by its own imperfections. I agree that such ideas need more development, but preferably outside the context of the difficulties that we are facing with a specific individual. Eclecticology 21:37, 15 March 2006 (UTC)
  • Ec, regarding #2: THAT IS TOTALLY FALSE! Primetime offered fake evidence that a fictional dictionary had different meanings entered thirty years later!
  • All of these copyright violations (all of Primetime's entries) need to be removed.
  • --Connel MacKenzie T C 22:33, 15 March 2006 (UTC)
  • (1)I think we should set a specific ammount of time, making it an arbitrary "as long as it takes" doesn't give Primetime any definite time when he can feel like he can rejoin the community if he wishes to, and it doesn't give the community any idea of when it is proper to accept him. Some may feel that a week down the road he is rehabilitated while others may feel that several months is more appropriate. I don't think this should be a "feel it out" sort of deadline. (my psersonal opinion is something in the range of 3 months to 1 year
    (2)I personally cannot reconcile banning a user for their contributions AND keeping those contributions, this is hypocracy. If we have decided that the contributions were copied from another source, I feel they should be removed. Whether or not this is a legal issue, it is a moral one to not include other peoples work passed off as our own, and I am absolutely willing to readd my share of the copied terms if it is the content loss you are worried about. A pattern of copyvios HAS been established and it is irresponsible to keep those copyvios when we know they exist. delete
    (3)I agree that ArbCom has issues, however they do fulfill a certain aspect of the disciplinary process that I think is crucial, namely that such decisions should not have the potential of becoming popularity contests. If two people get into a revert war, and person1 is well liked while person2 is not so much, person2 may be unjustly overruled simply because of bias. I am not saying this has happened, it is just an eventuality of mob-rule discipline. - TheDaveRoss 21:06, 16 March 2006 (UTC)
  • I hope that you aren't suggesting every one of the 430 entries I have created and every one of the 1270 edits I have made here are copied out of a dictionary. I can understand why you would be suspicious about the rapidly-created ones (about 220 of the total), but those are a recent phenomenon, created on three evenings in March and February. I have been editing here since November 22 of 2005, and most of my entries have subsequently been reformatted, expanded--and often reworded--by other editors.

    About the block, though: The reason why I came to Wiktionary in the first place was to add definitions to words I use in my articles in Wikipedia. I gradually developed an affinity for editing here regularly, but that is gone now. I don't think I really want to start editing here on a large scale again. I would like to be able to come here and add a few entries every once in a while that I think everyone would be interested in or that I need defined for Wikipedia.I probably wouldn't create more than three a day, or so, though. But, I don't think you need to ban me that long at all. I felt humiliated by being blocked like I was, so you can be assured that I was punished. And I think it's obvious why some people got upset with me, so I'll stop.--Primetime 22:08, 16 March 2006 (UTC)

Apart from the copyright violations, you have also resorted to sockpuppetry, not only to post here but to edit articles as well (see florid). I assume that blocking the Primetime account will not suffice. — Vildricianus 09:03, 17 March 2006 (UTC)
Vildric: Whatever block is decided upon, it will be enforced, based on the person not the username.
Primetime: I think that the ones which are obviously your work can stay, but many are obviously not and should go.
- TheDaveRoss 02:49, 18 March 2006 (UTC)
Yes, there is clearly a distinction for what was entered initially. Primetime claims some of that has changed over time. As to what's salvagable though, my partial reformatting, rather than rephrasing, of give is lost, as is probably the majority of like content. If Primetime is willing to make the obvious reverts himself, leaving any subjective cases to us, then you should allow him to do so, and consider a shorter ban thereafter for having done so. Davilla 05:25, 18 March 2006 (UTC)
  • I don't think that any of my contributions should be deleted. I agree with Eclecticology that even if they were copied from a newer source, they are not really copyvios. (See Meta:Avoid Copyright Paranoia.) All I was saying is that I understand why you were suspicious about them, and that if you unblock me, I wouldn't keep on doing what I was doing. Wiktionary is certainly not liable right now because we are partially exempted as a non-profit educational entity under copyright law.[1] A key question in determining whether a violation has occured is whether another dictionary's ability to compete has been damaged by the use of any of their material. If it has, they are compensated, but since the entries I created aren't really referenced, and they lack the credibility of those in print, and since the audiences for online dictionaries and printed ones are completely different, I don't think anyone would even consider taking us to court, as they might be rewarded $5 (or less) by a jury in damages. Further, the trustees who run Wikimedia would need to be warned first by the dictionary in question that they are violating copyright law. (They can't really be expected to be aware of every discussion thread on every Wiki, and thus are not liable for knowingly aiding the copying of any material). TheDaveRoss's proposal about rewording definitions sounds OK to me, though, but they shouldn't be deleted, and many of the definitions in question cannot be reworded because they have only one definition. Also, please realize that I am a financial donor to Wikimedia, so if I believed that something would harm the wiki, I wouldn't do it! Finally, "give" is from the first edition of the OED.

    About the block again, though: I've never been blocked this long on any wiki. The longest I've ever been blocked is three days or so--and that was here. This seems like overkill, and let me tell you that editing using open proxies is not easy--they disconnect, they're slow, and they take time to find. The only reason why I created "florid" was because I needed it defined for an article in Wikipedia. I created "spic" because I had just written an article on the word in Wikipedia, so I figured it belonged here, as well. By and large, I'm really just using them--with difficulty--to argue my case here. If you need another type of guarantee from me to be unblocked that I haven't given, let me know.--Primetime 09:56, 18 March 2006 (UTC)

    Regardless of the legality of the issue, I don't think we should be in the buisiness of doing as much as we can get away with, it is *wrong* to take someone elses work and claim it as your own, and it is *wrong* to act deceitfully about doing so. This is my biggest problem with the whole situation, you claimed someone elses work as your own, and when confronted you lied about it, repeatedly. Why should we delete what you added? Because the authors of the content are the only ones who can give it to us, the only ones who can re-license it (which happens when something is added to Wiktionary) and even if they could never do anything about it in court, we shouldn't take what isn't ours. Rewording them would be tricky, I am not sure that we have any contributors at present with the expertise to write definitions for many of the obscure words you copied, and they are not easily researched. - TheDaveRoss 04:17, 19 March 2006 (UTC)
    But isn't that what we do already with dictionaries like Webster's Second? Further, deleting the entries in question would do more harm than good. Making the definitions impossible to find is much worse, in my opinion, than being disingenuous about their authorship (if you can call that authorship). And let's be clear about how un-unique these definitions are. Almost all of them are less than a sentence in length. They're phrases, really. That raises the question, again, about whether anyone--legally or logically--can claim that a concept is their own. No dictionary invented these words--the use by others made them English. The terse statement of an idea is not a unique work--so how can you steal something that isn't someone else's to begin with?--Primetime 06:25, 19 March 2006 (UTC)
I am stunned that all of Primetime's contributions still have not been deleted. So far, every single statement from Primetime has been exposed as a lie, once investigated. I have no reason to believe that any of his submissions are not copyright violations. They are all GFDL violations in that he did not attribute the source. His copyright violations began with his arrival and were always suspicious; as he grew bolder he sped up, he didn't change methods. And he lied every step of the way. --Connel MacKenzie T C 03:21, 22 March 2006 (UTC)
I have started going through his entries and cleaning them up. I can find the source of the copyvios in almost half the cases, but (after research) I am rewording them all, just in case. Some of them have Wikipedia entries, so that helps. I may be some time! SemperBlotto 17:22, 26 March 2006 (UTC)
Thank you, but shouldn't they be deleted outright, along with all his submissions? A pattern of copyvio is clearly established. The ones you are not immediately identifying as copyvios are probably simply from another source that you don't have easy access to. --Connel MacKenzie T C 00:06, 27 March 2006 (UTC)
Thanks Blotto, I certainly don't feel I have the expertise to readd many of these, nor the resources to do so (for many of them, the pedia ones I suppose I could do). I think that if they are going to be rewritten we ought to make a list of them, and delete them, then readd them from scratch. I may be anal and this may be unecessary, but it also can't hurt anything. I do not feel comfortable with keeping another dictionaries set of definitions, again, regardless of the legality, regardless of the precident. - TheDaveRoss 03:41, 27 March 2006 (UTC)
Well, I'll continue to do the best I can, but where I am not confident I'll just delete the word. If anyone else wants to do some deletes, I'm sure few people will object. Words that warrant an entry will just get added again in the normal course of events. SemperBlotto 09:45, 27 March 2006 (UTC) - I have no idea what to do about give - it is just too daunting. SemperBlotto 10:57, 28 March 2006 (UTC)
Daunting indeed. However, we can't just take his word that these are from a public-domain source, can we? I'd say revert. — Vildricianus 12:26, 28 March 2006 (UTC)
Anyone has a clue? — Vildricianus 14:29, 3 April 2006 (UTC)
All done (apart from give) - and I'd rather not have to do it again. SemperBlotto 15:23, 3 April 2006 (UTC)


After Primetime's IP started recreating give me fin on the soul side which I had deleted he proceeded to progress to my RfA on Wikipedia here and decided to change his vote (which is fine but its what followed) here then after Dvortygirl made a comment he reverted them as a personal attack here and here. Its yet another incident to the list -- Tawker 10:23, 9 April 2006 (UTC)

Speaking of which, why did you delete an entry with citations and quotations? That was not even part of the pattern discussed above! I voted against you on Wikipedia, but I have every right to do that. Dvortygirl then mentioned my past on Wiktionary, which seems increadibly malicious and irrelevant to Wikipedia.--Primetime 10:45, 9 April 2006 (UTC)
Well, I am not going to argue with you because it is frankly not worth my time. It was the consensus of multiple people. As for your Wikipedia vote, Dvortygirl is well within her rights to post your context as it is pretty obvious your vote was a direct result of what happened here. As for you not being blocked on Wikipedia, check your IP logs, someone decided that trolling and removing contextual comments was grounds enough for a block (for 24 hours or less until my RfA there is over) - In short, it was a consensus of 4 adminstrators plus the fact that it was previously deleted (as basically the same content) that lead to the delete. You are more than welcome to contribute if you wish to stick with the guidelines of the Wiktionary community though, so I look forward to seeing some constructive edits :) - On another note, it might be best just to create a new account, Primetime seems to have a little stigma attached to it around here. All the best -- Tawker 18:33, 9 April 2006 (UTC)
See [2] for my response. Also, I agree you shouldn't continue to write on this matter, as you haven't been participating in this discussion about me and you have had very little time to watch my edit history.--Primetime 01:41, 10 April 2006 (UTC)

I had a good laugh with Primetime's changes to the Wiktionary article at Pedia [3]. — Vildricianus 11:30, 11 April 2006 (UTC)

Administrative note: I am striking out Primetime's false "additional column" from the table above, from this archive. --Connel MacKenzie T C 07:16, 8 May 2006 (UTC)

Wiktionary:Redirections

Proposal. Please comment. There's few information available on the topic, so I've only represented that which I could think of, which is only a part of the actual practical policies. — Vildricianus 19:54, 5 March 2006 (UTC)

Thanks for taking the plunge into identifying a policy and starting to develop it.Great stuff!--Richardb 08:32, 6 March 2006 (UTC)
You can find some debate regarding the subject at
--Patrik Stridvall 09:31, 6 March 2006 (UTC)
I like this policy and it adds a lot of clarity. I think however there is considerable overlap with the CFI policy - would you consider merging them?
Jonathan Webley 20:49, 7 March 2006 (UTC)
  • I guess you're talking about the idioms? I'll leave it this way and refer to CFI. Both pages need to mention it the way they do right now I think. — Vildricianus 10:07, 8 March 2006 (UTC)
Not just idioms - the whole article about redirects is really discussing what merits an article (i.e. it those which meet CFI) and those which should only get a redirect. Jonathan Webley 10:09, 8 March 2006 (UTC)

Something that occurs to me is the issue of citations of alternate spellings or inflected forms. Should they be listed under the primary article, or should they too go under the relevant specific inflected or alternate entry? I would prefer for them to go under the main article. This is helpful particularly where a group of citations aims to show a word's history, beyond its current spelling. Widsith 20:54, 7 March 2006 (UTC)

Both. Eclecticology 07:54, 8 March 2006 (UTC)

"Redirecting between obsolete spellings or regionally different spellings, not least of all from American to British or the other way round, is strongly discouraged." I can understand that for individual words, but what about compound words like food coloring? Redirecting seems right to me, but somehow I have a feeling I'm in the wrong here. Davilla 02:06, 15 March 2006 (UTC)

It's still bias towards one variant. We should avoid that. — Vildricianus 09:20, 17 March 2006 (UTC)

POS headings again (in particular Proper noun)

A comment of mine on having a separate "Proper noun" header from Eclecticology's talk page:

Headings such as "Proper noun", "(In)transitive verb", "Phrasal verb", etc. are problematic: First, they require contributors to have more grammatical knowledge compared with the simple POS headers (and so might not add a new definition for fear it might end up under the wrong header or, not much better, add it to the wrong section). Second, it is not clear why and what additional grammatical information should be added to them. We could as well have "(Un)countable noun", "(In)transitive phrasal verb", "(Not) comparable adjective", etc. headers, which, like the current exceptions, is highly unsystematic and irritating. A good bad example is "Wellington": The boot has wrongly been classified as a proper noun. That's why tagging individual definitions is necessary. We should aim to keep things simple and make it possible to add information in small bits. We also don't want users to have to know that when they are looking up a noun they have to check below the "Phrasal verb" header whether there is a "Proper noun" section.

It would be great if the community finally agreed to abolish the "Proper noun" header. Ncik 16:29, 7 March 2006 (UTC)

I think it would be much better to abolish all "POS" headings in favor of "===Definitions===" and a tag at the start of each "#" line indicating the part of speech (like a regular dictionary.) Since this idea had been vehemently rejected in the past, I don't expect much to have changed regarding it. But without a total transformation to the general approach here, the "Proper noun" heading should remain. --Connel MacKenzie T C 16:37, 7 March 2006 (UTC)
Although I wouldn't object to the ===Definition=== heading, I don't think that moving the POS into the separate lines would be helpful. "Regular dictionaries" do at least maintain some separation when a word functions as more than one part of speech. Unlike them, we also have extensive translations, and jumbling these together could be confusing for a passive user. Eclecticology 07:52, 8 March 2006 (UTC)
I wasn't suggesting "jumbling" them together, I was suggesting they belong together, grouped together, just like regular dictionaries have them. Although now that you mention it, if combined with the "##"/"###" sub-meanings proposed layout, such a jumbling could easily be possible. --Connel MacKenzie T C 20:14, 10 March 2006 (UTC)
I am stongly in favor of keeping the headers for Proper noun, Transitive verb, Intransitive verb, and (for Spanish) Reflexive verb. As for any other headers beyond the basic parts of speech, I'm either ambivalent about them or opposed to them. I do think it's very useful for the part of speech to be given in the header, particularly for long pages. As the entries become longer, it becomes harder to scan for the sense that you need. With a part of speech header, one can click on that POS in the Contents of a long page to get where one needs to go. Switching to line tags for each part of speech for each definition will make readability much worse, particularly in cases where ther is already a (mathematics) or similar tag at the beginning of the line. --EncycloPetey 13:42, 8 March 2006 (UTC)
I don't care either way when it comes to Proper Noun, but I think some kind of POS should be given rather than have a general Definitions line. As for verbs, I don't much like Transitive Verb and the like, only because to me it seems counter-productive to split verbs which can be used both transitively and intransitively. Widsith 18:23, 8 March 2006 (UTC)
Sure you're not going to remove all POS headers? While I agree that "POS header" is a misnomer, they do serve a useful purpose with regards to browsing long pages and keeping translations separated.
Of proper nouns, I have even considered removing this tag at all and replacing it with just "noun" – at one point I wasn't sure of the need to identify whether a noun is proper or not – but then, I'm not sure of that anymore. There's probably plenty of good reasons to have it. Then I think we should keep it, right?
As for transitivity of verbs, well I've removed it and merged both into one header dozens of times, and most of the time, the one ===Verb=== is more practical towards definitions, but less towards space management (many weird-looking and paperdictionarish tags in front of the def). We should remain flexible here. — Vildricianus 20:46, 8 March 2006 (UTC)

Unfortunately, Connel's remark has caused this discussion to become concerned with proposals reaching much further than mine. I just want to know if we can agree that, on the basis of our current layout, abolishing the "Proper noun" header and similarly any other header with more than just the simple POS in it (with the usual exception that there might not be an actual POS in the POS header) and instead giving this additional information at the beginning of each definition would be a step forward. Ncik 01:55, 9 March 2006 (UTC)

No. Keep Prop. Noun, tr. v., Intr. v. at the least. As was pointed out above, we need a means to distinguish the two basic categories of noun, namely whether the noun refers to a specific thing or to a class of objects. Capitlaization does not help because there are capitalized common nouns both in English (Wellingtons, Mackintosh, and the like) as well as in German. This is an important distinction, particularly for learners of English. I am also in favor of the basic split in verb headers, in part because it hugely affects the way the verb in used. Every major dictionary I know of separates the transitive from the intransitive definitions, so you're arguing against the format that lexicographers prefer and that readers of dictionaries have come to expect. --EncycloPetey 11:40, 10 March 2006 (UTC)
EncycloPetey, please read more carefully what other people write before of making unqualified comments. It will be indicated whether a noun is a proper noun or a verb is transitive or intransitive when having a particular meaning. I'm just trying to convince people to get rid of these specifications in the POS header and move them at the beginnin of the definitions. I explained the obvious advantages above. See also # Wiktionary:Beer_parlour_archive/October-December_05#Verbs that are transitive and intransitive. Ncik 01:57, 11 March 2006 (UTC)
Thanks. I've always been a bit ambivalent about the separate headings for Transitive and Intransitive verbs. You make a good point for separating them as an ease to navigating through a long article. Combining them and adding a line tag should not be automatic. Eclecticology 21:59, 10 March 2006 (UTC)
That makes sense, yes. I've always thought that merging them benefits the readability of definitions, while separating them benefits the ease of translations. Bearing in mind that our translations sections are for now more valuable than our definitions, I won't merge any more. As a logical result, Proper noun certainly deserves a header. — Vildricianus 22:09, 10 March 2006 (UTC)
Discouraging "Proper noun" as a heading would be a mistake. (Ncik, the first couple lines of this thread are where you introduced trans./intrans. to the conversation; my comments did not expand the scope.) --Connel MacKenzie T C 20:14, 10 March 2006 (UTC)
Could you please give reasons why it would be a mistake to drop the "Proper noun" heading. Ncik 01:57, 11 March 2006 (UTC)
"Proper" noun: don't care. "Countable" should never be included. An uncountable noun plus s means different kinds; e.g. milk is uncountable, but "milks" means different kinds of milk.
I don't understand what you mean by the above. Of course a countable noun should be tagged as countable. The "countable" just shouldn't appear in the header.
Sorry, I should have said "in the header". Davilla 12:32, 12 March 2006 (UTC)
Verb: don't further distinguish the type in the heading, but do split them. Personally I try to fit a pattern, e.g. to think someone somehow: To think some person to be some way: "I think him childish"; to look somehow: To appear to be some way: "Your grandmother looks young." Transitive and intransitive headings are not clear-cut in all cases. How to list inflections is a big question though.
The valence of a verb can easily be given at the beginning of each definition by means of a template. We currently have: Template:avalent for valence 0 (e.g. to rain), Template:intransitive for valence 1 (e.g. to die) and Template:transitive for valence 2. I don't know if there exists a template for tri-valent verbs (such as to buy: "I bought her roses"). Inflections are the same no matter what the valence of the meaning is. Ncik 01:57, 11 March 2006 (UTC)
The term valency seems to be a fairly recent addition to theoretical linguistic jargon; it is partly accepted in Britain, and not-at-all accepted in the United States. I don't know whether there really is such a thing as an avalent verb. The only reason "we" have Template:avalent is because Ncik put it there. I don't think that we are being helpful to the user when we start adopting linguistic theories that are only accepted by a subset of specialists in linguistics. In what little I've read there seems to be little support for the concept of avalency; perhaps it could be applied to the infinitive of a verb, but then we are not looking for characteristics that are only aqpplicable to a single form of a verb. Eclecticology 06:17, 11 March 2006 (UTC)
I wasn't aware that this is new terminology (The OED has neither "valence" nor "valency" in the linguistic sense. www.dictionary.com does have "valence" in the linguistic sense and "valency" as a variant. Wikipedia links fom "valence" to "valency (linguistics)".) Anyhow, apart from Template:avalent, which we might want to rename, I'm not suggesting using the word "valence". We should stay with the traditional terminology "transitive", "intransitive", and create Template:ditransitive for verbs of valency 3. Ncik 12:50, 11 March 2006 (UTC)
I do think it better to use "verbal"/"nominal"/etc. as attributes for other labels like "phrase", e.g. get out of here, but the confusion between "verbal phrase" and "phrasal verb" might merit elimination of the latter (as a heading, not of entries like look up). Davilla 22:20, 10 March 2006 (UTC)
The word "phrase" should never turn up in a POS header. For phrases the POS header should give the POS of the phrase, not the POS of the headword of the phrase. But we could consider indicating what the POS of the phrase's headword is elsewhere. Ncik 01:57, 11 March 2006 (UTC)
So try to teach grandma how to suck eggs should be ===Verb=== not ===Verbal Phrase===? Davilla 20:21, 13 March 2006 (UTC)
Yes. Ncik 21:00, 13 March 2006 (UTC)
Abosolutely not! The header is supposed to be ===Phrase=== because phrases are formed here in their most generic sense - not how they are used! "It was a case of him trying to teach granny to suck eggs" would be an example where the phrase is not a verb. But that (and similar common forms) should redirect to the entry. --Connel MacKenzie T C 00:15, 14 March 2006 (UTC)
Your example sentence features the phrase as a gerund. There have been many discussion about how to deal with gerunds in the past (rather less recently), and I think we have never come to a conclusion. I've hardly ever come across pages with a "Gerund" or "Verbal noun" header or something like that. The user will have to figure out himself that, in the English language, almost any verb's present participle can function as a gerund. The header should definitely not be "Phrase". Ncik 17:36, 14 March 2006 (UTC)
"Phrase" alone does sound too simplistic. There's no fine line between, say, a compound noun and a noun phrase. Davilla 22:44, 1 April 2006 (UTC)

As I was studying some grammar stuff, I came across that "valency" thing you (Ncik) mentioned. It occurred to me that this can be far more complex than what you suggested above, and I would therefore recommend that it be left out of consideration. However, at the same time I also realized that transitivity of verbs – which is not the same as valency – has the same capability of becoming elaborate beyond the common user's understanding. E.g. there's a usage of to read that is called "pseudo-transitive", when one says I've been reading all day (as one omits the direct object here), which might have a different translation in some languages.

If we were to keep it simple and plain we should leave out the trans/intrans stuff and replace it with simple model phrases like the ones Davilla mentioned above. As these labels are good for English-only stuff but perhaps insufficient for other languages, this would be the most beneficial thing to do for translations and the least one for definitions. However, I'm still not sure what our main direction is: a translating dictionary or a defining dictionary, as both these objectives are hard to combine in a balanced way. It occurs to me, though, that these labels are being applied here in Wiktionary because they have been in monolingual dictionaries. Note: this is all just brainstorming of course. — Vildricianus 11:23, 12 March 2006 (UTC)

I don't quite understand what Davilla is proposing there, and even less understand how it is related to transitivity. As always we will have to have different policies for different languages. It would be naive to expect grammatical concepts to be the same in all languages (more than one would actually be surprising enough). We should by no means abolish tagging verbs as (in/di)transitive just because these notions are more complex than they seem on first sight. The exact implications of tagging a verb as (in/di)transitive should be explained on the according template pages. Ncik 21:00, 13 March 2006 (UTC)
Wiktionary places way too much emphasis on transitivity in the rendered entry already. Real dictionaries list v. tr. or v. itr. or n because that is not useful information to someone looking for a definition. We shouldn't have "intransitive", "transitive", "ditransitive" in headers or spelled out on definition lines. They should be abbreviated, at the start of each definition, just like a real dictionary. Not being paper, we can link the abbreviations to the proper entry for the curious soulds out there. But spelling them out in headers gives uncalled-for emphasis to these gramatical distinctions. --Connel MacKenzie T C 00:15, 14 March 2006 (UTC)
What I mean is use a pattern instead of funny labels like transitivity and valencey that are questionable in some of the corner cases (such as those above). For the regular cases, this comes out as:
===Verb===
to eat
  1. To consume food.
===Verb===
to eat something
  1. To consume something.
or something along those lines. Davilla 20:10, 15 March 2006 (UTC)
You want to blur these concepts because you can't be asked to define and explain them properly?! Not the way forward. (Completely apart from the fact that having to "Verb" headers is atrocious). Ncik 21:54, 15 March 2006 (UTC)
Am I burring them? Do you not agree that these are two different senses, a transitive and intransitive? I would imagine all dictionaries distinguish those two definitions. I actually don't like the second header either, so I've started eliminating it. But the concept is the same.
This has nothing to do with how well they're defined. Even respectable dictionaries aren't very good at this IMO. For instance, AHD defines the word xxxxx (redacted) as "v. tr. To judge or regard; look upon." Consider that you are unfamiliar with xxxxx in English and would like to use it in a sentence. Please construct a gramatically correct sentence for me using the verb xxxxx and conveying the meaning given. 22:59, 1 April 2006 (UTC)

French Wiktionary article count

What the heck has happened over there? In the last three months they have increased their article count by 300%. It can't be a bot because the increase has been pretty steady over that period. So does anyone know what's been causing this monstrous increase? GeorgeStepanek 06:36, 9 March 2006 (UTC)

Apparently, they added an entire edition of the Dictionnaire de l'Académie. Eclecticology 09:55, 9 March 2006 (UTC)
Yeah, but that's only 33,000 articles. Where did the other 85,000 come from? GeorgeStepanek 06:04, 10 March 2006 (UTC)
  • Every entry that had a translation, they 'bot-created an article for the translation. So fr:hello now has a (short) entry. --Connel MacKenzie T C 08:01, 10 March 2006 (UTC)
    • Aha! Thank you explaining that. This mystery been bugging me for weeks. GeorgeStepanek 20:26, 11 March 2006 (UTC)
    • The increase comes from uploads of:
      • the Dictionnaire de l'Académie (30 000)
      • the 20k Chinese characters that are in en:
      • translations from other Wiktionaries (at least en: and de: )
      • an online dictionary of Japanese
      • translations from Ergane
      ... everything is there: fr:Wiktionnaire:Transferts. We don't have anything else to import for now, but we'll maybe do what Connel is talking about (still a project in discussion). Kipmaster 20:33, 12 March 2006 (UTC)
Wow. You trusted and used the translations here on en:, but not your own? --Connel MacKenzie T C 02:31, 13 March 2006 (UTC)
Yes, we can trust and import XX->fr in the XX wikt more easily than fr->XX in the fr wikt. I won't explain why here, because it would be a too long discussion.
Now, I'm dreaming of a world where all Wiktionarys will be united, and where we'll not have to do these duplicating imports... Kipmaster 13:34, 13 March 2006 (UTC)
Another comment from a Wiktionnaire contributor:
  • The Wiktionnaire, although the best wiktionary :-), is probably not the most advanced one: I feel that many words in the Wiktionary might be omitted from statistics because they lack a link (I found some by clicking on Random page).
  • If I am right, this is mainly due to a lack of control. Enforcing formatting rules through templates (whatever these rules) saves much time, and provides a better consistency and much easier control. In the Wikionnaire, thanks to templates, all words have at least one category and, therefore, at least one link.
  • If you want to, also have a look at my user page. Lmaltier 15:09, 18 March 2006 (UTC)

Help wanted - translations to be checked

Translations to be checked are now being categorised by language (see the discussion above) for easier maintenance. I'm going through the links at Wiktionary:Babel and inviting people with good knowledge of particular languages to help out with checking the translations for those languages. There are however many, many languages for which no one has a Babel entry, so I'd like to invite everyone who has a good (by which I mean Babel level 3 or above) knowledge of any foreign language (particularly the lesser-known ones) to take a look at the list of translations to be checked categorised by language and see what they can do to help. If you are interested in being involved, please read about how to check translations before you begin.

I also invite everyone, if they have not done so already, to add Wiktionary:Babel entries to their user page - it's quick and easy to do and lets everyone know which languages you are familiar with and at what level.

Thanks everyone!

Paul G 12:00, 11 March 2006 (UTC)

I've now contacted pretty much everyone with Babel level 3 or above who has made contributions in the last couple of months or so. Please excuse me if I've left you out - you are welcome to participate, naturally.
Here's a summary:
I've contacted everyone who says they known one or more of the following at level 3 or above, or who volunteered in the '"Translations to be checked" - a proposal' discussion in Wiktionary talk:Translations: Afrikaans, Bosnian, Chinese, Croatian, Czech, Danish, Dutch, Esperanto, Faroese, Finnish, French, Frisian, Ga, Galician, German, Greek (both Ancient and Modern), Icelandic, Italian, Japanese, Korean, Latvian, Latin, Norwegian, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Tok Pisin, Turkish and Welsh; in other words, most of the major languages and a few others.
I've had replies from people interested in working on Dutch, Italian and Welsh so far. I'm prepared to work on French (which has one of the biggest lists).
If you haven't been contacted personally and would like to help, please leave a message on my talk page. Thanks. — Paul G 09:09, 12 March 2006 (UTC)

Etymology

I had a go at writing a guide to formatting etymologies at Wiktionary:Etymology. But my style of writing is far from informative and useful (although I excel at rambling). You guys can make a better guide than I can. --Expurgator t(c) 20:56, 11 March 2006 (UTC)

Formatting guideling go on WT:ELE nowhere else. The page Wiktionary:Etymology seems more appropriate to deal with issues concerning the contents of our "Etymology" sections. Ncik 14:26, 20 March 2006 (UTC)
Do we put in etymologies from oldest to newest e.g "From Greek xxx meaning yyy through Latin xxx: yyy through Anglo Saxon xxx: yyy." or the other way round "Anglo Saxon xxx: yyy from Latin xxx: yyy from Greek xxx: yyy." I put them in from oldest to newest, because it shows logical progress (sometimes). Andrew massyn 14:11, 15 April 2006 (UTC).
I usually use the Webster's format of newest to oldest: From XXX, (XXX) from YYY, (YYY) from ZZZ. --Connel MacKenzie T C 19:39, 15 April 2006 (UTC)

Request for 'bot status User:TheCheatBot

  • Name: User:TheCheatBot
  • Owner: User: Connel MacKenzie
  • Purpose: Upload inflected forms (only if missing), translation entries (only if missing), redirects
  • Method:
    1. Analyze the "latest" XML dump.
    2. Find all words that are linked (text or any template.)
    3. For undefined entries, use a preload template to generate text.
    4. Magic: Upload using the pywikipedia tool pagesfromfile.py as so:
      1. Simple noun plurals that are not also third person verbs.
      2. Comparatives.
      3. Superlatives.
      4. Third person verb forms that are not also noun plurals.
      5. Present participles that are not adjective (manual list verification)
      6. Past and Past participles that are regular, i.e. match each other.
      7. Foreign language entries from list of de-wikified languages from WT:ELE.
      8. Noun plurals that are also third person verb forms.
      9. Fill in uppercase/lowercase missing redirects.

Note that for all nine tasks, each task will finish before moving on to the next.

VOTE:

*Approve 'bot flag:

  • Deny 'bot flag:
    1. This bot does too many things, including imposing formats and templates. That does not give alternative formats a fair and equal chance. Eclecticology 18:08, 13 March 2006 (UTC)
      If find your complaints very irrational. You've embodied so many misconceptions in two little sentences, I'm baffled and unsure where to begin. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
      • OK, The main thing I do not understand about your complaint is the "imposing formats" (no, I'm not!) "and templates" bit. Before you made your comment, I finished testing the first hundred or so entries. During that test, I refined the logic to exclude noun plurals that are also verb third person forms. During the initial test, five slipped through. But I invite you to point out so much as one single entry where the 'bot code employed a template! When parsing current root forms, I honor ALL templates, including the ones I loathe created by Ncik. --Connel MacKenzie T C 06:07, 14 March 2006 (UTC)
      • The second major misconception, is that these are not inherently integral tasks. The parsing that is done on an XML dump is naturally conducive to combining these. For practical implementation, I hoped to solicit comments at each phase, after completing the prior phase. That way, as new XML dumps become available, the earlier iterations are folded in. Also, finding where to add comments would then be easier. But since you insiste they be broken apart, TheCheatBot is now resubmitted (in a separate beer parlour section!) to perform only the noun plurals portion. --Connel MacKenzie T C 06:07, 14 March 2006 (UTC)
      • The implication that this somehow "imposes" anything is quite far-fetched! I currently enter these manually. I thought I was being nice by trying to do them outside of my user account, to avoid accusations of artificially boosting my edit count or other such nonsense. Furthermore, I prefer to use my semi-automated edits only when human review is required. For all these entries, no such review is needed (beyond pre-scanning the generated lists before each run.) --Connel MacKenzie T C 06:07, 14 March 2006 (UTC)
    2. I support this effort, but there are a few concerns with the automation (see below). There are other additions that could be made to fill out the pages more completely, including an etymology and a note not to leave translations for inflected senses. Also I'm unsatisfied with the format (see below) but for me that's not a barrier to activation, which is fortunate since it seems no one wants to make concessions on style here anyways. Davilla 19:51, 13 March 2006 (UTC)
      Thanks for the support. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
    3. Having these redirects is wrong. Proposing a bot that will add them is asking for me to vote against it. GerardM 20:18, 13 March 2006 (UTC)
      They are here by English Wiktionary consensus. We have headwords and redirects - the redirects are not "spelling errors" as you continually assert...and having them 'bot added is simply because no human wants to sit around entering them. But if they are in place, we frustrate fewer visitors. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
      The massive number of redirects that we have from capitalized to de-capitalized titles are a by-product of our changeover to first letter case sensitivity. They are kept because of a desire by some to protect links that already existed before the changeover. This should not be viewed as a green light to enter more such redirects. Eclecticology 20:02, 14 March 2006 (UTC)
  • Comments:
  • Note that denying 'bot flag will only accomplish flooding Special:Recentchanges. --Connel MacKenzie T C 02:23, 13 March 2006 (UTC)
re: Fill in uppercase/lowercase missing redirects. - I would vote against doing that. We have a policy (of sorts!). A word is only capitalized in special circumstances. So why put in tens of thousands of simple redirects which go against that rule. If a link is mistakenly capitalized (ie: a link would succeed if it were not capitalized), then the preferred action would be to de-capitalize the mistaken capitalized link, not add a redirect. ?--Richardb 08:26, 13 March 2006 (UTC)
The idea for the redirects is to assist Wikipedia (and Wikipedia-like) external links, that have auto-capitalization correctly turned on. The a 'bot being able to do it for us (note: without affecting the entry count) why should it be considered as an option? --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
I support redirects from lower-case to upper-case entries, as the typical user will expect searches to be case-insensitive. I don't see any point at all in doing them in the opposite direction. — Paul G 10:04, 13 March 2006 (UTC)
Sounds like good policy to me. Davilla 19:51, 13 March 2006 (UTC)
I can abide with this, but I'd still like the previous possibility discussed. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
Actually, shouldn't upper-case be redirected to lower-case since "Something" at the beginning of a sentence means "something"? But "villareal" is a misspelling of "Villareal" since the name should always be capitalized. Anyways the search mechanism already handles this. I would support deleting all redirects of capitalization, for single words (no spaces) at minimum. Davilla 03:19, 15 March 2006 (UTC)
That is demonstrably false. Wikified links do not search. External links do not search. Sister project wikifications do not search. Sister project links do not search. Mirror sites that politely/correctly refer back here do not search. (Note: most mirrors are using inherently out-dated XML dumps.) The search logic was modified as a stop gap, but the fundamental problem still exists; Wiktionary no longer (for the last 9 months) has headwords, so redirects function only as navigation aides. Redirects do not count against the entry count. Deleting redirects is just stupid. The only thing deleting a redirect accomplishes (vandalism aside) is making the site en.wiktionary.org less useful. --Connel MacKenzie T C 05:20, 15 March 2006 (UTC)
  • More thought goes into this when new entries are created manually for the inflections. Here are a few things that I can think of. For comparatives, sometimes both "adj+er" and "more adj" are acceptable, as in stupider = more stupid, and should be reflected as a synonym on the inflected page. Similarly for superlatives. (If in some cases these may have other definitions, those are rare.)
As with present participles, past participles that might have additional adjective senses, e.g. tired, could leave the page as a stub. It would almost be better to have nothing there and know as much then to have to comb through looking for obvious definitions that were left out by a bot. The best solution of course is to also verify past participles from a list.
As to present participles themselves, there's also the debate about gerunds that had come up between us before, and you sided on leaving them out. I still think the majority of "v+ing" forms can be nouns, as a simple translation of the sentence "I like to v..." into "v+ing... is fun" can attest to. Davilla 19:51, 13 March 2006 (UTC)
  • I'm not crazy about the language "third person singular simple present". Is there an alternative? I prefer "Simple past of verb" and "Past participle of verb", using the base form of the verb, to "Simple past and past participle of to verb". The infinitive form is more commonly recognized in romance languages, and different lines are potentially needed to differentiate regional use. In general, a distinction should be made between a definition that equates the title, preferred, and as here a definition that describes it, e.g. "A word that means...". Davilla 19:58, 13 March 2006 (UTC)
Davilla, I'll respond to these concerns more clearly elsewhere. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
  • Great idea, but not quite yet. My entirely frivolous reason: our 60,000+ real-ish English articles are within spitting distance of the 70,000 entries claimed by a bunch of mid-sized dictionaries, like the Concise Oxford. I'd find it more emotionally satisfying to cross that threshold honestly, dance a jig, throw a party, then start cheating like hell. Keffy 22:35, 13 March 2006 (UTC)
  • More seriously: Comparative adjectives should be skipped and done manually if the stem is also a verb, since those will likely also have an agentive noun that will need a human-written definition. Keffy 22:35, 13 March 2006 (UTC)
Keffy, we already have thousands of such entries. Are you able to exclude them in a consistent manner now, when generating your statistics?
"TheCheatBot" is so named in honor of homestarrunner.com's StrongBad e-mail making me laugh for minutes - it is not about cheating the count.
For noun plurals, comparatives, and superlatives, yes, I skip them outright, if they share a verb inflection. --Connel MacKenzie T C 00:30, 14 March 2006 (UTC)
What was someone saying earlier about Beer Parlouring being addictive? I actually found inputing inflected forms a useful way of getting used to Wictionary but there are way too many to contemplate doing them all manually. Lets get on with it, how can we let FR be ahead of EN? :-) MGSpiller 02:10, 14 March 2006 (UTC)
I really don't care which project has the most articles. IMHO quality is more important than quantity. In theory, if completed projects were possible, and they all included all words in all languages, they should all have the same number of articles. :-) Eclecticology 20:02, 14 March 2006 (UTC)

I read the strike throughs above as indications that this bot in this form has been withdrawn, and replaced by the series of narrower scoped bots described below. Further comments should prferably be put in the relevant section below. Eclecticology 20:02, 14 March 2006 (UTC)

Basic English words still in poor shape.

Connel asked if it was time to take down my Basic English clean up project. So I had a look at one word at random from the top 100 Basic English words. after. I added loads, and more could be added. There is no etymology. Translations are not linked to meanings. So, regrettably, it still seriously looks liek the Wiktionary:Project - Cleanup of Basic English Words is still really needed. I also know that head needs lots of work. We really need to find a way of ensuring this dull and boring stuff is done well if Wiktionary is to be taken seriously.--Richardb 08:09, 13 March 2006 (UTC)

It's a lot more fun if it's done collaboratively. Maybe "word of the day" could be a word to improve rather than a stellar entry. The French Wikipedia has some sort of reward for outstanding entries, and guess what? There's only one word that fits the bill. They reward really does demand quality though. Davilla 20:11, 13 March 2006 (UTC)
That wasn't quite what I asked, Richard. Perhaps I should reword it: can the cross-namespace links to the User: (or User talk:?) namespace by removed? Can this be moved to the Wiktionary namespace or something, so the cross-namespace cleanup list can get scrunched down to a manageble size? --Connel MacKenzie T C 09:13, 16 March 2006 (UTC)
Good idea of Davilla. Perhaps Cleanup of the day ? Some five entries per day, perhaps, corresponding to Translations of the week, or merged with it. It would in general be appreciated if people consecrated more time to Category:1000 English basic words (I do). — Vildricianus 17:36, 17 March 2006 (UTC)
The point of the Wiktionary:Project - Cleanup of Basic English Words was that we could all cooperate, by taking certain words, and marking when we had done our bit on a word. Then we could see when all words had been addressed and "voted" as done. I've no doubt you have done lots Vildricianus, as have others. But I have no idea which you have done and which you have not done, so no idea which one to look at next.--Richardb 14:31, 4 April 2006 (UTC)
Introducing Project Beacon the Collaboration of the week for what it's worth. Davilla 23:00, 17 April 2006 (UTC)
I'm pretty darn new here, but my two cents from what I have seen since my arrival is that wiktionary could benefit from the filling of omitted words far more than the improvement of the basic ones. I suppose it depends on whether you see this as a tool for those learning the language or as a more advanced reference guide. Anywaym, it seems like every other word I look up is missing and so far I am spending more time writing definitions than finishing this research paper.Morticae 08:45, 19 April 2006 (UTC)

Bot codes

Where can I find the code for each of the bots below? Ncik 23:15, 14 March 2006 (UTC)

An interesting question for those of us who don't understand how a bot operates. Perhaps it would be a good idea to show the code an a sub-page of the bot. I have had some ideas on bot management, but would prefer to wait until namespace management is available before going too far on that path. Eclecticology 23:37, 14 March 2006 (UTC)
On http://sf.net search for "pywikipediabot". --Connel MacKenzie T C 01:40, 15 March 2006 (UTC)
That's the general framework. I'm interested in the (changes to the) code for these particular bots. Ncik 03:54, 15 March 2006 (UTC)
So far, I've changed one line of code, to mark the edits as minor. Since that is GPL, and this is GFDL, I'll try to post that change back up on sf. --Connel MacKenzie T C 04:20, 15 March 2006 (UTC)
I don't get it. Downloading the current snapshot, executing 'tar zxvf snapshot-20060312.tar.gz' and then 'ls pywikipedia' gives:
archive              deadlinks             imageharvest.py   makecat.py             solve_disambiguation.py   test.py            wikipedia.py
catall.py            disambiguations       imagetransfer.py  mediawiki-messages     spellcheck.py             titletranslate.py  wiktionary
category             distrib               __init__.py       mediawiki_messages.py  spelling                  touch.py           wiktionary.py
category.py          editarticle.py        interwiki-dumps   nowcommons.py          splitwarning.py           upload.py          wiktionarytest.py
catlib.py            extract_wikilinks.py  interwiki-graphs  pagefromfile.py        sqldump.py                userinterfaces     windows_chars.py
config.py            families              interwiki.py      pagegenerators.py      standardize_interwiki.py  us-states.py       xmlreader.py
CONTENTS             family.py             LICENSE           redirect.py            standardize_notes.py      warnfile.py        xmltest.py
cosmetic_changes.py  featured.py           login-data        replace.py             table2wiki.py             watchlist.py
CVS                  followlive.py         login.py          saveHTML.py            template.py               watchlists
date.py              gui.py                logs              selflink.py            testfamily.py             weblinkchecker.py
Where are your bots? Are they part of other bots? In which file do I find the code that will execute the edits you are proposing? Ncik 14:24, 15 March 2006 (UTC)
pagefromfile.py. --Connel MacKenzie T C 15:05, 15 March 2006 (UTC)
But that file does not tell me what the new page will look like and what starttext and endtext your bots will use. Where is the piece of code that is used to detect what the inflected form for which we would like to create a new page is? Ncik 12:33, 17 March 2006 (UTC)

Request for bot status: TheCheatBot

  • Bot: User: TheCheatBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in plurals, exactly as is currently done manually, without templates of any sort whatsoever.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Root form must link the plural one or two lines after the ===Noun=== heading.
    3. Inflection can be provided by regular text wikification, template:en-noun-reg, template:en-noun, template:en-noun2, template:en-noun-unc, template:en-noun-both, template:en-noun-irreg, or any other template that wikifies terms.
    4. All but the last three two characters of the root form must match the plural form to be auto-generatred in this manner.
    5. Auto-generated only if there is no other inflected form (e.g. Verb 3rd person).
    6. ===Noun=== header of root term must be within an ==English== language section.
  • Name: The cutsie name is in honor of http://homestarrunner.com/sbemail143.html as the character (a stuffed doll) named "The Cheat" became "The Cheat Bot" for this episode, by duct-taping on a box covered with aluminum foil.)


VOTE:
  • For:
    1. --Connel MacKenzie T C 05:43, 14 March 2006 (UTC)
    2. -- Tawker 06:26, 14 March 2006 (UTC)
    3. -- SemperBlotto 08:31, 14 March 2006 (UTC)
    4. -- MGSpiller 01:43, 15 March 2006 (UTC)
    5. I trust that issues of soundness will be worked out. Davilla 02:47, 15 March 2006 (UTC)
    6. Keffy 06:32, 15 March 2006 (UTC)
    7. With all due caution and plenty of checking/testing, please. --Dvortygirl 05:22, 17 March 2006 (UTC)
    8. Vildricianus 09:53, 17 March 2006 (UTC)
    9. --EncycloPetey 01:25, 18 March 2006 (UTC) (original objection was addressed)
    10. Interestingly, my piece (definition number 5) is also named The Cheat. --Rory096 23:09, 6 April 2006 (UTC)
  • Against:
    1. --EncycloPetey 23:47, 14 March 2006 (UTC) (see below)
  • Comments:
  • Now that I have seen what this one does I can be a little less hostile; the description page should match what's above but that is only a matter of housekeeping. I reviewed what TheCheatBot has done, and in the one case where there was a questionable plural, abatiss, that questionable plural was already there in the article. The link to the name justification article didn't work for me, but the name is not an issue for me. It could still be misunderstood by others in the future. There seem to be adequate safeguards to prevent this bot from taking too big a swath.
It would be nice if the bot could double-check, much as a human would do instinctively. The ambiguous spelling rules are: -fs or -ves from single f; -fes (possibly?) or -ves from fe; usually -oes or rarely just -os for long o; -ces or -cs from c; not sure about q. The standard rules are, -ies from y proceded by consonant, or -ys by vowel (not sure about wy); -es from e; -ses from s; -shes from sh; -ches from ch; -zes from z; -xes from x. (Trying to remember if there are any more.) Otherwise just add -s. Davilla 02:47, 15 March 2006 (UTC)
These should have been checked by humans when they went on the root page for the singular in the first place. The bot should not be trying to analyse ambiguous spelling rules, just copying what's already there. Eclecticology 09:31, 15 March 2006 (UTC)
The standard rules can be checked easily. This would avoid problems like adding just "s" to a word that ends in "s". The rarer ambiguous cases would not be checked for preciseness. If the bot doesn't ignore them, it would just make sure that at least one of the patterns fits. Davilla 20:57, 29 March 2006 (UTC)

this section moved a bit up:

  • Problem: If we're going to have a "third person" bot, then there will be duplicate entries for some items. For example, consider that shop is both a noun and a verb in English, and so will have both a plural noun form and a third person verb form. I would vote for this bot IF the bot could somehow work in conjunction with ThirdPersonBot to create combined entries. That is, words that have both a noun and verb entry are treated simultaneously, so that we don't end up with only the plural or only the third person. --EncycloPetey 23:47, 14 March 2006 (UTC)
    • NotAProblem: At this point in time, I am skipping those. That was the intent from the start, but I had trouble parsing some templates initially. If there is both, that will be handled by a separate 'bot, not this one nor ThirdPersBot. That is, if I ever propose a 'bot again. --Connel MacKenzie T C 02:54, 15 March 2006 (UTC)
      • That's good. Eclecticology 09:31, 15 March 2006 (UTC)
        • More importantly that we can find a consensus on one part of your original proposal. Eclecticology 01:15, 18 March 2006 (UTC)
        • That I never propose a 'bot again?  :-) --Connel MacKenzie T C 18:08, 16 March 2006 (UTC)
  • The description on the bot user page could do with updating (it still refers to the original spec). MGSpiller 01:43, 15 March 2006 (UTC)
  • One step at a time is fine by me. Do the easy stuff (i.e. plurals only, 3rd person only) first then review & go for the harder ones which include both. MGSpiller 01:43, 15 March 2006 (UTC)
    • Exactly. Thank you for your kind words and support. --Connel MacKenzie T C 18:08, 16 March 2006 (UTC)
      • No problem, some wise person who's name is buried in some wiki flame war somewhere once suggested that you should try to bring more light than heat to an argument (or something to that effect). I try to keep to the spirit of that suggestion when I can. MGSpiller 02:34, 18 March 2006 (UTC)
    • What is clear in my mind is that we started with a bot that would do many things. To arrive at a consensus that everyone could live with we may have needed to break these taks down into excruciatingly small bits. Now this one appears as though it will work, and a few of the others are likely to follow suit. That's progress! Eclecticology 01:15, 18 March 2006 (UTC)




One format issue issue that should be looked at before this goes further is how the resulting line will look. Currently we mostly (including myself) have been using 'Plural of word' Another user suggested to me that word should be italicized or in quotation marks; after considering this I had to admit that he was grammatically correct. Another argument is that 'Plural of' should be italicized and word left in roman face since 'Plural of' is descriptive rather than definitive. These is a stylistic rather than a substantive issue, and I can live with any of these solutions. Are there general preferences? It would be nice to have a broad sense of direction on this. A similar issue will come up with some of the other bots mentioned below. Eclecticology 23:28, 14 March 2006 (UTC)
I very much supporst your idea of italicising descriptive definitions (and accordingly not italicising what usually would be italicised). Ncik 03:53, 15 March 2006 (UTC)
  • I have not strong preference on the formatting of this type of entry. I chose the only prevalent format that exists today for this type of entry. Whatever format is decided upon, it should be consistent. If existing entries are 'bot re-formatted, would that need a separate bot request, or could that full under the aegis of this one? Is there consensus that the descriptive text should be italicized? Would that rule apply to all forms descriptions? --Connel MacKenzie T C 18:08, 16 March 2006 (UTC)
I suggest you just do the italicising if nobody objects here. It has an obvious advantage. Italicising should consequently apply to all descriptions of inflected forms. Ncik 12:23, 17 March 2006 (UTC)
I agree with Ncik, standardizing all entries this way would be a good job. Although I've always added these things with only the basic word italicized, the correcter thing to do is the other way round. — Vildricianus 12:32, 17 March 2006 (UTC)
To summarize - Is it the fundamental consensus that the definition line for words should now be formatted
  1. Plural of word.
and this includes the capital "P", and the full stop? Anyone may reformat a line in that way. If someone sets up such lines in any other way it could be changed, but there would be no penalty unless the guy is being a complete jerk. Eclecticology 01:15, 18 March 2006 (UTC)
I'd just like to clarify the first sentence of Eclecticolgy's last comment: We don't want inverted italicising for all definitions, only those that are secondary descriptions. Examples include definitions that say
  • that a word is an inflected form of another, ("dogs": Plural of dog.)
  • that a word is an abbreviation of another, ("abbr.": Abbreviation for abbreviation.)
  • that a word is a spelling variant of another, ("œsophagus": Alternative spelling of oesophagus)
  • that an interjection is used as an expression of something ("ouch": Used to express physical pain.)
WT:ELE needs to be updated. Ncik 21:42, 19 March 2006 (UTC)
  • The more I see of this, the less I like the idea of superfluous italics. This is a faily major change to the formatting of nearly all Wiktionary entries.
What makes you think this change would affect "nearly all Wiktionary entries"? We don't have many inflected forms yet. And the number of abbreviations (around 3000) and alternative spellings (a few hundred at most, I'd say) is limited as well. Ncik 13:55, 23 March 2006 (UTC)
  • Ncik, I don't think there is consensus on your last suggestion for interjections. Actually, while all four suggestions seem reasonable, I don't think there is widespread consensus for 1, 2, 3 or 4. Again, such formatting changes would affect most entires, needlessly. --Connel MacKenzie T C 06:27, 22 March 2006 (UTC)
I crossed out the interjections. The situation is not as clear as in the other three cases. Ncik 13:55, 23 March 2006 (UTC)
Hrm. OK. Thanks, yes, the interjections were the most problematic of those four. The wording you used was a little misleading; it seemed to me that you were talking about a generic convention for italics for all descriptive text. A convention like that would eventually affect all entries. But since that isn't what you were saying, now I'm left wondering why these cases should be so differently formatted from all the rest of the main namespace entries. That is, I think a definition/meaning line should look like a typical definition/meaning line as much as possible; the italics don't do that. --Connel MacKenzie T C 17:25, 23 March 2006 (UTC)
Although Ncik crossed it out, #4 is how this extends naturally to the other entries.
ouch =interjection= Used to espress physical pain.
 : "Don't you dare stick that needle into my... ouch!"
excruciatory =adjective= Used to express physical pain.
 : "He's not a good actor when it comes to excruciatory lines."
 : "China has banned Google searches for the excruciatory emoticons."
Granted the second is made up, as I don't know any word that means exactly that, but you get how the two words are distinguished, an explanation of use from a definition of synonymy. Davilla 20:57, 29 March 2006 (UTC)

Layout

Now the 'bot flag has been approved, the layout issue is still pending. Among the possible options are the following:

  1. Plural of [[word]].
  2. Plural of ''[[word]]''.
  3. ''Plural of'' [[word]].
  4. Plural of '''[[word]]'''.
  5. ''Plural of'' '''[[word]]'''.
  6. (Combined Davilla)
    (a) Plural of "[[word]]".
    (b) Plural of "[[word]]."
  7. Plural of: [[word]]. (Added Davilla)

#1 and #2 prevail right now, with #2 being the one I prefer. Even though I suggested enthusiasm about #3, I tend to dislike long lines of italicized text. Applying it in these instances would require us to consequently apply it in many more instances than is actually feasible or advantageous. The more I see of it, the less I think it's a good idea to italicize all descriptive text. Therefore, my personal vote goes to #2. (Should we turn this into a formal vote?) — Vildricianus 17:00, 25 March 2006 (UTC)

I think I'll have to try yet again to say more precisely when italicising should apply: Italicising should apply if the definition is not a "primary" description in the sense that it is not a description of what the word being defined denotes, but instead is a "secondary" definition, i.e. describes the word being defined itself. Example: Dogs: Plural of dog. Italicised, because it describes the word "dogs" itself (in grammatical terms). A primary definition would be: Two or more members of the genus Canis. In most cases it is quite obvious what is a primary and what is a secondary definition, but I've come across examples where one couldn't tell. I'll try to find an example or construct one, and then post it here. Ncik 17:20, 26 March 2006 (UTC)

Note: see u#Dutch for an example of ugly italicization. — Vildricianus 17:03, 25 March 2006 (UTC)

However, this is a rare example. We normally don't add illustrative sentences to inflected forms. These belong on the page of the uninflected form, even if they feature the word in an inflected form. Ncik 17:20, 26 March 2006 (UTC)
Ok, then see ik#Dutch, me#Dutch, jullie etc. Note that I have applied your italicized proposal there to test it out. However, I can't see what you mean with "inflected forms". "U" is just a personal pronoun. — Vildricianus 10:17, 27 March 2006 (UTC)
One could interpret me as the accusative of ik, and u as the second-person of ik, hence call these inflected forms. But since those three are personal pronouns, one wouldn't use this terminology. This, and the personal pronouns' massive irregularities if regarded as inflected forms, is why I deem it acceptable to have illustrative sentences in these cases. The number of pronouns and similarly affected words is clearly very small in comparison to the remainder of the lexicon. Ncik 02:11, 28 March 2006 (UTC)
I prefer a fourth option (added above). Italicization becomes a real problem with non-Latin fonts, so I would rather not use them around links to entries where it can be avoided. I do like the ides of bolding a word that is intended to be a main entry, particularly in this case, where the page is presumably to have little other than a link to the singular form. --EncycloPetey 17:29, 25 March 2006 (UTC)
I'm not convinced that emboldening text is any easier for browsers (fonts) than italicising. I just checked a couple of scripts (Hindi, Tamil, Japanese, Greek, Chinese) in my own browser, and italicising works fine with all of them but only Tamil letters get emboldened. Apart form that, since this is the English Wiktionary, and English is written in Latin script, neither italicising nor emboldening words in non-Latin script can cause any confusion. Ncik 17:20, 26 March 2006 (UTC)
Italicizing doesn't work fine for Russian. Again, I fail to see what "the English Wiktionary" has to do with this. — Vildricianus 10:17, 27 March 2006 (UTC)
Suppose it was a proposal not to italicise or embolden words in non-Latin script at all. Ncik 02:11, 28 March 2006 (UTC)
Suppose. Suppose #1 is equally fine then for Latin script as well. — Vildricianus 12:27, 28 March 2006 (UTC)
If we are not to italicize entry names, then that eliminates one of the more popular options in running, #2. I'm backing #3 and #5 since the distinction is logical to me and I don't like any of the arguments against. (a) Going through existing entries to apply a new standard has never been and should never be a deterrent. (b) The problem with u#Dutch and jullie is that the examples are also italicized. It's unreadable because there's simply too much italic text. I've suggested ways to eliminate the italics in quotations, but it didn't draw much attention. (c) Long descriptions of text can often be rephrased or formatted, as I have done for nibling and Martial. Davilla 20:57, 29 March 2006 (UTC)
Even though this 'bot is concerned only with English entries, I agree with your conclusion, EncycloPetey. But it is still different from current practices. --Connel MacKenzie T C 20:30, 25 March 2006 (UTC)
Like Vildricianus my personal preference is/was #3 because I want them to stand out. However reading this I think perhaps #4 would do that job better. I don't think #2 will do that, it will just make the name of the main entry harder to read. But perhaps #5 would be even better, but I can live with #4. --Patrik Stridvall 09:23, 26 March 2006 (UTC)
To me, #5, just as #3, still bears the problem of too much italicized text. I think EncycloPetey's #4 is by far the best up to now. — Vildricianus 09:33, 26 March 2006 (UTC)
I don't see #3 as having too much italicized text. The Dutch example given above seems to ba an extreme case where other solutions may be available. Our concern for now is with English language entries. Using italics follows the same reasoning that applies for tags like (Sports), or (Obsolete), etc. that we put at the beginning of a line. My second choice (#6 #6.a) is to use quotation marks in accordance with ordinary punctuation rules. Eclecticology 07:23, 27 March 2006 (UTC) [ Edited Davilla ]
Even though our concern is with English, we have to be consistent across the entire Wiktionary, right? Using bold text follows EncycloPetey's reasoning of directing the user to the main entry. — Vildricianus 10:17, 27 March 2006 (UTC)
I suppose I could throw support towards #7 #6.b. (7 formats? Only 6 people are commenting here!) --Connel MacKenzie T C 07:54, 27 March 2006 (UTC) [ Edited Davilla ]
I'm eagerly waiting for #8. — Vildricianus 10:17, 27 March 2006 (UTC)
Guess you'll have to keep waiting. ;-) Davilla 20:57, 29 March 2006 (UTC)

More layout

  • Status: Meta request forwarded, approved. Awaiting better resolution on formatting before proceeding. --Connel MacKenzie T C 22:14, 24 March 2006 (UTC)
    • No consensus seems to be emerging.  :-( --Connel MacKenzie T C 20:30, 25 March 2006 (UTC)
    • No consensus has emerged. Although I'd lean towards format #7, the format #1 is the one that is clearly most used currently, so that is what I intend to proceed with. If the concept of "voting" is ever adapted, and used for this, these can then be 'bot changed. Without a conclusive vote, trying to keep the formats consistent for later change is the only reasonable approach. --Connel MacKenzie T C 07:12, 31 March 2006 (UTC)
Sounds reasonable, but I strongly suggest using #2 (or 4 or #6a), since #1 is orthographically wrong. Ncik 14:28, 31 March 2006 (UTC)
"Wrong?" Um, no. Just consistent with the entries that already exist. --Connel MacKenzie T C 15:43, 31 March 2006 (UTC)
Ncik, that's exactly what we're discussing, right? Are you planning to just repeat the above? BTW, does anyone think we should have a vote right now on this matter? — Vildricianus 13:27, 1 April 2006 (UTC)
I thought the main point of contention was italicising the definition itself. So for now I propose using #2 in order to get the orthography right, while avoiding the issue of italicising the definition. Can we agree on this? Ncik 14:25, 1 April 2006 (UTC)
Although #1 is one of the formats I like least, my and your opinion won't ultimately count for anything until everyone's input has resolved in some conclusion. I would say go ahead with any format. I see objections above to every option save the newest (which hasn't had time to garner objection). I would have to agree with Vildricianus that #4 is the least contentious. And anyways, we bold headwords as well. However, it's not clear that bolding will be necessary in the end, so why do it now? Start the bot, but let's not end the debate. (Not that debates ending themselves is ever a problem.)
The question as I see it is how to categorize these. It might makes sense to first vote on the use of italics for distinction of definition-use. Combined with that proposal would be a commitment to end the italicization of examples and quotations, with their new style deferred for a later vote. If that proposal fails, we continue with the use-definition distinction by first voting on using a colon as in #7 before other style differences. The colon does this job well in some cases, but it cannot when no synonymous phrase is given. Hence italicizing, if uglier, is more versatile. With the field narrowed, we can more sanely choose among the remaining options. Davilla 23:49, 1 April 2006 (UTC)
I'd be happy with anything but #1 and #7. Ncik 22:28, 2 April 2006 (UTC)
  1. I follow Connel's reasoning to proceed with the bot right now, following #1 as this is currently the most prevalent format.
  2. I tried to compile a summary from the above; I failed. In general, this debate numbs the mind and leaves me currently fairly indifferent as to what the final outcome will be. As long as the plurals are there. Any formatting could afterwards be redone by the bot.
  3. If someone wants a vote on this, then please put it forward; I don't feel like doing so.
  4. Personally, I support either #1 or #4. In addition to any italicization, I dislike both colon and quotation marks. Personal opinion is what will make the difference here, so perhaps we do need a vote. — Vildricianus 13:57, 3 April 2006 (UTC)
How can you support something that is orthographically wrong? Incidentially, I don't think format #1 is much more common than #2. Ncik 02:11, 4 April 2006 (UTC)
No, I'm pretty certain that the majority of plural entries that I entered matched format #1 - before the template was changed to be format #2. In either case, what exists is a mix of the two. Considering the disdain for italics expressed here, I think #1 is the better (more consistent) short-term choice. --Connel MacKenzie T C 08:23, 4 April 2006 (UTC)
By the way, what about format #1 is it that you assert is orthographically wrong? Some compelling (and not-so compelling) arguments against italics have been raised. I still don't understand why you object to format #1. --Connel MacKenzie T C 22:22, 4 April 2006 (UTC)
Format #1 is wrong as it does not quote the referenced word. If you haven't learnt this in school, I recommend the Wikipedia article on quotation marks. No arguments against putting the referenced word in italics have been raised. The above discussion was about italicising the whole definition. Ncik 23:03, 4 April 2006 (UTC)
The argument against all italics may have been archived with the original proposal. EncycloPetey did object to putting the referenced word in italics in this section though. My observations remain: format #1 matches the majority of existing "plural" entries and format #1 matches the majority of all other existing English Wiktionary entries. I think it would be best to remain consistent with these, so that future conversions (if a format is ever agreed to) have fewer cases to search for. --Connel MacKenzie T C 06:04, 5 April 2006 (UTC)
Well, and I object to wrong orthography. This is certainly a stronger argument than EncycloPetey's observation that certain fonts have difficulties displaying italicised non-Latin script, especially since the bot will only be concerned with English entries. Ncik 13:23, 5 April 2006 (UTC)
# The other argument I was referring to was against all italics in entries.
What are you referring to? Who proposed not to use italics anymore? Ncik 23:58, 6 April 2006 (UTC)
# Your proposal is still inconsistent with the rest of the entries in Wiktionary.
As I said, and you admitted, there already are entries in format #2. But considering the amount of entries the bot will create, this is not relevant anyway. Ncik 23:58, 6 April 2006 (UTC)
# Your proposal has not have consensus!
Neither has any other. The only thing I'm asking for is correct orthography. Ncik 23:58, 6 April 2006 (UTC)
--Connel MacKenzie T C 14:10, 5 April 2006 (UTC)
So you are just repeating what you said above. Great. Thanks for the extra delay. You are correct that format #1 no longer seems to have consensus either. Are you asserting that proceeding with it would be harmful (in light of the reformatting that conceivably needs to be done to all ~ 3,000 - 5,000 entries if/when consensus is ever reached?) --Connel MacKenzie T C 16:55, 7 April 2006 (UTC)
  • The "italics" comments were on my talk page (you posted there, before and just after the section, so I assume you saw it and chose not to comment.)

...I'd like to recommend against putting English words in italics (as in the recent run of past participles). Reason 1: italics is usually reserved for foreign words by editorial convention. Reason 2: Despite what most people think, italics does not emphasize a word visually -- it merely makes them "small and hard to read", as one of my friends has put it. --EncycloPetey 20:53, 25 March 2006 (UTC)

--Connel MacKenzie T C 17:32, 7 April 2006 (UTC)
I think I missed it. As to reason 1: Would such a convention make sense in our multilingual dictionary? Do we want
Plural of "pizza".
for the English word "pizzas" and
Plural of pizza.
for the French word? We could make the convention subject to language headers to avoid this, though. As to reason 2: First sentence is irrelevant, since we don't aim at emphzsising words. We are concerned with orthography. I disagree with EncycloPetey's friend. Ncik 03:29, 9 April 2006 (UTC)
As a dictionary, we are always talking about words. No other dictionary seems to follow the Wikipedia orthography "rule" you cited above (that seems like a reasonable rule for prose, but dictionaries are technical listings.) I don't think the approach of further fragmenting layout conventions by languages would be helpful to anyone...just more confusing to everyone.
Depending on what platform I am using, I sometimes agree with EncycloPetey's friend. Some platforms have "better" font families installed by default, while many others do not.
I maintain that format #1 still has the most support on this "voting"-type thread. It is also the most consistent style that matches all other entries on Wiktionary the best. --Connel MacKenzie T C 16:00, 10 April 2006 (UTC)
Yes, we are always talking about words, but we also use words to talk about them. The orthographic rules Ncik mentions were specifically created to help make the distinction between use and mention required when using words to talk about words, hence my vote for the italicized version below. Rodasmith 20:44, 10 April 2006 (UTC)
Indeed. Why then don't we do it in all other entries? Because it is a style convention, right? --Connel MacKenzie T C 20:53, 10 April 2006 (UTC)
I don't know whether the convention you mention was chosen from a list of candidates, but if so, neither do I know why an orthographically poor choice was made. Regardless, we should avoid supporting a poor choice based on an appeal to tradition. Rodasmith 18:18, 11 April 2006 (UTC)
Well, tradition or not, I agree with E.P.'s friend - on some platforms the italics are hard to read. I also maintain that consistency will allow for changes in the future more easily. I also believe the prose formatting conventions (for use/mention distinction) are not applicable to a technical listing, such as this dictionary. Instead, our chosen style should be our chosen style (like all other dictionaries.) --Connel MacKenzie T C 18:58, 11 April 2006 (UTC)
(unindenting for readability:)

Consider what the unquoted format does for the following:

  • plural of adjective
  • plural of magnitude [Possible interpretation: "This entry is a word meaning a plural of [great] magnitude."]
  • plural of noun
  • plural of one [Possible interpretation: "This entry is a word for terms inflected in the plural form that refer to collections known by the listener to be singular (but presumably of unknown quantity to the speaker)."]
  • plural of plural [Possible interpretation: "This entry is a word for the second of multiple levels of plurality."]
  • plural of singular [Possible interpretation: "This entry refers to the plural form of any singlular noun."]
  • plural of three [Possible interpretation: "There is a separate form for plurals referring to collections of exactly three objects."]
  • plural of uncountable [Possible interpretation: "This entry refers to the plural form of items whose quantities are uncountable."]
  • plural of verb

I hope the above examples illustrate that standard use-mention othography is important not just in prose, but whenever words in one language talk about words found within that language. Rodasmith 19:29, 11 April 2006 (UTC)

I'd like to point out that your examples omitted the wiki-links, which in this context, make that distinction. Just not as explicitly as in the syntax that you'd like. --Connel MacKenzie T C 19:39, 11 April 2006 (UTC) (edited) 19:44, 11 April 2006 (UTC)
Links don't make that distinction, because many definitions link terms from the grammatical descriptions themselves, e.g. "subjunctive mood conjugation of ...". Rodasmith 19:47, 11 April 2006 (UTC)
I don't think we should link "subjunctive". — Vildricianus 19:52, 11 April 2006 (UTC)
Why not? Because it's commonly known term? If that's the reason not to link it, there is a slippery slope with more obscure gammatical terms (e.g. jussive, copula...). Rodasmith 19:55, 11 April 2006 (UTC)
IMO, because the words should be defined in an appendix rather than individually. Davilla 20:33, 16 April 2006 (UTC)
The links are probably the absolute least distiction that could be made since we are trained to ignore them when reading. In some cases there may be no distiction whatsoever, for instance in printing, which doesn't often preserve them. Davilla 20:33, 16 April 2006 (UTC)
Rodasmith, excellent examples! Davilla 20:33, 16 April 2006 (UTC)

Vote

Anyone has a better idea? I suppose all parties would be comfortable having these plurals bluelinked while they're still alive. Might possibly lack interest, yet, even with three votes this could be finally "agreed on". Other options not allowed; this is only a partial democracy.

Is 4/17 Midnight GMT a good deadline for this vote? (1 week.) --Connel MacKenzie T C 07:57, 13 April 2006 (UTC)

Yup. — Vildricianus 07:58, 13 April 2006 (UTC)
Can we please extend this deadline. I hadn't been online for quite awhile, hence didn't know about this vote. I don't think this is the appropriate place to hold it anyways, since hardly anybody will read this discussion anymore. Ncik 23:11, 16 April 2006 (UTC)
Well, you just voted. I think everyone else has seen it. And your method here has been to be as obstructionist as possible. So what the heck? Why not add another week. Or were you thinking just another day or two?
You say the Beer Parlour is not read? I find that incredulous. I see you've "re-advertized" it with a current posting. I think that is good - I too would like to see more people's opinion on this matter. But moving this now would be a horrible mistake. Moving this vote would garner claims of foul play from both sides, wouldn't you agree? Or is that your intent, to delay this bot more by claiming an invalid vote (even though we don't normally even do any such thing.)
Ncik, the original routine bot request has been stymied by you for months now. Perhaps it is more accurate to say the delay is from my pandering to your series of pointless petty complaints. What are you so afraid of? --Connel MacKenzie T C 03:29, 17 April 2006 (UTC)
Enough people have read it by now. Note also that I did plan to extend it because you hadn't been here since. However, since you've seen it now, I think we can close it on the aforementioned date. — Vildricianus 09:16, 17 April 2006 (UTC)
  • -- Format #1: Plural of [[word]].
    1. Support. — Vildricianus 16:44, 10 April 2006 (UTC)
    2. --Connel MacKenzie T C 20:51, 10 April 2006 (UTC)
    3. Not really necessary to have more visual distraction than necessary -- Tawker 02:33, 11 April 2006 (UTC)
    4. --Dvortygirl 04:52, 11 April 2006 (UTC) Less typing, less clutter.
    5. --Widsith 07:59, 17 April 2006 (UTC)
    6. --SemperBlotto 10:03, 17 April 2006 (UTC)
    7. Kipmaster 17:19, 17 April 2006 (UTC) for consistency (since the other one is ugly for some scripts)
      Would you consider an format only for Latin script?
  • -- Format #2: Plural of ''[[word]]''.
    1. Support for roman script entries. Rodasmith 20:35, 10 April 2006 (UTC)
    2. As per Rodasmith. Davilla 20:27, 16 April 2006 (UTC)
    3. Ncik 23:04, 16 April 2006 (UTC)
    4. As per Rodasmith. --Patrik Stridvall 08:24, 17 April 2006 (UTC)
    5. \Mike 16:28, 17 April 2006 (UTC) I would prefer the descriptors to be italicized to the word being italicized, but one of them should be.
  • -- Format #3: Plural of '''[[word]]'''.. Here's why:
  1. Format 1 is incorrect and we should definitely not be using it. Compare Christmas is coming with "Christmas" is coming. The first means "it will be Christmas soon", while the second means "the word 'Christmas' is coming" (perhaps it's about to be entered into Wiktionary). The wikification of the word to be looked up is not sufficient, as we might, perhaps, decide that we want to wikify "plural" as well, or, indeed, all words used in definitions. We probably won't, but the same applies to less well-known grammatical terms (for example, consider this definition of "was": "third-person singular indicative present tense of to be"). We must indicate that we referring to the word and not actually using it. We should not be lazy and sacrifice clarity by omitting a few keystrokes.
I'm against format 2 because I think headwords should be emboldened when referred to, as they are emboldened when they feature as headwords in their respective entries. This is the policy of most print dictionaries (eg: foo: see bar). I prefer to reserve italicisation for non-English words.
Looks like I might be too late anyway - has the bot already done its stuff? — Paul G 15:43, 20 April 2006 (UTC)
You are not "too late" as the bot has not added any entries past the initial test batches. I'd like to note Vild's reasoning for limiting choices to only those two possibilities though. We are not other dictionaries, we have a community accepted format in use for years, and no consensus is emerging (still!) on this topic. Again, if one or the other can be decided on, the bot can go forward. At that point, it would be reasonable to suggest alternate "offical" format choices, to convert all entries (prior existing entries, bot-added entries and future entries) to whatever format all can agree to. --Connel MacKenzie T C 16:43, 20 April 2006 (UTC)
You didn't see any consensus emerging, but both Vildricianus and I noted that the least objected distinction was to bolding the stem word. Granted it isn't currently done for these pages, but by current practice it would be done in the etymology at least. You can also restrict bold to Latin script. A vote between #1 and #4 would still be close, but necessary to be sure. You know, like Chris Berman says: "That's why they play the game." Davilla 17:20, 23 April 2006 (UTC)
Your comment doesn't seem very helpful, therefore I must have misunderstood it? We are not other dictionaries and we have a community accepted format that has been in use for years. The main argument is that Ncik all-of-a-sudden wants to change how Wiktionary functions. To make his point, he is obstructing this vote with claims that dictionies all must follow his style, when in fact, most do not. And "we" (Wiktionary) certainly never have. --Connel MacKenzie T C 08:16, 24 April 2006 (UTC)

Darn, there's no end in sight. Too many people seem to attach a great deal of value to these minor aspects of "style", but perhaps fail to realize that Wiktionary has little to no consistency regarding them. Take a look at the style of our main contributors, they're all different from one another. They use different headers, different header levels, different italicizing or boldening, begin sentences with a capitalized letter or not, end it with a period or not, etc. At this point of the wiki, maintaining a consistent style is not feasible and merely obstructs the adding of content, which is what we're all about nowadays. Style is for bots to implement when the content is more fossilized than it is now; bots can change everything we want to see changed, so why let it bother us now? This bot is meant to add content. Sure it would have been nice to see a consensus (at least we can say that we tried), but apparently we're not ready to have that regarding our style. Let's draw a line here, let's archive this thing, and let's keep an open mind on all this when we are up to discussing our style in another, probably the last, phase of Wiktionary. Certainly we won't attract people by having the "right" orthographical style, or frighten them by having the "wrong" one, as long as we have red links for susceptibility, assiduity or unwillingness. — Vildricianus 09:00, 24 April 2006 (UTC)

Request for bot status: ComparBot

  • Bot: User: ComparBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in comparatives, exactly as is currently done manually.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Root form must link the comparative one or two lines after the ===Adjective=== heading.
    3. Inflection can be provided by regular text wikification, or any other template that wikifies terms.
    4. All but the last three two characters of the root form must match the plural form to be auto-generatred in this manner.
    5. Auto-generated only if there is no other inflected form (e.g. any verb or noun inflection).
    6. ===Adjective=== header of root term must be within an ==English== language section.
VOTE:


  • Against:
  • Comments:
    1. I would prefer that the headers of the created pages use ===Adjective form=== rather than simply "Adjective". This makes it clearer that mnore information will be found in a main entry. --EncycloPetey 23:40, 14 March 2006 (UTC)
    The format suggested by Connel matches current practice & various templates including those invoked by nogomatch. I support as is.MGSpiller 01:48, 15 March 2006 (UTC)
    Simpler as is. Davilla 02:51, 15 March 2006 (UTC)

Request for bot status: SuperlBot

  • Bot: User: SuperlBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in superlatives, exactly as is currently done manually.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Root form must link the superlative one or two lines after the ===Adjective=== heading.
    3. Inflection can be provided by regular text wikification, or any other template that wikifies terms.
    4. All but the last three two characters of the root form must match the plural form to be auto-generatred in this manner.
    5. Auto-generated only if there is no other inflected form (e.g. any verb or noun inflection).
    6. ===Adjective=== header of root term must be within an ==English== language section.
VOTE:


  • Against:
  • Comments:
    1. I would prefer that the headers of the created pages use ===Adjective form=== rather than simply "Adjective". This makes it clearer that mnore information will be found in a main entry. --EncycloPetey 23:39, 14 March 2006 (UTC)
Don't like the word "form", but what about use as a noun, and doesn't -er have that too? Davilla 02:58, 15 March 2006 (UTC)

Request for bot status: ThirdPersonBot

  • Bot: User: ThirdPersBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in third person verb forms, exactly as is currently done manually.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Root form must link the 3rd person form one or two lines after the ===Verb=== heading.
    3. Inflection can be provided by regular text wikification, or any other template that wikifies terms, including Uncle G's inflection templates as well as Ncik's templates (or any others!)
    4. All but the last three two characters of the root form must match the plural form to be auto-generatred in this manner.
    5. Auto-generated only if there is no other inflected form (e.g. any adjective or noun inflection).
    6. ===Verb=== header of root term must be within an ==English== language section.
VOTE:
  • Against:
  • Comments:
    1. I would prefer that the headers of the created pages use ===Verb form=== rather than simply "Verb". This makes it clearer that mnore information will be found in a main entry. --EncycloPetey 23:39, 14 March 2006 (UTC)
  • Spelling rules as above. Also as above, should not have a noun sense, as that would imply both plural and verb inflection. I know there was some opposition at first, but I think you went overboard by dividing these tasks so finely. Davilla 03:02, 15 March 2006 (UTC)
    • See "Generation rules #5" - yes, if there is a noun form, it is excluded. I don't think I went overboard dividing the tasks. This is the identical subdivision I had when I first proposed it and Ec unilaterally denied the entire request specifying that each sub-task needed separate approval. I think that is very illogical, but it has had the side benefit of separating the flamewars to only the sub-components generating controversy (i.e. redirects and translations.) --Connel MacKenzie T C 03:45, 17 March 2006 (UTC)

Request for bot status: PastBot

  • Bot: User: PastBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in past and past participle verb forms, exactly as is currently done manually.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Root form must link the past and past participle forms identically, one or two lines after the ===Verb=== heading.
    3. Inflection can be provided by regular text wikification, or any other template that wikifies terms, including Uncle G's inflection templates as well as Ncik's templates (or any others!)
    4. All but the last three two characters of the root form must match the plural form to be auto-generatred in this manner.
    5. Auto-generated only if there is no other inflected form (e.g. any adjective or noun inflection).
    6. ===Verb=== header of root term must be within an ==English== language section.
VOTE:


  • Against:
  • Comments:
    1. I would prefer that the headers of the created pages use ===Verb form=== rather than simply "Verb". This makes it clearer that mnore information will be found in a main entry. --EncycloPetey 23:38, 14 March 2006 (UTC)
  • What does this mean? "Auto-generated only if there is no other inflected form (e.g. any adjective or noun inflection)." Are these going to be list-verified as per earlier example of tired? Davilla 03:30, 15 March 2006 (UTC)
It would appear to mean that the script will check for links to noun inflections (as per the plural bot above) & if there is a plural and a third person both pointing at the same page then this bot will not generate a new page. It's only as good as the existing data in the XML dump which is not perfect but pretty good. (That's how I'd write it anyway.) MGSpiller 18:50, 15 March 2006 (UTC)
Still don't get it. What do "-ed" entries have to do with plurals and third-person verbs? If I were to write it I'd verify each to check for adjective senses. Davilla 19:59, 15 March 2006 (UTC)
Sorry yes, I'm confusing my bots, there are too many... Granted as adjectives like tired are root forms the bot is likely to miss them. My gut feeling is that these will be the exception rather than the rule and not significantly worse than a human editor missing the same homonym.MGSpiller 01:50, 17 March 2006 (UTC)
Yes, by "adjective...inflection" I do mean adjective senses. Although your example of tired makes me realize I need to revisit the parser. Right now, I only search for wikified terms. I need to expand the parser to also include terms that are merely bolded. But for tired itself, since the entry already exists, no 'bot using this method can upload it. This 'bot method will only work for entries that do not exist. --Connel MacKenzie T C 04:02, 17 March 2006 (UTC)

Request for bot status: TranslationBot

  • Bot: User: TranslationBot
  • Owner/operator: User: Connel MacKenzie
  • Purpose: Fill in translation entries of non-English terms, from translations given in translation sections of English entries.
  • Generation restrictions:
    1. Entry must not already exist.
    2. Translation must be in an "un-ambiguous" non-numbered-translations section.
    3. Language must be one of the "top 40"ish languages (whatever it is that WT:ELE currently recommends.)
  • No interwikis will be auto-added during this phase, unless GerardM asks me to auto-add them (on the assumption that his interwiki 'bots will remove the relatively few that don't have corresponding entries elsewhere.)
Please do not add interwiki links.. The bot will pick it up when it sees a corresponding entry in another language. GerardM 08:50, 17 March 2006 (UTC)
VOTE:
  • For:
    1. --Connel MacKenzie T C 05:44, 14 March 2006 (UTC)
    2. -- Tawker 06:26, 14 March 2006 (UTC)
    3. --Yyy 15:25, 15 March 2006 (UTC)
    4. Vildricianus 10:29, 17 March 2006 (UTC) (see comment somewhere far below)
    5. -- MGSpiller 02:21, 18 March 2006 (UTC) as per Vildricianus' comments...


  • Against:
    1. Ncik 17:44, 14 March 2006 (UTC)
    2. --Patrik Stridvall 18:41, 14 March 2006 (UTC)
    3. --EncycloPetey 23:32, 14 March 2006 (UTC)
    4. Would have to pull together definitions from multiple red links for the word if more than one, and in most cases there should be more than one. Davilla 03:12, 15 March 2006 (UTC)
      • Davilla, I may have worded it poorly above, but that is exactly the intent; how ever many English words link to a term, that term when generated will have one line for each definition line that has that as a translation. But only for well formed translation tables. Translation tables that don't start with a description line (if more than one definition is entered) are skipped. --Connel MacKenzie T C 05:45, 15 March 2006 (UTC)
        By "pull together" I don't just mean enumerate. For a foreign language word, a human could deduce a single or possibly multiple meanings from "what links here", but it takes a little mental computation that a bot hasn't got. For instance, this bot would have made niño into something like:
        1. baby boy...
        2. boy...
        3. child...
        A person halfway fluent in Spanish could do a better job. Davilla 19:53, 15 March 2006 (UTC)
  • Comments:
  • I'm not voting for this one as I am not convinced that machine translation can always get the nuances right. But I won't vote against it either unitil I've seen it in action (by which time it will be too late!). SemperBlotto 14:31, 14 March 2006 (UTC)
  • I'm not convinced this is a good idea either. When I add Swedish entries, the possible translations is the minor issue. The big issue is everything else including grouping the possible translations in different senses as well as adding a short qualification since even words that only have one possible English word as a translation might only mean that in a small subset of the English senses. See for example the Swedish word mede that translates into English as runner or rocker. Try to guess what the word really means then click on the link and see the qualifications. You guessed wrong, didn't you? Then look at Swedish rygg. If you just added the words the entry wouldn't make much sense would it?
  • Note limiting yourself to "Translation must be in an "un-ambiguous" non-numbered-translations section." will in many case mean that you pick entries that have the lowest quality. --Patrik Stridvall 18:41, 14 March 2006 (UTC)
    • My goodness, what a perfect example! This 'bot would not make an entry for mede as 1) it already exists, 2) the translation is only in the "to be checked" section, is not entered in either entry, not the un-ambiguous sections before it. But, assuming that the entry didn't exist, and the definitionstranslations were entered and placed correctly in the un-ambiguous sections, the entry that would be created for mede would have two "definition" lines with one wikified word each, a semicolon, a space, then the meaning description from the translation sub-section, on each line. Did you completely misread what I had, or was what I wrote too confusingly worded or something? --Connel MacKenzie T C 19:29, 14 March 2006 (UTC) (corrections) --20:55, 14 March 2006 (UTC)
      • I partly misunderstood the sentence "Translation must be in an "un-ambiguous" non-numbered-translations section.", yes. Still, it only makes it partly better. The Swedish senses are usually not same as the English senses despite being in a named translation table for that English sense. They are there because the English word can be translated to that particular Swedish word and it is the best match given the alternatives. mede was perhaps not the best example. Thinking again the English sense they attach to is almost exactly the same as the corresponing Swedish sense. This is usually not the case not even for cognates.
      • You will just end up creating a lot of low quality Swedish entries that will not help anybody very much. It will just be painful to sort it out. When doing the translation checks for Swedish I have often been frustrated how badly the translation tables match. I have tried to cleaned it up in some cases but often I simply haven't felt I had the time.
      • I hate to put it like this, since I'm not really much into personally attacks, but do you speak any language beside English? Your user page indicates that you don't, so I wonder if you really understand how different even related languages like Swedish are? And you are talking about doing it for the top 40 languages...
      • Note that you now have two native speaker of related languages (Swedish and German) against you. I wonder how speakers of unrelated languages feels about it? --Patrik Stridvall 21:06, 14 March 2006 (UTC)
Connel and Eclecticology are the main obstacles to sensibly treating non-English entries. They oppose any change to their own, anglo-centric policies. It is not allowed to add proper definitions to non-English words. I'm not surprised Connel now wants to generate entries using a bot. It's just sad. Ncik 23:02, 14 March 2006 (UTC)
There is no need to make this personal. There are times when my urge to strangle Connel is just as strong as my urge to strangle you. The reference to "anglo-centric policies" is plain false, as is the claim that "proper definitions to non-English words" are not allowed. I do not support the Translation Bot, but neither do I doubt that it was offered in good faith. Eclecticology 20:14, 15 March 2006 (UTC)
I agree that it is a bad idea for non-English words to be limited to a "translation", particularly when none of the English definitions captures the sense of the foreign word. Consider the word pavo in Latin. Yes, it refers to the peacock, but that doesn't give the reader information that it could be a food or that it is connected to the goddess Hera, both of which could be very important in the context of a Latin document. Consider the Latin word chamaeleon. Yes, the English word chameleon is cognate with the Latin, but in Latin documents the word refers to a mythical creature that subsisted on air alone, without eating food. It is NOT the lizard of sub-Saharan Africa that is meant in Latin texts, though that is the deifintion one would get from the English page by "translation".
        • As a speaker of English and a dabbler in others, I say it is better to be able to look up a word and get _something_ rather than a dead end. I like to try and follow what the German speakers in #wiktionary on IRC are saying sometimes, and I use en.wikt as my primary resource for this. All to often I have to go elsewhere because we don't even have a simple 1 word entry for what I seek. I have been adding lot's of 1 worders from the spanish language because it is a start, and if you come across abad in your daily life you probably want to know a similar english word, regardless of the fact that it might have nuances in Spanish that the word abbot doesn't have in English. - TheDaveRoss 22:15, 14 March 2006 (UTC)
Just use the Search button, DaveRoss. The bot won't add anything new. Ncik 23:02, 14 March 2006 (UTC)
The search button brings us no closer to the "every word, every language", the bot does. - TheDaveRoss 23:08, 14 March 2006 (UTC)
A low quality bot added Swedish (or German) entry will not help you very much, in fact it might fool you to believe that a human have entered it or at least checked it and that the qualification actually means something that it really doesn't. Even somebody with a Babel level of 2 or 3 might be fooled. In fact for uncommon words you might even fool a native speaker. I believe myself to understand subtle nuances of common English words almost as good as with Swedish common words, but I still have problem finding good qualifications for translations. Believing that the English senses can be used as qualification is simply wrong even for many cognates. So adding the qualifications can actually make things worse and not adding them doesn't really offer anything useful that you can't find using "Search". Humans regardless of Babel level are more likely to understand their own limitations than a bot and are less likely to add something that might fool others. --Patrik Stridvall 23:25, 14 March 2006 (UTC)
A human did enter them, Patrik, the bot will take the definitions people added to the english pages and create pages from them. An example: solar system has a a Swedish translation solsystem which was added by Mike. The bot would simply create solsystem and state that it meant solar system in Swedish. Mike seemed to think this was correct, and anyone who subsiquently looked at the page did also. The bot wont be making anything up, it is simply doing the redundant work that users would have to do anyway, for the simple translations. - TheDaveRoss 23:36, 14 March 2006 (UTC)
Bad example, a solar system is something concrete that is clearly defined and doesn't really have any subtle nuances. Most words do. The senses in English are a bad match for senses of foreign words and the human that entered it perhaps should have split the senses but didn't, perhaps because he didn't have time, perhaps because it sometimes is damn hard to do and perhaps because of a number of other reasons. Most translations are only good one way and trying to making it go the other way as well is asking for trouble. We have "Search", please use it instead. --Patrik Stridvall 00:01, 15 March 2006 (UTC)
But translations between languages are not necessarily reflexive, nor are they usually one-to-one. Suppose someone has entered for the English word woobit that the Dutch translation is waabijd. That may mean that waabijd is a strict one-to-one translation of woobit, or it may mean that waabijd is simply the best translation of a difficult to translate word. It does NOT mean that woobit is the best translation for the Dutch word waabijd. In short, just because A translates into B, does not mean that B necessarily should translate to A. There may be a much better and more accurate word in English. --EncycloPetey 23:53, 14 March 2006 (UTC)
Exactly. --Patrik Stridvall 00:01, 15 March 2006 (UTC)
I would be interested in an example where the translation listed in the English entry would not produce a reasonable new page. It is limited to only the simple ones (no multiple definition issues) and only uses already contributed translations. If woobit is translated into waabijd in Dutch, how is it possible that at least one of the senses of waabijd isn't woobit? If dog translates to perro, one of the senses of perro IS dog, even if perro can also mean spaceship. Even though the bot's additions will be incomplete, which of the human entries IS complete? The basis you list for denying this bot a run does not stand up when you look at what the bot actually will do, only if you assume the bot will do things it wont. - TheDaveRoss 01:51, 15 March 2006 (UTC)
If dog translates to wobitas, there's nothing to prevent the sense of wobitas being, for example, canine, where dog is not a "sense" of wobitas at all but a more specific word English uses that Woobitwegian doesn't, the disparity in information being normally ignored in translation but which is important in definition. This happens often with things like names of colors, for example. γλαυκός may translate blue or blue-green or gray but the real, single sense appears to be broader, and to give one would be misleading. —Muke Tever 12:12, 17 March 2006 (UTC)
Precisely. Translation in one direction either produces a word with nearly the same circumscription of meanings and connotations, or else produces a term with a broader definition. Reversing the translation is therefore inappropriate, since it applies a narrower meaning to a term than is intended. Consider that most English-Spanish dictionaries translate dog, hound, and mutt as perro in Spanish. This does not mean that Spanish perro carries all those specific senses. It merely means that Spanish does not have a word that specifically means hound with the sense of a hunting-dog while still applying in general nor a word that means mutt in the sense of implying the mixed history of the breed. Thus, to back-translate perro as "dog, hound, mutt" is terribly inappropriate. The same problem occurs many times in translation, any time that the vocubaluary of two languages is not one-to-one, which it seldom is. --EncycloPetey 21:15, 17 March 2006 (UTC)

  • Reading the comments here, I am beginning to think you are all insane.
    1. Patrick, I do speak one other language and have dabbled in several others (prior to Wiktionary, now I've had much more exposure.) Not being fluent, I am not comfortable asserting that I can reasonably contribute in any other languages authoritatively, so my babel template lists only English. There is no testing requirement for Babel templates; I feel that many who claim en-3 should really have en-1 listed. You, on the other hand, seem to be very competent.
    2. EncycloPetey, go look at the explanation for mede again; translations get pulled from wherever the term is defined, NOT ONE TO ONE! (You misunderstand my point. See the passage re: perro just above the line divider preceeding this passage. --EncycloPetey 01:31, 18 March 2006 (UTC))
    3. Patrick, a stub is much easier for you to enhance than a blank entry. If the definitions are wrong then it is a problem, (but then they are a problem already) and if they are already right then you have been saved some typing; no harm no foul.
    4. solar system is a great example; simple entries are filled in correctly. That is what this 'bot is for!
    5. Ncik, this is the English Wiktionary. Your treatment of Hand is in no way oriented towards a native English speaker. Of course it makes sense in German, but what you said in English borders on gibberish. If you don't like terms being explained in English for English readers, then don't contribute here.
  • --Connel MacKenzie T C 01:38, 15 March 2006 (UTC)


  • Woah there, calm down.

This bot is specified as generating a word in a foreign language from a red link already put in as a translation of an existing english word. There are some restrictions placed to ensure that it does not attempt to create translations which are obviously more complex than a simple one to one but I think a couple more checks should be implemented.

  1. It should generate an italic footer like the Webster entries making clear that it is likely that there are other senses & possible nuances which may be missed by this simple first step.
  2. It should generate a checklist for ease of checking by humans of what it's done, linked from the cleanup pages.
  3. As part of the preparatory work it should check whether a redlinked translation is present more than once i.e. from more than one English word. (It would be nice if the results of this check could be outputted as another cleanup page as it should give a human editor a headstart on creating pages)
Hopefully my humble suggestions may provide if not an outright solution then some thinking points for improvements.MGSpiller 02:17, 15 March 2006 (UTC)
Forgot to say, the cleanup pages (perhaps that should be translations to be checked) should be split by language though that is probably obvious.MGSpiller 02:24, 15 March 2006 (UTC)


  • I'm baffled at how I could have been unclear. The intent of this is to glean meanings from multiple words, such as runner or rocker to build an entry for mede.
  • Tagging these entries with an explanatory footer is a very good idea.
  • It is easier to create the list a gigantic single page (as was done for first few hundred Webster entries, the first time around) and let the translators take a shot at them. If that works out anything like the Webster entries have, then we'll get 5 to 10 entries actually entered from that list in the next two years.
--Connel MacKenzie T C 02:38, 15 March 2006 (UTC)


I don't think it's you that is unclear but the the very nature of the beast of translation. Though you did not specify on the BP what would happen if two English words both had the same foreign word listed as a translation, this should be treated carefully. I'm natuarally optomistic and still newish to wikis and there is also more editors here than there were before so perhaps we can hope for 20 or 40 entries a year :-).

I was going to post this but I was still typing when you posted....

  • Another subtle point is what Kipmaster said hereabout translations in other wictionaries with respect to what the French have already done. If a French wictionary writer has said that the closest translation of hrunk is weeble in English that is not the same as an English Wictionary writer saying that the closest translation of weeble is hrunk in French. Half the time the closest translation of weeble in English is going to be hrunkle in French, a similar but different word. Perhaps the french were right to translate words that are marked up in other Wiktionaries. The ideal would be do a lot of number crunching & cross reference, checking both this and our sister for the same language. Then we could start with the ones where both agree and then move on cautiously with the ones were they don't agree. MGSpiller 02:50, 15 March 2006 (UTC)
  • Well, hehe, guess what? I'll find a French match every time, if I did that. Here-to-there or there-to-here is a straw man. Either we trust our own definitions that we have or we don't. If we don't then we should just remove all translations. --Connel MacKenzie T C 05:49, 15 March 2006 (UTC)
  • 2 comments:
  • I'm, for now, opposed to running such a bot on the French Wiktionary, it looks to me just as a way to increase the number of articles, without increasing the quality (the foreign words in translation tables can already be found using the Search button). When we import en->fr from the en: Wiktionary, we take of course the translation, but also the gender (well, not in en...), preterit, pronunciation, ... So, the resulting article looks ok. if we create a en: or de: article from a fr translation table, the more we can have is the gender, which is not always indicated.
  • On a more positive note, I think we have on the French Wiktionary a good amount of fr->en translations, and I guess that other Wiktionaries would have a lot of them too. Since the templates on the French Wiktionary are very convenient, it's easy to extract those translations from there, + pronunciations, genders and so. I'll be glad to help doing that if somebody wants to spend some time on it (I have not enough time and too many projects to do it myself).
  • PS: I've never heard of hrunk in French ;-) Kipmaster 12:51, 15 March 2006 (UTC)
    • It should probably be hrunque from the French spoken in the Western US; it is derived from the sound heard when a jackalope from Boisé (pronounced /boy-zee/) gets its antlers caught in the trees on the way to the Grand Teton. There may also be a Spanish translation, jrunque, to reflect the sounds heard when a Texas Horned Frog gets stuck in a gopher hole in the course of its horny pursuits. :-) Eclecticology 20:14, 15 March 2006 (UTC)

"If you are keeping your head while others around you are losing theirs, perhaps you've misunderstood the whole situation." - Unknown

Connel, sorry for being personal, it is just that from my experience many Americans have a rather vague notation of the world around then, especially foreign languages. I probably went too far, I apologize.

Both sides have made mistakes by choosing bad examples. My mistake was mede (English: runner or rocker) which is something very concrete and distinct. Dave's mistake was solsystem (English: solar system) which is also concrete and distinct. Sure it would work for such words. This is unfortunately the small minority of words. Note however that Dave hasn't voted for despite his comments. Now lets forgot thoose mistakes and move on.

As for "Either we trust our own definitions that we have or we don't.". Well, if that how you wish to put it, I vote that we don't. Seriously, getting the English senses and in extension the labels on the translation labels right is a gradual process and is far from complete even for very common English words. Note that my point of view is from the point of Swedish that shares a common ancestry with English both languagewise as well as culturewise. Both languages have additionally been influenced by Latin. In the case Swedish either through German or directly. In case of English either through French or directly. Unfortunately the additional languages I speak is German and French so I really can't offer any outside perspective. But even from the inside of our common conceptual heritage I see large fundamental differences in how the languages work. As for the top 40 languages of the world I have a hard time to even imagine...

Now even with the English senses correct your really can't normally go in the other direction except for concrete and distinct concepts. Unless you are fluent in at least one foreign language I don't think you can have any real understanding on how bad English<->(foreign language) dictionaries really are and we are talking about dictionaries made by professionals that have been gradually evolving perhaps over hundreds of years. The only sure way to understand a word is to read an explaination in the foreign language itself.

As for "A stub is much easier for you to enhance than a blank entry", no not really, our labels for the translation tables are in most cases too bad to be of much use. Finding possible English translations is the easy part the hard part is all the rest. I much rather add entries de novo.

Please don't let any animosity toward the French lead to rash actions. Now lets try to be constructive instead of just criticizing. It is a non-trivial problem. Perhaps a website that crossreferences all Wiktionaries with automatically generated suggestions for entries to cut and paste from would be useful. Not only translations but also synonyms and such things. Perhaps at "xref.wiktionary.org". --Patrik Stridvall 10:42, 15 March 2006 (UTC)

Which are the "top 40"ish languages? (did not found a list in WT:ELE) --Yyy 11:24, 15 March 2006 (UTC)

I would support this, but I do not know if Latvian is in top 40. (I suspect not). Also, would be good, to add category (cat Latvian nouns for Latvian nouns, verbs for verbs and so on)(if coresponding category exists)(and if this applies to latvian language words).--Yyy 13:10, 15 March 2006 (UTC)
  1. Patrick, So you vote then that we should remove all translation sections from all entries?!
  2. Patrick, I assume TheDaveRoss made an oversight, since he did vote in favor originally.
  3. The "Top-40 languages" I keep referring to was how dewikified languages were originally referred to in ET:ELE. The talk page of that page goes on at length about which languages to de-wikify. I shall make an effort not to call them "top 40" but rather "de-wikified by consensus" languages. I won't just take entries if the language is dewikified though; rather only ones for the official list (wherever it is.)
  4. I personally hold no animosity towards French or the French people or the country of France. French is my favorite foreign language. The French Wiktionnaire is IMNSHO the best Wiktionary. I'd like to visit Paris someday.
  5. Patrick, are you suggesting I post the generated list (when I get to the point of being able to generate it) as a page here, to let you comb through? At that point, you can either enter terms that are "tricky" (thereby preventing the 'bot from ever entering them) or offer logic corrections, or enhance the root entries that cause problems. But to then let that list remain would be a mistake (like the Webster entries) in my opinion.

--Connel MacKenzie T C 15:03, 15 March 2006 (UTC)

Of course we shouldn't remove them translation section. Don't be silly. The point is that many of them barely fills the role they are meant to accomplish. Sure they will get better over time but you intend to use them NOW in their current state for something they are not "designed" for.
I suggest that we set up a website "xref.wiktionary.org" that crossreferences all Wiktionary that anybody can use to save time as described above. This will help all Wiktionaries not just us. I see it as a continous process not as a "Now lets see if we by using some dirty trick can quickly regain the lead somehow".
Note that even if I would agree to check Swedish, you still have 39 languages to go... Futhermore have you any idea how long it takes to check and modify a list of entries? Especially since most of the them will be at least partly wrong. It must be treated as a continous process... --Patrik Stridvall 16:20, 15 March 2006 (UTC)

While the other bots in this series have some hope of success, or are at repairable, it's clear to me from the above discussion that this one raises more questions than it solves. It is at best premature. Ultimate Wiktionary (or WiktionaryZ) has been suggested as a solution that would appear to accomplish what is suggested by Patrik's "xref.wiktionary", but we can't just sit around waiting for them to come up with something practical.

If any kind of bot solution for translations is workable it should probably work in the other direction, taking the existing foreign word entry and matching it with the translation lists -- assuming that that is something feasible. Foreign word entries are referenceable; translation lists can't be referenced easily or practically because nearly every element in the list would need a separate reference. Eclecticology 20:14, 15 March 2006 (UTC)

Oh, my "xref.wiktionary.org" suggestion is not even remotely as ambitious as WiktionaryZ. We are talking ant compared to dinosaur. It's strictly readonly, it doesn't even modify any Wiktionary by itself it is only an aid to help adding entries. The idea is that you take the dumps from all Wiktionaries and parse them and generate some sort of database. Then you have a website that allows you to enter a language and a word. It then shows which Wiktionaries have definitions for that word and what other languages and words that have that word in translation sections. A simple cross reference. Of course it could be made more advanced and actually suggest a possible entry that you can cut and paste. Approximately what Connel's bot is supposed to do but not actually adding anything by itself. A human will have to decide what makes sense and what does not. --Patrik Stridvall 21:01, 15 March 2006 (UTC)
It's the lack of ambition that makes this idea look attractive. I may perhaps sound extremely negative in this, but I see WiktionaryZ as promising everything and producing nothing despite the early special funding that the project received. At the moment I only have a very broad vision of how xref might work, but I would be prepared to support this if the idea can be fleshed out a bit and WiktionaryZ is acknowledged as going nowhere. Eclecticology 18:04, 16 March 2006 (UTC)
If WiktionaryZ is a dinosaur, then a sugar ant to your fire ant is the inclusion of other language entries on a spelling page. Davilla 14:39, 17 March 2006 (UTC)

This seems to be Wiki at its best. Unstructured comments, animosity and lots of spelling errors :-). I agree, though, that the proposed bot's operation was largely unclear at first. Although I was strongly against this when I first heard about it a week ago, I've had the time to think of a good argument and haven't found one.

It may have been suggested above (or below, as the discussion is everywhere), but reading through it once is more than enough: I recommend adding all bot-added entries (for translations) to something like Category:Bot-added entries (Dutch) (or something less verbose). Then it'll work I guess, I can't see why not, certainly if only applied to the top-40. I haven't found that many well-founded arguments in the above/below masses of text. I trust that it will be run with great care and plenty of consideration, and I guess any other issue will settle itself by time. I also trust that it'll be postponed until all other bots have run (as was planned). — Vildricianus 10:29, 17 March 2006 (UTC)

No amount of care can magically conjure up information that doesn't exist. No bot that we, with a reasonable effort, can program are likely a "understanding" that even remotely similar to human that doesn't even speak a specific language. Translation is hard very. I can read and write in English with not much more effort than I do in my native Swedish. Still, interpreting Swedish into English or vice versa is often hard. Not only because you have to realize what sense of the word is the relevant but also because that you often can't use it as is. Sometimes you even have to change the POS in order to make a correct sentence. Sometimes the best fit is too vague so you have to modify with an adjective, an adverb, a preposition or someting else to adapt it to the situation. The words given as translations are often just vague hints, they are certainly not reversible.
So, how will the bot determine the POS of the translation? Yea, sure in most cases it is the same as in English but even in related language like Swedish a translation doesn't nessarily have the same POS. For example Swedish have a preference for making genitive noun + noun compounds where English uses adjective + noun. So the translation for the adjective ivory is the noun elfenben in the prefix form elfenbens-. Then we have senses of English words that are normally when used in the form of + noun where the correct translation of the noun is an adjective in Swedish.
Note that Swedish is a related language. Most of the top 40 languages are not. There is no reason to believe that it will be better in thoose language, more likely worse... --10:22, 18 March 2006 (UTC)

Suggestion

As a non-linguist here, with little knowledge on the operations of bots, I'll add my (possibly naive) two cents. What I reckon is firstly that Wiktionary is a rather slow-moving project. The French Wiktionary has overtaken us in article-count, which is great for them. If we can get a bot running that does as Connel says, let's give it a try. I'd like to see this 'bot in motion. Or maybe we could start by only doing it for nouns to start with, as nouns are less ambiguous in translations. Just an idea --Dangherous 21:12, 15 March 2006 (UTC)

Thank you. I agree that I must start with nouns only. --Connel MacKenzie T C 22:35, 15 March 2006 (UTC)
Don't be dis-heartened Connel, we seem to have a consensus (more or less) on everything but the translations. Take it one step at a time, do the plurals first. By the time the plurals are done I reckon we will have no more opposing votes on any of comparative superlative etc. (I did mention my irredeemable optimism before didn't i? :-) ) They would all appear to be fair game at this point.
I am not aware of any form of automatic translation which is free of controversy, even if based directly upon human work (as this is) so don't feel that one bot being controversial is a failure. It is an ambitious proposal made in good faith, perhaps you did not realise how ambitious it really is but that is no bad thing.MGSpiller 00:51, 16 March 2006 (UTC)
I am disappointed but the Luddite mentality that seems to be prevailing for translations. To say that the current translations entered are incorrect is an impeachment of all translation attempts made here on Wiktionary. To rearrange the translations entered as I have suggested accomplishes several things: 1) a framework is popularized for filling in translation entries (which so far have seen rather paltry entries) 2) a lot of typing is saved for people! 3) makes Wiktionary the translation resource for English language speakers it was intended to be, ever since it's motto became "every word in every language", 4) it spurs people into correcting the generated entries (which otherwise wouldn't exist) 5) it makes direct lookup of foreign term possible! 6) It naturally (via interwiki links generated one day later) allows for correction and enhancement from other language wiktionaries. 7) It indirectly builds other language Wiktionaries' lists of possible/probable words that they should enter. 8) It provides a starting map-point for WiktionaryZ, especially for controversial and problematic terms.
I am disappointed that the choice of the name (chosen because of a humorous character on a cartoon website) mislead people to think this was some kind of "pump up the count" effort. I have discussed and planned this series of 'bots for quite some time now - more than months; about a year (maybe more than a year.) Furthermore, this one does nothing to increase the English language entry count!
I am disappointed that the complaints have ranged from bizarre to absurd. The only semi-legitimate complaint is that Wiktionary is currently incomplete. Well, duh. How is preventing more entries (of exactly equal quality of our current entries) helpful?
--Connel MacKenzie T C 08:15, 16 March 2006 (UTC)
The reason adding bad entries is unhelpful is that when the entries that you borrow from improves it this will not result in bad entries that has already been added being better. It is exactly the same reason as to why Copy and paste programming is bad, it breaks the semantic link between various parts of a program and by doing it makes a program much harder to maintain. Doing what you can to preserve the semantic link is what separates the amateurs from the professionals.
Why do you think WiktionaryZ is so complicated? Because it tries as far as possible to preserve the semantic link between words and the concepts they represent. It is an enormously complicated problem that is not likely to have an simple solution. However, calling people Luddites just because that don't have a solution is not helpful.
I refuse to take the bait and comment individually on the 8 point above. It is not about what good comes from doing it, it is about the price you pay. But most people don't want to hear that, they vote for whatever politician that promises to fix their specific problem without thinking about the consequences... ---Patrik Stridvall 20:30, 16 March 2006 (UTC)
Well, I apologize for returning mild insult for mild insult. But the comparison to copy-paste programming is invalid. We are discussing data elements, not logic elements. Yes of course it would be really cool to have both updated automatically...but that is not how Wiktionary works. Currently, all semantic links "are broken" according to you. But when you find an error in a translation section, you follow the link and fix it wherever else it is broken. That happens occasionally just for English words here (for inflected forms.) That doesn't mean that having entries is wrong! It means the few mistakes are equally wrong wherever they appear. The chances that the error will be found and corrected in both places is better, as the inaccuracies will probably stand out more.
I'm disappointed that after such a scathing (fallacious) impeachment as yours, you arrogantly dismiss numerous benefits...saying you "won't take the bait." They are all valid, tangible benefits - that is the only reason you don't respond. Again, the "price paid" is negligible (if it is any more at all) while the benefits are significant. One way or another, Wiktionary should eventually have all these entries. --Connel MacKenzie T C 04:44, 17 March 2006 (UTC)
Yes, sure I can be and probably was arrogant. Still, the reason I don't want to debate the details is that the existence or non-existence of useful benefits is not the real problem. Debating it would just lead the debate in the wrong direction. That it what I meant with "I refuse to take the bait".
First of all there difference between logic and data even when programming is purely artificial. In this context the difference is even smaller. In fact it almost meaningless to even talk about. It better to talk about semantic links instead.
If a page doesn't exists the semantic link is not really broken since it can be restored by algorithmic means without the need to have human intelligence and in most case even knowledge of the language is question. In fact this is what the "Search" button does in a limited way. In you add the page the semantic link is broken because when the entries you borrowed from changes you can no longer by algorithmic means propagate the changes. This makes it is even more important to get it right the first time. The problems are:
  • The quality of most of our entries is not that good yet.
  • Even except for available editor time, there are huge theoretical problems involved in getting it right.
  • Even entries that are right in some meaning, will not necessarily produce good entries because translations are by their nature are only partially reversible.
To respond to these three points: 1) It will never be perfect, 2) I can't imagine a more abstract complaint. 3) That is why this is a wiki! Humans have the opportunity to correct entries; they have added incentive, especially if tagged as suggested above. Looking at the separate-but-cut-n-paste method suggested of xref.wt.org, I must point out again that the existing Webster pages (an identical concept, without the separate server) only very very rarely get imported into "proper" entries. (I can't say never because I went through the painful process of entering a couple from the list that someone uploaded before I got here.) Why do they get imported so rarely? Because it is a horrible, cumbersome process. Whereas if non-existent entries are 'bot populated, they are much easier to update. With proper labeling (as suggested above) it becomes clear even to passing visitors that the information is suspect, and should be corrected whenever possible. Also, since this is the first pass, many things will be learned in the process...and future updates will likely build on this, to incorporate updates from the translations sections (for unmodified bot-entered entries only, of course.) But that is months away, I think. --Connel MacKenzie T C 08:26, 24 March 2006 (UTC)
I'm not questioning the benefits, even though many of them are more wishful thinking than anything really useful, I'm questioned and am still questioning if you understand the price you pay for it. Sure the Wiktionary have numerous technical limitation, that is why people are working on WiktionaryZ. But that only makes it more important to not ignore the limitations. What you are basically saying are "Oh well, we already have a lot of other problems, it doesn't matter so much if we create a few new ones".
I don't understand how you can say that such direct benefits are "more wishful thinking than anything really useful" - that strikes me as extraordinarily arrogant, especially when you haven't seen a sample of a dozen or a hundred entries. That is contempt prior to investigation. Your assumption that I'm creating a few new problems is invalid. --Connel MacKenzie T C 08:26, 24 March 2006 (UTC)
So, trying be constructive instead, I think your time is better spent implementing something like my "xref.wiktionary.org" proposal instead. You will need something similar anyway if you are to have any hope of generating anything even remotely useful. Then we can discuss whether there are enough acceptable potential entries to bot add them. Even if not, humans can always copy, paste, edit and submit whatever makes sense themselves. --Patrik Stridvall 10:04, 17 March 2006 (UTC)
The major problem with that link of reasoning is that the method you suggest has been tried (with the Webster entries) before, and has failed miserably. Perhaps a compromise would be to upload the generated pages to a holding area for one week before running (say, 1,000 at a time?) so that you or any other interested translators can remove entries from the list that are off the wall? Then after a week, upload the remaining entries proper, and replace the holding area with the next batch. Even better would be for you (or other interested translators) to preclude questionable entries with proper entries, to prevent the 'bot from re-attempting those translations in the future. Would you be willing to try something like that? After the first two or three batches, I'm sure you'll agree it is easier to upload them all, and make corrections to the handful of exceptions...but the proof is in the pudding. --Connel MacKenzie T C 08:26, 24 March 2006 (UTC)
    • Why aren't we moving entries found in the page "Webster 1913" that don't already have a definition? I understand they need formatting, but the vast majority of formatting tweaks could be accomplished using the Find and Replace box in Microsoft Word (e.g., replace <i> with ''; replace <syn> with [nothing]). We could worry about the formatting and updating each entry further later. Taken to the extreme, you could even write a bot that just copies them straight from the "Webster" pages into a new entry.--217.91.66.6 07:13, 16 March 2006 (UTC)
  • My major concern about doing so is the copyright notice in the Project Gutenberg copy of Webster 1913. The secondary concerns are the formatting challenges which you scratched the surface of. For automation, none of the XML tags can be used (that is what the copyright seems to cover most especially.) I need a copyright-free version to start from, which I currently don't have. And a lot of time parsing the entries as they exist there. Then another fair amount of time formatting them into Wiktionary style (which changes each month.) But yes, this is another project I hope to get to, overcoming the barriers I just mentioned, as well as a few other minor ones. --Connel MacKenzie T C 08:21, 16 March 2006 (UTC)

Someone who speaks of the Luddite mentality should at least read the Wikipedia article on the subject. An appropriate modern comparison with post-Napoleonic Luddites, who saw the mechanization of the weaving industry as an attempt to concentrate wealth in the hand of the factory owners at the expense of the small weavers, might be the free software movement which seeks to prevent the concentration of knowledge in the hands of companies like Microsoft. A person who understands the early role of the Luddites in the struggle for workers' rights can only wear such an epithet as a badge of honour.

The claim that there is a copyright problem with the 1913 Webster is entirely spurious. Whatever difficulties I may have with such a massive import have nothing to do with copyright.

I don't care if the French Wiktionary has more entries than the English one. A French verb, for example, has many more inflections than an English one. That alone would suggest more articles. Quality is far more important than quantity.

Given the number of bots involved, it makes sense to work out the bugs in the nouns first. Speaking for myself, the criticisms that I have about some of the others are very similar to the criticism that I have about the template for nouns. When there is agreement about the nouns, the others should largely fall into place. Eclecticology 19:09, 16 March 2006 (UTC)

They are separate 'bots only because you unilaterally denied the request of having them all as one. But however they arranged administratively is orthogonal; the nouns will be done before the comparatives are started; the comparatives will be done before the superlatives are started, etc.
Again, this is not about French having more entries. I've become comfortable enough with the 'bot technology and Wiktionary formats that I can do these 'bots that I've thought about, ever since discovering Wiktionary. I've talked about these for a year (give or take) long before Wiktionarre's stunt. --Connel MacKenzie T C 19:15, 16 March 2006 (UTC)
By the way, please re-read the introduction of Luddite. I'm using the "epithet" not as an epithet, but rather exactly as it is described there, with the current modern meaning...not the meaning the term might have had about 200 years ago. The analogy drawn seems very misguided, as well.
As for Eclecticology's ignorance regarding the copyright status of the Project Gutenberg copy of W 1913, please take a look at it. I do not have a copyright-free electronic copy of W 1913 to use! Do you know of one?
--Connel MacKenzie T C 08:26, 24 March 2006 (UTC)
Huh? Any copyright on republished public domain works is limited to the any additional formatting that was not present in the orginal work. See User_talk:Eclecticology#Template:R:Century 1914. -Patrik Stridvall 13:44, 24 March 2006 (UTC)

First of all "xref.wiktionary.org" is primarily meant to be a cross reference and only secondarily a tool to help to add translations and other things. So comparing it to the Webster entries is not really that relevant. Beside as I said you need something similar anyway. The hard parts are writing a good parser that stores it in a good database and then writing a good generator for the new pages. When you have done this you have done 95% of the difficult parts of "xref.wiktionary.org". Doing the rest should be easy. Note that you could have a button that says "Automatically add this" that queues it up for adding by a bot.

Having a real cross reference will makes it possible to discuss any shortcomings. Yes, I'm may be pessimistic and possibly arrogant, eventhough I would prefer to label myself realistic. However, perhaps I'm wrong. You admitted yourself that "the proof is in the pudding" so now it is up to proof. :-)

BTW, you still haven't answered how you will determine the POS for translations. It is not always the same, even for related languages like Swedish. While, in Swedish, I suppose you could take the adjective or the verb that an English noun translates to and make it an adjectival or a verbal noun, but this might not be how a sentence that contains the English noun is normally translated. Hmm, perhaps we should mark translations that changes POS somehow.

As for "Humans have the opportunity to correct entries; they have added incentive, especially if tagged as suggested above.". True, however, have you any idea what effort is involved in doing good translations? I have been trying to check the Swedish translations and have managed to do a large part of them by now. The problem is that even very simple words have a lot of different meanings that translates to different things. Even entries that looks good on the surface are often on closer examination revealed to be much more complicated. Some words have taken more than one hour to fix. In many case I had to add better English defintions, more English senses and more English synonyms and antonyms before I could properly understand how to translate the word. My Swedish-English dictionary is quite good, but very insufficient for this task. See for example what I did on dirty, prime and variable. When I started I thought that it would be easy. It was not. And no it will not be much easier the other way.

Your autogenerated entries are likely to not even look very good on the surface. How many of them do you think I can correct in a week? Even 100 would be very optimistic. 1000 is far out. Just saying that an entry is bad will not help very much. Then you have the other 39 languages... It must be treated as a continuous process. --Patrik Stridvall 13:44, 24 March 2006 (UTC)

gibberish titles 12 March

Could someone check the history for the entries that were deleted 12 March with gibberish titles (esp. with mixed numbers and letters)? I removed two of them from my watchlist before wondering how on earth they got to be on my watchlist, and I'm wondering if someone didn't just move the page and replace the content and thus make it appear to be complete garbage when in fact it used to be something meaningful and completely different. Best way to know is simply look at the history, but I don't have that privilage for deleted files. Thanks! Davilla 19:32, 15 March 2006 (UTC)


These are from the move log, and are probably what you mean, and all result from vandalism.

  1. 11:00, 12 March 2006 Connel MacKenzie Talk:58u639586759th5htu45hfg7yd7yg7345762793748972384h3tuihuhg795y moved to Wiktionary talk:Administrators
  2. 10:54, 12 March 2006 Connel MacKenzie 58u639586759th5htu45hfg7yd7yg7345762793748972384h3tuihuhg795y moved to Wiktionary:Administrators
  3. 10:50, 12 March 2006 Todo el mundo Wiktionary talk:Administrators moved to Talk:58u639586759th5htu45hfg7yd7yg7345762793748972384h3tuihuhg795y
  4. 10:50, 12 March 2006 Todo el mundo Wiktionary:Administrators moved to 58u639586759th5htu45hfg7yd7yg7345762793748972384h3tuihuhg795y
  5. 10:40, 12 March 2006 Connel MacKenzie 7856745kj546o2j348u6hjkujhtguoh3495 moved to Wiktionary:Requests for verification
  6. 10:38, 12 March 2006 Connel MacKenzie Talk:yytjertj489 moved to Wiktionary talk:Requests for verification over redirect
  7. 10:33, 12 March 2006 Connel MacKenzie Talk:7856745kj546o2j348u6hjkujhtguoh3495 moved to Talk:yytjertj489 over redirect
  8. 10:29, 12 March 2006 Ttw2s Talk:yytjertj489 moved to Talk:7856745kj546o2j348u6hjkujhtguoh3495
  9. 10:29, 12 March 2006 Ttw2s yytjertj489 moved to 7856745kj546o2j348u6hjkujhtguoh3495

Hope this clears things up. Jonathan Webley 20:39, 15 March 2006 (UTC)

Thanks Jonathan. That is mostly correct, but to explain a tiny bit further, the ones that I couldn't/didn't move back over redirect were ones where the redirect had been edited, and I opted to delete those manually first, before moving the real entry back. I'm sorry if that hosed your watchlist. In the future, I'll check the entry, and force the move-over-redirect rather than clearing the path first. --Connel MacKenzie T C 22:27, 15 March 2006 (UTC)

No, I guess everything's fine. Doesn't appear RFV ever left my watchlist. Davilla 05:08, 18 March 2006 (UTC)

CheckUser permission

Hi, considering our sock-puppet professional vandal (ex***t) I was thinking that it may be a good idea to give the CheckUser permission (help, policy) to some people here. According to the policy, these rights must be given to at least 2 people on a wiki if given to any. In my opinion, it should be given to bureaucrats, or maybe very trusted admins, that understands how ip works (ip ranges and whois). Could you organize a vote or something?

Also, does somebody know where to ask if we want to checkuser the vandal of yesterday? Kipmaster 09:41, 16 March 2006 (UTC)

In principle I have no particular objection to this idea. Currently, AFAIK the only admin with this power is Jon in his capacity as a steward. Extending, this to bureaucrats is fine with me, even though I have personally not sought this. Having this apply to selected other admins opens up the question of who would be trusted with it. The issue here is the ability to deal fairly with others, and a history of being able to avoid fights. I may not have the technical familiarity that is required, but that is learnable if circumstances require. I trust our other bureaucrat's social skills, but know absolutely nothing about his technical abilities. I look forward to his comments on this. Eclecticology 17:44, 16 March 2006 (UTC)
I have no desire to have this privilege, if that is what you mean. My tentacles are spread out thinly enough already. And I could do without accusations of abuse of that ability. If you are uncomfortable doing nslookups, whois', traceroute/tracert's and learning the nuances of CIDR address notation, perhaps we should check the meta: pages more closely: meta does not demand that such a person be a sysop, let alone a bureaucrat, right? I'd be comfortable with anyone that knows how to check and comprehend these: [4], [5], [6], [7], [8], [9] and [10]. --Connel MacKenzie T C 19:05, 16 March 2006 (UTC)
If you do not desire the privilege, the question of your suitability is moot. Please note that the only persons that were identifiably mentioned in my above comments are Jon and Paul. (Maybe I should have mentioned George and Ringo? :-)) Thank you for the links, I'll look into them further when needed. Eclecticology 19:27, 16 March 2006 (UTC)
I couldn't be trusted with checkuser! However, it may be necessary to create a Wiktionary:CheckUser page. --Dangherous 19:17, 16 March 2006 (UTC)

Well, I agree that our current bureaucrats seem like less than optimal choices for the CheckUser privilege. I'll start off some nominations of people I've seen demonstrate internet savvy that I'd be comfortable with having the ability. I do not know if each of these four desire this ability/responsibility. --Connel MacKenzie T C 19:47, 16 March 2006 (UTC)

Nominations:

  • I think that the most trustworthy people on Wiktionary are our Bureaucrats, especially Eclecticology. I haven't taken a single remark from him personally, whereas every single word of criticism Mac has ever given me I have taken personally because I knew it was maliciously intended.--Primetime 20:07, 16 March 2006 (UTC)
    • Our bureaucrats, although trustworthy perhaps, are clearly not technically savvy enough to notice your abuses, Primetime. That alone is perfect justification for excluding them from consideration with this complicated tool. --Connel MacKenzie T C 20:20, 16 March 2006 (UTC)
      • I think it's ironic that you are lecturing me about abuses.--Primetime 20:26, 16 March 2006 (UTC)
    • How about User:Uncle G? He's savvy, and gets involved with CheckUser on Wikipedia (to some extent at least). Anyway, I'm going to copy this to Wiktionary:CheckUser, and advise that you do similar. --Dangherous 20:13, 16 March 2006 (UTC)
      • If he's not too busy, I'd be happy with him being given this ability here. --Connel MacKenzie T C 20:21, 16 March 2006 (UTC)

first and second definitions disappeared - question

Moved to Wiktionary:Information desk

Pronunciation Guide/Audio Files - A single location to discuss?

Moved to Wiktionary:Information desk

category:Stub and Wiktionary:Lists of words needing attention

Moved to Wiktionary:Information desk

Word of the Day

Moved to Wiktionary:Information desk

Nasimhussain

Moved to Wiktionary:Information desk

Watch list

Default preferences seem to have changed very recently such that pages you create are automatically added to your watchlist. If you don't want this to happen - go to "preferences" => "edit" and untick the box. SemperBlotto 08:39, 19 March 2006 (UTC)

Saves me asking "Why the fuck is my watchlist flooded with crap I don't wanna watch" then. Thanks SB... --Expurgator t(c) 22:39, 19 March 2006 (UTC)
There's nothing new about this option. Eclecticology 09:33, 21 March 2006 (UTC)
True. But those of who had it turned off suddenly had it turned on. By whom, I don't know. SemperBlotto 10:27, 21 March 2006 (UTC)
Well, the old one is "watch all pages I edit" - the new alternative is "watch only the pages I *create*". I think the second one is quite new, indeed. \Mike 10:44, 21 March 2006 (UTC)

What's going on!?

I've noticed some very strange things going on here. Heavy vandalism notices, AOL users blocked, and now a secure.wikimedia.org link. What's going on!? Gerard Foley 10:59, 21 March 2006 (UTC)

Great Pronunciation Flood -- the dikes are cracking

The pronunciation flood I threatened above is now imminent. I have 1600 entries ready to go any time now, with thousands more to come. The KeffyBot successfully added its first entry (entitle) this morning. For the rest of this week, I'll continue testing the bot a word at a time and adding safety features. I should be ready to ask officially for a bot flag sometime this weekend.

In the meantime, I'd like to ask anybody who's interested in such things to check out the pages below and offer any suggestions for improvements or other feedback.

And the example words:

Most cosmetic issues can be addressed any time by redefining the template. But please make any comments that would require changing the way the bot fills in the templates soon, sometime during the next few days.

Some miscellaneous points/questions:

  • The bot as it's currently written will skip any article with an existing pronunciation. (Though I will eventually show no mercy to existing pronunciation entries that I wrote myself while impersonating an American.) In the distant future, when most articles have pronunciations, we can talk again about how best to deal with the entries that already exist.
  • I don't think it's worth having a category for all pages with synthetic pronunciations. (By summer, more than half our English articles should have them and it won't be hard to find one.) But I'd personally like to have a way, as I'm going along, to tag those synthetic pronunciations that are especially lame and should go to the front of the queue for replacement by a live human. Would {{rfap}} be the most appropriate way to do this?

Keffy 20:48, 21 March 2006 (UTC)

Keffy, please clarify: by "existing pronunciation" do you mean they have a goofy IPA entered, or that they have an .ogg file linked? Either?

It's now skipping anything that has the string "=Pronunciation=" anywhere between "==English==" and the end or the next second-level header. The existing entries are (in most cases) better than nothing, so I'm giving higher priority to articles that have nothing. Keffy 23:28, 22 March 2006 (UTC)

I think rfap is the correct template to use, if in total you are talking about a couple hundred. But then, I don't really do audio much anymore, so maybe you should ask User:Dvortygirl directly. --Connel MacKenzie T C 08:18, 22 March 2006 (UTC)

On voting already: Whoa! It's only been tested on one word! Keffy 23:28, 22 March 2006 (UTC)

Has there been any discussion on narrow vs. broad transcriptions in IPA? Neither in my speech is there any distinction between cot and caught, witch and which, or mary, merry, and marry, but I think there should be in a general pronunciation, esp. those currently labeled AHD. I like that your Canadian pronunciation is labeled (should it be more specific than Canada?), but I don't think it could be classified as a narrow transcription either since it leaves out some pedantic details, including use of a more convenient "r". I don't mind that IPA, a standard that applies to any language, is often corrupted for convenience. But considering that this dictionary is supposed to be multilingual, it would be nice to at least denote narrow transcriptions as such, in case these are ever to be compared across languages. Davilla 06:29, 23 March 2006 (UTC)

Please, don't restart the "r" flamewar, again!
On starting the voting: 1) The voting process takes quite some time. This clearly is something that we will want flagged as a bot eventually. You are being "good" about discussing the tests as you proceed with them; explicitly linking the test items. 2) Flagging an account with "bot" status does nothing to its past contributions. All a successful bot labelling does is make it less obtrusive to Special:Recentchanges - nothing else! It does not give the account any special status. It does not make that account a uber-account or any other such nonsense, it only unclogs RecentChanges. --Connel MacKenzie T C 07:58, 23 March 2006 (UTC)
Davilla: My transcriptions are all as broad as humanly possible, which is as it should be. Dictionary pronunciation transcriptions should give the barest minimum amount of information necessary to keep the user from accidentally saying a different word than the one they intend. If anyone wants more detailed information in order to sound more native-like in a particular accent, narrow IPA isn't the best way to give it to them. Audio files are. -- Keffy 06:27, 25 March 2006 (UTC)
I don't mean to restart any wars because this is could be a complete tangent and, if it concerns the format, it certainly does not pertain to the approval or disapproval of this bot. I've noticed that the French Wiktionary uses different symbols for general and regional pronunciations, with audio files applying to the latter. (See allemand or other entries.) I seem to be a latecomer in the discussion, but I had suggested doing something similar a few months ago, specifically that SAMPA be used for broad and X-SAMPA for narrow transcriptions. Although "IPA" means the latter to me, again I don't mind the corruption of IPA for convenience, so long as it is recognized and properly placed.
As it pertains to this bot, I presume you are confident that your transcriptions are broad enough to be labeled Canada rather than any specific region in Canada. Presumably many Canadians should exist who would theoretically be able to add a matching pronunciation file to every one of your additions. Then wouldn't it be redundant to provide the narrow transcription, which would override your Canadian one assuming that a broad transciption has been added by that point? If we have to wait for that time then so be it, but it seems more straightforward to just start with the narrow. Davilla 08:52, 25 March 2006 (UTC)
We should definitely have the broad/narrow discussion (especially since I suspect we're not using the terms the same way), but on one of the pronunciation pages, not here. On just the regional question: Yes, the entries reflect the pronunciations of almost all Canadian English speakers from Quebec westward (fewer in the Maritimes, fewer still in Newfoundland). No simple country tag will ever be completely accurate, but a Canada tag is closer to being true than any US or UK tag could hope.
The transcriptions and audio files are mostly applicable even to many/most American speakers. Between my experience and the fact that my accent is already damned close, I'm sure I'd be 99% accurate if I were to add US tags where I thought they should go. But I'd never know when the other 1% would happen (see, for example, the history page of capillary), and with tens of thousands of entries, even a 1% error rate is an awful lot of errors. So I'm sticking to the Canada tag and leaving tagging other regions to the people who live there. -- Keffy 19:33, 25 March 2006 (UTC)
"U.S." should really be General American aka Standard Midwestern, but I wonder how many people contributing audio files have really categorized their accents? Personally mine is GA with cot-caught merger, so I wouldn't know how to label it. Davilla 20:50, 3 April 2006 (UTC)

Okay. It looks like the KeffyBot now correctly changes what I want it change and takes a pass on what I want it to take a pass on. (But if anyone can suggest some pathologically formatted pages that I can try to trick it with, I'd love to hear them.) I'll be adding words in small batches till bot flags are approved both here and on Commons. Vote early, vote often. Keffy 19:33, 25 March 2006 (UTC)

The page for Greek ταχύ certainly qualifies as pathologically formatted. The entire contents are embedded in a template, most of which hasn't been filled in. --EncycloPetey 21:14, 25 March 2006 (UTC)

We can do without this bot and its idiosyncratic version of IPA. If we can avoid sticking in templates, so much the better. Also, it is important to remember that in many words Canadians vary considerably in whether they follow UK or US pronunciations. Your entry for biodiversity failed to note that the vowel for the third sylable can be either /ɪ/ or /ʌɪ/. The entry for capillary should also show that two pronunciations are used. There are really too many questionable entries that can be generated by such a bot. Eclecticology 09:23, 27 March 2006 (UTC)

The bot is not generating any transcriptions at all. I am generating each and every transcription by hand. The bot is merely filling them in in the appropriate places in the articles. If you don't trust me to do transcriptions that are better than empty space, come right out and say so. There are many other ways I could be productively spending my time. -- Keffy 15:35, 27 March 2006 (UTC)
Any pronunciations made by native speaker regardless of from where is better than no pronunciation at all. Actually even automatically generated pronunciations that have been checked by a native speaker is better than nothing. It certainly would be much more useful to any non-native speaker than guessing and getting it wrong. Futhermore unlike most other things that can be bot added this is actually possible to automatically check with another bot if somebody feels like it. --Patrik Stridvall 18:28, 27 March 2006 (UTC)

Vote for approving User:KeffyBot for bot status:
  • For:
  1. --Connel MacKenzie T C 08:18, 22 March 2006 (UTC)
  2. --Patrik Stridvall 18:28, 27 March 2006 (UTC)
  3. Davilla 00:10, 2 April 2006 (UTC)
  • Against:
Comments:

Women's Football League ?

Is their one? 216.220.231.226 21:58, 21 March 2006 (UTC)

How about this one Victorian Women's Football League? Jonathan Webley 22:07, 21 March 2006 (UTC)

Is that like real women playing football? 216.220.231.226 22:09, 21 March 2006 (UTC)

I think the original poster is being facetious. We should just quietly ignore him (there's little doubt it's a male user). — Paul G 14:48, 24 March 2006 (UTC)

daltry ?

What does it mean? I requested it! --130.111.98.244 21:20, 22 March 2006 (UTC)

Speedy delete?

I followed a redlink and created an entry...before I noticed the link was red because it was misspelled. How do I request speedy delete of an monumen on wiktionary? (Sorry about that). JillianE 16:18, 23 March 2006 (UTC)

Not sure JillianE ? Sorry Waha 16:19, 23 March 2006 (UTC)

The easiest way is to include the text {{rfd}} in the entry. --Connel MacKenzie T C 17:09, 23 March 2006 (UTC)
Even better is {{delete|reason why}}. — Vildricianus 14:08, 9 April 2006 (UTC)

Adding entries in HINDI language

How can one make entries in Hindi Language using Deonagari -- the script in which Hindi is written?

If you are familiar with the keyboard sequences to enter those characters, or have somewhere to cut-and-paste the characters into the search box, you should be able to enter them. At this time, we don't have a Devangari alphabet entered in the edit box. Can you help determine what should be there? --Connel MacKenzie T C 06:18, 24 March 2006 (UTC)
For entries in the Hindi language (as apposed to about the Hindi language) you want the Hindi Wiktionary, which I believe has help on setting up browsers for Indic scripts, if you need help using Devanagari as well. —Muke Tever 14:29, 24 March 2006 (UTC)

In case anyone is looking, the Devanagari characters have now been added to the edit tools box, so if you enable javascript for Wiktionary.org, you can just click on the characters to add them. - Taxman 13:39, 16 April 2006 (UTC)

Shortcut formats

I noticed a new shortcut CAT:UW and was immediately concerned. Does anyone else care that incorrect Wikipedia style shortcuts are being entered (i.e. the "CAT:" prefix) here? Also, "UW" would be confused for "Ultimate Wiktionary" (now WiktionaryZ) here - perhaps "WT:WARN" would have been a better choice? Or should I just shrug this off? --Connel MacKenzie T C 06:52, 24 March 2006 (UTC)

I've left this message for User:Psy guy
re: User warning templates- Interested in what you might be trying to do ? We already have a Wiktionary:Cleanup and deletion process in place. Could you please indicate why you feel it's a good idea to create another group of templates ? You might be right, or you may just be trying something out, but it would be nice to let the rest of us know. Could you please put your thinking on the talk page of Category:User warning templates--Richardb 10:08, 24 March 2006 (UTC)

I may just be "pedia-centric" but I thought CAT was the standard way to shortcut Category. If it is not, I would be more than happy to change it. CAT:WARN might be more explicit than just CAT:UW. Again, that is what it is called on WP. I think some standardization across projects is very important. I would enjoy your input. -- Psy guy 14:50, 24 March 2006 (UTC)

I'm sorry if it seems like I'm biting a newvisitor. That is not my intent. What alarmed me was the use of the CAT: prefix, as we haven't relied on that (see WT:WT.) It is a bit distressing that 'pedia uses the WT: prefix to mean "Wikipedia Talk" instead of "Wiktionary". I didn't put this message on User talk:Psy guy's talk page, but instead here, because I honestly don't know if anyone else cares. That was what I hoped to find out. Do we have a prefix we'd like to use for categories? I thought the same prefix was supposed to be used no matter what the namespace, within a project, for example WT:RQ, WT:POL, WT:WSI. So categories should not have a special prefix, right? I don't know that we could convince 'pedia to standardize their shortcuts correctly to be cross-project friendly; they have been using them "wrong" for a very long time. --Connel MacKenzie T C 16:27, 24 March 2006 (UTC)
I didn't realize that Wiktionary used WT for everything. I think that is a good idea, but if it is used for everything, does it lose meaning? Of course, using the same prefix does make it easier to organize the shortcuts. I am not trying to be critical, I am just trying to think of all possibilities. -- Psy guy 17:58, 24 March 2006 (UTC)
Well, I don't think different prefixes make them easier to remember at all.  :-) And they certainly are not cross-project friendly. --Connel MacKenzie T C 18:04, 24 March 2006 (UTC)
I'm sold! :-) I have put a speedy tag on CAT:UW and created WT:WARN to replace it. Would an admin please delete the former. -- Psy guy 20:50, 24 March 2006 (UTC)

Special:Categories

Can someone perhaps improve this page, so that we havea compact TOC - A-B-C-D etc. --Richardb 11:21, 24 March 2006 (UTC)

Vlicindarius and I have added {CategoryTOC} to pretty much all of the categories that requied it --Expurgator t(c) 13:50, 24 March 2006 (UTC)
Certainly the most hilarious misspelling I've seen up to now :-D . Note: I have created Template:CategoryTOC-Ru for the purpose of Russian categories. — Vildricianus 14:04, 24 March 2006 (UTC)
You should get a WT:EC- or WS:CMK-esque shortcut for yourself. They're all the rage these days. --Expurgator t(c) 14:31, 24 March 2006 (UTC)
The page Special:Categories itself, being automatically generated, can't really be improved—I don't think it has provision for by-letter lookup the way individual categories do. w:Special:Categories links to a category browser tool that might be useful. —Muke Tever 14:26, 24 March 2006 (UTC)
Whoever is responsible for the programming of the Special:Categories page, presumably they can program in an A-B-C-D type compact TOC. Anyone know how to ask for this to be done. Would be useful in Wikipedia too.--Richardb 23:53, 24 March 2006 (UTC)
Cool link! I've added it to MediaWiki:Categoriespagetext. --Connel MacKenzie T C 16:39, 24 March 2006 (UTC)

Do we need another "room" for newbies asking questions ?

The Beer Parlour gets more daunting all the time for beginners. These days it's a bit like a stranger in town walking into a new pub to find brawls and shouting going on all over the place! Enough to frighten them off.

Perhaps we should havea more sedate place for newbies to simply ask "How To" questions, without being surrounded by people trying to tear each other's head off.

Could we perhaps have a room named as sedately as "Information Desk" ?

What do you all think ?--Richardb 04:33, 25 March 2006 (UTC)

The Wikipedia Information Desk page is for general questions, not just formatting and whatnot. Perhaps we should have an equivalent page, for questions like "What does 'hrunk' mean?" --Connel MacKenzie T C 07:14, 25 March 2006 (UTC)
I would agree with anything that both helps the non-familiar AND cleans up the BP. I will even volunteer to keep track of it. :) - TheDaveRoss 07:28, 25 March 2006 (UTC)
Connel, isn't that what the tea room is for? —Muke Tever 16:04, 25 March 2006 (UTC)
No, Tea Room is generally for help on entries that exist. Information Desk is for "what does this word mean" questions. They are similar concepts, but different intended audiences. --Connel MacKenzie T C 16:16, 25 March 2006 (UTC)
The "Cognac saloon"? "Wine tavern"? "Cocktail lounge"? "Absinthe dungeon"? "Brawl bistro"? Or perhaps something with a different spelling on both sides of the Pond? Now serious: we could have a place where general Wiktionary questions are asked battles are fought: bot requests, proposed changes and projects, ideological revolutions, template wars etc. The other one would then involve specific, one-off questions: 'French article count' etc. — Vildricianus 16:29, 25 March 2006 (UTC)
The "Cafe"? "Coffee house"? "Salon"? "Saloon"? --EncycloPetey 21:16, 25 March 2006 (UTC)
Sounds like some people around here enjoy a little drink every now and then! ;-) -- Psy guy 03:40, 26 March 2006 (UTC)
I have started a discussion page at Information desk about what we think this should/shouldn't be about, please pitch in! - TheDaveRoss 23:37, 25 March 2006 (UTC)
Newcomers seem to always use talk:Main Page anyways, so why not officially sanction that as an information desk, and leave comments about changes to the Main Page on a different page? Davilla 17:13, 26 March 2006 (UTC)

Would this be an extension of Wiktionary:Welcome, newcomers? -- Psy guy 03:38, 26 March 2006 (UTC)

Partly yes, partly no. I am hoping that it is useful to both newcomers and long standing citizens who just have a quick question, i.e. "Is there already a template for doing hrunk?" or "How do I edit the Navigation bar which appears on every page?" Even though someone has been here a while, wiki is expansive enough and wiktionary is certainly big enough that noone knows everything, and everyone needs help sometimes. - TheDaveRoss 03:46, 26 March 2006 (UTC)

So is Wiktionary:Information desk agreed upon? If so, should we make a selection of current Beer parlour conversations and move them there in order to make a start? The following spring to mind: "French Wiktionary article count", "gibberish titles 12 March", "first and second definitions disappeared - question", "Pronunciation Guide/Audio Files - A single location to discuss?" etc. All one-off questions. — Vildricianus 14:50, 3 April 2006 (UTC)

Um, could we please match the Wikipedia name "Reference desk" for this? Good thing it hasn't gotten started yet! --Connel MacKenzie T C 18:02, 3 April 2006 (UTC)

Fair enough. I've put a proposal for some disambiguating WP:VP-like introbox on Wiktionary talk:Information desk. Please comment. — Vildricianus 18:13, 3 April 2006 (UTC)

If everyone agrees on Wiktionary:Reference desk, I will set up the page, perhaps move some of the most recent Beer parlour comments that belong there. OK? — Vildricianus 11:36, 5 April 2006 (UTC)

Urban Dictionary

Just a word of advance warning... Urban Dictionary is now out as a book, so be on the look-out for references to this book claiming that the word is in print and so must be good. — Paul G 09:31, 25 March 2006 (UTC)

Luckily tis still a dictionary, thus its headwords and use of them in examples can't be cited for CFI :p —Muke Tever 16:02, 25 March 2006 (UTC)
Does this make me a published author then? --Expurgator t(c) 16:36, 25 March 2006 (UTC)
You know maybe so, drinks on you! MGSpiller 00:34, 27 March 2006 (UTC)

Tawkerbot2 (anti vandalism bot)

This is a proposal to bring a clone of the Wikipedia bot Tawkerbot2 to automatically revert obvious blatant vandalism from Wiktionary. With various concerns on #wiktionary about Wiktionary's rising popularity and the increase of vandalism that may come with that, an automated tool on our side would be a great idea. -- Tawker 04:32, 27 March 2006 (UTC)

More information at w:User:Tawkerbot2. - dcljr 19:45, 30 March 2006 (UTC)

Support

  1. What the heck. Vandalism, bad. Less work, good. Davilla 22:38, 1 April 2006 (UTC)
  2. Page blanking auto-rollback is a very good idea. --Connel MacKenzie T C 18:08, 3 April 2006 (UTC) (Note: the bot is designed specifically for the SW type attacks, such as the ~1,200 pages blanked today that I cleaned up.)

Oppose


Comments

  • Perhaps you should wait another couple of months with this. It seems like Wiktionary is not ready/has no need for it yet. — Vildricianus 12:47, 3 April 2006 (UTC)
    • I hesitated at first too. But trying to get kinks worked out from anything automated is harder when trying to fend off a hostile party at the same time. I think waiting for our growing popularity to increase before getting a first-hand look at how it functions, might ultimately be harmful. --Connel MacKenzie T C 18:08, 3 April 2006 (UTC)

VfD for b:GAT: A Glossary of Astronomical Terms

There is a wikibook that is up for deletion where it has been strongly suggested that this content be moved to Wiktionary. My question here is do you want it on Wiktionary. Most of the content is sub-par compared to typical Wikitionary entries, but there might be some stuff that is worth keeping. In a move to tighten up Wikibooks content standards, a policy change has made content of this nature prohibited from Wikibook with a strong recommendation that if you wanted to create a dictionary or glossary that it should instead be a special project on Wiktionary. See b:Wikibooks:What is Wikibooks#Wikibooks is not a dictionary for policy details.

There are enough pages involved that doing a formal transwiki then deletion by admins here would be pointless if you don't want the content. Feel free to leave a message on b:Wikibooks:Votes for deletion#GAT: A Glossary of Astronomical Terms if you have comments or opinions on this topic. --Robert Horning 15:56, 27 March 2006 (UTC)

Yes, we want these, as per Wiktionary: Beer parlour archive/January-March 06#Names of Constellations and Stars. --Connel MacKenzie T C 15:59, 27 March 2006 (UTC)
I've put them at Appendix:Astronomical terms for now. And will clea up that page shortly. --Dangherous 16:00, 27 March 2006 (UTC)
Please don't dump stuff in the appendix. Ncik 02:16, 28 March 2006 (UTC)
Maybe some sort of staging sub-page? Appendix:Astronomical terms/transwiki or something so the appendix stays clean while we sort and Wiktionary-ize it all. - TheDaveRoss 02:19, 28 March 2006 (UTC)
I moved the page from Appendix to Wiktionary namespace. Ncik 02:34, 28 March 2006 (UTC)
That doesn't seem quite right - the Wiktionary namespace is for stuff dealing with how we function, the pseudo namespace Appendix: does seem better for this, even if it is intended as temporary. --Connel MacKenzie T C 14:30, 28 March 2006 (UTC)
None of these two seems ideal, but the Appendix really should only contain stuff that is presentable to an ordinary, unsuspecting user. The Wiktionary namespace on the other hand contains all sorts of stuff. Ncik 03:27, 29 March 2006 (UTC)
Transwiki:? - dcljr 19:40, 30 March 2006 (UTC)
Ncik, if you don't like the way the entry is formatted, clean it up - but that has nothing to do with where the entry belongs. The resulting entry belongs in the Appendix: pseudo-namespace. Perhaps as a temporary entry in the Category: namespace, but certainly not in the namespace reserved for describing and discussing Wiktionary policies. --Connel MacKenzie T C 08:18, 31 March 2006 (UTC)
I withdraw the first sentence of my last comment and replace it with "...should only contain stuff that belongs in a dictionary appendix." A general list of astronomical terms is not appendix material (but a list of planets with specifications, or a list of star signs is). Lists of that kind should be handled by categories or, as was proposed some time ago, put in a "Lexicon" namespace. I disagree that the "Wiktionary" namespace is reserved for describing and discussing Wiktionary policies. It is for all stuff that is not covered by any other namespace, in particular all sorts of request pages, as which the astronomical terms list can be interpreted as long as it's contents haven't been categorised. Ncik 18:32, 2 April 2006 (UTC)

OK, so where is the list now ? And, more importantly, where are all the other pages. Pages such as big_bang_model [[11]] . And what do we do with those. Strip them down to a dictionary definition, and a pointer to the hoped for WikiPedia atricle ?--Richardb 14:07, 4 April 2006 (UTC)


First quarter 2006 US vs. UK flamewar

I've been brave enough / arrogant enough / stupid enough to propose a DRAFT POLICY. Wiktionary:Spelling Variants in Entry Names - Draft Policy
Can we move the discussion to the Wiktionary talk:Spelling Variants in Entry Names - Draft Policy. I will be moving this great chunck to an archive page of that discussion page.
--Richardb 08:19, 15 April 2006 (UTC)

initial discussion

As is the regularly recurring cycle on en.wiktionary, questions have been cropping up again recently regarding American vs. Commonwealth spellings of words.

Since my comments in the past may have been unclear, I'd like to say that as an American, I think any respectable dictionary should list the UK spellings only as errors, perhaps used in a nonce fashion for comedic or Shakesperian effect. But they would be better off deleted.

In the interest of NPOV however, I have made numerous enormous concessions in my behavior regarding the incorrect UK spellings. I do not intend to change those compromises; that is, I'm not about to start deleting the UK spellings, even though I know in my heart it is wrong to include them here, especially indicated as valid spellings.

The purpose of this thread is to revive the older discussions so that those who are new, or have otherwise missed salient points of the conversation can adapt to the current practices.

My understanding of the current prectices (that I disagree with) are:

  1. Separate entries must be created for UK/CW spellings.
  2. Separate entries must be created for US spellings.
  3. Each must indicate the existance of the other in the ===Alternate spellings=== heading, before the definitions.
  4. Translations should not be duplicated, but rather limited to the "older" UK/CW spellings.

It is my hope that this quarter's discussion of the topic will not immediately revert to ad-hominem attacks, nor other such flamboyant nonsense.

Confer:

--Connel MacKenzie T C 17:09, 28 March 2006 (UTC)

I firmly support discussion of this important matter. However, it is unclear to me how you intend to bring this up as earnest dialogue, pretending to "hope" it will not result into flamboyant nonsense, while starting off with exactly such palaver about "erroneous" UK spellings. Or is this your humour again? — Vildricianus 18:43, 28 March 2006 (UTC)
In all seriousness, what is the current policy? Is Wiktionary:Policy Think Tank on American or British Spelling an accurate synopsis of where the issue stands? —Scs 20:37, 28 March 2006 (UTC)
There is no firm guidline nor policy. That makes the issue resurface regularly. --Connel MacKenzie T C 07:08, 29 March 2006 (UTC)
  • Palaver? Not at all. Commonwealth spellings are erroneous here in the US. Why would you think they are not? Sorry for generally trying to lighten the mood (especially on IRC) but I am quite serious here. --Connel MacKenzie T C 07:08, 29 March 2006 (UTC)
    • Palaver I say. — Vildricianus 09:48, 29 March 2006 (UTC)
    • You say "here in the US", but "here" for me, and indeed for many others, is not the US. I was not aware that en.wiktionary had suddenly been moved to en-us.wiktionary. Oh, wait…it hasn't. The lack of smileys in your text makes it difficult for me to tell whether you are being humorous or inflammatory. Please elucidate. HTH HAND —Phil | Talk 09:21, 29 March 2006 (UTC)

Other reference links:

--Connel MacKenzie T C 07:08, 29 March 2006 (UTC)

Here in Europe (in France for example), the spelling we learn is the UK one. The US spelling is only spelling errors that have been more or less officialised (learned is horrible for example, color?, gr(a|e)y, beurk ). English really lacks an Academy or something (that invents spellings that nobody wants to use...). Btw, all this war about spelling is the fault of the US who decided to change the spellings somewhere... (why?)
Now, the idea is: having two separated articles is stupid redundant. The definitions will be the same, the translations will be the same, it's just a spelling difference. So, the thing to do is to decide which spelling should contain the whole article, and which one should be simplified with links to the other. Google can help: the spelling that returns the most pages can be considered as the dominant one. Since most websites are American, it'll mainly be the American one, so what?
That's what we use on fr: when we have to choose between 2 spellings (the one we learnt at school, and the one proposed by the Academy), and I don't mind if my favorite spelling does not contain the main article. Kipmaster 09:42, 29 March 2006 (UTC)
BTW, "learned" is not a different spelling per se but reflects a different pronunciation — it really does have a D in America. —Muke Tever 23:41, 29 March 2006 (UTC)

Connel, you are being deliberately inflammatory or facetious, aren't you? Perhaps you can remind us when we decided on the policy that Wiktionary should carry American English only. My recollection is that we are in the business of not favouring any particular flavour of English over any other (except in postings to the Beer Parlour, of course :-) ), with the version that is posted first being the page that the "transpondian" spelling cross-refers to.

Oh, and it's "Shakespearean" or "Shakespearian", BTW; "Shakesperian" must be some US spelling I've never come across ;-) ;-P — Paul G

Yikes! Did I really type that? --Connel MacKenzie T C 21:20, 29 March 2006 (UTC)

Kipmaster, the spellings were changed by the US lexicographer Noah Webster. From the Wikipedia article: As a spelling reformer, Webster believed that English spelling rules were unnecessarily complex, so his dictionary introduced American English spellings like "color" instead of "colour," "music" instead " of "musick," "wagon" instead of "waggon," "center" instead of "centre," and "honor" instead of "honour." Additionally, there are "tho" and "thru" as alternatives to "though" and "through", although US English still has "cough"/"rough"/"bough" and "height"/"weight" rather than "coff"/"ruff"/"bow" and "hite"/"wate", so Webster didn't go the whole hog and reform the entire English spelling system. — Paul G 10:50, 29 March 2006 (UTC)

The idea of Paul G to consider the article written first as the main article is also ok for me (as a substitute for the Google idea I proposed). Thanks to the histories (historys in US? :-p), we can get that information. Any objective criterion would do it in fact. We should proceed to a vote soon so that the problem is solved once and for all.
PS: "favour?" Wow, now I'm speaking half US and half UK without knowing it. Learning UK at school, and watching US on tv... Thanks Paul G for putting a name (Webster) on that! Kipmaster 11:16, 29 March 2006 (UTC)
  • I was expressing my opinion, clearly marked as such, in difference to Wiktionary practices. The trmendous British prevalence here on en.wiktionary has taken a toll, leaving en.wiktionary looking less like a dictionary and more like joke. If it sounds inflammatory perhaps you should check your own assumptions. As I said earlier, I've no intention of going back on earlier compromises, and I am not about to start deleting CW spellings. But to arrrogantly, Britishly dismiss the notion is, well, arrogant and British.
  • No, I am not about to start deleting UK/CW spellings. From an American perspective though, I feel they should be. Kipmaster raised a very very common misconception in his arguments above...that American and British terms are equivalent. Paul seems to agree with Kipcool. But I find that rarely to be the case, when the spellings differ. Note that several of the cases have been rolled back to whatever Paul thinks is the right approach (and backed up by the Comonwealth cabal.) But if you want to learn the distinction between the "pondian" versions, you'd be better off looking outside of Wiktionary.
  • That then, is the heart of what I'd like to see fixed, after discussion and possibly even a vote. I can understand the desire for reducing "duplication" only for translations; but even then I doubt such removal of content is accurate. I would like to see each entry clearly identified as invalid spellings wherever they are considered invalid. Flavor should be marked as a spelling error in CW English, while flavour should be indicated as erroneous in US English. Perhaps that marking merits a separate vote of its own? --Connel MacKenzie T C 21:20, 29 March 2006 (UTC)
    • I think I see your point, after about three re-readings of your posts and skipping the irrelevant palaver about your American perspective (meaning: both perspectives should be respected and considered in order to create a healthy balance). I more or less agree with you, yes; words with variant spellings at either side of the water usually require more than just an =Alternative spelling= header to mark this. Probably, different derived/related terms, different meanings, and accordingly, different translations are also called for. Maybe they also deserve a section explaining what differences there are in usage, history etc, if relevant. At the moment, I can't think of any such difference, but there are certainly people who can add this information. — Vildricianus 07:49, 30 March 2006 (UTC)
    • Connel, I think I am broadly in agreement with you, in that I am in agreement with Vildricianus, but you must understand that, to me and others, your posting came across as (unintentionally) high-handed and disparaging. I think, on careful re-reading and re-interpretation, that what you meant is that "any respectable dictionary of American English should list the UK spellings as errors", but what you wrote initially is far removed from this meaning. I think you could have worded your posting more clearly to avoid raising people's hackles.
    • I and others have been more than happy to recognise that Wiktionary represents all varieties of English and so to be NPOV in my treatment of spelling variations. For myself, I have never asserted in my edits that UK English is somehow superior to other varieties. Naturally, being from the UK, I prefer to enter new words using UK spellings and then cross-refer other spellings to these, but I have no problem with others doing things the other way around. Incidentally, I'd be interested to find out more about this "Commonwealth cabal" that you claim exists.
    • As for marking certain spellings as erroneous in UK or US English, marking "flavour" as Commonwealth English and "flavor" as US English is sufficient, surely? Do we really need to say "this is the only spelling allowed in the UK/US and the other one is incorrect?" Incidentally, I think you'll find that spellings ending in "-our" are in fact acceptable in US English, but are little used. For example, the American Heritage Dictionary of the English Language at dictionary.com marks "flavour" as "chiefly British". Now, that might be intended to include Canada or Australia - it doesn't say - but I'm sure I've seen American dictionaries that acknowledge "-our" spellings as acceptable variants of the more common "-or" spellings. Perhaps the "-our" spellings are archaic in US English rather than erroneous. — Paul G 09:27, 30 March 2006 (UTC)

I am sorry for raising your hackles. I am glad that you re-read my initial posting and were able to realize that I wasn't suggesting that this is an American English dictionary, but rather that from my POV, those entries are wrong (as from the British POV, the correct American spellings are considered errors.)

When at work, if I type "flavour" in Microsoft word, it gets the red squiggly line underneath it. Why? Because it is a misspelling. Yes, it is recognized as British if I right-click it and look it up. But to include it would be erroneous.

As to the cabal, I was referring to center/centre when last year you did assert that everything in the correct entry center should point to centre, whilst removing the US-specific meanings from centre. I believe that individual pair has been partially corrected since then. To call the pro-British sentiment expressed at that time merely a cabal, is perhaps too kind.

Vildricianus, please confer color/colour as a good example of diiverging meanings. Or dig through the history and compare center/centre.

  • Yes, center is indeed a good example. — Vildricianus 15:57, 30 March 2006 (UTC)

The points I outlined above are not mentioned anywhere as official policy. Or even semi-official. Or even as guidelines. Yet they are the current practice, right? We need a vote of some sort, or an otherwise official policy stating what is what. Currently people are left guessing. Guessing what is appropriate continues to cause problems.

--Connel MacKenzie T C 10:11, 30 March 2006 (UTC)

Agreed, and apology accepted. I hope we can get down to discussing the matter at hand now.
As far as I am aware, there is no requirement to link cross-refer US spellings to UK ones. The (unwritten) policy is to write an entry in any particular type of English and then cross-refer all other variations to that entry. The fact that most contributors use UK English would mean that UK spellings would tend to get entered with US spellings having the cross-references, perhaps making it look as though new articles had to use UK spellings, which isn't the case. Such a requirement would introduce bias in favour of UK spellings, which we don't want to promote.
As for what I did with centre/center, I don't remember doing that, but maybe that was before I understood the appropriate way to treat this issue. I'm sorry if I gave the impression that "centre" had to be the "main" entry because it was a UK spelling. If it was entered before "center", then that would be why it should get the full treatment rather than because it was in one or other variety of English.
As Connel says, we certainly do need to get matters clear here and establish official, documented policy on this issue. So what do others think this policy should be? I'm keen on "what gets entered first is the main entry; others cross-refer" but this might not be sufficient as it doesn't cover shades of meaning that might exist in some Englishes but not others. — Paul G 10:50, 30 March 2006 (UTC)

translations

I think the translations not the definitions is big problem since as people have pointed out they might theoretically mean slightly different things depending on spelling. Since we strictly speaking should have at least three independent quotes for each sense of each spelling we really need entries for all variants. If nothing else to show the quotes somewhere. A minor issue is what to do with misspellings that are so common that it is even possible meet the criteria for inclusion.

Now back to translations, while of course the different spellings might possibly mean slightly different things such nuances are unlikely to effect translations except in rare cases. So the big question is where the translations should be. For example colorize, colorise, colourize and colourise all translate to färglägga in Swedish and I can't imagine any language that translates them differently depending on the spelling. Now I suppose you could argue that 2 of them are misspellings but even so they seem to be widely used none the less. Still, it leaves two of them. The point is that one of the spellings really must be the main entry. Note main entry, not right entry. --Patrik Stridvall 14:21, 30 March 2006 (UTC)

Well, it's a theoretical problem, as you say, Patrik, so it's unlikely there will be many cases like this. We'll just deal with them as they arise.
Here's an example that already exists. "Program(me)" is spelled "programme" in UK English and "program" in US English. However, the only spelling in the computing sense is "program" in all varieties of English. So this requires treatment at program that will be absent from programme. We just need to ensure that we provide that treatment. I don't think it is such a big problem. — Paul G 14:58, 30 March 2006 (UTC)
Actually, I don't agree at all with the first-gets-it-all principle. Yes, that would be the easiest for us, contributors, but we're making this for the user, aren't we? I don't think anyone looking up "colorize" should be referred to "colourize" or vica versa for any information whatsoever, not even translations. Beside the fact that it's not fair whatever way you turn it, it will make the user wonder. "Do we prefer UK spellings?" - "No we don't, but the UK version got here first." Pardon my language, but this sounds like rubbish. Each entry needs the relevant information there where it belongs, not in a variant which happened to be created first. Yes, there will be duplication. But until the software can handle this properly, we'll need to balance out these entries in order to respect the N in NPOV. For the user, it doesn't help having this first-gets-it-all rule. — Vildricianus 15:57, 30 March 2006 (UTC)

more discussion

In a comment on the proposal below, Paul G wrote:

...we are aiming at doing more that that here. We want to eliminate the unintentional bias that is introduced by having a page for (say) "aeroplane" that gives full treatment of the word, and another for "airplane" that just says "See aeroplane" as if "airplane" is a mere variant rather than the US spelling of the word.

I'd like to consider the possibility of not worrying about this, after all. Educated people understand that color and colour are two variant spellings of essentially the same word. Educated people understand that that no one spelling is universally "right" or "wrong" or "good" or "bad"; they're just different, that's all. Educated people understand that it's useful to have one page (not two) on which all the central, relevant information about a unique word is to be found. Educated people understand that (for the moment, at least) Mediawiki requires pages to have exactly one name. So, I assert, educated people do not see any bias when color redirects to colour, or vice versa; all they see is the unavoidable logistical repercussion of the simple facts that spelling variants exist, and that Mediawiki is the way it is.

(Now, it's true, uneducated people might perceive the "unintentional bias". But -- and I'm not trying to be elitist or anything; this is a plain fact -- uneducated people don't use dictionaries, so let's not worry about them so much.)

Some day, perhaps, Mediawiki will have a way for one article to have two (or more) different names, with absolutely no way of telling which is the "main" or "preferred" name and which are the variants. (Wiktionary Wikipedia could of course use such a mechanism, too.) That's really the only solution to the "bias" problem. Until then, I suggest we agree that the superficial appearance of "bias" is one that can best be solved by user education.

Up above, Connel MacKenzie wrote:

The tremendous British prevalence here on en.wiktionary has taken a toll, leaving en.wiktionary looking less like a dictionary and more like a joke.

I'd like to challenge this assertion, too, because I just don't see it that way at all. En.wiktionary is a dictionary that is rolling up its sleeves and getting down to the business of defining words; it is wisely and maturely not getting bogged down (well, present company excluded :-) ) in internecine, relatively unimportant, utterly unwinnable arguments about The One True Official English Spelling. Guess what? The English language does not have "one true official spelling", if for no other reason that there is no one, single body to officiate it.

I also don't see any "Comonwealth cabal", nor do I understand what Connel is referring to when he talks about "arrogantly, Britishly dismiss the notion". Sure, there are a lot of Commonwealth spellings on Wiktionary. So what? The English invented the language, after all (or, at least, their cabal conspired to get it named after them :-) ), so I really don't see the problem if their spellings get used, even in a reference work that gets read in America. There are plenty of American spellings on Wiktionary, too. Again, so what? (Emerson's quote on consistency springs to mind.)

Scs 16:13, 2 April 2006 (UTC)

Well-reasoned of you, and I'd perhaps agree, if it hadn't been for such an obvious section title. Your solution will evoke further disagreement in the future, if it could allay the current one at all. — Vildricianus 12:45, 3 April 2006 (UTC)
Eh? I didn't think I'd proposed a "solution", nor did I use a section title. Was it someone else you meant to reply to? —Scs 17:18, 3 April 2006 (UTC)
I fail to see how User:Scs' ad hominem attack is well-reasoned. I never said that en-us is "The One True Official English" and my comments very clearly, repeatedly expressed that I had no interest in trying to make such an assertion. As for his idiotic statement that there is no cabal, one needs only to look at the statistics. Again, as the only American contributor in the top 10 contributors on en.wiktionary, (starting out at en.wiktionary a year or more later than several of the others) I have perhaps made a dent. But I've been stymied several times with specious arguments such as "a UK spelling already exists" == "The One True Official English is en-uk." Or arguments such as "terms shouldn't be entered as anything other than redirects."
I still feel it is irresponsible of Wiktionary to list UK spellings such as colour, flavour or parlour without identifying them as spelling errors in American English. It is not as if we don't know they are errors. But the commonwealth cabal in place refuses to let them correctly be listed as such. I find that odd, as I'd assume those same parties would want the American spellings likewise identified as incorrect in commonwealth English.
Until we have some kind of official policy indicating that both need to be entered, and both need to be identified properly, we'll continue to have periodic flamewars on the topic. Any other approach is certain to offend one side of the pond or the other (as has been demonstrated several times in the past now.) --Connel MacKenzie T C 17:36, 3 April 2006 (UTC)
Whoah, Connel, calm down! I'm sorry you percieved an ad hominem attack, but truly, none was intended! I didn't say you said "One True Official English", and if you want to tar that statement of mine with anything, call it "hyperbolic" or a strawman, please. But at any rate, it's an objective fact: there is no one true official English, or else we wouldn't be having this debate.
Since you brought it up again, though, I'd like to ask why you're apparently so worried about asserting that words like colour are "spelling errors in American English". Would it not suffice to say that "color" is the accepted spelling in American English, and "colour" the accepted spelling in Commonwealth English, and leave it at that (i.e. and not brand either of them as "errors")? —Scs 18:40, 3 April 2006 (UTC)
Perhaps I misread your statement. I did not consider the notion that you were suggesting "user education" as this is a wiki and therefore such a thing is impossible. I'll try to remember you are not making a personal comment.  :-)
I was probably wrong to bring up the erroneous tagging again. As I said before, if I enter any British/commonwealth (is Commonwealth a proper noun?) spellings in, say, Microsoft Word, I'll be prompted to correct it. Those with the default "auto-correct" feature turned on may not even see the replacement. Now, since we aim to be correct in what we say about words, it does not make sense to lead someone on incorrectly. A visitor that arrives here and looks up the single word colour would not have any indication that what they entered is not a word in American English. Simply having a note somewhere that says it is British does not convey quite enough information to be useful. --Connel MacKenzie T C 20:49, 3 April 2006 (UTC)
In terms of the hypothetical visitor, see my new proposal below, which ends up addressing this (at least, if we can agree on having a single page for colo(u)r rather than two).
As far as Microsoft Word is concerned, and please don't take this as any kind of personal attack, I really, really don't care what it does. This is not because I'm a Microsoft basher, but simply because (as I've said before) there is no one authority on "correct" English spelling, and even if there were, it certainly would not be a software company in Redmond.
Here's a thought experiment. You're editing a manuscript (using your U.S. copy of Microsoft Word), and you happen to be including an excerpt from the Guardian Unlimited:
God, the archangel says, is also disturbed by Mr Blair's remark that while religious beliefs might colour his politics, "it's best not to take it too far".
(There's nothing special about this quote; it was just one of the first hits I got in a Google search for "Blair colour".) Now, when you type or paste in this quote, Microsoft Word is likely to give you the dread squiggly red underline for "colour". Is this a problem? Me, I don't think so. There's nothing intrinsically wrong with stringing together the six letters c o l o u r, even within the shores of the revolutionary colonies. Whatever it is that the squiggly red underline means, it is not, "Thou shalt not use this spelling; it is wrong; correct or remove it at once".
Scs 22:43, 3 April 2006 (UTC)
Well, I'm not running MS Word right now, but on Linux, I use the command "spell." When I type ctrl-D after pasting in your text, it (correctly) informs me that colour is not a word. --Connel MacKenzie T C 01:15, 4 April 2006 (UTC)
If you think that the Linux "spell" command carries any more (or less) weight here than Microsoft word, you haven't understood my point at all. At any rate, there is no dispute that "colour" is not the preferred American spelling. But that doesn't mean that it's "wrong", and it certainly doesn't mean that it's "not a word" (just as, of course, "color" is not wrong, either). —Scs 16:12, 4 April 2006 (UTC)

This is probably a naive comment of mine, but I fail to see how on earth this can be such a big deal. En.wiktionary includes both American and British English, as well as Australian, New-Zealand, South-African, Irish, Indian, Canadian and any other regional variant of English whatsoever; therefore, all words and spellings of either variant should be included, treated and valued with the same esteem, regardless of any personal affiliation or custom, in other words, with a neutral view. The problem arising out of identical translations for two different spellings must not influence our stance on this general principle, as it is clearly secondary to it. Even if we were to have our main or sole purpose to be a translating dictionary, we should respect English in all its variants, no matter what the result is for our layout, format, translations etc. I wonder whom of the main contributors refuses to honour/honor this principle. — Vildricianus 18:07, 3 April 2006 (UTC)

Rather than naming more names, I'd rather work towards solidifying a policy acceptable to all. Then stating it explicitly so that it no longer resurfaces (as it does now on a regular basis.) This is perhaps the poster-child of why we should have something like Wiktionary:Votes. --Connel MacKenzie T C 20:49, 3 April 2006 (UTC)

Proposal

A header for editability. — Vildricianus

The "first-gets-it-all" principle is not ideal, but I'm not sure we have anything better right now. Avoiding duplication is important because full pages for both/all spellings quickly get out of synch (with information being added or corrected on one page only).
I proposed the following solution before, but I don't remember what became of it. I'll use "color"/"colour" as an example.
  • Have a single page called "color/colour" or "color, colour", or something similar; the name of the page lists the variations in alphabetical order, so there can be no claims of language bias (although, for most UK/US variations, this favours the US spelling).
  • Move the entire contents of color and colour into this page and format it appropriately so that meanings for a particular spelling are distinguished.
  • Turn color and colour into redirects to this page (but see the next bullet point).
  • As "color" is also a word in Latin and Spanish, color would actually be more than a redirect: the English entry would say "See color/colour", and the Latin and Spanish entries would remain on that page.
I think this is a simple solution that would clear up all the issues around spelling variations once and for all. — Paul G 16:25, 30 March 2006 (UTC)
There is still the possibility to include the page color, colour (or whatever) as were it a template: {{:color, colour}}. I don't know if it is feasible, but at least theoretically there is a possibility that one could put the common info on that page and include it in both spellings... \Mike 16:43, 30 March 2006 (UTC)
Templates would make it too difficult to change content. I read elsewhere about Paul's idea, and couldn't understand why this didn't make a resolution at the time. Sounds reasonable. — Vildricianus 16:55, 30 March 2006 (UTC)
Also, there should be a better way to name these things. Commas imply some phrase; slashes are part of the url syntax. Perhaps color;colour? Note that there's no space. — Vildricianus 09:18, 31 March 2006 (UTC)
Agreed - this was one of the things I was unhappy about with my proposal. The phrase "there, there" could be wrongly interpreted as a page giving two identical spellings of "there", for example. — Paul G 10:50, 31 March 2006 (UTC)
People here seem to believe that words which have different spellings in American and British English often have slightly different meanings because they are spelt differently. This is clearly not the case. Nuances in meaning exist due to cultural and geographical factors. And this is the case for all English words, not only those that are spelt differently. Simply tagging words with the appropriate templates (Template:US, Template:UK, etc.), as we have always done, does the job. It is obvious that a meaning tagged as AmE will be spelt the American way, one tagged as BrE the British way, and one tagged as both can be spelt either way (according to where one comes from, or wants to have one's work published). Ncik 18:12, 30 March 2006 (UTC)
Also, if what we're worried about is translations, or specifically, the problem that might arise if the Commonwealth "colour" ought to have a different translation into, say, Klingon than does the U.S "color", it's already a much bigger problem that there might not be a single perfect translation at all, i.e. neither for colour nor color. It's often the case that a single word in one language will be translated to different words in some other language, depending on the sense in which the word is used. Any translation scheme must accomodate (or at least acknowledge) this possibility, and having done so, if it happens that some of the distinctions between translated-from senses end up being correlated with distinctions between translated-from spellings, then not only is this no problem, but it makes it even easier to document (for a particular translated-to word) which sense is being translated from.
(I suppose there's also the question of which spelling to use when translating to English. Does de.wiktionary say that Farbe is translated to color or colour? [Answer: de.wiktionary.org has Farbe → colour, but here in en.wiktionary.org we've got Farbe → color.])
Scs 20:37, 2 April 2006 (UTC)
This is true; however, we are aiming at doing more that that here. We want to eliminate the unintentional bias that is introduced by having a page for (say) "aeroplane" that gives full treatment of the word, and another for "airplane" that just says "See aeroplane" as if "airplane" is a mere variant rather than the US spelling of the word. We also want to eliminate the duplication of effort and inevitable inconsistencies that would arise if we had two (or more) pages, one for each spelling. The page "aeroplane, airplane" (or separated by whatever punctuation mark is chosen) would indicate where "aeroplane" is the correct spelling (UK, where else?) and where "airplane" is the correct spelling (US, where else?), and then give all of the definitions as they currently stand. In a few cases (such as "program"), there are senses that have only one spelling, and in this case, the senses themselves would be marked accordingly; so (excuse my concise definitions):
program, programme
  1. A series of planned events
  2. A leaflet outlining such events
  3. A TV show
  4. (always spelled program in all varieties of English) A computer program
or something like that. — Paul G 10:50, 31 March 2006 (UTC)

This bothers me. What about historical spellings of words? More to the point, what about the fact that many US spellings are in fact archaic UK spellings? We can't list all alternative spellings on the page title. In my opinion the best solution is to have duplicated information under all relevant headings – the problem of pages becoming out of synch with each other seems to me the lesser evil. Widsith 15:03, 1 April 2006 (UTC)

I don't like Paul's proposal either (for the same reason). But isn't our page layout flexible enough to incorporate all information on historical spellings on one page? This could be done under the "Etymology" header (alternatively a "Word history" header if we ever decide to create one), and by means of various usage notes and an expanded "Alternative spellings" section. Ncik 15:19, 1 April 2006 (UTC)
Why don't we just wait for the Indians to get more net active. Then clearly the commonwealth spellings will overwhelm the American spelling in numbers of users ! :-) --Richardb 00:34, 2 April 2006 (UTC)
I think the historical spellings, etc, could all be catered for in a section (maybe called "Spelling" towards the top of the page):
program, programme (as the page title)
==Spelling==
  • Program is the usual spelling in the US (and wherever else). The spelling programme is archaic (or historical, or whatever)
  • Programme is the correct spelling in the UK (and wherever else). However, in the computing sense, program is the correct UK spelling.
We could put any other information we like in there about who uses which spelling, which spellings are historical, archaic, etc, and then leave the rest of the article to give the meanings, etc.
Widsith, Ncik, do you have an alternative proposal that we could consider? — Paul G 10:05, 3 April 2006 (UTC)
Well I'm not really convinced of the need for any new proposal....as I said above I don't think it's too infeasible to keep a separate page for every current spelling, i.e. color and colour would both exist with very similar information on them (though each would have different citations reflecting the different spellings). Historical forms are a bit different, personally I'm coming round to the Word History idea but we don't need to worry about that till it's become an issue. Widsith 11:49, 3 April 2006 (UTC)

further discussion

Personally, after reconsidering this entire issue, I think it's perhaps the best solution to deal with it as we're doing now. That is, two pages for colour and color, and trying to keep them as much in sync as possible. I agree that this is not very constructive, and even though I partly like Paul's proposal, I'm not sure whether this would help our project a lot. Perhaps we first need to experiment a bit with a low-profile entry (not color/colour). — Vildricianus 12:45, 3 April 2006 (UTC)

Please also note that Paul's suggestion was tried (by user:Dmh?) with color, colour & color coloured and deleted by Ec as nonsense over a year ago. --Connel MacKenzie T C 17:10, 3 April 2006 (UTC)
Hm, why was that? — Paul G 09:44, 4 April 2006 (UTC)
As I recall, it was deleted as an abandoned/failed experiment. --Connel MacKenzie T C 16:51, 4 April 2006 (UTC)

another proposal

Here's what I would propose, for now at least. This isn't just a policy statement; it also touches on goals and explanations. But what it says isn't really very different from the status quo, as I understand it. (In other words, I'm not proposing any new policy here, mostly just restating the current one.)

That is a wildy false statement. What you propose is even worse than Paul's "partial redirect by section" proposal. What follows on here (based on your incorrect assumption that redirects are acceptable by anyone) is very well formatted, but totally unacceptable. Your choice of torch is interesting; do you realize that in America, a torch is only a wooden stick with oily rags on one end? If you tried to say "torch" in America to refer to a flashlight, you would not be understood; you'd probably be suspected of being mentally retarded. --Connel MacKenzie T C 22:50, 3 April 2006 (UTC)
I've clarified the "torch" example. (It wasn't hypothetical; it's pasted directly from torch.)
As far as the assumption that redirects are acceptable: I'm prepared to be proved wrong, but I had gotten the impression that plenty of people do find them acceptable. (Not perfect, but a decent compromise under the given constraints.) —Scs 22:56, 3 April 2006 (UTC)
Where did you get that assumption? The only person that suggests it these days is Paul; each time the topic comes up he re-suggests it innocently (sometimes suggesting that is has consensus) as if he'd never suggested it before, nor ever heard the arguments against it. It gets tiresome. Consensus has never been to use those redirects here on en.wiktionary. Some experiments with them have been made, but AFAIK, each has been undone. --Connel MacKenzie T C 01:05, 4 April 2006 (UTC)
Wow. Assuming I'm the Paul referred to, I find it extraordinary at how I am being misrepresented here ("each time"; "innoncently", "suggesting that [it] has consensus", "as if he'd never suggested it before nor ever heard the arguments against it". Is that really how you see me, Connel, or are you playing it up a bit here? — Paul G 09:51, 4 April 2006 (UTC)
Paul, on this topic, it seems to me that your normal rational self takes a vacation. --Connel MacKenzie T C 16:54, 4 April 2006 (UTC)
Paul doesn't need me to defend him, but he's so polite he might not say anything at all, so let me point out that denying his rationality here is uncalled for, and could be seen as offensive. Paul's been utterly rational on this topic, it's just that he's arguing from different premises and opinions than you are. You would do well to remember that many of your opinions are just that, also.
People have been bending over backwards to assume rationality and good faith on your part in spite of the wildly provocative way (yes, it really did look that way) you opened this thread. You might think about returning the favor. —Scs 01:43, 5 April 2006 (UTC)
I do not mean to offend; I was stating my opinion as a matter of fact (that is, from my perspective, his actions and statements on this topic do not coincide with his normal behavior, polite bearing and refreshingly clever intuition.)
I opened this thread in a calm manner, compared to the two inquiries (on my and other's talk pages) that immediately preceded re-opening the topic. I have been clear from the outset that I am not ignoring the CW perspective, merely stating the inverse of it: an American perspective. In doing do, I have gotten a far too healthy dose of negative responses. I do wonder why. Perhaps it is too much to swallow when coming from the (flawed) perspective that UK English is The One True English. The only thing even mildly incitful I did, was putting "flame war" in the topic heading (but even that has proven to be partly accurate.)
It is curious that the prevailing mindset here is still not one of openness to allowing all words in all languages (as is the Wiki Way.) Isn't it clear by now, that whenever an attested alternate spelling exists, there must be two separate entries to be accurate?
--Connel MacKenzie T C 03:24, 5 April 2006 (UTC)
I guess there are different kinds of "openness". From where I sit, statements like "any respectable dictionary should list the UK spellings only as errors" and "colour is not a word" don't look very open. In what way do the arguments I've been making look not open?
With respect to having separate entries, no, it's not at all clear that "there must be two separate entries to be accurate". For example, we have one page on bald, even through it's also a completely different and unrelated word in German. (And of course this is not an isolated example; it's just the first one I thought of.)
Finally, with respect to your suggestion of a "perspective that UK English is The One True English": that would indeed be flawed, but, again, I just don't see it. I certainly don't see it in this thread, and in particular not in any attempt to (say) unify the color and colour pages. —Scs 17:00, 5 April 2006 (UTC)


I have made one little change in the described use of the "Alternative spellings" section. I've also alluded to the possibilities of combined titles (e.g. "colour/color"), and of separate entries with guaranteed-identical, transcluded content, but I don't get the impression there's consensus around those yet so I'm leaving them as ideas still under discussion.

Suggestions, criticisms, rewordings welcome. (In particular, there's probably a better taxonomic nomenclature than "variants".)

En.wiktionary is a dictionary of the English language, embracing several distinct variants such as British (or "Commonwealth") English, American English, Australian / New Zealand English, Indian English, etc.
In some cases, of course, these variants involve different words for the same idea, different meanings for the same word, different spellings for the same word, and words unique to a particular variant. (If there were no such differences, they wouldn't be variants!) Wiktionary entries must therefore be careful when defining these mixed-use words to indicate how the words are used in each variant.
When two variants have different words for the same idea, those entries should be tagged with their variant:
torch
1. a stick with a flame on one end used as a light source
2. (British, Aust) a portable source of electric light
Synonyms
* flashlight (US)
----------
flashlight
1. (US) An electrical hand-held lightsource.
Synonym
* torch (UK, Aust)
Cross-references between the other-English "translations" can be in the form of synonym lists (as in the examples above), or directly in the definition (e.g. "flashlight: (US) An electrical hand-held lightsource (a British torch).").
When two variants have different meanings for the same word, again, each sense in that word's definition should be appropriately tagged:
subway
1. (North American) underground railway.
4. (British) underground walkway, tunnel for pedestrians.
When a word is specific to one variant, it should obviously be so tagged:
godown
1. (Indian English) A building for the storage of goods; a warehouse.
The situation is trickiest when two variants have different spellings for the same word. In this case, it is preferable to collect the word's etymology, definitions, and other information in a single entry, to avoid duplication of effort, and so that translations can be consistently listed. The alternative spellings are listed in the "Alternative spellings" listing:
colour
Alternative spellings
* colour (Commonwealth English)
* color (US)
All alternatives (including that of the nominal headword) should be listed in the "Alternative spellings" section, as shown.
An unavoidable technical limitation is that any entry must have one title, which will perforce use a particular spelling. The preferred solution is to list the entire entry once under one spelling, and to use redirects to that entry from the other spelling(s). The choice of which spelling "gets" the entry, and which spellings are redirects, is almost accidental; the current practice is simply that the spelling first used when an entry is created stays with the entry, and that the later-added spellings are redirects. (This approach, though unabashedly empirical, does have a certain Solomonlike appeal to it.)
This issue has aroused considerable debate, but the contention can be minimized by observing that there is no claim or assertion of primacy or "correctness" attached to the choice of spelling of the main entry, versus the redirects. All spellings listed in the "Alternative spellings" section are equally valid in the context of their respective variants. The fact that one spelling happens to be listed in an entry's title is an artifact of Wiktionary's database architecture; it is not a value judgment.
To further reduce any appearance of bias, it has been suggested that the title of such an entry be something like "colour/color" or "colour,color" or "colo(u)r", with all individual spellings as redirects to it. This proposal is under discussion but has not reached consensus.
To reduce the appearance of bias, it would also be possible to retain one arbitrary spelling as an entry's formal title, but to list the variants in the entry's various sections:
colour
Alternative spellings
* colour (Commonwealth English)
* color (US)
Noun
colour/color
1. The spectral composition of visible light.
So that visitors unfamiliar with these issues will not perceive any unintended bias, it might be appropriate to include a templateized disclaimer at the top of multi-spelling entries:
Due to technical limitations, this entry's title uses a particular spelling, and is redirected to from other spellings. No value judgment is intended by these choices. See the "Alternative spellings" section for the list of all spellings of this word and their status.
This disclaimer is inspired by the {lowercase} template which Wikipedia uses for words which are supposed to start with a lower-case letter (see e.g. Wikipedia:zsync).
Finally, it is worth asking whether an enhancement to the Wikimedia software could be pursued which would enable a single entry to exist under multiple names, to completely eliminate the implication that the "main" entry uses a "preferred" spelling, or that there is anything inferior about spellings that use redirects.
(It has also been suggested that much the same effect could be achieved without any software changes, by having two or more distinct pages, each containing identical content transcluded from some central place via a template or other mechanism. This idea poses difficulties for those editing the content, and is still being discussed.)

Scs 22:26, 3 April 2006 (UTC)

  • Whomever corrected my error with the heading levels here in this BP section, thank you. --Connel MacKenzie T C 22:50, 3 April 2006 (UTC)
You're welcome. :-)
Thank you Scs. --Connel MacKenzie T C 01:05, 4 April 2006 (UTC)
  • I do not see how software could overcome the bias issue. No matter what technical solution is proposed, the result is still sub-optimal. If a user looks up the word color would we then have lots of {{PAGENAME}} tags within the page to display only "color" and not "colour"? That would then raise accusations of bias from the Commonwealth proponents, would it not? (Also, given inflected forms, I don't think such a solution is possible anyhow.)
  • Simply applying the wiki default policies here would be a monstrous improvement to this (Scs') proposal...that is, as Vild said, just have two entries.
  • For some reason, people seem also to forget why heading templates are not allowed on the English Wiktionary. If we simply had template: translations:color, colour (noun) and template: translations:color, colour (verb) then each term could re-use the other's translations, rendering them properly. Additionally, clicking on the translation section's [edit] link to the right would then properly edit the template not the entry. Any visitors editing the translations for one or the other would (correctly) then enter the translations for both (without even knowing it, presumably.)

  • Any objections to me editing color and colour to demonstrate this technique? I've put those two templates in place in preparation. --Connel MacKenzie T C 01:22, 4 April 2006 (UTC)
Please go ahead. I'm looking forward to a fresh idea to overcome the seemingly unresolvable stalemate in this matter. Ncik 02:40, 4 April 2006 (UTC)

At the same time, would anyone object if I moved this whole discussion over to Wiktionary:(Policy Think Tank on) American or British Spelling? —Scs 03:23, 4 April 2006 (UTC)

Please, not just yet. --Connel MacKenzie T C 03:48, 4 April 2006 (UTC)

Yet another proposal

Regarding the third part of User:Scs's proposal above:

Restating what I said there, I have now set up the example contentious entries color and colour in a manner I think may address all concerns. Those being (in order of importance):

  1. Redundant translations should be avoided at all costs.
    1. visiting translators may have a hard enough time with English - let's not make it impossible
    2. visiting newcomers often enter translations in one and not the other ==> very very bad
    3. keeping multiple translation lists is very difficult even for seasoned Wiktionarians
    4. synchronizing reverse/complementary translations becomes increasingly difficult
  2. The perception of a preferred spelling is not NPOV.
  3. The arbitrary choice of a preferred spelling is not NPOV.
  4. The practice of using redirects is abhorred.
  5. Dialectal entries must have the flexibility to express the distinguishing elements.
  6. As per previous Beer Parlour discussions, only the translations section merits this special treatment (the etymology of color is supposed to say that it is derived from colour, for example.)
  7. Solution must not be overwhelmingly complicated for the Wiktionarians doing the grunt-work of setting entries up "properly."
  8. Format resulting from this should conform to WT:ELE as best as possible.

What I've done is replace the two translation sections in both entries to links to the two "common" translation templates named {{translations:color, colour (noun)}} and {{translations:color, colour (verb)}}. Using section editing to edit the translation section of either color or colour will automagically edit the proper template instead.

Things I did wrong on this experiment:

  • I don't like the naming convention, as it is too much to type. Template: trans-color-colour (noun) and Template: trans-color-colour (verb) would have been a better choice.
  • I haven't (yet) converted the verb section to unnumbered translation style. I'm sure this will only increase in importance if this technique is pursued. Note: The "{{ttbc}}" stuff works just fine in this context.
  • I haven't experimented with putting "Return to entry color or colour" somehow at the top using "noinclude"s. Editing one of these templates leaves you at the template right now, not at the entry you started editing. Also, we may want to inform people that they must use section editing to edit translations (or edit the template directly.)
  • Care must be taken with creating the initial templates, if more than one POS section has translations entered. For now, I used Wikipedia-style disambiguation to differentiate them.

Are there any concerns not addressed by this? If this is agreeable to all, I'll run through the various lists...

--Connel MacKenzie T C 04:54, 4 April 2006 (UTC)

  • Notes:
    • Perhaps the naming could even simply be {{color-colour-verb}}, as it is obviously only for translations.
    • The mention of please use section editing could go right under the =Translations= header in the template.
    • Apart from the following concerns, this is a good proposal. I didn't know about the section editing taking you to the template.
  • Concerns:
    • This proposal doesn't take care of differences in other sections than translations. Solution: the affected entries should be expanded into more or less "complete" ones ASAP, so that few revisions/additions are necessary after the different variants are synchronized.
    • What about senses and their accompanying translations that apply to only one variant, for instance in program/programme or center/centre? We could put these right below the corresponding ones, and have an additional translation table below the template. However, this will then only be reached by editing the entire page. You may have to experiment further with disambiguated translation tables to show how it works.
Vildricianus 07:28, 4 April 2006 (UTC)
  • Heh. I thought I picked the trickiest example. OK, then, I'll try program/programme tomorrow, unless someone beats me to it. --Connel MacKenzie T C 08:39, 4 April 2006 (UTC)
Well, I must say there aren't any conerns I can think of that haven't already been raised. It seems as though this might be a productive approach. --EncycloPetey 09:33, 4 April 2006 (UTC)
  • I'll start on center/centre then gray/grey next. I'm still undecided on the naming, e.g. color-colour-verb vs. color-colour (verb) since the disambiguation hiding is turned off here - I guess it doesn't really matter? (WP-style disambiguation dictates that stuff in parenthesis in an article title is hidden when displayed as a link...either I'm doing it wrong or it is turned off here.) --Connel MacKenzie T C 19:51, 4 April 2006 (UTC)
  • OK, I realized I didn't give the granularity Vild asked for with program/programme nor center/centre. I inadvertently included a superset of translations, I think. But perhaps "accidentally" having the superset works? --Connel MacKenzie T C 20:39, 4 April 2006 (UTC)
  • Comments:
Impressive! I had no idea it was possible for section editing to automatically flow through to the template contents like that.
I appreciate the work you've done and I won't try to dissuade you from further experimentation, but for the record I can't say I like the result. There are still, predominantly, two separate pages, which still strikes me -- please don't anyone take this personally -- as retarded. It's a "share the misery" approach; it's the compromise that succeeds not because it's any good, but because it's the least unpalatable.
All we're centralizing so far is the translations, and while those are significant, they're not the only or even the most important part of the entry that are problematically duplicated, that are at risk of diverging. The definitions, examples, and etymologies are still redundant. (Although someone mentioned that an earlier discussion deemed etymologies to be necessarily distinct, for reasons that escape me.) —Scs 03:04, 5 April 2006 (UTC)
The NPOV issue arises again if the definitions are shared. Furthermore, when looking at colored vs. coloured I think you'll see dramatic differences in the definitions (etc.)
The etymology of the American terms must be different; they are, after all, bastard children of the commonwealth spellings (courtesy of Noah Webster.)
--Connel MacKenzie T C 03:28, 5 April 2006 (UTC)
See comment below. —Scs 22:08, 5 April 2006 (UTC)
The pages look very similar now because of the previous iterations of this flame war, not because they should look the same. As they are given the opportunity to diverge, they should gradually become more accurate. --Connel MacKenzie T C 03:31, 5 April 2006 (UTC)
In what way(s) are they currently inaccurate? —Scs 22:08, 5 April 2006 (UTC)
  • I disagree with you, Scs. Actually, the only things that are shared by these pages are the translations. Etymology: different; pronunciation: different; derived/related terms: different; and so forth. Certainly the definitions; when first looking at Connel's proposal I also thought about expanding this system to all corresponding sections, including defs, until I realized the definitions are even the main point of difference between these entries. Also, American articles should be written in American English, right? Even more differences.
  • Connel, American spellings on -or are historically as correct as Commonwealth -our. See Paradise Lost, there are only words on -or (IIRC). This is comparable to -ize/-ise, where the former predates (and is etymologically correcter than) the latter.
  • Up till now, I haven't got any complaints about this proposal. — Vildricianus 10:25, 5 April 2006 (UTC)
Right. So as the "-*our" pages are corrected, they will reflect that they are valid archaic/obsolete spellings (everywhere, not CW nor US) while the "-*or" pages obviously will not contain those same definition lines. --Connel MacKenzie T C 14:18, 5 April 2006 (UTC)
On reflection, some of these arguments are really not so convincing. It's been asserted that all sorts of things have to be different on the separate and differently-spelled U.S. vs. Commonwealth pages, but I'm not seeing those differences. (Perhaps this is because, as Connel suggested above, they've been artificially synchronized, altough at first glance they don't look the worse for that, if so.) In particular:
  • Definitions and examples. The definitions on the color and colour pages are, as mentioned above, virtually identical. The definitions on the colored and coloured pages (which Connel suggested I look at for "dramatic differences") are virtually identical. Besides the spelling and the irrelevant divergence in senses 6 and 7, the only difference that I can see is that the colour page includes tags for countable vs. uncountable. (Am I missing something?)
  • Spelling and other usage within definitions. Now, it's certainly true that a U.S.-slanted entry for "color" is likely to use "color" and other U.S. spellings in its definitions and examples, while a U.K.-slanted definition is likely to use "colour". But this is potentially true of every single definition of any word on Wiktionary. If this is a problem, then we need separate definitions and examples for every word, or in other words, we need a separate en_uk.wiktionary.org and en_us.wiktionary.org. But if we don't need to make that split, if we can tolerate U.K. spelling and usage in a definition that might be read by an American, or American usage in a definition that might be read by the rest of the world, then I don't see why we can't tolerate such quirks on a hypothetical unified color/colour page.
  • Pronunciation. Obviously, the pronunciation of many or most words differs between Commonwealth and U.S. English. If we can capture those differences adequately on a page like father, then we don't necessarily need two separate pages to capture differences in U.S./U.K. pronunciation for color/colour.
  • Etymologies. A couple of people have asserted that the etymologies for colour and color are different. But let's look:
colour: From Old French coulour, from Latin color. In American spelling the 'u' was dropped from colour to simplify the spelling. In British spelling the 'u' remains.
color: From Latin "color" via Old French "coulour"; in U.S. spelling the 'u' was dropped from colour to conform to the word's Latin origin. In the rest of the English-speaking world the 'u' remains.
So the only significant difference is that the two pages give different reasons for why the spelling is different in the U.S.
Now, in terms of hypothetically consolidated pages, I grant that there are additional complications for the color page (which has to contend with the so-spelled Latin and Spanish words), and the tyre page (which has to contend with the city). I grant that, for some people (though I don't know how many), the transpondian spelling variations loom large, and that the appearance of stigma associated with redirects is a real issue. But for all of these other alleged reasons why colour and color have to stay on separate pages, the reasons either aren't compelling (it's demonstrably possible to minimize the differences, as color/colour shows), or the reasons extend past color/colour to suggest that we end up needing (to butcher a phrase) two separate dictionaries separated by a common language. —Scs 22:08, 5 April 2006 (UTC)
Please clarify what you are trying to say there at the end. I, for a long time, advocated using redirects (particularly for inflected forms of English words!) Many reasons exist for not using redirects here. The primary reason is that other language entries might share the same spelling. The secondary reason is the havoc caused to interwiki links. A third reason is the certain spellings are not NPOV. (I'm sure many more arguments were ofered a year ago, when I started entering redirects for inflected forms of English words. I don't feel like looking them up right now.)
If en.wiktionary.org has the practice of not using redirects, then is is a very British POV to quash American English spellings. To shoe-horn multiple entries into single entries (as has been done and I repeat: has not yet been undone on these entries) is obviously not a neutral point of view. I have not suggested having separate en-us and en-uk Wiktionaries (you, from your POV have.)
The only component of the entries that is unlikely to diverge (and diverge by a lot they will!) is the translation sections. Artificially merging the entries has always been and will always be, POV. --Connel MacKenzie T C 03:09, 7 April 2006 (UTC)
I do wonder what you think my "POV" is. I think the only strong opinions I have here are that (1) having separate pages for e.g. color and colour is lame, and (2) redirects do not necessariloy connote second-class status.
I haven't seen anyone trying to "quash American English spellings", and I'm certainly not trying to. (If there were to be a combined color/colour page, and if it were not called "color/colour", it would likely be under color, seeming if anything to "quash" the British spelling, since it would make more sense to share the same page with the Latin and Spanish spellings.)
When I mentioned the possibility of separate en-us and en-uk wiktionaries I was not proposing or advocating it! What I was trying to show was that many of those who insist that color and colour must remain distinct should also, by logical extension of their own arguments, find themselves requiring such a wholesale split.
And that is what I was trying to say at the end there, which I shall try to clarify. The question is, are color and colour so closely related that they deserve to be discussed on the same page, or are they so different that they require two separate pages? And if they require two separate pages, are there other pairs of related-but-not-identical words that should similarly be split?
Let's look at some other cases:
  • sewer pronunciation 1 (a system of pipes) versus sewer pronunciation 2 (one who sews). Different etymologies, different pronunciations, totally different meanings, yet they share the same page.
  • periodic etymology 1 (repeating) versus periodic etymology 2 (chemistry, per + iodic). Again, completely different etymologies, pronunciations, and meanings, yet they share the same page.
  • father. Different pronunciations in the U.K. versus the U.S., but one page.
  • subway sense 1 (underground railway) versus subway sense 4 (tunnel for pedestrians). Different meanings in the U.S. versus the U.K., but one page.
Here we have pairs of words with completely different meanings and etymologies, or significantly different pronunciations, sharing the same page. Yet color and colour, which are clearly the same word but with a minor regional spelling difference, are consigned to separate pages. Why should spelling be the difference that trumps all others?
It has been repeatedly argued here that color and colour ought to be on separate pages because, aside from their spelling, they have or ought to have significant differences in the way their pronunciations, etymologies, definitions, or examples are listed. But if there ought to be separate color and colour pages for those reasons, then by the very same arguments, there should be separate pages for U.K. vs. U.S. father, and U.S. vs. U.K. subway, and chemistry vs. common usage periodic, and the two very different senses of sewer. If U.K. versus U.S. spelling and usage matter in definitions and examples, then every word (even if it's spelled and defined the same) potentially needs separate U.K. and U.S. definitions — hence, the hypothetical, reductio ad absurdum en.uk versus en.us split.
Real dictionaries do have separate entries for different words that happen to be spelled the same, such as sewer and sewer or periodic and periodic. Real dictionaries don't tend to have separate, redundant entries for color and colour — if they list both as separate headwords, one is invariably a "see" (i.e. a redirect) to the other.
Scs 02:12, 11 April 2006 (UTC)
  • Note that at some point in the future there should be entries on colour for the Middle English and Old French words colour, just as the Latin and Spanish words color are on color. -Silence 22:43, 5 April 2006 (UTC)

Scs, entries on the English Wiktionary are distinguished by spelling - why we don't use Wikipedia-style disambiguation, I don't know. I wish we did, but we don't. With that premise in place, to not make the spelling distinction when the spelling distinction exists is POV. The "real" dictionaries you refer to are not multilingual dictionaries, but rather one or the other - the American Heritage Dictionary has an obvious bias towards the American spellings, while the Oxford English Dictionary has an obvious bias towards the British spelling. It is wrong for any Wikimedia project to adopt one bias or the other. The is the heart of neutral point of view! --Connel MacKenzie T C 02:39, 11 April 2006 (UTC)

  • Has anybody thought about using templates for the shared parts which do not depend on spelling, such as synonyms and translations?
    Template:Shared:color/colour:Synonyms Template:sh:colo(u)r:Translations or some other variant?
  • Or what about this: make the whole article a template taking the spelling differences as parameters:
    Let me make an example on colour/experiment, color/experiment, and Template:sh_colo(u)r... — Hippietrail 22:18, 11 April 2006 (UTC)
    OK some things work and some things don't work. I was going to fiddle with it to make it as good as possible but this machine is crash-prone and I know there are some other people here who will be able to fiddle with it besides me so I leave it as is so you can first see the problems before I tout it as a miracle cure. I do think it has possibilites, especially with some extra parameters. Have fun! — Hippietrail 22:51, 11 April 2006 (UTC)

Ligatures

What about fetus, foetus and fœtus? Or is the ligature extinct in modern English? Jonathan Webley 06:46, 31 March 2006 (UTC)

  • Briefly: US English drops the "o" and UK English retains it; the spelling with the ligature is archaic in UK English and obsolete (or archaic? or erroneous?) in US English. — Paul G 10:50, 31 March 2006 (UTC)

Draft Policy

I've been brave enough / arrogant enough / stupid enough to propose a DRAFT POLICY. Wiktionary:Spelling Variants in Entry Names - Draft Policy
Can we move the discussion to the Wiktionary talk:Spelling Variants in Entry Names - Draft Policy. I will be moving this great chunck to an archive page of that discussion page.
--Richardb 08:19, 15 April 2006 (UTC)



rhymes?

I'm noticing that very few entries have rhymes listed. A while ago I thought I remembered more did. There hasn't been any emphasis on removing them, has there? We've still got nice big lists of rhymes underneath rhymes:English. Is there any reason not to insert the appropriate rhyme: links in the pronunciation sections of individual words? —Scs 20:01, 28 March 2006 (UTC)

More likely the case is that the number of rhymes per capita has decreased as the number of "capitas" has increased. I just don't think anyone has focused on adding them, we certainly didn't decide against them or delete them to my knowledge. - TheDaveRoss 20:04, 28 March 2006 (UTC)

Thanks. Second question: would anyone object to backlinks from the various English rhyme lists to rhymes:English, and thence to Wiktionary:Rhymes? —Scs 20:14, 28 March 2006 (UTC)

I wouldn't object, but I wouldn't want to do them myself :) Perhaps templates could be written for the links, which could then be added automatically and used on new pages in the future.
I think the reason that the rhymes links aren't added is that few people think to do it, and the fact that there are relatively few pages that have these links means that people aren't reminded to do it. The rhymes links are of secondary importance rather than a necessity, so if they are missing, few people are troubled by their absence. — Paul G 10:23, 29 March 2006 (UTC)

"Translingual" header

I was wondering wether we should have this header. The only thing it can contain are definitions and some related terms and links. But pronunciations, homophones, rhymes, inflected forms, synonyms etc, quotations, anagrams, and so on usually require separate language headers anyway. Ncik 18:40, 29 March 2006 (UTC)

Numbers, symbols, elements and other terms that are universal in all languages fit nicely. --Connel MacKenzie T C 18:47, 29 March 2006 (UTC)
Yes, I would recommend it only to be used for those symbols which are not pronounced as they are written, such as abbreviations like Na or e.g., and symbols like &. Then the definition would link to, on this wiki, the English entry, and from there translations would be given. However any abbreviation pronounced as written (whether an acronym like NASA or an initialism like pH or e.g.) I would not put there, as they are essentially words whose pronunciation &c will differ from language to language. (I put initialisms here, because the pronunciation of letters differs from language to language—and where will the stress go?—and in languages that don't normally use the Latin alphabet, the pronunciation may be nontrivial to find.) —Muke Tever 23:36, 29 March 2006 (UTC)
Yes, it must be used only for symbols that are not pronounced as they are written. "Na" is a universally used symbol, not an abbreviation (even though it is derived from the Latin natrium). However, "e.g." is not translingual: the French is "par ex." (for "par exemple"). When I was in France some of my colleagues were astonished that we use Latin abbreviations for the English expressions "for example" and "that is" (French abbreviates the French equivalents). Furthermore, "e.g." is often pronounced (informally) as /i:"gi:/ ("ee jee") in the UK and perhaps elsewhere, and "i.e." is very often pronounced /aI"i:/ ("eye ee").
Yes, I know about the pronunciations of e.g.. That is why I listed it both as an example of an abbreviation pronounced as written (as "e gee"), and one not pronounced as written (as "for example"). Anyway, "Translingual" doesn't mean "in all languages"—just in multiple (what might go under ISO 639 code 'mul', I suppose). e.g. has at least English and Latin (though Latin also uses e.c. for this). —Muke Tever 22:58, 30 March 2006 (UTC)
The pronunciation of what these symbols stand for goes in the entries for what the symbols stand for. So "Na" both stands for and is pronounced "sodium" in English, "sodio" in Italian, etc, so the pronunciation and other information go under those entries.
The Translingual header is also appropriate for all those formal scientific names of taxa, which are used across most languages without italicization as foreign words would have. Thus, Echinodermata is Translingual, since it is the same name in all European scientific publications, but echinoderm is English, since it translates in other languages. --EncycloPetey 06:38, 31 March 2006 (UTC)
To what extent can given names or surnames be translingual? They're a tricky thing, partly because of borrowing and translating them. For example, John is clearly of English origin, but is also used in other languages, Dutch among them. — Vildricianus 09:10, 31 March 2006 (UTC)
Echinodermata, the whole of the Linnaean taxonomy, and later extensions to it, are Latin, just as Linnaeus wrote in. Use of such words without italicization (which is not universal, btw; cf. w:sl:Iglokožci, w:lt:Dygiaodžiai, etc.) indicates either borrowing (cf. the word in Webster 1913 as English), alternate typographical practice, or both. —Muke Tever 16:11, 31 March 2006 (UTC)
But most such names have never and do never appear in Latin texts. They are used as words within the texts of whatever language in being used by that author. It is true that Linnaeus wrote in Latin, but that is because it was the language of international scientific and mathematical discourse at the time. New taxonomic names of organisms and groups continue to be published, and many of them use terms and elements that have never appeared in Latin -- I am thinking of species epithets such as yokohamaensis. To classify such words as Latin, then, and list them as such in Wiktionary is misleading. We make no distinction here between Classical Latin, medieval Latin, and nomenclatorial Latin as languages, but the three are quite dissimilar in vocabulary and very different in usage. I see nothing to be gained by listing 20th century invented words (using Latin grammatical conventions) alongside the words of Cicero and Ovid as if they belonged to the same language. Taxonomic names are strictly nouns -- strictly names of organisms and groups of those organisms. They are not used as words in the Latin language and the majority of them never have been. --EncycloPetey 05:30, 1 April 2006 (UTC)
Please check your sources before making assumptions. For example the very first Google hit for Yokohamaensis is not taxonomic at all, but a Catholic website giving the Latin name for the diocese of Yokohama. (If Karl Egger didn't coin "Yokohamaensis" for his Lexicon Nominum Locorum, it doesn't look any different from if he would have.) Latin did not die with Cicero and Ovid. It was the literary language of most of Europe through the 18th century—which was why Linnaeus was writing in it, after all. The meme that "Roman Latin is the only pure Latin" was a major factor in the decline of its use, but it is no invalidation of the vast corpus of Latin produced since the Augustan Era, and neither is there any magic dividing line that must divorce 20th and 21st century Latin neologisms from those of the 19th century, 18th century, 17th century, 16th century, or any preceding. Even for those words that have not been in use in running text, the situation is no different from a print dictionary writer-slash-language activist who coins/collects words in a minority language for modern concepts (like Egger himself!)—even if they haven't been used before, they're there for when they will be. Hence w:la:Felidae, etc., and if you like I can write more articles like w:la:tigris to make the epithets nice and unproblematically attested. —Muke Tever 15:37, 1 April 2006 (UTC)
I think you are being a bit over-generous in extending Latin's use through the 18th century. By the early 16th century, court documents and offical records across Europe had switched in favor of the local vernacular over traditional Latin as the language of choice. True, the erudite elite continued to use Latin over the next couple of centuries, but it had lost considerable ground by then. While my choice of yokohamaensis may not have been the best, I can cite dozens of others (the most odious being josecuervensis) that I cannot imagine appearing outside of taxonomic circles. All too many are essentially neologistic. --EncycloPetey 15:50, 1 April 2006 (UTC)
Generous? Linnaeus' Systema Naturae is an 18th century work! —Muke Tever 19:44, 1 April 2006 (UTC)

Definite articles in place names

Some time back there was a discussion over whether Le Caire (the French name for Cairo) should be given at Caire rather than at "Le Caire". Wiktionary's policy with phrases that include an initial article ("the", "a" or "an") is to create an entry without this article but to state it with the article under the part-of-speech heading. The same applies to place names beginning with "the", so "the Netherlands" can be found at Netherlands.

However, I disagree that this should be extended to foreign place names that begin with foreign articles. French place names do not usually begin with an article, but "Le Caire" is one example where it does (there is no such place as "Caire"). "Le Havre" is another, and most English-language atlases, gazetteers and encyclopedias list it under "L", not "H". Similarly, I don't think anyone would seriously suggest that "Las Vegas" should go under "V", or "Los Angeles" and "El Alamein" under "A", or "La Spezia" under "S".

So I propose that French should not be given special treatment and that Caire be moved back to "Le Caire" (with a redirect from "Caire"). We already have La Mecque (French for "Mecca"). — Paul G 10:23, 30 March 2006 (UTC)

What about Sphinx? There are both Sphinx "name of a Greek demon" and The Sphinx "the great monument in Egypt". --Patrik Stridvall 14:32, 30 March 2006 (UTC)
I'm referring to place names with foreign articles. Both "Sphinx" and "the Sphinx" belong at Sphinx. — Paul G 14:54, 30 March 2006 (UTC)
Exactly. For instance The Hague versus Thames, even though one always says the Thames. — Vildricianus 15:48, 30 March 2006 (UTC)
Well, we also have the translations of The Sphinx. Should French be Le Sphinx? In Swedish it is even worse since it inflects in the definitive to Sfinxen instead of just Sfinx which BTW is the name Oedipus uses in the Swedish translation when talks to the Sphinx of Greek mythology. When he talk about it he uses the definitive Sfinxen. Sigh. --Patrik Stridvall 19:55, 30 March 2006 (UTC)
You can clarify that with a small note within translations tables. Deviant translations are no reason for creating entries like The Sphinx. — Vildricianus 09:04, 31 March 2006 (UTC)
Hmm. Deviant translation is probably right in this case. IIRC Ancient Greek uses the definite article in front of names. So a correct translation into Swedish would probably use Sfinx everywhere since that how Swedish normally treats names. Names doesn't inflect in the definite, or rather they don't unless they always do like in the case of "the Netherlands" (Swedish: Nederländerna). But in such cases the form is frozen. You can't "uninflect" it. The same goes for Sfinxen in the sense of the momument in Egypt. So I think the Swedish translation of Sphinx is correct. The big question is what the entry at Sfinxen should say? Except for the sense for "the momument in Egypt", that is. "Mistranslation of Sfinx" or what? --Patrik Stridvall 20:35, 31 March 2006 (UTC)

Does anyone have objections to moving Caire back to Le Caire? If not, I'll make this change. — Paul G 09:16, 31 March 2006 (UTC)

Incidentally, I don't intend this to be done for country names that include the definite article in other languages, such as le Royaume-Uni, la Chine, les États-Unis in French (the United Kingdom, China, the United States). These should stay at Royaume-Uni, Chine, and États-Unis respectively (which is how the entries appear in the French Wiktionary, incidentally). — Paul G 09:21, 31 March 2006 (UTC)
To make things clear: in French, most country names are used with an article, but this article is not a part of the name, and is therefore not capitalized (the same applies to le Sphinx). In Le Caire or Le Havre, Le is a part of the name, which cannot be used without it, and is capitalized (although it is still considered as an article: we don't write à Le Caire or à Le Havre, but au Caire and au Havre. Also note that most French paper dictionaries list Le Havre at Havre (Le), sometimes with an L entry redirecting to the H entry. But here, it should be Le Havre (my place of birth, by the way)... Lmaltier 17:14, 31 March 2006 (UTC)
Then these are narrow exceptions. No objections to moving it back. Davilla 17:10, 1 April 2006 (UTC)
Moved back to Le Caire. Thanks. — Paul G 10:09, 3 April 2006 (UTC)

Curly apostrophes

Are we really using curly apostrophes now (one’s)? - dcljr 19:57, 30 March 2006 (UTC)

Absolutely. We have been using it for a few months now. Ncik 01:59, 31 March 2006 (UTC)
Yes, but never in wikilink syntax and always with a redirect from the normal ASCII apostrophe. --Connel MacKenzie T C 06:15, 31 March 2006 (UTC)
Seems pretty silly to me. Let's force as many needless redirect pages as we can so that we have a prettier apostraphe. Great. - 71.254.2.52 06:45, 31 March 2006 (UTC)
Well, Ncik says "we", but, for myself, I never use them, nor do lots of other contributors. I don't see any good reason to. I can type an apostrophe directly, but I have to do more work to enter a curly/slanted apostrophe. Ncik, can you remind us why this is being done? — Paul G 09:14, 31 March 2006 (UTC)
See also User talk:Hippietrail#Replacement of apostrophes. — Vildricianus 11:09, 31 March 2006 (UTC)
As far as I remember, the main argument was that ’ is the Unicode character specifically designed to be used as the apostrophe, whereas ' is essentially an ASCII remainder which exists in Unicode for compatability reasons. Ncik 14:22, 31 March 2006 (UTC)
Not quite true. See below.—Scs 02:35, 2 April 2006 (UTC)
Also, "'" is the only character that can be entered in the search box on a US keyboard. --Connel MacKenzie T C 15:52, 31 March 2006 (UTC)
On a computer screen, it doesn’t make much difference, but when something is copied and pasted into a document to be printed, then it is important. Prior to the last decade or so when everyone began to do his own typing, formatting and printing, the straight quote was always an indication of typewritten text (manual or electric typewritter), and the curly quote was a requirement for professional typeset material. Over the last ten to fifteen years, so much has been entered and printed by casual typists that the formal quote almost disappeared not only from English documents, but from formal texts in almost all other languages. The French guillomets « x » in well-known French magazines turned into straight American "x"; the same happened to German quotes, Dutch quotes, Russian quotes, Italian quotes, and so on. Beginning with Microsoft’s Word 97, this all began to right itself, and good word processors now automatically select the correct quotes according to the language and country, and the proper formal quotes are expected again in printed materials. So, the straight quote is fine on screen, but it’s very sloppy and unprofessional-looking for anything that someone might want to copy and print out. —Stephen 11:52, 1 April 2006 (UTC)
"Always" — I like that. Pretty difficult to get the year ’95 right. Davilla 17:05, 1 April 2006 (UTC)
Yes, there certainly is that little problem. Word gets the apostrophe in terms such as ’95 and ’til wrong, and that’s why we’re now finding ‘til ‘95 even in commercially prepared materials. —Stephen 19:48, 6 April 2006 (UTC)
dcljr's question was about apostrophes, but what just about everyone else has been talking about is quotes, and there's a difference. Ncik said, "...’ is the Unicode character specifically designed to be used as the apostrophe", but the character he used is actually U+2019, the Right Single Quotation Mark. This, on the other hand, is the Unicode character specifically originally designed to be used as the apostrophe: ʼ (it looks the same, it's true, but the code underneath is U+02BC).
I'm all for nice typographical appearance, but it has to be secondary to proper semantic encoding, database consistency, and ease of searching. If someone gets some quotes wrong, or if someone's browser can't display Unicode, that's not so bad. But when it comes to apostrophes, we're talking about the actual spelling of words. It seems to me it would cause far fewer problems to use the plain old simple ASCII apostrophe ' in actual entry names. If you don't like the way a (plain) apostrophe looks, it seems to me that's a problem to fix in your display or printing software, not by introducing extra complexity into the data, and expecting every other editor ever to go along.
It's not merely an issue of display. The ASCII apostrophe is ambiguous as an ambidextrious single quotation mark. Davilla 19:58, 3 April 2006 (UTC)
Sure, we can play games with redirects, but what a nuisance! And strictly speaking we'll need lots more of them. Right now it's redirects to it’s, but itʼs does not; if I try to look up itʼs (spelled with a Unicode apostrophe) I get the "No page with this exact title exists" page. —Scs 02:35, 2 April 2006 (UTC)
Umm you're quite wrong about U+02BC. It's called "modifier letter apostrophe" and has a special meaning. It is specifically for use with languages that have a letter which looks like an apostrophe but is a true letter rather than punctuation. Usually it represents a glottal stop but sometimes it indicates palatalization. One language I've added a few words of which should use this is Amuzgo though in that case I've decided to use the plain straight apostrophe for now. Most languages I'm aware of which use it are rare minority languages.
The character U+2019, "right single quotation mark" is the correct character for both its namesake and the apostrophe as this comment from the Unicode entry states: "this is the preferred character to use for apostrophe". See here and elsewhere... — Hippietrail 03:05, 2 April 2006 (UTC)
No, actually it's more complicated even than that. In earlier revisions of the Unicode Standard, U+02BC was the preferred character for (all) apostrophes. They changed their minds along the way for some reason. It's quite a mess really. —Scs 04:29, 2 April 2006 (UTC)
Whee! - dcljr 22:50, 2 April 2006 (UTC)
Yup. (Aren't you glad you asked? :-) ) —Scs 12:01, 3 April 2006 (UTC)
Ordinary users should not be subjected to the need to distinguish different types of apostrophes. Most will only be confused by these technical details. The KISS principle has much merit. Eclecticology 02:30, 3 April 2006 (UTC)
Yes, thank you, Eclecticology. As far as I see, having entries with the "correct" apostrophe means that:
  • Contributors have to know that this is what they should be using;
  • Contributors have to find out how to enter it;
  • Numerous redirects are required;
  • Users of the dictionary entering the "straight" apostrophe will not find those entries that have been entered using only the "correct" apostrophe.
...all of which is a complete waste of time, IMO, as they come out the same on my display anyhow! If we drop this, absolutely no one, believe me, is going to criticise us for not using "curly" apostrophes, and we're going to save ourselves a lot of work. To me, it's a pedantic refinement too far. Let's keep it simple, stupid. — Paul G 09:59, 3 April 2006 (UTC)

Yes, I agree with Paul. There are too many other things to be done right now. If we ever decide we should do it the "correct" way, we can easily set a bot working and get it done overnight. — Vildricianus 12:18, 3 April 2006 (UTC)

Yes, but that ignores that entries are and have been entered using the unicode apostrophe instead of "'". Right now those entries can only be found (using the MediaWiki software) if the redirects for each are in place. --Connel MacKenzie T C 16:53, 3 April 2006 (UTC)
If someone goes through that much trouble, then they should have enough know-how to make a redirect page. What's most important is that it works. Rather than making entries consistent across the board, just keep the redirects from ASCII apostrophe to unicode apostrophe where those pages exist. Does this create any problems? In particular, how well would links work if this were carried out to completion? Davilla 19:58, 3 April 2006 (UTC)
Davilla's suggestion seems sensible, under the circumstances. There is no point undoing what has been done, provided all of the content with Unicode apostrophes is accessible using low-ASCII apostrophes too. — Paul G 09:56, 4 April 2006 (UTC)
  • This was all gone through and settled months ago. Let me respond to each point above:
    • Contributors have to know that this is what they should be using;
      Not really. We should tell them we prefer it and how to do it in the formatting howtos, but I'm not in favour of forcing anybody to do it. Those of us who care have been fixing entries for ages and will continue to do so.
    • Contributors have to find out how to enter it;
      Put it in the formatting howtos, there's no "have to".
    • Numerous redirects are required;
      No redirects are required. All titles with apostrophes should use the typewriter apostrophe. People linking to these should endeavour to pipelink printers' apostrophes but they shouldn't be forced to. We do not need to link from printers' apostrophes to typewriter apostrophes.
    • Users of the dictionary entering the "straight" apostrophe will not find those entries that have been entered using only the "correct" apostrophe
      Any entries using the printer's apostrophe should be moved to the straight apostrophe. The inverse problem will still occur but very few people will use the printers' apostrophe in the search box.
    Hippietrail 19:03, 6 April 2006 (UTC)