Wiktionary:Beer parlour: difference between revisions

From Wiktionary, the free dictionary
Latest comment: 18 years ago by Primetime in topic Some are being too Autocratic
Jump to navigation Jump to search
Content deleted Content added
Connel MacKenzie (talk | contribs)
→‎Some are being too Autocratic: They're better than everyone else.
Line 1,805: Line 1,805:
We elect responsible people as admins and they should be trusted to use their judgement.
We elect responsible people as admins and they should be trusted to use their judgement.
[[User:Jonathan Webley|Jonathan Webley]] 10:15, 29 May 2006 (UTC)
[[User:Jonathan Webley|Jonathan Webley]] 10:15, 29 May 2006 (UTC)

::Oh yeah. They're all perfect. Their opinions are more important than everyone else's because they have enough friends to get elected. Why let it last through a vote? The fact that most users can't even double-check the entries they deleted is irrelevant. The fact that I've seen them delete entries on sight with citations is not a sign of abuse, either.--[[user:Primetime|Primetime]] 04:34, 30 May 2006 (UTC)


:I think we should also keep a log of when [{{fullurl:Special:Listusers|username=Richardb}} administrators] [{{fullurl:naivety|diff=1061328&oldid=975040}} revert other people] who try to give a bit of consistency to our entries. [[Special:Mypage|—Vildricianus]] 11:17, 29 May 2006 ([[User:Vildricianus|U]][[User talk:Vildricianus#|T]][[Special:Contributions/Vildricianus|C]])
:I think we should also keep a log of when [{{fullurl:Special:Listusers|username=Richardb}} administrators] [{{fullurl:naivety|diff=1061328&oldid=975040}} revert other people] who try to give a bit of consistency to our entries. [[Special:Mypage|—Vildricianus]] 11:17, 29 May 2006 ([[User:Vildricianus|U]][[User talk:Vildricianus#|T]][[Special:Contributions/Vildricianus|C]])

Revision as of 04:34, 30 May 2006

Wiktionary:Beer parlour/header

Policies in Development (Mostly)

For a full list see Category:Policies - Wiktionary Top Level or Wiktionary:Index to Policies. The most current / active are:

Summarized sections

Computing languages

Do we want to include reserved words from computing languages?

xpage was recently posted to WS:RFD (summarised below) with the objection that this is a Java class and, by another user, that this does not belong under the heading "English".

Perhaps we could include these, with the computing language as the title. Our remit is "all words in all languages", after all - as this stands, it doesn't restrict us to natural languages only. So should we include C's "void", Visual Basic's "Dim" and the like?

Note that there is precedent for this: I believe the OED includes PEEK and POKE (which are used in BASIC). That said, I also believe it gives these as verbs (as in "what value do you get if you PEEK memory location 12345?"), so they might be "English" words after all. — Paul G 09:34, 27 April 2006 (UTC)Reply

I'm broadly in favour of including such things. Though we must be careful not to get too encyclopedic - e.g. all the formats of the if (or IF) statement in FORTRAN, COBOL, BASIC, C++ etc SemperBlotto 09:38, 27 April 2006 (UTC)Reply
I would prefer restricting ourselves to natural language. peek, poke, and I'm sure many other words like goto are okay since they're used as verbs, that is, quite integrally within a sentence. They can be cited within English or other natural language texts, and are probably best placed under a broader header. Not necessarily all-caps, but I'd think all major languages accept reserved words that way, so it would be just as well to place them there. Any more than that would be opening a can of worms. I mean, do we want to start quoting lines of code? And anyways, what constitutes a legitimate language? Think of all the variants of Basic. Think of all the languages that have been around since the dawn of man, er... machine. There's so much of no import and too much gray area. Assembly shorthand, Unix commands like echo, operating system calls like gestalt, standard files like config, MUD commands like emit... where do you draw the line? Davilla 20:37, 27 April 2006 (UTC)Reply
I'll hold off with IBM System 360 machine-code mnemonics for a while then! SemperBlotto 21:14, 27 April 2006 (UTC)Reply
I would much prefer that Wiktionary allowed term from all computing languages, but sadly, it does not. Certainly tracking syntax changes (by programming language version) would be very useful, but the English Wiktionary seems to have a hard enough time with spoken languages at this point. --Connel MacKenzie T C 22:00, 27 April 2006 (UTC)Reply
I'm all for creating a separate Wiktionary for programming language statements, and even for all the standard library routines on all computer platforms ever. Throw in all the tags in all markup languages ever, the official names of al the Unicode characters, and all the standard Unix commands from /bin - but that stuff has nothing to do with a "normal" dictionary of a spoken language. The fact that the term "language" has other senses shouldn't throw us off course. So put in the request for a new wiktionary and you can count on my 100% support. — Hippietrail 02:20, 28 April 2006 (UTC)Reply
In the meantime, it might be worrthwhile to begin amassing a list of computer syntax terms as an appendix page (without links). That way, there would be physical evidence of (1) the extensive list of terms, (2) cross-language use, and (3) demonstration of someone's willingness to work on the project. --EncycloPetey 05:51, 28 April 2006 (UTC)Reply
Perhaps we could start something like that, initially containing only the Wiki-relevant protocols, languages and jargon? Off the top of my head: .css, PHP, python, Perl, WikiSyntax, JavaScript, Solaris (toolserver), RedHat/Fedora Core 3 (cluster), HTML, XML, XHTML, bash, DOS .bat files, M. Perhaps only the top 200 most relevant keywords for each, for a start? --Connel MacKenzie T C 15:26, 12 May 2006 (UTC)Reply

What about "citing" a keyword in computer languages rather than computer programs? For instance, goto is a keyword in at least every version of basic that has ever existed. Same criteria: three citations in "independent" languages, e.g. different implementations of {Java} by Sun or IBM or what have you. They could go under a ==Programming== language header or something. Languages must be Turing complete. Reserved words only is probably too restrictive, but it's safer to start out strict on this one, IMO. 59.112.38.180 21:07, 19 May 2006 (UTC)Reply

Censorship

This is America's Sweetheart. Steven G. Brown has been going through all of my contributions and reverting them. I added quotes to two entries that I got from the OED2. A new entry I created he RFVd. I asked for an explanation on his talk page, but he gave none, just reverted. When I started restoring my quotes, he protected the pages. Is this OK? I got upset and RFVd one of his entries, and he just removed the tag and protected the page. He then said every single one of my contributions needs to be verified on the RFV page and blocked me. Is it OK to block someone you're in a content dispute with and revert all of their changes without explanation? —This unsigned comment was added by 129.82.42.77 (talkcontribs) 17:56, 27 April 2006 (UTC).Reply

Your message to me entitled "Primitive Censorship attempts" ended in the words "Now go play outside." You clearly were not looking for an explanation. All of your contributions (Special:Contributions/America's Sweetheart) focus on sexual perversion and racial denigration and add no deeper understanding or anything of any value whatsoever. Nevertheless, to be fair, I allowed one of them to stand (an extraordinarily filthy article named sperm burper) and rfv’d it so that others could have a look and give their opinions. Rather than wait to see what the community had to say, you began tagging my recent Russian pages with rfv. You began to create such a disturbance that I blocked you for one day. Frankly, I think you should be blocked for a much longer time. —Stephen 18:58, 27 April 2006 (UTC)Reply
So, the OED2 was wrong to include those quotations in their dictionary? If tagging an entry because you think it's "filthy" is fair, I'd hate to see what you think is unfair. I tagged one of your entries, and I still think it needs verification. I also spent a lot of time formatting those quotations and looking up the authors for attribution, but I guess stopping you from reverting them all is vandalism. I personally think your censorship attempts are vandalism and that you should be de-sysopped and banned permanently from editing Wiktionary.
Unsigned comment 13:10, April 27, 2006 62.7.244.103 (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks)
I won't try and defend America's Sweetheart. I don't want to defend America's Sweetheart. I mean, this user was in an edit war with a sysop. And I have to believe that the sysops need to have the power to block users who act that way. But I can't defend all of Stephen's actions either. He deleted citations from Hemingway and Twain. He deleted a quotation from Gone with the Wind. I have to agree, that sounds like censorship to me. Maybe in his opinion they don't add anything of value. And if it comes down to deleting trash then sure, call it censorship or what have you, but it's got to go. This, on the other hand, seems pretty legitimate. I don't want Wiktionary to be that strongly, um... regulated. Davilla 19:51, 27 April 2006 (UTC)Reply
Since you are not a sysop, you can't see the nonsense that was added, but other sysops can. The entries that potentially could be entries that meet our criteria can by all means be reentered properly. While less restraint exists for entries that have been previously deleted, a genuine contribution (of correctly enting a term) is not at all likely to be redeleted. That is not censorship. No one is preventing that user from making their point on their own website. No one is trying to prevent that person from discussing their opinion in public. What Stephen did was quash a vandal, who judging from their IP hopping, trolling and other activities, has no intention of being a helpful contributor here. Taking the vandal's bizarre selection of (unhelpful) citations as a whole, I strongly agree with Stephens actions. Except that I think he should have blocked longer. --Connel MacKenzie T C 20:04, 27 April 2006 (UTC)Reply
Where are you getting all of this? Anyone can look at what I wrote by looking in the history of the pages ([1], [2], and[3].) None of the definitions were deleted. They were reverted and one was tagged (he left my uncontroversial definition, lighterman, alone [tellingly].) I also didn't know that you could read my mind and tell that I intended to stop contributing suddenly ("based on his IP hopping, blah blah blah") judging how I just got here . . . ok . . . Also, what is all this about not censoring me in public or on another website? You're censoring my contributions right now on Wiktionary.
Unsigned comment 14:56, April 27, 2006 159.61.240.143 (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks)
Nice troll. Deleting your cruft here does not prevent you from presenting your particular point of view elsewhere, in an appropriate place. --Connel MacKenzie T C 21:28, 27 April 2006 (UTC)Reply
I'd have to agree. Cable can do what it wishes, but the TV networks are still considered censored. I'm not talking about the world wide web, I'm talking about censorship as the meaning applies right here on Wiktionary. And it's not necessarily wrong. Omitting some constructed languages, requiring a certain degree of attestation, not allowing certain images: these are all forms of censorship, in a way, but they are standards decided by the community. The sysops are free to pursue vandals, and I have full faith that they use their best judgement when doing so. But I am just as thankful that contributors are allowed to voice their opinions here. Davilla 21:40, 27 April 2006 (UTC)Reply
So do I understand this right? When pages and their histories are deleted, they also disappear from a user's contributions. You're saying there was a lot more than what I can see. Then I'd have to agree, America's Sweetheart should have been blocked sooner, instead of being allowed to make later, more legitimate contributions whose reversions make the sysops look guilty. Davilla 21:40, 27 April 2006 (UTC)Reply
None of my entries were deleted! Look in the deletion log: [4] None of them are mine!
I've been looking at some of the pages in question... why are the edits being reverted? One would think that a revert war would be explained on the talk page, but apparently the sysop(s) in question didn't feel like taking this responsibility —Muke Tever 00:34, 28 April 2006 (UTC)Reply
This is a general comment. I have not looked closely at the history of this incident. Wiktionary gets a great deal of traffic and those volunteers who undertake to make some sense of it and keep order must sometimes make quick judgments with regard to what is appropriate content. Please believe me when I say that the vast majority of edits that look like junk, are. That said, admins do make the occasional mistake. I don't believe that any of us seek to censor the content here.
Admins do not outrank the other contributors. We're really more like the janitors around here. The check is community trust in electing admins in the first place and discussions like this one. --Dvortygirl 06:47, 28 April 2006 (UTC)Reply
I agree with Dvortygirl. If Stephen was hasty it's because he and other admins are fed up with dealing with the quantities of junk etc. which are put on to Wiktionary. But there is a right way and a wrong way to respond to that, and the snide comments and RfVing from America's Sweetheart was not conducive to resolving the problem. Hopefully once his block has expired he will be able to make a useful contribution here. Widsith 07:03, 28 April 2006 (UTC)Reply
Perhaps someone could restore my work, and, in return I could apologize to Steven?

A few elements and factors are indispensable in the process of becoming "trusted" and less easily reverted. One of them is not applying the "eye for an eye" principle and rfv'ing a perfect entry. Another one is spelling somebody's name correctly (mine is an exception, everyone is allowed to misspell it). Other things that may help is de-redlinking one's userpage, adding a Babel or something like that, and in general, not restricting one's contributions to editing likely targets for vandalism, like fag or nigger. Certainly, this sounds hypocritical, as it is one's contributions that count, but that's the way a wiki works. If all sysops were to consider and think ten minutes before reverting (I do so at times, usually to no avail), we would need a thousand of them. — Vildricianus 10:40, 28 April 2006 (UTC)Reply

  • It is very clear that User:America's Sweetheart is of course, a simplistic vandal. Stephen's intuition was right on spot. The relatalitory vandalism from blocking this user (who then resorted to several other dynamic IP addresses) clearly demonstrates this. The childish page removals are even more childish than the initial questionable entries. --Connel MacKenzie T C 01:43, 29 April 2006 (UTC)Reply
In making up my mind about America’s Sweetheart’s intentions, the biggest point is the fact that all but one of his contributions are about the most vile and dehumanizing of terms, and he seems intent on making them even more horrible. The definitions already listed under nigger, for instance, are already pretty complete, and the quotes that already appear there seem to me to be sufficient for a dictionary. I can’t see how adding "*A similar error has turned Othello..into a rank woolly-pated, thick-lipped nigger.--Hartley Coleridge Essays and marginalia (1851) I" adds anything of value to a dictionary page. It may be fit for an encyclopedia, but I don’t think we have to list every nasty cite in a dictionary. —Stephen 09:33, 29 April 2006 (UTC)Reply
By the way, these racist and perverse entries are continuations of the ones such as nigger baby that we had to deal with in early March. —Stephen 09:54, 29 April 2006 (UTC)Reply
This is a dictionary. We gather quotations, to illustrate the use of a word: both its use in grammatical and pragmatic contexts, and its use over time. Nasty words have nasty cites; that's just a fact of life. —Muke Tever 14:22, 29 April 2006 (UTC)Reply
  • BTW, MacKenzie has deleted "sperm burper" after I added three quotes to it. He didn't care that it was in an RFV or that it was proven to exist. He also reverted changes to "angel" after I proved that sense existed with three quotes. 62.118.249.75 21:13, 30 April 2006 (UTC)Reply
You didn't prove the sense existed. Two of your quotes were from the Bible and Shakespeare, neither of which used angel in any sense but the primary; and your third citation didn't even use the word. Widsith 21:19, 30 April 2006 (UTC)Reply
The quotes make very clear that they're talking about an angel in a homosexual sense. The word is a variant of ingle which is defined in the OED in the same sense. I got two of the quotes from this book which has the same interpretation of the quotes as me. 21:51, 30 April 2006 (UTC)
You've misunderstood, and horribly, the source I brought up :( The sense 'homosexual' is only one of the senses that entry mentions—not all the cites it offers are for the same sense. (It also speaks about the sense of ingle we already have, i.e. a fire[place]; and the Bible quote appears to be intended as an explanation for the source, or at least the long-standingness, of the association of homosexuals with angels.) In any case, a pun or allusion is only a suggestion of usage, and not a usage itself. —Muke Tever 23:17, 30 April 2006 (UTC)Reply
  • By the way, MacKenzie has just deleted "shit stabber", "shit on a raft", and "shit hunter", all of which were attested. It's impossible that the confluence of events is a mere coincidence. 85.31.186.86 21:55, 30 April 2006 (UTC)Reply
Of course all User:Primetime sockpuppet submissions are deleted. Got any more IP addresses for me to block? --Connel MacKenzie T C 05:55, 1 May 2006 (UTC)Reply

Look, I'm a hasty guy too. But I try my best not to delete things! (Just move them). Some of our "censors" argue that some words should not be in WikiSaurus because they don't have page entries. Then someone goes to the trouble of defining sperm burper and finding citations. This term has 36,100 Ghits. But it is deleted! I'm afraid I already have some of the "Censors" down in my book as bowdlerisers. I think this is uncalled for Censorship. I am going to put sperm burper back.--Richardb 03:54, 14 May 2006 (UTC)Reply

Richardb, please feel free to enter a legitimate entry for sperm burper (of course, with three pring citations.) Do remember not to restore previous iterations from the known copyright violation source User:America's Sweetheart, confirmed sockpuppet of User:Primetime; I pretty sure that doing so would be a direct violation of WMF policy. Please note discussions about User:Primetime here and on w:WP:AN#User:Primetime regarding the much harsher treatment his entries are recieving on Wikipedia, which is also a WMF project. --Connel MacKenzie T C 20:00, 14 May 2006 (UTC)Reply
To me it was a valid entry. In the history it had another contributor too, with some slightly bowdlerised version. SO, a valid entry could have been constructed by merging the entries. But, I'm not going to betoher entering a sysop war over delete/reinstate/delete/reinstate.--Richardb 12:53, 16 May 2006 (UTC)Reply
What you are suggesting would be a "derivative work," therefore also a copyvio. --Connel MacKenzie T C 07:15, 18 May 2006 (UTC)Reply

Redirects from lower to uppercase

Relevant Policy

We've had a draft policy since 15 Apr 2006, Wiktionary:Spelling Variants in Entry Names - Draft Policy that covers this !--Richardb 03:37, 14 May 2006 (UTC)Reply

The "draft pol" discussion there, of course, is now overcome by events. Read on here, and find out why this "draft" needs massive revision, before a vote on accepting it will ever be remotely conceivable. Furthermore, that "draft" starts with a horrific error - since it was created on the sly, it does not yet reflect that it is "conflicting directly with existing practices." The comments I've made on that topic are mysteriously absent now, with the very first "rule" it suggests being the most outrageous. Any Latin-alphabet spelling difference is not acceptable as a redirect here on the English Wiktionary.
As current technical developments evolve (as discussed below) this will need further discussion and revision. Some improvements have already been made; more are imminent. This "draft" (which apparently reflects only one individual's POV,) does not acknowledge the former, nor recognize the latter. --Connel MacKenzie T C 21:07, 14 May 2006 (UTC)Reply


Is it safe to delete entries that redirect from, say, paris to Paris? — Vildricianus 18:50, 27 April 2006 (UTC)Reply

No. --Connel MacKenzie T C 19:05, 27 April 2006 (UTC)Reply
It's been quite a long time now. There are quite a few wrong redirects (especially but not only those that involve German). There also seem to be people who are either actively making new such redirects or actively wikifying uppercase words in articles when they should be pipelinking so that we can eventually clean these up. So when will "eventually" be? I think we should decide on a time, put up a warning notice between now and that time that they'll be deleted, and then get rid of them. — Hippietrail 21:37, 27 April 2006 (UTC)Reply
As time has passed, I've noticed greater variety in how mirrors and other Wikis/Wikts link here, not less. Until the MediaWiki software is fixed to correctly handle external links, we should be adding more redirects, not removing them.
I am shocked at the notion that deleting these navigation aides is somehow considered helpful, buy anyone anywhere. Deleting a redirect increases the db size, while breaking links. How is that helpful? --Connel MacKenzie T C 22:05, 27 April 2006 (UTC)Reply
That's all nice, but this here involves the other direction of redirects (read title), whose purpose I find doubtful. Who'll be ever linking to paris (ok, Kipmaster destroyed my example, say then, london)? — Vildricianus 22:12, 27 April 2006 (UTC)Reply
Well, yes. Some mirrors recognize that en.wiktionary is now case sensitive, and therefore convert all links to lowercase first character (from their site to ours!) Others, like onelook, assume we still do it the same as Wikipedia, and link directly to uppercase first characters. The variety is just as bad within Wikimedia projects, it seems. --Connel MacKenzie T C 22:31, 27 April 2006 (UTC)Reply
Mmm, that's even worse then. Bah. — Vildricianus 22:34, 27 April 2006 (UTC)Reply
I don't see any reason to be shocked. Just tell us why you think it's a bad idea. If it's because other sites are lazy about how they link to us then I think my suggestion of putting a warning at the top of every page is a good one - or do we really want to encourage sites to continue ignorantly linking to us indefinitely? Doesn't that in a way make us a slave to outside sites? — Hippietrail 22:43, 27 April 2006 (UTC)Reply
But you know very well that such linked used to work correctly. --Connel MacKenzie T C 00:45, 28 April 2006 (UTC)Reply
Yes, but that has nothing at all to do with trying to fix things. I really can't understand why you are fighting to keep things as they are instead of trying to move forward in a nice way which will lead us to a better overall situation. Well what do other people think? Especially long-time contributors - should we keep this upper to lower and lower to upper, the good ones and the broken ones, indefinitely? Forever? Or should we say - in the place we usually ask for donations and such - that we're deprecating this practice and sites linking to us have 2 months, 6 months, 12 months, whatever to comply? And if we don't want to move forward with this, why keep only half a system instead of making some bot which makes redirects for all the words which currently only have lowercase entry? I for one am quite annoyed each time it looks like a German noun or a proper noun already has an entry but the blue link turns out to be misleading due to one of these stale links. — Hippietrail 01:02, 28 April 2006 (UTC)Reply
What I am saying has everything to do with fixing things! Wikipedia (AKA Big Sister) does not break external links in this manner; they do the opposite. Any site linking here reasonably expects us to follow that convention (apparently.) [Note: all mirrors are supposed to link back to us - the only ones that are having problems are the ones correctly linking back!!!] --Connel MacKenzie T C 05:37, 28 April 2006 (UTC)Reply
I'm sure there's a CSS hack to make redirects have a different color. I have that on Special:Allpages now, and am still looking for something to make it work everywhere. It also works by setting the stub threshold infinitely high, so everything looks brown except redirects. — Vildricianus 10:17, 28 April 2006 (UTC)Reply

What about redirects from upper to lowercase? Those seem even less necessary. (But if it's still policy to retain them, does that mean we should be adding one for each new entry created?) —Scs 14:23, 3 May 2006 (UTC)Reply

Ec opposed doing this for new entries; I am still unclear why. --Connel MacKenzie T C 06:12, 5 May 2006 (UTC)Reply
I don't know about Ec, but my reason would be: creating all these redirects is clearly a waste of time and database space. If users should have easy access to a differently-capitalized version of the word they were seeking (as of course they should), that's clearly a function which ought to be (and in fact for the most part already is) handled automatically by the mediawiki software, not handled manually by exhaustively creating a redirect for every single word in the dictionary.
There are three circumstances I can think of where this matters:
  1. the search box
  2. internal links (using [[]] wikilink syntax)
  3. external links (especially from mirrors)
The search box already works perfectly, as you can verify by experimenting with words like "Tamper" and "cypriot" (neither of which currently have redirects).
Internal links with the wrong case (e.g. Tamper and cypriot) don't automatically work, but it's not clear that they should. It might be nice if they could (as they do in the "caseless" Wikipedia), but of course editors can always use the pipe syntax.
Finally, and maybe I'm being too callous, I'm really not worried about redirects back to Wiktionary from mirrors. Those mirrors are mostly freeloaders who are mooching off our hard work already; why should we do even more work just to make their lives easier?
Scs 17:51, 6 May 2006 (UTC)Reply
Internal links have to treat upper- and lowercase forms differently - otherwise, once we had added smith there would be no (easy) way to add Smith. SemperBlotto 17:56, 6 May 2006 (UTC)Reply
  • Connel, considering your emotional responses you may not be aware that we don't know what you mean. Please tell us why this breaks valid mirrors and doesn't break invalid mirrors, and why there is no long-term fix or plan to create one. — Hippietrail 21:12, 6 May 2006 (UTC)Reply

I'm am chagrined at the notion of someone flippantly saying they don't care what work derives from Wiktionary. The point of it being GFDL is to encourage derivitives. If that isn't why you are here, then go work for one of the "closed" dictionaries and get paid for your work.

That said, there are derivitives out there that honor the GFDL, and there are derivitives that do not. The ones that do honor the GFDL have to link back here. They must, as a GFDL compliance requirement. Following the meta: directions, they will consistently link back here wrong. The mirrors that were valid in the past, linking back correctly to how we used to work, no longer work. The mirrors that "correct" entries to lower-case will randomly get the wrong result.

Just as much confusion exists within our sister projects, as with external valid mirrors. The variety of when to choose upper case vs. lower case is astounding. The only project that has even tried to rectify their redirects is the English Wikipedia; I think User:Uncle G may have abandoned that effort as well. English Wikisource, English Wikinews, English Wikibooks, English Wikispecies, English Wikiquote and Wikicommons are each much worse off than English Wikipedia. Other language Wikimedia projects I can't even guess at, but I do know that many use the "visible extended interwiki" style references, (e.g. fr:bon) in their translation sections. With the recent decapitalization of all Wiktionaries, many have changed the rules they follow for such redirects - some assuming lowercase, some assuming upper case, some assuming a link to [Search], etc.)

With the software changes that have been made to decapitalize Wiktionary (decapitate, as I like to say,) the external links no longer work. Using redirects is the only viable work-around that I know of, to date. At this point I don't think the WM developers even acknowledge the problem. Apparently, many here don't quite get it either.

--Connel MacKenzie T C 23:10, 6 May 2006 (UTC)Reply

There are two different questions here: how bad is the problem, and what's the right fix?
  1. How bad is the problem? I'm sorry you thought I was being flippant. You're right, derived projects are important. The ones I was dismissing (and don't feel like helping) are just those that slurp a wiki's contents and redisplay it with no value added other than the negative value of a bunch of ads.

    With that said, I'm still puzzled what the problem is. Can someone give some specific examples? (The only one mentioned so far has been http://onelook.com/, and it seems to work perfectly.) The obvious way for a site mirroring our definition of "foo" to link back is to link back to "foo" -- and, similarly, for "Foo" to link back to "Foo", "fOo" to link back to "fOo", etc. Are there really sites that mirror an entry titled "foo" and say to themselves, "Since the wiki software treats initial capitalization as nonsignificant, I should link back to 'Foo'"? That takes work, quite unnecessary work; why would a site go out of its way to do something unnecessary which would only cause problems (just like this) later? (And what's such a site doing mirroring an entry titled "foo" in the first place? Shouldn't it think that the title is "Foo"? Or is that the problem, that it was once titled "Foo", and along the way, when we decapitalized, we changed it to "foo"?)
  2. What's the right fix? To my mind, the right fix is clearly not manual duplication of every entry. That's a viable workaround only if the code is inviolate. But a software fix for the broken external link problem would be trivial. I'll work on it myself (it's a perfect excuse to dive into the Mediawiki code, which I've been meaning to do) and report back later. —Scs 13:58, 7 May 2006 (UTC)Reply
I have to agree. From a software design standpoint, fixing the problem at one point is a million times simpler than creating all of these redirects. My apologies for the understatement. Consider that the software fix is a one-time solution, and the latter requires continual management and updating every time a page is created, indefinitely into the future. The second is not a viable solution and should not influence our reasoning on this topic. But the problem is the more important matter. If mirror sites may not function properly as a result, and as claimed many do not now, I would consider this to be a PRIORITY. Davilla 17:52, 8 May 2006 (UTC)Reply
  • If you are adept enough at PHP to effect the right change, then please do! BTW, doing a lookup on www.onelook.com for "dog" links one to DOG (which I just added {{see}} to.) Perhaps "dictionary" is a better example, as that links to Dictionary which redirects to dictionary. Why do they do that? I assume it is to be consistent with their links to en.wikipedia.org. But my point is that if such a massively public, well-known site as onelook gets it wrong, how can we expect 500+ other mirrors to get it right? If you really need me to, I'll go the Alexa's list of whatSitesLinkThere and find more examples...but I'd rather not. --Connel MacKenzie T C 19:33, 7 May 2006 (UTC)Reply
I'm not sure what onelook is doing, but they may not be as broken as you might think. If you search for "cypriot" there, they link to our Cypriot, and if you search for "Tamper", they link to our tamper. ("Tamper" and "cypriot" are two examples of words we don't currently have case-redirecting entries for.) Now, I see that for words we have both entries for, they do seem to always link to the capitalized one. They may be linking to the first spelling (irrespective of case) we ever added, or they may be linking to the one that's the first alphabetically. But it does appear they're taking care to link only to words we do have entries for. So they may be linking to our capitalized words not because they're stupid, but simply because we have them -- and our having them might therefore be, not a necessary fix, but rather enabling behavior!
As a test, it would be interesting to delete one of our redirects for some relatively unimportant word (say, Decile), wait a few weeks for their mirror to catch up, and see what they do.
Speaking of mirrors, we've been talking about "proper mirrors that do link back", but another important thing a proper mirror does is update itself regularly. (That is, after all, what "mirror" in this sense means; anything else is just a "snapshot", and those have way more problems than case mapping.) If we were to delete all our upper-case redirects, any proper mirror ought to catch up and fix themselves automatically on their next scan. (But it's true, sister projects with manually-composed links are another story.)
If you know of any other specific mirrors, please mention them. Besides onelook, so far the only one I've found is http://open-dictionary.com/, which seems to be only half-broken: "cypriot" finds their "Cypriot" which is a copy of ours, but "Tamper" doesn't work. (And there seems to be something broken about their links back to us in general.)
Scs 23:48, 7 May 2006 (UTC)Reply
Well, that is very interesting. Onelook seems to be using the wrong index, as they link back to use for proteger (even though they had me generate an English only list, so they would only get English terms.) Apparently, there are more problems than I first suspected.
Alexa seems to be reporting a tremendous number of false-positives. open-dictionary.com looks pretty broken (and seems to auto-convert the first character to uppercase.) thefreedictionary.com seems to link when there is no "better" dictionary definition elsewhere, but links "correctly." Perhaps I should check the list of mirrors on meta: instead. --Connel MacKenzie T C 02:23, 8 May 2006 (UTC)Reply
  • This Nicaraguan internet cafe is not as good as the one I was at last night so I'll be brief and mabye I even missed something above.
    • If the information on how to mirror on Meta is wrong, isn't it our duty to edit it and correct the mistakes? Isn't Meta editable by everybody just like any other Wiki?
    • If the Wiki software contains a bug or lacks a feature that means we are putting in lots of work to to get mirrors working - and still with many imperfections, isn't it our duty to report that bug on http://bugzilla.wikipedia.org ?
    • Wiktionary is part of a community and we should act the part by being responsible and reporting problems to the other parts of the wider Wiki community and seeing that they are looked at. There is no need for us to be so passive and waste a lot of effort in workarounds when we can be proactive and help improve the Wiki experience for everybody. — Hippietrail 18:49, 8 May 2006 (UTC)Reply
These are excellent points HT. Perhaps I'm not seeing the forrest through the trees. --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)Reply

AFAIK, there is no such list at Meta. There is this and this, but the real "mirrors and forks" list is at Wikipedia. I haven't looked very thoroughly, but IIRC, there is no list of Wiktionary-only mirrors. Anyone? —Vildricianus | t | 19:21, 8 May 2006 (UTC)Reply

Some time back, a visiting Wikipedian referred to the meta mirrors list as their starting point for finding a non-compliant mirror. I assumed that meant there was such a list. Perhaps they were just checking for Wiktionary content at each of the Wikipedia mirrors? --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)Reply

Having actually thought about this a little more, my attention is drawn to MediaWiki:Newarticletext. I've been bold and added a search link that may mitigate the problem to a certain degree. Clicking on that "search" link invokes the full [Go] functionality (which includes the search logic.)

If there were some way of telling what the referring URL was, we could possibly:

  1. add an indentifying id= or name= somehwere in MediaWiki:Newarticletext,
  2. detect that tag in MediaWiki:Monobook.js,
  3. add logic to check the refurl to make sure we don't repeat searching in a loop,
  4. add logic to automatically invoke the [Go] link when those conditions are met.

Should I pursue this experiment? If this works, I would have no objection to deleting all "de-capitalization" redirects. This would have the secondary benefit of offering the preload templates to internal red links. --Connel MacKenzie T C 19:18, 9 May 2006 (UTC)Reply

Hey, Connel, what city you in? We should be sitting in the same room, so we can compare notes and work on this side by side. :-)
I just tried your Wiktionary "E-mail this user" link but you don't seem to have a confirmed e-mail address. So instead of sending you my contact info, I guess we'll have to use Vulcan Mind Meld or something.  :-) --Connel MacKenzie T C 19:53, 9 May 2006 (UTC)Reply
I'd been approaching the problem from a slightly different direction. I've got some new PHP code that's trying to do the right thing (and I got it working, too, not five minutes ago, in my own home wiki here), but the problem is that it can't inject its output quite where it wants, precisely because 'newarticletext' and 'noarticletext' are templates, not dynamically-generated.
(Did you mean MediaWiki:Newarticletext, or MediaWiki:Noarticletext? It was the page that comes up as MediaWiki:Noarticletext that I was trying to augment with a link to the other-case article, to fix broken links coming in from the outside.)
It might be possible to do some fancy programming right there in the templates (using mediawiki code, not PHP code, more or less as you suggest), but I haven't explored this yet.
I'm now going to try to ask the real mediawiki developers for a little advice. There's probably a right way and a wrong way to proceed, and they'll have a better feel for that than we will.
Scs 19:42, 9 May 2006 (UTC)Reply
Your last point is the best...I agree completely.
Actually, it's even better than that. I posted a message to the Wikitech-l list, and one of the regulars there has already provided what I thought couldn't be done: the "fancy programming right there in the templates, using mediawiki code". See here. I haven't tested this yet, because I don't have the expression evaluation parser loaded into my home wiki. —Scs 04:12, 10 May 2006 (UTC)Reply
Yes, here on en.wikt: we've had the inelegant {{~if}} and {{if}} for a while, but those are shunned for a variety of reasons. In the last week or two, now the "#IF" magic-word exists, I've been thinking about correcting links to our templates. Unfortunately, that doesn't do a http 302 nor even a javascript redirect. And since we can't determine the referring URL, we won't know if that was intended or not (e.g. an internal link should always get the edit page with an option to redirect, while external links should just be redirected immediately.) --Connel MacKenzie T C 05:02, 10 May 2006 (UTC)Reply
As far as "New" vs. "No", I think I may have confused myself there. "New" is the result of an internal link, which is what I was testing a few minutes ago. --Connel MacKenzie T C 19:53, 9 May 2006 (UTC)Reply

technical discussion

New section for edit link...

OK, it seems clear to me now that my basic assumption was wrong. Anyone arriving at a page containing MediaWiki:Noarticletext should be redirected (if alt page exists.) Anyone arriving at a page containing MediaWiki:Newarticletext should be warned if alt pages exist.

On the Noar page, I'll add an identifier within the conditional template(s). Withing Monobook.js (my own, for now) I'll trigger a redirect. --Connel MacKenzie T C 05:17, 10 May 2006 (UTC)Reply

For those still with us -- I got the {{#ifeq:}} hack in MediaWiki:Noarticletext working as its author intended, so we shouldn't need to muck around with monobook.js or CSS after all. See cypriot, (cypriot/Cypriot) Tamper, (tamper/Tamper) and nopageatall for a demonstration. —Scs 05:37, 11 May 2006 (UTC) added regular wikilinks. --Connel MacKenzie T C 06:23, 11 May 2006 (UTC)Reply
Another example: milf/MILF/milf. Oh crap, does this mean I have to make good on my DeleteRedirectsBot idiotic promise? --Connel MacKenzie T C 06:23, 11 May 2006 (UTC)Reply
By the way, I would like to see Hippietrail's auto-redirect experiment as well... --Connel MacKenzie T C 06:25, 11 May 2006 (UTC)Reply
Now that this is "solved" for NS:0, perhaps we should let this conversation get archived, and restart a fresh discussion about policy regarding #REDIRECTs? --Connel MacKenzie T C 15:16, 12 May 2006 (UTC)Reply
Not yet, I'd say. Let's see for a while how well this works. —Vildricianus | t | 17:53, 12 May 2006 (UTC)Reply
  • Oh. I feel kindof silly now. Whether Noarticletext or Newarticletext finds a match, the redirect for both will be the same: to Special:Search/{{PAGENAME}} (or Special:Search/{{NAMESPACE}}:{{PAGENAME}}. There is no looping danger, in that case, right? If the page does not exist, the search page will open up, not a target page. If the page does exist in any capitalization form, the target will be reached without this nonsense. Forest for the trees, I tell ya. I must be ill, to have spun my wheels this much. --Connel MacKenzie T C 05:13, 13 May 2006 (UTC)Reply
    • Actually, the choice to not do it for "Newarticletext" is so that we are able to enter upper and lower case entries for any given term. So we'll acutally be "nicer" to external links than internal. Unless of course, we go berserk and add some kind of back-link (via cookie?) similar for "regular" redirect pages that when clicked on will go back to the internal link with something extra in the url (like redirect=no.)

Anyway, I have it working in my User:Connel MacKenzie/monobook.js (the top-most function in the file.) For external links, when you get to buenos Aires it auto-redirects to the search page, which zaps you directly to Buenos Aires. --Connel MacKenzie T C 07:19, 13 May 2006 (UTC)Reply


Proper nouns/place names

I've been adding some capital cities lately, and I was wondering to what extent we're going with place names. We've got most countries and capitals now, and a couple of other cities and towns, but are all place names on Earth considered to be the part of the all words of all languages statement? I think they are, but I'm not certain everyone agrees on this.

If they are, then, how are we going about categorizing them? I now see that the Category:Capital cities may not have been an excellent choice, for I think it may involve politically loaded inclusions/exclusions and therefrom resulting discussions/edit wars etc. that are better left in Wikipedia. Any thoughts? — Vildricianus 21:21, 27 April 2006 (UTC)Reply

I'm curious too, what would ideally go in such an entry? Just a short one-line listing, "a village / town / city in XXX country", with a "See also" pointing to Wikipedia? I think I'll go look at a couple Wiktionary entries and see if I can answer my own question.  :) Cheers, Eiríkr Útlendi | Tala við mig 21:28, 27 April 2006 (UTC)Reply
I see no good reason for deleting any place name that is entered, even ones as small as, say, Butetown or Denigomodu. Otherwise we'd have to draft some kind of policy saying "only towns with x number of people in are allowed for inclusion", and nobody likes making policies, do they ;). As far as adding them goes, it should be very low down on our "priority list". Category:Capital cities is a good enough category in my eyes. When, in 5 years or so down the line, we've run out of non-proper nouns to add, we'll end up creating them anyway, lol. --Dangherous 21:36, 27 April 2006 (UTC)Reply
Well, if Dunabökény can stay in here, then anything will. Take that entry as a test to the system...whatever system it may be. --Dangherous 21:46, 27 April 2006 (UTC)Reply
We might want to consider (though not necessarily right away) whether we want a single all-inclusive Catgeory:Capital cities, or some kind of regional breakdown. I can think of several ways to do this, but then the category isn't overly large right now. --EncycloPetey 05:45, 28 April 2006 (UTC)Reply

Etymologies and translations are two good reasons to have them, though I confess I still feel in two minds about it myself. Widsith 07:05, 28 April 2006 (UTC)Reply

That's what I thought as well, but there's not much to say in either section for less notable places, like, for instance, Big Lake, Texas. — Vildricianus 10:07, 28 April 2006 (UTC)Reply
Does anyone have a sense for what criteria are used for determining inclusions in published geographical dictionaries like Webster's? --EncycloPetey 09:18, 29 April 2006 (UTC)Reply
My gut feeling is that they have an idea of how big the book should be, and what price they can sell it for (and to whom) and include places in reverse order of size and importance until the book is "full". They probably include smaller places in the USA than in China if that is where they plan on marketting it. But our Wiki can be as big as it likes, is free, and we market to the world! SemperBlotto 10:14, 29 April 2006 (UTC)Reply

These must be treated in a dual nature, just like given and family names. I really don't care which historical figures had the name David, and I don't really care which states in the U.S. have cities named Athens. The first is a common given name and the second is a place name. However, the Biblical figure and the city in Greece each deserve an entry. By what criteria though? The CFI currently says that names must be attributive. I've suggested before not including a place name (as a specific city or what have you) unless it has a common or non-literal translation on the other side of the world, which would indicate its importance. I'm sure "Big Lake" has a translation into Chinese, but would any Chinese person know anything about the city aside from the presumed big lake nearby? Taipei, on the other hand, isn't the most well received transliteration of the Chinese word, but it is the universally standard one. Davilla 13:36, 30 April 2006 (UTC)Reply

My view is that place names should only be included if:-

  • they have a different name in a diffrent language. We need the name in order to show the translation.
  • they are necessarily referenced from another entry :Athenian => Athens.
    Though I worry about Leodensian => Leeds. But, i suppose if Leodensian was written in a novel I'd want to know what that meant.

Mostly that then limits us to having entries for significant places.

But WT:CFI already says something - A name should be included if it is used attributively, with a widely-understood meaning. . Perhaps it could do with a bit of updating to reflect the above though.
--Richardb 03:28, 14 May 2006 (UTC)Reply

Wikisaurus cleanup

There is a project space established for this sort of discussion. Wiktionary talk:WikiSaurus improvements Please try to be a bit disciplined and conduct the discussions there. I will try to move this discussion to that place, and apologise in advance if I don't do it perfectly. (And Connel, please don't kneecap me or something if I make any mistakes). It sure would help if you guys had the duscipline to use such discussion places in the first place. We would have one central place to carry on the discussion, and to look back on discussions. It would also help keep Beer Parlor more manageable.--Richardb 00:59, 14 May 2006 (UTC)Reply

Moved "en-bloc" to Wiktionary talk:WikiSaurus improvements/BP extract 29-Apr-2006- "WikiSaurus cleanup"

Numerical accuracy

We should be careful when giving approximate values in the definitions of terms. A dictionary serves to define words, but many times values are measurements, which must always be rounded at some point, and are therefore superfluous information, even if useful.

absolute zero is by definition zero on the Kelvin scale, so this is appropriate to note in the definition. However, the Celcius and Fahrenheit values are approximates since those scales are based on the freezing and boiling points of water. The values should be provided only as additional information.

A year is defined as exactly 365.25 days in scientific terms, but the more common meaning is defined astronomically. In fact any value given would not only be approximate but is in fact slowly changing over time.

I've also noted that the elements, which are defined according to atomic number, have included atomic mass in the definitions, but the latter is also measurement. Worse, it's based on specific isotopes, so I'm not certain it's at all correct to place the information there. More descriptive defintions could include information about valence, which directly relates to the charge of the nucleus.

I'm not saying that informative descriptors should be excluded from definitions in every case. What's important is to make a clear distinction between what information defines a term, and what information can only additionally illustrate it. Davilla 23:52, 1 May 2006 (UTC)Reply

I was unable to find the previous discussion about this. The topic is which: colloquial vs. scientific? Or the accuracy of scientific measurements, as reported by Wiktionary? --Connel MacKenzie T C 06:40, 5 May 2006 (UTC)Reply
Nor can I. Maybe it was on WiktionaryZ? Anyways, are my comments out of line? Davilla 17:31, 8 May 2006 (UTC)Reply
Not at all...I am trying to suggest that this topic still needs further discussion, as it was unresolved that last time it was mentioned, IIRC. --Connel MacKenzie T C 18:40, 8 May 2006 (UTC)Reply

---- hidden between articles in view mode

This has come up often and I think I'v come up with the definitive solution thanks to some help. I have modified the global monobook CSS page to hide the extra horizontal line but to increase blank space between language sections in its place. A small extra benefit is that only one blank line before and after ---- in the wiki source will result in nice formatting. If anybody thinks the amount of space needs adjustment or finds any problems please comment here. If it's controversial we may need to vote as to whether the standard is to have the extra line with a per-user option to hide it, or vice versa. Please not that skins besides monobook are entirely unchanged. If you would like this change for another skin please also comment here.

Those monobook users who would like to retain the old look can insert this code into their custom CSS file:

.ns-0 #bodyContent hr { visibility: visible }

Those who don't want extra space between language entries add this: .ns-0 #bodyContent hr { margin-top: auto }

Hippietrail 22:28, 2 May 2006 (UTC)Reply

Perfect! —Vildricianus 16:00, 4 May 2006 (UTC)Reply
The first three times I read this, I misunderstood it to be the inverse of what you actually wrote. Why would you blank these from the default view? I'm sorry, but that just doesn't make sense to me. --Connel MacKenzie T C 20:29, 4 May 2006 (UTC)Reply
Connel has reverted this without commenting either here, the custom CSS talk page, or my tak page. In his edit comment he merely decides: "rv HT's line-blanking thing for personal monobook.css's"
Does anybody beside Connel not like it and is anybody in favour of dropping the "be bold" guideline? — Hippietrail 20:27, 4 May 2006 (UTC)Reply

I'm sorry if you felt that I was stepping on your toes. I honestly thought what you did was a simple error when I saw it (finally) in Monobook.css. NOTE: Please see the conversation below, in the section acout the broken Main Page as to why what you've proposed is not a workable solution. It still has some promise, but still has big kinks that need to be worked out. --Connel MacKenzie T C 20:35, 4 May 2006 (UTC)Reply

It has promise, yes; where are the kinks beside on the main page? —Vildricianus 22:00, 4 May 2006 (UTC)Reply
Well, I for one (perhaps the only?) thought that this was a discussion about personal monobook.css customization from the very start. I do feel that if something is in an entry, something visible on the rendered page should correspond to it. The "kinds" I was referring to were that all HRs were invisibe-ized by this change, not just the ones above the headings - but also the ones below. I don't think that is the desired effect, is it? --Connel MacKenzie T C 00:29, 5 May 2006 (UTC)Reply
P.S. Evil site-wide code restore to functionality. --Connel MacKenzie T C 01:10, 5 May 2006 (UTC)Reply
Actually, maybe we never had lines underneath 3rd level headings. I thought we did, but I guess not. --Connel MacKenzie T C 01:17, 5 May 2006 (UTC)Reply
  • Ok everybody, I've cooled down again although I missed my bus before. Connel and I are still friends. I finished testing my CSS that gives the usual result for ---- on Main page, and the result I thought (almost) everybody wanted on other articles. I'm not a CSS guru though and I don't feel it's elegant enough so please take a look. The ---- is definitely visible but I'm not 100% sure the spacing is as it was before. I'd appreciate any comments.
  • By the way the line before the level-2 heading was an HR, the line after it is produced magically by the monobook core CSS and just looks like an HR. I'm not aware of any other horizontal lines anywhere but please let me know if I missed some so that I can make exceptions for those too.
  • Also, if anybody really does hate this change please share your feelings here. We can always vote on it or revert it but I was honestly under the impression that all Monobook users wanted such a fix. For now please test it though and comment. Thanks for your patience. — Hippietrail 01:29, 5 May 2006 (UTC)Reply
I don’t think the extra space with no visible line is appealing. Typographically, it doesn’t yield a sharp, clean look to the page. A better idea would be to give it the double spacing preceding the ---- just as you’ve done it, but keeping the ---- visible. That way, typing a single HR, then ----, then another HR, would produce professional-looking margins and virtual page boundaries. —Stephen 21:48, 6 May 2006 (UTC)Reply
Try putting this in your custom CSS, then refresh your cache (CTRL+F5):
.ns-0 #bodyContent hr { visibility: visible !important; }
Others please try this too and comment here. We can make whichever is more popular default, and the other optional. — Hippietrail 21:53, 6 May 2006 (UTC)Reply
Ah, much better! Thanks. —Stephen 22:42, 6 May 2006 (UTC)Reply
OK, I think perhaps the visibility:visible is more proper as default mode. —Vildricianus | t | 18:21, 7 May 2006 (UTC)Reply
If the power doesn't go out again I'm going to turn the line back on by default and update the customization page with how to hide it. Is everybody happy with the amount of space? — Hippietrail 16:50, 8 May 2006 (UTC)Reply

Stats

Hi, I was wondering about this page. How does one get it updated? I'd like to know where I am on this list. --Dangherous 18:14, 3 May 2006 (UTC)Reply

It is updated based on the DB dumps, but the person who does the updating (Author:Erik Zachte Mail:###@chello.nl (<nospam> ### = epzachte </nospam>)) seems to have skipped the last few. - TheDaveRoss 04:43, 4 May 2006 (UTC)Reply
I asked about this on #wikimedia-tech. The host name stats.wikimedia.org resolves to albert, which is part of the core server cluster. So someone there may be inclined to run it when the last of the XML dumps finish. Who knows - maybe they'll automate it. Or maybe just ignore it. Hard to guess. --Connel MacKenzie T C 06:46, 5 May 2006 (UTC)Reply

Hey, it's just been updated! Congrats to SemperB, the top non-mechanical contributor...though sometimes I wonder.. Widsith 17:45, 14 May 2006 (UTC)Reply

Suppose you are an English speaker trying to learn Spanish, and you come across the word "tener" (to have). It is easier to remember if you recognise related words in English such as "contain".

Would it be possible to collect such related words? It would probably have to be done for pairs of languages. —This unsigned comment was added by 70.50.115.35 (talkcontribs) 19:59, 3 May 2006.

It is not an exact cognate, but there is no reason why you couldn't add ‘Compare English contain, retain, etc.’ to the =Etymology= section. As long as you have a clear understanding of what the relationship is there is no problem. Widsith 20:09, 3 May 2006 (UTC)Reply

Well, it looks to me as though it might be better in a section like the translations into other languages. Perhaps under a title like "Cognates, near cognates and false cognates." (False cognates are known as "falsos amigos" in Spanish.) It might be like the translations by having a sub-section for each language.

I guess I am also asking two questions simultaneously. Roughly, could this be organised and would people be so kind as to make the necessary entries.

I do have a book on false cognates between English and Spanish, but I have never seen anything on near cognates (in this sense). —This unsigned comment was added by 65.95.116.102 (talkcontribs) 21:20, 3 May 2006.

The Etymology section is the section for cognates; that is where they belong. Near-cognates may be added as well if they are helpful (but not all of them will be: Latin tenere, in your example from earlier, has several score descendants in English alone). False cognates between different languages might be best placed in a Usage note (‘not to be confused with...’) where the words are false friends, otherwise they can also appear in the Etymologies (‘Not related to...’). Widsith 21:33, 3 May 2006 (UTC)Reply
There was preliminary discussion a couple months ago about a ===Miscellaneous=== heading at the end of an entry. A subsection under that might be ====Cognate mnemonics====, if I understood what you are saying correctly. But I don't think I'd vote for inclusion of those items in Wiktionary. Maybe write a book on Wikisource on the topic? We're trying to build a dictionary, but we aren't even close to having the basic language covered just yet. Ooops, the same argument can be used against {{rank}}s. Hmmm. --Connel MacKenzie T C 01:35, 4 May 2006 (UTC)Reply
Ranks are very useful for translators browsing through the basic entries. In general, I'd vote no for this proposal. As Connel points out, we're already having a hard time doing the basics for English. —Vildricianus 15:59, 4 May 2006 (UTC)Reply

We had something similar on fr:, because someone proposed two sections: "derived terms in other languages" and "related terms in other languages". It appeared that the second would be too difficult to fill (in a multilingual dictionary), considering that every words (in every languages) related should have that same section with a link to the other words (that can be huge, and the information would be repeted several times). So, we proposed to gather all those informations in the article of the etymon, in a section "derived terms in other languages". For example, the page fr:bellum contains links to the French fr:belliqueux and the English belligerent. I think it is the most logical way to handle this. - Dakdada 18:35, 4 May 2006 (UTC)Reply

Hmmm... a reverse etymology, of sorts. Fery interestink. There's at least one Wikibook I've seen on amigos falsettos, by the way. Davilla 19:39, 4 May 2006 (UTC)Reply
Reconsidering. What if it really is critical, as in the French for pint?

I have been adding sections like this to Old English entries, under a =Descendants= heading. Of course in that case it's really only English or Scots, but in the case of Latin there would be many languages. Such words are normally called derivatives, but in Wiktionary we use Derived Terms for related words in the same language, so it was necessary to distinguish. (It is more complicated and important in Romance languages, because French and Spanish have some words that have evolved from Latin, and others which have been borrowed from Latin later on. It is important to distinguish between these, but that implies yet another section...) Widsith 19:54, 4 May 2006 (UTC)Reply

Given the possible options, I think it would be preferrable to have a separate section of =Cognates= and a section of =Derivates in other languages= in which the individual words were listed by language. This is certainly the case for Latin entries, which will have derivates in several languages as well as cognates. --EncycloPetey 08:53, 5 May 2006 (UTC)Reply

How to hide sense numbers when there is only one sense

Some people have had strong feelings that there should be no #1 for entries which only have one sense. Now there is a way for these people to hide that number without changing the wiki code, and everybody else will see just what they always saw.

See Wiktionary:Customizing your monobook for details, in the Javascript section.

Anybody who is good at documentation, especially those who also already have custom JavaScript files, please improve the page. Specifically we need some kind of introduction telling people how to start such a page, and then telling them how to add fuctions.

Right now the position of the definition will be exactly the same with only the number invisible but basically still taking up its space—so it looks too far indented. CSS gurus might be able to tell us a way to get rid of this indent.

I've done some CSS experiments of my own and the best I have found is this:
.ns-0 ol.single-entry-list li { list-style-type: none }
.ns-0 ol.single-entry-list { margin-left: 0; padding-left: 1em }
This merely gives the definition indentation a value that looks kind of right for me, but I really want a solution which results in the definition beginning in the exact position the sense number would begin, and on all browsers. Can anybody tell me if this is possible? — Hippietrail 20:32, 6 May 2006 (UTC)Reply

I expect the broken livery entry is related to this experiment. For me, it's missing the number "1" before its first definition. Granted, the entry's layout itself is wrong, but is the number "1" missing for other editors or just for me? Rodasmith 17:35, 9 May 2006 (UTC)Reply

No, this is a customization you have to install in your own JS and CSS files, I haven't made it global. In this case the formatting is just wrong. I suggest placing an {{rfc}} on that page. — Hippietrail 17:52, 9 May 2006 (UTC)Reply

Category tree

I posted the following question to Wiktionary talk:Categorization and am posting a short version of it here for visibility: Are all English language entries really supposed to get Category:English language, as recommended in Wiktionary:Categorization? Rodasmith 03:02, 4 May 2006 (UTC)Reply

I think in an ideal world, yes. But in practice, it doesn't really happen, because we don't have template like the French Wiktionnaire, so we tend to get lazy and omit them. That's my theory anyway. --ex-admin part-time sockpuppetting quasi-vandal Wonderfool 11:22, 4 May 2006 (UTC)Reply

In practice, if all Englush entries got that tag, the category would be useless, since it would contain more than a million entries (eventually). --EncycloPetey 11:26, 4 May 2006 (UTC)Reply

Thanks. Wiktionary:Beer parlour archive/January-March 06#Category:English Adjective and WT:BP#Categories of the form <language>:<part of speech> agree that huge categories are undesirable. I'll update Wiktionary:Categorization accordingly. Rodasmith 17:35, 4 May 2006 (UTC)Reply

== Word Characteristics ==

(Copied over from Tea Room)

I am beginning a project intended to optimize the order in which the various characteristics of words are organized. This project requires a comprehensive list of such characteristics and their values. For instance: using the characteristic “Parts of Speech” the characteristic values would be noun, verb, adjective, adverb, etc. Although a good list of characteristics can be obtained from almost any dictionary entry I would prefer to develop an exhaustive and comprehensive list here. Any ideas as to the best way that this might be done?

Pce3@ij.net 18:19, 30 April 2006 (UTC)Reply

This user is a member of the Association of Inclusionist Wikipedians

The motto of the AIW is Salva veritate, which translates to, "with truth preserved." This motto reflects the inclusionist desire to change Wikipedia only when no knowledge would be lost as a result.

Association of Inclusionist Wikipedians
Association of Inclusionist Wikipedians
Characteristics also include: number of double letters (Mississippi has 3), number of capitalized letters (McDonald has 2), number of vowels (twyndyllyngs has none), type of symmetry if any, and value considered as a number in base 36. Davilla 19:13, 1 May 2006 (UTC)Reply
By symmetry I assume you mean words like Abba. What do you mean by "type" of symmetry? Can you list the "types"? Also what do you mean by "value" in base 36? Thanks. -- Pce3@ij.net 19:29, 1 May 2006 (UTC)Reply
Besides the palindromes, and it's too bad this one doesn't have an entry, words like "Pd" and "SWIMS" have rotational symmetry. Davilla 19:22, 4 May 2006 (UTC)Reply
I think you might get more help and response over in the Beer Parlour, so I'm copying this section to there, and suggest the converdsation continues there, not here in the tea room, which is more about specific words, rather than methods.--Richardb 13:06, 4 May 2006 (UTC)Reply
This is a little off-topic, but the motto of the AIW is actually Conservata veritate and not Salva veritate (which means "safe with truth"). --EncycloPetey 15:22, 6 May 2006 (UTC)Reply

Main Page

Why are the <hr>'s missing/invisible on Main Page, especially in the "other languages" section? --Connel MacKenzie T C 19:52, 4 May 2006 (UTC)Reply

Because of the recent Monobook.css customization of hr's being rendered invisible in main namespace. Creating a manual line doesn't work either (tested in MediaWiki:Noarticletext). —Vildricianus 20:11, 4 May 2006 (UTC)Reply
Um, what recent customization? There were notes on how individual people could do it to their own personal monobook (if they were so inclined) but such a thing should not have made it into the site-wide version. I didn't see it there, when I looked, BTW. Has it already been rolled back? --Connel MacKenzie T C 20:16, 4 May 2006 (UTC)Reply
OK, I found that directive (it looked like a comment, due to line-wrapping, earlier) and commented it out. Where is the previous conversation, somwhere here? --Connel MacKenzie T C 20:23, 4 May 2006 (UTC)Reply
I don't know why you chose to ignore the topic already here above, ignore my talk page, ignore the global CSS's talk page, and ignore my specific request to comment here in the global CSS file. I was trying to actively work on this right now it was difficult to find what was going wrong since you didn't comment in any of the expected places. I filed a bug report yesterday on the subject of no CSS to distinguish the special Main page from normal default namespace pages. I have meanwhile added a CSS id to the main page as a workaround, and I have unfinished CSS code in my personal CSS file that was trying to get the HRs to be hidden only on "normal" default-namespace pages. I have to give it up now due to lost time and I have a bus to catch. Anybody with an interest in this and a knowledge of CSS please feel free to continue the work... — Hippietrail 20:37, 4 May 2006 (UTC)Reply

Wikimedia Toolserver

I finally got my account at the Wikimedia Toolserver and it seems to work or at least partially work. I have a page there now with a very simple database access example.

Unfortunately it seems that just the meta data not the actual page data is there. If you click on an entry it just shows where the data is stored not the actual data. If I run it on my own box with the dump imported it shows the actual page data... But perhaps the data is somewhere else... --Patrik Stridvall 21:21, 4 May 2006 (UTC)Reply

This, I think, is a fantastic start. --Connel MacKenzie T C 06:50, 5 May 2006 (UTC)Reply
Lemme know if you get Kate's "markthrough" working. --Connel MacKenzie T C 06:53, 5 May 2006 (UTC)Reply
Get what working? Anyway, I will work more on it during the weekend. I guess I can get the page data from the XML dump if everything else fails but that is just a reserve solution. In the meantime, does anybody has something that only needs the meta data? --Patrik Stridvall 08:19, 5 May 2006 (UTC)Reply
Kate has a tool called "markthrough" for doing wiki-ish markup for toolserver html pages (tags the links at the bottom correctly, various other "required" cross-links and stuff.) I've had trouble setting it up, due to time constraints. --Connel MacKenzie T C 18:35, 6 May 2006 (UTC)Reply
OK. I personally prefer PHP and since I do a lot of database stuff with dynamic formating I don't see much use for it. Especially since PHP can easily share headers and footers between files which seems to be one of the major advertized advantages. --Patrik Stridvall 21:04, 7 May 2006 (UTC)Reply

I have written PHP code to parse the XML dump and find the headers and store it in a database on the toolserver. I have also done a few other updates. See for yourselves on my toolserver page.

Since I now have a base for parsing the XML dump I can now build a database on whatever we like. Any suggestions? Translations perhaps? --Patrik Stridvall 21:04, 7 May 2006 (UTC)Reply

Wow, that was quick. It seemed to work on the first click-through I did, but now is just giving blank pages? Hrm, now working again. I'll look more - so far it looks like some fantastic stuff.
I remembered I had forgotten something a while after posting so I made a few more changes. Perhaps is was what caused the temporary failure. The machine running the MySQL server seems to be down right now so I can't check. --Patrik Stridvall 20:37, 8 May 2006 (UTC)Reply
I guess the next "really useful" thing would be to have something that listens to irc://irc.wikimedia.org#en.wiktionary and creates a list (once daily) of pages updated that day, so that at any time we can simply get the delta pages from Wiktionary instead of the XML dump for any entry touched since the last dump. (Perhaps it should even exclude edits from the interwiki robot User:RobotGMwikt?)
There is a table that contains the recent changes that is replicated live so that is not a problem. I could exclude everything updated after the date of the dump from the pages to make it possible to "check off" fixed problems. That is planed but not done. Or I could download everything newer directly from here but that is more complicated. --Patrik Stridvall 20:37, 8 May 2006 (UTC)Reply
--Connel MacKenzie T C 21:43, 7 May 2006 (UTC) (edited)Reply

Boy do I feel stupid. I didn't realize I had read access to /home/strivall/public_html. I must say, I am very impressed by your style. I see now why you shun the markthrough thing. As a petty administrative note, you might want to tag everything as GPL (assuming that is your intent, or copyright P.S., or whatever) and the icon links (spelled out at meta: Toolserver#Using convention.) --Connel MacKenzie T C 22:38, 7 May 2006 (UTC)Reply

I do similar things for a living so I guess I have learned something over the years. As for license, GPL is fine unless somebody has any better idea. I will fix the administrative stuff and publish the source code when it is more ready than it is now. In the mean time feel free to used any code under the GPL. --Patrik Stridvall 20:37, 8 May 2006 (UTC)Reply

Patrik, do you know how often the XML dump (that your tools are based on) is refreshed? —Scs 01:12, 9 May 2006 (UTC)Reply

From m:Data_dumps "dumps will be run approximately once a week". Since the last one is from 2006-05-03 a new one should available tomorrow. I have to download and run the parser again manually though. Note that not all pages uses the dump. Each page says whether it uses the dump or not. Everything except for the page data is live or almost live there might be a small replication delay. --Patrik Stridvall 16:57, 9 May 2006 (UTC)Reply
Thanks! —Scs 04:04, 10 May 2006 (UTC)Reply
That is quite a new development. When the XML dumps are running well, I think we've been able to rely on them about once a month. More often, there are minor problems amounting to more significant delay. I think the once-a-week thing refers to when the pass starts for small wikis (which we are not.) We'll see soon if we really do get more than one a month from now on, but I'm not doing anything more than crossing my fingers at this point. (Of note: the en.wikipedia dump pooched out again...not a good sign.) --Connel MacKenzie T C 15:08, 12 May 2006 (UTC)Reply
It seems that you are right. Oh well, not much we can do but wait for the next. --Patrik Stridvall 20:00, 12 May 2006 (UTC)Reply
New XML dump generated for en.wikt: last night... --Connel MacKenzie T C 16:24, 21 May 2006 (UTC) Something tells me, I should have downloaded the XML dump first, then posted this message.  :-) I've never seen download.wikimedia.org pushing less than 100KB/sec before. --Connel MacKenzie T C 16:41, 21 May 2006 (UTC)Reply
And now Patrik's tools are seeing it! Hooray!
(Unfortunately the dump was from just before I figured out a sneaky way to track down and fix all the headers with apostrophes in them. And it looks like Patrik fixed his script so it doesn't choke on those any more, anyway. Thanks!) –Scs 16:48, 26 May 2006 (UTC)Reply

Dictionary.com and m-w.com

I see that we can't copy/paste entries from dictionary.com. Is there a copyright issue? I thought copyright laws don't apply to words. By the way, what about m-w.com? Can we copy/paste from there too? --68.102.193.78 08:29, 5 May 2006 (UTC)Reply

Yes, it is. Aren't there any older English dictionaries in the public domain now?--Jusjih 11:01, 5 May 2006 (UTC)Reply
Plenty are in the public domain - but few are available online. We often use the 1913 Oxford Dictionary. I've recently added a Dictionary of the Chinook Jargon from Gutenburg, and are still searching for a digital copy of the 1910 Black's Law Dictionary. BDAbramson T 13:15, 5 May 2006 (UTC)Reply
Aside from the potential legal issues, simply mirroring another dictionary is counter to the idea behind this project. We do not aim to simply be a dictionary.com mirror, but a secondary source like dictionary.com. - TheDaveRoss 14:13, 5 May 2006 (UTC)Reply
Copyright doesn't apply to individual words, sure, and on that ground we have several wordlists from various sources. But what you'd be copying from a dictionary website is not the word, but a definition of it—which may or may not, depending on the style, be copyrightable, but would you really care to risk a lawsuit when you could write it yourself?—and possibly other information such as etymology and pronunciation.
Another decent PD source is Webster 1913. The problem with public-domain-by-age stuff though is its lack of descriptions of newer developments in the language (which can be fine for historical perspective and completeness, but can make a poor beginning for an entry). —Muke Tever 14:23, 5 May 2006 (UTC)Reply
Good. Having one online dictionary in the public domain is much better than having none, so please provide any online sources so someone may use them freely with any needed updates.--Jusjih 15:34, 6 May 2006 (UTC)Reply

Registering words

EDUCTIVITY - From the Latin educere, to lead out or out of. The dynamic between deduction (leading from, general to specific) and induction (leading in or into, specific to general). SImilar to the Hegelian dialectic of thesis (rationalism), antithesis (empiricism) and synthesis (eductivity) except that the process must be constant to create 'good,' either positive or normative. The stone tablets must be continually created, broken, recreated, etc.

Retrieved from "http://en.wiktionary.org/wiki/eductivity"

Is there a way this word can be registered? robert.h.yauger@verizon.net

-- asked by User talk:71.245.183.154

"Registered" how? If you mean in copyright, that's not possible; if you mean in trademark, you will want to check the laws of your country; if you mean to make it acceptable to this dictionary, include three verifiable quotations of other people using the word over the span of at least two calendar years (see WT:CFI); if you mean to make it acceptable to the English language, you can only do that by using it, and in ways that incite others to use it as well. —Muke Tever 14:30, 5 May 2006 (UTC)Reply

Oops. I somehow missed this BP thread before beginning, but FYI, I'm using Wiktionary:List of protologisms/eductivity to develop a protologism mainenance tool. If anyone would reather I not use that submission, please say so. Otherwise, I'll continue and post a follow-up here when I'm finished. Rodasmith 04:42, 9 May 2006 (UTC)Reply

Well, the idea was to do something with the "Sniglets" that were being submitted in 2003/2004. At that time, only a tiny minority here wanted them included on Wiktionary. Trimming/neutering them down to one line seemed acceptable to most. But your experiment has great promise; the Wiktionary namespace isn't seached, by default, right? This sort of page could serve as a holding spot for building citations. Primetime, E. Seguora, other vandals, sockpuppets and POV pushers would love that opportunity...but then so would some amateur linguists. I think you should limit the experiment to only a small number of entries until consensus emerges. --Connel MacKenzie T C 06:00, 9 May 2006 (UTC)Reply
As it turns out, my end suggestion is does not require a maintenance tool. Instead, it amounts to managing each new protologism submission as follows:
  1. move it to Wiktionary:List of protologisms/sampleProtologism
  2. instead of listing its definition on Wiktionary:List of protologisms, add it to Category:Protologisms.
I'm not sure that is any better than the current process, so I won't be particularly hurt if the suggestion is shot down. Ready... aim.... Rodasmith 07:17, 9 May 2006 (UTC)Reply
Fire. Sorry. Even with a huge disclaimer on the page, it's simply not something people here are interested in promoting. Any more than one line is granting way too much freedom. Davilla 21:16, 9 May 2006 (UTC)Reply

substandard

The word liquification turns out to be incorrect (I, and 62,000 google hits, thought otherwise). If I make a page to say the correct word is liquefaction, does it belong in a category of incorrect words? I considered creating category:substandard but do we already have a category for such cases? Is substandard the best name for such a category? JillianE 16:12, 5 May 2006 (UTC)Reply

Incorrect? Liquification tells me something else. —Vildricianus 16:16, 5 May 2006 (UTC)Reply

Dictionary.com doesn't seem to agree.

1=liquification

Please see Module:checkparams for help with this warning.

Also, there are 62,000 googles for liquification but over 2 million for liquefaction (and google asks if I meant liquefaction). JillianE 16:22, 5 May 2006 (UTC)Reply

If you make a page to say the correct word is liquefaction it would belong in a category of POV entries to be rewritten, unless you back that up with quotations from grammarians in a ==Usage note== section. (If you like, I could contribute a note in the ==etymology== section saying a better-justified spelling is liquefication, since the better-accepted form of the verb is liquefy; the -i- seems to come by analogy from liquid, which is from liqu-idus.) The existence of one word does not negate the existence of another of similar formation or meaning (cf. comic and comical, and the whole debate elsewhere on cacodemoniacal or whichever it was). —Muke Tever 12:46, 6 May 2006 (UTC)Reply

Not quite the same – comic and comical (and all variants thereof) are both entirely valid formations within the established patterns of English suffixing. Liquification arose only through error. It is hardly POV to point that out, since liquification would not be acceptable in a business letter, job interview, etc etc, and users ought to know that. Widsith 07:58, 9 May 2006 (UTC)Reply

Given that very few words are properly in -efaction (putrefaction is the only other relatively common one I can think of) it's no surprise that lique/ification be recreated on the analogy of verbs in -fy, which most frequently produce formations in -fication within the "established patterns of English suffixing." Modification and creation of words based on analogy are an ordinary linguistic phenomenon, and can easily be mainstream (such as spelling island with an s) or regional (color/colour with French -our or "restored" Latin -or) or disturbingly nonstandard (orifii as a plural of orifice on the model of other words ending in /əs/). Deciding to label liquification an 'error' has to be POV. —Muke Tever 23:28, 9 May 2006 (UTC)Reply
That is UK/European POV, but the opposite seems to be true here in America. <insert pejorative comment about the US Congress' (Senate and House) language use here>. --Connel MacKenzie T C 17:06, 9 May 2006 (UTC)Reply

Ah, OK – the Usage note will be a complicated one, evidently! Widsith 17:08, 9 May 2006 (UTC)Reply

Maybe nonstandard would be a better word to describe this. JillianE 19:47, 10 May 2006 (UTC)Reply

Probably. --Connel MacKenzie T C 14:58, 12 May 2006 (UTC)Reply

Is is just me, or is the pronounciation depicted in our logo missing a syllable? Wouldn't the construction currently displayed be pronounced like Wik-tion-ry as opposed to Wik-tion-a-ry? BDAbramson T 00:49, 6 May 2006 (UTC)Reply

There is a section on the topic hidden in WT:FAQ.  :-) It is also discussed about a dozen times in the archives of this page, if you need an in-depth answer. --Connel MacKenzie T C 02:23, 6 May 2006 (UTC)Reply
Ah - didn't know we had a FAQ! BDAbramson T 15:55, 6 May 2006 (UTC)Reply

Anagrams Category

Is there a category for words that are anagrams, and if so, is there a language-specific category? --Think Fast 01:05, 6 May 2006 (UTC)Reply

Can't most words be anagrammed? How would such a category be subdivided? --Connel MacKenzie T C 02:25, 6 May 2006 (UTC)Reply
I'm sorry. I should have been more specific. I meant words that can be anagrammed into other words that are actually words. (I guess that otherwise the only words that couldn't be anagrammed would be "I" and "a":) There's a page about anagrams at Wiktionary:Anagrams that gives a very small explanation of it.
As for how the category would be subdivided, I don't quite know what you mean. (I haven't been here very long.) But guessing I would say options would include by language, alphabetically, and by number of anagrams. --Think Fast 13:41, 6 May 2006 (UTC)Reply
Well, that would still result in (one or more) gigantic category(ies), right? I'm not sure that people would find that helpful. Doing a crazy subnamespacing by the alphabetic first-sort doesn't work well either, e.g. Category:Anagrams:opts for opts, post, pots, spot, stop, tops. The resulting sub-caregories would generally be too small, and the pseudo-namespace talk pages would be in the wrong place.
I don't think categories is the right approach. Perhaps Wiktionary:List of anagrams or something like it? That could have a one line entry for opts. Pages could have *See [[Wiktionary:List of anangrams#opts|anagrams]]. in the ===See also=== section. Does this work well for everyone? --Connel MacKenzie T C 15:42, 6 May 2006 (UTC)Reply
Correction: I meant Appendix:List of Anagrams. The Wiktionary namespace is not appropriate. --Connel MacKenzie T C 15:44, 6 May 2006 (UTC)Reply
Would this mean that the ===Anagrams=== section would have to be removed and replace in the ===See also=== section? Also, I don't know if adding something like *See [[Wiktionary:List of anangrams#opts|anagrams]] would work because what if there were more than one anagram for a word? What would we do then? --Think Fast 00:03, 11 May 2006 (UTC)Reply
No, I do not think the ===Anagrams=== heading would have to be removed. I think it would be greatly simplified, as each entry (opts, post, pots, spot, stop and tops) would have the same one line, perhaps more like * [[Appendix:List of anagrams-o#opts|Anagrams for {{PAGENAME}}]]. Note that I think a similar subdivision method (as used in Appendix:Names would keep the lists managable.) I'm not sure at all, what you mean by "more than one" though. --Connel MacKenzie T C 02:54, 11 May 2006 (UTC)Reply
I mean something such as having opts, post, pots, spot, stop, and tops all for the same word. Now all are listed under the ===Anagrams=== category, but under the appendix idea, would all you see just be a line saying "Anagrams for {{PAGENAME}}" with a link to the appendix? --Think Fast 23:31, 11 May 2006 (UTC)Reply
Yes, that was what I was saying. AFAIK, the anagrams were supposed to be sorted alphabetically, so all the words would point to the Appendix:List of anagrams-o#opts. In the appendix list, there would be "== opts ==" with the anagrams listed under them. In the individual main namespace entries, yes, all you'd see is the ===Anagrams=== heading plus one line saying "Anagrams for ...". This would have the benefit of reducing clutter in the main namespace, for a "trivia" item that is not widely adored by other contributors here, while still providing a maintainable way to enter them. --Connel MacKenzie T C 14:55, 12 May 2006 (UTC)Reply
Now I see what you're saying. But why would an appendix be needed instead of listing the words in the article? The anagrams really don't take up all that much space... but then again, maybe it would be better to have an appendix. Is there a policy on things like this, or should we have a poll, or what? --Think Fast 23:28, 12 May 2006 (UTC)Reply

WikiSaurus:new

Anyone care to have a stab at adding to this one. There a lot of subtly different meanings that need Thesaurus lists. It's a bit of a test of if WikiSaurus really is going to work for us, how we might need to refine it.--Richardb 15:17, 6 May 2006 (UTC)Reply

Apostrophes versus quotation marks

I've noticed that the titles of some articles that contain apostrophes in their titles use the character U+2109 (’) to represent an apostrophe. While this character may be typographically correct in appearance, it is encoded in Unicode as the "right single quotation mark," where as the "typewriter" apostrophe (') or U+0027 is encoded merely as the "apostrophe." Thus, I believe it is incorrect to use the former character in titles as it is not really an apostrophe (in addition to not existing on a keyboard.) At any rate, it is not Unicode's job to dictate how a character is actually drawn: that is up to the user agent. The fact that a quotation mark "looks" more correct than an apostrophe is really an error on the part of the font or user agent and it is an erroneous substitute.

(Unicode also defines U+02BC as the "modifier letter apostrophe", which in many fonts looks different than the regular apostrophe. This character, however, is a meant for use as a diacritic and not as a regular apostrophe either.)

Addendum: it appears this issue has already been argued about a great deal, and dragging it on further would be of little benefit. Since, however, there appears to have been no formal resolution, it would seem prudent to use the U+0027 character, which appears to be the prevailing trend.

-- Ian Bollinger 04:43, 7 May 2006 (UTC)Reply

Actually, in the most recent go-round (see Wiktionary:Beer parlour archive/January-March 06#Curly apostrophes), according to User:Hippietrail (who I have no reason to doubt), there is a resolution, and it agrees with yours: "All titles with apostrophes should use the typewriter apostrophe." —Scs 15:19, 7 May 2006 (UTC)Reply

For those curious, see User talk:Connel MacKenzie/apostrophe. --Connel MacKenzie T C 06:40, 7 May 2006 (UTC)Reply

User:Primetime

Due to the recent spate of increased Primetime vandalism here, I came across familiar looking edit patterns at w:J. At the time, I was verifying an unconfirmed sockpuppet's copyvio vandalism of our entry for j. As a result of looking at the history of w:J, and a cursory amount of additional observation on Wikipedia, I discussed the matter with several others. In the end, I followed a suggestion to list a notice on w:Wikipedia:Administrators'_noticeboard#Wiktionary_user. (Thanks again for the link - I probably never would have found WP:AN nor WP:ANI.) Thank you whoever fixed the link, from when it got changed (by Primetime?) on Wikipedia. I'd re-added the link w:WP:AN#User:Primetime there, with a note to keep it affixed. --Connel MacKenzie T C 01:49, 9 May 2006 (UTC)Reply

I am still concerned that many entries created by User:Primetime still exist here. Systemic copyright violation was demonstrated, beyond any reasonable doubt. So why should any be retained? --Connel MacKenzie T C 09:48, 8 May 2006 (UTC)Reply

These are probably also Primetime's: Special:Contributions/67.165.217.42. —Vildricianus | t | 10:11, 10 May 2006 (UTC)Reply
This underscores the need for a CheckUser on enwikt; we know he's using open proxies, but we can't block them when he creates accounts (because only CheckUsers can see accounts' IPs.) --Rory096 09:37, 12 May 2006 (UTC)Reply
Discussion here: Wiktionary:CheckUser, but nobody wants to have the power. I could do it, but as a new admin, I'm not sure I'm the right person. Kipmaster 09:46, 12 May 2006 (UTC)Reply
Amgine and Hippietrail want to do it, but I had thought of you as well for the job, Kip. —Vildricianus | t | 10:21, 12 May 2006 (UTC)Reply
Note the latest on Wikipedia: w:User:Jimbo Wales has blocked Primetime indefinitely. --Connel MacKenzie T C 01:37, 22 May 2006 (UTC)Reply

WikiSaurus proposal

Would it be possible to implement a Wikisaurus: namespace in the same style as the Category: namespace? Equivalently, could we move Wikisaurus entries from the main space into the Category: space? For instance, consider Category:Wikisaurus:unhappy and Category:Wikisaurus:pathetic. If sad had templates referencing Wikisaurus, then it would be to these pages rather than any other pages. Because unhappy would also have a template reference, the page Category:Wikisaurus:unhappy would include "unhappy". It would also include any other word that had the template reference to "unhappy" in Wikisaurus. Because pathetic would also have a template reference, the page Category:Wikisaurus:pathetic would include "pathetic". To add a word to that list, one must add the Wikisaurus template to the word's entry, effectively claiming that all of what's listed on the WikiSaurus page are synonyms of a given sense. Davilla 18:05, 8 May 2006 (UTC)Reply

I experimented some months ago with the concept at Category:Wikisaurus:Book. See the talk page for why I (and Richardb) consider the experiment a failure. --Connel MacKenzie T C 03:48, 9 May 2006 (UTC)Reply

Blocking policy for spam?

Did we (the English Wiktionary) ever come up with a recommendation for block-duration for spammers? I'd like to see either an infinite block or a one year block for first offense, but that would be conditional on doing an ISP check though ARIN (or similar.) In the situation where it is an ISP, how long is appropriate? One month? Three months? 2 hours?

It is quite hard to tell if an ISP dynamic address is really dynamic or not. I know US cable companies (e.g. *.rr.com, *.comcast.net/com) and DSL (*.*bell.com, *.sbc.*, etc) tend to be semi-static, changing only once or twice a year. Dial up ISPs obviously give a different IP address with each connection.

Getting spam indicates a compromised/hax0r3d host - and as such indicates a significantly long block is warranted. Does anyone know of a reliable way of determining if an ISP connection is a dail up? I know that many ISPs use a DNS naming convention of "ppp-nnn-nnn-nnn-nnn.ISP.tld" for their (point to point protocol) dailup pools, but that hardly seems reliable.

--Connel MacKenzie T C 17:39, 9 May 2006 (UTC)Reply

Vote for User:TheCheatBot format

Fourth revote cancelled and restarted


  • Comments:
    1. I presume multiple changing votes are allowed, and I will consider doing it only if it appears that another variation I like could overtake the one I chose. In fact the one I chose is not the one I prefer, but from previous votes it would seem to be a run-off between the first two options. Davilla 21:07, 9 May 2006 (UTC)Reply
    2. While I prefer the traditional use of italics to signify use-mention distinction of latin script words, the last round of voting indicated that option #4 had little support. Following Hippietrail's lead of using css to give users a flexible experience, however, I'd like to add the following option:
      # {{plural of|word}}
      I suggest the above to help us achieve consistent results and to allow users to customize their monobook.css if they wish to distinguish use from mention with italics. Would anyone strongly object to my adding it to the above voting options? Rodasmith 01:04, 10 May 2006 (UTC)Reply
      See "words" for the css solution in use. For most users, it appears in style #1 below. For me, it appears in format #4 below, because User:Rodasmith/monobook.css has the "italicized singular" option below. To apply any of the following styles to the template-driven example at "words", copy the corresponding code into your User:YourUserName/monobook.css (and refresh your browser cache):
      Plain (consistent with majority of existing entries):
      .use-with-mention { font-style: normal; }
      .mention { font-style: normal; }
      Bold singular:
      .use-with-mention { font-style: normal; }
      .mention { font-weight: bold; }
      Italicized qualifier:
      .use-with-mention { font-style: italic; }
      .mention { font-style: normal; font-weight:bold; }
      Italicized singular:
      .use-with-mention { font-style: normal; }
      .mention { font-style: italic; }
      Please let me know if you have any questions or difficulty using this framework. Rodasmith 05:33, 10 May 2006 (UTC)Reply
      Nice work. I'm glad to see people joining the effort to CSS-ize Wiktionary. A couple of comments: Why not just call it "mention" since it's not both a use and a mention, "use-mention" is the name of the distinction. Also, I would like to see "Regular ..." for the regular cases, at least as an option via CSS. I'm trying to think of ways to make it dependant on script too. Different people may want to emphasize non-latin scripts in other ways. Maybe Template:mentionXX where XX can be AR FA RU TH etc as in my XXchar templates? Or we could just use HTML's language feature - does anybody know how well that works in combination with CSS on current browsers? — Hippietrail 01:31, 10 May 2006 (UTC)Reply
      I followed Hippietrail's advice and adjusted the css name to be "mention" for the mentioned word and added a style for the using sentence. Note that using a template-driven solution allows us to refine it easily with language-specific styles later. Rodasmith 05:33, 10 May 2006 (UTC)Reply
Thank you for thinking outside of the box. Yes, I'll change my vote to that as well. I'd appreciate it if you came up with (assuming you haven't already) the various css solutions to display them in the varieties spelled out abovebelow. I can think of a dozen minor issues offhand, but I'd support this as it is the first open-ended solution.
Done. See above. Rodasmith 05:45, 10 May 2006 (UTC)Reply
  • Minor issues:
  1. What should be the default? (I for one, could care less. But I'm sure it'll become an issue eventually.)
    Can we simplify this vote by choosing the default style for the template-driven solution later since this vote is supposed to let us run the bot? Rodasmith 05:33, 10 May 2006 (UTC)Reply
    No. One reason is that the use of templates themselves may evoke negative response. Another is that we may as well kill two birds with one stone, as that is still the only remaining point of contention against running the 'bot. --Connel MacKenzie T C 06:29, 10 May 2006 (UTC)Reply
  2. What is the css to italicize "plural of"?
    I updated my post above to show that. Rodasmith 05:33, 10 May 2006 (UTC)Reply
  3. How do I capitalize the first caracter of an arbitrary string (not just Plural) in css?
    To capitalize the first letter of each word, the css would include the style "text-transform:capitalize;". To capitalize just the character of a multi-word string, the string would need to be marked up with the a span around the first character or the first word. Rodasmith 05:45, 10 May 2006 (UTC)Reply
As for multiple votes, I'd recommend against that - we are striving for a simple vote. Allow me to rearrange things a little, since this is being restarted on the very first day... --Connel MacKenzie T C 03:07, 10 May 2006 (UTC)Reply

Note: This is the FIFTH re-vote being held on this topic.

Since the last vote died in the Beer Parlour archives, (waiting for Ec and Ncik to return and comment) I'd like to try again. Please vote only once. This vote will last for two weeks, ending on May 24th, 24:59:59 UTC. Adding addition formatting choices will automatically restart the vote, with only a one-week duration from the time the choice is added, or the original end date (whichever is later) to allow time for people to change their vote.

Note: This vote will end shortly. Those who have declined to vote, while otherwise contributing (ignoring notices on their talk pages) will probably be regarding with less weight, as the matter clearly is not important to them now. --Connel MacKenzie T C 18:43, 23 May 2006 (UTC)Reply



How should the approved 'bot TheCheatBot format English plural noun entries?

  1. consistent with majority of existing entries: # Plural of [[word]].
    • For:
  2. bold singular: # Plural of '''[[word]]'''.
    • For:
  3. italicized qualifier: # ''Plural of'' '''[[word]]'''.
    • For:
  4. italicized singular: # Plural of ''[[word]]''.
    • For:
  5. template driven: # {{plural of|word}}
    • For:
    1. --Connel MacKenzie T C 03:07, 10 May 2006 (UTC)Reply
    2. Rodasmith 05:33, 10 May 2006 (UTC)Reply
    3. \Mike 06:06, 10 May 2006 (UTC)Reply
    4. Widsith 09:29, 10 May 2006 (UTC)Reply
    5. —Vildricianus | t | 09:38, 10 May 2006 (UTC)Reply
    6. —Stephen 11:32, 10 May 2006 (UTC)Reply
    7. --Patrik Stridvall 19:17, 10 May 2006 (UTC)Reply
    8. MGSpiller 22:05, 10 May 2006 (UTC)Reply
    9. Clear winner --Dangherous 14:48, 11 May 2006 (UTC)Reply

More inflection templates?

For users who are not bots who restrict ourselves to English nouns, there could be instances where we would like a similar template but with a user-chosen string instead of "plural of". But such a template would definitely benefit from the aforementioned possibility to adapt whether to italicize depending on the script. I'm unsure about how to implement it, but if anyone can do it, it'd be great! \Mike 06:23, 10 May 2006 (UTC)Reply

I'd be happy to show an example of such additional templates, Mike. A key is to identify the purpose of the optional italics so that we can create a sensible css style. Just let me know a scenario where you would want to apply such a template so I can demponstrate. Rodasmith 16:43, 10 May 2006 (UTC)Reply
I like the template option, but it doesn't seem to allow piping, which might be important for some languages. For example gewat at the moment links to [[gewitan|ġewitan]] and [[gewitan|ġewītan]] – both the same page but the second one usefully distiguished with a macron. I know that isn't a plural but you see what I mean. Can anyone comment on this? Will that be impossible with this template? Widsith 08:08, 10 May 2006 (UTC)Reply
Can be solved by adding a second parameter. —Vildricianus | t | 09:28, 10 May 2006 (UTC)Reply
OK, I don't really know what that means but I'm sure it'll work. Widsith 09:29, 10 May 2006 (UTC)Reply
Something like Plural of [[{{{1}}}|{{{2|{{{1}}}}}}]]. perhaps? —Vildricianus | t | 09:37, 10 May 2006 (UTC)Reply
Thanks, Vildricianus. I put your above code into {{plural of}} to enable piped links. Widsith, could you provide an example of a plural term whose singular link should differ from the singular form displayed so that I can show that example in the template documentation? Rodasmith 16:43, 10 May 2006 (UTC)Reply
Well, a simple example is stān ‘stone’, plural stānas. Entered as stan and stanas respectively, with macrons piped in as per modern editors/dictionaries. Widsith 16:50, 10 May 2006 (UTC)Reply
Thanks. See "stanas". Rodasmith 18:29, 10 May 2006 (UTC)Reply
  • WHOA! Those entries should be split into separate entries with {{see}} on the first line. What printed dictionary uses "piped" hyperlinks? This seems like a very poor example, as it is directly in conflict with existing Wiktionary practices. --Connel MacKenzie T C 02:24, 11 May 2006 (UTC)Reply
OK, so there's a separate policy discussion to have regarding whether we allow piped diacritic-stripping links. Let's not let it derail this cheat-bot vote. Rod (☎ Smith) 02:52, 11 May 2006 (UTC)Reply
For those interested, I have raised this with Connel on his Talk page. In fact piping is well established here for OE, Latin, Russian stress marks, Arabic vowels etc. Widsith 06:48, 11 May 2006 (UTC)Reply
Precisely. Sorry for the noise. --Connel MacKenzie T C 07:02, 11 May 2006 (UTC)Reply

I can't wait to vote on the style of the template:plural of page. :-P Davilla 14:51, 10 May 2006 (UTC)Reply

You won't have to vote. With Rodasmith's proposal, you can choose whichever you like with CSS. —Vildricianus | t | 15:11, 10 May 2006 (UTC)Reply
But what is the default style? I guess the vote actually pertains to use-mention, however it is those classes are decided here. Davilla 15:14, 10 May 2006 (UTC)Reply

I'd like to comment on the voting method used here. I think it is appropriate in this case due to the small number of voters, their communication, and the ability to guage the outcome and change one's vote. I'd just like to point out that, from a purely theoretical aspect, restriction to one vote when there are multiple options is not the best way to reach consensus (in its most basic sense a majority) because of the spoiler problem. I know that wiki does not equate with democracy, but as voting methods are a particular interest of mine I would hope that some people here would be interested in pursuing a technical treatment of what's considered "fair". If so I suggest looking into approval voting, which is applicable to this case and probably one of the most transparent methods. On the other hand I would have to say that the multiple votes taken to resolve this one issue are a clear indication that there is a good-faith effort to reach a consensus and to treat all contributors' opinions fairly. Davilla 15:52, 10 May 2006 (UTC)Reply

Drat, that's what I thought you were getting at when you first suggested it. I am very mildly against approval voting, as I'd prefer a simple vote. But hey, it's a wiki - if you guys want to redo this in a "superior" manner, well, I guess I can't object. Does anyone else (besides Davilla) feel strongly one way or the other, about the "approval voting" style at this point in time? --Connel MacKenzie T C 02:21, 11 May 2006 (UTC)Reply
Actually I feel very strongly that this vote is going well, and I wouldn't even object to the one vote restriction unless it were a plurality rather than a majority that constituted a consensus. It was brought up just for your information and future reference. Davilla 17:30, 11 May 2006 (UTC)Reply

Misspellings in Wiktionary definitions

There is a list of common misspellings here. Anyone who is looking for a little project might like to run a search (either within Wiktionary, or using Google with the domain set to en.wiktionary.com) for misspelled words in definitions in Wiktionary and correct them. For example, there are lots of "accidently"s in Wiktionary definitions that should be changed to "accidentally". (Note that there is an entry for accidently, an obsolete word, which is correct as it stands and should not be changed.) Unfortunately, the page doesn't always give the usual incorrect spelling, so some will have to be deduced.

There is a link at the bottom of the page that makes the list of words to check grow from 100 to 250 words.

(Incidentally, why does this page have two "+" tabs now?)

Paul G 11:19, 11 May 2006 (UTC)Reply

(Now that is a really good question! I've never understood why that tab sometimes appears and sometimes doesn't, but it's even more baffling that it's even possible for it to appear twice.... —Scs 14:39, 11 May 2006 (UTC))Reply
("Guess": Because in MediaWiki:Monobook.js there is code to add this "+" tab specifically to Beer Parlour. At some later date, someone added the new magic word __NEWSECTIONLINK__ at the top, which I guess does the same thing, though I cannot be sure until I've saved this as none shows up in edit mode... (Yes, I'm removing it now for the sake of scientific inquiry :) Please replace at will. \Mike 20:13, 11 May 2006 (UTC))Reply
Fun fun. That's my code from when I was first learning JavaScript and wiki customization. The same happens at the Tea room. I've commented out my code since it's slower than the new magic word. — Hippietrail 20:55, 11 May 2006 (UTC)Reply
Perhaps a page could be set up somewhere that lists all of these and records dates next to them saying when that misspelling was last hunted down and eliminated. This would be very useful. — Paul G 11:23, 11 May 2006 (UTC)Reply
Another thing of course is that most (but perhaps not all) of these should have entries in Wiktionary of the form "* Common misspelling of..." (see alot, for example). — Paul G 11:36, 11 May 2006 (UTC)Reply
Isn't our list much more extensive? I've been running my list of typos for some time now - is there a better place for this list to go? --Connel MacKenzie T C 15:01, 11 May 2006 (UTC)Reply
I'm sure you know Wikipedia has a list of misspellings, and it's pretty darn big. It might make sense to update theirs and modify it to your own needs. Seems to be one in the same.
Eventually there must be a solution so that words used in modern English, and only words used in modern English, show in blue. (The others would not be red of course, but neither blue.) Davilla 17:17, 11 May 2006 (UTC)Reply
Um, wow. There is a lot more to your request than I realized the first time I glanced at it. "Modern" English being a sub-language or something? (I assume you don't mean delete obsolete or archaic spellings, but rather have the links appear in a different color or something, right?) I'm not sure how a Wikimedia extension would actually do it, offhand. --Connel MacKenzie T C 14:43, 12 May 2006 (UTC)Reply
In fact possibly not just modern English, but common English, excluding words like therefor and assay, words that raise eyebrows when they're used. This all seems quite a ways off, but anyhow... There must be a point where the software actually looks up the word to see what color it is. Maybe the quality it's looking for isn't called "color", rather "existence" or something, but I would extend that to simply mean the desired color, or add another field to that effect. This would be the only way to do it efficiently. The question is how that field would be filled, how the color would be determined. The difference between blank pages, nonexistent pages, redirects, and real pages is pretty easy to ascertain. But this would require an additional flag, something that could distinguish between good English words and alot of others. It would take someone familiar with the software to determine the best way to do that. A special category is an idea, or maybe using [en:word] on the page just as [fr:word] and others are added for a different special meaning. Incidentally this is a distinction needed only for common English words on the English Wiktionary, only common French words on the French Wiktionary, etc. But it might also be smart to make the solution extensible--so long as it's globally within the Wikimedia: space rather than something where anyone could choose any color for their word. Davilla 16:09, 12 May 2006 (UTC)Reply
:-) Category: Good-er, more betterer English? Seriously though, something like the "stub threshold" functionality (that Vild and I use to make regular entries appear as brown, and redirects appear as blue) shows that this concept is perhaps possible, by modifying the core Wikimedia software. That would, of course, reopen the prescriptivist/descriptivist debate, as to which terms belong in the magic category. Also, I think the developers are (justifiably) leery of touching those sections of code. --Connel MacKenzie T C 18:27, 12 May 2006 (UTC)Reply
You are definitely the most futuristic among us, Davilla :-). I don't have a clue how this would get implemented, and moreover, which criteria would be used for it (what the heck is wrong with assay)? You see, I don't think it's really feasible right now. —Vildricianus | t | 18:04, 12 May 2006 (UTC)Reply
Sounds kinda nice, but... It's not a word alone that is common or not. It is the usage. assay = "the qualitative or quantitative chemical analysis of something" is current, and moderately common, while assay as "trial, attempt, essay" is totally foreign to me. At this poiint I think the idea breaks down, until we have the propsed new version of Wiktionary that has meanings as a separate database item.--Richardb 03:09, 14 May 2006 (UTC)Reply
This is one area where =Reference notes= come into their own. The ones in for alot are really useful. Widsith 17:47, 11 May 2006 (UTC)Reply

Ah, but the page I referred to gives a list of misspellings, not of typos. "Teh" is a typo for "the", while "accomodation" is a misspelling of "accommodation". The difference is that someone typing "teh" knows that the word is spelled "the" (and, if writing with a pen, would write "the"), but someone writing "accomodation" with a pen is unaware that the correct spelling is "accommodation". A typo is a slip of the fingers that is corrected on a careful re-reading; a misspelling is ignorance of the correct spelling and will be overlooked on a re-reading.

If Connel's list of typos already does this job, then that's good. However, it would, as Davilla says, be worth while checking that the list includes the words on the website I am referring to. — Paul G 15:05, 12 May 2006 (UTC)Reply

Could you please add whatever ones that are missing to Wiktionary:List of common misspellings / w:Wikipedia:List of common misspellings? --Connel MacKenzie T C 16:56, 12 May 2006 (UTC)Reply
I spent some time building a proper, simple list from the yourdictionary.com article, so yeah, I can do that. —Scs 03:03, 13 May 2006 (UTC)Reply

Okay, short answers now, longer answers later (because there's a lot more to do here):

  1. Our list and Wikipedia's are nowhere near in synch.
  2. As complete as the two lists are, there's a remarkable number of words on the two yourdictionary.com lists (misspelled.html and 150more.html) that are not on either of our lists. In fact, I count 105 of them (out of 250). A preliminary list is below.

Among other things, Wikipedia's list is broken up into 26 per-letter lists and a master ""machine format" list. I haven't investigated whether the former are built from the latter, or what.

Here's a preliminary list of the ones we're missing:

  • a while
  • accelerate
  • accumulate
  • acknowledge
  • acquit
  • axle
  • barbecue
  • bellwether
  • broccoli
  • camouflage
  • cantaloupe
  • carburetor
  • chauvinism
  • chili
  • chocolaty
  • coliseum
  • collectible
  • colonel
  • column
  • conscience
  • conscientious
  • coolly
  • daiquiri
  • deceive
  • defendant
  • defiant
  • desiccate

Template:mid4

  • deterrence
  • difference
  • diorama
  • disappoint
  • discipline
  • dissipate
  • drunkenness
  • dumbbell
  • ecstasy
  • especially
  • exceed
  • exercise
  • exhilarate
  • experience
  • explanation
  • fiery
  • flabbergast
  • flotation
  • fourth
  • fulfill
  • genius
  • gross
  • handkerchief
  • horrific
  • ignorance
  • immediate

Template:mid4

  • inadvertent
  • ingenious
  • inoculate
  • irascible
  • jewelry
  • judgement
  • kebab
  • kernel
  • lightning
  • liquefy
  • lose
  • magically
  • marshmallow
  • memento
  • mischief
  • nauseous
  • octopus
  • onomatopoeia
  • perseverance
  • physical
  • pigeon
  • pistachio
  • plenitude
  • preferable
  • presumptuous
  • principal/principle

Template:mid4

  • publicly
  • puerile
  • putrefy
  • questionnaire
  • raspberry
  • receive/receipt
  • sacrilegious
  • sandal
  • savvy
  • scissors
  • sensible
  • septuagenarian
  • shish
  • simile
  • special
  • supersede
  • tableau
  • tariff
  • their/they're/there
  • too/to/two
  • tragedy
  • twelfth
  • ukulele
  • vicious
  • village
  • you're/your

Scs 05:57, 16 May 2006 (UTC)Reply

Recent changes text

As my computer is... not doing so well right now, can somebody else maintain Special:Recentchanges? I can only get on like once a day, and just for short periods of time, so I don't really have time to do it right now. --Rory096 09:39, 12 May 2006 (UTC)Reply

Pushing for the definitive WikiSaurus name and namespace

Purpose of this thread is to achieve both the following:

  1. A final decision on whether to keep WikiSaurus as the definitive name for the Thesaurus part of Wiktionary.
  2. The establishment of an independent namespace for it, instead of the current pseudo-namespace. This also for WikiSaurus talk of course.

Personally, I'm neutral to the first. I don't like it (it reminds me too much of prehistorical creatures), but I don't have a valid alternative right now. I didn't find any threads about it, but then, I didn't look very well. The second is an absolutely necessary beginning in the current developments that should bring the Thesaurus out of its pre-embryonic state. —Vildricianus | t | 12:30, 12 May 2006 (UTC)....So, would thesaurus also remind you of prehistoric creatures?>/small> Reply

Is the 'S' capitalized? Davilla 15:46, 12 May 2006 (UTC)Reply
Why not simply Thesaurus: ? There is no need for the Wiki part in the name (too much a dinosaur name...) and Thesaurus is more intellegible. - Dakdada 15:58, 12 May 2006 (UTC)Reply

That's what I like. Voting without bothering to put forward arguments! :-) Before perhaps actually considering what a name is used for. Before all the arguments are put. Will you later read the additional arguments put, and reconsider your vote?--Richardb 01:54, 14 May 2006 (UTC)Reply

I think the thesaurus name should be preserved in some way. For uniformity, preferably in the same way "encyclopedia" is preserved in Wikipedia, and "dictionary" is preserved in Wiktionary.
It's part of "marketing". We need a name which both fairly obviously means thesaurus, but also is our unique name for it. There are already approximately 12,000 external links to "WikiSaurus". How are people going to refer to our Thesaurus ? Are they going to have to have links to the "Wiktionary Thesaurus". Bit of a mouthful. And currently, do they refer to the "Wikipedia Dictionary", or to Wiktionary? --Richardb 01:54, 14 May 2006 (UTC)Reply
  • Comments:
I believe the "WikiSaurus" can grow to be more than a "simple" thesaurus, as it is not bounded by the paper limitations of a Thesaurus. So I would not like to see us limit our thinking by adopting a limited thinking name of "Thesaurus" with all the baggage that carries. (A publication, usually in the form of a book, that provides synonyms (and sometimes antonyms) for the words of a given language.) --Richardb 17:36, 12 May 2006 (UTC)Reply
Calling a thesaurus a thesaurus can only help people visiting what the sub-project is all about. Calling it a Wikisaurus seems to only cause confusion. The topics that I've seen debated about it were all about criteria for inclusion/what line-in-the-sand to use, or about formatting/layout. I haven't heard any suggestions about it becoming anything more than a thesaurus, to date. It is still small and fairly easy to correct now - why wait? --Connel MacKenzie T C 17:52, 12 May 2006 (UTC)Reply
Hands up anyone else who is confused by the name WikiSaurus/Wikisaurus ? Are you also confused by the names Wikipedia and Wiktionary ?  :-)Is that supposed to be a real argument Connel ?--Richardb 01:54, 14 May 2006 (UTC)Reply
There have been a number of extension suggestions, certainly beyond synonyms and antonyms to all manner of other realtionships. Whether they are valid or not I am not sure. Possibly those relationships should be put in the main word entry. One suggestion was to have a "range" realtionship, eg: freezing, cold, tepid, warm, hot, blistering. This is a minor part of the argument for the name though.--Richardb 02:00, 14 May 2006 (UTC)Reply
Well I'm sure, Richard, that you know what thesaurus literally means: treasury, or storehouse. I certainly think that, whichever direction it is we decide to follow, this name is quite suitable. —Vildricianus | t | 18:29, 12 May 2006 (UTC)Reply
Jay Leno once did a joke that scientists had discovered a new dinosaur called the thesaurus, which defended itself from predators with flowery language. BDAbramson T 18:49, 12 May 2006 (UTC). Is that an argument for or against anything??Reply
This seems like a non-issue when we lack content to such an appalling degree. Don't spend time thinking about the name or debating it, spend time making up lists of semantically and thematically related words, then make WikiSaurus into something deserving of a name at all. - TheDaveRoss 19:03, 12 May 2006 (UTC)Reply
The name itself, indeed, is not very important. WikiSaurus being a real namespace is. But we need a settled name before applying it, right? —Vildricianus | t | 19:15, 12 May 2006 (UTC) Totally support Vildricianus that it is better to settle this as early as possible, before the thing gets much bigger. Already Connel is asking is it too big to rename now.--Richardb 01:54, 14 May 2006 (UTC)Reply

Good, that's at least 6 votes, which is remarkable! On a technical note, though: how are things with the namespace manager thingy? Is this available to us, and if so, who has access to it? Developers, bureaucrats? —Vildricianus 21:21, 20 May 2006 (UTC)

Meta indicates that Bureaucrats should have it at the very bottom of their Specialpages. --Connel MacKenzie T C 02:14, 22 May 2006 (UTC)Reply

Continuing the debate

Here's what I'm thinking right now:

  • WikiSaurus/Wikisaurus is not the right name for the namespace for these reasons:
    • It's confusing. People do think of a dinosaur. Let's keep it easy, clear and simple please.
    • It's not appropriate to append Wiki- to everything (cf. Rodasmith above). That'd become silly. So why do it for our thesaurus part? Because it is said that Wiktionary is the free dictionary and thesaurus? That's invalid. Our thesaurus is only part of the project, as are our Appendixes, Indexes, Rhymes, Quotations subpages, Concordances etc. All of these, or most, are absent from standard dictionaries, like a thesaurus usually is. So why allow exactly our thesaurus to become more Wiki than everything else? Already we have way more Rhymes: pages than WikiSaurus: pages. WikiRhymes? No.
  • WikiSaurus/Wikisaurus is the right name for the following reasons:
    • It's unique. Granted, every Google search that turns up "WikiSaurus" has to do with our project. And there are 13,000 search results. Most of these, though, apart from the mirrors, seem to profit from the huge amount of vulgar language that's in WikiSaurus, so it's rather defaming to have 13,000 Google hits.
    • There's too many infrastructure in place. Mmm, perhaps. On the other hand, there's relatively little infrastructure in place, compared to, for example, the number of different English inflection templates that are in use across our dictionary content. But we're still going to change them all, aren't we? Yes we are. So never mind a couple of WS: pages and links.
    • There's no burden or baggage from what people expect from a Thesaurus. True, we can fill in the idea according to our wish. But we can do that just as well when we use "Thesaurus" instead of our own unique name, can't we? Actually, I own a couple of thesauri, and all of them seem to have a unique interpretation of the concept. That's right, a thesaurus is "a treasury of words", and it's up to its creators how to interpret that.

Summary: no problem for me to have Wikisaurus as the definitive name, as long as there's a definitive name. But personally I support Thesaurus:, per the above arguments. —Vildricianus 17:09, 28 May 2006 (UTC)

proverb vs an aphorism vs an adage vs a maxim

proverb vs an aphorism vs an adage vs a maxim

I was going to categorise a saying clothes don't make the man as an aphorism. Then saw that there were synonyms too aphorism. And Widsith pointed out we already have Category:en:Proverbs. So what is the difference ? Can we categorise them all as proverbs ? Or should we distinguish betwen them ? (Widsith says its six of one and half a dozen of the other !) --Richardb 17:30, 12 May 2006 (UTC)Reply

This bears a similarity to the abbreviation/acronym/initialism issue, which also has never been properly "solved." That is to say, all sub-categories are more reasonably listed as a combined list somehow, in addition to being listed in their more granular sub-category. On the other hand, a large combined list is often considered to be "too big," here. I think the approach I took with a/a/i categories is a reasonable and useful one, but I know that several disagree. Fresh ideas/viewpoints would help on both these issues, I think. --Connel MacKenzie T C 18:06, 12 May 2006 (UTC)Reply

Customize italicized parenthesized terms

I'm just finalizing a template which allows each user to choose betwen:

  1. (foo) — my favourite because it’s beautiful
  2. (foo) — butt ugly
  3. (foo:) — straight out of satan’s bottom

But to show how impartial I am, I'm making the default #3 since I see it everywhere. Please discuss here which one the majority prefer and if necessary vote on it.

The template is {{italbrac}} - I can change the name if you like. I've used it in the article media. — Hippietrail 21:58, 12 May 2006 (UTC)Reply

  • As an added bonus, I've added it to {{idiom}} to show how it can be used with other templates. This is designed to be a very simple single-function formatting template only. But it likes to play with other templates! — Hippietrail 22:18, 12 May 2006 (UTC)Reply
  • .ib-inner, .ib-outer { display: none !important } and depending on whether you want to show or hide the colon set .ib-colon to inline or none, but it seems you must always use !important. — Hippietrail 23:27, 12 May 2006 (UTC)Reply
Now that is beautiful: THANK YOU. --Connel MacKenzie T C 05:16, 13 May 2006 (UTC)Reply
Correct me if I'm wrong. #1 should be the default, and #3 on the example page (as it shows for me) isn't very common. The default style for synonyms is also different, using a colon after the parenthesis.
Perhaps the templates could be named for function, as per {context} below, rather than the current style of parenthesized italics or whatever it may be. Davilla 09:54, 14 May 2006 (UTC)Reply
I also want #1 default but don't want to force my opinions or taste on everyone. I hit random a lot yesterday trying to find what was most common but it was very inconclusive. I also asked Connel to grep the SQL or XML dump for a scientific answer to which is most common, but he hasn't had the time yet. If anyone else can do it please see Connel's talk page for what I need - there are 8 possible variations - I don't know which are in use let alone most common. I can make #1 default but fear less people will notice and therefore less people will come here and comment/vote/opine. As for the colon outside for synonyms, would that be (foo): or (foo): ? — Hippietrail 23:30, 14 May 2006 (UTC)Reply
Due to popular demand of 2 people I've made #1 the default. I guess now I'll hear the wrath of somebody who liked the other way. In any case keep communicating the ideas. P.S. You don't even have to refresh your caches! — Hippietrail 00:35, 15 May 2006 (UTC)Reply
Hoorah. Thank you for seeing sense. There is precedent in print dictionaries for #1. I can't imagine what possessed you to sell your soul to Beelzebub and initially plump for #3 :) — Paul G 10:16, 16 May 2006 (UTC)Reply
Gee, looking at this, this, this and this, I'm left wondering what pagan dictionary you are using.  :-)   The parenthesis are simply unacceptable, (especially) (when) (stacked) (ridiculously.) We should perform a robotic exorcism of these cacodaemoniacal constructs.  :-)   --Connel MacKenzie T C 07:34, 18 May 2006 (UTC)Reply
I don’t have any idea what Hippietrail’s No. 3 is supposed to represent, but as a long-time professional typographer, I can assure you that, in U.S. English, at least, parentheses follow their base contents. That is, if, for example, you put parens around an italicized Latin term such as (hoc), all of it goes in italics. If there are mixed contents, the parens follow the base part: (the Latin is hoc); or (hoc is the Latin word); but (je ne parle pas le English). Usually the contents are not mixed, and parentheses always follow the contents. This goes not only for italics but also for bolding and for choice of typeface. If the contents are Times Roman, the parens are too. If the contents are mixed Times Roman and Arial, the parens follow the base part. If the base content is superscripted, subscripted, or formatted in any other way, it applies to the parentheses as well. This holds true not just for parentheses, but also for single and double quotes, and to colons. If a sentence or phrase ends in a colon, the colon receives the formatting treatment that was given to the base part of the sentence, regardless of how the word immediately before the colon was formatted. For example, "a list of auto parts:" or, "a list of auto parts:". These rules also hold for semicolons, commas, periods, question marks, and exclamation marks. This is why fontographers make italicized parentheses and punctuation that are actually italicized. Some symbols, such as bullets, do not do this, and therefore they are font-independent. —Stephen 18:11, 20 May 2006 (UTC)Reply

This code:
.ib-outer { display:none }
.ib-content { font-variant: small-caps; font-style:normal; }

is just fricking amazing! Absolutely! Perfidiously brilliant! Are we getting {{italbrac}} implemented everywhere? What are we waiting for? —Vildricianus 21:18, 20 May 2006 (UTC)

  • In answer to Stephen, #3 is not mine at all, of the 8 possibilites when including variations on whether the colon is inside or outside the parentheses and whether or not it is italicized, all but one are actually used here on Wiktionary. The only one which is mine at all is #1 since that's what I use when I'm adding text. Also, I believe everything you say since you have the experience, but when looking at actual books I see both systems. Just today I saw #1 in a language-related book I was looking at here in Costa Rica. That's why I think giving people a choice is the best answer. It's true that customization requires editing CSS right now and that's too technical for the average user - but hopefully we can improve that too, possibly with interaction from MediaWiki or a developer.
  • Secondly, I've removed the colon from this template as the colon is only used in certain situations so it seems. For those situations I have split off a 2nd template, {{italbrac-colon}}. It includes a colon both inside and outside the parenthese so a user will either always hide colons or display the one she wants to see. Outside makes more sense to me, but I'm not an expert. To manipulate the colon here is an example:
    .ib-outer .ib-colon { display: inline; font-style: italic }

Hippietrail 23:06, 25 May 2006 (UTC)Reply

  • I've simplified these templates and altered the way their CSS works so you might need to edit your CSS file if you customized it already. Please read the updated documentation here. — Hippietrail 20:22, 29 May 2006 (UTC)Reply

Wiki machine translation?

This hit me today when I was trying to machine-translate some Japanese into English: Knowing some of each language, it would be easy for me to correct the atrocious output of the translator. If there was a way the translator could take my input and use it to guide future translations, translation quality would improve significantly. Like Wikipedia, these corrections could be peer-reviewed.

Anyone heard of such a project? —This unsigned comment was added by 69.181.40.145 (talkcontribs) 06:45, 13 May 2006.

I would guess any attempt at the like would build from WordNet, although I agree that a wiki would be more powerful. I find foreign languages fascinating, and I have my own ideas on how to put a wiki to work, toward a different goal. The difficulty has got to be the programming. Davilla 18:04, 13 May 2006 (UTC)Reply
Machine translation between languages like English and Japanese are notoriously difficult because the two grammars are so very different. Most of the atrocious problems in that area would not be directly remedied by creating a wiki-style system of letting readers expand the translation vocabulary. Just figuring out what to use for the English sentence subject is challenging. Borrowing from Wikipedia:
  • 僕は鰻だ
  • 僕 ("I"/"manservant") + は (topic marker) + 鰻 ("eel") + だ (copula)
  • "As for"+"me" + "an eel"+"is"
  • "Regarding me, it's an eel."
In a cartoon with an eel speeking, it might translate as "I am an eel." In a restaurant, however, it translates as "I'd like an eel." So you see, machine translation from Japanese depends on context and would not clearly be assisted by wiki-enabling the word-to-word translation tables. Rod (A. Smith) 21:49, 15 May 2006 (UTC)Reply
Indeed. I'm a professional translator myself, to get the disclosure out of the way.  :) Nonetheless, some of the more hilarious bits of non-English I've run across have been generated by deliberately abusing online machine translation engines. In one instance, translating "My dog has fleas" just from English to Japanese and back again, without going through any other language pairs, results in "there is a chisel in my dog." Wow. I mean, wow. I couldn't smoke enough to get to that point myself, but hey.
Looking into it, it looks like part of the problem stemmed from how the translation engine parsed the sentence -- nomi in Japanese is both "flea" and "chisel", depending on context and subtle grammatical cues. Going into Japanese, the engine decided on nomi ga aru ("flea/chisel" [subject] "exists (inanimate)". Since the verb chosen is only for inanimate objects, nomi here could only be "chisel".
But without the kind of living-breathing-human understanding, any translation based on a word-for-word engine model is doomed to fail -- languages are not so tidy. And without a deeper understanding of both source and target languages, even a human might be hard-put to figure out that "there is a chisel in my dog" was originally meant to convey "my dog has fleas" -- they'd only know that it sounds purty durn weird, and that something was probably wrong, but they wouldn't know how to fix it.
However, the basic idea behind what our anonymous GP poster describes is in fact the way that most machine translation trainers are going, i.e. building up the capabilities of the system based not on word pairs but rather on whole text pairs, with the MT-produced translations edited by knowledgeable humans and fed back into the system. Some of the fancier systems also go about using a statistical analysis of the paired texts to try to predict how to translate similar texts.
As one might imagine, these efforts at building up and correcting a machine translation system are extremely labor intensive, and as such these systems are not cheap. It's also worth pointing out that any such system is limited to what has been fed into it -- if your MT system is all about legal boilerplate, good luck getting it to produce sane medicalese.
The upshot is that humans remain your best bet for flexible spot-market translation needs. MT is great for churning out masses of documentation, particularly if the end result is only needed for informational or in-house purposes, but for a polished finished document you wouldn't be embarassed to post in public, you're still going to need a human at some point of the process -- either as the translator, or at the bare minimum as the editor.
As a side note, those interested in poking about might want to look at Lost in Translation, a funny project similar to the summer camp game of Telephone. Type in a sentence or two, and the website feeds that through several language pairs, showing you how the original changes along the way. "My dog has fleas." became, after seven manglings, "_ with the mine of the dog of the chisel _". I find the punctuation changes most puzzling. Even more interesting, perhaps, is how this process stabilizes past a certain point, where feeding the resultant gibberish in at the front gives you more or less the same gibberish at the end. Food for thought, at any rate. Cheers, Eiríkr Útlendi | Tala við mig 04:21, 16 May 2006 (UTC)Reply
From your description of the way modern MT works, I would think that's possible within a wiki framework, except that the concept isn't transparent to the average person. Wikipedia is successful because everyone knows what an encyclopedia is and therefore the implicit goals of the project. Not everyone knows what an automated translation table of hand-coded text pairing mumbo jumbo database is. In other words, when knowledgeable minds come together to consider the engineering of such a project, the focus will have to be on minimizing the learning curve for your necessarily multilingual contributors. Davilla 17:36, 25 May 2006 (UTC)Reply

categories - toward consensus?

Hi I have been looking at categories and I think there are way too many of them. My difficulty is that if I look for a word, (let us say a technical term I don’t understand) e.g. consanguinity, I would expect to find it in the category []Law]], but not necessarily in a category []Law of persons]], because I wouldn’t know that it related to the law of persons.

This whole discussion is surely based on a false premise. If you are looking for a word consanguinity, logically you would use GO and SEARCH to find the word, not categories. Right ? If you expect to find it in the category LAW, and it's not. Then guess what. You put in the category LAW. No revision of categories is called for at all. This is not Wikipedia.--Richardb 02:38, 14 May 2006 (UTC)Reply
Not to speak for Andrew massyn, but I think the scenario he described is when a reader can't quite remember the word but will know it on sight. Rod (A. Smith) 17:24, 14 May 2006 (UTC)Reply
Ta that's what I should have meant.Andrew massyn
A possible solution for the editor is to put it in []Category: Law]], []Category: Civil Law []Category: Law of Persons]], ]], []Category: Family Law]], []Category: Law of Inheritance]] []Category: Criminal Law]] []Category: Law of Marriage]] but in practice this is not happening. Further, because consanguinity relates to at least three branches of law, as well as the overarching categories, it becomes unwieldy and impossible to deal with.
A second possible solution is to get rid of as many sub-categories as possible. This is the view that I favour. If one is seeking a legal definition, surely one category is sufficient. As commonly pointed out, we are not an encyclopaedia. If one is to distinguish between []law of persons]] and []family law]], this distinction is at its most basic an encyclopaedic definition.

When discussing parts of speech, my view is that the category []English Nouns]] for example is useless. The standard when editing words it to put in the part of speech.

Parts of speech like []transitive verbs]] or (my personal worst) []uncountable nouns]], should be on the article page. Again, if a specialist is looking for the distinction between perfect participles and pluperfect participles, this is not the forum for it.

My thought is to have as few general categories as possible, and if necessary to link between the general categories. Thus for example, the category []Food and Drink]] is good. []Category: Chickens]] is not. Category []Poultry]] (which I created) is not. []Category: Italian Dishes]] is also not. I realize that certain words will get lost in certain categories, e.g. poulter would disappear from []Food and drink]], but could well find itself revived in a []Category: Work and Leasure]]

If there is general consensus, it would entail a lot of recatogorising and tidying up of disused categories, but my personal view is that it is worth it.

What is the community's view on the above? If the answer is in general yes then each category would have to be looked at individually and a decision made on each one. If no then that is the end of it. Andrew massyn 14:27, 13 May 2006 (UTC)Reply

  • I am probably unique in that I wouldn't mind if ALL categories were removed. Do we have any evidence that our users actually use them at all to find words (or even know that they exist)? I wouldn't be surprised if the Wiki went faster without them. SemperBlotto 14:33, 13 May 2006 (UTC)Reply
I also tend to this point of view. Widsith 14:43, 13 May 2006 (UTC)Reply
Agree as well. Ordering words by semantic field is something that belongs in a Thesaurus or Appendix. In the main namespace, "temporary" categories, like [<language>:<POS>] for non-English words are useful, though, and I know they're being used. Same goes for maintenance categories like TTBC or RFC of course. —Vildricianus | t | 15:01, 13 May 2006 (UTC)Reply
  • Just one point: don't forget that categories are very useful to people interested in languages using another writing. For example, how do you think that hieroglyphs can be found without categories? And this is true for other languages, too. Lmaltier 20:18, 13 May 2006 (UTC)Reply
Agreed. Paul Willocx 20:23, 13 May 2006 (UTC)Reply

Yes, ordering words by semantic field is something that belongs in a thesaurus or appendix, but from where should the thesaurus or appendix acquire the semantic information?

While many categories may be irrelevant to most editors, that does not make them useless. Categories are WikiMedia's way of associating meta-data with entries. It does not hurt anything for each term to appear in its appropriate positions under Category:All languages. There are potential uses for the categorization (e.g. automated thesaurus features, automated phrasebook organization, automated translation) even if any particular user never browses those categories directly in the dictionary portion of the project.

By fixing the problems with our current application of grammar categories, context categories, inflection templates, and context templates, we can simplify editing, gain consistency, and retain the useful category functionality:

  • Current problems:
  1. It is difficult to know where in Category:*Topics to look for any given word (e.g. "grand larceny").
  2. When editing, it is difficult to know the right categories to apply.
  3. Editors differ in the names and styles they use for context tags.
  1. Use Category:Inflection templates (e.g. {{en-noun-reg-y}}) to assign grammar categories instead of hard-coding entries with categories like [[Category:English nouns]].
  2. Instead of hard-coding things like [[Category:Criminal law]] in an entry, use {{context}}{{cattag}} on the definition line to assign Category:*Topics, e.g.:
    # {{cattag|US|criminal law}} [[larceny]] of [[property]] whose...
    1. Template:italbrac larceny of property whose...
  • Benefits of retaining specific categories:
  1. Consistency: Using this system, editors need not concern themselves with how to format definition contexts or with what categories to use for a term, because the Category:Context templates and Category:Inflection templates will handle categorization and {{context}}{{cattag}} will handle formatting.
  2. Wiktionary can retain the flexibility to be used as the database for other projects where the software must be able to read meta-data of entries (e.g. in a hypothetical WikiMedia Translator or Thesaurus).
  3. Review of topic categories can reveal language-specific deficiencies in, e.g. Category:Criminal law.

Should we branch this conversation off into a new subpage of BP? Rod (A. Smith) 21:39, 13 May 2006 (UTC) (I withdraw my recommendation for {{context}}, as {{cattag}} has precedence.) Rod (A. Smith) 15:42, 19 May 2006 (UTC)Reply

Rod. You could try creating a policy think tank page.--Richardb 02:33, 14 May 2006 (UTC)Reply
Yes, that would be a good way to eliminate any further discussion, as it will remain hidden from everyone's view.  :-) --Connel MacKenzie T C 21:23, 14 May 2006 (UTC)Reply
I know User:Eclecticology was very interested in how the categories were turning out. I also know he has had tremendous influence on implementing the current scheme. I vaguely recall him saying something about a one month wikibreak (although I can't find that reference now) so he should be back any day now. I do not think it would be reasonable to proceed with a large-scale re-engineering of our current category scheme in his absence. --Connel MacKenzie T C 00:49, 14 May 2006 (UTC)Reply
Wow. If Connel is advocating patience, then we definitely must wait!

Seriously, any contemplation of removing category information (ie:deleting knowledge) must be considered for at least three months before any such destructive action is taken.--Richardb 02:33, 14 May 2006 (UTC) I created a template {{rfcc}} for category page cleanup a while back. And I've just updated it by adding these words to the banner heading. A category page should/must include a description of what the category is for, and how it fits into any structure of categories and sub-categories.. My view is that we should pounce on any new categories that are created (but how to detect them ?) as quickly as possible, and ask the user who created them to document what the category is for. For a category without an explanation is often a waste of time indeed. I'd be happy if we could tackle/cleanup (by consensus, and the cleanup process) some of the categories which have very few entries and no explanation.--Richardb 02:47, 14 May 2006 (UTC)Reply

Since this is the first time {{rfcc}} has been announced here, should we wait three months before using it then? --Connel MacKenzie T C 21:23, 14 May 2006 (UTC)Reply
I agree with a lot of the above discussion, such as templates being the primary force by which categories are added, and the cleanup of categories tackling the sparsely and over-populated before any others.
I'm not up-to-date on which templates are used to show context. I always hard-code, which means categories aren't being added. Is there a way to add multiple, arbitrary contexts such as chemistry and physics, or logic and computer science, to a single definition? And are these contexts narrowly enough defined for our purposes? Davilla 09:48, 14 May 2006 (UTC)Reply
My suggestion above (which I should move to a policy think tank) is to have a complete set of context tags, exactly corresponding to the tags that editors want to display on the definition line. Each of those context tags will have both display text (e.g. chemistry) and a set of categories (typically one per context, e.g. Category:Chemistry). The display formatting (i.e. the parentheses and italics) would be handled by {{context}}. It will then be easy to manage what contexts are in use at Wiktionary. Rod (A. Smith) 17:24, 14 May 2006 (UTC)Reply
I don't think it is a good idea to limit the potential sub-category breakdowns while Wiktionary is still in its current embyonic state. Having a uniform way of adding them (such as {{cattag}}, {{cattag2}} or {{context}} has several benefits. --Connel MacKenzie T C 21:23, 14 May 2006 (UTC)Reply
I absolutely can't stand the numbers added to e.g. see also templates. Isn't there a way to take an arbitrary number of arguments? Regardless, these need to be funnelled somehow so we don't end up with categories soccer, Soccer, football, Football, soccer (football), Soccer (football), Soccer (Football), and probably a few items under "foot ball". But before that can be tackled we need to know how subcategories will be handled. There are too many open questions at this point! Davilla 20:41, 15 May 2006 (UTC)Reply
{{cattag}} and {{cattag2}} predate the ability to have parameter defaults in templates, IIRC. Perhaps {{cattag}} (category + tag) should be upgraded to our current standards. The lcfirst: and ucfirst: magic keywords also did not exist when these templates were created, but should be used now. --Connel MacKenzie T C 21:07, 15 May 2006 (UTC)Reply
OK, {{cattag}} now takes up to nine (9) such tags. --Connel MacKenzie T C 21:14, 15 May 2006 (UTC)Reply
Modified using template:foreach Davilla 02:17, 29 May 2006 (UTC)Reply

reflexive verbs

This is an issue which concerns certain languages which use reflexive pronouns. There are some (not many actually) entries for reflexive verbs, eg French se lever. But I think that's a bit silly, and that it would be better going under the lever page, on a def line having a (reflexive) marker instead of a (transitive) one. Is there any policy on including reflexive pronouns in the page titles? And what do others think? To me it seems a bit like having a page for kill oneself etc. in English. Widsith 11:23, 15 May 2006 (UTC)Reply

Yes, I'm familiar with that for Spanish; but the changes are entirely regular and never alter the stem (don't know if it's the same in Italian) – also, that implies you'd also need entries for lavarmi etc etc - I think personally all of them (including lavarsi) should be redirects (of some kind). The forms are very obvious to anyone who knows the language even slightly, or at least that's the case with Spanish. In French of course the pronoun is always a separate word so there's even less call for it, IMO. Widsith 11:36, 15 May 2006 (UTC)Reply

I agree with Widsith. Reflexivity, if you will, is just one possible option of the transitivity of verbs, at least in Spanish and French verbs. That is, some verbs are intransitive, some are transitive, some are reflexive, and some have multiple senses including some from each of those categories. So, by analogy with plural entries, the reflexive entries (at least for Spanish and French verbs) should redirect to the primary entries or should have a simple definition pointing to the verb's main entry (e.g. on "lavarse": "# {{reflexive of|lavar}}"). The main entry then shows the different senses, including intransitive, transitive, and reflexive. Rod (A. Smith) 15:26, 15 May 2006 (UTC)Reply
I don't see the value of diverging so far from standard en.wiktionary practice. We have separate entries for each spelling. So something like lavarmi should never be just a redirect. But, as Rodasmith indicates, the wiktionary way, would be to have a short entry for it, indicating it is a form of lavar. This is especially helpful for those of us who speak English, but not Spanish.
I strongly agree with Widsith that "reflexive" should be a definition-specific qualifier, at the start of a definition line. I don't object to having an entry for French se lever. But it seems to make more sense if listed as a definition line of lever#French, under verb, qualified as {{reflexive}}. --Connel MacKenzie T C 16:00, 15 May 2006 (UTC)Reply
I don't agree here; I support having these entries on separate pages (see zich herinneren). My reasoning here is that sometimes, it's possible that there is no non-reflexive variant (see zich gedragen vs. gedragen). Vildricianus 19:05, 15 May 2006 (UTC)Reply
But even in that case, wouldn't it still be better to define the word just once, rather than at every variation of the reflexive? How have you singled out which form to use? The other pages aren't filled out, so I don't know what you have in mind. Davilla 20:48, 15 May 2006 (UTC)Reply
  • I think we should do just as we do with past and present participles. Sometimes they have a special sense as an adjective which warrants a full entry, when they don't they can have the mini entry or be a redirect. A case I can think of that coincides with a spelling in another language is ververse (in a way parallel to dardame). Print dictionaries always carry the reflexive senses in the same entry but after putting the reflexive spelling in the same font and style as the headword. There are a few words which have no non-reflexive sense and these are then the primary/only headword. — Hippietrail 04:31, 16 May 2006 (UTC)Reply
Also, in Italian (don't know about other languages) there are some verbs (addirsi "to be suitable", for instance) for which there is no "normal" form (I'll get round to it sometime). SemperBlotto 10:28, 16 May 2006 (UTC)Reply

Changed Proposal for Policies and Guidelines to Semi-Offical Status

After giving notice in February, I have now upgraded Wiktionary:Proposal for Policies and Guidelines to Wiktionary:Policies and Guidelines - Policy, and changed the status to Semi-Official.

At this time, I will leave the redirect in place, and will slowly replace the important links.--Richardb 12:09, 16 May 2006 (UTC)Reply

Urgent: experiment gone wrong??

I've got some strange bugs right now that weren't here this afternoon:

  1. HTML displayed as text: This is a <a target="_blank" href="/wiki/Help:Minor edit" class='internal' title="Minor edit =typos, formatting etc (opens a new window)">minor edit</a>
  2. Contents of edit box being reformatted - adding linebreaks.

I do have custom stuff in my js but it wasn't doing this before. The global js doesn't look changed. I'll doublecheck my js now but just in case somebody is doing js experiments somewhere, I'm reporting it here so it can be stopped soon. — Hippietrail 23:31, 17 May 2006 (UTC)Reply

    1. 2 seems to be a false alarm. It looks like it was some weird dormant side-effect of my own custom js. Still, if you do see anything odd, best report it here just in case. #1 is still current though... — Hippietrail 23:37, 17 May 2006 (UTC)Reply
  • Goofy. I'm seeing #1, too. Specifically, next to the "this is a minor edit" checkbox on the edit page. Somebody changed something, that's for sure! Anybody want to 'fess up? :-) —Scs 01:56, 18 May 2006 (UTC)Reply
You all probably know this, but that's the text of MediaWiki:Minoredit. The MediaWiki server for some reason no longer believes that the HTML there is balanced and so it's escaping the HTML. No changes or deletions have recently occured on that resource or anything near it, though. Seems like a bug. Could an admin please try making a minor change to MediaWiki:Minoredit to see if that kicks the server into re-reading it? Rod (A. Smith) 03:13, 18 May 2006 (UTC)Reply

Word of Day

Who's in charge of updating the word of the day? Is there a bot for that? JillianE 13:47, 18 May 2006 (UTC)Reply

I think User:EncycloPetey has been most active populating them. But I know there have been several requests for additional volunteers to assist with adding new entries. --Connel MacKenzie T C 14:11, 18 May 2006 (UTC)Reply
In case you were wondering how it changes at 00:00 UTC: that happens automatically. —Vildricianus 19:39, 20 May 2006 (UTC)

Wiktionary:Spelling Variants in Entry Names - Draft Policy

I've incorporated the suggested improvements. I think the policy is now ready to upgrade to "Semi-official" status. Which I will do in one month, unless the debate remains active at that time.--Richardb 11:03, 19 May 2006 (UTC)Reply

I've changed its name to Wiktionary:Spelling variants in entry names. The convention is all lowercase, and policy status should not be reflected in the page title. —Vildricianus 19:37, 20 May 2006 (UTC)

exic*rnt

User:Ronnie11 entered definition "exic*rnt ." (note period) and deleted exic*nt entry from Wiktionary:Vandalism in progress/Long-term alerts. JillianE 18:23, 19 May 2006 (UTC)Reply

any Hebrew scholars here?

The pages דניּאל and בּית שׁמשׁ don't have language headers. I don't know whether they're properly classified as "Biblical Hebrew" or just "Hebrew". —Scs 12:43, 20 May 2006 (UTC)Reply

They are the same in both Biblical and Modern Hebrew, so I just put the ==Hebrew== heading. However, they should not be pointed, so I moved them to דניאל and בית שמש. —Stephen 17:33, 20 May 2006 (UTC)Reply
I've been attacking a plethora of these just on the most basic formatting. Do all the entries that were spammed here from the "Strong's" concordances go under just ==Hebrew== then? --Connel MacKenzie T C 23:02, 20 May 2006 (UTC)Reply

CheckUsers for en:wikt (revisited)

It's time to get it moving, right? Does anyone object to a vote for CheckUser status for any of the nominees here, or to a CheckUser at all on en:wikt? Now is the time! Please read m:CheckUser Policy and m:Help:CheckUser before considering. This would not be your average admin (or even bureaucrat) election, so it should go with even more consideration than at WT:A. Keep in mind: the thing here is not only confidentiality, but also technical ability on top of that. Nominees are also supposed to be aware of the Wikimedia Privacy policy. And lastly: as per m:CheckUser#Access, a Wikimedia project is supposed to have at least two CheckUsers, or none at all, to allow mutual checking. If no serious objections arise, I'll start three voting-style nominations at Wiktionary:CheckUser for the users who showed interest. —Vildricianus 21:01, 20 May 2006 (UTC)

possible music theory copyvios

I've come across a bunch of musical theory terms entered by User:Hyacinth on 2004-04-29. Examples: artificial grammar, metrical structure, time-span reduction, transposition, well-formedness rules. The formatting is poor and the definitions are fragmentary, and they appear to be copied verbatim out of some music theory books -- though at least these are cited. I've lightly edited a couple of them, but something more major is probably in order. Opinions? —Scs 02:00, 21 May 2006 (UTC)Reply

When you say "appear to be copied verbatim" do you mean you found the publications online, or you have a copy of that book handy? (Sorry, but the word "appear" makes your statement slightly ambiguous.) --Connel MacKenzie T C 02:02, 22 May 2006 (UTC)Reply
No, I don't have copies of the publications. But look at artificial grammar and the others; you'll see what I mean. —Scs 14:02, 22 May 2006 (UTC)Reply
What does "appear to be copied verbatim" mean? If you read the record, dmh had given Hyacinth some pointers a couple of years back, to which the contributor in question commented that he had "copied [some of the] definitions almost verbatim". Davilla 00:27, 23 May 2006 (UTC)Reply
I meant, the wording appeared to be more what a formal textbook would use than a random wiktionary contributor would use, and furthermore, the definitions are in many cases in quotes, as if to say, "I quoted this directly from the source I'm citing". And I did notice the contributor's comment, which only confirmed my suspicions. (Lastly, the quoted fragments aren't really in the form of dictionary definitions, either, and could use cleanup for that reason alone. I would have embarked on that, but in cases of systematic copyright violations, sometimes it's better to delete and start from scratch. Which is why I asked for opinions before proceeding.) —Scs 02:28, 23 May 2006 (UTC)Reply

Warning: funky notice - Revamping Beer parlour

Brainstorming still proceeding on Wiktionary talk:Beer parlour. Please comment on any of the proposed solutions, or your 56k modem will die. —Vildricianus 21:48, 20 May 2006 (UTC)

Italics

Why are quotations here set all Italic? Isn't a seperate line and indentation enough? I mean, the Italics along with boldface for the entry word looks like something from a comic book. I've never seen it in a dictionary. We're already using Italics for the notes, and Italics are used for foreign words and words referred to as words. It makes them stand out. I changed it for realize, but someone reverted me. I'm going to change it back again.—Uulgjm 18:00, 21 May 2006 (UTC)Reply

Mentioned sentences must be distinguished from used sentences to avoid confusion. Otherwise, it would appear that Wiktionary is making the claims of the quoted authors. Traditionally, quotation marks or italics make that distinction. Rod (A. Smith) 19:17, 21 May 2006 (UTC)Reply
According to WT:ELE, your edit of realize was quite unhealthy. We don't work with submeanings here. As for the quotations in italics: you're correct, they should be in normal font style, whereas italics is reserved for example sentences only. —Vildricianus 19:34, 21 May 2006 (UTC)
The Webster's entry for "realize" that the entry was copied from merges half the meanings. So, I don't know why you guys don't use submeanings, but it makes it much easier for the reader, who doesn't have to hunt down a long list of nearly-identical definitions. I usually use double bars || à la Larousse and Espasa-Calpe house style. Merriam-Webster uses bolded letters (a.) and others use letters in parenthesis like this: (a) or even just numbers in parentheses like this: (1).—Uulgjm 19:50, 21 May 2006 (UTC)Reply
We have experimented with submeanings (see deal for example), but they are not yet official policy. You might have a point that realize needs some attention, but ignoring the established style here just because you don't like it is not a good approach. Widsith 19:57, 21 May 2006 (UTC)Reply
We don't, for the sake of translations, for one. The only acceptable format would be double ##, if we were ever going to use it. Personally, I don't believe in the idea of subsenses, but that's not relevant here. —Vildricianus 20:39, 21 May 2006 (UTC)
It would be kind of cool if the software recognized ## not as a new list but as a continuation, indented. Then you could have:
1. General meaning
2. Narrower meaning
3. Et cetera
Davilla 20:50, 23 May 2006 (UTC)Reply
I apologize for not replying on your talk page, immediately after clicking [rollback]...I got tied up with other concerns. I'm glad to see you found an appropriate place to ask your questions, and have them answered (much as I would have.) --Connel MacKenzie T C 21:37, 21 May 2006 (UTC)Reply

Language wikification in translation tables round 7

I feel kind of moronic bringing this up yet again, but, I feel even more moronic looking at these funky translation tables. You guys know how perfectionist I am :-). Now listen, let me sing another tune than last time: what if we settled on "not wikifying any darn language at all in the translation tables." Sounds tough, huh? I've come to believe that either one of both extremes (all or nothing at all) is still a way better solution than the current randomness. But promised, you won't hear me again on this one from now on. —Vildricianus 21:30, 21 May 2006 (UTC)

That would certainly be closer to a NPOV. It would also be much easier to parse (programmatically.) WT:ELE could then be unambiguous, therefore less confusing to newcomers. --Connel MacKenzie T C 21:36, 21 May 2006 (UTC)Reply

Thank God, I thought I was the only one who wanted this. I agree! Widsith 17:01, 22 May 2006 (UTC)Reply

  • I think it's always useful to have unusual, rare, or exotic English words wikified to promote their lookup. Many language names fall into this category. Other issues such as parsing and worrying over why Esperanto or Yiddish is wikified in one table and not in another seem quite trifling and solvable. To me at least, wikifying hard words is good for the user, the other things are difficult in minor ways to editors, but if they're worried about them they can just ignore them and leave it up to other people. — Hippietrail 20:34, 23 May 2006 (UTC)Reply
That's what I used to think as well, but I've gathered a number of opinions on it:
  • It's not NPOV, unless we adopt some serious criteria for dewikification.
  • It's inconsistent and promotes arbitrariness.
  • It's very confusing for newcomers and has become one of the most FAQ.
  • For the tech-minded: it makes Wiktionary harder for a bot to analyse.
  • It's ugly and absolutely unprofessional.
Your arguments could also count for the option "wikify all languages," which is also better than current practice. Rules of thumb, however, depend on personal judgement and knowledge, and are therefore likely to differ widely among editors. That can be considered a bad thing, for the reasons given above, and it will only become less "trifling and solvable" as we grow bigger and bigger. PS: I even found a post in my talk page archives on it. —Vildricianus 10:26, 24 May 2006 (UTC)

translations to be checked

What's the right way to handle translations to be checked? Tag them with {{ttbc}}? Keep them in a separate section (perhaps "Translations to be checked"), tagged with {{checktrans}}? Both?

I ask because currently there are a lot of uncertain translation that are not tagged in either way, but are just listed in a section labeled with some variation on "Translations to be checked", and I'm worrying that those will never be found, let alone checked or fixed. —Scs 21:56, 22 May 2006 (UTC)Reply

Both. It should be like this:

=====Translations to be checked=====
{{checktrans}}
*{{ttbc|French}}: [[mot]]
*{{ttbc|German}}: [[Wort]]
etc.

—Vildricianus 22:02, 22 May 2006 (UTC)

Cool; that's what I had almost convinced myself of. Thanks for the confirmation. —Scs 23:11, 22 May 2006 (UTC)Reply
If you are feeling lazy, you can put them in a separate section with only {{checktrans}} and my Javascript/cleanup lists will catch it on the next XML dump. It is best if you just do as Vild indicated above, though. Someone indicated there was some desire to depricate {{checktrans}} a while back, in deference to {{ttbc}}. Perhaps we should start removing {{checktrans}} from entries that are "properly" TTBC'ed? Or is it felt that the combination is still the best approach?
For translation sections that are still in the original 2003/2004 "one translation only" format, I'm marking them with {{rfc-trans}} as I find them. --Connel MacKenzie T C 23:24, 22 May 2006 (UTC)Reply
The {{checktrans}} template should be retained because it contains useful information on how to check the translations and what to do when all the translations have been checked and tabulated. It also contains two categories. — Paul G 10:09, 23 May 2006 (UTC)Reply

Wiktionary:Tutorial

Rather here than at RFC: somebody needs to go through it and clean it up a bit. A lot of things seem outdated or simply erroneous, and it seems that newcomers really use it. Any takers? —Vildricianus 21:59, 22 May 2006 (UTC)

True, I had a helpful newbie say that (s)he is reading through Wiktionary:Tutorial (Wiktionary links)#Linking dates, and (s)he wikified the dates, which I thought we didn't do. I'll have a go at tweaking the pages, but ideally someone who's been here longer than me should tidy it up (I'm still a relative newbie, Dangherous has been a regualr just over 5 months, and there's probs quite a bit I'm still unfamiliar about. --Dangherous 09:53, 24 May 2006 (UTC)Reply

Images

Could I ask that people adding images to pages put them at the top (as the very first line of the entry)? Two pages that I have edited today include images that were lower on the page and overlapped the content, making it unreadable. The user shouldn't be required to change the size of their browser or the screen resolution in order to be able to see everything. (Indeed, this is not always possible - I have my flat screen and the browser set to their maximum respective resolutions, and the content was obscured.) — Paul G 10:12, 23 May 2006 (UTC)Reply

We don't have set rules on image position, because there are so many exceptions to each rule. I agree that the layout seems to work best with the image(s) opposite the TOC. I believe WT:ELE suggests placing an image near the definition it applies to, but I find that this does not often work very well. The most common mistake I see with images is forgetting the caption, with seems to break the "float=right" if they keyword "thumb" is also used. Hard-coding the thumbnail size is rarely the best approach. --Connel MacKenzie T C 14:14, 23 May 2006 (UTC)Reply
The TOC is almost invariably much narrower than the page, ensuring that there is room for the image in most cases, so this is usually a good position. Putting it anywhere else, there is no guarantee that the content and the image will not overlap or obscure each other. The layout for light bulb seems to work well, as the text wraps before it reaches the image (at my resolution, anyway).
Look ugly to me. Chops into the horizontal rule, and anyways I usually scroll down immediatly, ignoring the TOC. It's almost impossible not to do it automatically after a while. Hmmm.... it's odd that we would require scrolling down on every page to get to the meat. Davilla 20:45, 23 May 2006 (UTC)Reply
You can hide TOCs in your preferences. —Vildricianus 09:33, 24 May 2006 (UTC)
My "preference" is to see the page exactly as do 90% or more of visitors. Maybe the TOC should be collapsed by default? Davilla 17:16, 25 May 2006 (UTC)Reply
{{wikipedia}} can go at the top, of course, but I think it is preferable to put it between the POS and the inflections, as this puts the link next to the word it refers to and usually does not introduce any extra whitespace. — Paul G 14:45, 23 May 2006 (UTC)Reply
Yes, but I found multiple instances where that placement is not good, for example when the boxy templates are in place. —Vildricianus 17:04, 23 May 2006 (UTC)
I prefer to place the image in that sweet spot, between the POS and inflections, because of the edit links. The pedia box can go there alternatively, or otherwise, with both, I bump the latter down to somewhere below the translations. Davilla 20:45, 23 May 2006 (UTC)Reply
I put the image on the line below the language header, which is a mix between top-placement and placement relevant to the entry (e.g. la:libellula). I think it looks more balanced than the placement under the POS header (e.g. la:canis) which can leave text spilling around it. It's also better for if there have to be multiple images (e.g. la:Venus). —Muke Tever 23:07, 24 May 2006 (UTC)Reply

Why aren't Categories made more use of? - proposal

This is a proposal for a system that would make Wiktionary easier to use. I don't know if it's in the right place - if not, someone move it please.

Consider: If every translation/thesaurus reference was made as a categorisation, you'd automatically set up a much easier translation and thesaurus mechanism.

Fully explained:

Say you are writing an article about the french word Travailler. Under "translations" (or rather, just under it's meaning in english), instead of just writing its translation ("# work"), use a template-call ("{{t|work|fr}}"). Then template {{t}} would have the following contents:

# {{{1}}} [[Category:Translation: {{{1}}}|{{{2}}} {{PAGENAME}}]]
where {{{1}}} is the english translation of the word
      {{{2}}} is the language code of the foreign word

This would generate the same output (i.e. "# work") but would simultaneously categorise the french article as a translation of "work".

Then in the article on the english "to work", a simple link to the category Category:Translation: work ([[:Category:Translation: {{PAGENAME}}]]) will automatically draw you up a list of every translation in the Wiktionary.

This process would work equally well with WikiSaurus. --w:User:Alfakim

It does have the difficulty — at least for translations — that the category itself will not be usable: If someone is looking for a French word for something, say, they won't be able to find it, because it wont be labelled as French in the category listing (sort keys are not displayed, as your proposal seems to indicate). However it might be usable for thesaurus. —Muke Tever 22:55, 24 May 2006 (UTC)Reply
Hint: WiktionaryZ :). --Celestianpower háblame 09:05, 25 May 2006 (UTC)Reply
Yeah, I know about Z. It isn't much use though, as it won't let me put stuff in in my language :p And Z last I checked doesn't do translations as categories either, having the more redundant step of sharing all translations under every entry. —Muke Tever 00:26, 26 May 2006 (UTC)Reply
This encounters the same problems as the use of categories for Wikisaurus (See above references under my #Wikisaurus proposal). However, I think it's right to think of the issue in this way. Would it be possible to someday make Wikisaurus multilingual? Depends on the success of WiktionaryZ. Davilla 17:12, 25 May 2006 (UTC)Reply

redirects from caps to lowercase

I've been noticing a lot of redirects from the capitalized version of words to the uncapitalized where the uncapitalized entry doesn't exist. JillianE 16:26, 24 May 2006 (UTC)Reply

Examples?
My first guess (BICBW) is that they're to entries that were deleted, but someone forgot to delete the redirect. —Scs 17:47, 24 May 2006 (UTC)Reply


  • Inflected forms were, for a very short time, entered as redirects, until that was deemed unacceptable. Then along came the case-conversion. Since that time, I've deleted many/most of those redirects, leaving the capitalization redirect for when the 'bot-uploaded inflected form is entered. Now that the latest vote has expired (with the clear knowledge of the most abusive objector,) this "problem" can finally be fixed. As soon as I clear up a few other things first, that is. --Connel MacKenzie T C 06:38, 25 May 2006 (UTC)Reply

pronunciation of names/places

I noticed that, in wiktionary, the pronunciation of names of people or places, like Washington, is not labeled. I am wondering if it can be added. This could be very useful in many ways. For example, it would be polite to be able to pronounce someone else's name correctly. Of course it can be done by directly asking the person. But what if I am reading an article and would like to remeber the name of the author. I found that it becomes easier to remember a name if I can pronounce it. In academia, researchers are from all over the world and it would be of great help if their names can be pronounced properly.

I googled the web and found some websites related to this idea:

The first one is an online, automatic name pronunciation program developed at Carnegie Mellon University. But now it is not usable. the synthetic pronunciation probably is not good enough.

The second is a website about how to pronounce Finish names, which is similar to what I have in mind. But it goes further by providing recorded pronunciations.

Massachussett State has a webpage asking people to notate how to pronounce their hometown and Arizon State also has a website showing how to pronounce the town names of Arizona correctly.

By using Wiktionary, I believe a wider range of names and places can be incorporated. Since a person is authorative on how to pronounce his own names, everyone is encouraged to put his name in Wiktionary and notate his pronunciation in a proper way.

I am new to Wiktionary and would like to have your opinions.

This is definitely something Wiktionary can and does (in some cases) provide. If examples you have looked up here have been without pronunciation info, that is probably just because proper nouns have not received a huge amount of attention here. Also, the extent to which we include minor or obscure place-names is still open to debate. However, if you have specific requests in mind, you could add them to Wiktionary:Pronunciation_file_requests and you should get what you need. Widsith 19:18, 24 May 2006 (UTC)Reply
Yes, there is definitely debate on the more obscure placenames, esp. their inclusion as specific places rather than under a generic placename label. Davilla 17:07, 25 May 2006 (UTC)Reply
It sounds like you're interested in a batch process to accomplish this. The computerized weather channels have a good amount of hand-coded pronunciation information that they may be willing to share. Of course the format wouldn't be the same and requires processing, but you could be sure the pronunciation of place names matches what is used in the place itself. (This is the goal, right, to list "Houston" as /hjustn/, rather than /haustn/ as in New York's "Houston Street"?) I remember hearing that the automatically generated pronunciations had failed miserably at this. Davilla 17:05, 25 May 2006 (UTC)Reply

new word posting

I have a suggestion for a new word, for a specific purpose, in the English language. As far as the distinction between a protologism or a neologism, I hope I'm getting those words right, I have not heard the word used before, googled it, and came up with 2 usages, both agreeinging with the meaning I propose.

I tried to put it in as a neologism, no luck with figuring how exactly to do that, and had the same luck with the progologism.

any suggestions? thanks p —This unsigned comment was added by 68.35.9.33 (talkcontribs) 2006-05-24 23:46:12.

Did you see Wiktionary:List of protologisms? That's the best place for such words. If you did try there, what problem did you experience? Rod (A. Smith) 23:58, 24 May 2006 (UTC)Reply

Maybe time to rewrite Wiktionary:Neologisms? Davilla 17:23, 25 May 2006 (UTC)Reply

ALT-A

Someone (in the past couple days) changed something. ALT-A no longer functions on Wiktionary: namespace pages (like ALT-C does on NS:0 pages.) This has worked for a very long time. Anyone know what changed? --Connel MacKenzie T C 06:54, 25 May 2006 (UTC)Reply

Change Logo to SVG-Version

Hi there! Please change the logo in Template:wikibookspar, there is an vector version available: Image:Wikibooks-logo-en.svg. Regards, Schaengel89 09:45, 25 May 2006 (UTC)Reply

I'm not sure why we'd want that. I thought all logos were supposed to be kept locally, for times when commons is experiencing a slowdown (all too frequently.) The template protection has been lowered to "semi-protected" though. It is not clear to me what the correct action here should be. --Connel MacKenzie T C 15:48, 25 May 2006 (UTC)Reply
Also, since SVG support is not quite universal in browsers yet, I'd say that logos should remain GIF or JPEG for a bit longer. –Scs 18:08, 25 May 2006 (UTC)Reply
Just a note, that the MediaWiki servers produce PNG thumbnails "on the fly" for SVG images. Users will only encounter the actual SVG if they click through and try and open the original file. Regards, commons:User:pfctdayelise 07:20, 28 May 2006 (UTC)Reply

French Beer parlour

Does anyone happen to know what the equivalent of the Beer parlour is on Wiktionnaire, the French wiktionary? That's the place, not the translation (which apparently is "arrière-salle", meaning "back-room"), please. I'd like to post a question there. Thanks. — Paul G 11:43, 25 May 2006 (UTC)Reply

This page has an interwiki link to Wiktionnaire:Wikidémie on the left-most column. --Connel MacKenzie T C 15:44, 25 May 2006 (UTC)Reply

part-of-speech header for articles

Since there's one definite article in English and two indefinite ones, it would make more sense to me for the level-three header for all of them to be just "Article", not "Definite article" and "Indefinite article". Any objections to coalescing them in that way? (The point being, that if you think of part-of-speech as a category, a category with just one thing in it seems kind of useless.) –Scs 15:05, 25 May 2006 (UTC)Reply

Not sure I care either way but kein and keine are German indefinite articles and der, die, das are German definite articles.JillianE 15:52, 25 May 2006 (UTC)Reply
Ah, good point -- I always forget that the English Wiktionary isn't just English! Although I notice that der, die, and das are sitting under headers that just say "Article".
Looking at Patrik Stridvall's header tool, I see that as of 5/3 we had 40 articles, 49 definite articles, and 18 indefinite articles. –Scs 18:01, 25 May 2006 (UTC)Reply
  • Ec used to change these to "Adjective". I don't know if he changed his opinion or practice, but you might keep it in mind. As always, I advise people to check what print dictionaries do since they are made by trained proffesionals, we can learn from them. — Hippietrail 20:14, 25 May 2006 (UTC)Reply

I think =Article= is fine, as long as you find somewhere else in the entry to say whether it's deifinite or indefinite. Widsith 06:40, 26 May 2006 (UTC)Reply

Agree. There are different ways to look at parts of speech depending especially on the language, and there has been a move to keep these as simple as possible. Davilla 14:05, 28 May 2006 (UTC)Reply

Discussion of multi-word lexical items on lexicographers' mailing list

This topic may be one to watch for many of our contributors and other interested parties. Please even join the list and take part. I'm subscribed but I can find a website that provides access here. Please don't miss this opportunity. We can all learn from trained lexicographers. — Hippietrail 21:45, 25 May 2006 (UTC)Reply

The purpose of RFD

Moved from RFD; interesting discussion, broad target, shouldn't get lost amid the heaps of RFDs. —Vildricianus 22:47, 25 May 2006 (UTC)


I've broken these meta-comments off from the above discussion about vintage car. Davilla 19:17, 9 May 2006 (UTC)Reply

  • What do we have here? Hippietrail called for the deletion of a term he knew from the start would never be deleted. Why? <cough>POINT</cough>. I'm very sorry, but nominating something for deletion is just plain silly.
  • I do understand HT's frustration regarding including phrases in this dictionary. But I still (as I did a year ago) disagree about 95%. While we have no strict policy, the Pawley list seems the most reasonable set of tests we've encountered so far. Certainly in a wiki, where ultimately we will have all those entries, it seem quite counter-productive to be fighting it. Especially when numerous people disagree completely...not just on the impractiacality of enforcement, but rather on the basic priciple he is trying to assert; that multi word nouns are not nouns. I maintain, as I did a year ago (and before then) the opposite. --Connel MacKenzie T C 06:52, 9 May 2006 (UTC)Reply
  • I was thinking about this all night and I think we have some basic problems on Wiktionary. Anybody should be able to dispute a term, sense, criteria for inclusion, policy, etymology, pronunciation, translation, etc, without a fight breaking out. I RFD'd this because that seems to be the way we dispute something here. Maybe I should've RFV'd it instead but to me at the time RFD made more sense and it's what I'm more used to. So it seems that RFD'ing an article is interpreted by some as an attack on the article and even an attack on other contributors. This is silly. I think we need a better way of disputing articles in such a way that we can all be adult about it. Perhaps RFV is that way, perhaps a new place to list articles under dispute, with a small banner on disputed pages. Isn't this what Wikipedia does? Further, I think it might make a lot of sense to carry out the disputes on the articles' talk pages rather than here (or RFV) - these pages can be instead lists of links with perhaps a line of text indicating the current status of each dispute. Disputes which involve more than one page might be better off here than in the talk pages.
    No, RfV would not be correct as it's clearly a common noun phrase. The problem is that there are several levels of these requests, and people take offense. Some material is uncertain and needs to be checked first. A lot of this turns out to be cruft, but there have been some embarrassing nominations. Clearly some material needs to be deleted immediately and is, although mistakes are made there too. If anyone confuses vintage car with either of those two cases, then of course they're going to be offended. The point of this discussion is to hammer out the rules, and to see where the line is drawn as far as consensus. It's perfectly acceptable, in my opinion, to bring this up for debate provided that the page isn't flooded, first of all, and that you really do believe it doesn't fit our current standards. As to the latter, this is one of the grayest areas I've seen in the few months I've been here, in this period of inclusionism as it would seem. Davilla 19:50, 9 May 2006 (UTC)Reply
  • Contrary to Connel's attempt at explaining my position, I am not at all against multi-word entries. Just take a look at my contributions. I am against misleading dictionary users. The thing is that there is nothing on our pages to tell the user why a word is included. Currently we include items by language and "part of speech", the latter field being slightly overextended. We have nothing to (consistently) say "idiom", "set phrase", conversational phrase, encyclopedic entry, translating dictionary entry, character from a computer game, etc. Because of this somebody will see "vintage car" and not "vintage motorcar", "pick up the phone" and not "pick up a phone", "fried egg" and not "fry an egg". It's impossible for the user to tell which phrases are a lexical part of the English language, and which are simply common phrases, words which have special senses when used in combination, terms with figurative senses sometimes only in some contexts, phrases which have semantics over and above their literal contents, etc. Since most dictionaries only include words, idioms, and some set phrases, that is what most dictionary users are used to - when they come across our other kinds of entries they are bound to assume they fall into one of those usual categories and that they are always basic components of the English language. If we cause anybody to think that we are misleading them. It is our duty to tell them what a phrase is and what it isn't. If we don't know we can say so, if we disagree amongst ourselves we can say that too - at least then the dictionary user will know it's not a cut and dry case and can think about it themselves or look it up in another source for more information.
  • Here's how I think we need to move forward:
    1. Classify entries at a level other than only part-of-speech.
      Not unless they don't meet our CFI under the broadest standards. Right now I would consider only phrasebook entries to fall under this category. Davilla
      Other examples would be inflected terms such as plurals and past tenses, which I'm very much in favour of having, and common mispellings, which indeed are one class that is marked by a different format, though one I rather dislike. — Hippietrail 22:06, 9 May 2006 (UTC)Reply
      Misspellings I could agree with. The inflected forms will be sorted out meticulously on their own. Recently we can add romanizations. Davilla 18:20, 25 May 2006 (UTC)Reply
    2. Have a transparent and adult way of handling disputes.
      Up to this point I thought it was being handled pretty well. Davilla
      Perhaps you weren't around for tidal wave / tsunami or Egyptian pyramid. And I'm quite certain there were some big ones that I wasn't involved in as well. — Hippietrail 22:06, 9 May 2006 (UTC)Reply
    3. Mark pages as disputed.
      Unnecessary IMO. But the original nomination could have been lighter, distinguishing tags that debate the idiomacy of a term rather than its correctness. Davilla
    4. Quit the attacks and emotional responses.
    5. Either ban original research fully like on Wikipedia, or have an open and transparent way to research terms without fighting like children.
      The CFI requires research, the wiki style necessitates that it open and transparent, and we will always fight like children. Davilla 19:50, 9 May 2006 (UTC)Reply

Please comment and feel free to move this topic to a better place. — Hippietrail 18:36, 9 May 2006 (UTC)Reply

If you are sincere about point #4, it would behoove you to follow your own advice (#2, #5.) I'd prefer it if you not resort to name-calling every time your points are criticized. --Connel MacKenzie T C 19:38, 9 May 2006 (UTC)Reply
I would like to take this opportunity to apologize sincerely for and and all childishness, attacks, and other uncivil behaviour, wilful or thoughtless, against Connel, or anyone else. (One caveat is that I do not feel that adding terms I see as equally acceptable to the title term as being disruptive and so am do not apologize for that.) — Hippietrail 22:06, 9 May 2006 (UTC)Reply
Please rephrase your last sentence. I don't understand what you mean. --Connel MacKenzie T C 19:20, 11 May 2006 (UTC)Reply
The both of you I'm sure realize that the best way to stop fighting is to actually stop. If saying that isn't helpful enough then I'm content keeping out of this. Davilla 19:50, 9 May 2006 (UTC)Reply
People who've only seen Connel and I interact in RFD might not be aware that we are actually friends and get along rather well other than when raging a bitter dispute about which each of us feels strongly yet opposed. — Hippietrail 22:06, 9 May 2006 (UTC)Reply

Am I correct in my understanding: your objection is not to Wiktionary including multi-word terms, but rather your objection is labeling them as nouns? Is that your main objection, or am I still missing your point? --Connel MacKenzie T C 19:20, 11 May 2006 (UTC)Reply

No my objection is for including the types of multi-word terms dictionaries generally include such as steering wheel and against mundande combinations the inclusion of which turn is into some sort of combinatorial dictionary. I really don't know if you just can't see this difference, of see it but don't care about it, or see it and don't want our users to see which word is which, etc. I am not the only one who feels that including skateboard wheel opens the door for motorcycle wheel, car wheel, and every other wheel designed for one or another vehicle or device, plus all the combinations involving their synonyms, as long as at least 3 people have got them into print over the space of at least one year. I don't see how this is helpful and I don't agree that the full OED leaves them out merely to save space. But what I really disagree with is downplaying the difference so that casual users see them presented here in exactly the same way with nothing to tell them whether they're an elemental lexical part of the English vocabulary or just a combination that people can use. — Hippietrail 22:05, 11 May 2006 (UTC)Reply
I just can't see the difference. I believe that "traditional" dictionaries exclude appropriate combinations due to a historic lack of space in printed editions, not because they are not valid.
I implore you to find evidence to back up your belief then, or in the future when stating it, state also that it is your opinion. I plan after my travelling to write to the major dictionaries asking them about their policies. — Hippietrail 18:17, 18 May 2006 (UTC)Reply
Your last sentence gives me pause. Are you suggesting the only thing here should be "elemental lexical part[s] of...English"? If so, I disagree. Past discussion indicate that I am not alone in that sentiment. But the last phrase of that last sentence shows a clearer misconception/difference of opinion. You say "just a combination that people can use" while I say "a combination that has specific meaning." I think that is the larger difference of opinion, between you and me.
Yes. I believe that is what are called listemes. Items of vocabulary which must be memorized as a list. It does appear that others share your sentiment but it is also apparent that others share my sentiment. On the "specific meaning" front, I still don't get it. "Old car" has specific meaning. How can a dictionary in 27 or so large volumes just leave out entries with specific meaning due to lack of space if their general meaning is not enough to define what they convey? — Hippietrail 18:17, 18 May 2006 (UTC)Reply
I think any combination that does have a specific meaning can merit an entry. But more to the point, if someone has taken the time to create an entry, they obviously feel that such a thing is either distinct enough on its own, or woefully ambiguous when described by either component.
I think you need to rirorously define "specific meaning" as a concept we can include in our CFI. For instance what specific meaning does "pick up the phone" have which "pick up a phone" has? Would an entry for another sense of "pick up" go under "pick up the girl", and if not, why not?
On your 2nd point, the fact that we are all amateurs and make mistakes often leads us to take the time to create an entry. You yourself delete and vote to delete many of these - more than me it seems. Having well-defined methods to decide what goes in is a good thing, not a bad thing, and contributors should be made aware of them. — Hippietrail 18:17, 18 May 2006 (UTC)Reply
To turn around, and nominate legitimate information that provides differentiation from the components for deletion is quite a different thing, than passively allowing entries to exist. Please note that I personally do not make a habit of entering compound terms.
To comingle a watered-down version in one or the other component term's definition is misleading and confusing to a general reader. Doing so is also inaccurate, when not added to each component term.
This is the strangest part for me, you really need to indicate what this new term "watered-down version" means. Is "skateboard wheel" a watered-down version of "wheel"? Why isn't "pick up the phone" a watered-down version of "pick up"? I believe including those is confusing and inaccurate. I cannot see any difference. — Hippietrail 18:17, 18 May 2006 (UTC)Reply
So yes, I still "can't see the difference." To me, deleting terms because you "don't see how this is helpful" is enormously different from saying that they could be improved upon, especially when they do convey information that we don't have elsewhere (but probably should.) --Connel MacKenzie T C 16:59, 13 May 2006 (UTC)Reply
I don't believe I've deleted very many articles at all. I have nominated a small number, and voted on a still fairly modest number. Besides RFD, which other process would you recommend? My recommendation is to not treat or encourage others to tread RFD as an attack but rather as a discussion to help us amateurs improve our lexicographical skills. Sorry I'm out of time. More answers to come... — Hippietrail 18:17, 18 May 2006 (UTC)Reply
I think my view is generally similar to Connel's. I would generalise it as:
  • We should only add items that are significantly helpful
  • We should only remove items that are significantly unhelpful, eg because
they are badly wrong
they take us in a direction we feel is damaging
  • We should hope that the no-mans-land between helpful and unhelpful is wide enough to cover most differences of opinion, but inevitably not all; that is what this and the RfV page are for.
To give a new (to me) example of a direction we feel is damaging I recently put in a sarcastic usage. Widsith took it out, on the basis that any word can be used sarcastically, and on reflection, I agree with him. Our entry for sarcasm has a link to the (rather poor) Wikipedia entry. In general, I agree it would be better to spend time improving the latter, so that people can work out for themselves what any word will mean when used sarcastically, rather than to add sarcastic senses to some of the more popular words.
If this discussion succeeds in crystalising (or at least approximating to) the edges of the no-mans-land re phrases to be included, it will have been worthwhile. --Enginear 14:45, 14 May 2006 (UTC)Reply
  • There are some points I disagree with and it's because I want a free online dictionary that contains just the kind of things dictionaries generally contain. I am certain there are many other people in the world, both current contributors and also people who've never heard of Wiktionary also want such a thing. I never expected it a few years ago, but now it is quite clear from my experience at Wiktionary that there is also quite a large class who wants something that embraces a lot more things than tradtional dictionaries do. Hopefully rather than forking or creating a new similar project we can work Wiktionary into a site that both camps will find useful.
  • These are the things that would be different for a "traditional" Wiktionary from what is listed above.
    1. "significantly helpful" is too broad and fuzzy for a traditional dictionary. It wants only lexical items. The basic elements from which speech and text are made. This is something like the sum-of-parts argument but there it is also based on intuition such that "human being" is in most dictionaries. Many terms are so common and easy to understand that people might say "that's not really useful" but a traditional dictionary will want to include those too - and so does Wiktionary - so it's not a well-defined term.
  • I believe strongly that there are many people who can benefit from a fee and open yet traditional dictionary so that hitting the random button or looking at a full index of words will only show the types of things that a traditional dictionary does. Such features will be hard to use for people not interested in video game characters or phrases easily understood from their parts.
  • I think one solution might be to change from a 2-way system (delete vs keep) to a 3-way system (delete vs traditional entry vs expanded entry). I don't know the best names to give them. By using categories, templates, css, javascript, etc we can provide ways for users to see just what they want to see. Users who want to see all kinds of things not in even the biggest of dictionaries can see everything. Users who want something just like the OED or Websters but free and open can see just the traditional kinds of things. Stuff that is just rubbished can still be deleted outright. — Hippietrail 22:13, 25 May 2006 (UTC)Reply

So what is this topic actually about? Does it question CFI or the procedures of RFD?

  1. As for the latter, I agree that the current system is not waterproof. I sometimes wonder where to add a request: here, at RFV or at RFC. I don't know whether there is a need for another page where one can request community input without having to mention "deletion" or "verification". Some people already seem to understand that this page is not just for requesting an article's deletion, others do not. Wikipedia has things like Wikipedia:Current surveys, but of course their aims are entirely different from ours. I would mildly oppose moving general discussions about articles to their talk pages. It seems that the prevalent idea here on Wiktionary is that discussions should be concentrated on as few pages as possible, which is probably because of our small group of contributors.
  2. How to decide what to include and what not? I wonder. Pretty basic question, but it seems it has been discussed ever since Wiktionary was set up. There are many possibilities, but it also seems people are afraid to move too far away from the current formula. Who is to decide whether or not to include vintage car? How will what we decide today reflect itself in the future? Who is this dictionary for anyway? Who will use it, how will they use it, and what will they expect when they hit "Go"? This was WT:CFI just a bit more than a year ago – how will it look in another year's time? As it is right now, the rules are as thin as paper, but as has been put above, it had perhaps better remain that way for Wiktionary to stay fun. —Vildricianus 20:20, 25 May 2006 (UTC)
  • "What is this topic actually about? Does it question CFI or the procedures of RFD?"
    It's about both of those and more. It's about how we do things here, why we do them, how we discuss them, the problems we've had, the differences between how different groups deal with things, etc, etc. Basically it's about improving Wiktionary for everybody. We're getting bigger and we need to develop - which we've always done anyway. — Hippietrail 23:10, 25 May 2006 (UTC)Reply

Creation of translation need to be checked

Do many people go around moving translations to need to be checked status, or is it some kind of bot? How often are translations moved to this status, and under what criteria. It seems to me like many perfectly good translations are moved, and isn't that simply counter-productive? Especially for languages with few users here. Ptalatas 23:39, 25 May 2006 (UTC)Reply

I don't know about other editors, but there are a couple of situations I notice in which it's clear that the translations might be wrong and need to be re-checked.
For example, suppose a word used to have one definition. Then several translations were added. Then a second sense is added. Now, maybe, some of the old translations belong with the old sense and some belong with the new sense, and maybe some new translations are needed for the new sense, too. So the old translations are moved to "Translations to be checked" so that people who know the translated-to languages can come back and rework them in light of the new sense distinctions.
Or, suppose that a word has several senses, and there are several sets of translations all nicely separated for the several senses, except that the way the translations are tied to the senses is by number. Then, suppose that the senses have been added to and rearranged. It's quite likely that when senses are rearranged, the numbered translations don't all get renumbered properly to match. So, again, when it looks like this has happened, the best thing to do is to move all the translations to the "Translations to be checked" section, so that people who know the translated-to languages can come back and rework them using a better tagging scheme, such as little snippets ("glosses") describing in words which sense is being translated.
The problem, of course, is that no one person knows every language, so it often isn't obvious that the current state of some translations is "perfectly good". If you see some translations under "Translations to be checked" that are in a language that you know and that are obviously "perfectly good", please, take a moment and move them back! –Scs 00:42, 26 May 2006 (UTC)Reply
Yeah, I'm going through the greek and norwegian ones now, but it will take some time. And I could be using that time to add new content. Or read for my exams. ;) But it is frustrating to know that if you translate something into a language with few users, then much of it will be bumped to need to be checked over time. Ptalatas 01:24, 26 May 2006 (UTC)Reply
If you put your translations into translation sections with the sense identified, e.g. like the translation sections in "pump", other editors will know that you translated the specific sense and your translation will not likely be moved into {{ttbc}} (unless, of course, the translation just looks wrong). Rod (A. Smith) 02:07, 26 May 2006 (UTC)Reply
Here's another case I just discovered. True story. Once upon a time, our ankle entry listed only a noun sense of the word. It had several translations -- eight of them, as of last August. But then, someone added a verb sense, but they stuck it between the noun sense and its translations -- so, all of a sudden, the translations seemed to be attached to the verb sense (which is evidently U.K. slang for "walk", so it's not going to have the same translations at all). Since then, eleven more translations have been added. Now, are those new translations for the noun or the verb? Probably for the noun, but I can't be sure. (Some of them I can recognize or guess, one of them is a blue link so I can check, but for the rest, who knows?)
I just moved the translations section back to where it belongs, but strictly speaking I should probably move at least the suspect new translations that have been added since August to the dread ttbc status.
(Rod's right, tagging translation sections with words is much better than numbers, but what this episode demonstrates is that, strictly speaking, you ought to tag the translations even when there's only one sense that doesn't need disambiguation -- because, later, it might, but sometimes by then it's too late.)
Scs 06:53, 26 May 2006 (UTC)Reply
Excellent idea. We should start doing that, ourselves, when we encounter single entries with translations. --Connel MacKenzie T C 07:53, 26 May 2006 (UTC)Reply
I've practically always done this. Note also that the idea behind last March's overhaul of the TTBC system was partially to make tagging translations "to be checked" a less dreaded practice. Frustrating, certainly, but whereas other information is easy to check by native speakers, translations are a more complicated matter, and there's nothing worse than having one or two bad translations spoil an entire section. That doesn't mean one can be reckless in tagging everything that seems wrong as TTBC, on the contrary, a lot of trouble can be solved by checking page histories, interwiki links and sometimes Wikipedia and interwiki links over there. But one shouldn't shun {{ttbc}} just out of fear of undoing other people's work. Accuracy comes on the first place! However, in order to avoid that such things happen, use disambiguated translation tables and level 4 headers. —Vildricianus 13:31, 26 May 2006 (UTC)
The other thing to remember is that tagging a translation with {ttbc}, and/or moving it to a "Translations to be checked" section, doesn't really undo someone else's work. A reader needing that translation can still find it, and they can try to determine how accurate it might be, but at least they won't get the false impression that it's known to be accurate.
If we do this right, we can minimize the work on people checking and repairing translations, minimize the chance that a reader will be misled by a wrong translation, and maximize the chance that a reader can get some value out of an uncertain translation. –Scs 14:40, 26 May 2006 (UTC)Reply

etymonline.com

Is there something special about this site? Is it public domain? A different number of people have added complete chunks of text from it here, as if it weren't copyvios. Am I missing something? —Vildricianus 16:03, 26 May 2006 (UTC)

They shouldn't do, it's all copyrighted as far as I know. Widsith 16:06, 26 May 2006 (UTC)Reply

Redirect pinyin?

I'd like to start either redirecting Pinyin terms to the Chinese characters they represent (e.g. huā to , or making articles out of each Pinyin term indicating that each is, in fact, the Pinyin for that respective Chinese character. Does anyone have any thoughts as to which would be preferable, or why either would be a bad idea? My thinking is that a lot of "learn-Chinese-for-travel" type books use only Pinyin. BDAbramson T 03:46, 23 May 2006 (UTC)Reply

Some Pinyin transliterations will collide with terms from other languages, so "hard" redirect ("#REDIRECT") is not universally feasible. That suggests using a "soft" redirect (a definition line saying "# Pinyin transliteration of ..."), ideally via a template like {{pinyin of}} (ala {{plural of}}) for ease of management and to help with consistent wording and style. Rod (A. Smith) 04:12, 23 May 2006 (UTC)Reply
Yep, sounds good. Widsith 07:18, 23 May 2006 (UTC)Reply
There are usually going to be multiple characters with the same reading. I've gone ahead and made an entry for huā if anyone wants to take a look. Kappa 12:03, 23 May 2006 (UTC)Reply
Yes, definitely a better option in light of the multiple characters that may be represented by one Pinyin word. I'll work on 'em! BDAbramson T 13:33, 23 May 2006 (UTC)Reply

I agree in principal, not because I think it should be possible to look up words phonetically (although I do, and actually bopomofo is just a step away), but because it's possible to run across the pinyin in an English text without having the Chinese equivalent printed. However, I'm not sure all of the issues have been worked out. Immediately it has been noticed that there are homophones in Chinese, so it's quite obvious a page is needed and not a redirect. But what about other romanizations? In fact there are at least two kinds of pinyin, so I'm not sure using "pinyin" is appropriate universally. How are different romanizations to be handled more generally? e.g. what if a wade romanization matches the pinyin romanization of a different word? Davilla 15:17, 24 May 2006 (UTC)Reply

  • Pinyin being the most popular, I intend to worry about that first... if we run into conflicting Pinyin and Wade transliterations in the future, we can lay them out like so:

==Chinese==

===Pinyin===

  1. character 1: meaning
  2. character 2: meaning
  3. character 3: meaning

===Wade romanization===

  1. character 1: meaning
  2. character 2: meaning
  3. character 3: meaning

... and so forth. Indeed, see . BDAbramson T 16:22, 24 May 2006 (UTC)Reply

That addresses one point. Maybe ===Wade-Giles=== is a better option. But what about the flavors of pinyin? I imagine it must be all right to let "Pinyin" mean Hanyu Pinyin. Is that the opinion of people here, or is the distinction simply being overlooked? Tongyong Pinyin is probably set to die since even in Taiwan it is not used consistently, nor has it replaced zhuyin (bopomofo) in the curriculum, as has Hanyu Pinyin in China, sadly. (Postal System Pinyin is no longer in use and never was generally applicable to the language.) Davilla 15:54, 25 May 2006 (UTC)Reply
I'm happy to see how closely "bú#Chinese" resembles romaji entries (e.g. "bui#Japanese"). One slight deviation is that romaji entries repeat the "inflection line", if you will, after the "POS" heading. Perhaps it makes sense not to do so with Pinyin, since Chinese doesn't really have an equivalent of Japanese kana (the Japanese transliteration shown on the "inflection line" of romaji entries), but I want to point out the difference to gather people's opinions on the matter. Should all entries in all languages consistently repeat the headword after each "POS" heading even if the "POS" isn't really a part of speech and the headword has no actual inflections?
By the way, there is some contention as to whether romaji entries should have "===Romaji===" or "===Noun===" etc. for their POS heading. Hopefully any consensus achieved on Pinyin layout will reflected in romaji entry layout. Rod (A. Smith) 17:30, 24 May 2006 (UTC)Reply
I have modified to reflect the style used for Japanese. However, I don't understand why you link the kana. I thought that was, for the most part, a pronunciation, especially as it's being used here. So wouldn't a page for the kana be equivalent to the romanization page?
I'm fine with leaving the headers as they are, rather than trying to classify the words by part of speech. In fact, I wonder to what extent even translations are needed. Davilla 15:54, 25 May 2006 (UTC)Reply
Although kana forms of 漢語 (kango, i.e. Japanese words of Chinese origin) are "transliterations" of kanji, the reverse is true for 和語 (wago, i.e. Japanese words of ancient Japanese origin). That is, kanji for ancient Japanese words are actually "transliterations". In any event, several Japanese terms don't even have corresponding kanji (e.g. particles). All romaji, however, are transliterations of kana terms, so it is consistent and etymologically appropriate to have each romaji link to the kana entry. Rod (A. Smith) 20:15, 25 May 2006 (UTC)Reply
Thank you for the information. It answers my question if in an indirect way. I will spell it all out for completeness. Your argument, the second half particularly, supports the inclusion of kana under the romaji header. A parallel argument in Mandarin supports the inclusion of zhuyin under the pinyin header, since zhuyin is the etymological and somewhat idiosyncratic Chinese pronunciation system reflected verbatim in the more modern pinyin. However, zhuyin is considered a pronunciation, mixed with Chinese characters only in rubi text, and therefore would not be instantiated under current policy. This is where Japanese differs. The linking to kana is supported by the necessity of those pages. Davilla 16:45, 26 May 2006 (UTC)Reply
Case in point: on one of my cleanup lists was xianzai. As an English speaker, I haven't a clue what this page is trying to convey. It seems to be something pertaining to Chinese characters, but the state the entry is in now makes even that, something of a stretch. What would be helpful to me, as an English speaker (i.e. the intended audience of the English Wiktionary,) would be a listing of definitions for the term, perhaps with links to the Chinese characters themselves. Seeing what POS the term functions as (broken down in sections the same way English entries are, for consistency,) would be helpful as well. --Connel MacKenzie T C 03:11, 28 May 2006 (UTC)Reply
My understanding was that the POS headings were supposed to be POS when possible. That would mean a Romaji term would have an ===Adjective===, ===Noun=== and ===Verb=== heading, each with an "inflection line." Is my assumption regarding Japanese entries incorrect? --Connel MacKenzie T C 18:22, 25 May 2006 (UTC)Reply
The problem with this approach is that the pages have to be syncronized. If a POS heading is added to the word, it has to be added to its romanizations as well. I wouldn't break off the parts of speech unless the new term gained meaning of its own, e.g. if ASAP came to be used as a verb. Davilla 19:02, 25 May 2006 (UTC)Reply
I don't see how that is a problem. Yes, eventually, the corresponding entries will be corrected as well. But we edit one entry at a time... --Connel MacKenzie T C 08:07, 26 May 2006 (UTC)Reply
The problem is that it creates a lot of needless maintenance. It's an issue of scale, of one set of tasks in comparison to the other. Davilla 16:52, 26 May 2006 (UTC)Reply
The same could be said about English language entries (paticularly inflected forms) but that really does not match what we've been going, to date. Just as English entries are now benefitting from some automation, I can see these getting the same class of treatment, eventually. --Connel MacKenzie T C 15:16, 27 May 2006 (UTC)Reply
Careful, you're starting to convince me. On the other hand Hippietrail, I believe, had advocated using e.g. ===Verb form=== which would circumvent issues of gernunds, past participles and such. You've never taken the noun and adjective uses very seriously.
Anyways, given that the header contains part-of-speech information, how would you mark the entry as being romaji, pinyin or what have you? Davilla 16:58, 27 May 2006 (UTC)Reply
Indeed, that is one of the challenges with representing multiple transliterations, not just for kanji/kana/romaji and hanzi/Pinyin/Wade-Giles, but also for regional spelling variations like "color/colour". Personally, I'd love to see these issues resolved ala "theatre"/"theater", but I'm not certain that solution is easy enough for all editors follow. Rod (A. Smith) 20:15, 25 May 2006 (UTC)Reply
I think the approach used with color/colour is better than theater/ theatre for several reasons: 1) it uses the template namespace, 2) it has back-links, 3) it is much more flexible 4) only the common sections are shared. --Connel MacKenzie T C 08:07, 26 May 2006 (UTC)Reply
If your reason 1 ("it uses the template namespace") is a benefit, that benefit is not obvious to me. Reason 2 is a good idea, so I added backlinks to "theater"/"theatre". I don't understand your reason 3 ("it is much more flexible"). Your reason 4 is confusing to me, because in "theater"/"theatre", only the common sections are shared.
Reason #1 is that there is inherent confusion when the main namespace is overloaded. Reasons #3/4 pertains to multiple sections in "theater"/"theatre" being shared, while in "color/colour" each shared section gets a separate shared template. I should have worded #4 more clearly. By having more specific shared sections, there is more flexibility for common sections diverging (as they should) while other sections remain common. To me, including more than one heading section in a common template is therefore likely to shoehorn data into a common section, that shouldn't be shared. With color/colour, the issue of translations being common is plain, but making any other section common would not be NPOV. I see the same issue for theater/theatre, by the way. But the practice of using shared sections too much seems dangerous, in that certain sections incorrectly commingled into a common section will then stagnate, rather than diverge correctly. --Connel MacKenzie T C 15:16, 27 May 2006 (UTC)Reply
I think the "color"/"colour" approach is great. I just suggest to use the main namespace for entry-specific content. My motivations are (a) organizing content with the content's entry and (b) avoiding overpopulation of the template namespace. Consider managing the template namespace if every "-ize"/"-ise" entry has its own template, as well as every term of languages (like Japanese) that have multiple transliterations. Creating hundreds of thousands templates seems excessive. Rod (A. Smith) 22:46, 26 May 2006 (UTC)Reply
I was thinking about this kind of thing a long time ago but didn't get around to experimenting with it fully. My idea was to request the devs to give us a new namespace "Shared:" which is specifically for these templates and no others. You can sort-of try it now but since it's only a pseudo-namespace now it will require [[:Shared:color-colour]] rather than [[Shared:color-colour]] if it were a real namespace. — Hippietrail 23:29, 26 May 2006 (UTC)Reply
Wasn't that the technical intent of the template: namespace, to begin with though? I'm not clear on just what the distinction being suggested is. Would "shared:" be used for things like cacodaemonically's sections? Or would "shared:" be limited to cases where the content is shared between only two entries? --Connel MacKenzie T C 15:16, 27 May 2006 (UTC)Reply
Template: would be for templates that do funky things and are included by diverse pages for diverse reasons. Shared: would be only for contents shared between articles with differing spellings or possibly also such as romanized article or regional synonyms in some cases - but mainly just for color/colour and center/centre. No tricky stuff would be allowed in Shared: — Hippietrail 19:37, 28 May 2006 (UTC)Reply
  • Oh I forgot to mention, I requested a relevant feature a few days ago and it's already here. When an article wikifies to itself it is wrapped in an HTML strong tag, but now the tag also has the CSS class "selflink". Using this feature we can have Shared Templates even for sections which might reference the article title - such as the Alternative spellings section. If all Shared templates have somthing like an HTML DIV with CSS class shared-template we can have a CSS rule .shared-template .selflink { display: none }. If the DIV is infeasible we can add some JavaScript that searches for them and removes them. Anyway this is all Grease pit talk. — Hippietrail 19:44, 28 May 2006 (UTC)Reply

List of words used in Wiktionary that aren't defined in wiktionary

I just added a list of words used in Wiktionary that aren't defined in wiktionary at User:RJFJR/WTconcord. It's based on the XML dump (my program for building the list still needs some work but this list is still something to look at). RJFJR 06:46, 27 May 2006 (UTC)Reply

Ooohh, excellent! Thanks very much. This is an extremely good idea.
The next challenge, of course -- (you were waiting for someone to bring this up, weren't you?) -- is to filter out all the simple derived words (plurals, past tenses, etc.) which do have definitions for their base words. (But having written code that tries to do this myself, I know it's not quite as simple as it sounds.) –14:28, 27 May 2006 (UTC)
This link may be helpful then: http://www.tartarus.org/~martin/PorterStemmer/ --Connel MacKenzie T C 15:19, 27 May 2006 (UTC)Reply
The Porter Stemmer may be a bit too powerful for the purpose of inflections only. I would hope that derivations like poorly and weakness do get filled out. Davilla 17:04, 27 May 2006 (UTC)Reply
Also, it would probably be good to filter out headings (the XML dump marks those as such, right?), because the stylized terms used there (e.g. "romanizations") are swamping the rest. –Scs 14:54, 27 May 2006 (UTC)Reply
Filtering out headings would be a bad idea, IMHO, as we do want those entered as well. --Connel MacKenzie T C 15:19, 27 May 2006 (UTC)Reply
Doh! Of course. (What was I thinking?) Scs 16:39, 27 May 2006 (UTC)Reply
Excellently done, RJFJR! I often filter out entry names that contain a colon in the headword, when trying to limit things to NS:0. This has the side benefit of excluding pseudo-namespaces as well. --Connel MacKenzie T C 15:22, 27 May 2006 (UTC)Reply


I've been making modifications. To get around the problem with not knowing if a word is supposed to be capitalized and scanned against titles case-insensitive. And I modified it to ignore template invocations to cleanup some of the entries.

But what's really cool (at least to me): Somebody linkspammed us and they put it in so many times that part of the url (between periods) floated to near the top of the list of words that are used but not defined and when I searched for it I found and removed it. (And one of the words this happened with was in links that were made non displayign so we wouldn't even see it if we looked at that word.). RJFJR 02:10, 30 May 2006 (UTC)Reply

The Grease pit has been created

With all the exciting new development underway by Connel, myself, and lots of people whose names have sadly gone in one ear and out the other, I think it's high time to create a separate talk space where we can discuss ideas related to templates, javascript, CSS, customization generally, toolserver; and also generally discuss how to make Wiktionary better in both the short-term and the long-term. It's not pretty right now but please go over and sow some ideas and tidy it up. I think many of us have an idea of how to use such a place. Now over to the Grease pit. — Hippietrail 17:40, 27 May 2006 (UTC)Reply

Can you clarify its purpose a bit?
  • It is a place to discuss how to solve technical problems.
  • Its purpose is specifically for discussing the future development of the English Wiktionary both as a dictionary and as a website.
  • It is also a place to think in non-technical ways how to make the best free open online dictionary.
These statements range from specific to general. How do you see it? Technical stuff only or a broader idea? —Vildricianus 18:05, 27 May 2006 (UTC)
The name seems to signify that, if you don't want to get your hands dirty, you should stay away. Sounds like good advice to me! SemperBlotto 18:10, 27 May 2006 (UTC)Reply
  • I've tried to cover this a bit in the intro now. I think the Beer parlour is for politics and policy and the Grease pit is for engineers, to put it very basically. In the grease pit we can talk about how to lower your suspension or add a link the navigation bar, but we can also talk about what we'd do if we were in charge of Ferarri. In other words policy talk might belong there if it's very long range - thinking about how things could be, but talk about the finer points of current policy would be much better off right here in the Beer parlour. But it's all up in the air right now so feel free to start a topic in the Grease pit about what it should and shouldn't cover. — Hippietrail 18:45, 27 May 2006 (UTC)Reply

Standardizing inflection templates

In Category talk:Conjugation and declension templates/Inflection, conjugation, and declension template names, I have proposed that we standardize the names, purpose, and display style of three types of templates used in Wiktionary.

My interest in that topic is now renewed because of {{fr-infl-adj}} (used on "digital" and other entries). {{fr-infl-adj}} displays like a floating declension table (with the added property of showing pronunciation) but is named like {{en-infl-reg-other-e}} et al., which are used on the "inflection" line (immediately following the POS heading) to show the headword and its main inflections. Is there yet a preferred pattern for naming and display styles of the two types of templates?

Also, the existence of two sets of English inflection line templates ({{en-noun-reg}} and {{en-noun2}}) leads me to believe that editors have agreed to disagree about inflection line display style. If so, some CSS magic can let everyone have his or her way just as it does with {{cattag}}. Would anyone object to standardizing the inflection template display in a way that allows each reader to choose his or her inflection line display style? (Should I move this to WT:GP?) Rod (A. Smith) 19:39, 27 May 2006 (UTC)Reply

  • I think a good place to start would be to come up with some standard CSS classes. We would need one SPAN or DIV (depending on whether it's inline or creates some kind of box or table) as a wrapper for the whole thing, and another SPAN for the actual inflected words. All templates should use the same classes so that people can say select italics for the body of the section and bold for the actual words - or however they like it.
  • The next step would be to come up with a flexible template that uses whatever it takes to please all the camps and produce an inline version or a version in a box or table etc. That's a lot more work and the place to do it is over in the [[Grease pit|]]. — Hippietrail 22:05, 27 May 2006 (UTC)Reply
This conversation will continue at WT:GP#Standardized customizable inflection templates. When the technical details are worked out, a proposal will be posted here. Rod (A. Smith) 23:45, 27 May 2006 (UTC)Reply

What is going on?

I just entered "prescriptivistic" in the search box and pressed [Go]. I was presented with a page that had "create an entry with that title" in BLUE. Clicking the link, I arrived at $1. Anyone know what the @$#% is going on? What changed in the last day? --Connel MacKenzie T C 04:34, 28 May 2006 (UTC)Reply

Temporarily solved: Will file a bugzilla, then give a description here. --Connel MacKenzie T C 05:20, 28 May 2006 (UTC)Reply
bugzilla:6115. A change made today to the MediaWiki software allows for ridiculously punctuated characters (such as $1) to be passed through in a more consistent manner. Unfortunately, that means that "$1" is no longer expanded when referencing the "template" Wiktionary:Project-Nogomatch from MediaWiki:Nogomatch. A suggestion from Splarka was to subst: the "template" back into the MediaWiki: namespace, which did the trick. So that page is back to being sysop-edit only again. If needed, we could go back to the extenseive Javascript manipulation style, to allow it to still be a template, but I am quite leery of doing so.
Some talk-page moving around (and other assorted cleanup) is now needed for the residual "template", but otherwise, things seem to be back to normal. I don't know of any other MediaWiki: system messages that are similarly affected at this time, but I have not done any extensive searches. Yet. --Connel MacKenzie T C 06:06, 28 May 2006 (UTC)Reply
Ah, that explains why $1 got "vandalized" last night - I had better unblock the innocent party. SemperBlotto 07:16, 28 May 2006 (UTC)Reply
Looking in the comments, in history, the article $1 was created because it had a lot of wanted links to it, but I don't think it should have an article. The wanted links probably came out of entries intending to call the $1 variable marker in a Mediawiki message, not because anybody needs to know that it's a symbol for one dollar or one peso. When we're done with the technical corrections, let's delete the article and perhaps correct some "what links here"s in the process. --Dvortygirl 08:45, 28 May 2006 (UTC)Reply

Wikipedia system of queueing articles from anons

Recently, I accidentally tried to add an article to Wikipedia when I wasn't logged on. It wouldn't let me. Instead the article can be put on a queue for checking. See w:Wikipedia:Articles for creation. What do you think are the pros and cons of this system? Would it work here? SemperBlotto 17:38, 28 May 2006 (UTC)Reply

I thought this would come up sooner or later (sooner, actually). I'd really like to keep it as a last resort. Of course I know how much rubbish gets added, but we are not Wikipedia, in no way whatsoever. Pedia gets thousands and thousands of edits per day, and prior to the restriction of article creation for anons, thousands of new entries a day. That's hardly necessary for a one-million-article encyclopedia. But it's totally inappropriate, I think, to do the same for a dictionary with barely the basic English entries covered. I assume WP was at the time of the rule's institution drowning in bullshit new articles. I don't think we are. Let's try to keep it up, we've just had 9 more sysops this month. —Vildricianus 17:54, 28 May 2006 (UTC)
I don't think it will ever be appropriate for Wiktionary, or at least not for a long long time, because unlike Wikipedia articles, which are topics edited thousands of times as the page grows, Wiktionary entries are sparse, with the words defined very simply usually. -Davilla 59.112.52.124 19:21, 28 May 2006 (UTC)Reply

Moved to Tea room. —Vildricianus 19:55, 28 May 2006 (UTC)

Oops. Thanks! Davilla 02:12, 29 May 2006 (UTC)Reply

Some are being too Autocratic

I have just restored a couple of entries - pussyjunk and veritaserum. Both were deleted in very autocratic style, simply because of the prejudices and autocratic nature of the administrator.

pussyjunk is clearly at least a protologism. That policy / process is tough enough, without arbitrary instant deletion by a bowdlerising adminisatror.
veritaserum is as valid as Jabberwocky. But, even if this doesn't count, there is a process which is to be followed, allowing time for people to collectively consider if a word is valid or not, or perhaps just needs cleanup. It is not for some autocratic administrataor to just decide Harry Potter is not real literature and to instantly delete words which only appear in Harry Potter.

I think we should perhaps start keeping a log of when people abuse their administrator rights, and those who become too autocratic should be on some sort of warning about losing their deletion rights.--Richardb 01:08, 29 May 2006 (UTC)Reply

Well, I deleted pussyjunk and I stand by it. It is nonsense and only gets 9 Google hits – random strings of letters would get more. Widsith 09:33, 29 May 2006 (UTC)Reply
Delete. —Stephen 18:34, 29 May 2006 (UTC)Reply

I think it was right to delete pussyjunk, though I'm waiting to be convinced about veritaserum.

This is the junk that was deleted yesterday:

  1. adding new content
  2. bien sur
  3. White Snot
  4. pussyjunk
  5. cabalidad
  6. Hinge moment
  7. Vitaut the Great
  8. Kategoria:eston'ski (indeks)
  9. eweqweqweqweqw
  10. cybernetica
  11. pitted tubesoulder
  12. andier
  13. Sylvie ilter
  14. xpage
  15. nr.
  16. manasd?r
  17. to give
  18. disfrute
  19. fulling
  20. Snape Killed Dumbledore!

If this is a typical day, and each of these had to pass through rfd before deletion then the process would become swamped and overloaded. We elect responsible people as admins and they should be trusted to use their judgement. Jonathan Webley 10:15, 29 May 2006 (UTC)Reply

Oh yeah. They're all perfect. Their opinions are more important than everyone else's because they have enough friends to get elected. Why let it last through a vote? The fact that most users can't even double-check the entries they deleted is irrelevant. The fact that I've seen them delete entries on sight with citations is not a sign of abuse, either.--Primetime 04:34, 30 May 2006 (UTC)Reply
I think we should also keep a log of when administrators revert other people who try to give a bit of consistency to our entries. —Vildricianus 11:17, 29 May 2006 (UTC)

Transliterations

There has been a lot of confusion about transliteration, due mostly to the fact that there are several different systems in use for any given script. I’ve been saying for a while now that we should develop an official policy, at least in regard to major non-Roman language such as Russian, Korean, and Arabic.
For Korean, we have been pretty consistent about using the new w:Revised Romanization of Korean, which avoids all diacritics and most hyphens. For Chinese, the Pinyin system is well founded and already being followed. Japanese doesn’t present many problems, but there are a couple of important considerations.
In case everybody agrees that this is a good idea, I have included below a table showing the main systems in use for Russian. If we choose any one of these as official policy for Russian, it will mean a lot of work because up to now we have been using a lightly different system. I don’t know anything about bots, but perhaps somebody could create a bot to change existing Russian transliterations (which are very consistent) to the agreed-on system.
This if this can be made to work for Russian, we should consider a system for Arabic and for Greek. Of these systems, I don’t recommend the ALA-LC because of its reliance on diacritics and special digraphs. —Stephen 21:05, 29 May 2006 (UTC)Reply

Transliteration table

Common systems for romanizing Russian
Cyrillic Scholarly ISO/R 9:1968 GOST UN ISO 9:1995 ALA-LC BGN/PCGN
А а a a a a a a a
Б б b b b b b b b
В в v v v v v v v
Г г g g g g g g g
Д д d d d d d d d
Е е e e e e e e e, ye†
Ё ё ë ë jo ë ë ë ë, yë†
Ж ж ž ž zh ž ž zh zh
З з z z z z z z z
И и i i i i i i i
Й й j j j j j ĭ y
К к k k k k k k k
Л л l l l l l l l
М м m m m m m m m
Н н n n n n n n n
О о o o o o o o o
П п p p p p p p p
Р р r r r r r r r
С с s s s s s s s
Т т t t t t t t t
У у u u u u u u u
Ф ф f f f f f f f
Х х x ch kh h h kh kh
Ц ц c c c c c t͡s ts
Ч ч č č ch č č ch ch
Ш ш š š sh š š sh sh
Щ щ šč šč shh šč ŝ shch shch
Ъ ъ "  ″*
Ы ы y y y y y y y
Ь ь '
Э э è ė eh è è ė e
Ю ю ju ju ju ju û i͡u yu
Я я ja ja ja ja â i͡a ya
Pre-1917 letters
І і i i ĭ ì ī
Ѳ ѳ f
Ѣ ѣ ě ě ě ě i͡e
Ѵ ѵ i
Pre-nineteenth century letters
Ѕ ѕ
Ѯ ѯ
Ѱ ѱ
Ѡ ѡ
Ѫ ѫ ǎ
Ѧ ѧ
Ѭ ѭ
Ѩ ѩ


Notes
* ALA-LC: ъ is not romanized at the end of a word.
† BGN/PCGN: ye and are used to indicate iotation word-initially, and after a vowel, й, ъ, or ь.