Wiktionary:Beer parlour/2006/December

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives +/-

Slavic noun templates and Lithuanian letters

I've come to notice that we may need some additional gender and number templates to deal with Slavic nouns (at least Polish and Czech). Specifically, a singular noun can use {{s}} and a plural noun can use {{p}}, but Polish and Czech also have dual number nouns. We could use {{d}}, but that currently redirects to "speedy delete", so we'd have to be careful if we made that switch. Also Polish and Czech have two categories of masculine nouns: animate and inanimate, which affects the inflectional endings. And no, the distinction doesn't strictly follow the actual characteristics of the thing described any more than the grammatical gender does. There are rules for deciding whether a Polish/Czech noun is "animate" or not, but the rules run to three pages in most grammars and still have exceptions. Should we have a {{m-i}} and {{m-a}}, or would some other form be preferrable?

Additionally, I've noticed that the character insertion tool in the Edit page does not include Latvian / Lithuanian characters. I have to go to Wikipedia and copy-paste from an edit window there. Could someone add: Ė ė Ą ą Ų ų Ģ ģ Ķ ķ Ļ ļ Ņ ņ to the character insertion tool? I've no clue how to do it myself. They could go under general Latin/Roman, but since Latvian and Lithuanian are Baltic languages, this list wouldn't fit any of the existing sublists. They're not quite Slavic. Actually, it would be nice if all the ogonek vowels were added: Ą ą Ę ę Į į Ǫ ǫ Ų ų since right now only Ą ą Ę ę are available, and those only under the Slavic Roman sublist. --EncycloPetey 02:45, 2 December 2006 (UTC)

I have to deal with duals, paucals, animates and inanimates all the time, and it has been my practice never to abbreviate dual or paucal, and rarely animate or inanimate. The only abbreviations in this regard that I use are vai, vii, vta, vti, since these are standard linguistic abbreviations. —Stephen 03:38, 2 December 2006 (UTC)
I agree that we should not have any more abbreviations. Wikipedia is not paper. The templates, if we create any more, should expand to the full form (perhaps wikified) so that they are understandable to the user. Already we have "c" (for "common gender") and that is not known to many people who understand "m", "f" and "n" ("masculine", "feminine" and "neuter"). We also have "m1" for Irish nouns, and I still don't understand what that is. — Paul G 16:09, 6 December 2006 (UTC)

EncycloPetey, I've added the characters you want (the ones with ogoneks under Latin/Roman, between the ones with cedillas and the ones with haceks). I added a new menu option for Latvian/Lithuanian, but it's not showing up... I'll need to check what I've done. — Paul G 16:19, 6 December 2006 (UTC)

Connel has kindly fixed this now. — Paul G 17:32, 6 December 2006 (UTC)

Categorize by number of letters?

I was working on a crossword puzzle and needed a nine letter word starting with "m" meaning "she heads her family", when it struck me that we should have a category for words by number of letters (at least for English words), e.g. [[Category:English nine-letter words]]. Seems to me this would be really easy to do with a bot - just tell it to count the number of letters, ignore entries including spaces, numbers, and punctuation marks, and add the right category. It would basically make an instant crossword-puzzle dictionary. Any thoughts? bd2412 T 19:46, 3 December 2006 (UTC)

Sounds like category creep to me. I think this would be better left to a more sophisticated search form made specifically for crosswords. It should also let you search for a 9-letter word with 'm' as the 5th letter. I wouldn't like to see a category for that. Pengo 02:34, 4 December 2006 (UTC)
I think that would be better implemented on toolserver, than locally in categories. Next would be a subpage for every word in Wiktionary, "[[/Crossword puzzle questions]]" to build an easier cheat index of crossword puzzle hints. I think going down the path of a "crossword puzzle dictionary" far exceeds our remit and would probably induce too many copyright questions. --Connel MacKenzie 04:53, 4 December 2006 (UTC)
It's matriarch, by the way. Widsith 12:07, 4 December 2006 (UTC)
Slippery-slope arguments aside, if such complexity in interface ever comes to Wiktionary, where one would be able to, say, intersect the category of 9-letter words with those of English words and words whose fifth letter is M, then such categories creatable by mere computation would probably be better implemented as such. On the other hand, information as basic as the number of letters in a word is essential even to currently provided information, specifically anagrams, although I see no compelling connection to categories. DAVilla 15:27, 4 December 2006 (UTC)

Please help with Latin translations

Please help create Wiktionary entries for these Latin and Translingual words (taxonomic epithets)

Current lists

2011 Catalogue of Life:


  • Most wanted — various short lists of epithets and related terms that ought to be added to Wiktionary.
  • Most Common Epithets 1 — Which epithet is the most commonly found in scientific names? Which Latin epithet is found in over 2000 binominal species' names? This list gives the top 6000 most common species epithets, ranked by how many binomial names they're found in. (see also: Most Common Epithets contents page; Original Top 1000 list)
  • Most threatened — Which epithets belong to the largest number of threatened species? Unfilled entries only.

By Suffix:

By Prefix:

  • Most common first four letters: nigr-, long-, flav- (May add more later. Please request others if these are helpful)

More info:

  • Motivation — Why I made these lists. Also other details, what highlights mean, notes, sources, to do list.

Older lists

2006 Red List species only (all categories):



  • CRC World Dictionary of Plant Names [SLV]
  • CRC World Dictionary of Grasses
  • The Eponym Dictionary of Mammals / Reptiles / Amphibians
  • Stemmata Latinitatis; or, An etymological Latin dictionary google scan

Popularity lists


More: Birds - Amphibians and Reptiles - Arthropods - Biology - Cats - Dogs - Dinosaurs - Equine - Fishes - Fungi - Gastropods - Microbiology - Molecular and Cellular Biology - Palaeontology - Primates

Assessment data (Wikipedia Release Version Tools)

Language parameter added to form of templates

I recently had a conversaton thread with User:Connel MacKenzie where I queried recent change to the templates below where English categories had been added to them. The templates are not language specific therefore I have removed the hard coded categories from the templates below and modified them so that they will accept a language as a parameter, when the lang is given the article will be automattically categoried into the appropriate category. See the list below for examples--Williamsayers79 20:05, 4 December 2006 (UTC)

Note: because {{comparative of}} and {{superlative of}} can also be used for adverbs as well as adjectives I have included a POS parameter where the part of speach must be specified along with the language in the lang parameter.--Williamsayers79 20:05, 4 December 2006 (UTC)
Please see talk pages of the templates above for further info.--Williamsayers79 20:05, 4 December 2006 (UTC)
This is not the first time users have assumed that these templates are for English use only. The hard-coding of "to" before verbs is a prime example. I've already seen fr:I can't remember what. Perhaps because of the comparison with category naming, maybe we should just make it so that they are? Or if fr: etc. is a bad idea then we need to develop a multilingual approach to these. DAVilla 06:00, 5 December 2006 (UTC)
I have no problem with using the "lang=" parameter, provided the English category is defaulted in as the third pipe segment of the #if. Exotic "#if"s are not needed here though; see this edit for what I mean. I do have a very big problem with extraneous <CR>s in widely used templates though! These are supposed to follow the in-line conventions, with glosses added after the template, if needed. --Connel MacKenzie 18:18, 6 December 2006 (UTC)
P.S. I really dislike the notion of adding "|lang=English" to all those entries. That doesn't follow any of the previous conventions, that on the English Wiktionary, we deal first with English. --Connel MacKenzie 18:23, 6 December 2006 (UTC)
Thanks for your reply Connel, I ahev not added the lang parameter to many artciles yet as I was waiting to here from the other admins. The main reason for the lang= parameter was appease both camps, those wanting to autocategorise with the form of templates, those sick of typing in endless categories when entering inflected words into Wiktionary, and finally those that did not like English categories by default in the form of templates.--Williamsayers79 21:29, 6 December 2006 (UTC)
I concur that the default should be English. If it turns out that we don't go that way, at least the automated buttons for entry creation should be modified to add the English flag. Also, should your flags for language based on the category names, or on the language ISO codes? --Jeffqyzt 20:49, 12 December 2006 (UTC)

What was entry 300,000?

The Announcements don't say which entry was historic number 300,000. I'm curious to know. --EncycloPetey 05:12, 5 December 2006 (UTC)

<blushing> TheCheatBot's English plural electroplatings became entry 300,000 on 2006-11-04 (31). It is listed at Wiktionary:Milestones, but not nearly as important as 250,000. Are you volunteering to add weekly updates to the announcements page? --Connel MacKenzie 05:48, 6 December 2006 (UTC)
I don't know how to do so. Frankly, I'm not sure that there is always something worth announcing each week. I would like to see every 50,000th entry listed there, however. --EncycloPetey 02:15, 12 December 2006 (UTC)

Mandarin pinyin in Han character entries

Having sorted the problems with the Korean Yale, I've turned to finding the problems with the pinyin for Mandarin. There are several hundred problems (250 so far, there are more), plus a systematic mis-generation of the pinyin from the form with tone numbers for a few (if the syllable ends with ui, the diacritic is supposed to be on the i ...)

See User:Robert Ullmann/Mandarin Pinyin for the current state of the analysis.

Comments appreciated, and if you are so inclined to fix any of the identified entries? There is little that can be fixed automatically. When I can get this sorted, it can generate the missing pinyin syllable entries, otherwise a very laborious task. Robert Ullmann 15:37, 5 December 2006 (UTC)

Does anyone know what the asterisks are supposed to mean? They were added by (one or more) IP-anon user(s) several years ago. Robert Ullmann 21:01, 5 December 2006 (UTC)
It is possible that the entries with the diacritic on the u instead of the i can be fixed automagically. I'd be interested in opinions? Robert Ullmann 21:06, 5 December 2006 (UTC)
I tried updating the report to include all the cases of ui with the diacritic on the u, but there are way too many ... will look at trying to fix all or most of them first. Robert Ullmann 22:04, 5 December 2006 (UTC)
Several hundred such. Perhaps 6-700. Working on fixing them. Robert Ullmann 12:45, 6 December 2006 (UTC)

alpha privatives

I would like to create a category containing words which have alpha privatives (amnesia). I've found about 20 such words on wiktionary. Could eventually grow to twice as many. Lethe 03:44, 6 December 2006 (UTC)

We don't categorize entries by prefixes or suffixes. Instead, you can put those words in under a level-4 heading ====Related terms==== on a- (Etymology 5) and an- or, maybe, on alpha privative itself. See an-, bi- or for- for example. --Tohru 14:22, 6 December 2006 (UTC)
In Ancient Greek entries (e.g. ἄδικος), I've included it in the etymology, linking to the page alpha privative. It seems to me, however, as if that task would be un-ending. Medellia 18:26, 6 December 2006 (UTC)
I don't see how this would help any of our users. What we normally do with such words is to list them, as "Derived terms" after the definition of the prefix. See poly- as a fairly typical example. SemperBlotto 09:25, 7 December 2006 (UTC)

Definition numbers carried through to other sections

I thought that the use of numbers (from the definitions section) was deprecated in the Translations section (and anywhere else), since if someone reorders the definitions... (see bounty). Am I right to remove them when found? —— Saltmarsh 15:19, 6 December 2006 (UTC)

Yes, you are, provided there are translation tables that correspond to the senses (if not, please add them). Numbers are redundant and prone to breaking. — Paul G 15:58, 6 December 2006 (UTC)
bounty now updated to the new format for translations. — Paul G 16:02, 6 December 2006 (UTC)
Just to clarify for others reading this: don't blindly remove definition numbers, instead convert the entire Translations section to using the newer format (like Saltmarsh and Paul did.) If in doubt, (or too lazy) then tag it with {{rfc-trans}} instead of removing information. If reformatting and encountering languages you don't speak natively, put the entries you are unsure of in a "Translations to be checked" subsection, tagged with {{checktrans}}. --Connel MacKenzie 17:47, 6 December 2006 (UTC)

Lost it again...

(Discussion moved to WT:GP)

Italicize Spanish words in many, many entries

A VOTE is happening on this topic.

I recently had an exchange with the owner of TheDaveBot, which has added tons of conjugated Spanish verbs. My problem with what has been done is illustrated by an example entry:

  1. The first-person plural of vedar in the present subjunctive.
  2. The first-person plural of vedar in the imperative.

It seems to me that this should actually be:

  1. The first-person plural of vedar in the present subjunctive.
  2. The first-person plural of vedar in the imperative.

In particular, I'd like the Spanish words italicized in the definitions. Not only are foriegn words typically italicized in English text, but the Spanish words are being used to represent the words themselves, as opposed to being used for their meaning, which is often another reason to italicize words (e.g., "the word word"). I'd like other people's opinions on this. - dcljr 17:03, 6 December 2006 (UTC)

Yes, you are right that this is the appropriate thing to do. Wikification alone is not enough, as I have pointed out before, as it makes no distinction between the English and Spanish wikified words. Wikification is also not equivalent to quotation (writing "the word word" makes no distinction between the two instances of "word" in that sentence, where the first is being used grammatically and the second is quoted, but not indicated as such). There is a question of whether the terms should be italicised (as they are foreign in the context in which they are used) or emboldened (because the wikipedia links take you to headwords). Italicisation is probably OK. — Paul G 17:25, 6 December 2006 (UTC)
Totally agree indeed. henne 17:48, 6 December 2006 (UTC)
I meant to say with this that I agree that wikifying is not enough. If there is strong objection against italics, bold is fine for me. I do like Petey’s suggestion, and indeed also think the answer is to use templates. henne 11:07, 11 December 2006 (UTC)
I am unsure if I agree or disagree. I do recall that the main reasons for avoiding italics were: 1) many computer fonts render italics poorly, especially for non-English scripts (where pixel accuracy is needed even more!); 2) the formatting conventions of this dictionary are whatever we say they are, as long as we are consistent; 3) other dictionaries also use their own styles...we're not writing a term paper, we are trying to present something that can be read and understood. However, Paul's point that we aren't indicating the use/mention distinction at all needs to be addressed somehow (if not with italics.) --Connel MacKenzie 18:03, 6 December 2006 (UTC)
Regarding foreign scripts, perhaps we ought to italicize only foreign words written in the Roman alphabet as it is quite easy to assume that a foreign script denotes an alternate language (and I'm fairly certain I've seen no use of w:Shavian or the like for English entries!). Medellia 18:22, 6 December 2006 (UTC)
A minor problem came to mind: we would need additional "form of" templates (as Template:feminine of) with the term italicized. Medellia 21:03, 6 December 2006 (UTC)
It seems we've had this discussion before, but I don't think a consensus was ever reached. Personally, I prefer to italicize non-lemma entries this way:
  1. The first-person plural of vedar in the present subjunctive.
  2. The first-person plural of vedar in the imperative.
While traditionally italics are used to "emphasize" text, the actual result on an electronic screen is to make the italicized text smaller and harder to read. Italicizing non-Roman scripts (as Medellia has noted) just makes them even harder for someone to read who is not familiar with them, and I expect that it will do the same for people less familiar with Roman typeface. If it were feasible, I would argue against using italics at all, but there are too many places on Wiktionary where they're a necessary evil.
In the particular example above, the word vedar is the most important one in the "definition", since it is on that lemma page that the actual definition/translation will be found. The rest of the "definition" text is simply giving grammatical context of the entry rather than its lexical interpretation. This is not what one would normally find in a true definition of a word, and we want people to be aware of this difference. We also want to guide users to the lemma page, since that's where the lion's share of information will be found on that "word". The use of much italic text on the definition line quickly and easily makes this important difference visible. If only one word were in italics, the user will not have such an important visual clue. My preferred format also makes it clearer that one should follow the bold link for a definition and further information. --EncycloPetey 21:11, 6 December 2006 (UTC)

There are over 1,400 words in Category:Italian verb forms that are affected in the same way. So let me know what to do before I add too many more. (I'm not changing any of the existing ones, but I won't object if anyone else does). SemperBlotto 22:19, 6 December 2006 (UTC)

Please don't read my lack of doing the changes as a "don't change" vote...I am fine with it but don't have the time to do it. - TheDaveRoss 00:47, 7 December 2006 (UTC)

I don’t like the italic idea at all. In the first place, it does not produce the same visual effect on a computer screen that it does on paper ... on a computer it does not stand out. Secondly, some languages such as Russian have rather different character forms in the italic, and italicized Cyrillic often confuses Westerners ... for example, мать vs. мать. Thirdly, some languages aren’t normally italicized at all, such as Arabic and Chinese. It can be done but it looks horrible. I recommend bolding instead, but only for languages in the Roman alphabet ... bolding does not work well for some non-Roman scripts. Leave non-Roman scripts unformatted except for special templates such {{ARchar|}} for Arabic. —Stephen 07:09, 7 December 2006 (UTC)
I can agree with you wholeheartedly. I still get confused whenever I'm trying to read Cyrillic script (italics) as opposed to standard Cyrillic typeface, for which I recognize charatcers quite readily. --EncycloPetey 01:31, 12 December 2006 (UTC)
On a side note, why not include the italicized version of Cyrillic words on the page, since the appearance is so substantially altered in some cases? And does this possibly even apply to some forms of English words? Does anyone want to auto-image the scripted forms, or am I going too far with this? DAVilla 20:32, 24 December 2006 (UTC)


The answer is, use templates. The style can be deferred. We should be questioning the naming convention for templates instead of having this argument again. In fact I think I'll start a vote to hopefully settle the question. As for the templates, do we want to have a separate group of templates for each and every language? Remember there are several times as many minor languages which don't have their own Wiktionaries. What about categorization? If it's handled with a lang parameter, could we re-use for instance {{past of}}? DAVilla 07:12, 9 December 2006 (UTC)

I wrote Template:es-conjline for the 'bot to use, including the spans with use-with-mention and mention classes, as we use in the other form-of templates. If the 'bot has used it, (or copied the correct syntax) there wouldn't be a problem. But the bot used Template:esbot:conjline and subst'd it, which it should not have done, so now it is painful to fix. We have to go back through all the entries and add some form-of template (whether es-conjline or something else) and then our standard will apply. (E.g. whatever default we set in the CSS classes, or whatever a logged-in user cares to set for themselves.) Robert Ullmann 18:05, 10 December 2006 (UTC)
(strike "painful"), is fairly easy to fix. My comments on the 'bot were almost all on this point (see User Talk:TheDaveBot), trying to say that the text of the "conjugation line" had to be in a "permanent", i.e. non-subst'd template that could do the CSS-span-class magic. It is easy to fix, though you do have to touch every entry. Robert Ullmann 21:09, 10 December 2006 (UTC)


  • The correct answer is to use CSS styles. This way some people can use italics, others bold, or people can use italics just for some languages. Or everybody can use the same but at least we only have to edit the global CSS file when we change our minds rather than thousands of articles.
  • The best way to add styles in a Wiki is with templates.
  • The problem is that we want to take into account a couple of factors: firstly that the word is "affected" whether that end up meaning italic, bold, or some fancy font; secondly either the language or the script which is a lot for casual editors but i guess dedicated editors or bots could try to fix them; thirdly i'm not sure if you'll want to take into account the use/mention stuff talked about lately.
  • Also I'm not a CSS genius but I think it might make a difference which CSS span is inside which. I think you can make one span with two CSS classes separated by a space like this: <span class="wiki-italic wiki-langscript-arabic">foo</span>
  • A good template could take an argument for the language which would call another template which specifies the script for that language.
  • This way you could say "make this word stand out" and that might mean italics for English and Spanish but bold or a different font for Japanese or Hebrew.

Hippietrail 11:59, 11 December 2006 (UTC)

Translations - wiki links

Is the caret the correct way to link to other language dictionaries? As in:

  • German: Wissenschaft f (^)
    and can this be put in the policy document - which mention the links but doesn't say how? —— Saltmarsh 07:25, 7 December 2006 (UTC)
There was a discussion not too long ago about using {{trad}}. Some people object to the asterix, others to the circumflex. I'd prefer it look like *German: Wissenschaft Deutsch but I don't know of a straight-forward way of implementing that. --Connel MacKenzie 17:50, 7 December 2006 (UTC)
I've modified {{t}} although I expect it to be rolled back. Is this what you're aiming for? DAVilla 19:58, 7 December 2006 (UTC)
As I have commented in {{t}}, I do not like this at all. It takes way too much space, see e.g. false, Dutch translations. I propose it be reverted to the language abbreviation. henne 11:48, 11 December 2006 (UTC)

I hate to be an aguafiestas but I really hate the asterisk, the caret, and the superscript native language name. I think I might hate it less if it looked smooth and modern and professional. I can imagine a Unicode character that looks like a solid triangle pointing right set to a specific colour that doesn't have a current wiki meaning like green or yellow or magenta or cyan. — Hippietrail 12:06, 11 December 2006 (UTC)

Something like this:
Though it looks like we'd have to use a CSS class name and edit it in the global CSS file. Inline colours don't seem to work with links. There's a bunch of other symbols which look nice for these sorts of things in Unicode's "Geometric Shapes" page, U+25A0 – U+25FF.
Hippietrail 12:30, 11 December 2006 (UTC)
I love Hippietrail's idea. It's clear and yet unobtrusive. Some modern print dictionaries use a similar symbol to mean "see also" or "see under", so I agree that this would be a good symbol to use. — 14:32, 12 December 2006 (UTC)
Check out what they're doing on the Latin Wiktionary for this issue. --EncycloPetey 02:52, 14 December 2006 (UTC)
  • Spanish: así   — looks best at my screen resolution.
  • Spanish: asíen   — the pointing finger too small. —— Saltmarsh 04:56, 15 December 2006 (UTC)

There seems to be no consensus, how do we reach one? The choices seem to be between:

Are there any others? —— Saltmarsh 05:39, 20 December 2006 (UTC)

There's the asterix...
and perhaps some other unicode shapes:
..or, perhaps, the same as all of the above, but without the parentheses:
However, may I suggest that whatever we end up with be implemented via use of a template, so that writing it would be something like the following (I chose "native" as the template name, but whatever suits):

*German: [[Wissenschaft]] {{native|de:Wissenschaft}} {{f}}

...the point of which would be to allow easy modification later without touching the articles by updating the template.
As for how to reach consensus, once you're reasonably sure you've got all the options you could call a WT:VOTE. Or possibly multiple votes, first to determine whether we want to codify such linkages in the style guide, and secondly about the specific implementation details. --Jeffqyzt 15:41, 21 December 2006 (UTC)
But why not make it all into one template {{native|de|Wissenschaft|f}} yielding [[Wissenschaft#German|Wissenschaft]] [[:de:Wissenschaft|de]] {{f}} or however we want to name it and modify the appearance of it... Is it possible to automatically expand de as a parameter to "German" for use in the anchor, btw? \Mike 15:56, 21 December 2006 (UTC)
Even better :-) Of course, someone who actually creates templates should step in and tell us whether this is all possible in our template coding system... --16:05, 21 December 2006 (UTC)
Of course: it is template {{t}} (or {{trad}}), which is what started this discussion! Robert Ullmann 12:20, 26 December 2006 (UTC)
I would argue against having too many arguments to the template - they make them more inscutable to the newbie and may discourage use, also different ones might be needed for different parts of speech. And don't we alread have template {{t}} which is also redirected to from {{trad}} —— Saltmarsh 06:59, 22 December 2006 (UTC)
I have called for a vote WT:VOTE#Translations_-_wiki_links on this. To take place in two parts - first to decide in general whether text or symbols is preferred - and then to select which text/symbol. The first vote ends 15 January 2007, the second on 31 January 2007. —— Saltmarsh 11:43, 26 December 2006 (UTC)
The first vote doesn't include using the code? Odd. Doing it in two stages does not work. I'd prefer a symbol I liked over the code, but very strongly prefer the code over a symbol I don't like, which is almost all of them ...
Note that all of this is discussion about what such a template might look like, to come up with a reasonable proposal. There has not been any discussion yet of whether such a template should be used. Robert Ullmann 12:17, 26 December 2006 (UTC)
Sorry - by code do you mean en, de etc ? Saltmarsh 12:24, 26 December 2006 (UTC)
No one had voted so I rejigged the options - wlong line that I hope User:Robert Ullmann will support. —— Saltmarsh 15:11, 26 December 2006 (UTC)

'A picture' v '1000 words'

Could we have (do we have) some "prefect example" pages (no more than ~3) illustrating recommended layout? I can spend, more than 50% of a session searching for an example/guidance. Not only would these serve as guidance - they could provide the examples needed when a discussion is taking place. —— Saltmarsh 07:31, 7 December 2006 (UTC)

I assume you mean something like leap, be or portmanteau? These used to be referenced on WT:ELE as examples of properly formatted entries, but they've been removed from that page for some reason. - dcljr 21:49, 8 December 2006 (UTC)
This is probably fine - except that someone may edit, unnoticed, such an example.
WT:ELE illustrates part of the problem - guidance gives Quotations as a separate heading after definitions, I suspect most articles have them interspersed with definitions. Meanwhile people are adding entries in differing formats - which may need straightening out later. Total uniformity may be undesirable, but I for one would like to see what I should be aiming for.
RTFM? I have often spent considerably more time than I want to looking for help - more often than not a picture (epitome article) would do it in a few seconds. [[User:Saltmarsh/Sandbox2]] User:Saltmarsh/Example1 shows a starting point in outline. Such pages would be readonly - or refreshed regularly?
This is not to criticise those who have spent many hours writing help - but I cannot always find what I know must be there! —— Saltmarsh 14:25, 11 December 2006 (UTC)
I've started working to organize a set of greatly fleshed out entries that will be simple for various cases. The first one to be in suitable shape is listen, which serves as a simple verb example with a single etymology and single pronunciation but with many of the other bells and whistles in place. I've got a short list (not currently handy) of noun, adjective, entry-with-1-pron/2-ety, entry-with-1-ety/2-pron, entry-with-2-ety/2-pron, entry-with-homophones, and so forth. I'll try to remember to set this up somewhere in my user space for you and other folks to take a look at. I intend to keep entries from the list on my watchlist, and use them as points of discussion for updating some of the older portions of the ELE. I've begun the project page here. --EncycloPetey 02:34, 12 December 2006 (UTC)

How tag for deletion as vandalism?

How do I tag an entry like horrificus‎ to be deleted on the ground it's vandalism? I think there's a template but I don't remember which. RJFJR 17:30, 7 December 2006 (UTC)

  1. Nominate more sysops on WT:VOTE
  2. Use {{rfd|vandalism}} or {{delete}}
  3. WT:VIP
--Connel MacKenzie 17:45, 7 December 2006 (UTC)

"would" for "had"

There seems to be another use of "would" emerging. Here's an example:

"If you would have done what I told you to, we wouldn't be in trouble.

This isn't grammatically correct - the pluperfect needs to be used in the first clause ("had done" instead of "would have done") I suspect this comes from the fact that "had" is often contracted to "'d" in speaking, and this is mistaken for a contraction of "would".

Is this US-specific (I've heard a few US speakers use it), and if so, is it a regional/dialectal variation, or is it just non-standard?

Whether it is non-standard, incorrect, regional or whatever, I think that, if we are to be descriptive, we need to include this usage. — Paul G 10:18, 10 December 2006 (UTC)

Excellent. DAVilla 17:56, 10 December 2006 (UTC)
I'm sorry - but I don't understand what you are saying is incorrect about it. I think the usage above is very common in the US, but I'm not clear on what you are saying is wrong about it...nothing is wrong from what I see. "If you'd have done..." is a contraction of "If you would have done..." so I'm not clear on why you think the word "have" can be removed there. I'm not sure I've ever heard had pronounced as 'd. I agree that it is very common, and should be described here (but I'd like to understand what you would like it to say, first.) To me, the flavor of the sentences is quite different "If you would have done..." vs. "If you had done..." but possibly that's just me. --Connel MacKenzie 18:01, 10 December 2006 (UTC)

I always hear this less as a pluperfect than as a kind of colloquial subjunctive - it's like the equivalent of "If you were to have done it...", but much less formal. This may be the difference that Connel instinctively feels. (btw Connel: you must be mistaken about never hearing had as d – think of sentences like I'd already eaten dinner by the time she got home.) Widsith 12:34, 11 December 2006 (UTC)

Oh, thank you. I misread what he had said with regard to "'d". I thought he was saying that "had" was somehow pronounced as "d", without being written as a contraction. My bad. --Connel MacKenzie 05:18, 12 December 2006 (UTC)

Analysis of L3 headers

While waiting this afternoon for Chelsea/Arsenal (still nil-nil at 70 minutes), it occurred to me that it would be easy to do a complete analysis of the level 3 headers in the wikt; I had 90% of the code already.

See User:Robert Ullmann/L3

Interesting. Now if Arsenal can hold, Man U will be either 8 or 9 points clear ... ;-) Robert Ullmann 17:31, 10 December 2006 (UTC)

http://tools.wikimedia.de/~stridvall ? --Connel MacKenzie 17:44, 10 December 2006 (UTC)

Just some Python code. I'm not suggesting some particular discussion, just thought some people might be interested. Robert Ullmann 18:10, 10 December 2006 (UTC)

New image source

Perhaps some of you know commons:Category:Vocabulary. I have just found another source of images illustrating basic words: w:simple:Wikipedia:Basic English picture wordlist. --Derbeth talk 23:34, 10 December 2006 (UTC)

Etymology: Different templates for ‘X derivations’ and ‘related to X’ where X is language

Often, an etymology section contains stuff like ‘From somelanguage word, from olderlanguage oldword, compare with relatedlanguage similarword, lessrelatedlanguage lesssimilarword’ etc. Nice templates exist to replace somelanguage and olderlanguage (and I have been creating a few more), which can be found in WT:I2T#Webster 1913 abbreviations. One tends to use those same templates for (less)relatedlanguage too, but this gets the word inserted in the category of (less)relatedlanguage derivations. Do we want this? Should separate templates be created which do not add the category tag? Should these languages not be linked? henne 10:46, 11 December 2006 (UTC)

My opinion: You could link the language name directly, but templates should only be used in the etymology section when it gives the immediate source of derivation. So, a word that entered English from French should use the {{F.}} template, but if that French word comes from Latin you would not use {{L.}}, because the word did not enter English from Latin. It would be nice if {{L.}} could be expanded to accept a parameter so that it could be used on non-English words whose etymolgies are Latin (such as most of the Romance languages ;) ). Unfortunately, I can't think of a simple way to do that. There is a Category:Latin words from Greek, for instance, but the category must be added manually right now. --EncycloPetey 01:45, 12 December 2006 (UTC)
This would mean a lot of the etymology sections have to be reviewed. I would not be so strict: use the template for all the languages in the chain it has been derived from, but not for the languages which have cognate words.
OTOH, it would be easy to have a template for those too. Maybe we can add a parameter to those templates to indicate it is the primary source, and only include the ‘derived from’ category if this parameter is present. Then they can still be used for the other languages, conveniently adding pedialinks. henne 09:20, 12 December 2006 (UTC)

I think we should keep with using the Etymology templates (staying clear of the plain language ones of course!) but not for cognates or compares etc.--Williamsayers79 16:39, 12 December 2006 (UTC)

I also think we should stop calling templates such as {{OE.}}, {{Zulu.}}, {{F.}} Webster templates, we should call them Etymology templates to stop confusing people (this is half the reason why people, including myself, ended up using the plain language templates in etymologies). I reckon we should move them to category:Etymology templates also. I'd like to make a seperate page to called Wiktionary:Etymology templates to detail them all, currently we have a very confusion situation where there are two Webster Etymology template lists at Wiktionary:Index to templates#Webster 1913 abbreviations and Wiktionary:Etymology. What do you all think?--Williamsayers79 16:39, 12 December 2006 (UTC)
So, have you devised a usable naming scheme for those templates? I think the format {{etym-OE}} would be too cumbersome, offhand. Perhaps {{e-OE}} or "e-" + ISO 639 lang code? Hey, did you just volunteer to set these all up and document them? FANTASTIC! As each of the legacy Webster templates are cleared, we should probably mark them as deprecated and retain the template saying so (and what it should be called now.) --Connel MacKenzie 17:21, 12 December 2006 (UTC)
Have created a sub-page Wiktionary:Etymology/language templates and category:Etymology templates, which holds all the so-called Webster templates and some others that have been raised too. In time I will devise a nifty way to make these templates take language parameters to make things easier for foreign word etymologies.--Williamsayers79 21:04, 14 December 2006 (UTC)
Tangential note: At one point, we had an appendix (or appendix-like) thing for abbreviations used in Webster 1913. Until one of us gets around to doing an organized import all the rest of them, (perhaps into the Transwiki namespace, or something?) we'll need to know what those wacky abbreviations were. So please remember to deprecate them, but not delete them (when you get to that phase.) --Connel MacKenzie 06:45, 15 December 2006 (UTC)
Why do you want to deprecate them? I see absolutely no problem in the {{OE.}} format. The point at the end distinguishes them very nicely from the other language templates. Ok, for newcomers it is not immediately obvious what the templates are for, but I think that is a small initial step. I see no hail in renaming them all, except much extra work.
William: good work! I added a description to Category:Etymology templates, copying some of the content from WT:I2T there. I edited WT:I2T too, to just link to the list in Wiktionary:Etymology/language templates. henne 12:14, 15 December 2006 (UTC)
My suggestiong for deprecating them was only a response to the above comments, that they are misnamed. If we are going to adopt a consistent "etym-" prefix, and a language name for them all, then the "Webster" ones will all become redirects, right? But even after all the existing entries are converted to a consistent naming scheme, those redirects absolutely must be retained, somehow. In the past, people have raised objections to retaining anything related to "Webster 1913" for reasons I never did understand. --Connel MacKenzie 18:23, 18 December 2006 (UTC)

Christmas Competition 2006

This years Christmas competition is underway at Wiktionary:Christmas Competition 2006 SemperBlotto 11:51, 11 December 2006 (UTC)

Due to unprecedented demand - this has been extended to the New Year. (see also irony SemperBlotto 22:47, 26 December 2006 (UTC)


Moved to The grease pit. --Connel MacKenzie 05:06, 13 December 2006 (UTC)

Japanese katakana

I intend to write basic articles for 43 of the basic katakana. I won't touch : they're fine for now. I also won't do anything about katakana with [han]dakuten. Of the remainder, and exist as normal articles, if stubby; the others are nonexistent or redirects to hiragana. See and for examples of the style.

I bring this up here because when I do it it might be confused for a bot run; I'll be pasting subst:s of User:Cynewulf/sandbox, and it might go somewhat quickly.

If anyone has an objection or suggestion for improvement, please let me know. If no one objects, I'll begin about a week from now.

Cynewulf 06:30, 15 December 2006 (UTC)

The format looks very nice to me. Only two minor points:
  • I think, in respect of consistency, it's a good idea to overwrite the first five katakana entries, , too with your template, salvaging the well written etymologies and the noun section on . No valuable information will be lost by it.
  • The link Special:Allpages/か should be Special:Allpages/カ, shouldn't it? Hiragana entries can have their own all-pages links.
--Tohru 13:53, 18 December 2006 (UTC)
Ah, OK. I made the Allpages link to hiragana because I was thinking about the hidx= parameter to every template. I'll fix that. Cynewulf 22:27, 18 December 2006 (UTC)
Good ... "Stroke order" shouldn't be a L3 header. You might look at {{Han stroke}} and {{stroke order}} if you want to maybe use a float right box; otherwise just make stroke order a caption (e.g. with ; at the start of the line) And I concur with Tohru, subst the template in at the top of the existing entries, and then edit out the older redundant format. Robert Ullmann 22:05, 18 December 2006 (UTC)
Heh, I wrote {{stroke order}}. It calls {{Han stroke}}, which I suppose is misusing it for kana. I'll stick {{stroke order}} in and we can change the template later if necessary. Cynewulf 22:27, 18 December 2006 (UTC)
Indeed ;-) "Han stroke" was sort of a stop-gap while formatting the Han character entries; all it is is a float box with the caption "stroke order" (as you know, but others reading this may not); we can figure this out presently. Robert Ullmann 22:34, 18 December 2006 (UTC)

Sorry for the delay -- everything's ready on my end. I'll give it a day or two and start work. Cynewulf 16:36, 8 January 2007 (UTC)

Here I go... Cynewulf 14:31, 14 January 2007 (UTC)

Done! Cynewulf 15:08, 14 January 2007 (UTC)

Sorry, I've never looked at this discussion or these pages before, but the use of phoneme is incorrect for most kana. They are typically syllables composed of two phonemes though the vowels and a few others are phonemes.--BrettR 16:04, 14 January 2007 (UTC)

Sigh, nice timing.. I agree that e.g. "ka" in English is two phonemes, but since Wikipedia says "The phoneme can be defined as 'the smallest meaningful psychological unit of sound.'" and since we say in phoneme "An indivisible unit of sound in a language.", because in Japanese individual consonants don't exist, is "ka" then a phoneme in Japanese? (It's certainly two phones.) (Yes, I'm arguing this out of laziness.) (See also , [2]) Cynewulf 16:54, 14 January 2007 (UTC)
Even in Japanese, you can show that phonemes are psychologically real independent of syllables by considering the way ~てしまう becomes ~ちゃう taking the beginning of the て and the end of the ま. Spoonerisms are also evidence for it.--BrettR 19:10, 14 January 2007 (UTC)
And the -く -> -かない sort of thing as well. Oh well, I'll fix it eventually. Cynewulf 19:28, 14 January 2007 (UTC)
How about simply replacing "phoneme" with "syllable"? I think it is good to explain [ka] as a unit of Japanese speech sound, though it actually consists of two phonemes. --Tohru 11:46, 16 January 2007 (UTC)
Yes, that was what I imagined. I would do it even for the few that are phonemes.--BrettR 13:23, 16 January 2007 (UTC)

Old Spanish?

I'd like to add some obsolete Spanish forms, but not I'm not sure whether to label them as ==Spanish== or ==Old Spanish==. Where does Old Spanish begin and end? Is there a Middle Spanish? (The Wikipedia article on the subject, Linguistic history of Spanish, is currently crap.) --Ptcamn 12:32, 15 December 2006 (UTC)

This also interests me. Some months ago I added things like "muger" as Alternative spellings marked as "obsolete". I also couldn't find much info on Old Spanish / Middle Spanish but you will find the equivalent for Portuguese and until you find something better I would go roughly with similar dates for Spanish since the languages grew up together in the same area under similar influences and from the same roots. — Hippietrail 01:14, 17 December 2006 (UTC)
Wikipedia says Old Portuguese is 12th to 16th centuries. After that is considered early modern Portuguese. — 12:52, 17 December 2006 (UTC)
Even better. The Linguist List states "The language spoken in central Spain from the 10th -- mid 13th century AD, and the precursor of Modern Spanish." -- which turns out to be quite different to the Portuguese case after all. [3] &dash; Hippietrail 13:07, 17 December 2006 (UTC)


Have you seen the French Wiktionary? It is beautiful and very easy to use. For example, compare site with http://fr.wiktionary.org/wiki/site (ignoring content) for an example of what I mean - gorgeous and so easy to locate the information you are after. It would definitely help en.wikt a lot! --HappyDog 12:53, 15 December 2006 (UTC) (ps this page is far too long! time for an archive, someone?)

I like the coloured language heading bands - these make them stand out. But I dont think that the heading icons add much.
What would improve things would be to differentiate the different levels of heading better - on my screen the fonts are all pretty much the same size. —— Saltmarsh 12:14, 16 December 2006 (UTC)
I disagree that it is easier to find information you are looking for, there. I strongly agree that it is prettier. --Connel MacKenzie 18:15, 18 December 2006 (UTC)
Yes, it looks better. Russian Wiktionary also uses color for this (e.g., ru:общий). —Stephen 18:58, 19 December 2006 (UTC)
Hey, didn't we agree to do that here, also? Who was going to mangle the CSS to put the background colors on the headings? Or did that get quashed? --Connel MacKenzie 08:50, 12 January 2007 (UTC)
It looks "pretty" but English Wikt looks slick and neat if you ask me. The Russian versions looks more business like than the French one. I'm in favour of sticking with what we've got.--Williamsayers79 09:10, 20 December 2006 (UTC)
Yes, the little icons are pretty. They add no information. I for one don't like them because they are distracting. But there is another problem: the way the fr.wikt defines templates for headers does not allow for the extended structure we use with multiple etymologies. Compare our entry for a with fr:a, where there isn't any way they can represent the multiple etymologies for English. (The preposition POS is missing entirely, as are several others, and there is no place to put the etymologies, which may be why they are missing entirely.) Even if the fr.wikt had some template variants, the cute icons get very tiresome very quickly.
All that said, we should always be looking at some improvements in aesthetics. Robert Ullmann 12:50, 20 December 2006 (UTC)
Some good points, and I concede that a fair amount of it is subjective. A couple of specific responses:
  • Icons - Personally, I like them, although some of them are a bit 'twee' and they are not quite unified enough (they seem like they come from different icon sets, which they do). In one sense, yes, they add no information. However, what they add is an at-a-glance overview of the type of information being presented. When you are familiar with them it makes it easier to locate the information you want simply by scrolling - you don't have to read the headings. I know that there will be strong resistance to adding them here by people who see them as distracting, or worse 'dumbing down', but I definitely would find them useful.
  • Templates - surely we can create templates that work how we want them to work? If our method won't work with their existing templates then make new templates.
Another thing to consider is the user experience. Currently I feel that looking up information on en.wikt is a chore, whilst fr.wikt is a very positive experience. I'm pretty sure that en.wikt is a more complete source of information, but that is not my only consideration - I tend to use fr.wikt as my first port of call (even though I am a native English speaker) falling back on en.wikt when necessary. I know it may not make too much sense, but that's the way people work on the web - why else do you think movie studios spend so much money on making their vapid websites look so good? --HappyDog 03:06, 12 January 2007 (UTC)
We (en.wiktionary.org) get "hit" by up to 30,000 anonymous users per day. I find it very hard to even imagine that the majority of them are regular "readers" that haven't chosen to contribute...instead I suspect the the million+ visitors to Wikipedia provide a non-stop flow of new users here. So I don't think the concept of them being able to recognise them (after they've gotten used to them) has a tremendous amount of leverage. For editors, yes of course.
My main objection to the icons is performance. Every Monday, when the load peaks, image loading from Commons: suffers. Yes, I understand that from day to day, images are cached in the browser. But again, most of our visitors seem to be visiting for the first time. Making the page load go from 5 seconds to 30+ seconds seems very detrimental. (I suppose we could have sysops upload the icon images here...but the performance problem would still exist, to a lesser degree.)
Then again, people do wait for much longer than that for their youtube movies to load.
Templates. Well. We've had a long legacy of militantly opposing heading templates on en.wiktionary.org. The section editing it too useful to give up on, just for pretty icons. The obvious arguments regarding template complexity (for newbies) do not need to be mentioned, do they? On fr.wikt:, everything is wrapped in a template. There are good and bad aspects of that. As much progress as en.wikt: has made toward using templates better, I think a wholesale jump to the fr.wikt: style would meet an unspeakable amount of resistance.
--Connel MacKenzie 08:34, 12 January 2007 (UTC)
It's the first time I see either Russian or French Wiktionary, and I have to agree with the poster above who said that the Russian one looks more professional. Perhaps the icons work better when you get used to them, but for a first time visitor I think they are distracting. The problem is that there are too many of them. You can't emphasize almost every subsection like that and have it be clear. Emphasis must be used sparingly to work, and those icons very much stand out, even more than the language headings, which should be the most prominent divisions of the article. What English Wiktionary could use is emphasis of the language headings and perhaps the actual definitions of the words, which is what the reader usually wants aside from translations, which stand out already due to their being dense collections of links. -- 00:59, 14 January 2007 (UTC)

Protected page

What is the Edit protected template called? Where is the administrator's noticeboard? Specifically, Template:En-noun needs to be changed to use a much clearer term than simply "uncountable". This is inscrutable to any common reader. Centrx 08:59, 16 December 2006 (UTC)

(Also, why is the style guide in Pathoschild's userspace? Centrx 09:03, 16 December 2006 (UTC))
Such a request would go on Template talk:en-noun, but it would not be honored without (another) very lively discussion and a vote.
We don't have the critical mass of administrators to split out an admisinstrator's noticeboard from the beer parlour. With 25 times as many sysops, perhaps.
Pathoschild's experiment has much merit, but is no more than a personal experiment at this point in time. --Connel MacKenzie 22:34, 16 December 2006 (UTC)
Where is the original or other discussion? Centrx 03:43, 17 December 2006 (UTC)
In the beer parlour archives (2006) somewhere; the flashpoint of the previous flamewar was Category:English nouns. --Connel MacKenzie 18:12, 18 December 2006 (UTC)
I remember objections to "uncountable" even at the time. Did anyone ever dig this conversation up, or has the topic been dropped? DAVilla 02:00, 30 December 2006 (UTC)


Can Wiktionary please get a new favicon, to differentiate it from Wikipedia? I find it annoying that they're the same because I don't know which tabs in my browser are which, and then I have to think too much when using the search box, which I don't like to do. :( Pengo 22:28, 17 December 2006 (UTC)

We had the separate one recognized for a while...not sure what bug crept in. I do remember the Wikipedia favicon being affected by our change at one point. bugzilla:6096 seems to be the most recent duplicate bug on the topic. --Connel MacKenzie 18:10, 18 December 2006 (UTC)

non-English word of the day

I was thinking, could we perhaps arrange sometime a section for foreign language words? This would be a good way to jump start readers into browsing other scripts and such. John Riemann Soong 22:54, 17 December 2006 (UTC)

  • I suggested this before but was shot down in flames - apparently this is the English wiktionary. SemperBlotto 23:02, 17 December 2006 (UTC)
  • I don't see a problem with it. Actually sounds like a good idea to me. Pengo 01:48, 18 December 2006 (UTC)
  • For (edit: separate from en:WOTD). DAVilla 08:57, 18 December 2006 (UTC)

I like the idea of having a separate WOTD for foreign entries. The current WOTD has a growing following though (see the info desk requests for a cleaner RSS feed.) Commingling non-English entries there would unwise. --Connel MacKenzie 18:06, 18 December 2006 (UTC)

How about we just ask the people that are subscribed to the RSS-feed? It shouldn’t be that difficult to include an extra paragraph once, right? See who answers... henne 19:22, 18 December 2006 (UTC)
I also oppose the notion of commingling them for the obvious reason that this is the English Wiktionary. That said, a separate WOTDforeign is a good idea. FWOTD? FOTD? --Connel MacKenzie 19:30, 18 December 2006 (UTC)
MDJ? (Mot de jour?) :-P DAVilla 19:45, 18 December 2006 (UTC)
A separate WOTD project to feature non-English entries does sound feasible. It just needs volunteers to make it happen. --EncycloPetey 22:38, 26 December 2006 (UTC)
This issue (of including non-English entries in the WOTD) has come up several times before, but no one has ever attempted to answer the criticisms I have raised. As I have stated before, the primary challenge with such a proposal is finding a fair and equitable way to select them, as well as criteria for how often a non-English (as opposed to English) entry should be featured. I can't think of any way to do these.
And there are many technical and procedural issues I would want addressed first. What makes a non-English entry a "good" WOTD candidate? How do we judge the validity of nominations when the only person able to read the nominated word is the person who nominated it? What percent of WOTD entries in a given month should be English and what percent should be non-English? How often should such a selection be French? Arabic? Polish? Khazakh? Telugu? How do we deal with languages in non-Roman fonts? Do we allow entries in fonts that may not display correctly for a significant proportion of users? How do we deal with languages that aren't written left-to-right? And what is the point of learning exactly one word in a single foreign language anyway?
One of the driving points of selection for WOTD has been the potential of the word for use. Granted, some selections (particularly early on) did not meet this criterion well, but I can't see how expansion of WOTD to encompass our 389 languages will improve WOTD in this regard. Quite the contrary I think. My own feeling is that each Wiktionary project can select words of the day from the language of the project. Let the French Wiktionary feature French words, the Bulgarian Wiktionary feature Bulgarian words, the Vietnamese Wiktionary feature Vietnamese (or Khmer) words, and so on. Some other Wiktionaries (such as German and Russian) are doing exactly that. I don't see the need to duplicate their efforts. --EncycloPetey 22:36, 26 December 2006 (UTC)
The original post, and everyone else so far, seems to have suggested that the foreign word of the day (FWOTD) be a separate entity to the en-WOTD, so the problems of how to mix the two are moot. Subscribers to WOTD didn't ask for FWOTD and they won't get it unless they opt-in. As for the suggestion that we let other language wiktionaries select their own words of the day in their own language on their own main page, well that's all fine and dandy, but it doesn't give English speakers a foreign word of the day. English speakers simply aren't going to go to the pick a random wiktionary each day and see if the Bulgarian word of the day has a good English translation. As for how to stop one any one language monopolising the FWOTD: two suggestions:
  1. The easiest way to stop any language taking over is to pick a word from a different language each day. One French word a year, one Cantonese word, one Latin, and so on, and just 24 languages might miss out. This would probably be better as "Language of the day", perhaps a separate thing again.
  2. The other way is to have "proportional representation", randomly pick 25 (non-stubby) foreign entries from the entire pool of foreign language entries each week and rank them, taking the best 7 for the week. Languages with better coverage get more words of the day, but others get some too. No two same-language words should appear within 7 days. Pengo 22:23, 28 December 2006 (UTC)
You've started to attack the question of how to select from multiple languages, which attempt I admire. The problem I see with option 2 is that it will grossly overrepresent certain languages and underrepresent others. Consider that there are an enormous number of spanish entries here because a 'bot went through and added conjugated forms of many Spanish verbs. So, while there are a huge number of spanish entries, a significant fraction of them look like hablamos or trabajando. We would need a means for using a count of lemma forms, rather than total number of entries. I can't envision a way to easily do that. --EncycloPetey 22:35, 28 December 2006 (UTC)
I would personally prefer option number two if it could be worked somehow, as some languages are quite scantily done at this point, Coptic for example. But EncycloPetey raises an excellent point in that non-lemma forms should not count towards representation. In Ancient Greek we're using separate categories for lemma and non-lemma forms. I don't know how many other languages are doing something similar, but it seems that in Spanish the lemma forms have categories and the non-lemma forms don't. I have no idea if this fact can be used to count anything. I also agree that the FWOTD should be a separate heading/entry/paragraph. Perhaps this is far too tangential to merit discussion on this thread, but I wonder if Wiktionary would benefit from some variation on Wikipedia's "Portal" theme. I have often thought it would be nice if there was some home page for each language, where people could see everyone who's working on any particular language, find out the conventions and templates used for the language, and perhaps have its own word of the day, so that people who are trying to learn German, Spanish, Greek, etc, could learn a new word of the language of their choosing each day. Cerealkiller13 00:29, 29 December 2006 (UTC)
To some degree that does exist with the "About" pages (see Wiktionary:Language considerations). --EncycloPetey 01:26, 30 December 2006 (UTC)
You have an excellent point about the other Wiktionary projects having their own words of the day. However, the number of Wiktionary projects doesn't run into the hundreds, and certainly there are only a handful running a WOTD, so even if we exclude them, which with a few exceptions I'm thinking is probably a good idea, there will still be plenty of material to pick from. The reason leaving out languages of other Wiktionaries is a good idea is that the best pages for those languages would more likely be on the other site rather than this one, or otherwise the page here wouldn't be much more than a translation. You have to admit that at the moment most of our foriegn language entries show only the primary sense of a word, and that's not something we should be interested in showing off.
If that means pushing aside the major languages then so be it. It's more significant to have some common animal be the primary definition of an Ancient Greek word than of a Spanish one. In the first case I'd think we're succeeding, in the second failing.
Now that may be the case for Spanish, but not all of the other Wiktionaries are off to such a good start, and it wouldn't surprise me if often we had more thorough content. In those cases our duty is to attract contributors onto the foreign-language Wiktionary if not just this one, and the words we select in those cases, and their presentation, should be towards that aim. We can only rejoice when the contributions towards other projects grow enough to start a WOTD of their own.
Besides potential foreign-language contributors, our audience also includes English-only contributors. That I can think of, there are two main reasons why someone who isn't even that interested in foreign languages might want to see a Foreign WOTD on English Wiktionary. The first is that it's a word in the news, and the second that it's a word that's borrowed in English. In both cases an English-only speaker might be likely to run across it. While we wouldn't really want to pass off words like jefe as being English, attested or not, FWOTD is an excellent place to highlight the Spanish meaning. These are obvious exceptions to the major-language exclusion rule above. Another exception, which again appeals more to the translation crowd, would be a word that is particularly difficult to translate, although in truth such words are more plentiful in cultures that are more foreign to us rather than just languages.
Now if I've inadvertantly split the audience into two camps, it's just a result of addressing the fringes before the central crowd. There is a large middle segment interested in foreign languages yet incapable or barely capable of contributing to them, and the primary goal in getting this word out is to drive home the point that this is a multilingual dictionary unlike any other. You're worried about scripts not displaying properly, but part of the FWOTD is the shock and awe factor in having a strange-looking script on the front page. To be effective, that might require extra work imaging the word, but it's well worth it if it emphasizes how broadly this project aims. Insofar as people may want to learn new foreign language words, what's most important isn't that the letters are decipherable to them, but that they have some grasp on how to pronounce it. IPA is okay but we're really shooting for audio, which only adds to the shock and awe. Seeing such entries brought to the forefront might be just what people need to motivate them into beefing up their own foreign-language contributions.
Which languages are to be chosen? How about judging quality word by word? Let's not assume this will start bickering, or at least try to channel that into positive competition when the time comes. But I have to say that I don't think we're even looking at this question of languages in the right light. I would want some of the foreign words to be defined in multiple languages, so many that it makes your jaw drop. Instead of giving the definition, give a laundry list of languages it's defined in, undoubtedly some unheard by us more ignorant. Or give false friends of English words or better yet in two other languages, or words with different shades of meanings in other languages, or equivalent words with different slang meanings derived from them, and highlight the differences. Or give a word history from Old English to Middle English to its modern derivatives, linking only the Old and Middle English words, or the equivalent in your language of choice. There are a multitude of possibilities for FWOTD that need to be hammered out, with words to be chosen under very clear goals, and presentations tailored towards those goals, the overaching one being the show-off of multilingual content and talent that WOTD doesn't do justice to. DAVilla 15:53, 29 December 2006 (UTC)
I;m not against the idea -- quite the contrary -- but I raise the technical and procedural issues so that they can be tackled and (hopefully) subdued at the outset of any undertaking to begin FWOTD (or whatever it may be called), rather than having the difficult issues inadvertently discovered later. The biggest obstacle right now is getting a person or persons to spearhead the initial setup and organization. --EncycloPetey 01:26, 30 December 2006 (UTC)
I nominate the Little Red Hen. DAVilla 02:03, 30 December 2006 (UTC)

Pronunciation in SAMPA

I know that the policy right now is to put either/both IPA and SAMPA in pronunciation guides. However, considering that SAMPA came about to represent IPA characters in 7-bit ASCII characters and since Unicode and templating have negated the need for SAMPA on Wiktionary, I propose that we do away with the redundant inclusion of SAMPA as part of this policy. This is how it has been done for quite some time on Wikipedia and I think Wiktionary ought to be brought up to speed. Ƶ§œš¹ IPA: [aɪm ˈfɻɛ̃ⁿdˡi] 06:19, 21 December 2006 (UTC)

I believe that we should have both, two reasons for this are:
1.) Not all browsers support unicode or IPA characters
2.) Some people still will use SAMPA
I believe these to be valid reasons to keep both. There is no sense either in removing it once it has already been entered, unless, of course, it is errant. --Williamsayers79 08:44, 22 December 2006 (UTC)
Concur with Williamsayers79: while IPA is clearly to be preferred, we have no reason to deprecate or remove SAMPA. I understand that while the pedia may reasonably want one single standard, in the wikt we can provide for more, since our purpose is describe the words. Robert Ullmann 16:18, 22 December 2006 (UTC)

I'm pretty certain that, since MSIE can display IPA with the IPA template and the other commonly used browsers are even better at displaying IPA that the overwhelming majority of users will be able to view IPA with no problem. I understand that it'll be cumbersome to remove it all as of this point, but the policy can certainly be to just have IPA and not necessarily require SAMPA removed. IMHO, the history and correspondance that SAMPA has with IPA shows that it is IPA and one would be hard pressed to find a large group of people familiar with SAMPA and not the IPA. It is, therefore, very redundant. Note also that I'm not proposing the removal of other pronunciation guides, since those are certainly different enough. Ƶ§œš¹ IPA: [aɪm ˈfɻɛ̃ⁿdˡi] 09:37, 23 December 2006 (UTC)

The display is the lesser issue; it is also about what the user can read and understand. (Your user name is AEuSoes1, perfectly easy to read and use. The way you display it is cute, but utterly useless ;-) We can perfectly well have both. Robert Ullmann 12:33, 26 December 2006 (UTC)

Reconstructed terms, request for assistance

I've revived the discussion at Wiktionary_talk:Reconstructed terms, but instead of receiving constructive input there, I am receiving block threats from User:Robert Ullmann, who apparently has admin buttons here. Upon my question which policy exactly I had been violating, I received more threats instead of an answer. I would request broader input both on the topic itself, and on what I consider misbehaviour. Dbachmann 12:11, 21 December 2006 (UTC)

Warnings are misbehavior? Should I do away with awkwardly wording warnings entirely, then? Re-adding material that has been removed or moved (after it has been Wiktionary-ified) is always a bad thing. A great many Wikipedia conventions are followed here, but many are not (e.g. "notability".) It is a bad assumption to think of Wiktionary merely as an extension of Wikipedia.
That said, I am curious as to what you think would help improve the cross-project coordination, particularly to reduce the occasional animosity that arises. I'm also unclear on what action should be preferred when an admin from another project comes barging in, wishing to "reform" Wiktionary to some other standard. The latter seems to be something of a trend. --Connel MacKenzie 23:19, 21 December 2006 (UTC)


(also applies to Proto-Germanic, -Uralic, -Sino-Tibetan, etc.)

For some time now, PIE entries have resided in the Appendicies, with references from the etymologies of various words. This is appropriate, as they are conjectural reconstructions, not actual words that can meet CFI.

The issue arises because one otherwise valuable contributor has begun trying to move these entries into the main namespace. For some background, see Wiktionary:Requests for deletion#*men-, User talk:Dbachmann, User talk:Robert Ullmann, Wiktionary talk:About Proto-Indo-European, and WT:VOTE. The vote is about removing PIE entirely, which is apparently not the common view; the common view seems to prefer the status quo pro ante: that PIE (et al) belongs in the Appendicies.

I would propose the following:

  • That WT:CFI state explicitly that reconstructed languages do not meet CFI.
  • That all etymologies referring to PIE et al use template {{proto}} which will format the links to the appendicies and also provide a way to indentify these entries. (Special/Whatlinkshere, or adding a category or categories to the template as desired.)

This is essentially codifying what we have been doing to date. Robert Ullmann 12:51, 21 December 2006 (UTC)

I stongly agree. Could you please start a separate vote for that, as the one I proposed seems unlikely to pass. --Connel MacKenzie 22:59, 21 December 2006 (UTC)
Okay, see Wiktionary:Votes/pl-2006-12/Proto- languages in Appendicies Robert Ullmann 16:12, 22 December 2006 (UTC)

Translation project work

I've been looking over Wiki pages for spanish words and translations, and I've noticed that a lot of pages need work.

A few of the things that I noticed that need work are:

  • Many pages need categorizations(Spanish verb, adverb, adj, etc)
  • Many pages are also missing conjugations for verbs.
  • Some pages that have the same spelling as words in italian and portugeuse are missing their spanish sections.
  • In articles a Spanish word should link to the "Spanish" section of an article, not the article itself. For example, marcha, which is under a spanish translation, should be marcha#Spanish.

These are just a few concerns, and I think that an organized translation project would be helpful. Please give me your insights. Oh and, Im only using spanish as a reference, as I know it somewhat.

Bearingbreaker92 02:23, 23 December 2006 (UTC)

In patrolling recentchanges, I tend to notice only three people editing Spanish entries in earnest. Two of them are MIA right now. On all four points you raise above, I agree completely. But not having ever taken a Spanish class, nor studied anything beyond a menu (mmmmm, carne asada torta!) I can't offer any direct help. You can try searching the Babel template categories for other Spanish speaking contributors and organizing something. I'm not sure we have quite the critical mass of contributors that certain other sister projects seem to have. Since you'd be the first of that type, you'd be setting a nice precedent for others to emulate. Do you have a namespace in mind, for the translation project area? --Connel MacKenzie 04:28, 23 December 2006 (UTC)
You say there there are not many people working on translations projects, perhaps because of a lack of organization. Perhaps I shall start working on a project page, a just rather ambigious one that could be applied to any language, if people seem interested enough.

Bearingbreaker92 13:28, 23 December 2006 (UTC)

Well, I was talking about Spanish only, when I said there aren't many people working on translations. But yes, I personally am interested to see what you come up with. --Connel MacKenzie 15:16, 23 December 2006 (UTC)
Three of these points seem like (a) they're pretty mundane tasks that could be accomplished by a bot, and (b) they could be applied to any language, with minor adaptation, which would make a bot really useful. Would whoever's currently converting all the English pages be interested in doing what they can for other language entries as well?
Incidentally, I've always wondered why [[marcha#Spanish]] resolves to marcha#Spanish instead of marcha. The bar used in e.g. [[marcha#Spanish|marcha]] is there primarily to cover up code, but the syntax necessitates duplicacy. While the hash can be very useful in the code, why would we ever want to show it on a main page? How difficult would it be to automatically parse it out? DAVilla 18:16, 25 December 2006 (UTC)
Something like $ python replace.py -regex "\[\[(.*)?\#(.*)?|(.*)?\#\(.*)?]\]" "[[\1#\2|\3]]" I'd guess, but my "regex" knowlegde is pretty horrid, so I usually just ask for help in the Grease pit. --Connel MacKenzie 16:12, 26 December 2006 (UTC)

Why has my username User:Shoof been blocked

Apparantly the computer is telling me that my username has been blocked for "readding" numbers. I don't understand that because I haven't recently added any numbers as separate entries to Wiktionary. I've added them to the protologisms page (where invented words belong) but not as actually entries (where they don't belong). Adding number terms to the protologisms page is not a reason to be blocked as User:Jimp has also done it (to the now deleted Appendix:List of protologisms/numbers and User:Jimp isn't blocked. If you have any doubt about this, check my contributions page. I should be unblocked as this someone blocking me was clearly a mistake. I was trying to add way back when as an entry and got a message that I was blocked 19:51, 24 December 2006 (UTC)

  • Block log: [4] You got blocked just for entering things into the list of protologisms? Ouch. Of course technically this mean you yourself have been banned, not just the account, and it's very naughty of you to come here and complain... Kappa 22:07, 24 December 2006 (UTC)
It's not naughty to come here and complain if you've been blocked for a very nonsensical reason. It's naughty for someone to say "it's very naughty of you to come here and complain" when I've clearly been blocked just for adding to the protologisms list. I'm sure anyone who's been randomly blocked would complain. 01:37, 25 December 2006 (UTC)
I would have to judge that in this case it's not unreasonable that the user did not fully understand the reason for a block. It was discussed here in the BP last time, which I would think the user must have read, having commented on it even, IIRC. However a message was never posted to the user's talk page (my fault if anyone's) and the full reasoning potentially forgotten. I certainly don't consider asking to be disruptive. However complaints can always be resolved by email. DAVilla 17:40, 25 December 2006 (UTC)
You were only blocked for a week. Protologisms go on the protologism page all right, but you were stuffing the page with a very large amount of material that doesn’t even count as protologisms. Please try to control your interest in unusual numbers and number notations. Develop an interest in real, honest-to-goodness words that are spelt alphabetically and are actually found in books (and I don’t mean books about weird words or number theory). —Stephen 14:09, 25 December 2006 (UTC)
You have to look at the look at the block log like this] This user has been blocked repeatedly (5 times) for section blanking faggot, use of sockpuppets, and re-adding deleted entries. With no useful contributions. As far as I am concerned one more chance when the week runs out, next is permanent. Robert Ullmann 15:17, 25 December 2006 (UTC)
I think that’s reasonable. —Stephen 16:05, 25 December 2006 (UTC)
Actually it had only been blocked twice before, and the three others were corrections, unblocked and reblocked with new timeframe. But I don't think another week is excessive and I'm not convinced the reasoning wasn't clear. The problem with the lists of numbers was that it was potentially a copyright violation. DAVilla 17:40, 25 December 2006 (UTC)

Appendix:List of protologisms/numbers and Appendix:List of protologisms by topic/numbers

These were apparantly just randomly deleted by some admin. The consensus before was clearly for the number terms to be split from the main Appendix:List of protologisms page, rather than remaining on that page. It seems to me like the deletion was vandalism and that the admin that deleted them is abusing their privilege of being an admin. These should be restored and the admin should be warned against just randomly deleting things from Wiktionary at will. Having the ability to delete pages doesn't mean you can delete them at will or with no consensus. 20:00, 24 December 2006 (UTC)

Who says there was no consensus? henne 11:36, 25 December 2006 (UTC)

December 25th WOTD

What could you possible have been thinking when you selected a Jewish holiday as the "word for the day" for December 25 - the day Christians celebrate the birth of Jesus Christ? Is anybody up there Christian? Is your intent to be controversial? provacative? cantankerous? dense? I don't get it. —This unsigned comment was added by (talkcontribs).

The Christians were all busy celebrating? --Versageek 04:22, 27 December 2006 (UTC)
I don't see anything provocative about the selection, particularly considering that the 25th of December is only two days after the end of Hanukkah this year. I don't think anyone would complain if we featured Easter instead of Vesak in April, for example. —{admin} Pathoschild 05:15, 27 December 2006 (UTC)
Now now, it does seem to be a valid complaint (if somewhat overstated.) The day Christmas falls on doesn't vary from 12/25. And Hanukkah really should have been the WOTD at the start or end of Hanukkah. While I am not offended by it, I can only say that I was too busy with Christmas preparations to notice the minor impropriety. On the other hand, I think it is a very Christan notion to respectfully acknowledge other faiths, particularly on such an important day. If the WOTD had been bargain, toys, returns, last-minute sale, X-mas or other commercialism tripe, I would have been quite offended. --Connel MacKenzie 05:44, 27 December 2006 (UTC)
Whoa, many Christians object to the word X-mas? I had no idea. As such knowledge can't hurt, I think it would've been an appropriate word on Christmas :-) -- 01:30, 14 January 2007 (UTC)
If you'll take a look at the whole of December, you'll see that a deliberate attempt was made to show the diversity of cultures and languages that have contributed into English for the latter half of December. Hanukkah seemed like a logical choice for Hebrew given the season. Its selection for 25 December was a little tongue-in-cheek, perhaps, but as a Christian myself, I don't see anything offensive in it. Remember also that Wiktionary has users working globally, so the date listed on the WOTD is purely nominal, and where I live it frequently doesn't appear for much of the "day" for which it is selected. The WOTD rolls over eight hours early in my time zone, so, from where I edit, the word was only up for part of Christmas anyway. How would you feel about the word Easter, since it derives from the name of a pagan deity? --EncycloPetey 08:28, 27 December 2006 (UTC)
Christmas celebrates the birth of a Jew, as prophesied in what were then Jewish religious texts even though they are now also used as part of the Christian Bible. Indeed the whole Christian faith revolves around Christ offering Himself as a sacrifice to atone for all human sin, in accordance with the Jewish law on sin offerings. Hanukkah celebrates the rebuilding of the Jewish Temple, which is often taken as a type of the resurrection of Christ ("...I will rebuild it in three days"). Altogether a highly appropriate word for Christmas Day, but would have been even better for Easter Sunday. :-) Delayed response because I lost the thread and have only just seen it again --Enginear 14:28, 9 January 2007 (UTC)

what is happening in biblical archaeology right now? are there any new findings which support the word of god's account?

i would like to discuss how archaeology has proved that the word of God's acoounts are true? Are there any new ahaeolgical finds that are something outstanding?

Is there any particular reason you're asking this question on a dictionary website? You might have better luck on Wikipedia or at Wikiversity. --EncycloPetey 23:23, 28 December 2006 (UTC)


This is supposed to be a template for English adverbs. I don’t know why a template would really be needed for adverbs, but here it is. The problem is that it gives the comparative and superlative of adverbs. As far as I know, adverbs have no such forms. Should this template be renamed for adjectives or should it be rewritten for English adverbs? If adverbs, then all it should do is add a category. —Stephen 21:27, 29 December 2006 (UTC)

Adverbs do have comparative and superlative forms; theyt simply aren't as widely advertised as for their adjectival counterparts. Consider:
  • She ran to the store quickly. versus She ran to the store more quickly.
There is an additional degree of comparison in the latter that does not exist in the former. Latin also has comparative and superlative forms of adverbs, where these forms are inflected rather than adding a helping word as in English. --EncycloPetey 01:18, 30 December 2006 (UTC)
The comparative and superlative forms of adverbs are also considerably more regular than those of adjectives. They are usually formed by adding 'more' and 'most' before the positive. E.g.: adverb: quickly, more quickly, most quickly; adjective: quick, quicker, quickest. But there are exceptions: hard, harder hardest. Ncik 01:23, 30 December 2006 (UTC)
Oh, I see what the problem is now. Some people leave out a parameter. I was looking at ===Adverb=== yesternight, and "more yesternight, most yesternight" just looked ridiculous. —Stephen 05:39, 31 December 2006 (UTC)
Ridiculous? Spot on! The positive form can be used as "We ate much chicken yesternight." The comparative would be "We ate more yesternight." ;) --EncycloPetey 06:11, 31 December 2006 (UTC) Sorry, I've been reading from Samuel Johnson's Dictionary.)
Surely, abverbs can most always be inflected :-) I think I've already cooked my goose; just don't suggest cold turkey right now! --Enginear 15:04, 31 December 2006 (UTC)
Can we change the en-POS templates so that they don't list any inflections unless given explicitly? (If there really are no forms, e.g. comparative and superlative or plural, then that would be given explicitly too.) Besides addressing misuse such as this, it answers how to deal with rare forms where the inflections cannot be attested and goes along the lines of the "uncountable" complaints. DAVilla 06:37, 31 December 2006 (UTC)
I prefer the current default of automatically generating the normal pattern. Otherwise, we'd have to request each en-noun template to present the plural, and request every en-adv to display forms. As it currently stands, people will make errors entering the template if (1) they aren't familiar with the documentation, or (b) they aren't paying attention to their edits. I've been guilty of not marking -es plurals, but I'd rather have a default to rely on instead of always having to know the right coding to get it to work. --EncycloPetey 07:14, 31 December 2006 (UTC)
I support DAVilla. I've come across so many wrong inflections caused by people not bothering to read the documentation properly and assuming the templates would work in all cases without any parameters. To be safe, the default should be "no inflections". Ncik 19:31, 1 January 2007 (UTC)
I also prefer the templates the way they are - it would be an immense amount of work to fix all entries after a rash change in {{en-adv}} - it is simpler to fix the ones that are not inflected.--Williamsayers79 22:16, 1 January 2007 (UTC)
Indeed, since after all the template will automatically categorize them in a single place. One could even do a search for the unmodified template, and visually do a scan of the results for adverbs that don't have comparative or superlatives, then alter the pages accordingly. --EncycloPetey 15:07, 2 January 2007 (UTC)

blocking question

Looking at this:


I am inclined to block this anon IP. What do other sysops think? Is a one year block long enough? --Connel MacKenzie 22:53, 29 December 2006 (UTC)

What is this, a joke? The edit is commented "removed quotations from unreliable sources" which proves true. Each is a discussion thread on a forum. In a few months the action would have been completely justified: two dead links instead of just the one. DAVilla 00:35, 30 December 2006 (UTC)
Well, you'd probably have to block DAVilla too, for [5]
But that isn't the point: removing the citations is a problem. If the link breaks, so what? (remove the link itself) We don't link citations in most cases anyway: we cite by year/date, publication, author, and the quotation. Since we have that, the citations should not have been removed in either case, unless they could be replaced with better ones. At HYPSMC, the citations should be put back. The user wasn't a vandal (should not be blocked): just misguidedly removing broken links (in the same way we sometimes see people unlinking red links). At nihilartikel we have only one left, the others should not have been removed. (Whether they "count" for RfV purposes is a separate issue.) 20:08, 30 December 2006 (UTC)
There are plenty of examples that do not rely on an anonymous comment to an anonymous blog, nor to a file in cyberspace that doesn't even have a permalink. The problem is that they aren't independent, referring for instance to the 2005 April Fools joke "Today's featured Nihilartikel". But even the del.icio.us link I deleted wasn't independent, and the other quotation impossible to tell. So really, what good are they? If you want quasi-indepenent try this quote apparently from a published magazine, which I'm cluless on how to source. DAVilla 21:31, 30 December 2006 (UTC)
things hasn't published an issue since 2004, the quote is from 2005 ... we might have a print citation sometime ... web whatever quotes are better than nothing, but can always be replaced with better ones. found the de:w article link for the etymology Robert Ullmann 21:42, 30 December 2006 (UTC)

Review processes

We have a couple of reviw process on Wiktionary that (1) are working well and (2) have heavy traffic. They are WT:RFD and WT:RFV. They're working well because the processes are established in making objective decisions about content. However, they are receiving so much traffic that it makes the pages difficult to navigate. Archiving is one problem which has to be addressed soon. Another is that there is simply too much stuff there that doesn't belong. Some stuff lasts no more than a few days without needing to finish the process, some entries are moved (or should be moved) between processes, and a lot of the foreign words are just handled as clean-ups by Stephen, while the WT:RFC page is nothing to take lightly either. The result is a bit of unneeded clutter.

On Help:Nominating an article for cleanup or deletion:

"Thinking deeply before listing an article here is not discouraged, but it is definitely not a requirement either. Most comments here are first impressions, not considered analyses of an article."

How ridiculous!

I've created a new template {{rfr}} which is intended as a preliminary step to avoid the cumbersome lists. There is a category, Category:Requests for review, which anyone knowledgeable of the different request templates can clean out by replacing the tag with something more specific. However, my intention is not to complicate the process. Instead of advertising RFR as yet another clean-up template, I want to encourage contributors to leave comments on RFC, RFD, and RFV just as they do for {{delete}}. The criteria for speedy deletion is important because individual cases aren't brought to the community's attention, and perhaps because of having to leave comments the reasons for deletion are second-nature to us. I want to make the reasons for RFC, RFD, and RFV second-nature too, and "redirect" these templates to RFR when no comment is given. The purpose primarily is to make people think about how and whether to list the article rather than relying on first impressions, maybe even Google the word before wasting everyone's time on it. And for those who are using the processes correctly, it really wouldn't be any more work if it were possible to transfer the [[title]] to the subject and the comment to the body of an added listing. DAVilla 18:15, 30 December 2006 (UTC)

Interesting. Rather than complicate those three templates to conditionally redirect to RFR, I've simply used {{wikify}} as a "speedy RFC". I'm not sure I understand how your proposal would ease the archivial problems. (That is the single problem you are trying to mitigate, right?) --Connel MacKenzie 19:16, 30 December 2006 (UTC)
Archiving needs an automated solution. This only eases archiving insofar as there's less to archive. Archiving e.g. the Beer Parlour is simpler, as the conversation simply ends. For RFD and RFV you need a ruling, and then to preserve the discussion on the talk page. What I'm trying to do is more clearly delineate the processes. DAVilla 20:11, 30 December 2006 (UTC)
Well, let's discuss the archiving in WT:GP...I've some automation ideas. --Connel MacKenzie 09:27, 31 December 2006 (UTC)

Three letter words in English

I was looking for the definition of a word, when I stumbled upon this section. And being curious continued onwards.

In the section on "the", there is appears to be no comment on the use of the word pronounced as 'thee', as in 'the book'. Further, nothing on the old use of 'ye' as the definite article wherefrom (I think) 'thee' comes from.

Yes, it is marked as pronounced that way: *:enPR: thē, IPA(key): /ðiː/, Template:X-SAMPA --EncycloPetey 06:05, 31 December 2006 (UTC)

Also, in the section on 'the' it has a link for other three letter words in English. It states that there are 16 in this category and all but 'the' begin with 'a'. Somehow I am unable to link to the others such as 'bee, see, sea, was, ' etc.

Chris 19:17, 30 December 2006 (UTC)

I've added a list to Category:English three letter words so those entries can be added to the category by anyone so inclined.
The entry the does have a usage notes entry that describes the pronunciation anomaly.
See ye#Etymology 2 for the thorn character. IIRC, there was quite a bit of discussion on that, a while back. --Connel MacKenzie 21:52, 30 December 2006 (UTC)

Thank you! Now how would I be able to spend some time setting the list in the alphatized listing?


Ooh, can't wait for Category:English four letter words! DAVilla 06:16, 31 December 2006 (UTC)
Natch. --EncycloPetey 06:23, 31 December 2006 (UTC)
We have Category:Vulgarities, don't we? --Connel MacKenzie 22:03, 3 January 2007 (UTC)
Seriously though, if we consider that these categories are necessary (or at least not discourageable), shouldn't we be thinking about the naming? Aside from weeding out initialisms, affixes, and words under more delicate criteria, this seems a fairly automatic construction to me. But do we really want the name of the language in the category name (and this applies more broadly to other categories) and the name of the number rather than just the Arabic numerals themselves? It looks prettier perhaps, but it's an extra step of coding if they're to be put to any computational use. DAVilla 06:33, 31 December 2006 (UTC)
More seriously is that no one questioned what filtering I used for generating that list. <shrug>. Renaming categories is always much easier than jump-starting them. I have no particular liking for Uncle G's name for the categories (note that some others have talk pages but no categories) but then, I never was clear on the proper category names. By the way, can't that category be a subcategory of Category:English 3 letter words? --Connel MacKenzie 09:33, 31 December 2006 (UTC)
Well part of it depends on what is meant by "word". As with all categories, it would be good to clearly define on the page.
I gather you're a lot more confident now about the naming. So is it Category:English three letter words or Category:en:3-letter words or something inbetween? DAVilla 15:32, 3 January 2007 (UTC)
No, with the ensuing BP conversation, I'm at more of a loss than ever. I think the proper title should be used, and happen to be a subcategory of the numeric-computational title...whatever form they ultimately take. --Connel MacKenzie 22:02, 3 January 2007 (UTC)

Gold star indicators for WOTDs

Anyone mind a gold star on the top-right corner of past WOTDs? Using the same technique as Wikipedia's Featured Article thing, that is. I don't know about anyone else, but I get annoyed when stumbling across a word that should be nominated, then after finally finding the nomination page, realize that I have to check somewhere else to see if it already has been used. The only down side is that having an extra WOTD indicator may cause other words to be nominated (which, isn't such a bad thing, in and of itself) because more people are reminded about it. Is this the sort of thing I should start a WT:VOTE for, or should I just be bold, or should I forget about it? --Connel MacKenzie 09:38, 31 December 2006 (UTC)

Yes, it’s a good idea. I think you should just go ahead and do it. —Stephen 09:56, 31 December 2006 (UTC)
I kind of like the idea too. I'd wait a bit for the holidays to end and see if any issues are raised in discussion. Otherwise, I agree that no vote is needed. --EncycloPetey 10:05, 31 December 2006 (UTC)
Does the gold star imply some condition of quality? Could another symbol be used, or perhaps a note on the talk page? DAVilla 10:45, 31 December 2006 (UTC)
Sure, we could use another symbol. It's precisely for the quality implication issue that I never implemented it myself. No, I don't think the talk page would be a good location for the symbol, since one the points in having the mark at all is to make it easier to determine when a word had been selected as a past WOTD. Relegating the symbol to a secondary page would render the process nugatory. --EncycloPetey 22:23, 31 December 2006 (UTC)
Symobl only on the talk page wasn't considered, but your points apply to a note as well. DAVilla 17:02, 2 January 2007 (UTC)
How about "WOTD" in the top right corner, then? --Connel MacKenzie 19:13, 1 January 2007 (UTC)
Or maybe yestern word of the day? :) Yeah, that's okay by me, or any symbol you like that doesn't imply "good page". DAVilla 17:02, 2 January 2007 (UTC)
{{was wotd}} look ok? --Connel MacKenzie 19:28, 2 January 2007 (UTC)
Should these be restricted to archive WOTD pages only? --Connel MacKenzie 19:36, 2 January 2007 (UTC)
Can't tell. Could you use it on the page zygodactylous, so we can see it in use? I can't tell what it's supposed to look like when it's used. --EncycloPetey 12:41, 3 January 2007 (UTC)
OK. How's it look? --Connel MacKenzie 15:15, 3 January 2007 (UTC)
Right now it looks awkward, appearing above "dismiss" for the fundraising drive. DAVilla 15:27, 3 January 2007 (UTC)
Could you take a look at it, with WT:PREFS (hide site notice) turned on (then you can turn it back on as soon as you're done.) --Connel MacKenzie 00:03, 4 January 2007 (UTC)
I think I'd like to see a simple icon used (or box?). Posibly identifying the year in which it was WOTD. After all, this could accumulate over a very long time, and we do expect to eventually begin recycling selections. --EncycloPetey 00:52, 4 January 2007 (UTC)
I'm one of those wacky people who actually look at an URL before I visit it...where the year is actually a component of the URL. But a visual indication shouldn't be too hard. Would just indicating it in the "hint text" be sufficient? What color do you want that box, around it? Wait, on second though...why don't you be bold with it, please. I think a decent icon that conveys "WOTD" without making implication of quality is going to be a bit of a struggle. --Connel MacKenzie 06:39, 4 January 2007 (UTC)
Um, I wasn't trying to be a wise-guy there. I actually forgot that I had it in the "hint text" already. (Could be prettied up a bit, of course.) --Connel MacKenzie 06:40, 4 January 2007 (UTC)
Well, it would appear that the restriction that the "WOTD" indicator should only point to archive pages. (Based on what you said above, EP.) Keeping that in mind, the 24 I've tagged so far are easy to fix; instead of one parameter, three (year, month, day) to make the output nicer-looking. Of course, the same could be done for regular WOTDs, to help the appearance of the title, there. Adding the category I think was a good idea (auto-alphabetic index, ready-made.) So if this works, you may choose to retire the manually updated alphabetic index. (OTOH, it seems you have that method worked out, quite well.) I shall experiment with {{was wotd}} a little more now... --Connel MacKenzie 01:10, 5 January 2007 (UTC)
OK, how's it look now? --Connel MacKenzie 01:49, 5 January 2007 (UTC)
The structure of the implementation looks fine. I don't know that I can design an icon though, since I don't currently have access to Photoshop or anything similar (and it's a bit expensive, so I won't be making that purchase anytime soon). I think an open (yellowed) book, angled slightly away from the viewer, with "WOTD" in suitably contrasting letters would work, if someone can out there could design it. --EncycloPetey 02:06, 5 January 2007 (UTC)
I mentioned this to Dvortygirl - she said she may have time next week to try something. On the other hand, perhaps just a colored box around it would be better (especially in these bandwidth-strained days.) Was the addition of the date helpful? --Connel MacKenzie 00:04, 6 January 2007 (UTC)
Yes, but I think convention is that a comma is not used when the day precedes the month. ...Yes, the Chicago Manual (sxn 6.46) notes that in the day-month-year format, commas are not required. --EncycloPetey 00:19, 6 January 2007 (UTC)
Did you ever get around to correcting that? Also, have you started using this for future WOTDs yet, or were you expecting me to automate it (beyond the cutting&pasting stuff I did for year #1)? --Connel MacKenzie 06:09, 11 January 2007 (UTC)
I hadn't gotten around to thinking that far ahead yet. It might as well be something automated, or something done at the end of a month's worth of WOTD. --EncycloPetey 01:04, 14 January 2007 (UTC)

Parts of speech in other languages

There are those nice categories Category:English nouns etc., for all POS. Similarly, they exist for Danish: Category:Danish parts of speech. However, for other languages, there seems to be this strange languagecode:POS convention, as in Category:nl:Pronouns, but at the same time this exists: Category:Dutch nouns. I strongly prefer the spelled out version, since it does not use the cryptical language code, which the average user probably will not know. Is it alright for me to convert the nl: categories to their more readable counterparts (I got a warning that it is not, see my talk page). henne 16:54, 31 December 2006 (UTC)

I asked henne to check here before going further with his changes - as I know there was a reason we moved to the Category:Code:POS format. That reason may have been rendered moot since then, I know there was some talk about mapping language codes to the actual languages automajically for sorting purposes - but I'm uncertain what became of it. Better to check first, than have a mess to clean up later. --Versageek 18:42, 31 December 2006 (UTC)
I know I am continuously confused on this issue, and would also appreciate policy-level clarification. --Connel MacKenzie 19:02, 31 December 2006 (UTC)
Policy: Categories meant to contain exclusively words from one language other than English have to include the language's ISO code.
I also suggest to apply the above rule if the language is English and then rename "English nouns" to "en:Nouns" etc. This has the advantage that there won't be any difficulties if we have categories consisting of words from different languages (e.g. categories containing etymological derivatives). Ncik 20:42, 31 December 2006 (UTC)
Thank you. I agree that the English categories should be brought in line. Does everyone agree? Should we have a WT:VOTE to solidify it, or does the policy page already exist somewhere? --Connel MacKenzie 20:50, 31 December 2006 (UTC)
POS categories in every language, not just English, consistantly use the full name of the language. Topic categories always use the code. This is very well settled, Ncik is completely off-the-wall on this. Robert Ullmann 17:14, 1 January 2007 (UTC)
Strong disagree. I thought when we had this discussion before we decided differently, though we may not have reached consensus. The current general practice as I've noticed it, is that POS categories use the spelled out language name while Topical categories use the ISO code. This has the advantage of automatically sorting separately those categories that are POS from those that are Topical. It also avoide the following problem: Under the proposed revision, would pretérito go under Category:es:Adjectives or Category:es:Verbs or both? Grammatically it's an adjective (and sometimes a substantive), but it's a word that describes a verb tense. Under the "current" system, there would be no such confusion. It goes under Category:Spanish adjectives because its POS is adjective, but it could also go under Category:es:Verbs if such a category existed because it's meaning pertains to verbs. Switching everything to the proposed system creates a huge problem that would never go away. --EncycloPetey 21:44, 31 December 2006 (UTC)
Question: shouldn't it be listed in both? --Connel MacKenzie 22:15, 31 December 2006 (UTC)
Both what? I've described two possible situations, and need to know which one you mean. --EncycloPetey 22:19, 31 December 2006 (UTC)
The problem is the term "verb tense". But any tense is a "verb tense". Only verbs can assume different tenses. So the solution is to put pretérito in Category:es:Tenses. Ncik 14:02, 1 January 2007 (UTC)
Actually, it depends on the language. In some languages, other parts of speech can have tenses. In Hausa, for example, the pronouns have tenses. Hausa pronouns also show mood and negativity. —Stephen 11:33, 2 January 2007 (UTC)
A category "es:Verbs" would be Spanish words about verbs ;-). As EP says, this is very well established. Robert Ullmann 17:14, 1 January 2007 (UTC)

Note that Category:nl:Pronouns was created by Eclecticology in July 2005, well before the discussion of using codes for topics and keeping the full language names for POS, it is an outlier (there are a few others created by EC and Dijan); it should be (and is) Category:Dutch pronouns. It is instructive to look at Category:Nouns by language, Category:Verbs by language etc. Oh, and Spanish verb forms go in in Category:Spanish verb forms. pretérito goes in Category:es:Parts of speech a topic category, and (of course) in Category:Spanish adjectives as it is one. Robert Ullmann 17:47, 1 January 2007 (UTC)

If you want a category about verbs, you should give it an appropriate, specific name (e.g. Category:es:Grammar of verbs or Category:es:Grammatical aspects of verbs. Such a category can then contain Spanish words for the tenses, for "transitive", "intransitive", "reflexive", etc.) The same situation exists by the way for any other topic and is not particular to the part of speech case: A category Category:es:Mammals contains (the Spanish words used to denote) mammals and not the Spanish translations of words such as "mammary gland", "neocortex" etc., which should go in something like Category:es:Characteristics of mammals. Ncik 18:47, 1 January 2007 (UTC)
  • It appears that I am not the only person unclear on this topic. Can we have a vote on this as well? --Connel MacKenzie 19:10, 1 January 2007 (UTC)
It used to be that we named a category [[Category:Dutch mountains]] if the category was intended only for Dutch mountains (mountains situated in the Netherlands), but [[Category:nl:Mountains]] for mountains around the world but in the Dutch language. In other words, if the category is simply "Mountains", but translated into Dutch, then it’s [[Category:nl:Mountains]]; but if the category is "Dutch Mountains", then it’s [[Category:Dutch mountains]]. In the case of parts of speech, there are nouns and then there are Dutch nouns. Dutch nouns should be, and indeed used to be, [[Category:Dutch nouns]]. Someone who was confused by the difference in nouns and Dutch nouns came along later and changed it to [[Category:nl:Nouns]], and then it became very confusing for everyone. If the normal English name for a category is explicitly "Dutch nouns", the category should be named [[Category:Dutch nouns]]. Only when the normal English name for a category is "Nouns" should we ever have [[Category:nl:Nouns]]...and since all parts of speech for any language are called in the style of "Spanish nouns", "Russian verbs", "French adjectives", "English conjunctions", etc., they should all be named liks [[Category:English nouns]], the way it reliably used to be. This way, [[Category:ru:Sciences]] are the various recognized sciences (such as Geology, Biology, Genetics) but translated into Russian; and [[Category:Russian sciences]] would be for Russian sciences that are peculiar to the nation of Russia (perhaps sciences related to the Siberian tundra). —Stephen 11:33, 2 January 2007 (UTC)
(sorry, I had one more thing to say, but the network here sometimes goes away for hours ;-) We already do have a policy document on this, at WT:POS. (addressing the POS cats, I don't know where the topic cats are written up.) Connel, the policy isn't confused, but you are right there is still some confusion because of some existing categories that should be cleaned up. (as henne was doing). Robert Ullmann 12:17, 2 January 2007 (UTC)
I've been doing category cleanup too (as noted at WT:DW), but primarily at the level above this one (merging Category Romany language into Category:Romani language, for instance). It seems there's not only confusion about the issue, but disagreement about what is "correct" among some of the most frequent contributors. I'd just be interested even to see a straw poll about who thinks each view is the currently correct one, regardless of what the actual policy is. --EncycloPetey 15:02, 2 January 2007 (UTC)

I think it would be a lot less confusing to always consider the language as part of the categorization, that is, to use Ncik's proposal of Category:en: for English words. That would create Category:en:Nouns under which, as per Stephen, it would then makes sense to use Category:nl:Nouns rather than the confusing "language name for POS only" rule we apparently currently have. DAVilla 16:55, 2 January 2007 (UTC)

Well, thank you for reminding me what the issue is. This is the English Wiktionary. No, adding "en:" to the start of all/most category names is not desirable. The categories without a language prefix are English topics. The categories with "English " in the category title are about linguistic feature topics of the English language. Which, I believe is how Wiktionary:Categorization spells it all out (just as Stephen said.) --Connel MacKenzie 18:12, 2 January 2007 (UTC)
If Category: = Category:en: conceptually, then Category:English nouns is repetitive. Why wouldn't it be just Category:Nouns? One good reason can be found at Category:Abbreviations which are chock-full of English abbreviations. When leaving out the en: subnamespace, we need to insert the word English to remind people that these words belong at Category:English abbreviations. At least I think they do, since Abbreviation may be a 3-level header but is no more a part of speech than any of the other misfits. But then what is Category:Abbreviations for anyways? For translingual headers? Or if I'm mistaken about Abbreviations not being a POS, and it's Category:English abbreviations that's out of line, then where translingual abbreviations belong? DAVilla 18:37, 2 January 2007 (UTC)
Category:Abbreviations should be a supercategory, which has as subcategories Category:English abbreviations, Category:Dutch abbreviations, Category:German abbreviations, etc. Just as Category:Nouns is a supercategory for Category:Nouns by language, which is on its turn a supercategory for Category:English nouns, and, indeed, Category:Dutch nouns.
So Category:nl:Pronouns is a misnomer, and I will be correcting this. Or is anyone willing to write a robot (or assist me in writing a robot) which can automate these changes (i.e. all words in it have to be recategorised)?
I think it makes sense to summarise once more: Categories for English words contain no language code. Categories for foreign words contain the language code. So no adding of berg to Category:Mountains, and no mountain in Category:nl:Mountains, but rather the inverse.
POS fall out of this system, since they are on a higher level. Therefore they are intrepreted as categories in the English language, since we are on the English WT. Therefore they do not need the language code: Category:Nouns contains all nouns in all languages, further subdivided in other categories, to give more overview.
However, reading WT:CG raises a little doubt: it suggests categories such as Category:fr:Cardinal numbers and proposes Category:xx:Proper nouns. I would expect these two to fall under the POS guideline. henne 11:21, 3 January 2007 (UTC)
It's clearly wrong about the suggestion for Category:xx:Proper nouns. The category Category:fr:Cardinal numbers could be right or wrong depending on how you look at it. If "Cardinal number" is interpretated as a POS, then the category name is incorrectly formed. However, if "Cardinal number" is interpreted as a mathematical topic, then the category name is correctly formed. This was one of the issues I had difficulty reconciling in my own mind during the "Is the part of speech "Number" or "Numeral" discussion. I'm inclined to think we should prefer Category:fr:Cardinal numbers and the like, and interpret this as a topical category, since not all the words in the English category will function as the POS called "number/numeral" (e.g. aleph-null). Whether such categories belong in a metacategory titled Category French numbers or Category:French numerals is a matter for debate that probably should not be reopened at this time. --EncycloPetey 12:37, 3 January 2007 (UTC)
I'm inclined to think that we should prefer not to have to make such distinctions, and simply use xx: when we mean the language, and Ex-ex when we mean the region. DAVilla 17:29, 4 January 2007 (UTC)
But the point is that a system like that breaks down for POS categories. Currently, If you see Category:fr:Time, you know it is the French counterpart to Category:Time (the English version). However, if we have a metacategory for Category:Verbs (including all languages), then where does that leave Category:fr:Verbs? It would be the French category in parallel with Category:Verbs, and not a subcategory parallel with Category:English verbs. To keep the categories parallel, we need a consistent set of names, or we run into (even more) confusion. --EncycloPetey 01:59, 5 January 2007 (UTC)
The solution is to either (1) use en: for English-language categories, which Connel strongly objects to, or (2) consider Category:Verbs to be verbs in English only, just as with any of the topic categories. In the latter case it could still be possible to have a category for verbs in all languages (and why we would want to do that is beyond me), it just couldn't be called Category:Verbs. DAVilla 21:57, 6 January 2007 (UTC)
I had tried to start a bot request page for category renames, but no one else liked the idea. If you pepper my talk page with requests, (with the exact category renames you'd like performed,) I or someone else will run them. --Connel MacKenzie 15:22, 3 January 2007 (UTC)

Nonce words and hapax legomena

As a newly-registered user, I'm not sure if this is the right page to ask general questions about the project (if not, could somebody please direct me to the right page?), but there are a couple of things I was wondering about that I can't find advice on dealing with:

First, I'd like to ask: what exactly is a nonce word (and what isn't) and what is Wiktionary's policy on them?

Secondly, what is Wiktionary's policy on words used only by specific authors (but notable authors - e.g. Shakespeare, Dickens)? I am particularly curious about words appearing as hapax legomena in the works of such prestigious authors.

Thanks, RobbieG 17:50, 31 December 2006 (UTC)

  1. This is the best place for these questions.
  2. We list nonce word as a word invented for the occasion. There is significant, long-standing resistance to including invented words on Wiktionary. Authors of historic importance (not merely notable) seem to dance around the more aggressive rules and practices. But our criteria tries to address that with the combination of the independence clause and the spanning one year clause.
  3. Historic hapax legomenon terms do seem to pass our criteria easily as well, often from other mentions by other writers (contrary to the actual definition) or secondary sources, but perhaps more due to the fact that no one can reasonably justify tagging them with {{rfv}}. Some have argued that our criteria should be relaxed for such terms, but not much progress has been made on the issue, while other technical implementation aspects are addressed first.
--Connel MacKenzie 18:40, 31 December 2006 (UTC)
Thank you! RobbieG 18:58, 31 December 2006 (UTC)
Maybe the difference between (2) and (3) is that in (3) "it's likely that someone would run across it and want to know what it means." (e.g., honorificamumble in Shakespeare), while a nonce word is generally defined as it's used (if only by context), so the reader is not likely to want to know what it means. If you buy that, then there should be no problem with a short special note for hapax legomena. -dmh 16:23, 15 January 2007 (UTC)

WT:MTW (Move To Wiktionary from Wikipedia)

I've created a new helper page for sysops, so that I'm no longer the only one doing Transwiki: cleanups. Comments are appreciated. (Reword it to satisfaction, please. It is a first draft right now, and I was annoyed when I wrote it.) --Connel MacKenzie 20:30, 31 December 2006 (UTC)