Wiktionary:Beer parlour: difference between revisions

From Wiktionary, the free dictionary
Latest comment: 13 years ago by Ruakh in topic Poll: Choosing topical categories
Jump to navigation Jump to search
Content deleted Content added
Mglovesfun (talk | contribs)
→‎Flood Flag: not talking about deletion
→‎Flood Flag: Clarify.
Line 2,789: Line 2,789:
:: We can compromise and say that everyone must ''inform'' DCDuring, the preferred mechanism for doing so being an informative reason in the flag assignment. - {{User:TheDaveRoss/sig}} 23:30, 10 February 2011 (UTC)
:: We can compromise and say that everyone must ''inform'' DCDuring, the preferred mechanism for doing so being an informative reason in the flag assignment. - {{User:TheDaveRoss/sig}} 23:30, 10 February 2011 (UTC)
:::Yes totally agree, perhaps move [[Wiktionary:Requests for flood flag]] to [[Wiktionary:Flood flag/requests]] and create [[Wiktionary:Flood flag]]. Then admins could simply 'inform' others of their use of a flood flag, instead of 'requesting it'. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 00:15, 11 February 2011 (UTC)
:::Yes totally agree, perhaps move [[Wiktionary:Requests for flood flag]] to [[Wiktionary:Flood flag/requests]] and create [[Wiktionary:Flood flag]]. Then admins could simply 'inform' others of their use of a flood flag, instead of 'requesting it'. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 00:15, 11 February 2011 (UTC)
: I think we might as well keep the page, and have admins comment there when they flag themselves, just so people who care can keep the page on their watchlist. We can easily drop that requirement at some point if we decide in future that it's too onerous and not necessary. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 16:15, 11 February 2011 (UTC)
: @TheDaveRoss: I think we might as well keep the page, and have admins comment there when they flag themselves, just so people who care can keep the page on their watchlist. We can easily drop that requirement at some point if we decide in future that it's too onerous and not necessary. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 16:15, 11 February 2011 (UTC)
::I'm talking about moving it and replacing the redirect [[WT:Flood flag]] with content. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 16:28, 11 February 2011 (UTC)

Revision as of 16:42, 11 February 2011

Wiktionary > Discussion rooms > Beer parlour Wiktionary:Beer parlour/header

November 2010

Rhymes adder

I've been working on a script to simplify adding rhymes to rhymes lists (User:Yair rand/rhymesedit.js, available in WT:PREFS). It works much like the existing translations adder, placing input forms at the end of each section, and also automatically updates the entry for the added rhymes upon saving. Right now it's quite likely that there are a lot of bugs in it (due to it being pretty much untested), but assuming it can be made mostly bug-free, do people think that this is something we would want to have enabled by default at some point? --Yair rand (talk) 06:40, 12 November 2010 (UTC)Reply

What happens when the target rhymes page does not exist (which I find is often the case)? --EncycloPetey 19:44, 12 November 2010 (UTC)Reply
I'm not completely sure if this is what you're asking, but if the appropriate language section is not available in the entry or the entry does not exist, nothing is added to the entry, but the red link is still added to the rhymes list. --Yair rand (talk) 01:11, 14 November 2010 (UTC)Reply
It pops up an error message. The rest of the rhymes get added OK tho. — lexicógrafa | háblame01:38, 14 November 2010 (UTC)Reply
Hm, that wasn't supposed to happen. Now fixed. --Yair rand (talk) 03:02, 14 November 2010 (UTC)Reply
OK, I'm confused even more now. At first, it sounded as though you were adding rhyming words to lists on pages in the Rhymes namespace. Now it sounds as if you're adding a {{rhymes}} link to the pronunciation section of an entry based on what's extant in the Rhymes namespace. Which is it? --EncycloPetey 19:53, 16 November 2010 (UTC)Reply
Input boxes appear on the list in the Rhymes namespace, that are used for adding words to the list, and then the script also automatically adds {{rhymes}} to the entries of each the newly added rhymes upon clicking the save button. --Yair rand (talk) 22:22, 16 November 2010 (UTC)Reply
Is it case-sensitive? It shouldn't, it causes problems for German. -- Prince Kassad 23:30, 16 November 2010 (UTC) (edit: it also does not work with {{top4}} which is a problem for long rhymes lists...)Reply
It's not case-sensitive in its sorting, and I'm pretty sure it does work with {{top4}}. (Note: I just fixed a bug in the script a few minutes ago that was causing it to mess up on any page that had the beginning of a header as the first character of the page. There's decent chance that the problems were due to that bug.) --Yair rand (talk) 00:11, 17 November 2010 (UTC)Reply
Hmm yes, it seems to work better now. -- Prince Kassad 10:40, 17 November 2010 (UTC)Reply
Yes IMO (to answer the original question).​—msh210 (talk) 17:35, 27 December 2010 (UTC)Reply

Okay, this has been sitting here for quite a while. If there are no objections, I'm turning this on in a few days from now. --Yair rand (talk) 19:19, 21 January 2011 (UTC)Reply

The rhymes adder is now enabled by default. --Yair rand (talk) 19:51, 23 January 2011 (UTC)Reply

Geographic language categories

There seems to be no real consensus on how to handle geographic language categories. This is evidenced by the recent RFDOs on Category:Languages of Tibet and Category:Languages of New Mexico, as well as the historic RFDO on Category:Languages of the United States Virgin Islands (passed as no consensus). Some people are inclusionists and think we should have any conceivable area, others think it is redundant and useless. There is as of yet no policy on this, only the draft Wiktionary:Language categories created by myself which generally only allows language categories for internationally accepted sovereign countries, with other areas being decided on a case-by-case basis. The question is whether this should be elevated to policy, or whether we want a completely new idea. -- Prince Kassad 16:29, 16 November 2010 (UTC)Reply

As a first step towards higher organization, I propose removing Category:Languages of the Americas from Category:Languages by country, because "the Americas" is not a country.
As for your specific concern of inclusionism, it makes sense to create categories for small areas if there are languages restricted to them: for example, Category:Languages of Vatican. --Daniel. 05:23, 17 November 2010 (UTC)Reply
Well, that's a country anyway. But let's say, er, Languages of Naples (Neapolitan) and Languages of Dalmatia (Dalmatian)? I think not. I'm for restricting to countries (whatever they might be. I suppose internationally recognized countries, or some such. Including past ones, like the Holy Roman Empire, though I think that including all of the USSR, the w:Russian Democratic Federative Republic, the w:Russian Empire, and the w:Tsardom of Russia is too much. Where to draw the line...).​—msh210 (talk) 06:06, 17 November 2010 (UTC)Reply
Past countries might be useful for categorizing ancient languages, see Category:Languages of the ancient Near East. But yeah, it's rather difficult to draw the line, what to include and what not to include.
re Vatican, we have that one: Category:Languages of Vatican City. About the Americas, it is indeed not a country. I consider it more of a continent of some kind, so it would fit with Category:Languages of Africa et al. which are also categorized into Languages by country. -- Prince Kassad 10:04, 17 November 2010 (UTC)Reply
Per my RFDO comments, I see no linguistic value in these. Let Wikipedia handle it. I don't really feel strongly about deleting the lot, as I imagine the page traffic for these categories is very, very low. I suppose it's one way to find obscure languages, like people browsing Category:Languages of Italy and finding out there's a Venetian language. That's the only positive point I can think of. Mglovesfun (talk) 23:57, 21 November 2010 (UTC)Reply
We can do it geographically rather than politically: Languages of [the various continents], of the Americas, of the Caribbean, of Iberia, of the Middle East, of the Arabian Peninsula, of the Sahara, of Asia Minor, of the Himalayas, of the Alps, of the Italian Peninsula, of the Great Plains, of the Appalachians, of the Canadian Shield, of Carpathian Ruthenia.​—msh210 (talk) 18:38, 1 December 2010 (UTC)Reply
That is even worse. Political borders are clear and precise. Geographical borders are not. -- Prince Kassad 18:41, 1 December 2010 (UTC)Reply
Geographical borders are not, but neither are language borders (if you will: I mean borders of regions in which specified languages are respectively spoken). I suspect the two may match up somewhat, though, not that I know much about this.​—msh210 (talk) 18:57, 1 December 2010 (UTC)Reply
I'm just thinking that users are probably more familiar with political countries than geographical areas of the earth. -- Prince Kassad 19:06, 1 December 2010 (UTC)Reply
True, but we run into the problems mentioned above. Category pages can have little maps, and the categories can be subcats of the cats for the areas their referents are parts of.​—msh210 (talk) 19:57, 1 December 2010 (UTC)Reply

"Appearance in a refereed academic journal"

For previous discussions, see Talk:infotactic#RFV discussion, Wiktionary:Requests for verification#fluffragette.'
(Also NB supramillion / WT:RFV#supramillion / WT:RFV permalink. — Beobach 07:45, 19 November 2010 (UTC))Reply

Wiktionary:Criteria for inclusion#Attestation allows "[a]ppearance in a refereed academic journal" to bypass any other form of attestation. It doesn't require that the term be used (as opposed to mentioned); it doesn't require that the term have been around for at least a year; and it doesn't even require that the journal be durably archived, which doesn't usually come up, but personally I'm not so sure about the durable archival of the e-journal that mentions (deprecated template usage) fluffragette (see its RFV discussion). Oh, and technically, it doesn't explicitly require that the appearance itself be in a refereed article, as opposed to a letter-to-the-editor or something; but I take that to be implicit, so that aspect doesn't worry me. (And, interestingly, it doesn't put any restrictions on what the journal says about the word; technically one could argue that even something like "the form *foobar is unattested, and perhaps impossible" would count. Dunno why we currently shuffle reconstructed forms off to appendices. ;-)   )

Does anyone support the current version? (Not necessarily the weirdest quirks resulting from a naïve or too-literal reading, but the overall idea: that a single use or mention in a peer-reviewed paper should bypass all other attestation requirements?)

If not — what should we change it to? I think my preference is just to remove it, but I've put a few options at Wiktionary:Votes/pl-2010-11/Attestation in academic journals to get discussion going. (Feel free to modify that page.)

RuakhTALK 20:49, 14 November 2010 (UTC)Reply

I think the principle that a refereed academic journal can be assumed to have a higher level of authority than other literature is sound, and that a single citation is therefore are sufficient. The lack of a requirement for a use-mention distinction is concerning. I wouldn't go so far as to say that we must require only uses, however. I am more partial to option three in your vote, though not the exact wording. Academic articles like these, because of their authority, should be trustworthy sources on words that they document without using. A list of phobias on a trivia site is a good example of why we disallow mentions as sources; a list of words in a journal of linguistics or lexicography documenting real words from a linguistic community which may not appear in literature (as from a spoken language, or even perhaps words in written languages that don't appear in print) seems like a good source.

The distinction that I would draw in academic sources, then, is not whether the word is used, but whether the context of the article suggests that the word is an attestable member of a particular language. I realize that's not completely objective, but I think we could make it work. The problem I have with the article where "fluffragette" is found is not that it is a list of words, but that the article itself is about word formation and indicates that the list is of neologisms verbally observed by the author, possibly only once. A field linguist's list of common animal words in Shabo, on the other hand, would seem worthy of pulling from.

To me, it makes sense to extend this principle to all peer-reviewed academic works (like university press books, and not just articles), but I don't like the broadness introduced by "reliable source." I would also assume that when this criterion was introduced, it was not contemplated that a peer-reviewed academic journal could be not durably archived; it would make sense to have that apply to all quotations universally.

And really, if we are looking at the standards for attestation, "Clearly widespread use" is the one that really bugs me. If it is so clear, just demonstrate it with a quotations fitting one of the other criteria. Dominic·t 12:08, 15 November 2010 (UTC)Reply

I agree that we need to do something about tiny languages (and perhaps even tiny dialects of English), but I think it might be better to do that explicitly, rather than by trying to create some general principle that will de-disadvantage those languages. BTW, feel free to modify that proposal in whatever way you see fit. So far you're the only one who's commented supporting it, so for now, you own it. ;-)   —RuakhTALK 15:45, 15 November 2010 (UTC)Reply
I may have overemphasized the point about languages without literature (because it seemed like the clearest example) but the point that I am really trying to make applies to all languages we are documenting. Because it is languages we are documenting, after all. Our mission is to document language, not literature. The privileging of the written word over the spoken word, and, at that, the durably archived written word over the word in more transitory media, is something we and all dictionaries must struggle with. Basically, the use requirement forces us to find a word used by its speakers in their own voice (whether fictional or not), but not all voices are recorded that way. That doesn't even hurt just the disadvantaged speakers and their words—there is lots of technical language that is also difficult to cite because it is language that might not see publication in the voice of its speakers either. If a (durably archived) peer-reviewed academic work says something is a real word without necessarily using it, even if we cannot find it in use elsewhere, that is acceptable to me as proof. I do think the questions of use vs. mention and and of one or three quotations are entirely separate though. Dominic·t 10:47, 16 November 2010 (UTC)Reply
  • I'd go for "Proposed Change 1: Remove", in the absence of a rationale that points otherwise. --Dan Polansky 14:36, 15 November 2010 (UTC)Reply
    • I'd be happy enough to remove the academic journal criterion - count them as durably archived and therefore as valid citations; yes. To put them ahead of other durably archived sources, no! Regarding "I think the principle that a refereed academic journal can be assumed to have a higher level of authority than other literature is sound" (Dominic) I don't think that's the case. As sound yes, ok, but authority with respect to written language? No. Mglovesfun (talk) 15:02, 15 November 2010 (UTC)Reply
Like Dominic, I like option 3. An alternative is to count mentions in journals (but only uses elsewhere), while requiring at least one use (somewhere). That's what's done in Wiktionary:Votes/pl-2007-12/Attestation criteria. (I like Ruakh's option 3 better than that, though.)​—msh210 (talk) 18:47, 15 November 2010 (UTC)Reply
A problem with mentions in articles, particularly linguistics articles, is that transliterations are used of languages that are not written, and even of languages that are not written using Latin letters (example). We don't want the latter, and (I think) we want the former iff the transliteration scheme is standard. AFAICT this problem applies also to option 3 in Ruakh's vote, but not to the alternative I suggest just above, as that requires some attested use.​—msh210 (talk) 18:56, 15 November 2010 (UTC)Reply
Furthermore something that is mentioned once, with no usage (anywhere else) there's no way it can support any definition. If I were to write 'gollygalf is an interesting word', that couldn't support any possible sense of gollygalf other than 'we don't know what it is'. Mglovesfun (talk) 19:02, 15 November 2010 (UTC)Reply
NB, when that happens with ancient languages, dictionaries (including Wiktionary, I think) write "a word of uncertain meaning". — Beobach972 19:16, 15 November 2010 (UTC)Reply
  • Each of the proposed changes is better than the status quo, but I think Proposal 1 (remove the privilege) is best. (At a minimum, we should require that appearances be in durably-archived journals... but I wouldn't add that as an option to this vote, because I fear enough votes might be diluted between it and Proposal 2 that none of the options would have a majority. Perhaps we could have a runoff, if no one option has a majority, but change has more support than the status quo.) One clarification question, though: in the event Proposal 3 (require assertion of common use) passed, and we found a word used in two books and a journal article — but only used, not asserted to be common — it would meet CFI, correct? — Beobach972 19:11, 15 November 2010 (UTC)Reply
  • Re: vote dilution, runoff: it's approval voting, so theoretically people should vote for every option that they support. I'm beginning to doubt the theory, though. :-P
    Re: clarification: correct.
    RuakhTALK 21:11, 15 November 2010 (UTC)Reply

Shall we start the vote? — Beobach 01:29, 24 November 2010 (UTC)Reply

I'd been hoping that people would improve the proposals, but no one has . . . I've set the vote to start in three days, to give a last chance for improvements before the vote starts. —RuakhTALK 02:09, 24 November 2010 (UTC)Reply
Does it make sense to differentiate by language characteristics? For example, should we be more willing to rely on academic sources for extinct languages and/or tiny languages? I don't really see any reason to enshrine academic journals in any way for usage in modern languages with abundant online corpora like English. Furthermore academic journals are not readily searchable except by those blessed with access. Public libraries afford only the most limited access to such specialized and costly resources. DCDuring TALK 11:31, 24 November 2010 (UTC)Reply

Non-language three-letter templates

What are our three-letter templates that are not language codes? I remember {{rfd}}, {{rfc}}, {{rfv}}, {{rfp}}, {{rfe}} and {{sic}}. --Daniel. 10:57, 15 November 2010 (UTC)Reply

This is the 15-th discussion in Beer parlour that you have initiated in November. That makes one new Beer parlour thread per day, on average. What about you reduce the rate to one half? Or what about doing something uncontroversial for a while, such things that do not require much discussion? --Dan Polansky 14:31, 15 November 2010 (UTC)Reply
Also {{art}}, {{dat}} and {{gen}}. More specifically, three-letter, no caps, Latin script, no diacritics. Or, using only abcdefghijlkmnopqrstuvwxyz. We had drive to get rid of these on fr: and now apart from one, only a couple exists as redirects, and the redirects as listed as deprecated templates. I've always wondered why we don't renamed stuff like rfv and rfd to avoid future clashes. Mglovesfun (talk) 15:05, 15 November 2010 (UTC)Reply
From Dan's apparent criticism above, I suppose I would not be able satisfy everyone if I tried to. Two months ago, there were people claiming that I don't discuss enough.
Thank you for listing these templates, Martin. --Daniel. 15:31, 15 November 2010 (UTC)Reply
The complaints about your not discussing enough were not about the absolute volume of discussion you generate, but rather about your doing too many possibly controversial changes without a discussion. If you reduce the volume of controversial changes, you will be able to reduce the volume of accompanying discussion you generate in Beer parlour. Anyway, maybe other people see it differently from me. --Dan Polansky 15:37, 15 November 2010 (UTC)Reply
Hmm, I don't remember ever doing a controversial change without it being discussed first, but perhaps my views on controversy and necessity for discussions are different from those of other people.
By the way, I like to point out suggestions for improvement of Wiktionary and question its practices when they are too obscure. I don't feel the need for reducing the absolute volume of discussions created per day by me. --Daniel. 16:08, 15 November 2010 (UTC)Reply
Re "I don't feel the need for reducing the absolute volume of discussions created per day by me": I know. That is why I have pointed out that I do feel the need that you reduce the volume. I do not know how other people see it, though. --Dan Polansky 16:11, 15 November 2010 (UTC)Reply
You have quite often (I remember four or five occasions) edited major templates without prior discussion and caused problems that broke hundreds or thousands of entries. Those changes might not be controversial in nature, but they were damaging in effect, and the breakage might have been avoided if others had had a chance to review them first. Equinox 00:46, 16 November 2010 (UTC)Reply
{{rft}} comes to mind also.​—msh210 (talk) 17:40, 15 November 2010 (UTC)Reply
Dan Polansky, can I complain about your off-topic input to this debate, then? Thank you. Mglovesfun (talk) 18:54, 15 November 2010 (UTC)Reply
As of the last dump, I think the only ones not yet mentioned are {{voc}}, {{inv}}, {{rfi}}, {{wse}}. --Bequw τ 00:44, 17 November 2010 (UTC)Reply

Thanks to msh210 and Bequw too, for replying. Another template worthy of mentioning is {{see}}, that should be used only as a language template (see is the code for Seneca), but also retains the functionality of being equal to {{also}}. --Daniel. 08:57, 18 November 2010 (UTC)Reply

Preventing creation of new entries by anons

Wikipedia prevents "anons" (IP addresses, who aren't logged in to a named user account) from creating pages. I think this might be a good idea on Wiktionary too, because a lot of IP-created entries are obvious vandalism, and (relatively speaking) we have far fewer users patrolling vandalism than WP does. While it's slightly annoying to have to sign up (I didn't bother for a few months when I first came here circa 2008), I think it's a reasonable thing to ask, and restricting entry creation to registered users would probably kill a significant category of vandalism — plus we have Requested entries for those who want to suggest an entry but don't know how to write it properly, or don't want to bother. 1. Is this something that could be rolled out across Wiktionary? 2. If so, how do people feel about it? Equinox 00:42, 16 November 2010 (UTC)Reply

I'd want to see some data about how many pages from anons are immediately deleted. Maybe we can get in touch with someone with access to that kind of information (or is available in the database dumps?) Nadando 00:57, 16 November 2010 (UTC)Reply
As we weigh whether or not to require new would-be editors to create accounts before creating pages, we should also consider than some of our established editors sometimes edit without logging in, if they are for example on public computers. When I look at Special:NewPages, the vandalism I find is (as you say, Equinox) obvious (and thus easy to spot and delete)... to me, preventing anons from creating pages is unnecessary. Furthermore, that new anons sometimes give us useful content, and that established users sometimes choose not to log in before giving us useful content — to me, each of those things justifies allowing users to create pages without needing to create or log in to accounts first. — Beobach972 04:30, 16 November 2010 (UTC)Reply
As for the statistics Nadando does well to request: this is only a day's anecdote, but may give some idea: On the 15th, anons gave us meningsverschil, ‎Daygo, chính trị, ‎:bộ chính trị, and řádka. Daygo is currently undergoing RFV (rightly doubted and listed by Equinox), but the other 4 are good Dutch, Vietnamese, and Czech entries. Meanwhile, 32 of the ~83 pages deleted on the 15th were created by anons. Thus, slightly more than 10% of anon contributions on that day were good, while anon vandalism represented slightly less than 40% of what was deleted (the rest being vandalism by logged-in users or miscarriages by bots or logged-in users). — Beobach972 04:30, 16 November 2010 (UTC)Reply
We have some known regular IPs, like User:71.66.97.228, I would not want to scare them away. -- Prince Kassad 09:41, 16 November 2010 (UTC)Reply
I don't really like it, it seems to me that the whole idea of a Wiki is that anyone can create or edit an article. To be honest it's our job to ditch the ones that aren't any good. Ƿidsiþ 09:47, 16 November 2010 (UTC)Reply
Just as point of fact, Wikipedia's current hostility to anonymous editing is an attempt to prevent libel issues from editors who would use the site to cause very serious harm in the real world. This is an issue which we do not yet have, if we ever will. Anonymous article creation and semi-protection were developed in the wake of the John Seigenthaler controversy not because of the overall quality of their edits (which are good, for the same reason the wiki system works), but because of the potential risk that even a single instance of a certain type of vandalism represented. I think it is important to understand that we don't have the same equation here; even on Wikipedia it is recognized that losing anonymous article creations was a trade-off, but we have far less to lose and far more to gain from it.

Also, the data only gets us so far, as the intangible aspects of anonymous editing may be even more important than the contributions themselves. All of us started out anonymous, and anonymous editing, even clueless, bad anonymous editing, is the gateway to becoming a good editor. It is also in line with the open and meritocratic principles that sustain the project. Targeting it is a form of collective punishment, of creating security through creating barriers to all newcomers, good and bad. It doesn't matter if registration is simple and easy; just as the vandals who are not committed won't bother, so too the casual outsiders with something to add won't take the extra step either. Dominic·t 10:23, 16 November 2010 (UTC)Reply

I oppose. I know many productive anon users, I don't want to scare them away either. --Anatoli 01:47, 18 November 2010 (UTC)Reply
If we believe that new entries by anons are particularly likely to be vandalism, then I think we should first try to make entry-creating edits easier to patrol. Right now a patrolling admin can ask the recent-changes interface to show only unpatrolled anonymous edits, but (s)he can't ask it to show only unpatrolled anonymous entry-creating edits. Such a feature should lower the "cost" of allowing anons to create entries. If we try that sort of step, and we still find this to be a problem, then we can consider stronger measures afterward. —RuakhTALK 04:41, 18 November 2010 (UTC)Reply
Is this what you were suggesting? --Yair rand (talk) 05:08, 18 November 2010 (UTC)Reply
Yes! Awesome! Thank you! —RuakhTALK 19:33, 19 November 2010 (UTC)Reply
Or this (last updated in 2009- might not still work). Nadando 05:14, 18 November 2010 (UTC)Reply
Every time this comes up I have to say that creation of new entries and editing in general must be given the same treatment in terms of access privilege. If you want to make it impossible for anonymous users to create new entries then that's fine with me as long as anonymous users are not able to edit pages either. Otherwise you will not deter the deposit of cruft, you will only redistribute its accumulation. People will start defining terms where they are listed in derived and related terms rather than on a separate page where they belong, and where the merits of including or excluding the terms are more easily weighed. In fact they already do do this, which is a royal pain but nowhere near the landfill of carnage that will be brought on with a shortsighted policy change. DAVilla 19:21, 4 December 2010 (UTC)Reply

I am rather late to this discussion, and I understand that, but I would like to say that I oppose the concept (though I see it's been significantly opposed as it is). For myself, I know that there have in the past been several productive anon editors of languages such as Vietnamese. I don't want to scare away people who want to add a few words in their language but don't necessarily want to make an account, because they don't feel that level of dedication to the project. Anyway. --Neskayagawonisgv? 20:19, 13 January 2011 (UTC)Reply

Some attested nonfictional terms

I have attested Citations:damage counter, Citations:Basic Pokémon, Citations:Baby Pokémon, Citations:Potterism and Citations:plotkai, so I believe they qualify to be kept as entries in the main namespace, according to our CFI.

As usual, I used Google Books as a durable source of works from independent authors, with the exception of plotkai, which has dozens of quotes from webpages (that is, articles and forums) and employs Archive.org as the durable source (though I'm not sure if I used Archive.org correctly here); as a result, I am pondering whether or not it fits the label "Internet slang".

Dan Polansky and Equinox, please refrain from being impolite. --Daniel. 06:48, 17 November 2010 (UTC)Reply

Actually it looks like damage counter will be deleted. It doesn't seem to be used outside the Pokemon universe, and the definition itself is SOP and unnecessarily specific. ---> Tooironic 11:17, 17 November 2010 (UTC)Reply
The current definition is of a real game, and specific enough to avoid being SOP: it is not anything that counts damage; it means 10 points of damage in that specific trading card game.
In more than one conversation, I have compared Pokémon TCG to other games for purposes of inclusion to the dictionary. For example, there is the fact that development is defined as both
  1. (uncountable) The process of developing; growth, directed change
  2. (chess, uncountable) The active placement of the pieces, or the process of achieving it
The latter definition is a more specific version of the former, but conveys a linguistic nuance restricted to the game of chess. The same applies to damage counter of Pokémon TCG, in comparison to any counter of damage.
Note: When I commented this before, there was at least one reply suggesting that chess terms are more important and more worthy to be included than terms of Pokémon TCG. I fundamentally disagree with this idea. --Daniel. 11:55, 17 November 2010 (UTC)Reply
I think our fictional universe rule requires citations independent of the universe. Chess is not a fictional universe, Pokemon on the other hand is. -- Prince Kassad 13:44, 17 November 2010 (UTC)Reply
Pokémon, basically, is not a fictional universe; it is a franchise that depicts multiple fictional universes.
There are words coined to represent fictional concepts from Pokémon, such as "Pikachu", "S. S. Anne", "Goldenrod" and "Oran", that are under the rules of WT:FICTION for the purpose of being or not defined on Wiktionary.
There are other words that represent real concepts directly related to Pokémon, that are not represented by special rules. For example, game mechanics and strategies such as "F.E.A.R." and "Masuda method". --Daniel. 14:47, 17 November 2010 (UTC)Reply
For context: W:Pokémon Trading Card Game (Pokémon TCG). Related: W:Magic: The Gathering, a collectible card game. --Dan Polansky 14:06, 17 November 2010 (UTC)Reply
I don't think the Pokémon thing is a more specific sub-sense of damage counter; it is just an instance of one. Daniel's pending additions in Appendix:Chip's_Challenge include block (“a brown object that can be moved by Chip, by ices or force floors”) and creature (“any of a set of harmful moving things”). These are blocks and creatures, and they have particular aspects to match the game's needs, but that doesn't make them anything more than blocks and creatures. Hundreds of video games feature "monsters" and "zombies", but I wouldn't want to see hundreds of separate senses, one for each game, simply because they have different abilities and colours. Equinox 14:31, 17 November 2010 (UTC)Reply
As a related note, I have fundamentally opposed the possibility of having dozens of senses for zombie (of zombies of various universes) at Wiktionary talk:Votes/pl-2010-10/Disallowing certain appendices#And the mainspace?, in the message from "01:56, 6 November 2010 (UTC)". --Daniel. 16:07, 17 November 2010 (UTC)Reply
I don't think many of your citations are valid per WT:CFI because you have taken them from Web sources like LiveJournal, which are not durably archived. Equinox 14:32, 17 November 2010 (UTC)Reply
Quite right. And the citations that are durably archived don't all support the headword and/or sense. For example, Citations:Potterism has two cites for (deprecated template usage) Harry Potterism (Lua error in Module:affix/templates at line 38: The |lang= parameter is not used by this template. Place the language code in parameter 1 instead.), one for (deprecated template usage) anti-Potterism (Template:morph + Template:morph + Template:morph, IMHO), and only one for (deprecated template usage) Potterism (where it doesn't quite refer to the fandom; but that one could be fixed by tweaking the def). —RuakhTALK 16:50, 17 November 2010 (UTC)Reply
I believe Archive.org does the job of being durably archived. Multiple citations from the page Citations:plotkai includes the piece of text "acessed on" with a link to Archive.org. --Daniel. 21:46, 17 November 2010 (UTC)Reply
"For example, the Wayback Machine maintained by Archive.org is not considered usable for attestation, because the archive of a site can be erased at the request of the site owner." Equinox 21:55, 17 November 2010 (UTC)Reply
(Interestingly, Google Groups will also remove Usenet posts at the author's request. But Usenet is archived in other places, presumably...?) Equinox 21:58, 17 November 2010 (UTC)Reply
(For what is worth, the piece of text that Equinox quoted is from Wiktionary:Criteria for inclusion/Editable.)
Can you please point me to any discussion where the result was never using the Wayback Machine? When I searched this subject through Wiktionary, I have found both opinions (that we should and that we should't use it). For example, here is one message from Wiktionary:Requests for verification archive/October 2005 and another from Wiktionary:Beer parlour archive/2010/August:
Lua error in Module:languages/errorGetBy at line 16: Please specify a language or etymology language code in the first parameter; the value "This reminds me, btw, of a bit on WT:CFI about pages being "durably archived" on Google. I know it's the case that after a page has been down awhile it no longer appears in search results — does the cache still remain after that? I think if a durable archive is to be suggested it should probably be the Wayback Machine. —Muke Tever 04:57, 12 October 2005 (UTC)" is not valid (see Wiktionary:List of languages).
Lua error in Module:languages/errorGetBy at line 16: Please specify a language or etymology language code in the first parameter; the value "[...] Another point of inclarity in your post is that it seems to be assuming archive.org (the Internet Wayback machine) is accepted as a source, and arguing that webcitation.org is no worse; in fact, though, we don't accept the Wayback Machine as a source of attestation.​—msh210℠ (talk) 15:21, 9 August 2010 (UTC)" is not valid (see Wiktionary:List of languages).
--Daniel. 22:34, 17 November 2010 (UTC)Reply
Aha, I might have quoted from an inappropriate place, because I thought the rule existed so I used Google to search pages on Wiktionary. Equinox 22:46, 17 November 2010 (UTC)Reply
No, Daniel, you misread what I'd written. I wasn't opining we shouldn't use it: I was stating what I thought was an already decided-upon practice that we don't.​—msh210 (talk) 01:43, 18 November 2010 (UTC)Reply
Actually, I didn't say that you were opining we shouldn't use it. Your quote is pretty clear as stating a supposed fact (that we have the practice of never using the Wayback Machine as a source of attestation), rather than an opinion (that you, rather than the community as a whole, prefer to dismiss the Wayback Machine as counting for attestation).
I assumed that if a consensus was hypothetically attained, someone must have opinions supporting said consensus. (Or, rather, since you assumed we had a consensus, you must have assumed that someone supports said consensus.)
Apparently I have skipped a few steps of thought in that message. I apologize for that. --Daniel. 11:50, 18 November 2010 (UTC)Reply
In that case, I misread what you wrote.  ;-)  Sorry.​—msh210 (talk) 16:31, 18 November 2010 (UTC)Reply

What are the durably archived sources?

I would like to know the answer to the question above. If anyone knows a durably archived source that has not yet been mentioned in this conversation, please do so. Especially if your source is accessible from the Internet.

  • The ones I remember are: Books (including the ones that can be viewed through Google Books and Wikisource), movies, video games and Usenet.
  • Wikimedia projects, including Wikipedia and Wikibooks, and presumably other Wikis such as Wikia or Uncyclopedia are "durably archived" by definition, but I can see a consensus not to count them towards the 3-cite rule of attestation, apparently because these sites can be easily edited by anyone.
  • Finally, there is the Wayback Machine from Archive.org, that has raised some controversy because "the archive of a site can be erased at the request of the site owner" according to WT:CFIEDIT.

--Daniel. 06:31, 18 November 2010 (UTC)Reply

I'd qualify some of the above, but you asked for more, not less, so: laws, court decisions, legislative minutes, any of which are published officially; archived periodicals; engravings on monuments, tombstones, Walk-of-Fame-star-type things.​—msh210 (talk) 08:40, 18 November 2010 (UTC)Reply
Good examples. Magazines and newspapers also serve as durably archived sources.
Oh, msh210, you have my formal permission to express disagreement with any of the sources that I have mentioned. --Daniel. 17:25, 18 November 2010 (UTC)Reply
One of the things in CFI I have a problem with. Since durably is comparable, how durably? Does it have to be forever? How can we know that these sources will last forever? I'd much rather we use Wikipedia's "reliable third-party sources", with a bit more qualifying. Wiktionary:Durably archived sources would be a massive help. To be honest, I don't think anyone gives a sh*t what 'durably archived' actually means. It's just one of those things, if we can't fix it, just ignore it. Mglovesfun (talk) 17:31, 18 November 2010 (UTC)Reply
"Durably" means - will last at least as long as this wiki. SemperBlotto 17:34, 18 November 2010 (UTC)Reply
Sure, but we can't know that, can we? We don't have a crystal ball. Mglovesfun (talk) 17:36, 18 November 2010 (UTC)Reply
This wiki will last forever! Or, at least, 20 billion years. --Daniel. 17:48, 18 November 2010 (UTC)Reply
I think SB's is the right short answer, but I like long non-answers.
The non-parametric statistical theory of extreme values suggests that for any randomish phenomenon measured over a period of time, T, with extreme value Xhigh and Xlow, there is a 50% chance the phenomenon will register values higher than Xhigh and a 50% chance that it will register values lower than Xlow over the next time period of length T. What percentage of the works of Classical Greece and Rome were lost each century over the last 2000 years? What was the highest level of loss: 10%, 20%, 40%, more? I think there is some similar statistical reasoning can get one to estimate the total expected life of a phenomenon that has lasted T years to be another T years. Another way of looking at it is economically: how much does it cost to maintain an archive and access to it once its immediate economic utility is negligible relative to the resources of the civilization that might incur the cost? High-value religious, official, and literary texts seem to last a long time, outlasting their languages. But low-value text seems likely to not be worth the effort of transcribing and may not outlast its initial physical medium. There are even questions of how long it pays to maintain a community of scholars that can decode ancient scripts and translate ancient texts into more modern forms.
In our case, even high-acid paper has lasted 100 years or so and better quality papyrus, parchment, and paper much longer. How old is the oldest analog sound recording? How old is the oldest digital data? The means of access for electrically recording information are also problematic, with old formats requiring retranscription, which raises questions of economics.
How long will it be possible or worthwhile to maintain electronic copies of blogs, organizational web pages, and news forums that are out of reach of erasure by the original owner of the data ?
So, w:Wiktionary hasn't been around for 8 years yet. w:Usenet covers 28 years, but archiving by commercial enterprises is only 15 years old. The w:Wayback machine/w:Internet archive is about 14 years old. By the statistical logic and SB's threshold criterion alone, we should accept all of these. A problem with the Internet Archive is that the owners of the content (and others (such as the Scientologists) have the power to have content removed or to prevent their content from being archived. DCDuring TALK 20:25, 18 November 2010 (UTC)Reply
If a Wiktionarian successfully quotes something from the Wayback Machine, then the owner evidently did not "prevent their content from being archived"; it leaves only the problem of the owner being able to remove them later. Couldn't it be solved by a bot checking for dead links from the Internet Archive once in a while? Perhaps once every five or ten years? --Daniel. 10:13, 21 November 2010 (UTC)Reply
There is nothing that prevents the owner from requesting removal at any time. The owners of the Internet Archive need to comply to avoid hostile judicial and legislative action which would jeopardize the very existence of the archive as a public resource. DCDuring TALK 12:29, 21 November 2010 (UTC)Reply
What would be the purpose of re-assessing our evaluation of a term every five or ten years? Would a barely legal word suddenly not be considered a word just because some guy decided he didn't want people to see something from the past on a completely unrelated (or worse, an entirely pertinent) concern? No, the criteria are for a durable source precisely for this reason. Barring a change to the criteria themselves, once a term is accepted it is accepted for good. Consider that when you brandish dubious terms like MissingNo.. We expect these to be here for all time. DAVilla 08:21, 4 December 2010 (UTC)Reply
I don't consider video games to be durably archived. Nor wikis (WMF or otherwise). Nor web-pages archived by archive.org. And for books, I'd only count "real" books that are actually published to print; there's some book-like content on Google Books that I'm not sure has ever been to paper. —RuakhTALK 21:04, 18 November 2010 (UTC)Reply
I consider a given video game or WMF wiki to be durably archived, basically because millions of people have copies of it. Modern books would fall into the same (personal) criterion. --Daniel. 10:13, 21 November 2010 (UTC)Reply
In the spirit of openness we disfavor all sources that are not available without incremental cost, ie, from libraries. Games are available from few libraries. — This unsigned comment was added by DCDuring (talkcontribs).
Well, I suppose there are not many, if any, words that can only be cited from video games, so that is not much of an issue. Even video game-related terms such as HP and special attack are citeable from other places, such as books about video games. --Daniel. 07:24, 22 November 2010 (UTC)Reply

WT:SEA was built based on the types of sources that have been generally agreed to be durable. DAVilla 08:21, 4 December 2010 (UTC)Reply

Note that Google keeps in a cache all pages it finds, for a long time, and that this cache is accessible even after the page has been changed and normal searches don't find it any more. I don't know how to access these old cache pages, but User:Stephane8888 has accessed an old page in this cache, through the following URL: http://webcache.googleusercontent.com/search?q=cache:CtePz131FCwJ:sites.univ-provence.fr/veronis/Parole2007/transcript.php%3Fn%3DLepage+%22remiter+le+territoire+et+de+refaire+toute+une+s%C3%A9rie%22&cd=1&hl=fr&ct=clnk&gl=fr&client=firefox-a
In addition, there is Wayback and, for French sites only, archiving performed by BNF (Bibliothèque nationale de France), which are accessible to researchers. BNF archives everything published in France (books, magazines, etc.), indefinitely.
Therefore, it seems that even normal Internet pages are durably archived.
About wikis, there is no reason to exclude them, when it is clear that the use is a natural use, not an artificial use coined to deceive us. When there is any doubt, the citation should be excluded, but only in this case. Lmaltier 09:02, 4 December 2010 (UTC)Reply
Does the Google cache reflect that the page hasn't been re-visited by the web-crawling spider, or does Google intentionally archive these, and if so for how long? Wayback is mentioned specifically, and the argument against its use is not my own invention. This is not a policy but a reflection (and interpretation, of course) of how the community feels, so although you're certainly welcome to question it, realize there will have to be a lot of minds to change.
On the other hand, BnF is durable by the sound, look, and feel of it, being the national library of France and all. That's exciting because on of its programs, Gallica, has started archiving e-books.
Wikis are neither included or excluded. If durable then certainly the quotations would count (although frankly I can't think of any wikis that are considered durable except where they are published in another form). The text there just comments on how difficult it is to cite them. DAVilla 18:22, 4 December 2010 (UTC)Reply
About Google, the page I mention was deleted from the site a long time ago, and is not found by normal searches, but is still accessible nonetheless. For how long, I don't know.
About BnF, yes, archiving is durable, but accessibility is low (unfortunately)... Nonetheless, it's interesting. Lmaltier 21:34, 4 December 2010 (UTC)Reply

Poll: Inflection to inflection-line

Recently, a process has been started to move templates that belong to inflection line from Category:English inflection templates to Category:English inflection-line templates, Category:Spanish inflection templates to Category:Spanish inflection-line templates and the same for other languages. The process was approved in RFM by four people (Wiktionary:RFM#Category:English inflection templates), but this seems insufficient for such a big change. Hence this poll to find what people support.

The RFM discussion also dealt with two other moves, but in this poll I am focusing only on the categories for templates that belong to an inflection line.

The rationale for the move was basically that a template with a conjugation table is an inflection template but not one that belongs to an inflection line, so the name "Spanish inflection templates" is misleading, as the category is only for templates that belong on the inflection line.

Do you support moving from Category:English inflection templates to Category:English inflection-line templates and the same for other languages?

If you basically agree with the move but prefer "Category:English inflection line templates" to "Category:English inflection-line templates" (the difference is only in the missing dash or hyphen), please indicate so in your cast vote. --Dan Polansky 11:17, 19 November 2010 (UTC)Reply

Support:

  1. RuakhTALK 14:04, 19 November 2010 (UTC)Reply
  2. ​—msh210 (talk) 16:11, 19 November 2010 (UTC)Reply
  3. SupportCodeCat 16:54, 19 November 2010 (UTC) but without the hyphen.Reply
    In this case you need to vote oppose. -- Prince Kassad 18:49, 19 November 2010 (UTC)Reply
    Why? This is as intended: you if basically agree, you support. If you strenuously disagree with the hyphenated form, then an oppose would be in order, but if you merely prefer the form without hyphen, you support and state your preference. Anyway, I am surprised that you oppose the move on the account that the new name does not sound properly English when native speakers have had no problem with the new name so far. --Dan Polansky 18:56, 19 November 2010 (UTC)Reply
  4. Support, even though I never liked the name inflection line. I prefer headword line. --Vahag 17:26, 19 November 2010 (UTC)Reply
  5. SupportRod (A. Smith) 19:24, 19 November 2010 (UTC) But I would prefer "headword-line templates". —Rod (A. Smith) 19:24, 19 November 2010 (UTC)Reply
  6. Support Daniel. 19:41, 19 November 2010 (UTC) I, too, prefer headword, rather than inflection, in this case; "headword" was my first suggestion when I proposed deprecating the name "Category:English inflection templates". --Daniel. 19:41, 19 November 2010 (UTC)Reply
    Should it be "Category:English headword line templates" or "Category:English headword templates", per your preference? --Dan Polansky 20:05, 19 November 2010 (UTC)Reply
    Thanks for asking. I prefer Category:English headword line templates. After pondering on this subject, I came to the conclusion that I don't like the other alternative, Category:English headword templates, because headword lines includes other items, not just headwords, such as genders, inflections and transliterations (and parentheses, for that matter). --Daniel. 09:20, 20 November 2010 (UTC)Reply
  7. Support as is.   AugPi 04:23, 20 November 2010 (UTC)Reply
  8. Support. --Yair rand (talk) 09:57, 21 November 2010 (UTC)Reply
  9. Support Dan Polansky 08:34, 22 November 2010 (UTC)Reply
  10. SupportSaltmarshαπάντηση 17:03, 22 November 2010 (UTC) But (for what its worth) (1) the hyphen seems unnecessary, (2) I would have preferred Headword line template etc.Reply
    The hyphen is there to indicate that the noun phrase "inflection line" is being used attributively, to indicate that "inflection line" modifies "template" rather than "inflection" modifying "line template."   AugPi 04:54, 25 November 2010 (UTC)Reply

Oppose:

  1. Oppose sounds too much like broken English. -- Prince Kassad 14:23, 19 November 2010 (UTC)Reply
  2. Oppose Maro 19:26, 19 November 2010 (UTC)Reply
  3. Oppose I opt for the current terminology. The uſer hight Bogorm converſation 09:22, 6 December 2010 (UTC)Reply

Abstain:

Comments:

Quote signs in category names, appendices

Hi, there is a policy that words or phrases with an apostrophe are entered with a straight ASCII apostrophe, for technical reasons, unfortunately. Hopefully one day we will overcome this imperfection. On the other hand, it is accepted that curly quotes should be used in articles wherever possible. In that respect we are more advanced than Wikipedia. However, we need to come up with some policy: I have been changing links to stuff like Appendix:Variations of "man" to Appendix:Variations of “man” and then making a redirect to the page with the straight quotes. For Categories, there is a problem: the redirect redirects you when you visit the category, but doesn’t redirect the pages which are added to the category. Compare Category:Nouns ending in “-ism” by language and Category:Nouns ending in "-ism" by language (please leave like this for now, until this discussion is resolved).

What policy shall we use here? I would rather not use the quote signs at all in those categories, which also already is done sometimes now: Category:Danish words suffixed with -isme. It avoids the problem and doesn’t look bad to me.

Of course, my vote would be to abandon straight quotes altogether and fix the software, but… H. (talk) 08:02, 20 November 2010 (UTC)Reply

Re "... it is accepted that curly quotes should be used in articles wherever possible." It seems you get this wrong. I know of no consensual support of this by the community. There may be a plain majority of supporters of such a thing, judging from Wiktionary:Votes/pl-2008-12/curly quotes in WT:ELE, which ended (8-8-0) at the end of voting period, (9-8-1) counting the late votes. It seems at best tolerated when lovers of curly quotes place them at various places, to avoid an edit war. I certainly do not feel obliged to use curly quotes whenever possible. --Dan Polansky 08:58, 20 November 2010 (UTC)Reply
I have reverted some of your changes in category names, such as this. Category names do not use typographic or curly quotes; if you want to change this, you have to garner consensus or at least some support in a discussion. --Dan Polansky 09:10, 20 November 2010 (UTC)Reply
Category:English suffixes suggests the practice of omitting quotes in category names: "English words suffixed with -cyte". --Dan Polansky 09:16, 20 November 2010 (UTC)Reply

Removing e-mails of authors

Apparently, as standard practice, we mention the e-mail of the author of any message from Usenet. For three examples, see Citations:kawaiily.

Shouldn't we delete all these e-mails, due to the danger of exposing these people to spam? --Daniel. 11:32, 21 November 2010 (UTC)Reply

[e/c] No reason to IMO. By posting to Usenet, they've exposed themselves to spam already: spammers use NNTP, too.​—msh210 (talk) 05:04, 22 November 2010 (UTC)Reply
We could comment them out, leaving them just in the wikitext. This would helpful to editors and I bet no automated spider will check there. --Bequw τ 05:01, 22 November 2010 (UTC)Reply

@msh210&Bequw: Well, Google Groups hides the e-mail of people, behing a CAPTCHA. If that is so useless, I wouldn't worry hiding the e-mails here too. As for leaving them just in the wikitext, I would disagree because it seems pointless to add this information where it can't be read immediately. Besides, eventually spammers would learn that Wiktionary has a bunch of e-mails in wikitext and how to gather them at once. --Daniel. 06:36, 22 November 2010 (UTC)Reply

I quite agree with your last point: If we agree to hide e-mail addresses, then they shouldn't be in comments either.​—msh210 (talk) 07:20, 23 November 2010 (UTC)Reply

We can perhaps use image-file versions of the commercial-at sign, as is done at [[foundation:contact us]].​—msh210 (talk) 07:20, 23 November 2010 (UTC)Reply

I don't necessarily disagree with the cool appearance of the at-signs of your example, but it would be extremely easy to convert automatically the piece of text info{{@}}wikimedia.org into info@wikimedia.org in order to spam it. --Daniel. 03:53, 25 November 2010 (UTC)Reply
Right, but AFAICT anything we do to hide e-mail address will have that problem if spammers read our dumps and catch on to our system of obfuscation, so we have to ignore that problem. I've made {{@}} meanwhile, q.v. (Of course, another solution, as you suggested originally, is to remove the addresses altogether. (And presumably to delete old revisions? Well, whatever.) But the benefit of having a better-cited citation overrides IMHO any concern for the privacy of Usenet posters who, remember, don't really have it anyway. What do others think?)​—msh210 (talk) 08:23, 25 November 2010 (UTC)Reply

Late again, but please also remember that most users do not enter typographic quotation marks directly from their keyboards. Putting these in category names for which the typographic quotation is not the absolute rule makes the category substantially less accessible, and adds a need for yet more redirects --Neskayagawonisgv? 20:27, 13 January 2011 (UTC)Reply

Citation tools and templates

Given the numerous citation tools for wikipedia, should we provide template name & parameter compatibility with their citation templates. I imagine ours are a bit different (if at least more restrictive). Our display can/will differ, but several of these tools are quite helpful and easy to use. --Bequw τ 01:03, 22 November 2010 (UTC)Reply

How to list word pairs and their translations

Many pairs of verb + preposition have their own entries, not only go on and take back, but also believe in. Is there any guideline for when to create a separate entry like that? I'm asking from the perspective of a non-English language. The Swedish translation to "believe in" is "tro på". This is actually a non-trivial piece of information, since "in" normally doesn't translate to "på". Swedes believe "on" things, not "in". Likewise, "believe of" (to believe something of someone; which doesn't have its own entry) translates to "tro om" (believe about). So, do I really need to create separate entries for "tro på" and "tro om"? That would lead to very many entries, with the benefit that I can link directly to each entry, but also with the drawback that a reader looking at the separate entry would not get the whole picture of how the verb is used. Or should I fit this knowledge into the article for the verb, and how? Just as one more example sentence? Are there any examples of how this dilemma has been solved or addressed for other non-English languages? --LA2 04:02, 22 November 2010 (UTC)Reply

I made two examples: tro#Swedish (the verb, definition 1) uses example sentences, while vika#Swedish uses separate inflection lines for each phrase, under the same Verb heading, above the same Conjugation heading. --LA2 07:14, 22 November 2010 (UTC)Reply
I think one important distinction is that phrasal verbs (Appendix:English phrasal verbs) put stress the particle: go on, take back, whereas the verb is stressed in believe in.
  • Perhaps the latter should not have an article of its own?
  • In the sentence "the painting will go on sale next week", the verb "go" happens to be followed by "on", but this is not the idiomatic phrasal verb "go on", is it?
  • I wonder if this is not also the case with "I didn't have anyhing to go on", listed as definition 3 of go on (added by Taxman in April 2006).
  • In the sentence "let's go on with the show", the phrase "go on" happens to be followed by "with", but we don't have a separate article for "go on with". Should we?
Maybe there are lots of false friends among these phrasal verbs. --LA2 02:30, 23 November 2010 (UTC)Reply

Poll: Inflection to inflection-line 2

In a recently started poll about renaming categories for certain templates (still running), some people preferred a naming option that was not explicitly offered from the start: "headword line". Let me ask two more sets of questions, to resolve the possible naming options.

Old name:

  • Category:Spanish inflection templates

Renaming options:

  • (a) Category:Spanish inflection-line templates
  • (b) Category:Spanish inflection line templates
  • (c) Category:Spanish headword-line templates
  • (d) Category:Spanish headword line templates

The two sets of questions are such that each disregards one aspect: one asks about the choice of phrase to the disregard of hyphenation, the other asks about hyphenation to the disregard of the phrase.

Please post "support" under those statements that capture your preference. You would post two preference votes in total, one per triplet of preference statements.

--Dan Polansky 08:32, 22 November 2010 (UTC)Reply

1.1 I prefer "inflection line" to "headword line":

  1. Support -- Prince Kassad 11:07, 22 November 2010 (UTC)Reply
  2. Support Mglovesfun (talk) 17:26, 22 November 2010 (UTC)Reply

1.2 I prefer "headword line" to "inflection line":

  1. Support The appellation "inflection line" in Wiktionary historically comes from an English bias. Because English is almost a non-inflected language, it can accomodate all information about inflection in the headword line. Other languages use the headword line for other stuff, e.g. gender, transliteration, perfective/imperfective, alternative script, etc. and put the inflection under Declension and Conjugation. I remember when I was new I was very confused when people referred to "inflection line". It's a misleading and unintuitive name.--Vahag 16:53, 22 November 2010 (UTC)Reply
  2. Support Daniel. 17:56, 22 November 2010 (UTC) As I stated in one or more previous discussions, I agree with what Vahagn said now. It is possible and very common for a headword line to be devoid of inflections, therefore "inflection line" is imprecise and should be disencouraged in favor of the most intuitive name "headword line". --Daniel. 17:56, 22 November 2010 (UTC)Reply
  3. SupportRod (A. Smith) 23:36, 22 November 2010 (UTC)Reply
  4. SupportInternoob (DiscCont) 00:49, 23 November 2010 (UTC) per aboveReply
  5. Support, provided we modify Wiktionary:Entry layout explained#Inflections accordingly. —RuakhTALK 06:10, 23 November 2010 (UTC)Reply
  6. per Ruakh (06:10, 23 November 2010 (UTC) in this section 1.2) and Beobach (04:24, 23 November 2010 (UTC) in section 1.3, just below). — This unsigned comment was added by msh210 (talkcontribs) at 23 November 2010.
  7. Support, per Vahag's explanation, which has convinced me to switch my preference. --Dan Polansky 08:54, 23 November 2010 (UTC)Reply
  8. Support, as per 1 and 2 above. If the change is made every effort should be made (I'll help) to make sure that this nomenclature change is made throughout the project. —Saltmarshαπάντηση 11:44, 24 November 2010 (UTC)Reply
  9. Support \Mike 10:23, 30 January 2011 (UTC)Reply

1.3 I am indifferent about "headword line" vs "inflection line":

  1. SupportCodeCat 10:47, 22 November 2010 (UTC)Reply
  2. Support(?), but we should really pick one and be consistent in what it's called. --Yair rand (talk) 23:55, 22 November 2010 (UTC)Reply
  3. Support — Beobach 04:24, 23 November 2010 (UTC) Inflection line "sounds better" to me, but Vahag has a good point. — Beobach 04:24, 23 November 2010 (UTC)Reply
  4. Support Ƿidsiþ 14:07, 23 November 2010 (UTC)Reply

2.1 I prefer a hyphenated version to a version without hyphen:

  1. ​—msh210 (talk) 15:55, 22 November 2010 (UTC)Reply
  2. Support Mglovesfun (talk) 17:26, 22 November 2010 (UTC) (but mainly indifferent)Reply
  3. SupportRod (A. Smith) 23:36, 22 November 2010 (UTC)Reply
  4. I prefer a hyphenated version, though I do support being indifferent to whether it is hyphenated or not. :) --Yair rand (talk) 23:55, 22 November 2010 (UTC)Reply
  5. SupportInternoob (DiscCont) 00:49, 23 November 2010 (UTC) It's a { headword-line } template, not a headword { line template } — what's a "line template"?Reply
    A "line template" is a template that generates a line, such as definition line templates and headword line templates, which basically defeats the purpose of your distinction. (: But I still prefer the hyphenated version, for other reasons. --Daniel. 01:52, 23 November 2010 (UTC)Reply
  6. Support Daniel. 01:52, 23 November 2010 (UTC)Reply
  7. Support — Beobach 04:24, 23 November 2010 (UTC)Reply
  8. SupportRuakhTALK 06:10, 23 November 2010 (UTC)Reply
  9. Support. --Dan Polansky 08:54, 23 November 2010 (UTC)Reply
  10. Support for the reason I just gave above, in the first poll.   AugPi 05:04, 25 November 2010 (UTC)Reply

2.2 I prefer a version without hyphen to a hyphenated version:

  1. SupportCodeCat 10:47, 22 November 2010 (UTC)Reply
  2. Support -- Prince Kassad 11:07, 22 November 2010 (UTC)Reply
  3. Support \Mike 10:23, 30 January 2011 (UTC)Reply

2.3. I am indifferent about a version without hyphen vs a hyphenated version:

  1. Support --Vahag 16:54, 22 November 2010 (UTC)Reply
  2. Support, —Saltmarshαπάντηση 11:45, 24 November 2010 (UTC)Reply

I realized that the original reason for adding line to the category name (viz, to avoid confusion with a category of inflection templates) doesn't apply if headword is used in the category, so I'm starting another straw poll: "headword templates" vs. "headword[- ]line templates".​—msh210 (talk) 16:10, 24 November 2010 (UTC)Reply

3.1. I prefer "headword" alone over either "headword-line" or "headword line".

  1. ​—msh210 (talk) 16:10, 24 November 2010 (UTC)Reply
    I do not know, but let me note that the templates generate not only the headword but also other parts of the headword line. --Dan Polansky 18:07, 24 November 2010 (UTC)Reply
    Yes, Dan Polansky's statement is correct. msh210, please read the first two messages (Vahagn's and mine) about this subject in the section "2.1 I prefer a hyphenated version to a version without hyphen:". --Daniel. 21:03, 24 November 2010 (UTC)Reply
    Vahagn didn't write anything in 2.1 AFAICT, but I had read your comment there. "Headword templates" is shorter and as accurate as "headword[- ]line templates", as the templates do generate the headword (if often other things also). MHO, of course; the straw poll will determine what people like. If everyone except me (who already put my name in 3.1) and Yair (who already put his in 3.2) puts his in 3.4, I'll be glad to yield to Yair.​—msh210 (talk) 07:54, 25 November 2010 (UTC)Reply
    I'm sorry, I meant section "1.2". That is:
    Lua error in Module:languages/errorGetBy at line 16: Please specify a language or etymology language code in the first parameter; the value "The appellation "inflection line" in Wiktionary historically comes from an English bias. Because English is almost a non-inflected language, it can accomodate all information about inflection in the headword line. Other languages use the headword line for other stuff, e.g. gender, transliteration, perfective/imperfective, alternative script, etc. and put the inflection under Declension and Conjugation. I remember when I was new I was very confused when people referred to "inflection line". It's a misleading and unintuitive name.--Vahag 16:53, 22 November 2010 (UTC)" is not valid (see Wiktionary:List of languages).
    Lua error in Module:languages/errorGetBy at line 16: Please specify a language or etymology language code in the first parameter; the value "As I stated in one or more previous discussions, I agree with what Vahagn said now. It is possible and very common for a headword line to be devoid of inflections, therefore "inflection line" is imprecise and should be disencouraged in favor of the most intuitive name "headword line". --Daniel. 17:56, 22 November 2010 (UTC)" is not valid (see Wiktionary:List of languages).
    --Daniel. 13:12, 25 November 2010 (UTC)Reply
    Yes, I'd read those, too. My reply immediately above was to them rather than to anything from 2.1.​—msh210 (talk) 08:43, 26 November 2010 (UTC)Reply

3.2. I prefer either "headword line" or "headword-line" over "headword" alone.

  1. Support. Most of the templates aren't simply headword templates. They display the headword, usually along with some inflections and/or other information about the word. They produce content to fill the headword line. --Yair rand (talk) 04:02, 25 November 2010 (UTC)Reply
  2. Support Dan Polansky 08:33, 26 November 2010 (UTC) I am not wholly sure but I'll take a stance. I am willing to yield to plain majority. --Dan Polansky 08:33, 26 November 2010 (UTC)Reply
  3. Support \Mike 10:23, 30 January 2011 (UTC)Reply

3.3. I expressed a preference in the hyphenated vs. unhyphenated poll above (2.1 or 2.2) and prefer that form of "headword[- ]line" over "headword" alone, but "headword" alone over the other form of "headword[- ]line".

3.4. I am indifferent about "headword" vs. "headword[- ]line".

  1. SupportRuakhTALK 19:45, 24 November 2010 (UTC)Reply

Comments:

The introduction of the section 3.3 is too large, and unnecessarily complex because it repeats few sections while effectively deprecates other sections. --Daniel. 21:03, 24 November 2010 (UTC)Reply

"Redundant" languages

Sure that this has been talked about before, though how generally I don't know. I'd also like to split the debate in two, as I've been criticized before for bringing up too many points one discussions.

There are essentially some languages with ISO 639-1 or ISO 639-3 codes that we don't (currently) allow in NS:0. Ignoring constructed languages (a different issue) we don't allow Ancient Hebrew (known as Classical Hebrew) as we treat it as Hebrew (he as opposed to hbo). We also don't allow Chinese, but only Mandarin, Cantonese, Min Nan, etc.

The currently debate is on Flemish, whether Flemish sections are 'redundant' to Dutch ones. This could apply to quite a few other languages - Anglo-Norman, Norwegian Nynorsk, Norwegian Bokmål, Scots to name just four. So, what do we do? Decide them all on an individual basis I suppose. But using what criteria? Is it just voting or is there are minimum burden of proof, or is it just voting? Mglovesfun (talk) 15:44, 22 November 2010 (UTC)Reply

To answer the question "using what criteria?":
Consulting other authorities, at least on living speeches (to use the Middle English meaning of that word). We might ask: is the speech protected as a language by international treaties or inter-governmental organisations? Scots is a protected minority language (protected by the English), it is not English. We will definitely ask: do dictionaries of the speech(es) agree on what language they are dictionaries of? Dictionaries which contain "Flemish" speech and dictionaries which contain "Dutch" speech agree that they are dictionaries of the Dutch language, not dictionaries of separate languages. In this way, we avoid as much as we can the capriciousness of votes on each, and the bogeymen (or, in Middle English, bugge-men) of "original research" and the "slippery slope". (For example, the differences between Old Norse and Icelandic are as small as or smaller than the differences between Middle and modern English: if we combined Middle and modern English, we would struggle not to slide down the "slippery slope" of "having" to conflate very-similar languages because we'd already combined less-similar ones. However, if we did conflate Old Norse and Icelandic, we would be in a small minority; we might even be the first to do so: "original research".) — Beobach 05:41, 24 November 2010 (UTC)Reply
The Oxford English Dictionary includes Scots and Middle English, the latter silently except for dates, so I don't see your solution as a magic one. Nor, particularly, do I see any problem with distinguishing any pair of languages we like, no matter what we conflate. If it would serve the purposes of our users to separate Moldovian and Romanian, then separate we should. And original research is an issue for Wikipedia, not here.--Prosfilaes 06:30, 24 November 2010 (UTC)Reply
The OED includes Scots (and English) and calls itself a dictionary of the English language, but the Dictionary of the Scots Language calls itself a dictionary of the Scots language — ie, the answer to the question "do dictionaries of the speech(es) agree on what language they are dictionaries of?" is "no". (When we see that Scots is protected as a minority language, we then see more evidence that Scots and English are separate languages.) The idea of consulting such authorities is not magical; we have been doing it all along: if someone proposed merging Spanish and Russian, the idea would sooner be shot down as unsound "because no-one considers them the same language" than as a disservice to our users or our editors. The idea is not a solution, either: if authorities agree, as on Spanish/Russian or Old Norse/Icelandic, there's no problem to solve (except perhaps, in that hypothetical situation: why do some of our editors want to combine Spanish and Russian?!); if authorities are quite divided, or if there are no authorities for a particular speech, they're not much help at all. It's simply an answer to Mglovesfun's question, "is there a minimum burden of proof, or is it just voting?" — Beobach 21:59, 24 November 2010 (UTC)Reply
The OED's use of Middle English is very specific: it doesn't include ME words for the sake of it. But words which exist in modern English are traced back to their earliest appearence in the language, including in Old English or Middle English periods. This means that most OE words and many ME words are necessarily excluded. Ƿidsiþ 09:15, 3 December 2010 (UTC)Reply
I believe you're mistaken there. w:Oxford English Dictionary quotes the 1933 preface as saying "Hence we exclude all words that had become obsolete by 1150" and the 1879 Appeal to the English-speaking and English-reading public says "In the Early English period up to the invention of Printing [i.e. Middle English] so much has been done". That is, Middle English (but not Old English) is clearly part of the OED.--Prosfilaes 18:29, 3 December 2010 (UTC)Reply
I don't generally support the idea that we shouldn't blindly follow ISO 639 - which we don't, we already know there are ISO-639 codes we don't use. ISO 639 wasn't designed to be used for Wiktionary, while isn't pretty important for us to have shorter codes for longer and/or difficult to type languages name (like Old Provençal) we shouldn't "handcuff" ourselves with somebody else's system. Mglovesfun (talk) 15:16, 3 December 2010 (UTC)Reply
I don't generally support the idea that we shouldn't blindly follow ISO 639 - so we should blindly follow ISO? -- Prince Kassad 16:09, 3 December 2010 (UTC)Reply
Why looking for problems, for controversies? The only way to prevent controversies is the use of an external source. On fr.wikt, we allow at least all languages recognized by the foundation + all languages with an ISO-639 code, and, as ISO-639 is incomplete, we have defined clear criteria for inclusion of other languages. Note that allowing a language does not mean that sections for this language are encouraged. Lmaltier 20:22, 5 December 2010 (UTC)Reply
That is a remarkably sane approach, Lmaltier. I, for one, would endorse it on en.WT. - Amgine/talk 02:52, 30 December 2010 (UTC)Reply

Middle English/English crossover (for example)

A bit of research tends to show that anything that's attestable in Middle English is attestable in Early Modern English too. With respect to Old/Middle/Modern French, see (deprecated template usage) estre where we have four entries. Does anyone 'mind' having this 'duplication'. I mean they're all valid, attestable in the given languages, but does having four entries for what are essentially four different stages of the same language (or even three, combining Anglo-Norman into Old French)? Mglovesfun (talk) 15:44, 22 November 2010 (UTC)Reply

A point you should note though is that while there will be words that are identical in both stages, there will also be some that differ. It makes little sense to say 'they are the same except when they are not' and then deciding to include only words in the earlier language that differ from those in the later one. And don't forget that while MidE and EME share many words, there are some quite significant pronunciation differences. If we start adding pronunciation sections (as we have for OE) then we will definitely want to keep them apart. —CodeCat 21:50, 22 November 2010 (UTC)Reply
Part of that is our normalizing of Middle English spelling towards Modern English trends, which I'm personally not thrilled with. I also suspect that early Middle English had many more words that didn't survive to Modern English. In any case, the fact that a word can be attested in Middle English, and the definitions for which it can be attested do some service towards giving a history of the word that we don't usually do elsewhere.--Prosfilaes 03:15, 23 November 2010 (UTC)Reply
I think the tricky part is in deciding where one language ends and the next begins. The really comprehensive English dictionaries bypass the problem by including plenty of Middle English words in modernized spellings; the really comprehensive Hebrew dictionaries bypass the problem by including Ancient Hebrew words; and from what I understand, the really comprehensive Dutch dictionaries more or less bypass the problem by including Flemish words, except that they apparently reintroduce the problem somewhat by trying to tag the Flemish-only words. We are a bit unusual in trying to be really comprehensive, while also trying to distinguish all these languages. One approach is to declare, rather arbitrarily, that Middle English ceased to exist on January 1st, 1550 (when its native speakers adopted Modern English en masse), that Flemish ceases to exist when you cross the boundary from King Albert's domain to Queen Beatrix's, and so on. (Note: any Dutch works produced outside the Low Countries will be arbitrarily classified as English.) Once we have such a dictum, I don't see a problem with duplicate entries for any words that dare flout it. —RuakhTALK 06:42, 23 November 2010 (UTC)Reply
"Dutch works produced outside the Low Countries will be arbitrarily classified as English" haha! :)
As far as I know, the comprehensive Dutch dictionaries distinguish between Flemish Dutch and Netherlands Dutch natiolects (like American vs British English) — not asserting them to be different languages. We should consider that approach here: tagging words as {{context|Flemish|lang=nl}}/{{context|Belgium|lang=nl}} vs {{context|Netherlands|lang=nl}} if they are specific to one region, and tagging them as "pan-Dutch", or leaving them tagless, if they are used in both places. — Beobach 07:35, 23 November 2010 (UTC)Reply
That's certainly the most reasonable approach, but how will it help in our war with the French? (And yeah, I think pan-Dutch words should be tagless.) —RuakhTALK 13:09, 23 November 2010 (UTC)Reply
I like the system using arbitrary dates to define the difference from one language to another. It's far, far from perfect, I'd say it's equivalent to having a minimum age for the consumption of alcohol or sexual consent in that it's better than nothing. Using spelling is a bit POV, plus in the same text, one word could be Middle English, and the next word only Modern English. Per CodeCat, sometimes the word may be the same, but the pronunciation, gender, inflected forms (etc.) may be different. Mglovesfun (talk) 13:40, 23 November 2010 (UTC)Reply
  • To me, Middle English is a stage of the English language, not something separate from it, and considering it as a different entity helps no one. It even weakens our modern English sections by having words appear to come out of nowhere. It is uniquely awkward, because English was a flourishing literary language during exactly the period when it supposedly switched over, and since I work with quite a lot of stuff published around then -- like Mallory -- I'm very aware that picking one or other language is arbitrary. This is very different from the situation between Old English and Middle English, which is characterised by a near-total lack of attested writings (allowing us to draw a convenient line between the two), and during which time the language lost grammatical gender and very quickly absorbed thousands of new words from a completely new source. There the historical record has left us with a very clear language change. Above, someone suggested that ME entries are useful for things like pronunciation detail. This is fair, although singling out a 14th century pronunciation still seems rather random given that there are vast periods of "modern English" not covered by our modern English pronunciation sections: during Shakespeare's time there was no such sound as /ʌ/, and (deprecated template usage) love rhymed with (deprecated template usage) cove, to take one obvious example. Nor were spellings fixed in ME, making lemmata very awkward. The same can be said for most pre-modern languages, but in the case of ME there is a good solution available in merger with "English". Actually, when it comes right down to it Middle English wouldn't even have to disappear -- I think the sections are potentially useful for those interested in the period -- I just don't think their existence should exclude ME data and citations from modern English entries, that's what annoys me about it.
  • (On the French issue: for similar reasons, I would favour merging Middle French into (deprecated template usage) obsolete form of entries under "French", but retaining Old French as separate for reasons of grammatical inflection etc.) Ƿidsiþ 13:47, 23 November 2010 (UTC)Reply
Yes, w:fr:Moyen français was a redirect to w:fr:Français for nearly a year, until 2008. Anglo-Norman should really be merged into Old French too - I know it had a distinct influence on Middle English (and therefore English) but I'd rather use context labels, which is what I was doing before I discovered than Anglo-Norman had an ISO 639-3 code. Mglovesfun (talk) 13:55, 23 November 2010 (UTC)Reply
On the Dutch thing, I suppose there are two points that need to be addressed here. The first is Flemish. I think quite simply that it makes little sense to distinguish it as a separate language. There are some obvious differences, yes, but they rely on the same written standard for the most part. The differences between dialects in the extremes of the Netherlands (excluding those that are now recognised as languages, of course) is generally no greater than the difference between Netherlandic Dutch and Belgian Dutch. The Dutch as spoken in Belgium, even if it is not called Dutch, is readily readable and understandable by anyone from the Netherlands, particularly those in Brabant.
The second point regarding Dutch is the treatment given to Middle English and Middle French. Middle Dutch as a language is already quite similar to modern Dutch, just as Middle English resembles modern English to a degree. The pronunciation differences are about the same, too. But a modern Dutch speaker can not understand a complete text, just as a modern English speaker can't easily understand Chaucer without at least a glossary. What exactly the grounds are for differentiating what are essentially 'dialects' of one language spaced apart not in space but in time, I don't know. But I think presumed mutual intelligibility would play a big part, not just of the writing but also of the pronunciation. If a Middle text is read out to a modern speaker using authentic pronunciation (so far as we can tell what it was), would a modern speaker be able to make good sense of it? I think in that regard the difference between Middle English and Modern English might be as great as the difference between Modern English and Modern Scots. And we do treat Scots as a separate language. —CodeCat 14:24, 23 November 2010 (UTC)Reply
Re: "If a Middle text is read out to a modern speaker using authentic pronunciation (so far as we can tell what it was), would a modern speaker be able to make good sense of it?": In the case of Middle English, I'd say the answer is "no". When we were reading The Canterbury Tales in high school English, one of the other English teachers dropped in and recited the Pardoner's Prologue for us, so we could get a sense of the rhythm and rhyme and whatnot, and I literally did not understand a single word. (And none of my classmates seemed to, either, though of course I can't say that for sure.) But that's only if we ask what a 2010 (or 1998) speaker would understand; a 1551 speaker might well have understood more, and certainly a 1551 speaker would have no problems with a 1549 work (provided it was from the same dialect). It's like a Sprachbund Dialect continuum in time. —RuakhTALK 14:54, 23 November 2010 (UTC) edited 16:03, 23 November 2010 (UTC) per CodeCat's comment belowReply
Plus if Shakespeare had appeared in your class and recited one of his sonnets, even assuming you could get over your astonishment and concentrate, I suspect you would hardly have understood a word of that either. Ƿidsiþ 15:08, 23 November 2010 (UTC)Reply
Nah, we also watched some clips from Shakespeare in Love, and had no problem understanding it. ;-)   —RuakhTALK 16:03, 23 November 2010 (UTC)Reply
Realistically it's more like a dialect continuum. So we would use the same approach that is used for such dialects. Either the definition is geographically arbitrary, or it is formed by an isogloss, the analogues of which are respectively a fixed year and a sound change in diachronic linguistics. —CodeCat 15:03, 23 November 2010 (UTC)Reply
Re: dialect continuum: Oops, yes, that's what I meant. Thanks. —RuakhTALK 16:03, 23 November 2010 (UTC)Reply

Fula/Pulaar

Afaik, both words are simply synonyms and refer to one and the same language. But we treat them separately (cf. Category:Fula language and Category:Pulaar language). Why is that so? -- Prince Kassad 19:01, 5 December 2010 (UTC)Reply

Akan/Twi/Fante

Apparently, they used to have a different literary tradition but now they're all written the same. If that's true, certainly we don't need to differentiate. (There's a similar discussion on Meta about merging ak.wikipedia and tw.wikipedia due to this situation.) -- Prince Kassad 20:42, 5 December 2010 (UTC)Reply

User:Daniel./Nonfiction

I have created User:Daniel./Nonfiction to keep track of these words.

In particular, there are Citations:MissingNo., Citations:Curselax, Citations:Matrixism, Citations:Jediism, Citations:Narutard and Citations:Eevolution as recently attested entries. "Curselax" is a game strategy, "MissingNo." is a glitch, "Narutard" is a fan of a series, "Matrixism" and "Jediism" are religions, and "Eevolution" is a fanmade word that represents a group of fictional creatures. I find the last one very controversial as confronting the basic idea that fictional characters shouldn't ever be defined here, though not from a wikilawyerist point of view, because it does not fall under the criterion of "originating in fictional universes". Finally, I removed Citations:Potterism's status of "attested", because it indeed isn't. --Daniel. 17:49, 22 November 2010 (UTC)Reply

These are mostly specific to their universes, as has been pointed out to you before. The fact that they can refer to "real" things does not mean they can't be universe-specific. They have no place here and are pernicious. Equinox 23:24, 22 November 2010 (UTC)Reply
Matrixism, Jediism, and Narutard are not fiction, IMO. --Yair rand (talk) 23:33, 22 November 2010 (UTC)Reply
Yes, I agree on those. I did say "mostly" — actually, it's only half, three of the six, but I think there are further Pokémon-specific entries still extant by Daniel. Equinox 23:38, 22 November 2010 (UTC)Reply
Equinox, you are right: I estimated in another discussion that I am able to define at least some hundreds of nonfictional terms whose context is of Pokémon. Do you have any suggestion of guideline for inclusion that excludes all "universe-specific" words (which would deprecate or amend WT:FICTION), particularly one that additionally allows the existence of entries for Jediism, Matrixism and Narutard?
Your objections, as I remember them, are merely nuances of "[these words are] polluting this otherwise-useful project". I am aware that practices from other Mediawiki projects do not apply here, but I feel a direct analogy with Wikipedia's short policy named WP:UNENCYCLOPEDIC. Basically, "Delete it from the [dictionary] because it does not belong in the [dictionary]." without further elaboration is a personal opinion or a circular reasoning, rather than a relevant logical argument. --Daniel. 01:47, 23 November 2010 (UTC)Reply
Jediism is (until someone can find evidence to the contrary) never used in the Star Wars franchise; it's an invention of people who are referring to the Star Wars franchise, and seems to be cited for CFI. Mglovesfun (talk) 13:46, 23 November 2010 (UTC)Reply
Correct. --Daniel. 21:28, 23 November 2010 (UTC)Reply

ChuispastonBot for bot status

Hi all ! I propose to the english wiktionary community my bot, ChuispastonBot, to get a bot flag. You can ask questions and give your opinion here. Grimlock 20:46, 27 November 2010 (UTC)Reply

December 2010

Poll: Inflection to inflection-line 3

In the recent poll number 2, one renaming option has been clearly chosen as preferred by a comfortable supermajority: "Category:Spanish headword-line templates". In that very poll, one another candidate name has been proposed--"Category:Spanish headword templates", and another poll has been started that most people have overlooked so far.

Please, provide once more (hopefully last time) your input about your naming preference, so we can proceed to finalize the renaming from "Category:Spanish inflection templates" to "Category:Spanish headword-line templates" or "Category:Spanish headword templates". I apologize to those few users who have already voted on this; please excuse me and vote again.

In this poll, the two compared alternatives are "headword-line templates" vs. "headword templates":

  • Option 1: Rename "Category:Spanish inflection templates" to "Category:Spanish headword-line templates"
  • Option 2: Rename "Category:Spanish inflection templates" to "Category:Spanish headword templates"

--Dan Polansky 08:33, 1 December 2010 (UTC)Reply

I prefer "headword-line templates" to "headword templates":

  1. Support Dan Polansky 08:33, 1 December 2010 (UTC) I will yield to plain majority on this issue. --Dan Polansky 09:00, 1 December 2010 (UTC)Reply
  2. SupportCodeCat 09:55, 1 December 2010 (UTC)Reply
  3. Support --Yair rand (talk) 09:56, 1 December 2010 (UTC)Reply
  4. Support Daniel. 14:24, 1 December 2010 (UTC)Reply
  5. Supportlexicógrafa | háblame21:42, 2 December 2010 (UTC)Reply
  6. SupportRod (A. Smith) 20:45, 9 December 2010 (UTC)Reply
  7. Support   AugPi 21:41, 9 December 2010 (UTC)Reply
  8. SupportSaltmarshαπάντηση 06:07, 10 December 2010 (UTC)Reply

I prefer "headword templates" to "headword-line templates":

  1. unless a plain majority prefers the opposite (which it seems to, so far).​—msh210 (talk) 16:32, 1 December 2010 (UTC)Reply
  2. Support 50 Xylophone Players talk 20:22, 2 December 2010 (UTC) I don't know...just sounds better IMO.Reply
  3. Support Ƿidsiþ 09:10, 3 December 2010 (UTC) How official is a "poll" compared to a vote anyway?Reply
  4. Support JamesjiaoTC 09:07, 4 December 2010 (UTC)Reply
  5. Support The uſer hight Bogorm converſation 09:18, 6 December 2010 (UTC)Reply

I am indifferent about "headword-line templates" vs. "headword templates":

  1. SupportInternoob (DiscCont) 00:58, 8 December 2010 (UTC)Reply
  2. Support --Vahag 10:45, 9 December 2010 (UTC)Reply

Questioned/questionable altering of templates

This is pertaining to a FreezerTwelve who recently edited Swedish templates. See here for the meagre response. I took the liberty of bringing this here because the user does not seem to be here at the moment to do so themself. Personally, though I don't speak Swedish, my view is that "Swedish doesn't have a case system" is a somewhat outlandish statement and genetives should not be called possessives because in languages which they are used they do not always function as "possessives" of some kind. 50 Xylophone Players talk 20:20, 2 December 2010 (UTC)Reply

Quoth [[w:Swedish grammar#The genitive]]:
The Swedish genitive is not considered a case by all scholars today, due to a tendency of language users to put the -s on the last word of the noun phrase even though that word is not the head noun. This use of -s as a clitic rather than a suffix has traditionally been regarded as ungrammatical, but is a rather acceptable use today. It also mirrors English usage (e.g. Mannen som står där bortas hatt. "The man standing over there's hat.")
It's unreferenced, so doesn't justify/support the changes, but I think it shows where the user was probably coming from.
Also, your view doesn't seem well-supported, unless you have reason to suggest that in Swedish the "genitive" doesn't always function as a possessive. Note that English's possessive is frequently described as a "genitive", even though it's only ever used as a possessive.
RuakhTALK 21:41, 2 December 2010 (UTC)Reply
Of the problems facing Swedish entries in en.wiktionary, this must be the least of them. As a native speaker but not a linguist, I know the -s is moving around (1910: mannens där borta → 2010: mannen där bortas), but I don't know what this trend is called or to what extent it is unique to Swedish, and I don't have useful sources at hand to quote. To a lay person like me, genitive and possessive are synonyms. As a template designer I want fewer, more elegantly crafted templates. Right now templates are named {{sv-noun-form-indef-gen}} with "gen" as in "genitive", but print out "indefinite possessive singular of" because somebody thought this was right. Whichever way we might want to change this in the future, it will be a lot easier than some of the other changes I'm considering. I'm taking most of my questions about Swedish language to the village pump of the Swedish Wiktionary, where more informed and detailed opinions can be heard. --LA2 13:20, 4 December 2010 (UTC)Reply

Citations of terms from fictional universes

I want to know why citations from Citations:middle earth were moved to Appendix:J. R. R. Tolkien/Middle-earth. After some debate it still says on Wiktionary:Citations that "if the citations page exists, it should hold all quotations and references for the term". This applies whether the term is fully cited or the entry does not exist, whether the citation counts as durably archived and whatever else, and regardless of the meaning of the term, even if the meaning is unknown.

What's worse is that, as far as I can tell, Middle-earth is fully cited as a place name in works outside of the Tolkien universe, so I'm not sure why even the definition was moved in the first place. DAVilla 23:59, 2 December 2010 (UTC)Reply

By the way, this does open up the more trivial question of how terms that pass the fictional universe test will be represented in the appendix among those that do not. Or maybe someone has already thought of how that should look. DAVilla 00:09, 3 December 2010 (UTC)Reply

This is part of the alternate universe of fictional-universe subpages that are just like entries. They were justified based on the treatment of words from reconstructed languages. The citations seem to have been copied, not moved, AFAICT. I intensely dislike both the subpages and the duplication of cites. DCDuring TALK 00:21, 3 December 2010 (UTC)Reply
DCDuring, how come "They were justified based on the treatment of words from reconstructed languages."? As the most active editor of appendices for fiction, I say that reconstructed languages and fictional universes are completely different concepts, and may naturally have different treatments if applicable. Can you please link to a statement that contains this justification? --Daniel. 00:32, 3 December 2010 (UTC)Reply
DAVilla, moving the citations that reference Tolkien works from Citations:middle earth to Appendix:J. R. R. Tolkien/Middle-earth seemed the most natural way of separating the concepts of "main namespace" and "appendix namespace" that apparently were so strongly discriminated from each other. Since then, I have started relevant discussions that came to confirm what you say now: Citations:middle earth should contain the citations from Appendix:J. R. R. Tolkien/Middle-earth as well. See Citations:mutant for a suggested format that links both to an entry and an appendix.
I believe the universe-specific terms should be represented in appendices. For example, the appendix namespace may be used to explain dozens of attestable varieties of zombie of Marvel Comics, One Piece, Buffy, Doom, The Sims, Final Fantasy, Harry Potter, GURPS, Fallout, World of Warcraft and other works. --Daniel. 00:32, 3 December 2010 (UTC)Reply
Okay, thanks for your responses. Like DCDuring I have to say that I dislike the wholesale duplication, preferring sprinkled example quotations. But my primary concern was the Citations: space.
Daniel, I would have no problem approving an Appendix:zombie or something more descriptive, but I would think that this is an exception to the rule. DAVilla 00:50, 3 December 2010 (UTC)Reply
The reasons and results of duplication of citations are discussed in Wiktionary:Beer parlour archive/2010/October#Duplication of citations. Last month I have started a similar discussion (WT:BP#Citations for fiction) focused on appendices for fiction, but unfortunately no one replied.
I basically approve the suggestion of an "Appendix:zombie", preferably with a more intuitive name or a different namespace if possible. It would more easily be an application of a novel rule rather than as an exception, because it would serve as precedent for many other words with nuances that vary wildly between works. As examples, there are dragon, centaur, god, witch, human, moon, giant, vampire, time travel, electricity, mutant, sex, demon, soul, love, radiation, stone, kiss, evolution, blue, elf, fairy, death, werewolf, immortal, monster, mermaid, angel and ghost. --Daniel. 02:24, 3 December 2010 (UTC)Reply
I've been thinking that the fictional universe stuff is actually fairly closely related to concordances. The latter gives a list of all words used in a work (and sometimes the count or even a reference to where the words are used). What we are creating is the subset of that list consisting of all the nonces, be they entirely new neologisms or just terms where the definition is different from the standard definition. There is a very strong similarity in the organization, which is by work or author or corpus. Of course it depends on where the community would like to see these, but it makes just as much sense to me to list definitions not suitable for the main namespace in a concordance as it does to put them in the catch-all appendix as was voted.
This new idea you have for now I would put as a subpage of some appendix page that lists them all, for instance Appendix:Neologism/zombie, but I would consider that to be tentative. DAVilla 06:35, 4 December 2010 (UTC)Reply
I've started Appendix:Common themes in fiction, feel free to move. DAVilla 07:26, 4 December 2010 (UTC)Reply
I saw Appendix:Common themes in fiction, that appears to be the formalization of the proposal of "Appendix:Neologism/zombie". It may be a good place to explain universe-specific nuances such as the fact that zombies of Resident Evil are infected with "T-virus", which is in no way relevant to the other works.
However, I disagree with the alleged close relation between entries and concordances for fictional universes, for two reasons:
  1. It is not feasible and readable to define varieties of words within a list of hundreds of common words that are better defined on their respective entries. (For example, Concordance:Sherlock Holmes/M.)
  2. Commonly these words are unique and well-known by people, naturally including but not limited to the readers, players, etc. of the media that depict the fictional universes.
For example, as I mentioned in previous discussions, shiny contains three senses. Nonetheless, if one wants to use Wiktionary to understand the sentence "I got three shiny Gyarados for my team." word by word today, he or she would fail, because this phrase contains two words whose context is strictly of Pokémon: Gyarados is a fictional creature and shiny basically means "whose colors are rare". These conspicuous words are said and written by people all the time. Therefore, it is reasonable to expect readers to come to Wiktionary looking for them, and we to define the words, whether in a findable appendix or anywhere else. --Daniel. 08:18, 4 December 2010 (UTC)Reply
I was thinking the structure would be to define the neologisms on a separate page, like at Concordance:Pokémon/Nonces or in this case even Concordance:Sherlock Holmes is available. That's just a thought an alien planted in my head, so don't kill the messenger!
I would definitely have a link from the shiny entry to wherever the Pokémon definition resides, and I agree that the appendix or wherever such definitions are placed must be included in the searchable dictionary space. DAVilla 18:10, 4 December 2010 (UTC)Reply
I agree with the basic idea of creating a page like Concordance:Pokémon/Nonces as a list of terms and quantities of neologisms, not a place to define them. I can easily list thousands of words related to Pokémon, so each word would still be lost within a list of many others. These definitions can already be better handled by separate pages such as Appendix:Pokémon items, Appendix:Harry Potter objects and so on.
One obvious way of making the definition of "shiny" in context of Pokémon be searchable would be of course defining it in the main namespace. However, since we have a consensus on not defining "fictional" terms, and other of not imitating the format of entries for them, it may be better to create an additional namespace for these words, such as "Fiction:shiny", formatted as a simple list. --Daniel. 09:30, 5 December 2010 (UTC)Reply
My experience with namespaces is to hold off on it until there's a preponderance of information necessitating it. While I have done so in the past, today I would not bounce around such ideas for something as controversial as this.
The current vote on the subject will lead us to put all definitions in on big list. Most of the fictional universe terms we run across will be of this type. The ones that are defined specially in several universes will be the exceptions, and while you are right that there are a lot more than at least I had originally imagined, the number will still be a small fraction of the total. What I imagine doing in these cases is linking one (or a handful) of these exceptional terms in a long list of definitions to the respective special page(s) where the definition(s for each) can be found. So Appendix:Harry Potter whatever and Appendix:Star Wars whatever will define a bunch of terms, but a few of them will instead say See [[Appendix:Neologism/such-and-such#Star Wars|such-and-such]] pointing to the common page for "time travel" or "magic" or what have you. (You know, it would probably be a lot better to flesh this out than to try and describe it, but I have a tendency of doing that and then being ignored.) DAVilla 16:56, 7 December 2010 (UTC)Reply
OK, an additional namespace may wait or never be created, especially since the appendix namespace virtually serves the useful goal of holding whatever information we deem necessary.
I'm not sure what would be the ratio of fictional terms eligible to be included in the main namespace vs. the ones to be kept only in appendices. For example, by comparison with Citations:Pikachu, I suppose it would be relatively easy to include and attest the first 151 Pokémon species, which were introduced in 1996 and are the most famous ones.
I naturally agree with, for example, linking one or more appendices of Pokémon to the entries that define terms of this franchise. I think it would not be very different from Appendix:Star Wars derivations.
One example of universe that defines strict rules for what "immortal" means is Highlander. As a result, Appendix:Highlander might have both links: one to immortal and another to Appendix:Fiction/immortal.
Yes, the final part of your message is confusing, but I could understand it. --Daniel. 11:29, 8 December 2010 (UTC)Reply
It strikes me that it's going to be very difficult to keep these in sync. In some cases we'll wind up with the word attested in the main space but still defined in the appendix for that universe, or an unattested word defined in both the appendix for that universe and the appendix for that word as compared between universes, or in the latter plus the main space, or all three. The goal should be to have it defined in exactly one place, which means that there should be a hierarchy: appendix for universe, appendix for the word across universes, and finally the main space. The higher levels should refer to the lower levels, e.g. the main space back to the appendix. Is this maintainable? DAVilla 17:04, 8 December 2010 (UTC)Reply
(unindenting) This subject is becoming increasingly complex; which is not necessarily a bad thing.
We have the format of "glossaries", such as Appendix:Glossary of baseball jargon (M) (and the already mentioned Appendix:Star Wars derivations). They commonly have the same definitions that are found in entries, and apparently are always doomed to be out of sync.
Well, first of all, we can consider appendices and entries as different projects because not everybody would be inclined to edit both, so it is natural that they become out of sync. I would appreciate if synchronization could be possibly done by bot (or a bot could warn when definitions need manual synchronization), but this would be a complex automated task that was not introduced as of today. On the other hand, it is not that bad having out of sync definitions, as long as they are correct (and link to each other).
I suggest organizing the "hierarchy" as follows:
  1. Naturally, when a word may be added to the main namespace, add it.
  2. When there are multiple basic nuances, they should be included to the main namespace too. The entry immortal was referring only to "death" and "aging". I have added other forms of immortality as examples.
  3. There are universe-specific details such as the fact that (if I remember correctly) immortals of Highlander die if they lose their head. And in Hellsing, there is a vampire named Alucard who is immortal because he is undead and seemingly can restore himself from any injuries. One who reads the current entry immortal can already understand both types. Evidently, it is not necessary nor feasible to explain "superpowers" and laws of physics of each individual work in the entry. However, these peculiarities, when they are attestable, are expected to be important in context of the series, so they may be at Appendix:Fiction/immortal (or, even better, Appendix:Fiction/immortality).
  4. The pages Appendix:Hellsing and Appendix:Highlander may simply link to immortal and Appendix:Fiction/immortality with a good design and without any additional explanation. Or, alternatively, they may describe the intersection from a different point of view.
In addition, it may be worthwhile to create Appendix:Religion/immortality (or merge Appendix:Religion/immortality and Appendix:Fiction/immortality into another name), because there are multiple religious views on this matter. For example, depending on each doctrine, immortality may be virtually synonymous with "reincarnation", "eternal Heaven or Hell" and/or "a basic characteristic of God". Hearing this concept said seriously by a Protestant and by a Spiritist are entirely different experiences.
--Daniel. 19:15, 8 December 2010 (UTC)Reply
That stuff is what an encyclopedia is for. Equinox 13:05, 10 December 2010 (UTC)Reply
Which part of the conversation contains "stuff" for an encyclopedia and how it affects Wiktionary? --Daniel. 13:33, 10 December 2010 (UTC)Reply
"there are multiple religious views on this matter" Equinox 14:48, 10 December 2010 (UTC)Reply
I agree that mentioning the views of each religion regarding immortality is something purely encyclopedic. There is no reason to create such an appendix in a language dictionary. Our subject is words, only words (in the linguistic sense of the word), and I believe that the word immortality can be defined in a general way without dealing with these details. Lmaltier 07:08, 11 December 2010 (UTC)Reply
I am thinking in a simpler and more abrangent approach than what is found in an encyclopedia: that is, a list of reincarnationist religions, another list of monotheistic religions, and so on. Lists of words by their contexts are very common in Wiktionary. --Daniel. 17:41, 11 December 2010 (UTC)Reply
Even that strikes me as purely encyclopedic. Mentioning some reincarnationist religions in the page about the word reincarnationist may be useful to understand more easily the meaning of this word. But attempting to list all reincarnationist religions is out of our normal scope. Nonetheless, these words may be mentioned in a reincarnation thesaurus page, but not in the spirit of an encyclopedia, only because words for these religions may be needed when writing about reincarnation. Lmaltier 18:07, 11 December 2010 (UTC)Reply
Hmmm, I personally find your distinction between dictionary and encyclopedia too subjective. Nonetheless, since my basic proposal of having a list of reincarnationist religions was accepted by you, at least, you and I don't need to discuss this anymore. I agree: it would be very natural to choose Wikisaurus as the place to list these religions by their characteristics. --Daniel. 23:52, 14 December 2010 (UTC)Reply

Category:Nouns ending in "-ism" by language

Is this really the sort of thing we want? At the very least, shouldn't Category:English nouns ending in "-ism" be moved to Category:English words suffixed with -ism? Mglovesfun (talk) 14:26, 4 December 2010 (UTC)Reply

In any case, the name is unclear: the actual meaning intended is nouns ending in "-ism" (or a similar ending) by language. Lmaltier 15:01, 4 December 2010 (UTC)Reply
We should probably use {{suffix|..|ism}} to categorize, also with "lang=xxx". Also see Category talk:German words suffixed with -ismus and Category talk:Nouns ending in "-cide" by language. Mutante 19:58, 4 December 2010 (UTC)Reply

What is a fictional universe?

WT:RFV#Curselax got me thinking about this (again). How do we define a fictional universe. Cryptex failed RFV because it's a term from a fictional universe, the Da Vinci Code, yet the book is set in modern day Europe! That sort of makes me think that all works of fiction take place in fictional universes. So all are nonce words that are only used in fictional contexts would be their very nature fail RFV. One's used in non-fiction words would I suppose pass. I contrasted cryptex with warwood which is seemingly only used in Moby Dick, which is of course fictional.

Futhermore, didn't we once consider Monopoly, the board game, a fictional universe? Therefore video games and TV shows (rightly IMO) can be considered fictional universes. Mglovesfun (talk) 12:44, 6 December 2010 (UTC)Reply

If the places and situations of Monopoly comprise a fictional universe, then I suppose Marvin Gardens (a variety of Marven Gardens) cannot be defined here. On the other hand, Chance card, Community Chest, Speed Die and GO are eligible as nonfictional.
We also have categories for Rubik's cubes (and even the entry F2L!), Chess and other games here.
Notably, certain characteristics of Tetris may be defined here as well, such as gravity, infinite spin, wall kick, S and Z. However, if I remember correctly, the Tetris-style game Columns has an embedded history about the pieces being jewels to be sold, which would clearly constitute a fictional universe. --Daniel. 13:42, 6 December 2010 (UTC)Reply
Wiktionary:Criteria for inclusion/Fictional universes seems to be the 'policy' at hand. SFAICT it itself is not a policy, it's a fleshed out explanation to go with WT:CFI#Fictional universes itself. Neither of them try and define what a fictional universe actually is. Also, it seems pretty clear that if a term appears in more than one fictional universe, it is includable. The examples in WT:FICTION are a bit problematic, as in many cases it's unclear what they refer to, so difficult to show that they are not referring to a fictional universe. Ideally, citations in this sort of case should be at least a full sentence to show context. "Wielding his flashlight like a lightsaber, Kyle sent golden shafts slicing through the swirling vapors." as an example, doesn't show independence. Mglovesfun (talk) 15:01, 6 December 2010 (UTC)Reply
Wiktionary:Criteria for inclusion/Fictional universes is a full-fledged policy because it was born from this vote.
It should be noted that, for example, considering "Pokémon" an individual fictional universe would be a broad interpretation, because there are various comicverses, gameverses, etc. of Pokémon that are of different authors and don't interact with each other.
The concept of "independence" from that policy is indeed very problematic. One arguable result from it is that the entry Gyarados (defined as a Chinese-style aquatic dragon with whiskers and a furious face) should be created, because it appears on at least three broad fictional universes: "Pokémon", "Pucca" and "Dungeons and Dragons", not to mention "Smash Bros." --Daniel. 15:50, 6 December 2010 (UTC)Reply
The overall consideration is that linguistic phenomena that are not part of the lexicon should be excluded from treatment as entries. Spending time on the meaning of fictional universe is letting the tail wag the dog. Under what circumstances can a word be considered to have entered the lexicon. Clearly it takes usage outside the universe of authored works that have created the terms. The existence of fanzines covering the fictional universe makes it somewhat problematic because these entities have a degree of formal independence from the corporate purveyor of the fictional universes. The problem is similar to the question of which computer terms to include when the principal source might be the purveyor of Java or .net.framework. That there is a community of supposed independent companies and programmers who use the terms doesn't really make them part of the lexicon. There is perhaps also an analogy to WT:BRAND. What has always seemed lame about fictional universe terms is the extent to which they are an integral part of the marketing of commercial products. Whereas once commercial interests often lost control of user vocabulary for their products, they now seek to develop, influence, and exploit it, ie, to control it. DCDuring TALK 16:11, 6 December 2010 (UTC)Reply
Re Daniel., no it doesn't say anywhere it's a fully fledged policy. The vote itself perhaps, but not the subpage. Mglovesfun (talk) 16:24, 6 December 2010 (UTC)Reply
Mglovesfun, Wiktionary:Criteria for inclusion/Fictional universes is a copy of the proposal that has been accepted by Wiktionary:Votes/pl-2008-01/Appendices for fictional terms. They share the status of policy because they are identical.
DCDuring, my thoughts on this matter are probably influenced by the fact that I don't blindly support the "overall consideration" of what is or not "part of the lexicon" as you explained it. I personally read your arguments as very similar to saying that monogoneutic should not be included here because only entomologists are expected to use this word.
The mere existence of "Gyarados" in those four broad and different fictional universes is not necessarily a reasonable criterion for defining that word on Wiktionary. The entry Gyarados can of course be justified by WT:FICTION now, but I believe the most wise solution would be simply modifying the policy to avoid this excuse.
Nonetheless, perhaps differently from you, I consider the existence of fanzines as an argument in favor of creating, attesting and keeping fictional words on Wiktionary, as I would like to use this site to understand them if I want to.
It may be worth noting that (please correct me if I'm wrong; I didn't do an extensive research) apparently Wikibooks and Wikiversity (and obviously Wikispecies) are not fond of fictional universes, which appear to be commented as an examples of certain subjects but are never the center of the attention. For example, I couldn't find any Wikibook teaching how to play games of Pokémon; and some RFDs deleted material of Pokémon from there.
Other Wikimedia projects don't seem to care about sharing information about fiction, regardless of their commercial branches (usually as long as the tone of their texts is not of a blatant advertisement): Wikipedia, Wikiquote, Wikinews, Commons and especially Wikisource contain multiple pages focused on fiction when applicable, rather than having a rule that excludes them as inherently unworthy. --Daniel. 18:05, 6 December 2010 (UTC)Reply
About Clearly it takes usage outside the universe of authored works that have created the terms.. No, it's not clear, and I disagree. Should a mathematical term be rejected because it's not used outside the field of mathematics? Should topological space be excluded because it is never used outside topology? Should intransitive be excluded because it is never used outside grammar? I agree that, if an obscure author creates a word that nobody uses elsewhere and that nobody knows, it should be excluded. But, obviously, well-known words from famous works are words of the language. The sentence about fictional universes should be removed, and normal CFI should apply. Lmaltier 22:36, 6 December 2010 (UTC)Reply
Re Daniel., you don't find the word 'policy' anywhere on that page. Not a 'this will be policy' or 'this will be considered policy'. Just says 'voting on the following'. In such as case where to me the intention of the author isn't clear at all, I'm not going to generously 'assume' he/she meant policy, but didn't say it. Mglovesfun (talk) 23:11, 6 December 2010 (UTC)Reply
The vote is worded like a policy, starts with "pl-", which means policy, and it basically repeats a section of other policy named CFI.
I like Lmaltier's idea of applying the "normal CFI" on fictional terms as well. He and I were having an interesting conversation about this suggestion last month. --Daniel. 11:04, 8 December 2010 (UTC)Reply

Encouraging FL adjectives as translations of attributive use of English nouns

I am reasonably sure that we do not have FL adjectives appearing as translations of English nouns to cover attributive use. I would venture to guess that many languages do not have as much free attributive use of nouns as English. In any event, I assume that some languages have adjectives in instances where English uses a noun. We attempt to eliminate the needless duplication of senses between Noun and Adjective sections where there is no evidence of use of a word as a true adjective in English (See Wiktionary:English adjectives). Thus the Adjective header is not there to remind translators to make sure that their translations cover cases where English uses a noun attributively. For which languages is a separate adjective always, sometimes, or never required. How should we provide the reminders required, if any. DCDuring TALK 18:16, 6 December 2010 (UTC)Reply

Separate table?​—msh210 (talk) 18:30, 6 December 2010 (UTC)Reply
For each sense, for every language? That would seem to put us on the road to a high level of duplication of headings. We cannot assume that one FL adjective would cover attributive use of every English sense. I was thinking that the languages that had a separate adjective (or adjectival circumlocution) would include the relevant adjective (or circumlocution) with a qualifier in the noun sense translation tables. It might be possible to provide a reminder in our translation facilitator, possibly with messages tailored by language. DCDuring TALK 19:04, 6 December 2010 (UTC)Reply
Yeah, probably wiser.  :-)  GP?​—msh210 (talk) 19:38, 6 December 2010 (UTC)Reply
I was thinking that this needed wider airing because of my lack of knowledge about the relevant grammar of languages other than English. GP is where this would need to go to plea for someone to take up the implementation. DCDuring TALK 20:16, 6 December 2010 (UTC)Reply
How worthwhile is that? w:Star cluster's French interwiki is fr:w:Amas stellaire ("stellar cluster"), not fr:w:Amas d'étoiles ("star cluster"; lit. "cluster of stars"), so let's pretend, due to my inability to find a better example, that (deprecated template usage) amas stellaire were the most common French term. Would this mean that "star" should have "stellaire" listed as a French translation? How useful is that? Do any other bilingual dictionaries do anything like it? (I suppose the comprehensive ones might have ~ cluster as a run-in entry, so stellaire might appear somewhere in the greater entry; but I doubt that any dictionaries take a very thorough or consistent approach to this.) —RuakhTALK 20:25, 6 December 2010 (UTC)Reply
I constantly hear the argument that we are unlike all other dictionaries in our ambitions. Sometimes the argument extends to our ability to eventually realize those ambitions. It is a matter of many, many facts whether there are cases where English uses attributive use of a noun and a FL uses a different grammar. In this particular case, perhaps both French and English have an adjective that can replace attributive use of the noun, but in this one collocation French uses an adjective, but English uses a noun. What about other collocations and other languages? If we can't answer this question very well, what are the implications of this inability for how we handle this? To me it seems that we need some kind of constructive message to encourage translators to offer the associated adjective in the translation table if it is used with any frequency at all where English would use the noun. DCDuring TALK 21:34, 6 December 2010 (UTC)Reply
Re: "To me it seems that we need [] to offer the associated adjective in the translation table if it is used with any frequency at all where English would use the noun": Why? —RuakhTALK 22:27, 6 December 2010 (UTC)Reply
To offer a user the appropriate word in a given FL to translate the English attributive use of a noun if attributive use of the noun is not how the same meaning is achieved in that FL. DCDuring TALK 00:00, 7 December 2010 (UTC)Reply
I guess I'm just not convinced that that's necessary, or helpful. Rather, it seems potentially confusing. Should [[hungry#Translations]] include words that mean "hunger" in languages that say "have hunger" rather than "be hungry"? Should [[yesterday#Translations]] include words that mean "the day before yesterday" in languages that have a single word for that rather than casting it as a phrase? And conversely, should (deprecated template usage) stellaire be glossed as "Stellar, star"? Maybe the answer to all of these questions is "yes". Maybe we can come up with a good format that makes clear what's going on — or that makes clear that the reader needs to click through to the linked entry for an explanation of what's going on. But you don't seem to be addressing the underlying questions of "do we want this?" and "what should it look like?", just taking for granted that we want it and that everyone knows what it should look like, and skipping to the next question of "how do we get translators to add it?". Did I miss previous discussions that already addressed those questions? —RuakhTALK 01:20, 7 December 2010 (UTC)Reply
I am not in a position to address this knowledgeably. I am reacting to the possible bad consequences of removing Adjective sections from words that have a lot of attributive use but not a true adjective sense by our criteria. If this relatively simple case cannot be addressed, I think an a fortiori case could be made that more complicated cases of other types of complicated circumlocutions cannot be addressed and that our dictionary cannot escape some inherent limits of a lexicon, ie, conceptual ones, not technical ones.
Practically, I am imagining this as being a limited, partial solution, not a global one. I doubt that we have participants of sufficiently broad knowledge to anticipate all possible problems or solutions.
I am only considering the cases where this can be accomplished by the simple addition of an adjective translation to supplement the noun translation. There are plenty of cases where I can more easily imagine construction-grammar appendices (one per language) covering each language's treatment of a particular type of communication need, eg, communicating compass or other types of directions, rather than attempting to include every possible lexical multi-word entry. DCDuring TALK 01:54, 7 December 2010 (UTC)Reply

Best Russian dictionaries translate the words star and sea as adjectives (deprecated template usage) звёздный (zvjózdnyj) and (deprecated template usage) морской (morskoj), apart from noun senses. The same for Armenian dictionaries. But I don't know how we can accommodate such translations on Wiktionary. --Vahag 02:06, 7 December 2010 (UTC)Reply

How do they format it? —RuakhTALK 13:32, 7 December 2010 (UTC)Reply
Like this:
1. сущ.
1) звезда
2) знаменитость
2. прил.
1) звёздный
2) известный
--Vahag 18:59, 7 December 2010 (UTC)Reply
Include it with a gloss in the table of translations for that sense, e.g.:
Russian: звёздныйru (adj.)
DAVilla 16:39, 7 December 2010 (UTC)Reply
I could do that if we decide to make that the official format. --Vahag 18:59, 7 December 2010 (UTC)Reply
You may be right, but we need to make sure that the benighted souls who search on a word-by-word basis hare directed to leading generic possibilities, at least until we have time to reach consensus and then add all the new entriess required. The project will assure us of continued gainless employment for at least the lifetimes of the youngest here. DCDuring TALK 22:00, 9 December 2010 (UTC)Reply

Proposed Phrasebook criterion

I propose that, aside from the regular rule of idiomaticity for proper dictionary entries, there should be at most one phrasebook entry for any particular thought expressed. For instance, assuming there were the need for such a phrase, any one of I need a drink of water, I need some water, I need water would be fine, but only one. Just as on Wikipedia, we could vote to move a page to a different title. I don't think it's necessary to vote when (1) the meaning doesn't change, as with these three, and (2) it's clear that a better title exists. As to the latter, in this case there's not much distinction. Each of the exact phrases in Google books get 773, 1310, and 2290 hits respectively, which is not surprising since they are progressively shorter. In the case that two of these existed, any disinterested contributor could make a judgement as to the best title to merge them to. When the meaning does not change, the removal of one for merger into the other would not need any formal process as it could be assumed uncontroversial.

Note that I need a drink gets 16300 hits (excluding "I need a drink of"), but this might be considered idiomatic in the sense of an alcoholic drink. I'm thirsty is also a strong contender at 25200 hits, but of course a lot of us expect to satisfy thirst with something other than plain water. The proposed rule wouldn't say anything explicitly about these alternatives. It is weak in that it would only specifically exclude having two phrasebook (i.e. unidiomatic) phrases to cover the same idea. On the other hand, on the grounds of this general principle of uniqueness, one could vote to delete I need water in favor of I'm thirsty, which one could argue is a more useful phrase anyways.

Another example is you need a condom at 57 results vs. you should use a condom at 364 results. Although they aren't exactly equivalent, they are close enough to express the same concern. Since arguably the second does a much better job of that, the page title could be moved to you should use a condom without necessarily consulting the whole community... even in the case that a deletion request for you need a condom had failed since technically it isn't being deleted but rather moved to an equivalent. As with any changes made on this wiki, discretion should be used for when to be bold and when to ask for opinion. As I said for the first example, the best choice is not clear based on statistics, so a move e.g. from I need water to I need some water is more subjective and requires at least discussion on the talk page. DAVilla 04:04, 8 December 2010 (UTC)Reply

Regarding specifically the "I need [] " phrases, in UK English this would be a bit impolite unless it's someone you know. For me, standard British English would be "[please] can I have [] ". Mglovesfun (talk) 12:35, 8 December 2010 (UTC)Reply
You may be onto something. On Google Books may I have some water gets 820 hits, can I have some water gets 1130, which is pretty high given the length of the phrase. Although approaching the situation differently, it would seem that these expressions fill the same need in expressing the author's interest in obtaining water. Is there any way that we can code this, to say that I need water and may I have some water should not both exist, and that the best phrase to represent the idea should be chosen, and that best is sometimes clear but often subjective? DAVilla 16:54, 8 December 2010 (UTC)Reply
At worst, another vote is necessary to determine that. -- Prince Kassad 22:14, 8 December 2010 (UTC)Reply
Wording? DAVilla 06:26, 11 December 2010 (UTC)Reply

Holonyms

Is this ridiculously jargony for a Wiktionary header? The word isn't in the OED, Chambers or the AHD. (Some of the other ones at Wiktionary:Semantic relations seem a bit user-unfriendly too). Ƿidsiþ 12:02, 8 December 2010 (UTC)Reply

A long-standing issue for which we have had no good solution. If someone has any even a partial solution, I'm all ears. DCDuring TALK 15:50, 8 December 2010 (UTC)Reply
Partial solution: link the header (====[[holonym|Holonyms]]====).​—msh210 (talk) 16:16, 8 December 2010 (UTC)Reply
Yes, it's ridiculously jargony. My solution: excluding holonyms, meronyms, hypernyms, hyponyms and troponyms from the main space (after all, they don't relate to the word itself, only to its meaning), and moving them to Wikisaurus pages, where they belong to (with clear subtitles, adapted to the page, of course: e.g. Subspecies, etc. instead of Hyponyms if the page is about a species).
I would keep a few synonyms and antonyms in the page describing the word, and move the other ones to Wikisaurus: they relate to the meaning, too, but users expect them in the page describing the word. Lmaltier 21:02, 8 December 2010 (UTC)Reply
Many of the items called antonyms in our entries are more validly considered coordinate terms. I have no idea how other semantic relations would fit in with Wikisaurus, nor do I see signs of life there. How much use does it get?
We could allow "Other semantic relations" as an L4 header and put all semantic relations except Synonyms there under a show/hide bar. The show/hide bar could include a link to a page explaining all the semantic relations. DCDuring TALK 13:28, 9 December 2010 (UTC)Reply
Cf. the proposal at [[User:Msh210/ELE]].​—msh210 (talk) 18:57, 9 December 2010 (UTC)Reply
Wikisaurus should be considered as seriously as the main space, and inclusion criteria should be the same. There is not much activity, but it's probably because its mission and its criteria are not very clear. I think that its mission should be All words related to a subject you might want to look for when writing about this subject (especially when you know that a word exist to express your idea, but you forgot it). These words have to be organized as clearly and logically as possible. When possible, it is recommended to include pictures, e.g. a picture of a bicycle showing the names of each of its parts (this is what a visual dictionary does: a visual dictionary has the same objective as a thesaurus, except that it is limited to concrete words). Each page should be dedicated to one subject in one language. Lmaltier 18:45, 9 December 2010 (UTC)Reply
Hear, hear! Amend the about page. DAVilla 08:20, 15 December 2010 (UTC)Reply
I may be wrong, but I always prefer to get a consensus through discussion before this kind of change... Lmaltier 21:07, 15 December 2010 (UTC)Reply
Please don't. Wikisaurus pages are not about subjects; they are about senses or concepts. The semantic relations of hyponymy and meronymy (the inverses are hypernymy and holonymy) give a nice guide for how to organize the material, a guide that has turned successful with WordNet. The terms "hyponym", "hypernym", "meronym" and "holonym" sound unfamiliar, but so did "synonym" at first. Thesauri in information science simetimes use the terms "narrower term" and "broader term", but these are vague and broad; someone can think that "electricity" is a narrower term than "physics", and indeed, electricity as a subject is narrower than the subject of physics. But Wikisaurus is not a subject thesaurus or a thesaurus of information science; it is a lexicographical thesaurus. --Dan Polansky 09:26, 15 December 2010 (UTC)Reply
We don't have to create a new namespace before embarking on a new project, do we? I don't see what would be more appropriate for this information at least at the moment than the thesaurus, and a picture dictionary is still a dictionary, isn't it? Still, on a second read, I guess it is premature to put all that language on the about page, but at the same time I still feel it's inappropriate even after several years to dictate that Wikisaurus is one thing or another, by some sort of precedent, rather than to imagine what it could be. DAVilla 09:52, 15 December 2010 (UTC)Reply
to Dan Polansky: When I refer to a subject, it may be any sense, but with a sufficient scope to deserve a thesaurus page, but a narrow enough sense, so that the thesaurus page is not too large. I fully agree that we should build a lexicographical thesaurus e.g. if we build a thesaurus page about association football, we should include footballer, goalkeeper, goal, etc. but not the names of famous footballers. A thesaurus page is an organized list of words. Would you agree with that? Lmaltier 18:52, 15 December 2010 (UTC)Reply
To support what I propose, look at Roget's thesaurus. At the dressing entry, there is a footwear sub-entry, with words such as balletshoe, ski-boot or sabot (which are hyponyms); there is also a clothier sub-entry, with words such as tailor, tirewoman or milliner; there is a verb sub-entry with verbs such as dress, attire or accoutre and also, in a different sub-sub-entry: put on, slip into or carry; and many other sub-entries. There might also be words such as button or collar in the page. This is exactly what I propose: a thesaurus page contains words suggested by an idea, here the idea of dressing. It doesn't provide encyclopedic information about how to make clothing or the like, only words. Lmaltier 21:05, 15 December 2010 (UTC)Reply
For French, I recommend a classical thesaurus: the Dictionnaire des idées suggérées par les mots (Paul Rouaix). It is in the public domain, and may be reused at will. Lmaltier 07:56, 18 December 2010 (UTC)Reply

"Harry Potter objects"

I'm not sure if the franchises as attributive nouns sound good to name their respective appendices.

Pray, native speakers of English, shouldn't Appendix:Harry Potter objects and Appendix:Harry Potter spells be renamed to Appendix:Objects in Harry Potter and Appendix:Spells in Harry Potter? --Daniel. 12:05, 10 December 2010 (UTC)Reply

I agree. - [The]DaveRoss 13:27, 12 December 2010 (UTC)Reply
I would agree, except that the former sounds like gossypibomas.​—msh210 (talk) 18:29, 13 December 2010 (UTC)Reply

"H. sapiens"

Is there any rule or guideline preventing the inclusion of abbreviations of taxonomic species, such as H. sapiens? If I remember correctly, at least some of them were deleted. --Daniel. 15:23, 11 December 2010 (UTC)Reply

What's the term for a compound part, that doesn't exist as a word

In atom- I tentatively used the heading "Noun", which was quickly changed to "Prefix". In Danish it is called "førsteled" ("first part (of a compound)"). "Atom-" is derived from atom, but the meaning in a compound is different. Compounds with "atom" must be concerning atoms eg. atomkerne ("nucleus of an atom"), whereas compounds "atom-" concerns atomic energy or nuclear processes.--Leo Laursen – (talk · contribs) 11:06, 12 December 2010 (UTC)Reply

Er, what's wrong with prefix?--Prosfilaes 21:01, 12 December 2010 (UTC)Reply
Well it's not a prefix, that's all.--Leo Laursen – (talk · contribs) 20:46, 13 December 2010 (UTC)Reply
When I ask "what's wrong with prefix?", it means that it looks like a prefix to me. All answering "Well it's not a prefix, that's all." does is convince me you don't know what a prefix is.--Prosfilaes 22:24, 13 December 2010 (UTC)Reply
Maybe I don't know what a prefix is, 'cause I thought I explained above that it isn't a prefix.--Leo Laursen – (talk · contribs) 22:37, 13 December 2010 (UTC)Reply
Affix (also see wikipedia) seems to cover a multitude of sins. Pingku 03:11, 13 December 2010 (UTC)Reply
Combining form?​—msh210 (talk) 18:30, 13 December 2010 (UTC)Reply
Combining form sounds good, but all the examples on w:Combining form seems to be affixes.--Leo Laursen – (talk · contribs) 20:46, 13 December 2010 (UTC)Reply

Never mind, I'll just stay away from those kinds of entries.--Leo Laursen – (talk · contribs) 20:46, 13 December 2010 (UTC)Reply

In my readings (only on the English language) I haven't come across a term that covers exactly the example you offer. The components of compounds are unbound roots/morphemes, but so are other things. DCDuring TALK 22:42, 13 December 2010 (UTC)Reply
I don't speak Danish, so I couldn't tell what atom- is. It looks like a prefix to me, though. Definitely not a noun, at least not with that "-" at the end. However, if a native speaker thinks of it as a noun, they shouldn't have created a separate entry at all. They could instead add the additional meanings in the entry for atom and explain the difference in a note. We do something similar with Ancient Greek prepositions which have special meanings when used in compounds. --flyax 06:52, 14 December 2010 (UTC)Reply
You're probably right that it was a mistake to create it, so I'll just go and delete it. It is hard to tell, exactly how a Native speaker thinks about "atom-" since it only exists in compounds. Perhaps someone eventually will describe that "atom" has a non-existing sense that is only used in compounds, and generally translated with the adjective atomic or nuclear, and that the meaning differs from compounds with "atom" in the usual sense. Unfortunately there is quite a lot of these compound types in Danish, where the compound part is either obsolete or simply doesn't exist in it self.--Leo Laursen – (talk · contribs) 16:43, 14 December 2010 (UTC)Reply
An archetypal example of a closely related phenomenon is the morpheme "cran-" as in cranberry. So archetypal that the term cranberry morpheme is used by some linguists. But "cran-" is not exactly the same. There is no unbound morpheme "cran" with or without a related meaning. DCDuring TALK 17:19, 14 December 2010 (UTC)Reply
That is exactly it, thanks. I guess the only solution is to duplicate the information in the etymology of all the compounds with the morpheme. Too bad, because it would have been nice to educate the readers about the two different types of compounds with atom.--Leo Laursen – (talk · contribs) 20:23, 14 December 2010 (UTC)Reply

It seems you should write the entry (deprecated template usage) førsteled and find a good English translation for this word, as well as example sentences. Can you point to any Danish texts where a førsteled is described as being neither a compound or prefix? As far as I'm concerned, all words starting with atom- are compounds with atom, even if "atomic energy" is implied, e.g. (deprecated template usage) atombomb. --LA2 22:44, 18 December 2010 (UTC)Reply

Have a look at Den Danske Ordbog: "atom-" og Den Danske Ordbog: "førsteled".--Leo Laursen – (talk · contribs) 17:22, 23 December 2010 (UTC)Reply

μεγάλοςμέγιστοςμέγιστου

Am I over-egging the declined form of a superlative by indicating the original positive form, as in the entry: μέγιστου ? —Saltmarshαπάντηση 15:11, 14 December 2010 (UTC)Reply

Can't just all the forms be indicated in the conjugation table of the positive form? -- Prince Kassad 15:38, 14 December 2010 (UTC)Reply
with the few exceptions the answer is propably yes. Μέγιστος is irregular superlative, but I think most cases are regular. Are any examples available? —Saltmarshαπάντηση 15:49, 14 December 2010 (UTC)Reply
German already does this. See for example gut#German. It has always worked perfectly fine this way. -- Prince Kassad
Ah ha! that looks promising - I shall investigate. Thanks —Saltmarshαπάντηση 18:39, 14 December 2010 (UTC)Reply

Non-brand names of software products

What is the rule for things like (deprecated template usage) vi (currently defined as "(software) The primary text editor for Unix")? Okay, it's not a brand name, but surely we don't want to include the name of every free bit of software out there? What covers it? Equinox 16:33, 14 December 2010 (UTC)Reply

It still looks like a brand to me. I would expect such terms should be easier to so attest than most software product names, especially relative to the size of the actual user bases, probably because of the use of Unix in university environments. DCDuring TALK 17:12, 14 December 2010 (UTC)Reply
vi has been around for well over 30 years - hardy a "a free bit of software", it evolved from old text based 'teletype' editors and is probably the first visual editor many of us used. Its not 'commercial' and its many versions have evolved in the UNIX community. I think it should stay —Saltmarshαπάντηση 18:30, 14 December 2010 (UTC)Reply
The famous, popular game Bubble Bobble has been around for 20 years or more, but I would be appalled to see it in a dictionary (an encyclopaedia is fine). My ideas about a dictionary seem to be at odds with everyone else's. Equinox 22:21, 14 December 2010 (UTC)Reply
vi would easily meet the brandname CFI. -- Prince Kassad 18:59, 14 December 2010 (UTC)Reply
Why trying to exclude whole classes of words? Words are words, no matter what they mean. Lmaltier 19:17, 14 December 2010 (UTC)Reply
I suppose my idea of a dictionary is the kind that I grew up with, and that's dead now. Oh well. Equinox 22:19, 14 December 2010 (UTC)Reply
Arguably it's a specific entity, which would have unwritten rules. I wonder if when we write those rules we're going to have to take into account exactly how "specific" an entity it is. DAVilla 08:18, 15 December 2010 (UTC)Reply

Abbreviations

I gather the "abbreviation" and "initialism" parts of speech are deprecated. If that is so, how do we express them? e.g. (deprecated template usage) str says "Abbreviation" in its definition, but that's bad because it suggests that "a str" is "an abbreviation"; this isn't really part of the def. Where can it be put? (Or should we use the non-gloss definition? That sucks though.) Equinox 22:57, 14 December 2010 (UTC)Reply

Huh? Since when were they deprecated? -- Prince Kassad 23:33, 14 December 2010 (UTC)Reply
If they aren't, why have I often seen them replaced by noun entries (if only to permit a plural)? Equinox 23:38, 14 December 2010 (UTC)Reply
I think it doesn't need and hasn't had a vote. Of course, someone could insist. I have just edited [[str]] in accord with my understanding of the intent of the latest discussion that I remember. Does anyone have any better ideas? DCDuring TALK 00:10, 15 December 2010 (UTC)Reply
I'm happy with what you did. It is true that "abbreviation" isn't a part of speech (any PoS can be abbreviated), but I wanted to see at least a template, so we wouldn't be defining things as literally "Abbreviation of X" as though the word referred to an abbreviation. Equinox 00:18, 15 December 2010 (UTC)Reply
I'm not entirely happy with the entry as an example of the emerging consensus, but it seems better than what we had and I can't think of anything better. There is a lot of repetition of the words "abbreviation of". I dread the inclusion of pronunciation sections. Is the programming use of "str" Translingual? DCDuring TALK 00:33, 15 December 2010 (UTC)Reply
It's translingual insofar as most major programming languages tend to be based on English. But I don't think language-specific keywords are part of our remit, so really we're probably trying to define str as a programmers' abbreviation that would be used in a sentence or something (dare I say inlined?). The fact that one or two languages have a Str command, or that the natural abbreviation for substring is substr, doesn't have much bearing on it. Equinox 00:41, 15 December 2010 (UTC)Reply
Gotcha. DCDuring TALK 00:54, 15 December 2010 (UTC)Reply

I agree that the "abbreviation" and "initialism" parts of speech should be deprecated. Abbreviation of should not be in the definition, but in the etymology section. Lmaltier 06:16, 15 December 2010 (UTC)Reply

A long-standing practice in Wiktionary has been to use "abbreviation" and "initialism" as part-of-speech headings. I do not see that this has been deprecated, and by whom. I am not so sure that noun initialisms in Czech are best classified as nouns, given they often lack clear gender and inflection. A vote could seem needlessly formal, but right now I have no idea how many people actually support the deprecation of "abbreviation" and "initialism" as part-of-speech headings. --Dan Polansky 07:39, 15 December 2010 (UTC)Reply

This issue has long been at least somewhat controversial, and the policy handed down with a bit of opposition that was strongarmed in my view. We may have even had a vote at one point, or at least the proposal. Can't find it in the timeline though, maybe because there wasn't enough consensus at the time to address an overturn. There have been a few ideas like this that have been rejected in the past, and which I'm surprised to see in place today, for instance italicizing certain definitions, those that behave as usage notes moreso than synonyms. DAVilla 08:13, 15 December 2010 (UTC)Reply

I have removed the deprecation tag from {{acronym}} and {{abbreviation}}; {{initialism}} is protected, so I cannot do it. I see no clear consensus for the deprecation. In the previous BP discussion in which the deprecation of the templates was proposed, EncycloPetey had some misgivings about the proposal: #Re: Classification of abbreviations, initialisms and acronyms, November 2010 --Dan Polansky 07:53, 15 December 2010 (UTC)Reply

I promise you it goes back much further than that, but anyway I suggest we have a civilized conversation about it and finally confirm it one way or the other. DAVilla 08:16, 15 December 2010 (UTC)Reply
Oh the templates should be deprecated - we don't use them in headers because they break section linking. Instead, you should use plain ===Abbreviation=== or ===Initialism===. -- Prince Kassad 09:30, 15 December 2010 (UTC)Reply
Why don't you create a poll or a vote? Using these templates is a long-standing practice. The claim "we don't use them in headers" is obviously wrong: {{acronym}} is used in more than 500 pages; {{initialism}} is used in more than 4000 pages; {{abbreviation}} is used in more than 3500 pages. --Dan Polansky 09:43, 15 December 2010 (UTC)Reply
But it already says in WT:ELE that we use only plain-text headers! -- Prince Kassad 09:45, 15 December 2010 (UTC)Reply
Can you be specific? What sentence in WT:ELE does say that? Was there a vote (probably not) or a BP discussion (more likely) leading to that sentence of WT:ELE? --Dan Polansky 10:17, 15 December 2010 (UTC)Reply
As Lmaltier will know, these sort of headers were deprecated years ago on the French Wiktionary. From experience, a small pocket turn out to be very tough to deal with; like LTNS meaning long time no see. Long time no see is a phrase, but having LTNS under the header ===Phrase=== seems a bit ridiculous. Anything else usually turns out ok. Oh, by the way sometimes I just write [[Category:English initialisms]] instead of putting it in the definition. Mglovesfun (talk) 20:32, 16 December 2010 (UTC)Reply
Fwiw, I agree these templates should not be used in headers (though I use them in etymology sections, as they categorize). However, I've never seen consensus to stop using them.​—msh210 (talk) 15:33, 19 December 2010 (UTC)Reply
I think that there can be a consensus about several facts: that SQL, PHP or APL are as much nouns as Fortran or Python, that initialism is something about the etymology of these nouns, etc. Discussions and votes would benefit from a list of short arguments (pros and cons), a list everybody could add to until it is considered as complete. A vote can be useful only after nobody finds anything to add to this list of arguments. In many cases, this is the only practical way to get a consensus. Lmaltier 16:45, 19 December 2010 (UTC)Reply

Wiktionary:New Year's Competition 2011

Announcing. It hasn't yet opened; feel free to edit it for a while (and start it whenever it's deemed appropriate).​—msh210 (talk) 21:00, 15 December 2010 (UTC)Reply

I was thinking of something very similar to this. It's also written very technically, just as I would write it. Can someone make it more fun? Basically, find a word that's missing a lot of senses. The more senses missing, the bigger the find! DAVilla 15:48, 21 December 2010 (UTC)Reply
I'm not such fun, I'm afraid, and no one else has rewritten it. Anyway, it's now begun.​—msh210 (talk) 04:11, 23 December 2010 (UTC)Reply

Category:Ligurian language

There's a problem with this category.

There are two languages called "Ligurian". One is the Romance dialect, represented by the code {{lij}} and also this category. But there's a second Ligurian language, an ancient language spoken around that area and represented by the code {{xlg}}.

The problem is, how do we differentiate between these two? -- Prince Kassad 20:28, 16 December 2010 (UTC)Reply

Ligurian and Ancient Ligurian, maybe? —CodeCat 21:40, 16 December 2010 (UTC)Reply
Maybe. We have such a pair already: Category:Macedonian language and Category:Ancient Macedonian language. -- Prince Kassad 21:31, 17 December 2010 (UTC)Reply

Luckas-bot for bot status

Hi all ! I propose to the english wiktionary community my bot, Luckas-bot, to get a bot flag. You can ask questions and give your opinion here. --Luckas Blade 14:28, 17 December 2010 (UTC)Reply

We have a bot already, ChuispastonBot (talkcontribs), but a bureaucrat needs to grant it status because its vote has already passed some two weeks ago. —Internoob (DiscCont) 22:13, 17 December 2010 (UTC)Reply

Google Books Ngram Viewer at Google Labs

This link offers a new tool for examining trends in relative frequency of usage of words in a subset of Google's scans of books. This article from the NY Times provides some background. There are datasets available for free download. DCDuring TALK 15:14, 17 December 2010 (UTC)Reply

Addictive! Now I can't do anything else.--Makaokalani 17:26, 17 December 2010 (UTC)Reply

CFI simplification

WT:CFI#Constructed languages currently has four separate lists: those approved by consensus, those not approved, those with no consensus, those rejected by consensus. This seems needlessly complicated as by current interpretations the latter three are equivalent (all not allowed in main namespace). Can we simply state that constructed languages aren't allowed by default and just list the exceptions? I think we should also remove references to ISO codes as our reasons for including a language don't really rely on that feature. What do others think? --Bequw τ 02:03, 20 December 2010 (UTC)Reply

At the very least, we should make a distinction between those languages which simply have no approval and those which have been explicitly rejected via a VOTE. Still, that renders the middle two redundant. -- Prince Kassad 15:18, 20 December 2010 (UTC)Reply
I would keep these separate as you suggest. It's strange that consensus is required before a word can be included. Usually consensus is required to kick a word out. I guess in this case the consensus is presumed to allow only those words in approved languages, but I wonder what would happen if there were an RFD for a term in Romanova or Romanica. I suspect there are people here who wouldn't let it have its day in RFD.
Keep them separate because at least we would remember that there is uncertainty and the rule is subject to change. Mention ISO codes parenthetically only for the "no consensus" group, but no need to list them all. Effectively you could delete the third group of the four. I don't see why it's necessary to list a lot of artificial languages that don't have ISO codes and in some cases are even red and undefined. DAVilla 15:35, 21 December 2010 (UTC)Reply
Agreed; most of the red undefined languages are completely obscure, and I don't believe Orcish actually refers to any specific language at all.--Prosfilaes 20:14, 22 December 2010 (UTC)Reply
On the other hand, if we read CFI as to say that these two middle groups aren't prohibited, then certainly it is useful to list them all. But I doubt this is the way things will lean in the end. None of the undefined terms have an entry in Wikipedia. That is, Wikipedia doesn't think them noteworthy enough to even mention their existence. DAVilla 23:59, 23 December 2010 (UTC)Reply

I think there should be clear and simple criteria for inclusions of languages, either constructed or not. We have defined such criteria on fr.wiktionary:

  • all languages with an ISO 639 code
  • all languages with a Wikimedia code
  • all languages already used as their mother tongue by some people
  • all languages already learned in some schools
  • all languages with a literature
  • all languages with their own description page in at least 2 of the main 10 Wikipedias (the existence of these pages not being contested)

Any language meeting at least one of these criteria is automatically accepted. Other ones require a vote. Lmaltier 18:59, 20 December 2010 (UTC)Reply

We had LFN terms, a constructed language which was among the "no explicit approval" group. Recently, all entries in this language have been removed as "not approved for inclusion", which in my opinion is a misunderstanding of CFI. -- Prince Kassad 15:41, 21 December 2010 (UTC)Reply
Lingua Franca Nova has got its own IS639-3 code (lfn): it would be automatically approved for inclusion if above rules are adopted. In this case, and in almost all cases, adopting these rules would make things much simpler, and many controversies can be avoided (note that accepting a language only means that sections for this language are allowed, not that they are encouraged). Lmaltier 06:51, 22 December 2010 (UTC)Reply
Essentially what it comes down to is, do we need consensus to keep or do we need consensus to delete? Without a proper vote on the matter, we must assume that consensus is needed to delete, and the removal of Lingua Franca Nova is inappropriate. I'm starting a vote to see if this can be considered the appropriate instead. DAVilla 23:13, 23 December 2010 (UTC)Reply
I find that overbroad and redundant. Maybe all languages with an ISO 639 code, though that overturns the no Klingon rule. But ISO 639-3 by design covers everything; we shouldn't create a bunch of other rules to cover stuff that ISO 639-3 should already cover. And I don't see your rules as helping; many are too easily gameable or subject to interpretation. ("with a literature", for example.)--Prosfilaes 09:10, 22 December 2010 (UTC)Reply
ISO 639 codes are very useful for saving time, however I don't think we should live and die by them. Wiktionary users have to come first. RE: LFN, I had a similar feeling that 'not approved' doesn't necessarily mean 'not allowable'. I don't think deleting them was wrong so much as just one way to interpret that line in CFI. Therefore, if someone restored all the LFN stuff they'd be (IMO) in the same position - not wrong, just interpreting that line in CFI a different way. Mglovesfun (talk) 16:34, 22 December 2010 (UTC)Reply
I don't think that we should live and die by them, but I think that any exceptions are worth discussing. Any language that could seriously claim to have a literature is either listed in ISO 639-3 or was considered a dialect of a language already in ISO 639-3 and hence is worth a discussion. I'm less against making exceptions, and more against automatically making exceptions.--Prosfilaes 20:08, 22 December 2010 (UTC)Reply
The idea of above criteria was to be very broad (remember: all words, all languages), while preventing inclusion of "languages" nobody has ever heard of except their creators. Forbidding the inclusion of words for languages recognized as such by international organizations, or when the language clearly exists, is against the fundamental principle of the project, will always seem arbitrary to many readers, and and will always cause controversies. Yes, Wiktionary users have to come first. For users, it's better to allow sections for all languages, even when some sections are not used much, than to frustrate users looking for forbidden sections. Lmaltier 17:50, 22 December 2010 (UTC)Reply

Old French archaic forms (for example)

It's occurred to me there is a difference between obsolete and archaic. To put 'obsolete' for a word in any dead language would be silly - but you can have archaic forms. Pro (recently created) could IMO be described as an archaic form of por, but not as an 'obsolete' form of it. Would give a Middle English example, just I don't have one. Mglovesfun (talk) 13:07, 20 December 2010 (UTC)Reply

I had always assumed that neither tag made sense for dead languages. As applied in English (and presumably other living languages}, AFAIK "archaic" means something like 'not current, but intelligible at present (in its context)' and "obsolete" 'not current, not intelligible at present (in its context)'. For purposes of dead languages is "at present" appropriate? Does it mean at present from the point of view of a speaker of Modern French? Does it mean from the point of view of the end of the Old French period? What does fr.wikt do? How does Robert show such words?
Widsith's work in some English entries providing a period during which a word had a given sense is a major improvement over the use of these tags. It seems to be the highest standard that we have been able to achieve and the highest our users would be likely to appreciate. DCDuring TALK 15:13, 20 December 2010 (UTC)Reply
Couldn't archaic in this context just mean 'very old', that is older than other attested forms? Mglovesfun (talk) 15:20, 20 December 2010 (UTC)Reply
I don't think that's what it means in dictionaries. Have you seen Widsith's use of {{defdate}}? [[bead]] is an entry that has it in use. I don't know whether the online Robert always shows first use, nor how good its coverage of Old and Middle French is, let alone Anglo-Norman et al, but they seem to have some indication of an early use of a given sense. DCDuring TALK 19:04, 20 December 2010 (UTC)Reply
To me "archaic" for a modern language means that it's now an archaism, but for a historical language that doesn't make sense. So if I saw it in a historical language entry, I think I would take it to mean that, over the entire course of the language's history, it was predominantly an archaism. For example, if a Middle English word were tagged as "archaic", I would imagine that (1) a Middle English speaker by 1150 or so would already have considered it archaic and (2) it remained in use until at least 1400 or so. Or something like that.
But historical linguists seem to use "archaic" a bit differently; they seem to use it to mean that a form is an exceptional holdover from an earlier form of the language, without necessarily considering it an "archaism" in the sense that we mean. For example, the second component of Latin (deprecated template usage) pater familias is often said to be the "archaic" genitive of (deprecated template usage) familia, because by the time of Classical Latin the normal genitive was (deprecated template usage) familiae. I don't think that means that (deprecated template usage) familias was otherwise used as an archaism in Classical Latin (though I can't claim to be sure about that), only that (deprecated template usage) familias is an Old Latin form that persisted in this fixed expression, at least in legal contexts. And here Don Ringe writes of the reconstructed non-Anatolian Indo-European word for "wheel" that its "pattern of derivation ( [] ) is unique (archaic?)", which obviously isn't meant to suggest that speakers of the ancestor of the non-Anatolian Indo-European languages might have thought of their own word for "wheel" as archaic.
And you, Mglovesfun (talkcontribs), seem to interpret it differently from either of these ideas. (I think? Are you saying that all words in the Oaths of Strasbourg, say, would be "archaic"? Or only the ones that aren't also attested in much later Old French?)
So all told, I think we might be best off avoiding this word entirely for dead languages, and using more verbose context tags such as "rare after circa 900".
RuakhTALK 22:47, 2 January 2011 (UTC)Reply

[Poll] American English, use /ɒ:/ instead of /ɔ/

Hello,

I think we should stop using /ɔ/ for words like law, dog, bought in non caught-cot merger American English. The reason being that nowadays it's uncommon in American English, and it's used only in some parts of New England.

If British English law is /lɔ:/, then how American English law be /lɔ/? The difference is very easy to hear and even to see with a sound editor, because the vowel used in American English is not as "closed" as /ɔ/, it's much more open, because in reality it's something closer to /lɒ:/. It's not just a difference in length.

Also, if you listen to how the British say dog and then how (non caught-cot merged) Americans say it, it will be clear that they're using pretty much the same vowel (Americans perhaps with an extra schwa at the end), but the American one is longer. So BE /dɒg/, AmE /dɒ:g/, but certainly not /dɔg/.

Examples of the use of this vowel can be seen in dictionaries like Longman's Dictionary of Contemporary English. Other major American dictionaries either just use /ɑ/ or their own pronunciation system.

The only reason people are still using /ɔ/ is they're used to it, but the fact is, that is not the standard vowel found in American English today, and I don't see why we should keep using it. We are using an International alphabet, and it is very important to use the correct sound, because, if people go over to Wikipedia and hear what /ɔ/ sounds like, they will get a wrong idea. So, do you support using /ɒ:/ instead of /ɔ/ (except for before r) in American English?--AmeGOD 12:06, 21 December 2010 (UTC)Reply

You say "only in some parts of New England", but it's in fact used in NYC also, so perhaps I'm biased: I say ɔ and hear ɑ (as I live in a cot-is-caught area), so don't know of this ɒ use at all that you say is so common. Is it really?​—msh210 (talk) 17:58, 21 December 2010 (UTC)Reply
The New York dialect is kind of unique, because the vowel can be /ɔə/ or /oə/ or even something closer to /ʊə/. But usually in American English, the vowel in words like awe is different from the one the British use. It's easy to hear. And the vowel in words like dog is the same as the ones the British use, but longer. Under the current system, American "law" should sound exactly like British "law," because they both use the same vowel, /ɔ/, albeit in American English it is shorter. And the vowel in American "dog" should be more closed than British "dog." This is clearly not the case.--AmeGOD 19:50, 21 December 2010 (UTC)Reply
Again, the precise vowel that is transcribed as "ɔ" may vary. In a few speakers, particularly in a few parts of New England, it may approach "ɒ", but many of these speakers are cot-caught merged. So they would pronounce words like "not" and "lot" with the "ɒ" vowel as well. And in many cot-caught merged Canadian dialects, the vowel that is merged into is "ɒ" as well, so in fact many cot-caught merged speakers do pronounce "caught" "bought" and "sought" with a "ɒ".
Yes, the vowel an non cot-caught merged American would use for "awe" wouldn't usually sound identical to what a British speaker would us, but "ɔ" the closest and most accurate approximation, and not "ɒ". And it isn't just "some parts of New England", that's a highly innacurate description, especially considering how many cot-caught speakers there are there, especially in Northern New England. Many parts of the Midwest and South also preserve the contrast. If you head to Michigan or Wisconsin, you will hear a clear "ɔ" for "bought" and "taught", just as you would if you headed to Philly or New York.--Dezzie 14:31, 22 December 2010 (UTC)Reply
You aren't confusing /ɒ:/ (Open back rounded vowel) with /ɑ/ (Open back unrounded vowel) are you? (they look almost the same) I'm not talking about the caught-cot merger here, I'm talking about the vowel that non merged Americans use, which is in fact more similar to /ɒ:/ (a long version of the vowel in British "hot") than to /ɔ/ (vowel in British "law")--AmeGOD 14:49, 22 December 2010 (UTC)Reply
No, I'm not. Some speakers with the cot-caught merger merged the vowel into /ɒ/ rather than /ɑ/. A few cot-caught merged accents in New England and Canada will pronounce both cot and caught with a /ɒ/. But the closest approximation of the vowel used by non cot-caught merged speakers for caught is /ɔ/, and is transcribed as such. And it is not "Northeastern", again, there are speakers throughout the Midwest and the South who keep the /ɑ/ and /ɔ/ in cot and caught, or bought and bot distinct. Here is a study done on the cot-caught merger in Ohio. http://www.ohio.edu/linguistics/workingpapers/2008/flanigan_2008.pdf --Dezzie 15:30, 22 December 2010 (UTC)Reply
You are talking about a different phenomenon, which is using the short /ɒ/ in some sort of caught-cot merger. Again, I'm not arguing that Americans always use /ɒ:/, I'm saying that's the vowel they use, not /ɔ/. If you want, here's some quotes from an old phonology/english forum and website:
  • "The thing here, though, is that the traditional transcription of /ɔ(ː)/ is rather inaccurate in the case of GA, which really has the more open /ɒ(ː)/; speakers of GA-like dialects may very well perceive actual [ɔ(ː)] as being closer to /o(ʊ̯)/ rather than perceiving it as being the same as their actual /ɒ(ː)/. The transcription of /ɔː/ is more appropriate for Received Pronunciation, which has a significantly higher vowel for such than its counterpart in GA." [1]
  • In American transcriptions, ɔ: is often written as ɒ: (e.g. law = lɒ:), unless it is followed by r, in which case it remains an ɔ:. [2]

These are just 2 examples, you can find many more. But again, the difference between the vowel in British and American "law" is crystal clear. --AmeGOD 15:49, 22 December 2010 (UTC)Reply

How about a compromise?
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɔ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "cot-caught" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.
or
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɔ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "cot-caught" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "Canada" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.--Dezzie 16:12, 22 December 2010 (UTC)Reply
Fine, but why not just
  • (North American) Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. often Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɔ(ː)t/" is not valid. See WT:LOL and WT:LOL/E. ?

or

  • (North American), Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "cot-caught" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. often Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɔ(ː)t/" is not valid. See WT:LOL and WT:LOL/E. ?

After all, Wikipedia says: "The merger occurs in some accents of Scottish English and to some extent in Mid Ulster English but is best known as a phenomenon of many varieties of North American English." And it looks better.--AmeGOD 16:57, 22 December 2010 (UTC)Reply

Again, many North Irish and Scottish accents are cot-caught merged as well. The cot-caught merger spans both countries and continents, and should be considered its own separate phenomenon, and not "North American" necessarily. I thought my suggestion was fair. How about:
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɔ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "cot-caught" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "Canada" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E.--Dezzie 17:27, 22 December 2010 (UTC)Reply
My only beef with that is that considering how widespread the merger is in the US, writing it the way you suggests may make it look like an idiolect there, but since adding the Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑ(ː)t/" is not valid. See WT:LOL and WT:LOL/E. after Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. would be redundant, I agree with your version.--AmeGOD 17:36, 22 December 2010 (UTC)Reply
A typical example of our current practice would be dot (permanent link) where we use ɒ for British English and ɑ for American English. I'm not condoning this, just saying this is what we do. Mglovesfun (talk) 15:09, 22 December 2010 (UTC)Reply
No, words like "dot" or "hot" aren't in the same category as "bought", "taught" or "caught". Almost all Americans would pronounce "dot" with an "ɑ", save for a few speakers with the Northern Cities Vowel Shift who might front it to "a".--Dezzie 15:30, 22 December 2010 (UTC)Reply
Actually, fwiw, I'm from NYC, not, I think, considered subject to the NCVS, but pronounce dot with an a also. See [1].​—msh210 (talk) 01:35, 23 December 2010 (UTC)Reply

[Poll] Caught-cot merger in American English

I think we should stop considering the caught-cot merger an isolated phenomenon, and instead an alternative/secondary pronunciation.

According to a 1996 telephone survey made by professor Labov, about 40% of Americans had the merger. That is already very significant, and I think more than enough for a secondary pronunciation, however, in 2006, in an NPR interview Mr.Labov said that: "Half of this country has a merger of the word classes, cot, caught, don, dawn, hock, hawk...," which is very plausible considering that mergers keep spreading and that 10 years are plenty of time.

Besides, there are words like blog, log, log, gone, cross etc. that can be said with a /ɑ/ even by those who lack the merger. The distinction is fading away.

So you can't really call it an idiolect. When an American has this merger, shouldn't this pronunciation be included as a secondary pronunciation?

What I mean is we should write, for instance, bought as

  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɒ:t/" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑt/" is not valid. See WT:LOL and WT:LOL/E.

or

  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɒ:t/" is not valid. See WT:LOL and WT:LOL/E., also Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "Canadian" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑt/" is not valid. See WT:LOL and WT:LOL/E.

or

  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "North American" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑt/" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. often Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɒ:t/" is not valid. See WT:LOL and WT:LOL/E.

but not

  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɒ:t/" is not valid. See WT:LOL and WT:LOL/E.
  • Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "cot-caught merger" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑt/" is not valid. See WT:LOL and WT:LOL/E.


Do you agree?--AmeGOD 12:24, 21 December 2010 (UTC)Reply

(Comment for other readers: The (IMO odd) ɒ in the above post is used per the OP's own opinion as expressed in the preceding section that it be used in pronunciations instead of ɔ and may for the purposes of this discussion be read as if it were an ɔ. if you want to discuss whether to use the latter or ɒ, see the preceding section.​—msh210 (talk) 17:52, 21 December 2010 (UTC))Reply
If it were purely an American phenomenon, I would agree to label it under "US". But since it is standard in Canadian English as well, along with many dialects of Scottish and Irish English, it should be labelled as the separate phenomenon it is, not just existing in the US, and not just existing in North America, but occurring trans-continentally. Even if 40-50% make the merger in America, labelling it the way you are suggesting presents the cot-caught merger as exclusively American.--Dezzie 14:40, 22 December 2010 (UTC)Reply
I understand what you mean, how about my third option then? (*Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "North American" is not valid. See WT:LOL and WT:LOL/E. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɑt/" is not valid. See WT:LOL and WT:LOL/E., Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "US" is not valid. See WT:LOL and WT:LOL/E. often Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "/bɒ:t/" is not valid. See WT:LOL and WT:LOL/E.)--AmeGOD 14:53, 22 December 2010 (UTC)Reply
I can't hear the difference very well which makes it both fascinating to me and also more difficult to follow the conversation. I won't give my input except to say that I'm glad we're not considering using parenthesis here. I really hate the use of parenthesis to indicate that it's a different dialect. That is not at all obvious and more easily interpreted as optional in the same way that inflection in French is unimportant. The worst is the representation of the rhoticized ɚ as ə(ɹ). Am I crazy in thinking there's a difference between ɚ and əɹ (speaking to those who know something other than English here)? Worse, this affects the rhyming dictionary, resulting in words that must rhyme in any dialect, where we should list exceptional words that only rhyme in certain dialects. Different pronunciations should be listed separately. We're not paper and we don't need to cut corners. DAVilla 22:58, 23 December 2010 (UTC)Reply
Yes I agree with this idea. Ƿidsiþ 14:11, 28 December 2010 (UTC)Reply

Company names

Per WT:RFV#Wikimedia, would anyone object to the extension of WT:BRAND to include company names as well? There is already some overlap. For instance Boeing, Nike, Pepsi, Sony, and Toyota can refer to either the company or its respective product. This would allow a definition line for the company and, therefore, also entries like Verizon and Wal-Mart which do not have a product identified with them. Anyways these are usually just nicknames in fact, and we can explicitly state that Inc., Ltd. etc. are not to be included in the entry title. DAVilla 06:42, 24 December 2010 (UTC)Reply

Do we want to have any limits whatsoever beyond attestation? Do we want names that are attestable only in combination with generic terms, eg, Mount Vernon Fire Department? IBM Americas? Do we want any multi-word names (ie, open compounds)? DCDuring TALK 09:13, 24 December 2010 (UTC)Reply
I don't, and I'm not opposed to tweaking the requirements a little. DAVilla 16:54, 24 December 2010 (UTC)Reply
As I think on it, I wonder if I would object to any company name (or nickname) attestable from our currently accepted sources that was one word or hyphenated. Though I could imagine some multi-word (open compound) company names that would be acceptable, I think such need some criteria that prevent the waste of time that most such will represent. If we can't exclude most of them, I would be inclined to exclude company names unless they have a special, iconic meaning beyond their proper noun meaning. Eg, McDonald's, Wal-Mart, Macy's, Gimbel's, Niemann-Marcus, Sears, Harrod's, General Motors. I think we might want to make clear whether, when we refer to trademark, we mean to include service marks, as Verizon and Wal-Mart would be if they were not trademarks (as I would bet they are). I understand that service marks are easier to obtain and may include the names of performing groups, ie, bands.
I see a few simple approaches to excluding unwanted names, such as:
  1. Only attestable company names with a clear iconic meaning not referring to the company/brand/trademark/service mark.
  2. Only attestable one-part (or hyphenated compound) company names.
  3. Any attestable trademark or former trademark (or service mark) could be automatically included, possibly whether or not it is a one-part name.
The last appeals to me most, as the combination of attestability and registration implies that there has been some level of use in advertising that inevitably forces a term into the lexicon. DCDuring TALK 18:55, 24 December 2010 (UTC)Reply
We require that for trademarks the source not indicate that the term is a trademark, e.g. no "TM". We could do the same for service marks, of course, and for company names require that it not indicate explicitly that the entity is incorporated, for instance no "Ltd." or "Inc." Pretty much everything else could be the same, such as the requirement that it not be clear what type of company is referred to, or that additional knowledge about the company is required to understand the author's intent. I couldn't imagine that Mount Vernon Fire Department or IBM Americas could be cited in this way, but to be sure there could be an extension of the requirement for idiomaticity that reflects location or subsidiaries, but does not necessarily prohibit compounds. I guess the test case would be International Business Machines, which I would allow given the proper citation. DAVilla 09:35, 26 December 2010 (UTC)Reply
I feel there is no problem for inclusion, provided that these names can be considered as words. It's probably clear to everybody that Mount Vernon Fire Department or International Business Machines are names, but not words. But IBM or Wal-Mart can be considered as words (Wal Mart would probably have to be considered as a word, too, despite the space; I agree that the limit is not always clear, and clarifying it would require some work; linguistic interest should be a criterion). Therefore, including them is useful (for a definition, but also pronunciation, etymology, derived words such as IBMer, etc. I feel this is the same difference as between Confucius or New York (acceptable, because a word) and Charles Darwin, composed of two words, and without any linguistic interest.
Nonetheless, as anybody can create a company with any name, or a trademark with any name, I would require a minimum number (e.g. 10?) of independent citations not originating from the company, nor from its staff, nor from advertisement, etc. Lmaltier 21:23, 26 December 2010 (UTC)Reply
I oppose an extension of WT:BRAND to include company names. I support removing WT:BRAND from CFI, given that I have seen no brand that unequivocally meets WT:BRAND, while there are some brands that ambiguously meet WT:BRAND. WT:BRAND consists of seven requirements, some of them complex; the requirements put a considerable cognitive load on anyone trying to demonstrate that a brand meets the requirement. OTOH some of the seven requirements seem okay and simple enough; removing the seventh requirement of WT:BRAND would do a lot to make WT:BRAND acceptable to me. (R7: "The text preceding and surrounding the citation must not identify the product to which the brand name applies, whether by stating explicitly or implicitly some feature or use of the product from which its type and purpose may be surmised, or some inherent quality that is necessary for an understanding of the author’s intent.") --Dan Polansky 11:28, 27 December 2010 (UTC)Reply
Yes, criteria are much too complex. Criteria I propose above would be much simpler. Generally speaking, I'm against considering the sense of a word as a criterion (except for phrases, of course), because we accept all words. But it seems that it's impossible to do without stricter criteria for brands or companies. Lmaltier 17:43, 27 December 2010 (UTC)Reply
What I'm understanding is that you don't like the current rules, but if that were fixed, you'd be willing to apply it to company names equally as to trademarks. I have some rudimentary ideas for simplifying CFI in general, but these are really long shots, and sneaking companies in under this existing rule is the only way I could imagine allowing company names in the near future. DAVilla 17:54, 28 December 2010 (UTC)Reply
That seems right: If WT:BRAND is simplified and made more inclusive, then regulating brand names and company names together could make sense. OTOH company names are names of specific entities, whereas brands are names used to refer to sets of specific entities (individual things).
Re "...sneaking companies in under this existing rule is the only way I could imagine allowing company names in the near future": I don't think that anything like that is needed to allow company names. As regards voting and consensus, company names are not forbidden from Wiktionary. It is only an unvoted-on part of WT:CFI that forbids them. --Dan Polansky 09:29, 29 December 2010 (UTC)Reply

Category:German cardinal numbers = Category:de:Cardinal numbers

217.224.181.81 10:33, 24 December 2010 (UTC)Reply

Yes. The latter needs to be merged into the former. This is really a job for a bot, but mine seems to have broke. -- Prince Kassad 13:03, 24 December 2010 (UTC) (addendum: hmm it worked fine this time. It's done now.)Reply
The regex needed should be so simple that even I can write it. Such as txt=txt.replace(/\[\[Category\:([a-zA-Z][a-zA-Z\ ]*)\:Cardinal numbers/g, "[[Category:{{subst:$1|l=}} cardinal numbers]]");. I suspect I've made a mistake somewhere, I usually do. Mglovesfun (talk) 19:02, 24 December 2010 (UTC)Reply

Category:Portuguese cardinal numerals --217.224.181.225 14:57, 26 December 2010 (UTC)Reply

That is a long-standing issue. There is no agreement in the community on the usage of "number" versus "numeral". -- Prince Kassad 15:44, 26 December 2010 (UTC)Reply

Categorizing people

Shouldn't Shakespeare and Hitler be members of a category for names of individual people? I suppose Category:People would be perfect for that, but it is crowded with seemingly random common nouns such as dendrophile, banana bender and maid. --Daniel. 14:39, 24 December 2010 (UTC)Reply

No. We aren't Wikipedia. We shouldn't categorize individual people together. -- Prince Kassad 14:46, 24 December 2010 (UTC)Reply
We could do that: Maybe the naming different: Category:Actual people or Category:Real people or Category:Types of people are my lame suggestions --Parttimer 14:57, 24 December 2010 (UTC)Reply
Subcategory Individuals. DAVilla 16:55, 24 December 2010 (UTC)Reply
I like DAVilla's suggestion of Category:Individuals. I personally consider it better than the alternatives Category:Actual people, Category:Real people or Category:Types of people. I have added Hitler and Shakespeare to it, but I suspect there are other entries that can be included there as well. Thank you. --Daniel. 23:45, 24 December 2010 (UTC)Reply
That looks good. I've added a couple more entries to Category:Individuals. What do we do about Category:Biblical characters though? ---> Tooironic 02:48, 25 December 2010 (UTC)Reply
Thanks for helping me to populate Category:Individuals. Apparently, it is synonymous with "Category:Historical people", so why not include Jesus and David to it as well? --Daniel. 03:42, 25 December 2010 (UTC)Reply
  • On a sidenote, how do we categorise entries like Magellan, which define it as a surname but which mentions the famous person at the same time? Should we divide these into two different senses? ---> Tooironic 03:05, 25 December 2010 (UTC)Reply
    I have moved the etymological information of Magellan to the etymology section, not to another sense. On the other hand, naturally Washington has one sense for the famous person and another for the name. I would oppose a merged definition like "A surname of various people, including George Washington." --Daniel. 03:42, 25 December 2010 (UTC)Reply
From a linguistic point of view, Washington is a surname (and a place name). From this point of view, the only valuable distinction is between the original surname and the surname of people named after George Washington. Listing some famous people named Washington would be like listing some famous dogs at dog, this is an encyclopedic consideration. It's only worth a link to Wikipedia (and, possibly, a note stating that, when used only, the word almost always refers to some individual, but this is not a different sense: even when referring to him, the sense still is a surname).
As for George Washington, I think this is never considered as a word, always as a name composed of two words : a first name + a surname. New York, Washington, Churchill or Berlin are considered as words, not George Washington nor Cheyenne Mountain Resort. Lmaltier 18:42, 25 December 2010 (UTC)Reply
A linguistic point of view of whom? --Daniel. 19:17, 25 December 2010 (UTC)Reply
  • Oh my God Daniel. I can't believe you went ahead and created Charles Darwin and Charles Dickens... Making real-people definitions for surnames is one thing, but full names? Are you serious? ---> Tooironic 00:36, 26 December 2010 (UTC)Reply
    I am not advocating a distinction between "one-word name" versus "multiple-word name", where only the former can be included on Wiktionary. I simply prefer including aliases and pen names. Both "Dickens" and "Charles Dickens" are shorter versions of Charles John Huffam Dickens. --Daniel. 06:14, 26 December 2010 (UTC)Reply
    We must not forget that we include words, not names. But we must understand word in its linguistic sense, not its typographical sense (e.g. New York or presqu'île are one word in the linguistic sense, 2 words in the typographic sense). Charles Darwin is two words in both senses of the word word. To understand who Charles Darwin is, you should either consult Wikipedia, or consult Darwin, then click on the link to Wikipedia. Lmaltier 21:02, 26 December 2010 (UTC)Reply
    Lmaltier, if I may ask again essentially the same question, can you please elaborate what you understand by "linguistic sense"? The existing definitions of "linguistic", "nonlinguistic", "name" and "word" on Wiktionary don't seem to support your distinctions, but it may be simply due to the perceived poor coverage of some of them. Yes, articles of Wikipedia should be linked from entries of Wiktionary when applicable. We include names, such as "Daniel", as a subset of "words". Your personal criteria seems pretty broad if you support the inclusion of "Darwin" as the famous biologist, but the simultaneous exclusion of "Charles Darwin" seems contradictory. Why should we deny the possibility of searching for the latter but not the former? --Daniel. 22:17, 26 December 2010 (UTC)Reply
    See word: A distinct unit of language (sounds in speech or written letters) with a particular meaning, composed of one or more morphemes, and also of one or more phonemes that determine its sound pattern.. This is the linguistic sense. I oppose this sense to the typographic sense (word defined as a sequence of letters between separators: space, comma, apostrophe, etc., without any reference to the sense: for them, eg is one word, e.g. is composed of two words). And I don't support the inclusion of Darwin (the biologist) as a separate sense, not at all: the sense is the surname, the list of all people with this surname is not of linguistic interest, the interest is encyclopedic only (but a link to Wikipedia is necessary, of course). Lmaltier 17:37, 27 December 2010 (UTC)Reply
    Ah, so you, like Kassad, prefer names of individual people completely absent from Wiktionary, in favor of only defining whether a word is a "given name", a "surname", etc. Thank you for sharing this point of view. --Daniel. 21:24, 27 December 2010 (UTC)Reply
    Not completely absent, but included only when they are considered as words (e.g. Confucius or Noah). Lmaltier 21:30, 27 December 2010 (UTC)Reply
  • As any one who has been around here for a while cannot help but to have noticed, there has never been a consensus for including the names of individual entities, except for demonyms, language names, certain toponyms, and certain others that have had an attributive-use rationale. By long-standing consensus, we also have long had entries for given names and surnames. Accordingly, this discussion of categorizing individuals can be taken as, at best, premature. Less generous interpretations would include "railroading", waste of time, and vandalism. As this is not the first time that consensus has been ignored, it is increasingly difficult to maintain the charitable view. If users do not like the consensus represented by our practice, said users can initiate a frank, explicit discussion of the main issue, rather than use such indirect methods as initiating (during a slow period, yet) a discussion that implicitly assumes a consensus of an issue known to be contentious. It does not take a paranoid imagination to suspect that the discussion of the side issue would be used to claim an implied consensus on the main issue.
  • Accordingly, I think all matters relating to individual entities are out of order until there is a clear consensus or a vote to include names of individual people or other individual entities not sanctioned by vote or long-standing usage here. DCDuring TALK 20:29, 27 December 2010 (UTC)Reply
    Has there ever been a discussion about the following question: should all words be included, whatever their meaning? In any case, the first sentence of CFI states that we welcome all words. The question should not be is this an individual entity? but is this a word?. (note that I think that some feel that individual entities should not be included (except those with an attributive use) only because this seems to be Webster's policy. But you cannot compare an Internet dictionary without space constraint and dealing with all words of all languages to a paper dictionary dedicated to English). Lmaltier 20:48, 27 December 2010 (UTC)Reply
    DCDuring, your last comment may be interpreted as disapproving the existence of Category:Individuals (among multiple other disapprovals), because it fits into "all matters relating to individual entities". Do you? Since we have two or more names of individuals, it seems natural to have a way to find them all at once.
    Perhaps I am underestimating Kassad's "We shouldn't categorize individual people together", but the most controversial subject that has been raised in this discussion is the distinction between one-word names and multiple-word names of famous people. Since this is the contentious issue, I don't see where consensus is implicitly or explicity assumed. On the contrary, I, the person who is willing to discuss against it, know that "Darwin" is widely accepted here; and I am asking, why "Charles Darwin" is treated as unacceptable by most editors, especially including the ones who accept "Darwin" to refer to the same biologist? Similar multiple-word names have been created before, such as Alexander the Great, John the Baptist, Jesus Christ and Mao Zedong. Am I missing certain merits of other existing entries that are absent from "Charles Darwin"? If so, what are them? --Daniel. 21:24, 27 December 2010 (UTC)Reply
Is this a word? is a useful starting point, but says nothing of what the entry should contain, if it is valid. Smith is a word, but I wouldn't want every attestable meaning of it included. By that, I mean everyone who has been referred to as Smith three times in durably archived independent sources. Mglovesfun (talk) 21:34, 27 December 2010 (UTC)Reply
If Is this a word? is a useful starting point, Does it have a meaning? is a useful step two. By "meaning", I mean if the word implies some individual person, rather than a person out of a group of people of the same name. In the phrase "I believe in Jesus!", one assumes that the speaker is Christian, not that he believes in the words of (for example) his uncle named Jesus Johnson, unless the context clarifies it. --Daniel. 22:08, 27 December 2010 (UTC)Reply
For Smith, the sense is not a person, the sense is that it's a surname used in English (it would make sense to define a sense as the common name of all people sharing this name and with the same origin, but this solution would be difficult to manage; one line for each etymology is much more practical)
Yes, for Jesus, the person may be considered as the normal, original, sense of the word (at least in English). I don't object a definition line in such cases. For Alexander the Great..., there is a linguistic interest too: this name is not composed of a given name + a surname, it's a name given to an individual person, just like Confucius, spaces don(t change anything to this fact, and this is a good reason for inclusion. Also note that:
  • while Charles Darwin can be easily retrieved though the link to Wikipedia, finding Alexander the Great from great is much less easy...
  • in the case of biologists, individual person names may be defined in a separate definition line because of the conventional sense used in scientific names (very often, using a surname in a scientific name refers to a well-defined person; if somebody else with this name describes some species, another name has to be used). But this is an exception. Lmaltier 06:50, 28 December 2010 (UTC)Reply
To say "For Alexander the Great..., there is a linguistic interest too: this name is not composed of a given name + a surname, it's a name given to an individual person, just like Confucius", your personal understanding of "surname" must be different from mine. "the Great" is a surname, especially one used by dozens of famous people.
If, by chance, you understand that a surname is the one given from the parents, then the inclusion of Allan Kardec (the de facto founder of Spiritism) on Wiktionary might be possibly supported by you, since that is a multiple-word pen name given by himself.
Whether or not Charles Darwin can be easily found on "Darwin" and point to Wikipedia doesn't matter, because [1] Wiktionary is a dictionary, not an index of Wikipedia articles (though the functions often overlap); and [2] finding encyclopedical information about a person by searching for him or her on Wiktionary, and following a link from its surname to Wikipedia is not the most natural, intuitive and functional choice.
I can find information about individuals on both projects because I'm smart and used to the system, not because the system is well done. There are only 86 individuals at Category:Individuals. The absence of most American presidents, divine writers, biblical characters and founders of religions gives the impression that our coverage is deficient, not that certain editors are convinced that Wikipedia is the best and only place for them. --Daniel. 12:00, 29 December 2010 (UTC)Reply

Style of "plural of..."

The entry dogs has these two definitions:

  1. Lua error in Module:parameters at line 360: Parameter 1 should be a valid language or etymology language code; the value "dog" is not valid. See WT:LOL and WT:LOL/E.
  2. Template:third-person singular of

Why does the former begin with a lowercase letter and the latter with a capital letter? I remember the existence of "Plural of" with an initial uppercase letter for years. The documentation supports my memories, by saying "By default, this template's output looks like a complete sentence, in that the word "plural" is capitalized at the beginning [...]". --Daniel. 00:05, 25 December 2010 (UTC)Reply

Yes, it should start in an uppercase letter. I have no idea why it doesn't. -- Prince Kassad 00:38, 25 December 2010 (UTC)Reply
I agree wholeheartedly, but, now that it doesn't, people may have relied on that fact and created entries that will look terrible if it's to be switched, like # Canines ({{plural of|dog}}). (Not likely, and not the biggest deal even if true, so I'm not sure how much we care about accounting for such a possibility.)​—msh210 (talk) 07:15, 26 December 2010 (UTC)Reply

"Comics characters"

I'm pretty sure that all or most of the characters currently listed at Category:Fictional characters appear in comics, among other media.

With that in mind, isn't Category:Comics characters entirely, or mostly, redundant? Is there any particular reason to keep it? --Daniel. 09:04, 26 December 2010 (UTC)Reply

Foxy Loxy is not from a comic. The latter should be a subcategory of the former. All comic characters are fictional, but not vice versa. DAVilla 09:46, 26 December 2010 (UTC)Reply
There are various comic books (and other media) where famous real people appear; apparently, no fictional time traveller makes a trip to the Second World War without meeting Hitler in person. Whether it makes the dead Nazi guy a "nonfictional comics character" or a "fictional comics character who is a depiction of a real person" is a can of worms. Nonetheless, Category:Comics characters is a subcategory of Category:Fictional characters. Foxy Loxy was not present in any of these two categories before I added it now to the latter (I don't care enough about the former to actively populate it). Apparently, it still would be more feasible and readable to list together the characters who don't appear in comics, if possible and desirable. Foxy Loxy appears in comics. --Daniel. 10:13, 26 December 2010 (UTC)Reply
Foxy Loxy appears in comics? Okay, my interpretation then would be characters whose origin is the graphic novel. This would exclude Foxy Loxy. Obviously, Hitler did not originate there either. Whatever this should called, it's clearly a subcategory of Category:Fictional characters. DAVilla 04:51, 27 December 2010 (UTC)Reply
A category for characters with that origin would exclude Superman and Mickey Mouse, but include Astro Boy. I believe it would be better to organize them by cultural origins, imitating Category:Arabic fiction to possibly create Category:American fiction and Category:Japanese fiction, and perhaps subcategories for their respective characters. --Daniel. 10:39, 27 December 2010 (UTC)Reply
Superman, really? You're probably the most qualified here to address this issue then. DAVilla 17:38, 28 December 2010 (UTC)Reply
Well, thanks. Yes, The Reign of the Super-Man has more information on the origin of the character; specifically, it describes how a villain named Superman of a short story was developed into a hero of comics years after. (Which may be interpreted as the existence of two characters, one "nondeveloped villain" and one "developed hero", but I would oppose this interpretation, because Superman has undergone many stages of development, depending on the writer or the decade, including multiple times when he was the villain, among multiple other changeable characteristics.)
I created Category:Japanese fiction to contribute to this project of narrowing categories for works of fiction. --Daniel. 20:43, 28 December 2010 (UTC)Reply

"Star Wars derivations"

Should UTSL, Yoda, lightsaber, Jedi, Jedi mind trick, MTFBWY and other members of Category:Star Wars derivations be removed from this category and added into Category:Star Wars? The distinction between them is becoming more subjective with their new members, and apparently as standard practice isn't recognized and followed by most editors anyway. --Daniel. 22:58, 26 December 2010 (UTC)Reply

UTSL is not part of Star Wars (which doesn't mention source code!) so calling it a derivation seems best. Equinox 09:52, 27 December 2010 (UTC)Reply
UTSL, Yoda, Chewbacca, lightsaber and all other members of these categories are related to Star Wars but also used outside this context, otherwise, according to WT:FICTION, they wouldn't or shouldn't be defined on Wiktionary.
Before this vote, Category:Star Wars was synonymous with Special:PrefixIndex/Appendix:Star Wars/; Category:Pokémon was synonymous with Special:PrefixIndex/Appendix:Pokémon/; and Category:Star Wars derivations was the one with entries instead of appendices.
Since we don't allow numerous appendices for each universe anymore, I believe we don't need two sets of categories for each universe anymore. --Daniel. 19:50, 28 December 2010 (UTC)Reply

Vote on renaming categories for inflection or headword templates

As a follow-up on recent three polls, I have created a vote:

If the results of the polls were unequivocal, a vote would seem unnecessary, but as the last poll showed no supermajority, a confirmation vote seems in order.

The vote starts on 29 December 2010 and is planned to last 14 days. --Dan Polansky 12:37, 27 December 2010 (UTC)Reply

It looks ok to me. —CodeCat 11:26, 28 December 2010 (UTC)Reply

Poll: Including individual people

I would like to ask your opinion in an informal poll on whether at least some individual people should be included in Wiktionary. Entries that currently do host sense lines for individual people include Shakespeare, Einstein, Socrates, Plato, and others. --Dan Polansky 20:41, 27 December 2010 (UTC)Reply

Some individual people should have dedicated sense lines in some entries.

  1. Support [The]DaveRoss 20:49, 27 December 2010 (UTC) I have always been in favor of including glosses for the names of important people with a link to the relevant Wikipedia page. For what it's worth I am in favor of both (deprecated template usage) Dickens and (deprecated template usage) Charles Dickens.Reply
  2. Support But the real issue is what some means. I support the inclusion only when the name can be considered as a word. A few examples are Confucius, Noah, Molière, Stendhal or Charlemagne (in many cases, they are nicknames or pseudonyms, but not always). Dickens must be included, too, but as a surname, not as an individual person. I strongly oppose inclusion of names which are not words, and, therefore, have no linguistic interest, such as Charles Dickens (there might be exceptions, but only when there is some linguistic interest). We are not an encyclopedic dictionary, only a language dictionary. Lmaltier 21:19, 27 December 2010 (UTC)Reply
  3. Support - I support the inclusion for the following reasons. 1. English learners don't always know how to pronounce names, so it would be important to include individual names. 2. The translation section will lead to the FL entry of the same name. English names are inflected in some of the foreign languages such as Hungarian and the inflection is not always intuitive. --Panda10 21:51, 27 December 2010 (UTC)Reply
    I note that WP is including more translations/transliterations in its entries. I further note that Charles and Dickens yield pronunciation and transliteration/translation information for "Charles Dickens". Dickens also provides half of the pronunciation for "Benjamin Dickens", who lives in my city. Charles provides useful information for a large number of individuals, including "Prince Charles" and "Ray Charles". DCDuring TALK 00:51, 28 December 2010 (UTC)Reply
  4. Support Dan Polansky 08:43, 28 December 2010 (UTC); in particular, Shakespeare should include a dedicated sense line or definition line reading like "William Shakespeare, an English playwright and poet of the late sixteenth and early seventeenth centuries". A case less contentious than Shakespeare might be Confucius--"an influential Chinese philosopher who lived 551 BCE – 479 BCE" in contrast to mere Confucius--"A given name" or of the sort. --Dan Polansky 10:44, 28 December 2010 (UTC)Reply
  5. Support single word entries such as Dickens that, as well as being a surname, have a single line entry pointing to a Wikipedia entry for a famous person known by that name. SemperBlotto 11:45, 28 December 2010 (UTC)Reply
  6. Strong support including the full name where there is literary merit. DAVilla 17:42, 28 December 2010 (UTC)Reply
  7. I support very strict criteria for inclusion of (some?) single-word names modeled after and similar to our brand-names criteria.​—msh210 (talk) 17:46, 28 December 2010 (UTC)Reply
  8. Support Daniel. 20:10, 28 December 2010 (UTC) Among multiple other discussed names, I'd appreciate if Copernicus (the name of the astronomer known for heliocentrism) were defined here in the future. --Daniel. 20:10, 28 December 2010 (UTC)Reply
  9. Support single-word names of well-known historical, mythological and biblical characters, in some cases with an epithet. Shakespeare should be explained here, but I'd prefer a more logical definition like "A surname, most famously held by the English playwright w:William Shakespeare". Strongly oppose entries like Charles Dickens.--Makaokalani 15:47, 29 December 2010 (UTC)Reply
    In Portuguese, we translate William Shakespeare and William Tell into, respectively, William Shakespeare and Guilherme Tell. There is no Guilherme Shakespeare or William Tell in Portuguese, unless someone is, for whatever reason, playing with translations of famous Williams. So, there are differences between the treatment of names of individuals that may be mentioned in their respective entries. Another notable example is Charlie Chaplin, whose name was translated into Carlitos. --Daniel. 00:37, 30 December 2010 (UTC)Reply
  10. Support in rare cases. —RuakhTALK 03:19, 31 December 2010 (UTC)Reply
  11. Support. bd2412 T 00:44, 3 January 2011 (UTC)Reply
  12. Support Particularly for historical and literary figures. -- A-cai 02:47, 29 January 2011 (UTC)Reply

No individual person should have a dedicated sense line in any entry.

  1. Support DCDuring TALK 00:39, 28 December 2010 (UTC) No multipart names of individuals, no single-person sense lines for one-part names. We have a wonderful resource a link away, with very comprehensive coverage of persons with a given surname, for example, as well as of works bearing such names, etc.,all on disambiguation pages of great scope.Reply
    But, for some names (e.g. Charlemagne), the person is the sense. In such cases, the only possible definition is for the person. Lmaltier 06:31, 28 December 2010 (UTC)Reply
    All names have an original target. Almost all are not historically unique. Charlemagne is no different. --Bequw τ 07:51, 2 January 2011 (UTC)Reply
    Yes, Charlemagne is the name of a single person (it's a francization of Carolus Magnus, I think, and it's not used for anybody else). And it's the case of many other names / nicknames. Definitions should explain the actual sense, and the actual sense may be an individual. But I agree that these cases, although numerous, are exceptions. Your almost should lead to another choice in this poll. Lmaltier 21:23, 2 January 2011 (UTC)Reply
    (Actually, "Charlemagne" has been used by many people, such as Charlemagne Péralte.) --Yair rand (talk) 21:35, 2 January 2011 (UTC)Reply
    In this case, it's a (very rare) first name, it's not the same meaning at all. The emperor is the etymology for this first name, but the proper noun should get a definition. First names and surnames are not really proper nouns, they don't name a specific entity (on fr.wikt, we use 3 different parts of speech for first names, surnames and proper nouns, it's clearer). Lmaltier 22:10, 2 January 2011 (UTC)Reply
  2. Support When we can identify the original target of a name, that individual will go in the etymology section. That will often identify the famous individual, but that should not be a primary concern. --Bequw τ 07:51, 2 January 2011 (UTC)Reply

I am hesitant or don't care.

  1. Support I support inclusion of nicknames and perhaps famous surnames, but definitely not full names like Charles Dickens. — lexicógrafa | háblame20:53, 27 December 2010 (UTC)Reply
    If you support inclusion of Charles Dickens in Dickens, that would count as "some individual people should have dedicated sense lines in some entries". --Dan Polansky 20:57, 27 December 2010 (UTC)Reply
  2. These vote options are very strange. I think there should not be separate senses for notable instances of specific senses. This opinion doesn't seem to fall under any of the options given. --Yair rand (talk) 06:47, 28 December 2010 (UTC)Reply
    At Yair Rand: It looks like you are saying that "No individual person should have a dedicated sense line in any entry"; if you are not saying that, then you must want to include at least one dedicated sense line for a person in some entry, right? But I do not wholly understand what you are saying, especially what you mean by "notable instances of specific senses". An example of what you mean could help clarify. --Dan Polansky 08:39, 28 December 2010 (UTC)Reply
    Senses which are just specific instances of names should be deleted. In certain situations there could be a word/name that doesn't refer to anyone other than a specific person, which might be kept. I don't know whether the sense should be formatted as a name sense, a description, or a Wikipedia link, but the redundancy is the reason for deletion, not that it only has a single referent. Words/names that have entered common usage in a language as the standard way of referring to an individual but not to any others should still have at least one sense. In practice, I doubt that this would make much difference, as most previously unused names of historical figures have probably entered common usage and are attestable as referring to people other than the original holder of the name (though cites would probably be difficult to find in some cases). --Yair rand (talk) 17:28, 28 December 2010 (UTC)Reply
    To be specific: Are you saying that "Confucius" should be defined merely as "given name" or the sort? If yes, and if your answer is the same for all given names, surnames and other names of people, then you probably agree with the second option. But if there is at least one term in which you want to see a sense line that specifically refers to a particular person and only to that person (as might be "Confucius" or "Charlemagne"), then you agree with the first option. By listing specific examples of inclusion and exclusion, you can make it rather clear what sort of inclusion you support. --Dan Polansky 10:40, 29 December 2010 (UTC)Reply
    "Confucius" should be defined simply as a given name, as should "Charlemagne". "Stendhal" should have one sense shown even if it only refers to one person. --Yair rand (talk) 11:23, 29 December 2010 (UTC)Reply
    Should "Stendhal" be defined like "given name" (or "pen name", "nickname", or of the sort) without referring to W:Marie-Henri Beyle? What sense line or definition line would you like to see at "Stendhal"? --Dan Polansky 12:06, 29 December 2010 (UTC)Reply
    Charlemagne is the name of a single person, the name has been invented for him, he is the sense of this word. It is impossible to define the word without stating which person it is. The same applies to Stendhal (and to Confucius too, unless I am mistaken, even if this name has also been used, later, as a normal given name). But, of course, we should not write articles about these persons, their biography, etc., only enough to understand who they are, i.e. what the word means. But, in Hitler, I would remove the 2nd sense, which is the same as the first sense. It might be changed to a gloss explaining that, used alone, the word almost always refers to Adolf Hitler. Lmaltier 12:31, 29 December 2010 (UTC)Reply
    Lmaltier has essentially said what I was going to say. Unless you exclude Charlemagne all together, I don't see how you can define it without mentioning he was Germanic emperor. Mglovesfun (talk) 12:38, 30 December 2010 (UTC)Reply

Of or pertaining to Shakespeare...

I would like to categorize adjectives of individual persons together, such as Kinseyan, Dickensian, Rowleian and Abrahamic. Any ideas for the name of the new category? Or would we simply use Category:Individuals for that, mixing adjectives with nouns? --Daniel. 11:28, 28 December 2010 (UTC)Reply

Eponyms, or more precisely, eponymic adjectives. — lexicógrafa | háblame14:01, 28 December 2010 (UTC)Reply

Rollback request

Is it possible to request for rollback in this wiki? I have rollback access on Wikipedia and I would like to request access due to recent ongoing vandalism. nh.jg (talk) 06:26, 29 December 2010 (UTC)Reply

We don't have any official process by which such a right can be granted or rescinded, but using the process we have for the autopatrol right seems reasonable (and that's how our one rollbacker got his right), in which case (since any admin can defer an autopatrol request) I say deferred, too few undos.​—msh210 (talk) 16:34, 29 December 2010 (UTC)Reply
Whenever this question arises I always feel inclined to point out that this 'tool' does not add any special abilities to the user account, it merely adds a shortcut to an already available action. Furthermore the shortcut is one which could be added by modifying the users custom js. I am not sure what the downside is to allowing all auto-patrollers, or all auto-confirmed users this functionality on request or by default. - [The]DaveRoss 19:27, 29 December 2010 (UTC)Reply
Hm, I didn't realize it's JSable. A downside of allowing it by default is that some autopatrolled users may be trigger-happy. But I have to admit I don't see a downside to allowing it on request (since anyone requesting it can use the JS instead if denied). Does anyone else? (And I rescind my deferral, above.)​—msh210 (talk) 19:42, 29 December 2010 (UTC)Reply
I'm not exactly sure if most of the user scripts used on Wikipedia (e.g. Twinkle, GLOO, etc.) can be compatible for use on the Wiktionary. (BTW, I'm not familiar with any of the scripts used here on Wiktionary or outside of Wikipedia.) nh.jg (talk) 20:59, 29 December 2010 (UTC)Reply
Stupid question: As an admin, when I use the "rollback" feature, the edits that I'm rolling back are now considered "patrolled". I assume the same thing happens when a patroller+rollbacker uses the feature. But what about a rollbacker who can't otherwise mark edits as patrolled? —RuakhTALK 16:52, 11 January 2011 (UTC)Reply

Note to admins: Please watch major templates for a potential vandal

This notice is being cross-posted to the major administrators noticeboard (incidents or alerts) style pages on all the major projects.

Earlier today, a w:User:Meepsheep2 was blocked on English Wikipedia. Apparently in reprisal, he vandalized a major template on English Wiktionary with a fake fundraising banner that he photoshopped. Someone reported it on IRC, and we blocked him quite quickly for this time of night, but we want you to be on the lookout for future similar incidents. Please help keep an eye on major templates for vandalism specifically related to the fundraiser banners, and if they occur, globally lock their accounts (if you do not have that access, please block them locally on the wiki they vandalized, and then find someone on IRC who can globally lock the account). Stewards can assist with this. I know you guys all watch the high value templates anyway, and I'm not asking you to do anything different with those. I'm specifically referring to incidents that spoof the fundraising banners. Please keep an extra careful eye out for those, and take the extra step of globally locking the account to prevent future recurrences of this specific kind of vandalism. Please send any questions to drosenthal (at) wikimedia.org, or use my English Wikipedia User Talk page as I cannot respond locally on all projects. DanRosenthal Wikipedia Contribution Team 07:04, 29 December 2010 (UTC)Reply

BP bug

Am I the only one who is seeing exactly the penultimate revision of BP, instead of the last revision, always? --Daniel. 13:25, 29 December 2010 (UTC)Reply

I am getting the current version each time I load BP, and this is across multiple PCs and browsers. Do you get the last version you loaded until you refresh? Or do you actually get the second most recent revision regardless of when you last loaded the page? - [The]DaveRoss 14:40, 29 December 2010 (UTC)Reply
When I compare revisions or edit a section, I can see the code of the last revision (The text as is looks good.​—msh210 (talk) 16:21, 29 December 2010 (UTC)), but it doesn't appear where it should on BP.Reply
I actually get the second most recent revision regardless of when I last loaded the page. This situation does not change when I refresh. This situation does not change if I never got the chance too see the penultimate revision when it was actually the last. It simple chooses the second most recent revision and shows it on my monitor. I use Mozila Firefox 3.6.8.
When I tested Internet Explorer, it did work as expected, with all the recent messages visible. I feel inclined to reinstall Firefox now (especially because the stable version 3.6.13 was recently developed), though this problem appears to come from the server. --Daniel. 17:13, 29 December 2010 (UTC)Reply
If you don't mind saying, do you have Firefox set up to edit through a proxy? If you are getting stale pages with only one browser I am not sure how it could be a server-side problem. - [The]DaveRoss 20:24, 29 December 2010 (UTC)Reply
I am not using proxies. Hours ago, I discovered that apparently if the last revision exists for more than approximately 20 minutes (a phenomenon that has been a little rare today), it appears correctly when I try to see the Beer Parlour on my Firefox. --Daniel. 00:05, 30 December 2010 (UTC)Reply

Bring back WT:IRC

I've noticed over the festive period that the IRC channel has been quite popular. It's much, much more efficient than using user talk pages and pages like this one. Sadly there's often only two or three people on even at 'peak' times, if we could get our most active users (such as Equinox, Internoob, DCDuring) on there more often, we could discuss issues many many times quicker than you can on talk pages. Mglovesfun (talk) 15:55, 29 December 2010 (UTC)Reply

But without the record of what was said when and by whom. IIRC, Wikipedia decisions have sometimes been criticised as done by an IRC-based "cabal" rather than on talk pages. (That's not to say that I dislike the IRC channel, but I see it mainly as a social venue.) Equinox 17:45, 29 December 2010 (UTC)Reply
That is correct. Ok, it also allows working together on problems (disruptive editors, dubious entries). Some of my and others' RFD, RFV and RFC nominations have been based on IRC discussions. Mglovesfun (talk) 17:49, 29 December 2010 (UTC)Reply
The lack of record seems contrary to the spirit of wikiness for discussions with policy implications that don't require instant response. Vandalic "contributions" might warrant such response, but not where there is underlying substantive disagreement and non-negligible support for the "contributions". DCDuring TALK 18:30, 29 December 2010 (UTC)Reply
It wouldn't be particularly difficult to log the IRC channel and publish the logs. That being said, the real value of the channel is to bounce ideas around informally before presenting them for on wiki discussion. Making final decisions there would be opaque and un-wiki, discussing them there is just expedient. The most important upside of the IRC channel is that it is informal and conversational, it adds a more social aspect to the project. A closer-knit community works together better, improving the project overall. - [The]DaveRoss 19:23, 29 December 2010 (UTC)Reply
The #wiktionary-gfdl channel is meant to be published, and Neskaya used to log it; I don't know whether that's still the case.​—msh210 (talk) 20:44, 29 December 2010 (UTC)Reply
Even so, an IRC log doesn't have the separation into topics of talk pages. It's just one huge mass full of irrelevant asides. Equinox 22:28, 29 December 2010 (UTC)Reply

Microsoft

I created Category:Microsoft and populated it with existing terms. Sixty, to be exact. They include programs such as MS-DOS, PowerPoint, Excel and FrontPage, among functions and computer languages of that company. Our coverage may be increased by the addition of Word, Access and Windows, but I suppose they would be controversial. Thoughts? --Daniel. 17:20, 29 December 2010 (UTC)Reply

Subject to WT:BRAND. IMO, very unsuitable. Equinox 17:47, 29 December 2010 (UTC)Reply

A Christmas Carol

I created the category A Christmas Carol and its members that name characters of the book, such as Ghost of Christmas Present. My rationale is that they are well-known, in addition to appearing in a huge number of adaptations for TV shows, including many that don't mention the source, thus passing WT:FICTION. --Daniel. 19:42, 29 December 2010 (UTC)Reply

Have they passed RfV? By what standard? How many RfV-passing characters are there in w:A Christmas Carol? Why would we want a topical category that is unlikely to have more than ten members? DCDuring TALK 19:49, 29 December 2010 (UTC)Reply
Delete this category named as a topical-context category (and apparently so intended, from its description on the category page). We do not include words used only in the context of one work of fiction, so if anything belongs in the category it (the sense) should be deleted.​—msh210 (talk) 19:55, 29 December 2010 (UTC)Reply
None of these terms has passed RFV because nobody has created any RFV discussion for them. If they did, the foreseen standard would be WT:FICTION, as I said. Each topical category that is unlikely to have more than ten members should be analyzed by its own merits: Category:Days of the week is informative enough; Category:A Christmas Carol has the benefit of allowing me to create this BP discussion on these terms as a whole. The latter can be changed into Category:Charles Dickens, like Category:Lewis Carroll. The senses can be twisted to contain the fact that these characters were created by Charles Dickens but used in many other works. --Daniel. 20:03, 29 December 2010 (UTC)Reply
The fundamental premise is wrong. There is no presumption favoring fictional characters when WT:FICTION makes inclusion time-consuming. Perhaps taking time to cite such entries in a way that meets the standard would help you see the problem. Or perhaps the problem is that you don't share the stated goals of the project to be a dictionary and not an encyclopedia competing with the mother ship, except for those with short attention spans. DCDuring TALK 20:23, 29 December 2010 (UTC)Reply
"There is no presumption favoring fictional characters when WT:FICTION makes inclusion time-consuming." does not make sense.
Avoiding anything encyclopedia-like is one of many ways to make a dictionary compete with encyclopedias; please don't do it.
My message of "21:27, 28 December 2010 (UTC)" from this discussion contains some counterarguments that fit your understanding of "short-attention encyclopedia". --Daniel. 20:42, 29 December 2010 (UTC)Reply
It is you who are making the dictionary "compete with encyclopedias" by putting loads of encyclopaedic material into the dictionary! Let a dictionary define words and an encyclopaedia expound on topics. Equinox 22:30, 29 December 2010 (UTC)Reply
Equinox, since we disagree in many subjects, it would be safer to expose and explain the merits of your opinions instead of giving plain instructions or orders when talking to me. --Daniel. 22:52, 29 December 2010 (UTC)Reply
Do you disagree,then, Daniel, that a dictionary is supposed to define words and an encyclopedia expound on topics?​—msh210 (talk) 05:03, 30 December 2010 (UTC)Reply
No. I endorse the paragraph written by Lmaltier at "10:58, 30 December 2010 (UTC)", below, in this discussion. --Daniel. 12:29, 30 December 2010 (UTC)Reply
It would be "safer" to say nothing and let you wreak your havoc undisturbed. I refuse. Equinox 09:29, 30 December 2010 (UTC)Reply
Do you refuse to expose and explain the merits of your opinions? --Daniel. 12:29, 30 December 2010 (UTC)Reply
The term is not usual in English (while dictionnaire de langue is usual in French), but it might be better nonetheless to always use language dictionary to refer to Wiktionary, rather than simply dictionary, because encyclopedic dictionaries do exist (a classical example is the Petit Larousse (a French dictionary, providing both linguistic and encyclopedic information).
The differences between Wiktionary and Wikipedia are:
  • as Wiktionary studies words and Wikipedia studies topics, page titles must be words here, e.g. cat, New York, for instance (or prefixes, etc.) while any topic description could do for Wikipedia: words, but also List of ants of Minnesota, which is not a word.
  • here, information given in the page should be about the word, while it should be about the topic for Wikipedia (logically, the only common part should be the definition, but Wikipedia sometimes also provides some linguistic information).
Stating that a word is encyclopedic is meaningless: the contents of the page may be encyclopedic, but not the word. Lmaltier 10:58, 30 December 2010 (UTC)Reply

Mythological characters

Is there any consensus for the inclusion of gods and other characters of various mythologies, such as Jupiter, Bacchus, Osiris and Achilles, aside from the mere existence of these senses?

Naturally, my personal opinion is inclusionist. Apparently there is not any editor interested in deleting these senses as well. Am I wrong?

For what is worth, we have Category:Greek mythology with 226 members, Category:Roman mythology with 20 members and approximately a dozen of other categories of mythologies by culture. On the other hand, we are currently lacking Echo, Cloacina, Nefertem and various other characters. --Daniel. 21:46, 29 December 2010 (UTC)Reply

We welcome all words (see beginning of CFI). Their meaning is important for providing a definition, but is irrelevant (or should be irrelevant) for inclusion, when it's obvious that they are words. But the sense is relevant for phrases, in order to ascertain that they can be considered as words. Lmaltier 22:10, 29 December 2010 (UTC)Reply
In my unfashionable opinion, they are more worthy than recent pop culture (TV, anime, Web sites, etc.). But they should be defined as words and not encyclopaedia topics. Equinox 22:26, 29 December 2010 (UTC)Reply
I would be happy if the entries for one-word names existed to carry the wikipedia links to the dab pages there, the translations or transliterations, the etymology, RTs, DTs, etc, and any attributive meaning they had as embodiments of specific well-defined attributes, where such usage was attestable. I see no reason to apply qualitatively different standards to Classical mythology or Harry Potter. I expect that attributive use of figures from Classical mythology would be easy to attest from sources before the 20th century. DCDuring TALK 23:18, 29 December 2010 (UTC)Reply
Why looking for attributive uses, as they are words? The definition should always reflect the meaning: explaining in the definition that a cat is an animal, and what kind of animal, is necessary to be able to understand the word. It's the same for Jupiter: the meaning is the god, even if some figurative attributive uses exist, and the definition should reflect this fact. Anyway, attributive uses exist in English but, as they do not exist in other languages, this is not a possible criterion here. It seems clear to me that this criterion is very sound for Webster's (an English dictionary normally excluding proper nouns, surnames and given names, but including Abidjan because of a possible attributive use), but does not make sense for a dictionary describing all words of all languages. Lmaltier 10:35, 30 December 2010 (UTC)Reply
I agree: "the meaning [of Jupiter]] is the god, even if some figurative attributive uses exist, and the definition should reflect this fact". If we, hypothetically, find attributive uses for Jupiter, Bacchus, Osiris and Achilles as common nouns, then delete the proper noun definitions (resulting in the effect of the current Britney Spears, among other entries), the entries will probably seem incomplete. --Daniel. 12:55, 30 December 2010 (UTC)Reply
I agree on part of that. I see no reason to give mythology different treatment from pop culture. Lest we forget, Shakespeare was pop culture in his day, criticized for several centuries in fact. I just disagree that exclusion of Jupiter and all else is the best way to accomplish this. There are objective ways to determine which terms to allow. Subjective ones, you might agree, are only going to lead to a popularity contest, deciding which parts of our culture are to be embraced and which are to be detested. DAVilla 20:15, 30 December 2010 (UTC)Reply
How can you define Jupiter as a "word" without mentioning anything encyclopedic like the Solar System or Roman mythology? No matter how you slice it, Jupiter is a proper noun, and it refers very specifically to a particular planet and to the individual god who was the son of Saturn. DAVilla 20:10, 30 December 2010 (UTC)Reply
As the definition provides the meaning, there is something common with an encyclopedia in it, even for simple words as cat, but this must be limited to what is required to understand what the word means. There are objective ways to determine which terms to allow.: yes, the objective way is that we allow all of them. Any other way would be either subjective or arbitrary, or lead to disagreement between contributors. I don't understand why some words are not allowed here (I understand it for paper dictionaries, e.g. too recent words, but not here). Lmaltier 20:23, 30 December 2010 (UTC)Reply

Regarding Wiktionary:Votes/2010-09/Language codes in templates

I was thinking of organizing a new vote basically saying 'language codes are encouraged as much as possible, but some template also allow language names'. If I created such a thing, would other editors proof read it and/or vote on it? Mglovesfun (talk) 12:36, 30 December 2010 (UTC)Reply

I might, for one. But what would be the purpose of such? Custom, which already includes the same 'rule', is as strong as a toothless voted-on policy, I think.​—msh210 (talk) 21:37, 30 December 2010 (UTC)Reply
My lack of response says it all. I think. Mglovesfun (talk) 19:09, 17 January 2011 (UTC)Reply

User:MalafayaBot for operation in article namespace

Hi, all.

I'm requesting authorization to run my interwiki bot in the article namespace. It runs in -auto mode which means no existing redirects will be removed.

Please, read and cast your vote at Wiktionary:Votes/bt-2010-12/User:MalafayaBot for operation in Article namespace. Thanks, Malafaya 14:51, 30 December 2010 (UTC)Reply

The Bible and other books

We have entries for the Bible, and its books: Daniel, Mark, Genesis and others.

They comprise another group of proper nouns whose inclusion is unclear. If WT:BRAND applies to books, I suppose famous ones such as How to Win Friends and Influence People, War and Peace and Divine Comedy would be easily attestable, and most others would not.

All or most book titles fit the criterion of idiomaticy because their meanings of "A book [...]" cannot be inferred from its components. Nonetheless, I can see reasons for not making Wiktionary a compendium of every attestable title of book, which are related to the reasons for not making it a compendium of names of every living and dead person:

  1. One of these reasons is that the sense can usually be inferred from the context very easily. If an English student wants to understand and/or analyze the sentence "I was reading Harry Potter last week and I loved it!" by searching each word on Wiktionary, then every entry except Harry and Potter would be expected convey useful information about what exactly the speaker means, but the rest of the phrase already clarifies that Harry Potter is something written and entitled "Harry Potter". This is enough to understand the nature of this sense of the term "Harry Potter". If, alternatively, the sentence to be analyzed or understood were shorter, such as "I love Harry Potter!", the proper nouns still would be understood from the context, because it would be unusual to shout this on street without any introduction. It could be a discussion about movies; or even, about characters.
  2. Additionally, even when we accept the possibility of making an individual sense of proper noun for it that starts with "A book..." (generously avoiding the notion that titles of books comprised of multiple words are sums-of-parts), there is the fact that the title already explains something about the contents in the book. If the title is Harry Potter, it must be about a guy or something else named Harry Potter. It is short but explanatory enough.

These two reasons alone, as I see them, prove that in many situations, "Harry Potter" does not even need a definition in a dictionary to be understood, thus it fails the criterion of "A term should be included if it's likely that someone would run across it and want to know what it means." from WT:CFI. The Bible also fails that criterion, since it etymologically merely means "book", not to mention that it is simply a bible (common noun: A comprehensive manual that describes something.)

One possible generic and simple interpretation is that all works of literature should be treated in the same way, thus excluded from Wiktionary. That is, we can hypothetically develop explicit rules for never adding the Bible here as the title of one or more specific religious books or collections of books. Similarly, we would formally exclude other famous titles, not to mention much less famous ones such as A history of language or Thought and language: an essay, having in view the revival, correction, and exclusive establishment of Locke's philosophy.

We seem to have very few titles of artistic works. Apparently, none of them are paintings, sculptures or musics: Monalisa and Für Elise are not defined on Wiktionary. Various of the works whose titles are defined here are fairy tales. If I remember correctly, one apparent criterion mentioned by IRC is that when the names of a fairy tale and its title character are spelled identically, we allow the inclusion of both, such as Little Red Riding Hood but never Snow White and the Seven Dwarfs nor The History of Tom Thumb. I personally don't take this criterion seriously, but apparently the groups of existing and nonexisting entries basically support it.

Thoughts? --Daniel. 16:02, 30 December 2010 (UTC)Reply

You quote "A term should be included if it's likely that someone would run across it and want to know what it means.". I think that sentence should be extended: looking for a meaning is not the only use of a dictionary, you might look for a spelling, a conjugation, a pronunciation, an etymology, etc.
When names can be considered as words, I see no problem, but this is very exceptional. An example is Parthenon: this is a word, and this word is the name of a work of art. Bible is another example. However, even when book titles are composed of a single word, this word normally means something, it should not be defined as a book title, but as this something. Generally speaking, such names (book titles, etc.) cannot be considered as words (e.g. Für Elise). The Italian Wiktionary used to include (or still includes?) a few book titles, translations being titles chosen when these books were published into other languages. I personnaly think we should not follow this example, because it's beyond the already huge scope of all words of all languages.
Little Red Riding Hood might be considered as a word (on the same grounds as Gulliver or Superman), but I agree that there is no reason to define two senses for this word, the character should be the only sense. Lmaltier 16:43, 30 December 2010 (UTC)Reply
Very simply, all specific entities should be attested in metaphorical use. This would permit War and Peace, Divine Comedy, and Harry Potter—just wrap the title with "the...of" in a Google Books search—presumably also Für Elise although those cites would be harder to hunt down, but would almost certainly exclude Snow White and the Seven Dwarfs and How to Win Friends and Influence People (although oddly I was able to find one cite for the latter). DAVilla 19:15, 30 December 2010 (UTC)Reply
Shouldn't we be able to ascribe a particular meaning to the "metaphorical" use? At Donald Duck#Adjective the citations are focused on the voice of Donald Duck. That seems to be, by far, the most common attribute of w:Donald Duck that has entered the lexicon as something other than as the name of an individual. At WP there is often a section in the articles about various individuals with a title like "In popular culture" that can provide directions for searches for citations to support the kind of "metaphorical" use under discussion. I do not believe that three attestable instances of the formula "the [proper noun] of" is sufficient, unless it is clear that one set of attributes is being invoked. OTOH such searches may be effective in providing clues about the set of attributes that might be invoked in other usages of the proper name. DCDuring TALK 19:53, 30 December 2010 (UTC)Reply
First let me just clarify that not all results for "the...of" are metaphoric/metaphorical, naturally. It is just an easy way to find such uses, which is otherwise extremely difficult to do in a text search.
Your proposal is too restrictive, in my opinion. It would require citing not just metaphorical use, but implicit understanding of a particular attribute, meaning all three metaphors must agree. Often it's not exactly clear which quality a metaphor refers to in the first place. It might work in some cases, but it's not like authors spell these things out for us. Criteria formed on this basis would be a nightmare even scarier than WT:BRAND. DAVilla 20:49, 30 December 2010 (UTC)Reply
When something is a word (specific entity or not), there is no need for a metaphorical use to be included. You probably mean that a metaphorical use transforms something into a word even when it was not possible to consider it as a word before this use. Lmaltier 20:16, 30 December 2010 (UTC)Reply
The reason to cite metaphorical use is to show that the term is part of the lexicon, not just a name. Of course it is a name, and should be defined, in my opinion, as the thing we all know it to be, by why some names and not others?
In DCDuring's example of Donald Duck above, we have
  • "local patrons can request that their greeting be sung in Donald Duck voice"
    It is not literally Donald Duck's voice, it is people impersonating Donald Duck.
  • "banish those irritating Donald Duck squeaks and quacks from speeded-up playback"
    Here it isn't even imitating Donald Duck deliberately.
  • "his voice had risen into the Donald Duck register"
    This register existed before there was the cartoon character. Speaking this way has no direct connection to Donald Duck. It's called that only because of the similarity, but they are not the same.
Metaphor is saying something is Donald Duck when it's not. Contrast that with the fourth quotation, which is a simile:
  • And then the most Donald Duck-like screaming and jabbering you ever heard.
DAVilla 20:35, 30 December 2010 (UTC)Reply
Right, similes don't count for this kind of attestation, though they illustrate a related usage. DCDuring TALK 20:54, 30 December 2010 (UTC)Reply
  • @Maltier
That ship has sailed some years ago. We already exclude brand names, for example, and took the trouble to have a vote specifically including toponyms (which remain pathetically non-comprehensive, even random, in their coverage, a discredit to Wiktionary, IMHO). We are trying to decide on the desirability on include some proper nouns now. Whether we will ultimately include all proper nouns or all collocations is not really at issue now. There is almost no objection to the inclusion of single-word terms, though there is some objection (eg, by me) to include senses that refer to individuals. It already depends on an extended definition of "word" to include any multi-word term. I think our arguments about which class or types of proper noun words we could include now should be less based on arguments from first principles and more on practical considerations. One consideration is whether we can do a creditable job of including a large portion of the class of terms, at least in English.
I suppose another practical consideration is the possible role of proper noun (names of individual entities) in recruiting newbies. Names of specific entities are easy enough for our newest contributors to add given the skills, time, and motivation that they bring. Proper nouns often require no painful effort to attest to their meaning or painful conceptual effort to ascertain their meaning. As such they provide some nice easy entries. Perhaps we should reserve such proper-noun entries (for specific individuals) for unregistered users. Our more experienced and skilled contributors could work on correcting the entries and trying to convince the newbies to register and work on mainstream entities. DCDuring TALK 20:54, 30 December 2010 (UTC)Reply
We don't include names, we include words (whatever they mean). But many names are considered as words, and there are etymologic (not encyclopedic) dictionaries about place names, etc. Here, I understand word in its linguistic sense: this is not an extended sense, this is the sense given in word (but I understand that most people might understand this term more or less in its typographical sense, this is the difficulty with polysemic words). Lmaltier 21:38, 30 December 2010 (UTC)Reply
Mainly from DAVilla's comments, I believe Citations:Oxford English Dictionary is very attestable, in addition to already having five citations. --Daniel. 05:20, 31 December 2010 (UTC)Reply
It's a typical example of a name which is not a word. We might consider that the last two quotes (not the first ones) in this page use this name as a word. But, as this figure of speech may be applied to any proper name, including your own full name or mine, I feel uneasy about allowing this name on these grounds. Dealing with true words is a huge enough task. Lmaltier 09:16, 31 December 2010 (UTC)Reply
For anyone who's keeping count, only one of those OED quotations is metaphorical use, incidentally. The rest are merely out of context. DAVilla 16:47, 31 December 2010 (UTC) Updated. DAVilla 09:16, 12 January 2011 (UTC)Reply

Lingua Franca Nova

Apparently we had a lot of terms in this constructed language until recently. If we were to vote to include this language then now would be a good time to say something. You may also be interested in this current vote. DAVilla 02:52, 31 December 2010 (UTC)Reply

Is there any significant body of text that is durably archived? Looking at the Wikipedia page, I don't see any evidence that there is a single work of any length that is durably archived.--Prosfilaes 07:53, 31 December 2010 (UTC)Reply
What difference does that make? Wiktionary covers many languages like that. --Yair rand (talk) 08:03, 31 December 2010 (UTC)Reply
I don't see why we would include a language that has no words that could pass WT:CFI. And I don't think it does; virtually all natural languages are durably archived, by linguists if no one else.--Prosfilaes 08:23, 31 December 2010 (UTC)Reply
Read the Wikipedia page itself, and references. I cannot understand how you can state that no word would be includable. Lmaltier 09:04, 31 December 2010 (UTC)Reply
What words do we have in three independent durably archived sources? Between Usenet and published materials, what's available? All I see is a lot of stuff on Wikia and a couple English-language articles that at best use Lingua Franca Nova words for short example snippets.--Prosfilaes 00:57, 1 January 2011 (UTC)Reply
Well, there needs to be three independent citations in durably archived sources. So he has a point. -- Prince Kassad 17:04, 31 December 2010 (UTC)Reply
You would want to delete everything in Category:!Xóõ nouns? The "three durably archived uses" is really built for English words, sometimes for some other major languages. This part of CFI is really broken, it doesn't allow most words. The best we can get to is consider virtually all works in languages with a small literary base to be "well-known works", try and count everything in small languages to be "clearly widespread use", and ignore as many instances as possible. There is a pretty clear consensus that we should include "all languages", including minor ones, and to include "all words", not just the top few that might be attestable. The real dispute right here is what makes something a "language", which is by the definition a "form of communication", which means that the issue here is whether LFN is used by people to communicate. As to the answer to that, I have no idea. --Yair rand (talk) 17:20, 31 December 2010 (UTC)Reply
I don't see why if someone invents a word for "modem" in !Xóõ, that it's any more notable then any other word that's invented and used by one person. I don't know what is recorded about !Xóõ, but linguists have made it their life's job to record languages, so for all real languages, there are durably archived records. If you want to permit attestation from one durable source or even a durable dictionary (which would certainly cover these !Xóõ words), that's one thing. But if we permit people to add words claiming they're !Xóõ with no evidence, nobody will have any reason to trust that, especially not the linguists who about the only people that will care.
If in 20 years there's durably archived uses of Lingua Franca Nova, then we can include the language. If not, then likely there will be nothing left of the language and no one will be interested in looking up words in it.--Prosfilaes 23:40, 31 December 2010 (UTC)Reply
Yes, the three independent citations rules was designed to exclude "words" somebody has invented some day, but are not really words of the language. If I use britchinettery in a sentence, and I explain that this word means cat, that does not make it an English word. And the durably archived rule is only for verifiability. If verifiability can be achieved through other means (e.g. several contributors testifying that the word has been used on some sites, even if these uses are not accessible any more), I see no problem. Basic principles (such as all words, all languages) are much more important than rules, when they seem to contradict. Lmaltier 17:38, 31 December 2010 (UTC)Reply
For what is worth, Lmaltier's guideline of "If verifiability can be achieved through other means (e.g. several contributors testifying that the word has been used on some sites, even if these uses are not accessible any more), I see no problem." would probably make Citations:plotkai attestable. (The entry formally isn't attestable due to the lack of durable archived sources, since editors achieved the consensus that these forums and other websites are not durably archived.) --Daniel. 21:00, 31 December 2010 (UTC)Reply
If you use britchinettery in a sentence meaning cat, then in the sentence you wrote it means cat, and is as much English as the rest of the sentence. We should turn away nonce and idiosyncratic uses, but I don't think we can justify that as "are not really words of the language". Perhaps I should say that you're appealing to a Platonic interpretation of language that I don't think is justifiable; there is no bright-line test of when or if kawaii or Clintonomics or britchinettery or antidisestablishmentarianism became real terms of English, nor whether a text is Scots, English or Middle English. I have problems with any form of verifiability that that does not demand a citation that can be checked indefinitely in the future, and as a dictionary for the ages, I wonder about the long-term value of any entry for words that may not exist outside the entry in the future.--Prosfilaes 00:39, 1 January 2011 (UTC)Reply
I think the real key point here is that language serves as a form of communication between people. So it all comes down to whether the word actually communicates the same idea to multiple people. If you use a word and there are others who understand it without having to ask, then I would say the word has entered the language at least in a limited way. The next point from there is just a matter of how many people are aware of the meaning compared to how many that aren't. —CodeCat 01:12, 1 January 2011 (UTC)Reply
You are right. People invent words all the time and, if they can be easily understod, they may be considered as English words. My POD (Pocket Oxford Dictionary) considers that all adjectives built as noun + -like must be considered as correct English words. This might make antidisestablishmentarianismlike an English word (as soon as somebody needs to use it). But it's not the case for words invented for the fun and that nobody understands and nobody has ever understood. Lmaltier 15:32, 1 January 2011 (UTC)Reply
But words that nobody understands and nobody has ever understood aren't really the question. (Well, except for Selah and a few other translation words and words from well-known works.) What about CFI? You're sitting in a community of people who can read CFI in running text and understand what it means. What about floopy? One hundred students in Springfield Middle School agree on what floopy means and in five years no one will ever use floopy with that meaning. Words come into existence and disappear from record all the time. Do we really want every definition of floopy that two English speakers have mutually understood?--Prosfilaes 19:49, 1 January 2011 (UTC)Reply
Including local words is useful, yes, including floopy or rare ones. And why not CFI: including it here would be especially useful to readers. But I would not call an English word a word shared by only 2 speakers (as a joke, as a code, or otherwise) and not understood by anybody else. Lmaltier 21:33, 1 January 2011 (UTC)Reply
Fine, Floopy, of a fart, to be noisy and disrupt class. I can have three sockpuppets users aver that's used at Springfield Middle School. Not that anyone will ever come looking for that definition, or if they do, they will be able to find the right one under the pile of definitions.
Again, you're acting like there's a Platonic definition of language. I certainly don't agree that a word used in English sentences between monolingual English speakers can be said not to be an English word; it is certainly a unit of language, with meaning to its user and those he's using it to, and English is the only language it could be a part of. You can't dismiss this issue by reference to "real" English words; you're going to have to set lines excluding words only used between two people. (Three? Four?) And then while we're replacing CFI, you're going to have to find a way to tell whether a word fits the standards, and I for one don't see any way of telling whether a word is in current use among a small verbal community or is just made up and not really used. (And looking at Google Books, I don't think CFI would be especially useful to readers if we tossed the current rules to the side, as I can see at least six new definitions for CFI from the first couple pages of hits, and I can easily see a dozens and dozens of definitions with a little work. Wikipedia has sixteen, and that only includes two out of the six.)--Prosfilaes 23:23, 1 January 2011 (UTC)Reply
When there are too many senses, the best way is to organize them by domain. There are dictionaries of initialisms, and they really help. Lmaltier 09:23, 2 January 2011 (UTC)Reply

January 2011

Historical events

I created Category:Historical events (which is different from Category:History in many aspects) and filled it with fourteen members. Naturally there are many other names of historical events yet absent from this category and from Wiktionary.

Since proper nouns are currently a delicate subject, I ask: Are there any limits and thoughts about the inclusion of these events? For instance, in the subject of Christianity, I suggest the creation of Council of Chalcedon, Protestant Reformation and Great Awakening. --Daniel. 00:59, 1 January 2011 (UTC)Reply

Err, how is World War I or the Middle Ages an "event"? ---> Tooironic 14:04, 1 January 2011 (UTC)Reply
The entry event is defined as An occurrence; something that happens. World War I and the Middles Ages happened. Historically. --Daniel. 17:12, 1 January 2011 (UTC)Reply
The Middle Ages aren't really an "event" that "happened"; they're a time period (in a certain region). World War I, however, I would consider to be an event. —RuakhTALK 22:52, 1 January 2011 (UTC)Reply

Very often, these events are known under somewhat arbitrary names, and this fact makes it possible to consider these phrases as idiomatic phrases, as words (they should be defined as proper nouns). It's the case for Great Awakening, World War I or Révolution française: the meaning does not derive from the sense of their components, it cannot be guessed precisely with certainty. But including Council of Chalcedon is very disputable, for the same reason as Winston Churchill. I would not include it. Lmaltier 15:21, 1 January 2011 (UTC)Reply

From your chosen examples, the proposed distinction between events that are or nor arbitrary idiomatic names is obscure. One can guess that "Council of Chalcedon" is a council held in a place named Chalcedon, or by a person named Chalcedon. Similarly, one can guess that "World War I" is the first of a sequence of wars that affect the world in great scale; and that Révolution française is a revolution that occurs in France. --Daniel. 17:12, 1 January 2011 (UTC)Reply
There have been several revolutions in France, only one of them is called this way. The meaning of World War I is not obvious at all, you can only guess it was a world war. On the other hand, councils are normally named after the place where they were held, you cannot add any linguistic data, what you can add would be encyclopedic. Lmaltier 21:26, 1 January 2011 (UTC)Reply
There's only one (deprecated template usage) Révolution française, but it's frequently referred to simply as (deprecated template usage) la Révolution. —RuakhTALK 21:32, 1 January 2011 (UTC)Reply
You are right, Révolution might be created (but certainly not la Révolution, the article does not belong to the word). fr:Révolution française includes many translations, and a well-known anagram. Lmaltier 21:54, 1 January 2011 (UTC)Reply
As you can see, I did link to [[Révolution]]; but I definitely think la should appear on the headword line. —RuakhTALK 22:48, 1 January 2011 (UTC)Reply
I disagree: no, the article does not belong to the word, even in the headword line: it's like France, and unlike La Haye. The use of la seems to be systematic in sentences, you are right, but here is an example of a use without the article: http://www.bertrand-malvaux.fr/p/486/plaque-de-giberne-de-la-garde-nationale-revolution-1789-1792.html Lmaltier 22:27, 2 January 2011 (UTC)Reply
Is the Council of Chalcedon the only council ("committee that leads or governs", "discussion" or "deliberation") that happened in Chalcedon? --Daniel. 22:23, 1 January 2011 (UTC)Reply
From "World War I", one can infer that it is not only a world war, but the first of them. With a little creativity, and a certain lack of knowledge of basic history of the 20th century, one possible alternative explanation is that the "World War I" is the ninth world war, after World War A, World War B, World War C, World War D, World War E, World War F, World War G and World War H. --Daniel. 22:23, 1 January 2011 (UTC)Reply
  • What bothers me about all of these entries for singular entities or events is that they are so essentially different in nature from the fundamentally empirical nature of a dictionary. A "good" encyclopedic entry is a fairly precise and comprehensive description of something, but does not correspond to actual usage by very many people. It seems to me that such entries end up being prescriptive. How do we know what most people actually mean when the use the term French Revolution? Do we have recourse to experts about what they should mean? Do we abridge encyclopedias and other references? How do we know which facts are the salient ones in common usage for inclusion in a dictionary-length definiton (or short-attention-span encyclopedia entry)? Is it just a matter of individual opinion? That would politicize many matters. Are there any facts about usage that we could have recourse to clarify such matters? — This unsigned comment was added by DCDuring (talkcontribs) at 00:35, 3 January 2011 (UTC).Reply
    How do we know what people mean by French Revolution? The same way we know what they mean by oat; we look at examples of how people are using it, and try and infer a definition. What facts are salient? That's a hard problem; I find the oat solution, which gives a definition for the word that is precise, accurate, and completely useless to any one who actually needs the definition, to be frustrating, but it's probably the best thing to do with Wikipedia to point people to. (A "short-attention-span encyclopedia entry" would probably be more useful for readers, though.) Could a definition be political? Sure, like "A grain, which in England is generally given to horses, but in Scotland supports the people." as a definition for oats. And if you want facts about usage hit Google Books or Usenet, and see how it's being used.--Prosfilaes 03:36, 3 January 2011 (UTC)Reply
  • Well, the definition of a word should be what the word means. Yes, this is the meaning actually intended by people using the word (if we don't know with certainty what people were intending, it's better either not to include any definition, or to add a warning). The definition must be sufficient to understand what the word means, and not include anything more. For a revolution, this might include the location and the date, everything else being covered by Wikipedia: a link is necessary). I see no difference between oat and French Revolution about all these points. Lmaltier 16:00, 3 January 2011 (UTC)Reply

"singular of..."

The entry Assembly of God is defined as "The singular of Assemblies of God." Are there other English nouns whose plural is lemmatized and the singular is defined as a "singular of"? We don't categorize on that basis, so it's hard to know. --Daniel. 03:49, 1 January 2011 (UTC)Reply

Does (deprecated template usage) shoop count? —RuakhTALK 03:52, 1 January 2011 (UTC)Reply
I sometimes feel with entries that are rare in the singular, but the plural is much more common, that {{singular of}} seems justified. Mglovesfun (talk) 22:14, 2 January 2011 (UTC)Reply

.az language name / Azeri / Azerbaijani

I have been asked to change language names for .az on my Wikistats: " "Azeri" to "Azerbaijani" and "Azərbaycan" to "Azərbaycanca" [2] quote: "Please change "Azeri" to "Azerbaijani" and "Azərbaycan" to "Azərbaycanca" in this page. It's true name of language (http://en.wikipedia.org/wiki/Azerbaijani_language). And "Azərbaycanca" too. (See interwikis in wikipedia). Thanks." Does that mean we should think about changing it here too? Like all words in Category:Azeri nouns for example use "===Azeri===", should they be "Azerbaijani" instead? Are the reasons given above valid? Mutante 17:12, 2 January 2011 (UTC)Reply

Azeri refers to the people of Azeri descent, while Azerbaijani explicitly only refers to the people in Azerbaijan. This would therefore exclude the Azeris living in Iran. Unless we do split them up like this, the name should not be changed. -- Prince Kassad 17:20, 2 January 2011 (UTC)Reply
Btw, WP says "Azerbaijani or Azeri or Torki" and we dont have Torki as a word yet. Mutante 17:24, 2 January 2011 (UTC)Reply
Azerbaijani does not refer explicitly only to the people in the Republic of Azerbaijan. It also refers to the people of w:West Azerbaijan and w:East Azerbaijan, provinces of Iran. In fact, the Turkic-speaking people of the Republic of Azerbaijan were not referred to as Azerbaijanis before 1920s. Before that they were either "Transcaucasian Tatars" or simply "Muslims".
Azeri and Azerbaijani are 100% synonyms and because we have already picked Azeri, we should stick to it, if only not to bother with renaming so much stuff. --Vahag 17:41, 2 January 2011 (UTC)Reply
I think that the most common name should always be used, when it's obvious. In this case, it's not obvious, and I would keep Azeri. Lmaltier 22:18, 2 January 2011 (UTC)Reply

In the spirit of being bold as well as the spirit of letting people know what you are being bold about, you may notice that "(by language)" has disappeared (twice) from the sidebar. Neither of the tools supporting those features are currently working, one since October and one longer than that I am told. - [The]DaveRoss 21:58, 3 January 2011 (UTC)Reply

Wiktionary:Per-browser preferences

Shouldn't this be merged into the new gadgets system? It has the disadvantage that it only works on one browser, so if you login from somewhere else all the settings are gone. -- Prince Kassad 22:25, 3 January 2011 (UTC)Reply

I think it would be better if both per-browser preferences and per-user preferences were available. Perhaps a bit of javascript could allow the gadgets page to have the option of saving preferences in a cookie to make it browser-dependent? --Yair rand (talk) 19:44, 4 January 2011 (UTC)Reply

IP block exemptions

We have a user group called "IP block exemption" (ipblock-exempt), which has the rights ipblock-exempt (yes, same name as the group, just to confuse people) and torunblocked. The ipblock-exempt right allows someone to edit even if his IP number is blocked, and is a right shared by admins. The torunblocked right allows someone to edit even if a Tor user, but admins don't have that right. Now, we have several members of the group "IP block exemption", which are all admins' accounts: I, for example, added my alternate accounts to that list to avoid IP blocks, which I thought uncontroversial. But now I realize that I've actually assigned myself a right never allowed by the community (and so have others), namely torunblocked, so I thought I'd bring it here for discussion.​—msh210 (talk) 19:44, 7 January 2011 (UTC)Reply

Well... it is only needed if you use Tor. The only reason one would have for using Tor is if he comes from the PRC. Otherwise, I don't really see the point in this right for admins. -- Prince Kassad 21:59, 7 January 2011 (UTC)Reply
No, I wasn't proposing assigning the torunblocked right to admins (NTTAWWT) but merely broadcasting that my account (and other accounts) already have the right, despite no community consensus on anyone's having it, and allowing for discussion on who should have the it and how that should be determined. The status quo is fine by me for now, but I didn't want to leave it that way without mentioning it here.​—msh210 (talk) 18:14, 11 January 2011 (UTC)Reply
Yes, and I explained why no admin should ever have this right. -- Prince Kassad 18:24, 11 January 2011 (UTC)Reply
I assume you mean no admin account should have it? Some admins' non-admin accounts have it for the reason outlined above: that it comes along with the ipblock-exempt right. What if an admin does want to edit from China?​—msh210 (talk) 19:29, 11 January 2011 (UTC)Reply
All of these 'access rights' are to allow good faith editors exemptions from restrictions placed generally to block bad faith editors. Admins are all good faith editors, so they should be exempt from all restrictions placed generally, including Tor blocks, open proxy blocks, IP blocks. The people using the Tor/proxy/IP are people who we want to contribute, so the blocks are not for them. Why do we care if these exemptions are given to people we want editing anyway? - [The]DaveRoss 20:18, 11 January 2011 (UTC)Reply

plurals versus noun forms

There's been a suggestion on WT:RFDO that we should move from plurals to noun forms. Reasons for this, well, plural isn't specific to any part of speech. In English nouns, proper nouns and pronouns can all have plurals. It would also create standardization - noun forms for all languages, while currently many use noun forms, many use plurals, and some (like Catalan) use both. Lastly, some Wiktionaries already use this system - the French only only allows noun forms, adjective forms, adverb forms (etc.) never plurals. Thoughts? Mglovesfun (talk) 16:02, 9 January 2011 (UTC)Reply

Dunno. De-facto consensus was to use plurals when a language has only plurals (and no other noun forms), and noun forms for languages which actually inflect their nouns. I think this should be unified in one way or another, as it's currently causing a big chaos. Just look at Category:Plurals by language, Category:Plurals and Category:Noun forms by language. -- Prince Kassad 16:51, 9 January 2011 (UTC)Reply
Which class of users would be helped by a procrustean-bed treatment? How? We aren't being run for the benefit of the categorizing advocates and contributors, are we? DCDuring TALK 17:25, 9 January 2011 (UTC)Reply
I see added complexity for the general case, at the expense of other cases. The fact that all nouns, including special nouns like proper nouns and pronouns can have plurals doesn't lead me to believe that we should move from plurals to noun forms.--Prosfilaes 22:43, 9 January 2011 (UTC)Reply

German verbs ending in -eln, -ern

Should a category be created for Category:German verbs ending in -eln? Category:German verbs ending in -ern? Most verbs end in -en. Those, that end in -eln or -ern, share the same conjugation pattern. Would it be helpful to group them? Category:German nouns ending in "-ismus" and Category:German adjectives suffixed with -bar are already categories, for example. Also, what is the style? That is, should Category:German nouns ending in "-ismus" be Category:German nouns ending in -ismus or Category:German nouns suffixed with -ismus, or should Category:German adjectives suffixed with -bar be Category:German adjectives suffixed with "-bar" or Category:German adjectives ending in "-bar"? - -sche 20:27, 10 January 2011 (UTC)Reply

Both should be deleted. They do not conform to the affix category system we use, which does not distinguish by part of speech. -- Prince Kassad 17:00, 11 January 2011 (UTC)Reply

XML dumps, stats

Finally, after two months of standstill, XML dumps are being produced again and en.wiktionary was one of the first. Consequently, WT:STATS has been updated (by Conrad.Bot) and reveals that Swedish is now the 11th biggest language here, having more than 40K entries, up from position 19 (19K entries) in October and 24 (11K entries) in August. --LA2 14:08, 11 January 2011 (UTC)Reply

New languages: Catawba, Dolgan, Meänkieli, Nogai, O'odham, Old Swedish, Sundanese, Uab Meto. Lingua Franca Nova is no longer in the list. -- Prince Kassad 15:12, 11 January 2011 (UTC)Reply

Fictional and other characters

I believe I could place all entries defined as fictional characters into Category:Fictional characters and some of its subcategories accordingly.

I have not placed Zeus, Osiris, sphinx, mermaid, Santa Claus and Mephistopheles, among other entries, in that category. It would be more reasonable to place them into Category:Folkloric characters and/or Category:Mythological characters. --Daniel. 20:06, 11 January 2011 (UTC)Reply

  • Do these categories have a use? Or is it just to satisfy editors' mania for categorisation? Ƿidsiþ 20:07, 11 January 2011 (UTC)Reply
    Yes; no. --Daniel. 20:15, 11 January 2011 (UTC)Reply
    If yes, Daniel, would you please specify?​—msh210 (talk) 20:37, 11 January 2011 (UTC)Reply
    My personal view on topical categories can be generalized and explained shortly as this: They serve as useful references. For example, I own multiple dictionaries: including one of chess, another of Greek mythology, another of literature... Since we have Category:Chess, Category:Greek mythology and Category:Literature, I virtually feel like we are competing with them, and I like it. I like to be able to find meaningful and restrict relationships between terms, including lists of words that name animals, colors, etc. In theory, if I want to know all terms about chess, I should just navigate the specific category. In exchange, editors interested in the maintenance of topical categories would be reasonably expected to make them navigable and complete if possible. If a topical category is perceived as uncomplete, it serves as a clue for creating the necessary entries and categorizing them.
    Notably, the functions of Wikisaurus and topical categories often overlap, but I personally see them as clearly different projects, with different scopes, merits and presentations. I expect topical categories to be simpler and cleaner in comparison with Wikisaurus; both projects provide lists of "synonyms", "coordinate terms", "hyponyms", "meronyms" and "instances", but the categories effectively hide these labels by simply displaying a raw list of items. Wikisaurus displays useful linguistic information and more relationships.
    In addition, I expect Wikisaurus pages to be eventually much more numerous, due to their elaborated and complex approach to each concept, that may naturally lead to the inclusion of more concepts, regularly. While we have both Category:Animals and WS:animal with pleasant results (aside from few issues such as which categories of a category tree should categorize each member, that is a can of worms to be perhaps discussed in another thread), I highly doubt we would have equivalent categories for WS:boring and WS:precognitive.
    The choice of whether a topical category should exist is highly subjective and prone to editor discretion. Recently, on WT:RFDO, various topical categories have been nominated for deletion with only the obscure argument of considering them "overly specific". As a result, I parodied this phenomenon somewhere by mentioning that we would not want Category:Brown quadrupedal animals or Category:Musical instruments that touch the ground indeed. With the natural limits of topical categorization in mind, I believe it is safe to maintain Category:Fictional characters. It contains specific characters such as Dr Jekyll, Dracula and Batman; stock characters such as tsundere, shoulder angel and superhero; and also roles of characters, such as protagonist, title character and antagonist. These groups of entries fit well together, especially separately from the more abrangent Category:Fiction, that contains not only those characters, but multiple other terms.
    The additional categories for floklore and mythology would simply follow suit, presenting a way to find all the characters defined on Wiktionary easily and also a separation from more abrangent categories such as Category:Mythology. It seems natural to me separating fiction from mythology. If necessary, it may be useful to create Category:Religious characters as well in the future, while we already have Category:Biblical characters.
    Most or all of my words above may or not seem obvious due to the nature of the discussed projects, but to be on the safe side, I did explain them anyway. --Daniel. 11:17, 12 January 2011 (UTC)Reply
    No; yes. -- Prince Kassad 20:24, 11 January 2011 (UTC)Reply
I can't explain why, but "folkloric characters" and "mythological characters" both sound wrong to me. The former I would express as "characters from folklore"; the latter, as "mythological figures". Is that just me? —RuakhTALK 20:35, 11 January 2011 (UTC)Reply
Me, too.​—msh210 (talk) 20:37, 11 January 2011 (UTC)Reply
For what is worth, there are at least some thousands of pages containing either "mythological character" or "folkloric character" according to Google. Nonetheless, Ruakh's suggestions are good enough for me. I probably would not oppose the possible creation of Category:Mythological figures and/or Category:Characters from folklore. --Daniel. 11:17, 12 January 2011 (UTC)Reply
Are characters from fairy tales, such as Big Bad Wolf, Little Red Riding Hood and Prince Charming, "fictional", "from folklore" or "mythological"? --Daniel. 11:17, 12 January 2011 (UTC)Reply
At the present, poor Little Red Riding Hood can be found in categories: Fiction | Fairy tale | Fairy tales | Artistic works | Fictional characters | Fictional people | Fairy tale characters. Isn't it a bit repetitious? Isn't fairy tale folklore, and why must it be said three times? Please don't add any new categories. Shouldn't we have as few categories as possible? --Makaokalani 17:33, 14 January 2011 (UTC)Reply
It depends on whether or not we want to repeat categorization in various levels.
For example, the word dog is a member of Category:Dogs and of Category:Canids. It is not a member of Category:Mammals or of Category:Animals, but arguably can be added to them. --Daniel. 00:39, 15 January 2011 (UTC)Reply
Wouldn't it make more sense for clearly hierarchical categories like Category:Dogs > Category:Canids > Category:Mammals > Category:Animals to handle parent categorization automatically? I.e., such that simply marking a word as Category:Dogs alone would automatically categorize the word under all the parent categories as well. -- Eiríkr Útlendi | Tala við mig 00:53, 15 January 2011 (UTC)Reply
I agree. --Daniel. 05:45, 15 January 2011 (UTC)Reply

Template:defn outside of language sections

In my opinion, editors shouldn't add {{defn}} outside of language sections, such as {{defn|lang=et}} in a Finnish or Spanish section. I'm sure the idea is to get an editor to add the word, but I've seen some weird things like {{defn|lang=ro}} between two Latin definitions. The 'correct' procedures would be Wiktionary:Requested entries or to create a language section for the word using defn or {{rfdef}} (defn has been nominated for deletion). Mglovesfun (talk) 15:30, 12 January 2011 (UTC)Reply

Coincidentally, I removed all the {{defn|lang=et}}s a few minutes after you posted this, listing them all at [[Wiktionary:Requested entries (Estonian)#A]]. Needless to say, I agree with you. —RuakhTALK 00:09, 15 January 2011 (UTC)Reply

Organizing Japanese entries

Forgive me if this has already been gone over recently; I've been out of the loop for a few years.

Japanese presents an interesting organizational challenge for Wiktionary, in that we appear to have multiple different locations where a single Japanese word can be entered:

  • Romaji (possibly more than one page)
  • Katakana
  • Hiragana
  • Kanji (plus okurigana if appropriate; possibly more than one page)

The question becomes, where should etymologies and inflection information go? Putting all this information on all the pages creates additional work, and increases the likelihood of the information falling out of synch. I'd like to propose that phonetic entries not include extensive information like inflections and etymologies, and primarily list other more specific entries that would then include fuller information. Exceptions would be cases where the phonetic rendering is the form most commonly used, such as しかし (然し exists, but its usage is archaic, and thus this entry should point the user towards しかし instead).

Subaru presents an excellent example here. The various possible renderings, in no specific order:

  1. Subaru (the company, romaji for the star cluster)
  2. subaru (romaji for the verb)
  3. スバル (the company, possibly also a phonetic entry for the verb and noun forms)
  4. すばる (basic phonetic entry for the company, the verbs, the noun)
  5. 昴#Japanese (the Pleiades star cluster)
  6. 窄ばる (one verb meaning with specific contextual overtones)
  7. 統ばる (another verb meaning with different contextual overtones)

This gives us seven different headings for subaru. Other Japanese dead-tree and electronic dictionaries take the general approach that phonetic (generally hiragana, sometimes also romaji) entries list (or redirect to) the specific katakana or kanji-based entries, which then give the etymologies, inflections, and other information. So for the subaru example, this would break down as follows:

  1. Subaru: Full definition as the automaker division of Fuji Heavy Industries, link to Wikipedia page, etymologies, スバル as alternate
  2. subaru: List other renderings briefly defined to help user pick the relevant one
  3. スバル: Subaru as alternate; list other renderings briefly defined to help user pick the relevant one
  4. すばる: List other renderings briefly defined to help user pick the relevant one
  5. 昴#Japanese: Full definition as Pleiades star system, examples, etymologies, etc.
  6. 窄ばる: Full definitions, verb conjugation, examples, etymologies, etc.
  7. 統ばる: Full definitions, verb conjugation, examples, etymologies, etc.

Is this clear? What does everyone else think? I look forward to the discussion. -- Eiríkr Útlendi | Tala við mig 00:02, 14 January 2011 (UTC)Reply

Sorry to hijack your topic, but this problem concerns English entries as well. Compare for example center and centre, or color and colour. -- Prince Kassad 00:11, 14 January 2011 (UTC)Reply
No worries, that's a good point. Part of this issue is caused by the limitations of the Wiki software and how data is presented. A different database design could easily show all relevant data in one window, possibly reordered as appropriate. Which makes me wonder if there'd be any way to use transclusion -- i.e. keep all relevant info stored under one entry heading, and simply transclude into the other possible renderings? This wouldn't be a template per se, but using the same basic mechanism. Would that even be possible? -- Eiríkr Útlendi | Tala við mig 00:17, 14 January 2011 (UTC)Reply
Partially answering my own question, I've learned about labeled section transclusion, which would seem to do exactly what's needed here. After choosing a headword under which to enter all entry information, alternates could then simply transclude all relevant portions. This would neatly avoid the problem of entering identical information in multiple places, with that information possibly diverging over time. I just looked at centre and center, for instance, and found some notable differences that should probably be resolved (center's etymology is much more complete, for instance). -- Eiríkr Útlendi | Tala við mig 22:12, 14 January 2011 (UTC)Reply
I've just used labeled section transclusion to basically clone the 忍坂 entry on its alternate spelling page 忍阪. Might be worth a look-see; this could be an easy and elegant way of dealing with alternate spellings. -- Cheers, Eiríkr Útlendi | Tala við mig 23:31, 14 January 2011 (UTC)Reply
Personally I think that the place of the main content should conform to etymology. If the word is native or its borrowing from Chinese cannot be ascertained, it should have its main content in its hiragana (sometimes katakana) entry, and other forms should only include a link, with no detailed explanations. The reason is that kun'yomi (as well as ateji and (to a lesser extent) jukujikun) often involves multiple correspondences of hanzi to create nuances in meaning, and it's hard to decide which one is the dominant form. If the word is clearly Sino-Japanese, then its main content should be at its Kanji entry. For words with unknown etymologies, the most common form should be used. (This is how the Japanese Wiktionary organises its entries.) Wjcd 00:33, 14 January 2011 (UTC)Reply
So if I understand you correctly, for the subaru example above, the verb definitions and etymologies should then be listed under the hiragana heading すばる, since the etymologies are substantially the same, but with the definitions specifying which kanji is used for which sense. Is this what you mean? -- Eiríkr Útlendi | Tala við mig 22:12, 14 January 2011 (UTC)Reply
Yes. Wjcd 04:40, 16 January 2011 (UTC)Reply

Desysopping

Believing this is the best venue to discuss this issue (rather than opening various votes for desysop pages on every single one of our inactive admins), I want to request confirmation on what this past discussion regarding such sysops mean for the following administrators. Note that their desysopping is not a jab at their character or any of their normal functions; just as a matter of having no edits within the past year:

TeleComNasSprVen 22:09, 14 January 2011 (UTC)Reply

Boy, when you do edit, you really go for it! Mglovesfun (talk) 22:43, 14 January 2011 (UTC)Reply
I'd like to decide them on an individual basis. At least for brion, however, he does not need it since he has the developer right, which grants him the right to give himself sysop status on any wiki. -- Prince Kassad 22:57, 14 January 2011 (UTC)Reply
There is certainly precedent for a desysoping without prejudice after one year of inactivity. The benefits as I see them are a more accurate picture of the number of people working (which gives us incentive to promote more people to help when needed) and the fact that "the community" changes and most likely many of these folks are not known by "the community" and do not have "the community"'s support. This is the best reason for term limits on wikirights, just because 5 years ago the group of people working on this project thought it was a good idea to make me a sysop does not mean that the folks working on the project now still think it is a good idea. I would be in favor of removing bits unless there is a good reason to keep them, though I would also be in favor of utilizing the "email this user" function (or in Alhen's case just messaging him in IRC, he is still active on es) to let them know what is going on in case they wish to unretire. - [The]DaveRoss 23:32, 14 January 2011 (UTC)Reply
I agree completely. I should note, though, that there's not precedent for the notion that "just because 5 years ago the group of people working on this project thought it was a good idea to make me a sysop does not mean that the folks working on the project now still think it is a good idea"; on the contrary, the precedent in de-sysopping votes has generally been to make explicit that if the user ever returns, they can get the bit back without ceremony. (But I agree with your notion, and dislike this aspect of the precedent, and would happy to abandon it.) —RuakhTALK 00:07, 15 January 2011 (UTC)Reply
Not actually true about brion etc. Devs get sysop on a wiki through the usual community means, unless they need to make a change in their official capacity (for example for legal reasons), which is a totally different matter. (Spoken as a dev :-P) -- ArielGlenn 20:10, 19 January 2011 (UTC)Reply
I would have liked to see the users mentioned emailed and contacted before bringing this up in the Beer Parlour. But that might just be me. --Neskayagawonisgv? 00:16, 15 January 2011 (UTC)Reply
Not only do I agree with all of the above, but I have suggested similar things in the past. See associated data at User:SemperBlotto/Sysop Activity. SemperBlotto 08:42, 15 January 2011 (UTC)Reply
I don't see a problem with this - a message could be left on their talk page and they can be undesysopped on request if they resurrect themselves —Saltmarshαπάντηση 11:48, 15 January 2011 (UTC)Reply
Has anyone yet written any communication to these folks to inform them of the ongoing discussion? I don't think any action should be taken until that happens. - [The]DaveRoss 13:54, 16 January 2011 (UTC)Reply
I prefer that someone use the EmailUser function to contact them and ask them to come back as well. This is a well-known proposal for desysopping that I've made on other wikis, too. If they receive an email, they might turn active again. However, I use a fake address and so I can send emails but I cannot receive them. That's why I chose to bring it up at the Beer Parlour; to see whether anyone had any other ideas about them, and whether or not EmailUser would be necessary. TeleComNasSprVen 19:13, 16 January 2011 (UTC)Reply

Functionally, no real difference from my point of view, don't see any harm in having flags set as long as passwords are secure but no real need for the tools - I haven't had much time in the past year to contribute, I think I did the odd edit but forgot to sign in. -- Tawker 01:41, 20 January 2011 (UTC)Reply

Yeah, go ahead, I've been read-only for long enough anyway. Cynewulf 14:36, 31 January 2011 (UTC)Reply

Renaming "Mandarin" headings to "Chinese"

The Wikipedia page of w:Standard Mandarin has been recently moved to w:Standard Chinese after months of discussion, and I think it's about time for Wiktionary to follow this practice. "Modern Standard Chinese" (MSC, direct translation of its Chinese term 现代标准汉语) is the official language of the People's Republic of China and Taiwan, and is the de facto literary standard of all Chinese varieties. Although it is based on the phonology of the Beijing dialect and the grammar of northern Mandarin dialects in general, it is nonetheless inappropriate to simply call MSC "Mandarin". It functions as a high prestige spoken and written standard of all Chinese variants, in a way quite similar to Modern Standard Arabic (MSA), but is much more widely used than MSA. Saying that Standard Chinese is Standard Mandarin is implying the presence of other standard Chinese languages when in fact there is not; there is no such thing as Standard Gan, Standard Wu, Standard Hakka, etc. Even Cantonese does not have a de jure written standard and its written form has to converge as much as possible to MSC. Just look at the major news websites from Hong Kong [3][4][5][6][7][8][9] - the thing written on those pages is essentially what Wiktionary calls "Mandarin" (I'm not eliminating the possibility of variety headings for dialectal words however). Nevertheless, the current Wiktionary policy of eliminating "Chinese" headings, based on some editors' superficial impression that the mutual intelligibility across Chinese varieties makes them vastly different when spoken, ignores the fact that Wiktionary is written-language-orientated, and gives unfair treatment of Chinese compared with other languages in similar situations. For example, Serbo-Croation, Arabic and Chinese are all macrolanguages under ISO 639-3; Serbo-Croatian is written in different scripts and has dissimilar literary standards in different countries; Arabic has a modern standard language (MSA) based on Qur'anic Arabic which is less promulgated and used than MSC; And Chinese has a single (spoken and) written standard (ignoring the simp-trad complication) which is widely used amongst virtually all varieties. Yet Serbo-Croatian and Arabic headings are accepted perfectly fine, but the Chinese headings have to be forcibly changed to "Mandarin". This hardly seems reasonable.

I know this may seem like a significant change, but personally I think this is something that has to be changed sooner or later, before Wiktionary is filled up with horrendous-looking reduplicate sections of written-language-sharing languages (as you like) which are neither official in any state nor phonologically and grammatically standardised. With bots, the change should be fairly easy. But first, we have to start acknowledging that designating a written language as "Chinese" is perfectly justified (because written Chinese is Vernacular Chinese which is in turn Standard Chinese) and hence give some tolerance to those headings; and then, it's the work to expunge the misnomerous "Mandarin" headings. Wjcd 04:40, 16 January 2011 (UTC)Reply

This does not seem like a significant change; it's a label. It is, however, one I oppose, since Cantonese, Wu and other languages are as equally Chinese as Mandarin. The rhetoric about intolerance towards those languages not official in any state I find rather disturbing. It's simply bizarre to say that a language is not "phonologically and grammatically standardised"; by the definition of language, there is sufficient agreement among speakers as to phonology and grammar to communicate. Probably more so among most unofficial languages then among the large official languages that have millions of second language speakers and groups of speakers that have been geographically separated for some time.--Prosfilaes 06:24, 16 January 2011 (UTC)Reply
It is true that these languages are equally Chinese as Mandarin, but the thing is that there is only one de facto literary standard in Chinese, which happens to be based on a dialect of Mandarin (w:Regional language#Relationship with official languages). So whilst the other varieties are sometimes significantly different in speech from the literary standard, when written convergence has to be implemented. Because there is virtually nothing published which can be used as a written attestation of a term in Chinese varieties other than MSC, and it is similarly quite hard to find audio or video material which is "of verifiable origin" and "durably archived" to show the "clearly widespread use" of the term, it is probably best to avoid this kind of attestation. The high lexical similarity resulting from the logographic nature of Chinese characters also makes it hard to say that "this word is a strictly dialectal word", because locals often mingle into their MSC some regional expressions without realising their dialectal nature, and cross-dialectal borrowing is very common (the resultant mixture is termed "regional MSC"). Wjcd 09:33, 16 January 2011 (UTC)Reply
I still see no reason to label non-Mandarin languages as dialects.--Prosfilaes 21:57, 16 January 2011 (UTC)Reply
Like you said, "dialect" or "language" is only a label. Wjcd 00:49, 17 January 2011 (UTC)Reply
Sure. But dialect is a way of putting down people who don't speak the standard languages and the languages they speak. "Respectable people don't speak dialect." And ISO 639-3 has labeled them languages, so that's the default position for us; we have to argue why we're going against that.--Prosfilaes 02:45, 17 January 2011 (UTC)Reply
If we had the header ==Standard Mandarin==, your argument for writing ==Standard Chinese== would make sense to me: Modern Standard Chinese already implies a form of Mandarin. But we don't, so your argument seems like a stretch. You state that "Wiktionary is written-language-orientated", but I don't think that's true. Wiktionary itself is exclusively written, and it is largely dependent on written sources for verification; but these are limitations to be confronted, not strengths to be celebrated. —RuakhTALK 06:34, 16 January 2011 (UTC)Reply
The current header is a misnomer; the language that the section underneath it describes is properly called "Modern Standard Chinese". Proponents of the name "Mandarin" while realise that the de facto written form of Chinese is based on a dialect of Mandarin and that "Chinese" is too inhomogenous a macrolanguage to be packed under one heading, fail to realise that the group of Mandarin dialects is not homogeneous enough to be described in such a way either, and that using "Mandarin" to refer to MSC is an inappropriate underrepresentation of MSC's actual use. Wiktionary's written-language-orientatedness is inherent; we essentially take words as how they are normally written, and attestation via non-written means is essentially negligible. This is especially true for languages which may differ greatly in written and spoken forms, such as Tibetan and Finnish (and Chinese). Wjcd 09:44, 16 January 2011 (UTC)Reply
I see no reason to change the system, and certainly we should not necessarily model ourselves after Wikipedia since the aims of the projects are so different. However I have an open mind, and would like to hear how exactly you propose to change it - e.g. how would you define words with the same characters but under different languages (Mandarin, Cantonese, Min Nan, etc)? These are different (read: mutually unintelligible) languages and one reading will inevitably have different meanings, pronunciations, etymologies, usage notes, etc, all of which is very relevant information for Wiktionary. ---> Tooironic 07:17, 16 January 2011 (UTC)Reply

My envisaged layout would be:

Wjcd 10:48, 16 January 2011 (UTC)Reply

Initially oppose, if I've understood we're talking about merging the Chinese languages into one, not just changing a header. OK, sure there are lots of similarities between the languages, but what about Spanish, Italian, Portuguese, Occitan, Catalan (etc.) I wouldn't want to merge them into "Romance". That said, I'm not too hot on Chinese, so I await further argument. Mglovesfun (talk) 12:44, 16 January 2011 (UTC)Reply
I hope that the idea is not to merge Chinese languages.
I oppose renaming Mandarin: this is a language, with its own ISO code, etc. and Mandarin seems to be the most standard name for this language (Chinese is ambiguous, and standard Chinese less common). But I would not oppose Chinese headers to be allowed in addition, for readers looking for Chinese words, without knowing exactly what precise language they mean. Lmaltier 14:36, 16 January 2011 (UTC)Reply
Arabic and Serbo-Croatian are both macrolanguages under ISO 639-3, and these headings are currently allowed in Wiktionary. Would it be reasonable to further split a single Arabic section into 20+ copies, for people may be more interested in their language status than the actual content? Or to further split Serbo-Croatian into its component languages, as Serbo-Croatian is pluricentric with multiple literary standards? Wjcd 14:51, 16 January 2011 (UTC)Reply
Of course, this is reasonable. The basic principle is a section for each language. Sections would not be identical, they may differ for pronunciation, examples, probably senses of the word (at least in some cases), usage notes, homophones, etc. We have sections for Egyptian Arabic, etc, Croatian, etc. and this is perfectly normal. Lmaltier 15:42, 16 January 2011 (UTC)Reply
No it's not normal. "Croatian" is a language fabricated in the 1990s by Croatian nationalists once the former federal state seceded from the Communist Yugoslavia. It's exactly the same language as Bosnian, Serbian and Montenegrin, having 99% identical grammar. The notion of "macrolanguage" is non-existent in linguistics: that's a term only used by Ethnologue, a Christian organization intent on translating the Bible to as many languages as possible. Croatian language sections are completely obsoleted by Serbo-Croatian. I don't see how the situation with it is applicable at all to the Chinese scenario: the issue here is merely terminological (Mandarin vs. Chinese), not how the content should be formatted or split. EDIT: Whoops: after reading more thoroughly it appears that the issue is more than about renaming Mandarin to Standard Chinese. This discussion is very misleadingly named. --Ivan Štambuk 16:17, 16 January 2011 (UTC)Reply
The discussion is not about Croatian. The Chinese case is more similar to Arabic. I don't speak Arabic but I know that words are different between different Arabic languages. An example: a river is pronounced as something like wadi in the Yemen, and something like wed in Algeria, they are different words, even if the writing and the origin is the same (it might be somewhat different, I'm not a specialist). Lmaltier 17:12, 16 January 2011 (UTC)Reply
Just because a word has different pronunciations, does that mean they aren't the same language? With r's and vowel changes, there are very few words pronounced the same in Northern English dialects of English and Western American dialects. I'm not sure that comparing Chinese, Serbo-Croation and Arabic is reasonable here; they're each sui generis. Chinese at least seems to have less of a resistance to the concept of multiple languages then the Arabs do, and yet a script that has more tolerance to phonological diversity--though I think in Arabic, if I'm not mistaken, wadi/wed would both be written vowelless and hence spelled the same.--Prosfilaes 18:48, 16 January 2011 (UTC)Reply
wadi/wed is not a small variation in pronunciation... What I mean is that words from different languages are different words. Would you want to merge the English and the French section of interjection, because the spelling, the etymology and the meaning are the same? No, because languages are different. Lmaltier 19:16, 16 January 2011 (UTC)Reply
Orthographically, interjection is one and the same word, written in English or French. --Ivan Štambuk 19:34, 16 January 2011 (UTC)Reply
There's a lot of languages with major systematic changes. Loosing phonemes is not rare in dialects of English or Spanish. Is not kasa and kaθa a large variation in pronunciation? There are two words in Spanish, either of which can take either pronunciation, depending on where you are in the Spanish speaking world, and the variation between s and θ, even when it confuses words pronounced different in standard European Spanish, is systemic. What about fag versus cigarette in English? So, no, without a serious corpus and linguistic study, I'm not really interested in debating what is or isn't a large variation in pronunciation. I wouldn't merge them because there's advantages to being systematically consistent. I think whether or not they are the same word is a definitional game; they are if you think that words with the same pronunciation, spelling, etymology and meaning are the same word.--Prosfilaes 21:48, 16 January 2011 (UTC)Reply
The Spanish s-θ isn't a big variation. In Mandarin, for example, in the Tianjin dialect (100+ km from Beijing), there are significant phonological variations from the Beijing dialect. All the retroflexes are dropped, merged with their fricative or affricate counterparts or a semivowel, and the high level tone becomes a low falling one, the dip tone becomes purely low rising. These are, however, not important. For languages which have a literary standard, such as Arabic and Chinese, reduplicating the sections just because the languages have their own ISO codes is not reasonable. This is more so for Chinese, which has a script tolerant to phonological variation, in which this ثلج - تلج kind of dialectal variation in written Arabic is not possible. Wjcd 00:49, 17 January 2011 (UTC)Reply

Support renaming Mandarin to Chinese. Mandarin is a fruit grown by Georgians, not a language. --Vahag 15:12, 16 January 2011 (UTC)Reply

Chinese is a nationality of a state with 292 different languages that fall into seven different language families. Chinese is also the name of an ethnicity that speaks at least seven Sinitic languages. For linguistic confusion, I think that blows the fruit/language one out of the water.
(And I think you weren't being entirely series with that second sentence, but it surely has me confused. w:Mandarin orange lists neither Georgia or the US as major growers of the fruit, so I don't know which one you were referring to. I suspect that Georgian Caucasian language versus US dialect is probably an actual confusion among some of our users, but one a little--non-Wiktionary-specific--education will clear up, and I believe most of our users will be wise enough to look up Georgian if they are confused by a Georgian entry. Of course, on Commons, we had someone telling us that an image needed fixed, because Hurricane Dora was in the Atlantic, not the Pacific, and it was in 1969, not 1988, so apparently some people won't pause to figure out polysemy.)--Prosfilaes 18:48, 16 January 2011 (UTC)Reply

Do note the existence of Dungan, however. It is not written in Chinese characters. -- Prince Kassad 19:53, 16 January 2011 (UTC)Reply

  • Strongly oppose. I think the system that has been proposed above is utterly complicated and confusing - as if making Chinese language entries on Wiktionary isn't difficult enough as it is! Furthermore, to group all the languages under one language header is misleading because just because a word can have multiple readings does not mean that all the readings potentially have the same meaning - they are different languages after all. ---> Tooironic 22:55, 16 January 2011 (UTC)Reply

1) These are all valid languages with proper ISO codes. We'd rather have fourteen or so languages which share a single written form listed separately, because of fear of potential differences in meaning. Please, are there any nameable cross-dialectal nuances in meaning in any of the English words "ambiguous", "vague", "obscure", "intimate", "dubious", "explicit", "relationship", "modern", "language"? Yet English is unregulated officially, and written Chinese is regulated.

response: I think Tooironic was thinking of a word like 垃圾. -- A-cai 02:43, 17 January 2011 (UTC)Reply

2) There is basically nothing that can be regarded as an attestation of this term in Min Zhong, Min Bei etc., according to Wiktionary guidelines. What's spoken in dialects remains spoken, in written form it has to be Modern Standard Chinese. So if someone lists the above definitions for attestation, all but "Mandarin" will fail.

response: I would be very careful about making that type of assertion. The fact that a language is not generally written down does not mean that it can't be written down. This is why we have Wikipedias written in Min Nan, Min Dong, Hakka, Cantonese and Gan, among others. -- A-cai 02:43, 17 January 2011 (UTC)Reply

3) The name "Mandarin" is erroneous. There is no such thing as "modern written Mandarin". The language that is described under this header is "Modern Standard Chinese", which is applicable as a literary standard to nearly all Chinese varieties. Mandarin is a group of dialects spoken across much of northern and southwestern China, not written. Wjcd 00:49, 17 January 2011 (UTC)Reply

Wjcd, first of all, allow me to welcome you to Wiktionary. I see that you are a native Chinese speaker. We desperately need those around here, so I hope you stay and continue to contribute for many years to come. You may or may not be aware, but I was the person who originally advocated switching the label from Chinese to Mandarin several years ago. The debates are still in the Beer parlor archives, if you care to look them up. I don't want to rehash that stuff all over again, so I will confine myself to responding to some of the points that you made above.
  1. You said, "Standard Mandarin is implying the presence of other standard Chinese languages when in fact there is not." This is not how I interpret "Standard Mandarin." "Standard Mandarin" means that there are other varieties of "Mandarin" that are not standard. These varieties of Mandarin can be quite different from the "Standard Mandarin" that you hear on CCTV or read in the People's Daily.
  2. You said, "The current header is a misnomer; the language that the section underneath it describes is properly called "Modern Standard Chinese." That's not quite right. There is nothing modern or standard about the term 仆射, yet it is entirely appropriate to include under a Mandarin label. The reason it is appropriate, in my view, is that there is a Mandarin reading for it and it appears in Mandarin language texts. However, it is a historical term, not a modern one. It also features a non-standard reading for the second syllable, not a standard one. That means that we would not be able to include it under a label that implies modern and standard.
In conclusion, the "Mandarin" label simply means any word in Mandarin, whether ancient or modern, standard or non-standard (Mandarin). Of course, for the sake of simplicity, anything not otherwise labelled is considered to be the "Modern Standard" form of Mandarin, as opposed to an "archaic" or "non-standard" form of Mandarin. I hope this clarifies my position, and I look forward to your response. -- A-cai 02:16, 17 January 2011 (UTC)Reply
First of all, Mandarin is a rather new classification of Chinese varieties. The term 仆射 has basically fallen into disuse by the Song Dynasty as the position was abrogated, which is before the widely-regarded time of inception of Mandarin (using Zhongyuan Yinyun as a division between Middle Chinese and Mandarin). Thus saying that this term is Mandarin is not very appropriate; sure it can be used later on in Mandarin, when talking about historical events, but there is similarly no restriction on this term being used in other Chinese varieties, just that the pronunciation changes according to the dialectal readings of the characters. Therefore it is best called "Classical Chinese" or simply "Chinese".
Secondly, we need to agree that the current "Mandarin" header refers 99% of the time to "Standard Mandarin", because that is the way it is normally written. Other Mandarin dialects are rarely mentioned, not because the absence of anyone knowledgable, but because most of the time the definitions are identical, only the pronunciations differ. Your example of 垃圾 is also a manifestation of this. What is taken to be the pronunciations in this "Mandarin" language is the pronunciations in Modern Standard Chinese; the Taiwanese pronunciation is Min-influenced, but still, it is Beijing dialect-based Guoyu, an example of regional MSC, not another Mandarin dialect. (By the way, 垃圾 may also be used adjectivally in MSC, as well as figuratively. Similarly from what I know, there are also nominal uses of this term in Min Nan (in Fujian), by analogy with MSC.)
That being said, since the Wikipedia page has been moved, in accordance with the actual use of MSC as a prestige literary (and spoken) standard, we should also reconsider the validity of the name "Mandarin" in headers. In fact, the Chinese either call the written form 现代标准汉语 (modern standard Chinese), 普通话 (common speech), or 國語 (national language), 華語 (Chinese language). There is no mention of "Mandarin" (官话, 北方话), which is not surprising since this written form is applicable to nearly all Chinese varieties. Surely local dialects could be written down using the phonetic aspects of Chinese characters, but this is not how people write their language (as evidenced in the above Hong Kong example). Wikipedia content certainly does not constitute an indication of attestability of term usage in non-MSC dialects; anyone could write there. Hence if amalgamating the Chinese definitions is not allowed, attestation in non-MSC dialects will be very difficult. Wjcd 03:40, 17 January 2011 (UTC)Reply
Is the proposal to rename Mandarin headings to Chinese or to merge all the Chinese languages under a Chinese header? --Yair rand (talk) 04:10, 17 January 2011 (UTC)Reply
It's about renaming, because the current "Mandarin" header refers essentially to "Modern Standard Chinese", the modern literary standard amongst Chinese varieties. Wjcd 05:14, 17 January 2011 (UTC)Reply

Mandarin Chinese: It's complicated

I'm creating a new subheader because the thread is becoming unmanageable.
  1. 垃圾 may be used as an adjective in Mandarin, but it is not used as a noun in Min Nan. The colloquial Min Nan equivalent is 糞埽. That's the point I was trying to make. The usages don't always match up. Also, sometimes the meaning between dialects is the same, but a given word may be in a formal register in one dialect whereas it is in an informal register in another. See Appendix:Sino-Tibetan Swadesh lists for a good example of what I'm talking about.
  2. Your point about 仆射 is well taken. It is in fact a Classical Chinese term, or to be more precise, an Old Chinese (ISO code och) and a Middle Chinese (ISO code ltc) term. It is also a Mandarin term, albeit an archaic one. The modern Mandarin reading is púyè. To be completely accurate, we would have to include an Old Chinese header, with our best guess as to how it might have been pronounced, along with a separate "Middle Chinese" header, along with our best guess for the pronunciation from this period. The definition section would probably also change, since the meaning of the term gradually evolved over time.
See for an example of how we thought this might eventually be implemented here. I know it seems like a pain to separate words under multiple headers, but the purpose of Wiktionary is not to create yet another mediocre bilingual dictionary. If you want one of those, I can direct you to about a dozen online dictionaries that would fit the bill. Our goal is to document every aspect of the world's spoken and written languages. It will take many years, and may simply not be possible in some instances. Yet, some of us continue to toil away year after year :) -- A-cai 04:23, 17 January 2011 (UTC)Reply
1) Yes, that is what I meant. When a paragraph based on MSC grammar or imitating MSC grammar is read out aloud by a Min Nan speaker using Min Nan readings, no MSC speakers would regard that as MSC. It may however be perceived as a more formal register of Min Nan.
2) I think our main focus should be on literary languages, not spoken. There is no point in differentiating between Old Chinese or Middle Chinese since the style of writing is largely identical (based on the eloquent speeches of the Confucius etc.), and it's hard to ascertain when a new sense developed. Phonological reconstructions up to Proto-Sino-Tibetan are only necessary for monosyllabic characters; for polysyllabics it is primarily derivative work, based on monosyllabic characters, which I think is unnecessary since mechanically combining the reconstructed pronunciations ignores the important tone sandhis (as we are not even certain in the exact tonal values). The only differentiation needed is between Classical Chinese and Modern Vernacular Chinese, which can also be omitted if the senses deemed archaic or obsolete are tagged accordingly.
3) Actually, according to ISO standards, is not completed. There are at least six Chinese languages unaccounted for there. That's why I don't think we necessarily have to rely on ISO codes when it comes to headers. The work in getting them sectionalised gets exponentially laborious as one gradually proceeds, and in the end we come to the realisation that these "languages", whether historical or modern, are basically the same in terms of written language, and they converge to the prestige literary standard when written; and it's also very difficult to find attestations of these terms in languages other than Classical Chinese and MSC. For example, in , the sense "to marry, to wed" and "to nurture" regarded as already archaic in Old Chinese still pops up as a free morpheme in Qing Dynasty literature, and in a way still exists in Modern Chinese, in the form of many fossilised expressions (e.g. 待字闺中). If you are interested, please have a read of my proposed layout for monosyllabics above. By separating out languages that have a fundamentally different core vocabulary (ko, ja, vi) and listing the definitions in Chinese in a roughly chronological order, a general pattern of the dialectal differences in pronunciation and semantic development of characters should become clear. Wjcd 05:14, 17 January 2011 (UTC)Reply
I understand what you are proposing. I'm just not convinced at this point that it is the best approach. The fact that we're attempting the impossible is not lost on me. However, I don't view this as a short term project. I think of it as a multi-year, possibly multi-decade endeavor. As for your argument about written vs spoken, you're entitled to your opinion and I wouldn't to discourage you from working on whatever aspect of wiktionary you feel is important. We can certainly use all the help we can get. However, please recognize that there may be some who feel that spoken languages are even more important to document, especially since so many of them are rapidly facing extinction. My personal opinion is that wiktionary needs both, and I don't necessarily think it matters which part gets done first. -- A-cai 05:35, 17 January 2011 (UTC)Reply
Wjcd, I basically support and understand your point of view about the standard Chinese, as Mandarin = Standard Chinese and most written Chinese is actually Mandarin and even the majority of dialectal words are already borrowed into Mandarin. Only a few dialects have standardised or semi-standardised written form and Chinese themselves choose to write in Mandarin when they do write. The marginal cases, noteably Cantonese, which a limited number of specific characters,only support the case for a unified treatment of the language. However, a lot of contributions are made for the dialects as well and the headings and the way we translate Chinese entries have been discussed ad nausea. We have many archived discussions and votes. The decision is made and can be changed only if there is another vote that passes. The status quo is a compromise for all contributors with a different view on Chinese languages/dialects/topolects (whatever you prefer). If you are willing to stay, you are welcome but please follow our guidelines. I did change my attitude since I joined. If you want to make a difference, not make a point, please contribute in Mandarin. We need more skilled editors. Everybody knows that terms "Mandarin and "standard Chinese" are interchangeable. --Anatoli 06:13, 17 January 2011 (UTC)Reply

It's complicated only if simple principles are not followed. Strictly following the rule all languages and the rule a language=a section, has several advantages:

  • it's simple,
  • it's the only solution anyway, as the project is not allowed to take a position on sensitive, controversial, issues (NPOV principle). This principles applies to Wikipedia, but also here.

multi-decade endeavor? Much more: the project cannot be completed: it's impossible, as many new words are created each year. But the objective of all words, all languages is what makes it successful. Lmaltier 06:54, 17 January 2011 (UTC)Reply

It has been stated at least three times that all the dialects write in standard Chinese but in fact it is the case that often newspaper articles written in Chinese characters in some regional dialects can be 40% or more unintelligible to standard Chinese-only readers, due to different grammar, vocabulary, etc. Doesn't everyone know this? This fact came up several times in the original discussion for separating the Chinese languages/dialects into separate headers several years ago, which I am not certain the new editor has yet read, or expressed an interest in reading, or plans to read. Yes, some local newspapers in southern Chinese regions probably use standard Chinese but those that actually write the local dialect in the local vernacular cannot be understood well by Mandarin-only readers. I'm sure A-cai, as a speaker of Min Nan, can easily find some text like this. 71.66.97.228 08:24, 17 January 2011 (UTC)Reply

Yes, that is one of the reasons I included the following example in the quotations section of 阿斲仔:
  • 有一ê歐巴桑去美國chit-thô,欲去便所ê時,因為m7捌字,煞行入去查甫 e0 彼間,無外久,一ê阿督仔行入去,隨擱闖出來,一直喝講:「I am sorry , I am sorry。」尾a0,彼ê阿婆仔行出來氣 tshuà tshuà 講:「夭壽哦!一ê阿督仔真無禮貌,行入來人ê便所,也擱怪人門「抑m7鎖咧!」
The above is from a collection of various writtings, written in vernacular Min Nan, that I found on this website. Note that some words are written in Romanized script. This is quite common in vernacular Min Nan writings, since there are many Min Nan words that lack a standard written form. One solution to this problem is to simply write those words in Romanized script. Another solution, as is evident in the word 阿斲仔, is to substitute the original obscure character with a character whose Mandarin pronounciation roughly apes the pronunciation in Min Nan. Hence, 阿斲仔 becomes 阿桌仔 or 阿督仔. -- A-cai 12:04, 17 January 2011 (UTC)Reply

That looks like something from a blog; do you have any examples of newspaper text written where some southern dialect is spoken, which can unquestionably be said to not be written in Mandarin, but in one of those southern dialects? 71.66.97.228 22:41, 17 January 2011 (UTC)Reply

response: I'm not sure if there is anything like that on the web for Min Nan. I don't believe I have run across anything that formal. Such publications have certainly existed in the past. One example is given in the Taiwanese Hokkien article:
Taiwanese Min Nan in Romanized text
. However, there has never been an agreement as to how non-Mandarin languages should be written. Many of the informal blogs use a mixture of Chinese characters and letters. There are even entire novels in Min Nan that do it this way, an example of which is also provided in that same article:
Examples of published works in Min Nan in a variety of different orthographies
. Again, the larger point is that just because a language isn't generally written down doesn't mean that it can't be. Actually, do you want to know the most common example of written Taiwanese Min Nan in non-academic publications? Taiwanese karaoke song lyrics on KTV systems. The song lyrics are almost always written exclusively in Chinese characters, but heavily borrowing from Mandarin usage. For a more complete discussion of this phenomenon, see User_talk:A-cai/2009#甲伊. -- A-cai 01:22, 18 January 2011 (UTC)Reply

With hardly any support, I realise that the proposal is unachievable and unimplementable. Please close this discussion. Wjcd 22:47, 17 January 2011 (UTC)Reply

Wiktionary:Topical categories

My recent comments at WT:RFDO have stressed that there are no rules for topical categorization, so it's purely down to personal preference. Is anyone brave enough to write such a 'policy'? Does a 'consensus' exist? Mglovesfun (talk) 12:37, 16 January 2011 (UTC)Reply

My reply to the first question is: Sure, I guess. I started it now with a little information. --Daniel. 14:17, 16 January 2011 (UTC)Reply
There is a tentative guideline at Wiktionary:Categorization#Topic, linked to from Wiktionary:Topical category. It seem that Wiktionary:Topical categories can be turned into a redirect too. --Dan Polansky 14:45, 16 January 2011 (UTC)Reply
I tend to agree with the redirect unless enough information about topical categories to justify a separate page, rather than the "main" WT page. Mglovesfun (talk) 23:15, 16 January 2011 (UTC)Reply
I, too.​—msh210 (talk) 16:58, 18 January 2011 (UTC)Reply
I have turned Wiktionary:Topical categories into a redirect. --Dan Polansky 09:01, 22 January 2011 (UTC)Reply

give a man a fish, feed him for a day; teach a man to fish, feed him for a lifetime

Proverbs comprised of two or more sentences seem rare or inexistent in Wiktionary. Is that a consensus against their inclusion? Anyone objects the creation of give a man a fish, feed him for a day; teach a man to fish, feed him for a lifetime or Give a man a fish, feed him for a day. Teach a man to fish, feed him for a lifetime.? --Daniel. 15:12, 16 January 2011 (UTC)Reply

We already have give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime. "Wonderfool" beat you to the chase. --Downunder 19:18, 16 January 2011 (UTC)Reply
It also survived RFD almost unanimously, so I'd say go ahead and make more. —Internoob (DiscCont) 01:17, 17 January 2011 (UTC)Reply
Aren't they just rare in general? What other two-sentence proverbs are there? Equinox 10:54, 17 January 2011 (UTC)Reply
I'd say these.
--Daniel. 11:18, 17 January 2011 (UTC)Reply
Here's another.
And, here is just an interesting variety of that initial proverb.
  • Sell a man a fish, he eats for a day. Teach a man how to fish, you ruin a wonderful business opportunity.
--Daniel. 11:52, 17 January 2011 (UTC)Reply
Hm. IMO, some of these are platitudes, not proverbs, and one ("light a man on fire") is just a comedian's quip playing on the "teach a man to fish". That would be a bit like including knock-knock jokes. But I'm sure there are some okay multi-sentence ones. Equinox 13:07, 17 January 2011 (UTC)Reply

Always bold the head word?

WT:ELE says "We give a word's inflections without indentation in the line below the "Part of speech" header. There is no separate header for this. For uninflected words it is enough to repeat the entry word in boldface. Further forms can be given in parentheses." which actually isn't mega-helpful. I wanted to add the changes made by Ruakh to Template:Grek to all the script templates. However Prince Kassad undid my revision to {{Goth}}. I don't see where WT:ELE says always list the head word in bold, does it say it anywhere? I would like it to, I think per Wikipedia and other Wiktionaries/Wikipedias, we should always bold the head word, no matter what script. I've had no technical problems resulting from this on either Firefox or Internet Explorer. Mglovesfun (talk) 15:40, 17 January 2011 (UTC)Reply

On rereading, WT:ELE#A very simple example says "the inflection word itself (using the correct Part of Speech template or the word in bold letters),". Some languages doesn't use 'letters', but I think the spirit of the rule is "all head word in boldface". Mglovesfun (talk) 15:47, 17 January 2011 (UTC)Reply
I think that the only time we don't bold (is that a verb?) a headword is when it is in "Chinese" or other strange characters. Other than that - always. SemperBlotto 15:48, 17 January 2011 (UTC)Reply
Well "strange language" is too subjective in this context - see Template talk:ja-noun. It's true that Kanji and Hiragana don't use "letters" so much as "characters", so it would bypass ELE under the letter of the law. Mglovesfun (talk) 15:50, 17 January 2011 (UTC)Reply
One of our Hebrew editors (Ruakh? Shai? someone else?) decided that the he- inflection^Wheadword-line templates' making the words larger rather than boldfaced allows for greater readability. Methinks that's more important than consistency. (And as far as following the letter of the law, the passage Martin quotes from ELE says only that a template or boldfacing must be used.)​—msh210 (talk) 16:08, 17 January 2011 (UTC)Reply
Hmm that's a good point - and I agree that consistency should come second to language-specific considerations. It should say something like "the head word should appear in bold, except for certain scripts where this worsens readability." Mglovesfun (talk) 16:22, 17 January 2011 (UTC)Reply
Hebrew's biggitude is a blend of readability and consistency considerations: boldface makes the vowel diacritics impenetrable, and though we don't always include vowel diacritics, I think it would be weird to use embiggening when we include diacritics and emboldening when we don't. (That, and there's no good way to distinguish the two cases, since they're both he and Hebr.) —RuakhTALK 19:32, 17 January 2011 (UTC)Reply
Actually, I reverted the addition because it caused the template, strangely, to not work at all. -- Prince Kassad 16:23, 17 January 2011 (UTC)Reply
You might have to revert again, as I redid the modification and using diff, it's ended up exactly the same. That said, I've had no problem with it - has anyone else? — This unsigned comment was added by Mglovesfun (talkcontribs) at January 18 2011.
I've set up a small test case here. Tell me which of the letters you can see. -- Prince Kassad 19:13, 17 January 2011 (UTC)Reply
Running Firefox on Debian Linux unstable (that is, beta) showed me all of them, but they all looked exactly the same.--Prosfilaes 21:05, 17 January 2011 (UTC)Reply
I see the same as Prosfilaes. Mglovesfun (talk) 12:35, 20 January 2011 (UTC)Reply

Italics

Personal opinion, it's the way {{term}}'s set up, but we should only use italics for the Latin (Latn) script. I boldly remove the face=ital option from {{Cyrl}}. Perhaps a bit too boldly, but ah well. Mglovesfun (talk) 19:06, 17 January 2011 (UTC)Reply

Too boldly IMO unless you first checked that face=ital wasn't being used anywhere.​—msh210 (talk) 19:13, 17 January 2011 (UTC)Reply
You've slightly misunderstood (probably not actually). Cyrillic script is really hard to read in italics, to the extent that м nad т (that is м and т) look almost identical in italics. Same for some other letters. I removed it precisely because it is used by some template, the same way that {{infl}} calls face=bold for {{Hebr}}, but so long as face=head doesn't exist, it won't bold the head words. Mglovesfun (talk) 19:17, 17 January 2011 (UTC)Reply
I didn't follow your last sentence. What I meant, though, wasn't that it's unwise to remove face=ital if it's used (which is what all but the last sentence you just wrote seem to be thinking). I meant, rather, merely that it's unwise to remove it without discussion if it's used.​—msh210 (talk) 19:37, 17 January 2011 (UTC)Reply
Correct. Mglovesfun (talk) 23:52, 17 January 2011 (UTC)Reply

Publishing a Wiktionary

Let's say that i think that the Hebrew Wiktionary has enough good translations of English words into Hebrew. I write a script that dumps them to a nicely formatted file, print it and sell it in bookstores as "The Free English-Hebrew Dictionary, by Wiktionary contributors". It's supposed to be legal in general, but in practice - would it be enough to write on the first page: "This dictionary is published under the terms of the CC-BY-SA license. The list of contributors can be found in the history listing of each headword at http://he.wiktionary.org ."?

And did anyone already publish a printed Wiktionary in any language? --Amir E. Aharoni 00:08, 18 January 2011 (UTC)Reply

The Hebrew Wiktionary actually only includes in Hebrew words, but to answer your question: yes, I think that would be enough. (I'm not a lawyer, though.)
I don't know about printed Wiktionaries, but a publisher called ICON Group International puts out lots of books that consist largely, or even primarily, of snippets from the English Wikipedia. I assume these books are print-on-demand, but still, you have to imagine that someone at some point has accidentally bought a copy of one, thereby bringing it into print. Their copyright page is here, if you're curious. I don't know if it would hold up in court, but obviously WMF hasn't sued them . . .
RuakhTALK 00:19, 18 January 2011 (UTC)Reply
I would suggest talking to an IP lawyer in the country where you wish to publish and sell your book. The purpose of the CC-BY-SA license was to make it easier for people to reuse content, and I know there is some reasonable method for content with many authors, but we are not lawyers (mostly) and the best place for legal advice is not here. - [The]DaveRoss 00:22, 18 January 2011 (UTC)Reply
I'm assuming that Amire80 (talkcontribs) was asking hypothetically, given that he chose an impossible example; but yeah, [[w:Wikipedia:Legal disclaimer]] obtains, as always. —RuakhTALK 00:41, 18 January 2011 (UTC)Reply
I believe Wiktionary is a trademark of the foundation. And the ICON Group International may get away with it, but I think it violates the license to handwave at someone else's website. Print the list of contributors to each page, just like PediaPress does, and you'll be clearly legit.--Prosfilaes 03:22, 18 January 2011 (UTC)Reply
The problem with that is that it would be larger than the rest of the entire book. The attribution requirements, as shown by the Mediawiki edit screen, is that the authors "agree to be credited by re-users, at minimum, through a hyperlink or URL to the page you are contributing to". --Yair rand (talk) 03:54, 18 January 2011 (UTC)Reply
You can fit a thousand names on a letter-size page in six point font. If you were to print out all two million articles, the 450 pages you'd need for the names wouldn't be that much of your text. alphabet has 24 editors credited, and several of them are bots that would arguably only need to be listed once. If you were doing a lot of excerpting, it would be more problematic, yes. But the Mediawiki requirements don't at all fit the requirements of attribution in the legal text; there's no guarantee the page will exist or have not been deleted and rewritten. And even at that, they say "through a hyperlink or URL to the page you are contributing to"; i.e. not "http://he.wiktionary.org", but a URL to every page. Oxford found space to thank me for my meager efforts to the Oxford Science Fiction Dictionary; it'd be nice to see the same when my work was used on Wikimedia projects.--Prosfilaes 06:17, 18 January 2011 (UTC)Reply
The Creative Commons Attribution-ShareAlike 3.0 says "You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work)". In the case of the content of Wikimedia project, the manner specified is simply that one gives the URL or hyperlink. And in any compiled printed list of the translations of Wiktionary, the list of authors would be about the same length as the translations list itself. --Yair rand (talk) 06:32, 18 January 2011 (UTC)Reply
BTW, the foundation:Terms of Use seem to say the same thing: Re-users must include either "a) a hyperlink (where possible) or URL to the page or pages you are re-using, b) a hyperlink (where possible) or URL to an alternative, stable online copy which is freely accessible, which conforms with the license, and which provides credit to the authors in a manner equivalent to the credit given on this website, or c) a list of all authors." --Yair rand (talk) 06:45, 18 January 2011 (UTC)Reply
Which means that if you have a list of words and their translations, you must provide a URL for each word. Again, about the same length as the translations list.--Prosfilaes 07:01, 18 January 2011 (UTC)Reply
But the URL for each word would just be he.wiktionary.org/wiki/WORD. It seems a bit ridiculous to waste so much paper printing that string for each and every word. If CC-BY-SA requires it, it's a practical problem. --Amir E. Aharoni 08:49, 18 January 2011 (UTC)Reply
The CC-BY-SA requires that you list the authors. Wikimedia is effectively amending the license by letting you use a list of articles instead. You could get away with a lot, but as I said, I'm a little miffed that my right to attribution is getting waved at all, so I'm more inclined to be a hard-liner on what I'm getting instead.--Prosfilaes 18:48, 18 January 2011 (UTC)Reply
The question was indeed hypothetical, but the practical side is indeed, this: unlike in a encyclopedia, the articles in a bilingual dictionary are usually very short and a complete list of contributors may make the printed dictionary very long. Also, if i just dump all the contributors to all articles, it will include translators of the word to other languages, who, with all due respect, are not related to this work - unless, of course, i write some clever code that filters them out.
But maybe someone already wrote such filtering code? And maybe someone already measured how much space would a list of contributors take? Or am i the first to raise this problem? --Amir E. Aharoni 08:49, 18 January 2011 (UTC)Reply
I think so. It's not a problem if you're printing full articles, only if you take a small excerpt from many of them. Again, a letter sized page can fit a thousand names in six point font. There are 450,000 unique contributors to Wiktionary; I believe that includes IPs, which I would argue waved their attribution rights by remaining anonymous. alphabet had 24 unique authors, some being bots or hard-working humans that will repeat from article to article. If someone is serious, then they might want to write a program to figure out exactly how many contributors you'd have to list.--Prosfilaes 18:48, 18 January 2011 (UTC)Reply

Don't forget, in the citation provided above, "or c) a list of all authors". I think that providing the URL of the site + an explanation on how to build the URL to each individual page + a list of all authors, at the beginning of the book (2nd page), would probably meet the conditions. Lmaltier 18:06, 18 January 2011 (UTC)Reply

If you provide a list of authors anywhere, that should be sufficient on its own.--Prosfilaes 18:48, 18 January 2011 (UTC)Reply

Category:Definitionless terms

I've managed to get the contents of this category down from over 400 to under 70. Could anyone else like to try and get rid of the rest. There are entries in French, Hebrew and other strange things as well as English. Some of them might need deleting. SemperBlotto 11:14, 19 January 2011 (UTC)Reply

I wouldn't worry too much about getting them down to zero - I should be able to do escumer. I found citations for it but couldn't be bothered analysing them - that's why I've had a few days of to de-wikify my brain. NB if {{defn}} fails RFDO it will be way more than this, as {{defn}} doesn't categorize in this category, but {{rfdef}} does. Mglovesfun (talk) 12:37, 20 January 2011 (UTC)Reply

Rukhabot vote.

Quoth Wiktionary:Votes/bt-2011-01/User:Rukhabot for bot status:

I hereby request the Bot flag for Rukhabot (talkcontribs) for the purpose of adding interwiki links, as in these edits: [10][11][12][13][14][15][16][17]. Unlike other interwiki bots, Rukhabot uses custom code that only takes into account what mainspace page-names exist on other Wiktionaries; that is, it doesn't depend on existing interwiki links between other Wiktionaries.

RuakhTALK 13:11, 19 January 2011 (UTC)Reply

Esperanto corpus

I took the 66 Esperanto texts on PG and made a corpus. I haven't completely uploaded it, but User:Prosfilaes/Esperanto corpus has links to lists of all words found in more than 56 texts. It lists the word, and links to every text that includes it. Already I found a word, terura, that's a genuine Esperanto word used in 60 out 66 texts, and isn't up here.--Prosfilaes 23:02, 19 January 2011 (UTC)Reply

nine and 9

How should we relate Roman and Arabic "ciphered" numbers to their "lettered" counterparts? Sometimes they are listed as Synonyms (as IX is on nine), and sometimes as Alternative forms (as I is on one). I don't much care at the moment. --Bequw τ 23:11, 19 January 2011 (UTC)Reply

I'd vote for all these to be synonyms. Alternative forms just doesn't seem right. Thus a user of any language's lettered number entry would have the translingual symbols available in the entry, whereas a user of the symbol entry needs to traverse the link to the English lettered word to find non-English lettered numbers. DCDuring TALK 01:30, 20 January 2011 (UTC)Reply
They are merely alternative spellings. Characters like 9 are logographs akin to Chinese characters in which one character represents the entire word. The underlying word, however, is the same word.--Brett 15:57, 22 January 2011 (UTC)Reply
Yes, but we present "9" as Translingual. If [A] is an alternative spelling of B, then [B] is an alternative spelling of [A]. This would seem to leave us with each language's lettered number appearing at the top of the entry for 9. Is this the presentation that you would recommend? Or is this a case where English "nine" has a privileged position over neuf and neun for example? I also doubt that we want a language section at 9 for each language. DCDuring TALK 17:39, 22 January 2011 (UTC)Reply
Currently, the entry for 9 gives only the English nine, and I think that's fine given that this is the English Wiktionary. Only in cases where 9 happens to mean something else would you need another language section there. But in the entry for nine or neuf, I would simply list 9 and IX as alternative spellings. I suppose it's not perfectly elegant following your logic, but it should suffice. Perhaps it might be momentarily frustrating if you look up 9 hoping to find out how to say it in French, but there are links to other language wiktionaries.--Brett 18:40, 22 January 2011 (UTC)Reply

Category:Cosmetics, Category:Makeup

Should they be merged? Ƿidsiþ 09:15, 20 January 2011 (UTC)Reply

I don't know, but FWIW there is Wikisaurus:toiletry, broader than both mentioned categories. --Dan Polansky 15:49, 20 January 2011 (UTC)Reply
Category:Makeup might only include some nouns, like into Category:Perfumes or more commonly like the Category:Sports subcategories. JackPotte 21:04, 20 January 2011 (UTC)Reply

New gadget

As some people asked for it, I have added a new gadget which adds country flags next to the language headers on entries, similar to how it's done on other wiktionaries such as Lithuanian Wiktionary. It is mainly meant for those people who think the current headers are too boring and/or do not stand out enough in the page. I realize that there are still many flags missing, hopefully I'll manage to add all of them. -- Prince Kassad 22:48, 20 January 2011 (UTC)Reply

Rather than fear of being boring in style, I sometimes think that people who look at Swedish sections might want a way to get in touch with the Swedish-speaking users of Wiktionary, i.e. a link to Wiktionary:About Swedish or something like that. But such anonymous newcomers have of course not installed any special gadgets. --LA2 22:50, 21 January 2011 (UTC)Reply
"All of them" meaning that, for example, the "English" header will have, next to it, the flags for the US, the UK, South Africa, India, Austl., NZ, a whole bunch of Caribbean countries, several African ones, and a few Middle Eastern ones?​—msh210 (talk) 08:17, 23 January 2011 (UTC)Reply
That was an issue raised on IRC, namely that the flag chosen is completely arbitrary. Obviously, we can't have all flags (it would cause vertical scrolling which is bad), we must decide on one. -- Prince Kassad 15:09, 23 January 2011 (UTC)Reply
Flag Law, as proposed by yours truly: 1. Flags will be displayed in the order (left to right) of the number of speakers of the language residing in the country. 2. Only the flags of countries which represent >= 10% of the global speakers of the language will be displayed; the maximum limit on the number of flags being the lesser of ten or the number which cause wrapping or scrolling on an 800px width resolution. 3. Flags will be displayed to the left and right (or only right) of the language name, a maximum of one flag to the left of the name and the remainder to the right. 4. You do not talk about Flag Law. - [The]DaveRoss 16:15, 23 January 2011 (UTC)Reply
Can we have a Roman eagle for Latin? SemperBlotto 16:20, 23 January 2011 (UTC)Reply
I think it currently displays the flag of Vatican City. Is that offending or misleading? -- Prince Kassad 16:24, 23 January 2011 (UTC)Reply
Change it to an eagle. Where do I edit flags? --Vahag 18:54, 23 January 2011 (UTC)Reply
MediaWiki:Gadget-WiktCountryFlags.css. —RuakhTALK 19:18, 23 January 2011 (UTC)Reply
I definitely like it and I'll be using it, but I do think the flags are rather big. I think they should be shrunk down to at most two thirds of the current size. There are also quite a few languages that are missing flags still, but I take it those are going to be added later? Perhaps for translingual sections a picture of Earth can be added. —CodeCat 16:34, 23 January 2011 (UTC)Reply
Hmm, I used the same size that's used at Lithuanian Wiktionary (45 pixels width), and it seems to work fine for me, but maybe not for everyone. And yes, the missing flags will be added soon. -- Prince Kassad 17:16, 23 January 2011 (UTC)Reply
More options: multiple flags can be displayed as shown here, or all flags can line up on the right side. They could all line up on the left side but I think we would all agree that is silly. - [The]DaveRoss 19:59, 23 January 2011 (UTC)Reply
Some sites make a single flag composed of the UK and US flags together. There is a diagonal divide from top right to bottom left, and the US flag is on the top left half and the UK flag bottom right. Maybe that would work for us too? —CodeCat 20:20, 23 January 2011 (UTC)Reply
I have seen that before on Wikipedia. Now if I could remember where it is... -- Prince Kassad 10:05, 24 January 2011 (UTC) (addendum: gotcha! )Reply
People are going to bitch for choosing the flag of communist Yugoslavia for Serbo-Croatian... --Vahag 23:37, 23 January 2011 (UTC)Reply
What would be the 'best' choice for Serbo-Croation? Also, it is currently and opt-in gadget and person css will override gadget css so if someone wanted to replace a certain flag (or all of them) they can also do that. - [The]DaveRoss 23:49, 23 January 2011 (UTC)Reply
I don't know. I like the communist flag with the red star. --Vahag 23:56, 23 January 2011 (UTC)Reply
There was a heated discussion over the icon choice for Perapera-kun, my favourite Mozilla Firefox plug-in for Chinese (also Japanese) pop-up dictionary. Some American users refused to use the Communist red flag as the icon (one guy's argument was he worked for teh government and for him it was an issue if someone sees a red flag on his desktop) and chose an icon with a Chinese character instead, which I find very strange. Perhaps we don't need any icons? --Anatoli 00:57, 24 January 2011 (UTC)Reply
I'm decidedly not a fan of using a single country's flag to represent a language, as (a) it's insulting to those in other countries who speak the language and (b) may imply something about the dialect of the word in the entry. As the script is now, we should IMO never make this default, and I frankly don't like having it even as an option. If we use TDR's ideas, above, or others, for representing multiple countries, I wouldn't be (so) against it (though I wouldn't enable it, myself).​—msh210 (talk) 07:33, 24 January 2011 (UTC)Reply
I basically agree with Msh210. Unlike flags, languages are not country-specific; even currencies aren't. We will be giving the wrong impression. Equinox 09:49, 24 January 2011 (UTC)Reply
I more or less agree. (See also <http://www.useit.com/alertbox/flagproblem.html>, BTW.) I'm O.K. with it as an opt-in thing, though, since there are apparently users who want it. (The blend of U.S. and U.K. flags seems disrespectful to both, though!) —RuakhTALK 12:49, 24 January 2011 (UTC)Reply
Since the flags don't link to their file: pages, I assume they can't be GFDL/CC-BY licensed (which need to link to the authorship info AFAIK (IANAL)), so we'd need to use only PD files. Right?​—msh210 (talk) 07:35, 24 January 2011 (UTC)Reply
Can images of national flags be copyrighted in the first place? --Yair rand (talk) 07:44, 24 January 2011 (UTC)Reply
The SVG certainly can. I suppose if the server is sending PNGs that wouldn't be a problem in many cases, but File:Flag of Sicily.svg and File:Flag of Sicily (revised).svg show copyrightable variation. However, I think virtually all flags are PD, just for this reason.--Prosfilaes 04:54, 25 January 2011 (UTC)Reply
Templates don't link to the Template page, either (except for the edit window, which however is not immediately obvious). You could, for the same reason, remove all the templates on Wiktionary. -- Prince Kassad 08:45, 24 January 2011 (UTC)Reply
Arguing that something isn't true because we don't like the consequences is not a valid argument. We are pushing the lines by not linking to the template pages, but templates were made for and by Wikimedians who understand how templates are used so it's less of a big deal. And you can reach templates through the edit page; I see no way besides figuring out what a filename must be from a long URL hidden in a CSS file. Images were sometimes made by non-Wikimedians for non-Wikimedians and deliberately released under the CC-BY license (instead of having it pushed on them, like template writers). They may not appreciate as free an interpretation of the license.--Prosfilaes 04:54, 25 January 2011 (UTC)Reply
A better example might be our MediaWiki:Common.css and MediaWiki:Common.js, which are used for every single page. It is absolutely impossible to find these out if you don't know them. -- Prince Kassad 20:02, 28 January 2011 (UTC)Reply

Targeted Translations

I think it would be useful if User:Yair rand/TargetedTranslations.js were enabled by default. The script (based on User:Atelaes/TargetedTranslations.js) brings translations of specified languages into the NavHead (the grey part at the top of translation tables). It adds a small "Select targeted languages" button to the inside of translations tables so the users can specify their preferred languages, and not have to repeatedly open tables and search for the translation for every page. The script is currently available as a gadget in Special:Preferences. --Yair rand (talk) 23:58, 20 January 2011 (UTC)Reply

I don't know enough to say one way or the other about enabling it by default, but thanks to Yair rand for putting this together -- it sounds quite useful to me and I'm enabling it now. Cheers! Eiríkr Útlendi | Tala við mig 20:38, 21 January 2011 (UTC)Reply

Well, it's been a while, and nobody's objected, so ... I've enabled targeted translations by default. --Yair rand (talk) 23:30, 31 January 2011 (UTC)Reply

Just to make Yair happy I will voice approval here, so that it doesn't seem unilateral. - [The]DaveRoss 23:42, 31 January 2011 (UTC)Reply
  • Adding translation glosses is really really boring. I had a stint during my Volants time (well, I did many boring things then, you know, to branch out a little, and this was even more boring than adding {{also}} disambigs...possibly on the same level of boringness as archiving RFC or Tea Room discussions) and it is thankless. --DStirke 00:08, 1 February 2011 (UTC)Reply

Sign gloss namespace

We need to reevaluate the benefits of creating an entire namespace to include a single entry to support said namespace. Sign gloss is not getting anywhere, and if Rodasmith was here, maybe xhe'd help. TeleComNasSprVen 09:04, 21 January 2011 (UTC)Reply

There is no harm in having the namepsace so that when the editors who work with sign languages have the time, they can create the appropriate entries in that namespace. In fact, I might get around to making some for the sign language entries we have at the moment, so that every ASL entry has a sign gloss: linking to it. I think I'll start that this week. In the mean time it does not hurt anyone for the namespace to have been created, which is what I read from your comment here (a presumption that it existing is causing some imaginary harm) --Neskayagawonisgv? 09:49, 23 January 2011 (UTC)Reply
My question is, how are the glosses in the least bit useful? What we really need are some people who would like to sit in front of a camera and sign a few thousand words each, then create either .gifs or .ogg videos which we can put into translation tables. That I would find useful. - [The]DaveRoss 20:38, 28 January 2011 (UTC)Reply

Wikisaurus:defecate

(Not sure whether this belongs to Beer parlour.)

There is a host of redlinks at Wikisaurus:defecate of which it is not quite straightforward to decide whether they are attestable. If someone could help create in the mainspace those redlinked terms that are attestable, that would be great. I am trying to do just that, but, as I am a non-native, this is more of a challenge to me.

I do not know of any other Wikisaurus page with redlinks. My goal is to ensure there are no or very few redlinked entries in Wikisaurus, so it is possible to spot new spurious additions to Wikisaurus as redlinks. Yesterday, I have succeeded adding some attestable terms for urination to the mainspace, without adding the attestation in order to reduce the cost of the addition. --Dan Polansky 09:29, 21 January 2011 (UTC)Reply

This is a worthy goal. We could consider in the future instating a no redlinks policy on Wikisaurus (which would also require that there be a corresponding definition). It's easier to attest through RFV than to have two processes where inevitably ignoring one results in accumulation of cruft. DAVilla 06:31, 25 January 2011 (UTC)Reply
To correct myself: there are many more Wikisaurus entries with redlinked terms. These include WS:thingy, WS:glans, WS:masturbate, WS:insane, WS:corpse, WS:nude, WS:watercraft, WS:grass, WS:antihistamine, and more. I would prioritize fixing redlinks in those entries that are likely to attract random contributors, such as the anatomy ones. --Dan Polansky 09:14, 28 January 2011 (UTC)Reply

Topical category for places that people live in

Currently we have Category:Home, Category:Housing and Category:Place names, but we don't have a category for general places where several people live together. A category for terms like village, city, hamlet and so on. I imagine that 'place names' would be a subcategory of that, and maybe 'home' too. I would create the category myself, but I don't actually know a good term for such a category. Can anyone help? —CodeCat 19:10, 21 January 2011 (UTC)Reply

The Category:Cities does have a textual description that includes villages and towns of all sizes. In German and Swedish we would prefer the abstract term (deprecated template usage) Ort. The English Wikipedia has a long tradition of chaos in naming such categories. Right now it seems that w:Category:Populated places has the upper hand. --LA2 22:46, 21 January 2011 (UTC)Reply
I don't think the name of that category is very well chosen then. I've never heard anyone refer to a small settlement as a city, although I did notice address forms in America tend to ask for 'city' rather than 'place of residence' or something like that. Do you think 'places of residence' would be a good name for the category, or is that still too vague? And also, the category I'm thinking of is not for specific places of residence, but for terms relating to them. So it would not contain London but it would contain town, metropolis and so on. —CodeCat 12:16, 22 January 2011 (UTC)Reply
The experience from en.wp is that this discussion never ends and renaming the City category leads to just as much surprise as keeping any existing name. "Populated places" has now been in stable use on en.wp for almost an entire year, perhaps a new record. If we rename our "cities" to "places of residence", we will still differ from en.wp. --LA2 13:51, 22 January 2011 (UTC)Reply
It may be a British/American things, but w:Ingersoll, Oklahoma, population 9, is a city. While village does get thrown around in an informal context, the only place I've seen it in a formal context in the US is when the village is legally part of a city.--Prosfilaes 20:09, 22 January 2011 (UTC)Reply
Maybe that depends on the state? In Michigan and Ohio there are plenty of municipalities that are formally and officially incorporated as "villages" (look through google:"the Village of * Ohio" and you'll see plenty of examples). That said, in ordinary colloquial usage I think they're much more likely to be referred to as "towns". —RuakhTALK 20:27, 22 January 2011 (UTC)Reply
Ah, here: w:Village (United States). It does depend on the state. —RuakhTALK 20:31, 22 January 2011 (UTC)Reply
I doubt that we want to depend on legal definitions, which vary in the US by state as Ruakh suggests. In NY we have, at least villages, towns, and cities (of different classes!). The US Census has faced the US nomenclature problem and resolved it by coming up with its own set of names often distinct from the states' terms. In New York, for example, they dispense with "town" which can include villages within its borders and have CDP (Census-designated place), possibly for the portion of a town not also included in any village. CDP also serves as the type for places that are outside any municipal jurisdiction (aka, unincorporated places). They have another set of their own terms for large populated places that include multiple jurisdictions, possibly in multiple states. Coming up with standard names across countries seems quite hopeless. I can only hope our translations of the jurisdiction-type terms do justice to the realities. "Populated places" seems like a wonderful subset within toponyms. DCDuring TALK 21:14, 22 January 2011 (UTC)Reply
As to CodeCat's original question, I think the problem is that the category that includes all of the natural, legal, and administrative types of place/jurisdiction names is not a natural category, being both abstract and heterogeneous. Moreover, such names are not limited to places that have human populations. Would something as awkward as "Area designation types" be acceptable? DCDuring TALK 21:14, 22 January 2011 (UTC)Reply

One very natural thing is what you write in a postal address, next to the ZIP code. In U.S. mail order forms, that field is typically called "City:", which is a very short and convenient word, except that what's written in that field is not always a "major town with a cathedral". German and Swedish mail order forms say "Ort:" which is even shorter, and always correct since it is an abstract term (almost as generic as "place"). A similar problem occurs with "rivers", which is a short and convenient word, but one really doesn't want separate categories for creeks and streams. I'm perfectly happy with the "postal address" definition of "city", and think our categories can keep their current name. --LA2 02:49, 23 January 2011 (UTC)Reply

Nicoletapedia

The user Nicoletapedia has been blocked, and in IRC they were asking what to do; they said they'd emailed, and had no reply.

We couldn't find a Wikt admin around, so I thought I'd try posting here, to see if someone could have a look - User_talk:Nicoletapedia#Block. The user also said that they were unable to edit their own talk page.

So...please can someone investigate. And...sorry if this is the wrong place to ask, but I couldn't find anywhere more appropriate. Ta. Chzz 09:59, 22 January 2011 (UTC)Reply

Message from the user, via IRC;

User:Nicoletapedia was contributing to Wiktionary's Czech vocabulary by adding genitive & genitive plural forms of words already in Wiktionary when she suddenly found her account blocked by User:Mglovesfun with no message or reason given other than the ambiguous phrase "disruptive edits." No postings were made to her talkpage nor was any attempt made to communicate to her what, if anything, she was doing wrong. She was further blocked from contacting any Wiktionary admins regarding the matter, but was able to e-mail Mglovesfun via his profile on the Wikipedia project. Mglovesfun continued to be ambiguous and evasive and stopped responding to any correspondence since January 10. Nicoletapedia was subsequently able to contact User:Dominic on #Wiktionary IRC chat who assured her she'd done nothing wrong and that her account would soon be reactivated. Her account has still not been reactivated nor has she received any further correspondence on the matter.

Note: I am not involved with this case at all; I am just pasting this on behalf of that person, trying to help out. Thanks, Chzz 10:47, 22 January 2011 (UTC)Reply

I'm pretty sure what I said at User_talk:Nicoletapedia#Block covers it. Mglovesfun (talk) 11:06, 22 January 2011 (UTC)Reply
Unblocked, hopefully someone will follow her round and correct her entries. Or even better, she will read her own talk page. Mglovesfun (talk) 14:14, 22 January 2011 (UTC)Reply
Don't be ridiculous, women don't edit Wiktionary, they can't read; User:Nicoletapedia is a man. BTW, Gloves, don't go down Semper's path, be nice to newbies. --Vahag 16:45, 22 January 2011 (UTC)Reply
1st edit April 2010 - so not a newbie (and hence fair game) SemperBlotto 17:41, 22 January 2011 (UTC)Reply
I know you're joking, but for the record, Nicoletapedia is a woman; or at least, {{gender:Nicoletapedia|he|she|he or she}} returns currently evaluates to she, so that is the appropriate set of pronouns. —RuakhTALK 17:03, 22 January 2011 (UTC)Reply
Oh my god, she can read then. Do you know what that means? She's a witch! --Vahag 17:21, 22 January 2011 (UTC)Reply
Also for the record, not funny. DAVilla 06:23, 25 January 2011 (UTC)Reply

Vestigial Quotation headers

Many Quotation headers are now just storing {{seeCites}} notices. This is quite cumbersome as the notice can be separated from the actual quotations (below sense lines) by several headers. I've added a pos=right option to {{seeCites}} so that it displays the notice in a thin right-floating box (similar to {{slim-wikipedia}}). This allows the notice to be displayed in the PoS area(s) and in my view improve the layout of entries (when there aren't too many RHS elements already). Are there any problems with making this kind of change (example) on a wider-scale? This would reduce the number of quotation headers and help in the broad task of moving content from the quotation header to either below definitions or to the citations page. --Bequw τ 03:13, 26 January 2011 (UTC)Reply

I do not like this. We should better avoid right-floating boxes, I think. I dislike both {{wikipedia}} and {{slim-wikipedia}}, and prefer {{pedia}} AKA {{pedialite}}. So for me, {{seeCites}} is basically okay as it is, while I admit that having a dedicated section that contains just a link seems a bit odd. However, I for one was never quite convinced that it was a good idea to list quotations directly below the definitions in the definitions section, as is now the common practice. Originally, the section for quotations was intended for them. --Dan Polansky 07:54, 26 January 2011 (UTC)Reply
I'm not a big fan of right-side floating boxes, but IMO the diff provided looks good. I think that as long as there aren't other such boxes near it (pictures, interproject links, {{examples-right}}), this is fine. (What about {{seeMoreCites}}?)​—msh210 (talk) 16:55, 26 January 2011 (UTC)Reply
On large screen resolutions, this box is a bit too innocent and easily missed. I don't consider it a good replacement to the current solution. -- Prince Kassad 19:45, 26 January 2011 (UTC)Reply

Renaming CFI section for spellings

I would like to see one section of CFI renamed: "Misspellings, common misspellings and variant spellings" to "Spellings".

The edit would be the following:

===SpellingsMisspellings, common misspellings and variant spellings===
Misspellings, common misspellings and variant spellings: There is no simple hard and fast rule, particularly in English, for determining whether a particular spelling is “correct”. A person defending a disputed spelling should be prepared to support his view with references. Published grammars and style guides can be useful in that regard, as can statistics concerning the prevalence of various forms.

As a result, the section would have a succinct heading, while the fuller scope of the section would be stated immediately after the heading.

What do you think of this proposal? Are there other succinct versions of the heading that you would like to see? --Dan Polansky 07:50, 26 January 2011 (UTC)Reply

I don't care. That is, I certainly don't mind it, and if there's a vote (though I sincerely hope this can be concluded in the BP) I'd not oppose, but I don't see the point, why the new version is better than the current.​—msh210 (talk) 16:48, 26 January 2011 (UTC)Reply
Right now, the section 1.4 ("Misspellings, common misspellings and variant spellings") is rather conspicuous in the table of contents of CFI, because of its length. Succinct headings are generally better, I think.
If people won't oppose the proposal, I will set up a short vote. CFI should not be modified without a vote, which is IMHO a good thing. --Dan Polansky 18:14, 26 January 2011 (UTC)Reply

I have created a vote: Wiktionary:Votes/pl-2011-02/Renaming CFI section for spellings. I have packed one more proposal into the vote, a proposal that can be rejected separately. --Dan Polansky 15:27, 1 February 2011 (UTC)Reply

I have removed the other proposal from the vote, so the vote is only about renaming. --Dan Polansky 15:43, 1 February 2011 (UTC)Reply

Removing CFI section on leet spellings

I suspect the last section on the page [[WT:CFI]], "Typographic variants", no longer matches what the vast majority of editors think, so should be excised. Thoughts?​—msh210 (talk) 16:48, 26 January 2011 (UTC)Reply

I've been considering this whole section of CFI, "Issues to consider" and its two subsections. They're not really criteria - as other people have said (notably, in my mind Conrad.Irwin) CFI tries to do too much, I doesn't contain so much "rules" or "criteria" as just general discussion, which is interesting and well written, just not in the right place. I'd like CFI to be more black-and-white with less "narrative". In other words, those two paragraphs could be removed entirely, or replaced with a much more concise version. Mglovesfun (talk) 17:02, 26 January 2011 (UTC)Reply
I agree; CFI should be a clear set of rules, not discussion.--Prosfilaes 02:04, 27 January 2011 (UTC)Reply
In what sense does it not match what the editors think? It implies that the consensus is to include i18n, G-d and such forms.--Prosfilaes 02:04, 27 January 2011 (UTC)Reply
It suggests to me that there's no consensus and that whoever wrote those lines is for their inclusion. Perhaps I'm reading it wrong, though.​—msh210 (talk) 06:09, 27 January 2011 (UTC)Reply
I tend to agree that section "Typographic variants" can be removed from CFI. It seems that even the whole section "Issues to consider" can be removed, as proposed by MG. I assume that the inclusion of the mentioned terms (G-d, pr0n, i18n or veg*n) is no longer controversial. If you set up a vote, it could consist of two subvotes, one subvote proposing the removal of the larger section, while the other subvote proposing the removal of the narrower section. --Dan Polansky 15:44, 27 January 2011 (UTC)Reply
I agree. (Or we could have one subvote for each subsection, with the understanding that we'd remove the whole thing if both subvotes pass.) —RuakhTALK 17:02, 27 January 2011 (UTC)Reply
I like the system of WT:BRAND, where the relevant passage in CFI is only a few words long, as there is a subpages several hundred words long explain what that policy means. Mglovesfun (talk) 17:04, 27 January 2011 (UTC)Reply
I've created started Wiktionary:Votes/pl-2011-01/Final sections of the CFI.​—msh210 (talk) 18:28, 27 January 2011 (UTC) 17:59, 3 February 2011 (UTC)Reply
And now it's started.​—msh210 (talk) 17:59, 3 February 2011 (UTC)Reply

Citations of letters

Is it really necessary to add citations that show letters in use, as done on a? -- Prince Kassad 19:54, 26 January 2011 (UTC)Reply

Not on a. But it might be if you claimed that is an English letter.--Prosfilaes 02:09, 27 January 2011 (UTC)Reply
I would say not. —Internoob (DiscCont) 04:31, 27 January 2011 (UTC)Reply
That was for a WT:FUN, fwiw.​—msh210 (talk) 06:07, 27 January 2011 (UTC)Reply
It's not necessary to add citations to common letters such as a for the purpose of attesting them, because their existence is very unlikely to be challenged. That said, I fundamentally agree with Prosfilaes's statement: one would likely need to provide citations to attest as an English letter, because that's a controversial letter. Nonetheless, the existence of the verb be and the article the also are unlikely to be challenged, but we have Citations:be and Citations:the. Citations of entries with "clearly widespread use" are basically not required, but they also are susceptible to be informative and useful. --Daniel. 16:37, 27 January 2011 (UTC)Reply
Well of course, for be and the citations can be useful because they show exactly how to use these terms, which may not be obvious for non-native speakers of English (think Russian, which has neither articles nor does it generally use the verb to be). On the other hand, how to use the letter a should really be obvious to everyone. -- Prince Kassad 16:42, 27 January 2011 (UTC)Reply
Citations are often needed to attest to specific PoSes, uses, forms, or senses of headwords that are in widespread use in other uses or senses. Whatever is specifically challenged needs to be cited. DCDuring TALK 16:51, 27 January 2011 (UTC)Reply
I'd like to see earliest citation of J/j - middle ages from Spain I think. SemperBlotto 16:56, 27 January 2011 (UTC)Reply
I'd like to see examples of various Latin letters used to romanize Japanese, Chinese, Russian, etc. by multiple systems and authors eventually. Wiktionary still is not a very good place to look for this. --Daniel. 19:25, 27 January 2011 (UTC)Reply
Kassad's "how to use the letter a should really be obvious to everyone" is not accurate: People start to learn how to read all the time. Certainly lists of examples of usage of each letter would be useful for many people. --Daniel. 07:39, 30 January 2011 (UTC)Reply

Let's talk Wonderfool.

I see that more incarnations of Wonderfool have been blocked, and I think it might be worth discussing what exactly we should be doing in this case. The facts are more or less as follows: the guy is one of our most prolific contributors, probably top 5 or 10 in number of contributions. The guy is also the subject of something like 80% of all checkuser queries on this project. Technically, he is always subject to being blocked because his original (I think?) account is indef blocked. Whether or not it should have been an indef block or something shorter is debatable. Lastly he likes to make some dubious edits, but he makes an awful lot of pretty good edits in between and frankly there are several admins who make edits at least as suspect as Wonderfool's.

So my question is, what should we do about him moving forward? I see three distinct options, there may be others.

  1. Unblock User:Wonderfool or leave unblocked the next account he creates and let him edit on a single known account which is not subject to indef blocking any more than any other account is.
  2. Continue to play the game of cat and mouse which has been ongoing for the past 4 years or so, blocking him whenever someone spots his edits.
  3. Actually pursue his removal from the project. We can block the entire ISP he edits from, he is the only editor who uses it. We can also send an abuse report to the ISP but in this case they probably wouldn't care that much, unless Wikipedia also blocked the ISP.

Personally I am inclined to option 1, I would prefer not having to mess around with the other two, and it makes sense to me to have someone as dedicated to this project (in his own special way) as Wonderfool editing openly and under the same individual scrutiny that we all are. Perhaps Wonderfool would also like to comment, particularly if option 1 gets any support. Thoughts? - [The]DaveRoss 20:38, 27 January 2011 (UTC)Reply

  • My current strategy is to let him carry on editing for several days / weeks until something goes click in his head and he starts misbehaving. I then block him, and block him again if he comes back too quickly. It's no big deal. SemperBlotto 20:42, 27 January 2011 (UTC)Reply
    • I guess this is my point, there is a huge downside to that method of dealing with him. Whenever anyone new begins to edit someone suspects them of being the next incarnation of Wonderfool. I get lots of requests for checking IPs, people fling accusations about, it isn't a good thing. We are sitting on the fence and it isn't as harmless as it seems. - [The]DaveRoss 21:00, 27 January 2011 (UTC)Reply
  • The thing of #1 is that he frequently has multiple simultaneously editing sockpuppets, so it's a unilateral decision: we can say that we're "let[ting] him edit on a single known account", but that decision is not determinative of reality. This is not necessarily a fatal problem with #1 — after all, he'll presumably continue to use multiple sockpuppets no matter what we do — but it's worth keeping in mind. #1 will not stop us from accusing people of being Wonderfool; at most, it will be a community acknowledgment that being Wonderfool is not a capital offense, and hence that accusations of such are not, um, capital accusations. —RuakhTALK 21:08, 27 January 2011 (UTC)Reply
    • That has not been the case recently, most of the time it is a single account at a time. To be sure, in the past he has had several sleeper accounts going once, or multiple active accounts at once, but unless he is playing quite a deep game he has been using one at a time for the past few months. I agree though, that there is an assumption that if he is given the right to edit back officially then we are also assuming that the sock puppetry will stop. I guess that were we to go with option #1 there would be a caveat included that were more socks to emerge we would change to option #3. - [The]DaveRoss 21:21, 27 January 2011 (UTC)Reply
I'm not profiler but this person is a French speaker, and on fr.wikt we've got exactly the same kind of guy called User:X (mainly present is the English, Esperanto and Greek paragraphs), also known as User:ABC on el.wikt. He used to contribute with accuracy BUT by copying some pages from one Wiktionary to another without any author right, without reading any syntax consensus, and being rude or ignoring on his talk page. However, we've chosen to let him unblocked (after seven blocs) because he continued to edit from some other IPs every day, and a bot can easily correct his syntax while he collaborates only partially. We can't confound him with the other daily contributors because he's incurably asocial. In conclusion I propose to stop to speak to him ASAP because we've already lost many hours for no result, and I would bet that his addiction would be able to make him try several ISPs. JackPotte 23:48, 27 January 2011 (UTC)Reply
No, this doesn't sound like our guy at all. Wonderfool is a very talkative and social creature. --Vahag 23:53, 27 January 2011 (UTC)Reply
@JackPotte: If it helps: Wonderfool is Foumidable. —RuakhTALK 00:50, 28 January 2011 (UTC)Reply
User:DStirke and User:Romanb are his latest incarnations. Does anyone want to talk to him? SemperBlotto 19:58, 28 January 2011 (UTC)Reply
I see the advantage of unblocking him, but only if he stuck to one account. I get the impression he likes his multiple accounts so it would achieve nothing. He's got loads on the French Wiktionary where AFAIK he's never been blocked for a long period. Mglovesfun (talk) 11:36, 29 January 2011 (UTC)Reply

Voting percentages

Author's note, this is mainly abstract and hypothetical: the thought occurred to me last night. Take the current vote Wiktionary:Votes/pl-2011-01/Final sections of the CFI. The vote is to remove the sections, which should require 70% approval or more. What if the vote were worded the opposite way? Voting to keep the final sections of CFI. Wouldn't you then need 70% consensus to keep them?

Less hypothetical: it seems odd in a way to need 70% approval to remove something from CFI. It leaves open the possibility of one of our official criteria not having community support, but as long as 31% of the community support it, too bad for the other 69%. Mglovesfun (talk) 09:13, 28 January 2011 (UTC)Reply

One of the many reasons why votes are bad and discussion leading to consensus is good. Then again we can't usually agree on much of anything; 31% is actually quite a good showing! - [The]DaveRoss 11:05, 28 January 2011 (UTC)Reply
Re: "Wouldn't you then need 70% consensus to keep them?": No, because we don't have a bias in favor of "oppose", but rather a bias in favor of the status quo. Usually proposals are to change something, such that the status quo is reflected by the "oppose" option, but if you were to structure a vote such that the status quo were reflected by the "support" option, then the "support" option is what would get the boost. —RuakhTALK 12:16, 28 January 2011 (UTC)Reply
Votes are often good. The threshold of 2/3 can be used for this vote, as it is not a meta-vote, instead of the sometimes mentioned 70%. --Dan Polansky 13:45, 28 January 2011 (UTC)Reply

Wiktionary talk:Todo#Chinese translations

Heads up. Mglovesfun (talk) 10:30, 28 January 2011 (UTC)Reply

CFI: Removing usage in a well-known work

I would like to see this part or clause removed from CFI's attestation criteria:

  • "Usage in a well-known work, or"

One entry that would as a result fail attestation is bababadalgharaghtakamminarronnkonnbronntonnerronntuonnthunntrovarrhounawnskawntoohoohoordenenthurnuk. Another one is lukkedoerendunandurraskewdylooshoofermoyportertooryzooysphalnabortansporthaokansakroidverjkapakkapuk.

Thoughts? Opposition? --Dan Polansky 12:23, 28 January 2011 (UTC)Reply

I'd oppose it (though I'm glad it's being discussed). We only require three durably archived citations anyway, lessening this to one when it's a well-known work seems appropriate. These two entries are pretty ridiculous, yet we could conceivably lose along with these entries used in Shakespeare, Chaucer or Dickens (etc.). Another example is overwicked which I created based on the NIV which is a well-known work. I didn't check for any other independent citations. Mglovesfun (talk) 12:52, 28 January 2011 (UTC)Reply
I suppose one possibility is to have appendices for very well-known authors (like Shakespeare) with their nonce words. Shakespeare of course has tons, as do Spenser, Milton, etc. whereas Melville, Nabokov and Pynchon might just have a few. Joyce is so much deliberate wordplay and stream-of-consciousness that the appendix doesn't seem worth doing. Equinox 13:00, 28 January 2011 (UTC)Reply
Words that can only be cited based on usage (not mention) in a single work are words that cannot be said to have accepted meaning, except possibly by onomatopoeia or by composition of the meaning, possibly allusive - something we rarely accept - of components. I would strongly favor Appendices by author of their unsuccessful coinages, preferrably linked to by {{only in}}.
I am not committed to imposing such a prohibition in languages other than English. In English the corpus potentially provides ample evidence of usage for real words. DCDuring TALK 13:46, 28 January 2011 (UTC)Reply
The rules comes in handy for lesser attested languages like Old French and Anglo-Norman (in my case in particular) where getting three citations might be difficult. I definitely wouldn't support it being removed entirely, though I think there is some room for maneuver. The proposed "only one citation is needed for word/terms in dead languages" that was proposed a few months ago, if that passed it would circumnavigate part of the problem I just mentioned. Though not for languages that aren't dead where little written material exists. Mglovesfun (talk) 17:32, 28 January 2011 (UTC)Reply
I can see your point with less attested languages, but I do not see how the "well-known work" clause saves them. "Well-known" to whom, then, are the Old French works or Ancient Greek works supposed to be? Is there a single Old French term that is actually included because of the well-known-work clause? --Dan Polansky 17:41, 28 January 2011 (UTC)Reply
Debatable, and that goes to the meaning of "well-known work" which oddly, nobody has brought up. Simple answer is that I've RFV'd some Old French words, only one of which from memory has failed (host). None of the ones that passed (just preindre isn't it?) have passed via this rule. So I think the fairly simplistic answer is "no". Mglovesfun (talk) 17:47, 28 January 2011 (UTC)Reply
I don't know about Old French, but there are many well-known Ancient Greek works--the Iliad, the Odyssey, Plato, Homeric Hymns, Sophocles, etc. If it were interpreted as being language specific, I could find a list of well-known Esperanto works--Zamenhof's Hamleto, Fundamento Krestiomatio, Vikimoj, etc.--Prosfilaes 18:41, 28 January 2011 (UTC)Reply
I don't see why lack of accepted meaning should be a fatal problem. If a word appears in the KJV and nowhere else, there is a lot that we can usefully say about it, even without knowing for sure what the translators thought the source-word meant. —RuakhTALK 18:31, 28 January 2011 (UTC)Reply
KJV has an unusually limited vocabulary of words excepting proper names. To say something about hapax legomena we would need to simply rely on authorities, in the manner of WP, unless we accept each other's non-attestation-based opinions. DCDuring TALK 20:13, 28 January 2011 (UTC)Reply
I think I'd oppose. The cases that have come up are for unquestionably well-known works, and I don't see why it's bad to have bababadalgharaghtakamminarronnkonnbronntonnerronntuonnthunntrovarrhounawnskawntoohoohoordenenthurnuk for our Joyce reading users.--Prosfilaes 18:41, 28 January 2011 (UTC)Reply
Are you saying that a reader of Joyce is going to consult a dictionary to find out what the long thing means? --Dan Polansky 19:00, 28 January 2011 (UTC)Reply
After typing the first 5 letters in the search box only this "word" shows up, so we don't have a good test of the typing skills of readers of Joyce. Do we have anything meaningful to say about such an entry. We can't define it or translate it. Do we transliterate it? Accepting a single cite from a "well-known work" a proof of the existence of the "word" makes little contribution to attesting to its meaning. That, say, Nabakov or Joyce or Burgess uses a word a certain way says nothing about whether the word has "entered the lexicon" of any community of users. DCDuring TALK 20:07, 28 January 2011 (UTC)Reply
To have this BP discussion more complete, let me quote Visviva: "The rationale for the well-known work exemption, as I understand it, is that a complete version of Wiktionary should leave no word-sense questions unanswered for someone reading Shakespeare, Milton, etc. This seems reasonable enough to me, though the flip side of that is that we are currently missing thousands of words and word forms that appear even in respelled modern editions of Shakespeare. (I have some lists, if anyone is interested.) On the other hand, this particular need could arguably be better addressed in Concordance: or Appendix:-space, though that approach also has problems. That said, if we eliminate the exemption entirely, we need to replace it with a more nuanced approach to languages that are poorly-attested (Homeric Greek, Eteocypriot, Cia-Cia) or unstandardized (Middle English, Middle Korean, actually almost any Middle/Old language). "Well-known work" gives us an loophole for including forms that appear only in the Homeric hymns, or that are found in a particular spelling only in Chaucer. This is unsatisfactory, of course, since it still excludes less-known writings; but I don't think the well-known-work issue can be addressed before the poorly-attested-languages issue. -- Visviva 05:55, 17 October 2009 (UTC)", from Talk:bababadalgharaghtakamminarronnkonnbronntonnerronntuonnthunntrovarrhounawnskawntoohoohoordenenthurnuk. --Dan Polansky
Visviva's statement is sensible. IMHO, the use of {{only in}} to direct users to appendices or concordances or even a Citation space page would address the Shakespeare/Joyce-reader problem in that we could confirm that a given term is a hapax legomenon and provide opportunity for evidence and conjecture about meaning and derivation. I see no reason why all languages need to follow identical policies. English in particular can afford its own policies at English Wiktionary. Extinct languages, constructed languages, languages (not) distinguished by Google are all candidate classes for uniform treatment, should individual treatment by language be deemed inappropriate. DCDuring TALK 17:44, 30 January 2011 (UTC)Reply
I'm not seeing the value in making the change, though. It would be silly to make a special rule for, e.g., phobias, but this rule stood with consensus for a while. And we aren't going to be able to cleanly reverse it; there's going to be entries added under this rule that won't be ferreted out for a while. I also don't always see the advantage of appendixes instead of just tagging them on their page; I'd be happy to move dictionary-only words into the mainspace, as long as they were so tagged. Lastly, if we do want to treat other languages separately, that's something that needs discussion.--Prosfilaes 18:48, 30 January 2011 (UTC)Reply
  • Let's think about how writers coin new words. I would distinguish four kinds: 1, words made by recombining existing English elements (eg overwicked, elbowlessness); 2, words made from foreign elements which are fitted to existing English analogues (eg aqueity, anemocracy); 3, blends (eg spife, a spoon-cum-knife, which I just invented); 4, onomatopoeic creations (eg Joyce's hunded-letter word above). To me, group 1 is a set of clearly valid English words and I would not have a problem including them with a single citation from a well-known work. Group 2 is a bit different, but I think we have a good way of including them without creating an actual page for each coinage. What they demonstrate is a function of English's ability to neologize with specific elements, and we can use them as citations of those elements rather than of the whole word. For example, (deprecated template usage) anemocracy may not meet criteria on its own, but if we can also find a citation for (deprecated template usage) anemophobe and (deprecated template usage) anemophagic then we have three citations for (deprecated template usage) anemo- as a valid prefix in English. Similarly on the (deprecated template usage) -ity page, apart from a link to all the derived terms, we can include direct citations of on-the-fly uses like Ben Jonson's (deprecated template usage) aqueity. Groups 3 and 4 I would exclude (until they meet regular CFI). Also note that some novels are clearly not written in modern standard English (eg some futuristic novels like A Clockwork Orange or books like Finnegans Wake) and their vocabulary shouldn't count on a single citation. This is just me thinking out loud, I don't know how exactly I would want to codify it all into criteria. Ƿidsiþ 11:37, 2 February 2011 (UTC)Reply

Non-CFI-attestable verb forms

If I create a verb I check whether its inflections are CFI-attestable before adding them. (The -s form is often the rarest, especially with scientific terms, e.g. today I couldn't find (deprecated template usage) palettises even though the other inflections meet CFI.) However, User:CodeCat has told me that it's standard practice for bots to create these forms, even if they are not attestable (here talking about Finnish, which has far more obscure inflections); see User_talk:Hekaheka#Inflection_tables_that_don.27t_match_page_name. Is that true, and if we don't have other "theoretically extant but unused forms", like (deprecated template usage) femtobyte, is it wise? Equinox 21:52, 28 January 2011 (UTC)Reply

I think I can find some French forms I can more or less guarantee are unattested; past historic and imperfect subjunctive forms of modern verbs like (deprecated template usage) coacher for example. — This unsigned comment was added by Mglovesfun (talkcontribs).
And do you think we should have entries for the theoretical-only forms? Why, or why not? Equinox 22:01, 28 January 2011 (UTC)Reply
I don't favor keeping them, just spending a month trying to verify these will be a nightmare. Though I don't ideologically oppose it, my reservation is on practical grounds. Mglovesfun (talk) 22:03, 28 January 2011 (UTC)Reply
I'm not proposing we RFV all of the inflections of every bot-edited verb. I just question the utility of having yet another bot to create dubious inflections, especially in Finnish where some verbs have dozens of unused forms. Equinox 22:13, 28 January 2011 (UTC)Reply
If these entries should not exist in the first place, we shouldn't really list them in inflection tables either. We already have bots that ignore certain forms such as the plural if they are not present in the inflection table. —CodeCat 22:27, 28 January 2011 (UTC)Reply
I still don't oppose it on ideological grounds, but it would be akin to getting three citations for each verb form before creating the inflections. It would basically kill off inflection bots like SemperBlottoBot and MewBot. Mglovesfun (talk) 22:41, 28 January 2011 (UTC)Reply
I think that we should include all forms of a word, unless there's some reason to doubt a certain form. Note that if we were to apply the CFI separately to each individual form, we'd run into problems with rare words in highly inflected languages. Suppose that a language inflects its verbs to indicate tense (past/present/future), aspect (progressive/perfect/frequentative/neutral), mood (indicative/jussive/imperative/optative/contrafactual/negative/interrogative), evidentiality (observed/inferred), subject number (singular/dual/plural), object number (ditto), subject person (first/second/third-proximate/third-obviative), object person (ditto, plus third-reflexive), subject gender (masculine/feminine/neuter), and object gender (ditto). Yes, languages really do these sorts of things. In such a language, a word could appear dozens of times in durably archived works without any one specific form appearing three times. "I see him" and "have you seen them?" and "she sees someone-else-who's-not-important" and "they must have seen him" and "I hope someone sees it" and so on would all count toward different attestational "buckets", even if "see" is a perfectly regular verb. It makes no sense. —RuakhTALK 23:26, 28 January 2011 (UTC)Reply
Oh, unless you're saying that a cite for (deprecated template usage) abdicates (say) would count toward both (deprecated template usage) abdicate and (deprecated template usage) abdicates, whereas a cite for (deprecated template usage) abdicate would count only toward itself? I might be O.K. with that sort of thing. (And I'm not sure how much we really disagree. I note that in the discussion that led to this one, the issue came up because of forms that there is reason to doubt: a verb describing a political process that can't logically be used in certain subjects, voices, and/or moods.) —RuakhTALK 23:33, 28 January 2011 (UTC)Reply
Coming from the seat of an editor in Esperanto, it would seem weird and pointless to have one form of a verb and not another. If someone understand pluvas, they're going to use pluvis and pluvus without thinking about it, whether or not we can attest them. Even in English, if you can attest backspace as a verb, you know that backspaced is a word. (As for Ruakh, I don't know the instance you're talking about, but I still remember my German teacher refusing to conjugate kosten for ich, even though we all knew it was ich koste, and most of us could find a place to use it.)--Prosfilaes 01:42, 29 January 2011 (UTC)Reply
Yeah, but I gave the example of (deprecated template usage) femtobyte. Clearly if someone wants to talk about a quadrillionth of a byte (for some reason — it doesn't make sense in current computer science), they will say femtobyte. But until they say it it's not a word we should list. Ditto weird inflections, IMO. Equinox 01:46, 29 January 2011 (UTC)Reply
But it's not just weird inflections. Neither Esperanto nor English really have weird inflections; they have a small set of normal ones. With Esperanto, our corpus is small enough that we may be able to attest a word in the accusative (-n) but not quite in the nominative root form. I don't know how to even enter that; you put the main definition under the nominative and just accusative form of ---- in the accusative. In English, what if you have a word that appears five times, three in the plural (a simple -s) and two in the singular? Do you put the definition in the plural and act like the singular doesn't exist? Either case could seriously confuse someone looking for the obvious root. I've been attesting Esperanto words with different conjugations of the root, since whether it ends in -, -j, -jn, or -n, it's the same underlying word and every speaker will recognize that.
Femtobyte doesn't scare me. It's a different issue; we don't have every combination of prefix + root for any case. Also, we could take the standard prefixes * the normal words they attach to, add them to Wiktionary automatically, and the net effect would be minimal. I wish we had better control over the search; it should be possible to say: Did you mean femto- + byte?--Prosfilaes 03:08, 29 January 2011 (UTC)Reply

How to choose topical categories

Let me describe and compare the coverage of some topical categories.

There are Category:Foxes and Category:Dogs. Yet, there are a few entries of dogs and foxes scattered around Category:Canids and Category:Mammals, no one at Category:Vertebrates and no one at Category:Animals.

There are Category:Greek mythology, Category:Roman mythology, Category:Norse mythology and of various other mythologies. And we have Category:Mythology also containing many terms of these mythologies. By contrast, Category:Culture does not have any entry defined as something from mythologies.

None of the members of Category:Sex positions is also a member of Category:Sex.

Few but not all members of Category:Algebra are also of Category:Mathematics.

By rationalizing them, I could form the following guideline from my view of an apparent consensus. If it is correct, it is not perfectly in practice:

All entries should only be members of the narrowest topical categories available. German Shepherd should be a member of Category:Herding dogs, but not of Category:Dogs, Category:Canids, Category:Vertebrates or Category:Animals.

Am I right? Does anyone agree with it?

As I may have mentioned earlier, I can see some minor problems with this possible practice (mostly related with category names, multiple languages, etc.); but they aren't worth mentioning now, since I decided to open one can of worms at a time. Let's focus on that possibly consistent rule, if possible. --Daniel. 09:56, 30 January 2011 (UTC)Reply

Personally I agree, but if we remove Category:Skiing from Category:Sports to keep it only in Category:Winter sports, we should explain to the readers how to list all of the sports (eg: http://toolserver.org/~daniel/WikiSense/CatScan.php). JackPotte 12:00, 30 January 2011 (UTC)Reply
We should use that opportunity to delete some topical categories which are too narrow. Category:Water is a prime example. -- Prince Kassad 14:52, 30 January 2011 (UTC)Reply
Even taking Water as liquid form only, that's a lot of words; lake, sea, ocean, pond, river, stream, rain, rainfall, raindrop, just off the top of my head.--Prosfilaes 18:07, 30 January 2011 (UTC)Reply
I'm going to list some terms related to water, excluding life forms, transportation and proper nouns: ablution, absolute humidity, aquatic, baptism, bath, bathtub, bathroom, blackwater, canal, cloud, cryokinesis, cryokinetic, cumulonimbus, cumulus, dam, deep water, deep-water, dehydration, dew, dew point, dihydrogen monoxide, distilled water, drinking water, drown,drowning, ecohydrology, falls, faucet, fountain, freezing, freezing rain, freshwater, frost, frost point, glacier, glaciology, glaze, graupel, graywater, gray water, groundwater, hail, hailstone, heavy water, hoar frost, hoarfrost, hoar-frost, humidity, hydr-, hydrate, hydrated, hydration, hydro-, hydroelectric dam, hydroelectric generator, hydrogen, hydrogen oxide, hydrogeology, hydrokinesis, hydrokinetic, hydrology, hydrometeor, hydrophilic, hydrophobic, hydrosphere, hydrous, ice, ice cap, icecap, ice pellets, irrigation, lake, limnology, maelstrom, meltwater, mikva, mikvah, mikveh, mist, mizzle, Mpemba effect, oasis, ocean, oceanography, oxygen, pond, pool, precipitation, puddle, rain, rainbow, rain cats and dogs, raincoat, rain dogs and cats, raindrop, rainfall, rain gauge, rain hat, rain off, rain pitchforks, rainsoaked, rain-soaked, rainwater, relative humidity, river, sea, seawater, serein, serene, shower, sink, sleet, slush, snow, specific humidity, spring, squirt gun, steam, steam bath, stream, streamflow, surface water, tide, underwater, vapor, vapour, virga, wash, washing, wastewater, water, water bed, waterbed, water cycle, water down, watered-down, waterfall, water gun, waterlog, waterlogged, water pistol, water ski, water turbines, well, whirlpool, wudu. --Daniel. 00:03, 31 January 2011 (UTC)Reply
I agree with the substance of the proposal: that, if we have topical categories, then entries should be in the narrowest one only. I do notagree with how it's stated, though, as it assumes we should have a category "Herding dogs".​—msh210 (talk) 21:28, 30 January 2011 (UTC)Reply
If, for any reason, people eventually agree to delete Category:Herding dogs, then the wording of the proposal would naturally change as a result, probably resulting in:
All entries should only be members of the narrowest topical categories available. German Shepherd should be a member of Category:Dogs, but not of Category:Canids, Category:Vertebrates or Category:Animals.
It should be noted, though, that despite our Category:Herding dogs being virtually empty, there are at least 70 herding dogs according to Wikipedia, so there is potential for expansion. --Daniel. 00:03, 31 January 2011 (UTC)Reply
IMO we can get too specific, and then we'll fail to see the wood for the trees and make a load of useless, hyper-specific categories. In theory I feel that a word should be in its narrowest category and all of the containing categories, but I doubt Wikimedia software supports this. Equinox 21:31, 30 January 2011 (UTC)Reply
Do you mean placing German Shepherd into Category:Herding dogs, Category:Dogs, Category:Canids, Category:Vertebrates and Category:Animals simultaneously? Wikimedia surely would accept that. --Daniel. 00:03, 31 January 2011 (UTC)Reply
Yes, but I mean IMO membership of a topical category should imply membership of all parent categories, without the need for manual/bot insertion. Equinox 10:12, 1 February 2011 (UTC)Reply
We could give the appearance of "implying" membership through templates. For example, the code {{category|Herding dogs}} is almost identical to [[Category:Herding dogs]] and can easily be programmed to categorize any entry into Category:Herding dogs, Category:Dogs, Category:Canids, Category:Vertebrates and Category:Animals simultaneously. --Daniel. 12:18, 1 February 2011 (UTC)Reply
What about categories? Should they also be in only the narrowest available categories? For example, Category:Herding dogs is now in Pastoral dogs and in Dogs, and Pastoral dogs is in Dogs. (Not that I think we should have either of the two narrower ones anyway. They're just an example.) IMO yes: if we are to have topical categories, then they should be in the narrowest categories only.​—msh210 (talk) 21:42, 30 January 2011 (UTC)Reply
I personally consider categorization of topical categories a separate "can of worms", with many circumstances to be discused, perhaps later. That said, yes, I believe that the basic idea of only using the narrowest categories would work for them too. In my opinion, Category:Dogs should be a member of Category:Canids, but not of Category:Animals. --Daniel. 00:03, 31 January 2011 (UTC)Reply

Apparent loose cannon IP user, 90.209.77.78 (talk)

Apologies beforehand if this isn't the right venue.

I've noticed a number of apparently well-meaning edits by IP user 90.209.77.78 (talk), where the content is at least partially useful, but is very often in the wrong place. I just went through their contribs (after noting a change on my watchlist), and found that everything of theirs that I've looked at so far needs help. I don't have time right now to go through their whole contribs list, so I thought I'd post here as a heads-up, and to request help in notifying this clearly enthusiastic user (which is good) of WT formats, etc. -- TIA, Eiríkr Útlendi | Tala við mig 19:14, 30 January 2011 (UTC)Reply

Looks like this user is now at IP 90.209.77.67 (talk). 90.205.76.31 (talk) is older, but follows the same pattern, and could well be the same user.
The crux of the issue is that this user, or group of users, appears to have an intimate knowledge of manga and animé, but little real knowledge of Japanese. In their zeal to add this knowledge to Wiktionary, they are getting various things wrong, and often adding things in the wrong place (such as Japanese-specific content in the "Translingual" section of Chinese character entries). I'm seeing cases where they've gone back to undo fixes I've implemented just in the past couple days (mostly at 結界). I'm cleaning things up as best as I can, time allowing, but if anyone has any advice on how best to get through to enthusiastic but misinformed IP-based users, I'm all ears^W um, eyes.  :) -- Eiríkr Útlendi | Tala við mig 01:05, 31 January 2011 (UTC)Reply

I was having a browse today and ended up following an etymology to the page موم#Persian when I thought: wouldn't it be good if there was a link back to the wikipedia article on "Persian language". I notice some of the languages are linked, but has there ever been any discussion on doing this in a more widespread way? AndrewRT 22:05, 30 January 2011 (UTC)Reply

Adding a user preference would make the most sense IMO. Nadando 22:10, 30 January 2011 (UTC)Reply
I both love and hate cross project linking, it is great that we have an encyclopedia so closely tied to us, and it is awful internet design practice to shunt people from one site to another. It would be great if we had very solid "About:LangX" pages here which could then link to the full Wikipedia entry. We would not need to be too comprehensive, perhaps an abbreviated version of the pedia page describing the who (spoken by), what (orthography, grammar), when (duh) and where (geography) of the language in question. I know that we are not an encyclopedia but in certain, restricted ways it may make sense to keep folks on Wiktionary while providing certain encyclopedic information (I know, heresy, I am ready to be stoned). - [The]DaveRoss 23:46, 31 January 2011 (UTC)Reply

Independence.

How do y'all interpret this passage from WT:CFI?

Independence

This is meant to exclude multiple references that draw on each other. Where Wikipedia has an article on a given subject, and that article is mirrored by an external site the use of certain words on the mirror site would not be independent. It is quite common to find that material on one site is readily traced to another. Similarly, the same quote will often occur verbatim in separate sources. While the sources may be independent of each other, the usages in question are clearly not.

The presumption is that if a term is only used in a narrow community, there is no need to refer to a general dictionary such as this one to find its meaning.

There are a few things here that confuse or bother me (like, what's with all the non–durably-archived examples?), but the biggest is that the two paragraphs seem to be describing two very different phenomena, but the second one is worded in such a way as to imply that it's actually just explaining the rationale for the first. I doubt there's much correlation between mirroring or verbatim quotation on the one hand and sharing "narrow community" on the other.

Because the section is so vague and seemingly self-contradictory, it's hard to resolve certain RFV questions that hinge on it. For example, as I wrote at the RFV discussion about (deprecated template usage) novum, the term seems to get tens of thousands of b.g.c. hits in the relevant sense — which is really quite a lot — but it's hard to find hits that don't acknowledge Darko Suvin's coinage of the term. Does that make them non-independent? Other affected terms would be (deprecated template usage) ambient findability (where most uses are in books that also mention the book Ambient Findability and/or its author), (deprecated template usage) osmotic communication (where most uses are in books that also mention Alistair Cockburn, who coined the term, and/or Crystal Clear, his development methodology), and possibly (deprecated template usage) iDollator (where almost the only news articles using it are articles about one selfsame man — who, incidentally, wants to popularize and standardize the term).

My preferred interpretation would be something like this:

  • Non-independence is reflexive, symmetric, and transitive.
  • Verbatim quotations, near-verbatim quotations, and translations are not independent of their original source.
  • Multiple quotations from a single author are not independent of each other.

but I'd like input before I start applying that interpretation to RFV discussions.

RuakhTALK
03:51, 1 February 2011 (UTC)Reply

I'm going to raise another question, with an example. In 1983, Dave Gabai published an article, "Foliations and the topology of 3-manifolds", in the Journal of Differential Geometry, in which he first described a process called disk decomposition: a means of breaking up certain topological spaces into smaller ones ("disk-decomposing" them). This caught on and became an important tool in the low-dimensional topologist's toolkit. For several years thereafter, probably every reference to disk decomposition added a caveat as in "disk decomposition in the sense of Gabai" or "disk decomposition as in [4]" (where "[4]" is an item in the list of references at the back, viz Gabai's paper). Does that mean they're dependent? Later papers didn't use such caveats, as disk decomposition was by then well known (in the right circles), so only said "disk decomposition [4]", where, again "[4]" is a reference to Gabai. (I'd not be surprised to find that papers still did that.) Are these dependent? They're still "readily traced to" Gabai (to quote the CFI as quoted above). According to Ruakh's "preferred interpretation" above, seemingly all of the above are independent from Gabai (except any written by him, naturally).​—msh210 (talk) 08:46, 1 February 2011 (UTC)Reply
As a quick and first-impression response, the section's current wording seems worth scratching and replacing with something like your bullets. --Dan Polansky 09:45, 1 February 2011 (UTC)Reply
Addendum: (deprecated template usage) Curselax failed RFV because of this criterion, because all quotations were from postings to a single Usenet group. (See Talk:Curselax.) I participated in that decision, and still agree with it, but it's actually not covered by what I gave above as my preferred interpretation. Maybe a fourth criterion:
  • Multiple quotations from the same book, periodical, or Usenet group are not independent of each other.
? —RuakhTALK 18:30, 1 February 2011 (UTC)Reply
Sounds good. If this bullet point turns out too stringent, which is merely a hypothetical possibility right now, it can be amended later. --Dan Polansky 18:38, 1 February 2011 (UTC)Reply
Multiple authors' posts in the same Usenet newsgroup seems akin to multiple authors' works on the same topic in different books and by different publishers. The only "dependencies" inherent in a newsgroup are (1) that the authors are (usually) writing about similar topics and (2) that the authors (usually) have read each others' posts. (This argument does not apply to posts in the same thread, where a later author is replying to an earlier one (or to someone who replied to an earlier one, or some iteration of that), so will often use a word the earlier one used just because the earlier one did so.)—msh210℠ on a public computer 20:58, 1 February 2011 (UTC)Reply
Multiple quotes in the same book (but by different authors, as in a compilation or Festschrift) or periodical would seem to be independent in general, as authors, I think, choose their words, for the most part. Copy editors, however, choose spelling, so perhaps they should be considered dependent for purposes of proving a particular spelling. (Of course, the way RFV usually works, attesting a particular spelling is attesting the word, so this distinction might be too fine.)—msh210℠ on a public computer 20:58, 1 February 2011 (UTC)Reply
In a newsgroup not only have the writers likely read each other's posts, but they are also writing for each other -slash- for a shared audience. It's like when one author takes over another's series, and the target audience is basically "everyone who liked the first part enough that they're willing to put up with the new author's inability or unwillingness to emulate the old". (Of course, in the latter case the original author is usually still listed as a co-author, so it's covered anyway. :-P ) As for quotations from the same book or periodical — fair enough. I was mostly thinking of spellings (such as diacritics in the The New Yorker) and perhaps construals (such as the choice of preposition after a verb, which seems potentially subject to house style), but that may be too narrow a case to try to catch with a CFI rule. But be warned that if all issues of Penthouse are independent of each other, then "I never thought it would happen to me" is in "clearly widespread use". ;-)   —RuakhTALK 21:12, 1 February 2011 (UTC)Reply
  • Your second two bullet points seem to sum up the situation well, as I understood it too. I can't quite work out what the first one means though. Ƿidsiþ 10:59, 2 February 2011 (UTC)Reply
    • The first means that if work A is dependent on (by which I mean not independent of) work B, and B on C, then B is also dependent on A, A on C, and A on A. (And, therefore, C is dependent on A, C on B, B on B, and C on C.)​—msh210 (talk) 19:32, 2 February 2011 (UTC)Reply
This section would benefit massively from being entirely rewritten. Mglovesfun (talk) 11:41, 2 February 2011 (UTC)Reply

Languages and flags

Hi there,

I just noticed that, of 35 Wiktionary editions I have analysed, only 4 of them (Italian, Greek, Lithuanian, Hungarian) use flag icons to denote languages.

I'm not too sure about Wiktionary, but I know for a fact that at Wikipedia this practice was abandoned and flag dropped for very good reasons.

  1. Relationship between languages and countries is generally not 1-to-1. In fact, it's not even one-to-many, it's many-to-many.
  2. Flags tend to stir up all sorts of nationalist arguments that we'd rather avoid.

Please see the current discussion on Italian Wiktionary.

My feeling is that we should get rid of flags in all Wiktionary editions, especially silly ones like , or , or (my god) . Thoughts? 220.100.118.132 13:25, 1 February 2011 (UTC)Reply

The Portuguese Wiktionary uses flags too: http://pt.wiktionary.org/wiki/ser
The English Wiktionary uses flags of countries and other geographical representations occasionally on categories of languages: Category:Cantonese language, Category:Old Frisian language, Category:Frisian languages, Category:Manx language, Category:Old Prussian language... --Daniel. 13:43, 1 February 2011 (UTC)Reply
I agree that the flags of countries to represent languages are a bad idea. For what is worth, File:English language.svg is inaccurate. It lacks about 97% of the languages listed at the bottom of Category:English language. --Daniel. 13:43, 1 February 2011 (UTC)Reply
Like I said, flags should only be an option, not the default. -- Prince Kassad 15:45, 1 February 2011 (UTC)Reply

What I would like to achieve is consensus towards demoting this use of flags, from option to strongly deprecated option, or something like that.

Just to expand on the technical point, getting flags right for languages is not simply a hard problem, it's an impossible task. The reason is that the mapping is many-to-many, and so if you see (or should I say, manage to spot) e.g. a Swiss flag in one of those mongrel flags, you have to look at the rest of it to finally work out what language the flag was supposed to convey "at a glance" - proving that its intended purpose has completely been defeated, because it's much quicker and easier to just read the word representing the language. So, no matter what alternative representation you come up with, you will always get it wrong, by construction.

This is to say nothing about those distracting stroboscopic horrors that ignore a well-estabilished web design best practice of not flashing contents - or in this case, non-contents. I mean, I respect the effort people have put in creating these things, but in my opinion they have to go. 220.100.118.132 23:45, 1 February 2011 (UTC)Reply

I am strongly opposed to telling other language communities what they should and should not be doing stylistically on their projects. I am strongly opposed to removing optional features which in no way detract from the general usage of the project, especially for potential future problems. - [The]DaveRoss 23:59, 1 February 2011 (UTC)Reply
I'm totally with you on your first point. I'm not at all with you on your second, in that I think all optional features have a cost, if only in that it's hard to find the useful optional features if there are too many useless ones. In general, I'm inclined to take a descriptivist view (let's get rid of features that no one is using) rather than a prescriptivist one (let's get rid of features I don't want people to use), but in the particular case of flags, we're more or less endorsing a specific set of language-to-flag mappings (by making it, and only it, available through the "Gadgets" tab of Special:Preferences), so if there's significant risk of people being offended by that endorsement, I would definitely support removing it from gadgets and making people who really want it add it to their vector.css or whatnot. —RuakhTALK 00:11, 2 February 2011 (UTC)Reply
I'm with Dave on this one. I do understand Ruakh's point, but to take the descriptivist approach, you'd need to do a survey of who's using what, which might not be realistic -- for example, how many WT users, who may or may not use / like / not use / dislike the flags, will never register and never see this posting, yet have strong opinions about features of WT that they do make use of?
My own ¥2 here is that I quite like the flags, as they give me a very easy-to-scan visual cue to look for. I can very quickly scroll through a long entry and tell whether there's a Navajo or Japanese entry, for instance, just from the colors -- no reading required, which is easier on the eyes and quicker to visually parse. I think they improve the site's usability. -- Cheers, Eiríkr Útlendi | Tala við mig 03:06, 2 February 2011 (UTC)Reply
I agree strongly with the anonip that we should not have flags, and agree with Ruakh re "we're more or less endorsing a specific set of language-to-flag mappings [] , so if there's significant risk of people being offended by that endorsement, I would definitely support removing it from gadgets" (and I think there is such risk).​—msh210 (talk) 03:44, 2 February 2011 (UTC)Reply
Yet, the flags are only an option -- you have to turn on the gadget to get them to display at all. So far as I understand it, there is no risk of John Q. Public / Yusef Mustafa / Juan Carlos / Ernst Baumann / etc. wandering over and seeing the flags in any default configuration, which eliminates most of the people that might be offended. This leads me to wonder if we might be ultimately catering to offensensitivity? -- Eiríkr Útlendi | Tala við mig 04:03, 2 February 2011 (UTC)Reply
My understanding is that John Public can currently see the flags in the Wiktionary editions that have them turned on. Five such editions were identified above. 205.228.108.58 05:28, 2 February 2011 (UTC)Reply
Well, I certainly agree with TDR that we should not be deciding for (or, really, even suggesting to) other Wiktionaries that they cease using flags.​—msh210 (talk) 06:01, 2 February 2011 (UTC)Reply
Fair enough, I'm not well-versed in localisation policy. 205.228.108.58 07:16, 2 February 2011 (UTC)Reply
It's not just offensensitivity. Displaying a single flag also implies stuff about dialect that we don't (usually) wish to imply. Plus there's Ruakh's too-many-gadgets point.​—msh210 (talk) 06:01, 2 February 2011 (UTC)Reply

I have explicitly included the hybrid flags above to fully display their ugliness, although aesthetics is subjective and it is not my main point.

I agree with Eirikr that the original intention of flags is to be "easier on the eyes and quicker to visually parse", and if it were for me, I'd just stick to the one, de-facto standard flag. However, for languages that (unlike Japanese and Navajo) don't have a trivial language-country relationship, people do start to take issue with it, and there is no right answer to this ill-posed question, which is why we end up with such... elaborate solutions. 205.228.108.58 05:14, 2 February 2011 (UTC)Reply

While it's true on one side that mapping flags to countries is many-to-many, we are a dictionary. Dictionaries describe words and their origins, they don't try to describe various countries and cultures, that's encyclopedia material. So what we could do is use the flag of the place whose name the language is named after. We use an English flag (the red cross one) because English is named after England, a Japanese flag because Japanese is named after Japan, and so on. This means we don't need to deal with all the different places where a language is spoken, because we can just go with the etymological origin of the language's name. After all, even Americans call their language English... —CodeCat 10:45, 2 February 2011 (UTC)Reply
Then it becomes impossible to do consistently because Klingon, Esperanto, etc. aren't named after countries. I bet there are some where the origin is disputed too; more contentious politics. Equinox 10:53, 2 February 2011 (UTC)Reply
For what it's worth, Esperanto was a particularly easy flag to choose, and a quick commons search turns up a singular flag for that language as well . Certainly the lack of mapping makes it a challenge but the ability to have multiple flags or even options allowing people to choose which flag they want displayed for each language would alleviate these concerns. - [The]DaveRoss 14:23, 2 February 2011 (UTC)Reply
If we can get a high consensus on one or more sets language-flag mappings that set or sets could be offered as a gadget. But I, for one, am strongly opposed to the use of the UK flag to represent English on any such set. I may have other strongly held beliefs in specific cases where some kind of implicit minimisation of minority languages or political groupings is involved. Language names themselves can be taken as offensive to some, but have the advantage of long-standing and inevitable use, which can be attested using our customary methods or by appeal to external "authorities". DCDuring TALK 12:19, 2 February 2011 (UTC)Reply
This whole debate is why I had the foresight to write Flag Law, I don't remember where that was though. - [The]DaveRoss 03:13, 4 February 2011 (UTC)Reply

The entries English, Anglo-Norman and base contain simultaneously the sections "External links" and "See also", that apparently are interchangeable. There are links to Wikipedia and to 1911 Encyclopædia Britannica in both sections.

Compare with talent, mouse and Texas, that link to encyclopedias (including Wikipedia) and other external sites using the "External links" section.

Also compare with second, nostrum and marionette, that contain external links in the "See also" section.

I may or may not be able to once more describe and rationalize an apparent practice, and try to answer why there is this discrepancy of usage of sections. But, frankly, it seems just too random. Instead, I am going directly to propose a guideline that I believe would look good, be relatively easy to implement and even easier to mantain, and, most importantly, provide consistency.

I propose always using the "External links" and never the "See also" to place external links.

Naturally, boxes such as {{wikipedia}} would be exceptions to the proposed rule, because they are supposed to fit virtually anywhere. That's it. --Daniel. 11:13, 2 February 2011 (UTC)Reply

Interestingly, the ELE says nothing about this issue. But yeah, See also should only contain internal links, and Extermal links should have all links which lead out of English Wiktionary. -- Prince Kassad 18:54, 2 February 2011 (UTC)Reply
Thirded.​—msh210 (talk) 19:28, 2 February 2011 (UTC)Reply
Curiously (given the comments above) I'd prefer everything under see also, unless there's enough content that it's better to separate them for tidiness purposes. What about {{pedia}} then? Mglovesfun (talk) 14:13, 3 February 2011 (UTC)Reply
I have been putting {{pedia}} to See also, given that Wikipedia is semi-external to Wiktionary. If a plain majority prefers putting {{pedia}}, {{commonslite}}, etc. to "External links" rather than "See also", okay with me. --Dan Polansky 14:22, 3 February 2011 (UTC)Reply
Mglovesfun, how much content is enough content? After pondering this subject, I came to the conclusion that I prefer never placing external links (including {{pedia}}) under see also. One reason for my preferrence, as I stated, is consistency, which is good by itself. Another reason is that "External links" clarifies the limits between Wiktionary and other websites. Compare with how "See also" literally implies "You, Wiktionary user, who came to see a definition, perhaps an etymology, derived terms, inflections and maybe more linguistic information, take your time to admire this encyclopedical article, or list of images, or this additional dictionary, too." We don't want to send this message. Do we? --Daniel. 14:31, 3 February 2011 (UTC)Reply

Hyper-verbs

So, is "Hyper-verbs" an acceptable header? Does it mean something? See the current revision of punch. --Daniel. 16:50, 2 February 2011 (UTC)Reply

Presumably he meant (deprecated template usage) hyperonyms. I've changed it now. —RuakhTALK 18:05, 2 February 2011 (UTC)Reply
The usual Wiktionary heading is "hypernym". See also WT:ELE#Further semantic relations. --Dan Polansky 18:15, 2 February 2011 (UTC)Reply

Wiktionary talk:Etymology

In trying to proofread the page a bit, I've come across two issues, which I've put on the talk page (bottom two, as of this date and time). Mglovesfun (talk) 14:11, 3 February 2011 (UTC)Reply

Linking to Commons.

The File:Compaq keyboard and mouse cropped.jpg contains an automatic message:

This file is from Wikimedia Commons and may be used by other projects. The description on its file description page there is shown below.

All images linked from Commons to Wiktionary share that text. I personally consider it a little hard to spot. I proposed deleting that bland message and replacing it with the box from w:File:Compaq keyboard and mouse cropped.jpg.

Thoughts? --Daniel. 16:22, 3 February 2011 (UTC)Reply

Sounds like a good idea. I've copied over the message. --Yair rand (talk) 23:00, 3 February 2011 (UTC)Reply
Thanks. --Daniel. 09:49, 4 February 2011 (UTC)Reply

Sense IDs

Previous discussions: Wiktionary:Grease_pit_archive/2010/June#Sense_referentials_and_links, Wiktionary:Grease_pit_archive/2010/July#Stable_identifiers_for_meanings

Wiktionary has the problem of not being able to refer to specific definitions in links, which could be fixed by adding anchors containing glosses to individual definitions. The template {{senseid}} could work for this, if there was a simple way to add glosses to links via existing link templates. I propose that {{senseid}} be allowed for general use in the mainspace, but not be bot-added to all entries yet, and that {{l}} be changed to accept the id= parameter to link to definitions with glosses ({{l|en|peach|id=fruit}} would link here). --Yair rand (talk) 02:24, 4 February 2011 (UTC)Reply

I am unconvinced that this is the best, or even a good solution, but I need to think about it. So I am going to think about it and come back here and see all of the good reasons why my thoughts are dumb lined up for me. Get to it! - [The]DaveRoss 03:11, 4 February 2011 (UTC)Reply
It doesn't 'solve' the problem, but templates such as {{context}} and {{gloss}} could contain anchors. This would only work for senses using these glosses, of course, and the same gloss may appear more than once in an entry. Mglovesfun (talk) 00:13, 5 February 2011 (UTC)Reply

Also this would make {{gloss}} better than just writing something inside brackets which weirdly, is all the template does right now. No clever span stuff. — This unsigned comment was added by Mglovesfun (talkcontribs) at 5 February 2011.

Adding anchors to {{context}} couldn't really work, as there are lots of times multiple senses of a word that contain the same context tag. I don't really see how adding anchors to {{gloss}} would really be helpful either. We need to have some way of connecting senses, and if no one has any better way, I don't see why not to use the {{senseid}} template. --Yair rand (talk) 06:09, 7 February 2011 (UTC)Reply
I do not know if this is relevant or useful, but Icelandic entries like falla#Icelandic anchor synonyms to senses. - -sche 02:54, 9 February 2011 (UTC)Reply

Poorly attested languages

The obvious solution - for me anyway - is instead of listing them on individual language consideration pages (such as WT:About English) would be to have a CFI subpage on attestation. Like I say, I favor the use of subpages to 'declutter' the CFI page, so that it contains only criteria for inclusion, not discussion about those criteria.

Anyway, something like Wiktionary:Criteria for inclusion/Attestation should do it. And something like:

"The following are considered exceptions to the 'three durably archived citations' rule as they are poorly attested"

Then stuff like

  • Ancient Greek: 1
  • Old English: 2
  • Old French: 2

clearly it can only be one or two; not zero, and three is the norm. Mglovesfun (talk) 10:58, 4 February 2011 (UTC)Reply

But how is one going to add to this list? Via a series of VOTEs? I would favor a general exception for ancient languages. (probably obscure ones as well but it's hard to define that) -- Prince Kassad 11:12, 4 February 2011 (UTC)Reply
Why not just consider all the works in languages with only a small amount available to be "well-known works"? --Yair rand (talk) 20:59, 4 February 2011 (UTC)Reply
Because it twists the meaning of the phrase, and still doesn't define "a small amount". Does "The Flag of My Country. Shikéyah Bidah Na'at'a'í: Navajo New World Readers 2" really count as a well-known work by any standard? I would be generous as to "well-known works" with Navaho, but not that generous.--Prosfilaes 21:12, 4 February 2011 (UTC)Reply
That doesn't strike me as a practical solution. I'm with Kassad; a general exception for ancient languages would be better, though we probably don't want to accept modern translations, like Ancient Greek Harry Potter. We could define obscure languages; say if the Ethnologue gives them less than a million speakers, ask for 2, less than 100,000, ask for 1. That doesn't achieve everything; Oromo (17.3 million speakers) is probably a lot hard to cite than Estonian (1.0 million speakers). But it is a definition that will catch all the American and Australian languages.--Prosfilaes 21:12, 4 February 2011 (UTC)Reply
I'm not sure that this would be controversial, or even 'interesting' enough for editors to disagree over it. Might not take as much finagling as you might think. Re number of speakers, not the best criterion as speakers and written language are independent. Middle French is very well attested because it's relatively recent (post 1400) but has zero speakers, since it's 'become' Modern French. Mglovesfun (talk) 00:09, 5 February 2011 (UTC)Reply
I don't know what the difference between 1, 2 and 3 should be, and if we can set up rules for that, why do we need to discuss each and every one? Number of speakers isn't perfect, but I wasn't suggesting that it be used for dead languages. The only non-dead language that stands out as being more easily attestable then that rule would imply is Yiddish, and given the lack of good Yiddish OCR and of Yiddish readers, asking for only 2 attestations wouldn't be a big deal. Otherwise, it improves the condition of many, many languages dramatically. (I also wasn't planning on it being used for artificial languages, which need their own rules here.)--Prosfilaes 00:44, 5 February 2011 (UTC)Reply

Dawnraybot and pronunciations

Some of the pronunciations mentioned by Dawnraybot are incorrect, at least in scruterais (now fixed) and scruterait (to be fixed), the only ones I checked. There are probably many more pronunciations to be changed. Lmaltier 10:24, 5 February 2011 (UTC)Reply

Which specific forms have errors? If we can't be specific about the forms we should bot-remove all pronunciations from Dawnraybot. Nadando 00:19, 6 February 2011 (UTC)Reply
I've been finding errors from Dawnraybot all over the place for a while. No specific forms have errors, as often the pronunciation of the stem is wrong, [18] [19] [20] but often it's the ones that end in /ɛ/ that are shown with /e/, understandable because some people pronounce them that way. —Internoob (DiscCont) 03:40, 6 February 2011 (UTC)Reply
Yes. For example, the forms of peinturlurer have wrong SAMPA (using the dollar sign). And the forms of récurer have wrong IPA as well. -- Prince Kassad 18:18, 6 February 2011 (UTC)Reply
If Plowman is right with his statement above — and for wonderfully obvious reasons we may probably assume that this is the case — then Nadando's suggestion is probably the easiest solution, even though a number of correct pronunciations might be deleted as well. -- Gauss 18:44, 6 February 2011 (UTC)Reply
I asked a native speaker about the final /e/ vs. /ɛ/. She said they are different but for most people you won't hear the difference (that is, they pronounce them the same). So that's so minor I wouldn't worry about it, that is, listing pronunciations which do exist but aren't the ones listed in dictionaries. The harder part, like Internoob says, is just tracking down the ones that are totally wrong. Mglovesfun (talk) 20:50, 6 February 2011 (UTC)Reply
In case of any errors, I apologize: they were not intentionally incorrect. --Plowman 20:55, 6 February 2011 (UTC)Reply

Customizing TOC

Does anyone know how to customize the appearance of the table of contents of an entry? I would like to know the following:

  • 1. How do I make only language names visible using CSS, while hiding "Noun", "Synonyms" and other deeper headings from the TOC?
  • 2. How do I replace the numbered lists with bulleted lists using CSS?
  • 3. How do I hide the list mark altogether?

Thanks for any input. --Dan Polansky 13:51, 6 February 2011 (UTC)Reply

#1: Add table#toc ul ul { display: none; } to your CSS.
#3: Add table#toc span.tocnumber { display: none; } to your CSS.
#2: This is a bit trickier. The version that seems to fit best with the Vector scheme is
table#toc ul
{
  list-style-type: square;
  list-style-image: url(http://bits.wikimedia.org/skins-1.5/vector/images/bullet-icon.png?1); 
  margin-left: 1.5em;
}
(plus the CSS for #3), but you seem to be using Monobook?
RuakhTALK 14:32, 6 February 2011 (UTC)Reply
Thanks! It works well even with Monobook. --Dan Polansky 14:39, 6 February 2011 (UTC)Reply
A Monobook appearance can be obtained by using Monobook bullet, like this:
table#toc ul {
  list-style-type: square;
  list-style-image: url(http://bits.wikimedia.org/skins-1.5/monobook/bullet.gif?1);
  margin-left: 1.5em;
}
--Dan Polansky 08:13, 8 February 2011 (UTC)Reply
Re #1 (table#toc ul ul { display: none; } ): I have tried it and it has at least one downside: it also applies to TOC in Beer parlour, so it hides headings of the discussions. --Dan Polansky 14:54, 11 February 2011 (UTC)Reply
Try .ns-0 table#toc, etc.​—msh210 (talk) 15:55, 11 February 2011 (UTC)Reply

Poll: Etymology and the use of less-than symbol

I would like to ask you about your preference in using "<" vs "from" in etymologies in Wiktionary. Etymology sections in Wiktionary are not united in the use of "<" vs "from".

An example of the two different formats:

A longer example of the two formats:

This poll disregards whether the etymology should start with "from" or omit "from" from the start (or "<", respectively). The preference of "<" is compatible with the format "From A < B < C < D"; the only thing in question are the second, third, and later word or symbol.

I tend to prefer "<", but am okay with "from" if this is the majority preference. For me, the use of "<" makes it easier to scan the string of items using my eyes and locate the individual items, while "from" gets more easily lost in the jumble. One argument against the use of "<" is that its meaning is much less obvious than the meaning of "from". But I think the meaning of "<" can be quickly picked up by the user of the dictionary. Century 1911 and Encarta[21] use "<", while some other dictionaries including Merriam and Webster online[22] use "from".

This poll combines discussion with a clear indication of one's current preference, a preference that can change later as a result of discussion. Feel free to make other proposals and comments alongside your indication of your preference.

Thank you for your attention and your input! --Dan Polansky 08:42, 7 February 2011 (UTC)Reply

Preference 1

I prefer the use of "<" over the use of "from".

  1. Support Mglovesfun (talk) 12:19, 7 February 2011 (UTC). And the reason is, otherwise you repeat 'from' a lot which I find irritating. I don't feel that strongly about it, I don't mind being outvoted. Mglovesfun (talk) 12:19, 7 February 2011 (UTC)Reply
  2. Support Dan Polansky 14:14, 7 February 2011 (UTC) As I said, for me, the use of "<" makes it easier to scan the string of items using my eyes and locate the individual items, while "from" gets more easily lost in the jumble. I am okay with going by the option that is preferred by a plain majority. --Dan Polansky 14:14, 7 February 2011 (UTC)Reply
  3. Support We can do fancy things if we use a <, we can do even more fancy things if etymologies were templatized. We can do nothing fancy with a from. Also I think it is a much simpler, easier to read solution. - [The]DaveRoss 05:04, 9 February 2011 (UTC)Reply
    We can do fancy things if etymologies are templatified. For example, if, instead of (at [[sully]])
    From {{etyl|fro}} {{term|lang=fro|souillier}} (> {{etyl|fr|-}} {{term|souiller|lang=fr}}). Compare {{term|soil|lang=en}}.
    we had
    {{from|lang=fro|souillier}} {{whence|etyl=fro|lang=fr|souiller}} {{more at|soil|lang=en}}
    and instead of (at [[wend]])
    {{etyl|enm}} {{term|lang=enm|wenden}} from {{etyl|ang}} {{term|lang=ang|wendan||to turn, go}}, causative of {{term|windan|||to wind|lang=ang}}. Akin to {{etyl|ofs|-}} {{term|lang=ofs|wenda}}, {{etyl|osx|-}} {{term|lang=osx|wendian}}, {{etyl|non|-}} {{term|venda||to wend, to turn|lang=non}} ({{etyl|da|-}} {{term|vende|lang=da}}), {{etyl|de|-}} {{term|wenden||to turn|lang=de}} and {{etyl|got|-}} {{term|𐍅𐌰𐌽𐌳𐌾𐌰𐌽|sc=Goth|tr=wandjan|lang=got}}.
    we had
    {{from|lang=enm|wenden}} {{from|lang=ang|wendan||to turn, go}} {{etymon form of|form=causative|windan||to wind|lang=ang}} {{cognate|lang=ofs|wenda}} {{cognate|lang=osx|wendian}} {{cognate|non|venda||to wend, to turn}} {{whence|etyl=non|vende|lang=da}} {{cognate|wenden||to turn|lang=de}} {{cognate|𐍅𐌰𐌽𐌳𐌾𐌰𐌽|sc=Goth|tr=wandjan|lang=got}}
    that'd be both machine-readable and (with some work) human-readable as opposed to the current system which is only the latter. (However, I still think that the human-readable display should include "from" rather than "<".  :-) ) But I don't know what fancy things we can do with "<" (untemplatified) that we can't do with "from".​—msh210 (talk) 16:34, 9 February 2011 (UTC)Reply
    We can't easily parse from since it is a word which may appear in an etymology for other reasons. - [The]DaveRoss 20:24, 9 February 2011 (UTC)Reply

Preference 2

I prefer the use of "from" over the use of "<".

  1. I prefer "from", but sometimes "which is from" or the like, whichever fits best in the paragraph, over "<". But I don't mind "<" terribly.​—msh210 (talk) 09:46, 7 February 2011 (UTC)Reply
  2. Support Ƿidsiþ 09:55, 7 February 2011 (UTC), in general.Reply
  3. Support H. (talk) 11:52, 7 February 2011 (UTC) Yes please, this has been bothering me for a while, Wiktionary is not paper, remember? Furthermore, I never understood why to use < but not >. Furthermore, please begin with ‘From’ as well, in the same line of thinking: it is a sentence, not a telegraph message.Reply
    You are entitled to your preference, but the use of "<" has not much to do with whether Wiktionary is paper. You seem to imply that the people who prefer "<" do so to make the etymology shorter, but that is not necessarily the case, not with me anyway. For me, it is all about the ease of visual parsing. --Dan Polansky 12:16, 7 February 2011 (UTC)Reply
    I always thought that it's < because it looks like an arrow pointing from oldest to newest, showing the direction of progression. —Internoob (DiscCont) 03:00, 9 February 2011 (UTC)Reply
  4. SupportCodeCat 11:58, 7 February 2011 (UTC)Reply
  5. Support It's not a huge thing for me, but from is more obvious and less jargony.--Prosfilaes 20:49, 7 February 2011 (UTC)Reply
  6. Support the less-than sign is really confusing to most people. It's a far better idea to write it out. -- Prince Kassad 20:53, 7 February 2011 (UTC)Reply
  7. Support — is the principle that Wiktionary is not paper applicable here? It has the space to spell words out rather than to use unclear abbreviations. - -sche 05:14, 8 February 2011 (UTC)Reply
  8. Support Daniel. 15:16, 8 February 2011 (UTC) I personally feel more comfortable with "from", but I don't mind if people choose any of these possibilities. --Daniel. 15:16, 8 February 2011 (UTC)Reply
  9. SupportRuakhTALK 16:08, 8 February 2011 (UTC)Reply
  10. Support DCDuring TALK 16:23, 8 February 2011 (UTC)Reply

Preference 3

I am indifferent or indecisive about the use of "from" vs the use of "<".

  1. Support Vahag 13:57, 7 February 2011 (UTC) I use "from" in one- and two-member etymologies, but "<" in longer ones. Long chains with < are easier to read. --Vahag 13:57, 7 February 2011 (UTC)Reply
  2. SupportInternoob (DiscCont) 03:01, 9 February 2011 (UTC) Per above.Reply
  3. I'm not really a fan of either method. In my opinion, a template would be better. --Yair rand (talk) 05:26, 9 February 2011 (UTC)Reply
    Isn't that orthogonal to the question under discussion?--Prosfilaes 19:49, 9 February 2011 (UTC)Reply
    Not really. If "from" and "<" were identical in practice, and we had templates to display them, then any user would be able to see what he wants to see. (For example, I may see "from" and you may see "<".) --Daniel. 16:10, 10 February 2011 (UTC)Reply
    There would necessarily be a default view, though, so this discussion would not be moot.​—msh210 (talk) 06:41, 11 February 2011 (UTC)Reply

Discussion

Wiktionary:"/Templates

I created that and it was speedy deleted, without asking me. Well, it is not used as a redirect, but as most shortcut pages, they are to be used in the search box. I regularly want to lookup one of those templates, and it is easier to type ‘WT:"/Templates’ then ‘Wiktionary:Quotations/Templates’, let alone remember the name of the latter page. So why not keep it? H. (talk) 11:14, 7 February 2011 (UTC)Reply

Because we don't allow anything to redirect to anything. And like you say, it's your personal redirect. Another admin may disagree with me. Have you considered just adding the link from your user page. Oh having said that WT:" already exists. Sigh, shoot (that's a euphemism). Mglovesfun (talk) 12:29, 7 February 2011 (UTC)Reply
Definitely agree with Mg, Wiktionary namespace shouldn't have shortcut redirects. If you want to redirect to something in the Wiktionary namespace, a WT namespace page should be created for the purpose, maybe WT:QT for Quotations/Templates (which seems like a badly titled page to me). - [The]DaveRoss 14:43, 7 February 2011 (UTC)Reply
I'm not saying that no redirects should exist, just that you want to avoid ambiguity. For WT:QUOTE to redirect to WT:Quotations seems fine, as it's obvious what you're linking to. Having said that, we use a lot of initialisms which are only obvious once you click on the page. Still, that's not a reason to allow anything that someone cares to type in and hit enter. Mglovesfun (talk) 14:47, 7 February 2011 (UTC)Reply
But [[WT:"]] seems pretty obvious, and this is an obvious extension of it.​—msh210 (talk) 21:12, 7 February 2011 (UTC)Reply
Wiktionary: and WT: namespaces coincide now, TDR. (See, e.g., [[WT:Criteria for inclusion]] and [[Wiktionary:CFI]].) If you mean that onle short titles should redirect, well, there are loads and loads of redirects within the Wiktionary: namespace with long titles. Also some with mixed long and short titles (which I mention in case that's your objection), like [[WT:Editable ELE]]. Undelete or, if restored, keep.​—msh210 (talk) 21:12, 7 February 2011 (UTC)Reply
Oh, I had forgotten that we made WT an alias instead of just a pseudo-namespace. - [The]DaveRoss 21:50, 7 February 2011 (UTC)Reply

Wiktionary:Requests for moves, mergers and splits

Badly needs input. TBH I'm just gonna grant some of these as unopposed (1-0) unless someone objects. Mglovesfun (talk) 11:30, 8 February 2011 (UTC)Reply

Appendix-only constructed languages

Subcats of Category:Appendix-only constructed languages should not also be subcats of Category:All languages, right? I mean, Category:APL language (for example) should not show up among all the real languages in Category:All languages, right?​—msh210 (talk) 16:09, 9 February 2011 (UTC)Reply

I would advise deleting Category:Appendix-only constructed languages, because its title is too technical, in favor of creating "Category:Minor constructed languages" and "Category:Computer languages"...
As for your question, yes, APL should be a member of Category:All languages. Why not? It's a constructed language too, per constructed language. --Daniel. 16:26, 9 February 2011 (UTC)Reply
Re "why not": Because Category:All languages is not a topical category where we can debate whether something "is a constructed language too" and belongs in it. It's a lexical category, and we've decided to exclude APL (and everything else in Category:Appendix-only constructed languages) from the lexicon.​—msh210 (talk) 16:38, 9 February 2011 (UTC)Reply
I think having Category:Appendix-only constructed languages in Category:All languages is sufficient. Mglovesfun (talk) 16:41, 9 February 2011 (UTC)Reply
Having all languages in Category:All languages makes them easier to be found. If I want to find Category:Klingon language, or simply want to know whether we have a category for Klingon, I would like to have the possibility of browsing through the "K" part of Category:All languages. If any language is deliberately excluded from Category:All languages, I would at least expect this fact to be announced somewhere, like "This category does not contain certain minor or computer languages, that may be found in this other category." --Daniel. 16:50, 9 February 2011 (UTC)Reply
Another 2p in favor of including, well, all languages under Category:All languages. I'm actually a bit surprised this is even being discussed. The category name is, after all, all languages, and the category page states quite clearly that This category contains, indeed, all languages, or rather, all language names (in English). -- Bemused, Eiríkr Útlendi | Tala við mig 19:50, 9 February 2011 (UTC)Reply
I agree with User:Eirikr and User:Daniel. — it is counter-intuitive to include languages only somewhere other than Category:All languages. - -sche 08:31, 10 February 2011 (UTC)Reply
Above all, most of the categories from Category:Appendix-only constructed languages should be deleted, per Wiktionary:Votes/pl-2010-10/Disallowing certain appendices. For instance, Category:Klingon language contains Appendix:Klingon/ghommey, which should not exist per the vote. After the subpages are deleted, the category for Klingon gets pointless. --Dan Polansky 09:14, 10 February 2011 (UTC)Reply
I was under the impression that that vote was about fictional universe appendices, not fictional language appendices. --Yair rand (talk) 09:26, 10 February 2011 (UTC)Reply
Oops, you are right. So maybe we should have another vote that extends the treatment also to appendix-only languages. --Dan Polansky 12:17, 10 February 2011 (UTC)Reply
The treatment of fictional universe appendices is still relatively unclear. However, I can safely assume that, if appendix-only language appendices eventually should follow the rule of being defined in lists and never in individual entries, them the following facts about Category:Klingon language should be taken into consideration:
  1. We would still eventually have many lists of Klingon words, possibly alphabetically (Appendix:Klingon/A, Appendix:Klingon/B, Appendix:Klingon/C...), per part-of-speech (Appendix:Klingon/List of nouns, Appendix:Klingon/List of adjectives), and/or per subject (Appendix:Klingon/List of animals, Appendix:Klingon/List of clothing), thus justifying the existence of Category:Klingon language.
  2. We would still eventually have appendices for certain pieces of information that are common for other languages as well, such as Appendix:Klingon Swadesh list and Appendix:Klingon given names, thus justifying the existence of Category:Klingon language.
  3. We would still perhaps have Klingon templates (for example, to display multiple scripts), thus justifying the existence of Category:Klingon templates and Category:Klingon language.
  4. Not to mention Category:Klingon derivations and Wiktionary:Requested entries (Klingon).
  5. Category:Klingon language fits a intuitive and organized category tree of "all languages", and displays, or should display, relevant information such as script, family, a link to an entry, links to policies, subcategories, a link to a Wikipedia article and a useful warning about Klingon being forbidden in entries, so it is justified.
That's it. --Daniel. 14:00, 10 February 2011 (UTC)Reply
I think the problem is that Category:All languages is misnamed: it doesn't contain entries like English and French, as its name implies, but rather categories like Category:English language and Category:French language. A more accurate name might be "Category:All entries by language", or "Category:All language categories". —RuakhTALK 23:41, 10 February 2011 (UTC)Reply
In my opinion, "Category:All languages" is not a bad name, but for the sake of clarity...
--Daniel. 23:45, 10 February 2011 (UTC)Reply
Category:Klingon language is in this category, albeit not directly, but via a subcategory. Mglovesfun (talk) 23:53, 10 February 2011 (UTC)Reply
Category:Klingon language is (and always has been as far as I remember) a direct member of Category:All languages; check again. --Daniel. 23:57, 10 February 2011 (UTC)Reply
Re: "Category:Portuguese language, Category:English language, Category:French language, etc. contain not only entries but much more information": O.K., but let's focus on one problem at a time. ;-)   —RuakhTALK 00:06, 11 February 2011 (UTC)Reply
@Ruakh: "Category:All entries by language" and "Category:Entries by language" sound good to me. These categories mostly contain entries; the only other thing they contain are indexes and appendixes, but these are very few compared to entries, so the misnaming is not too bad. If we want to be more accurate, we can have "Category:Content by language"--Dan Polansky 14:24, 11 February 2011 (UTC)Reply
Language categories such as Category:English language and Category:Portuguese language also contain templates, relatively many pages of rhymes, few categories or pages of requests for attention, requested entries, etc., and citations. I appreciate accuracy, and I also appreciate your suggestion of "Category:Content by language". --Daniel. 15:47, 11 February 2011 (UTC)Reply
I like "Content by language".​—msh210 (talk) 15:57, 11 February 2011 (UTC)Reply

Poll: Choosing topical categories

I would like to know the opinions of other Wiktionarians about the problem mentioned in WT:BP#How to choose topical categories:

Most entries fall into the scope of various redundant topical categories simultaneously. For example, German Shepherd may be categorized into Category:Herding dogs, Category:Dogs, Category:Canids, and so on. Should "German Shepherd" be a member of all these categories? Or, perhaps, should it simply be a member of only the narrowest one, that is, Category:Herding dogs, and all the others would be merely implied?

This poll is not about deciding categorization of topical categories, or deciding names of topical categories, because I believe these are complex and separate problems to be discussed eventually. Nonetheless, I also believe our category tree is good enough to undergo this project of becoming more consistent by actually deciding and letting editors know where they should categorize entries.

Feel free to make other proposals and comments. Thank you for your attention and your input. --Daniel. 18:04, 10 February 2011 (UTC)Reply

Preference 1: Narrowest topical categories only

I prefer all entries as members of only the narrowest topical categories available.

If you agree 100% with this practice, or if you agree in essence with it but have ideas of some situations where it would be better to use less narrow categories, please vote for this option. Feel free to elaborate your thoughts.

Examples:

  1. Support Daniel. 18:04, 10 February 2011 (UTC)Reply
    I, personally, prefer the system of using only the narrowest category, because:
    1. It avoids redundant superpopulated categories such as Category:Nature with thousands of terms from Category:Animals, Category:Plants and Category:Weather. Feel free to correct me, but I believe they don't have any practical use; that is, I don't remember or assume that any particular target group of Wiktionary users would need, want or appreciate the existence of superpopulated topical categories. (And these possible wide lists have the potential to be scanned by external tools anyway, if anyone bothers to mention or create such a tool.)
    2. It is easier to visually scan the list of categories from an entry that avoids that redundancy. For example, the list of categories of the current revision of "dog" randomly contains various versions of Category:Canids, Category:Mammals, and Category:Animals, and I wouldn't call it the most comfortable piece of text to be read and understood.
    3. It helps to organize existing categories, by keeping a balance of populated "narrow" categories, and underpopulated "wide" categories. For example, with this system, if any editor finds any name of a specific animal on Category:Animals (or Category:fr:Animals, Category:pt:Animals, etc.), he/she would automatically know that these entries need to be recategorized under Category:Insects, Category:Mollusks, and so on. --Daniel. 18:04, 10 February 2011 (UTC)Reply
  2. Support Mglovesfun (talk) 18:18, 10 February 2011 (UTC)Reply
  3. I agree that, if we are to have topical categories at all, then entries should be only in the narrowest available. Otoh, if we are to have topical categories, IMO the categories should be as broad as feasible, so that the examples given for this option are not in accord with my view. (Even the wording of the option, "I prefer all entries as members of only the narrowest topical categories available", while technically correct, is misleading in that someone might mistakenly infer that I prefer having topical categories.)​—msh210 (talk) 18:40, 10 February 2011 (UTC)Reply
  4. SupportInternoob (DiscCont) 00:38, 11 February 2011 (UTC) I actually thought that that was (more or less) already one of those de facto policies. —Internoob (DiscCont) 00:38, 11 February 2011 (UTC)Reply
  5. Support , but you need to keep in mind that a term like 'narrow' only really works if the categories actually behave like a tree. But they don't always, some categories are nested in strange ways and there isn't really a clear definition of what's 'narrower' than something else. —CodeCat 14:10, 11 February 2011 (UTC)Reply

Preference 2: All topical categories

I prefer all entries as members of all the topical categories available.

If you agree 100% with this practice, or if you agree in essence with it but have ideas of some situations where it would be better to disregard certain topical categories, please vote for this option. Feel free to elaborate your thoughts.

Examples:

Preference 3: Indifference

I am indifferent or indecisive about the choice of topical categories.

Discussion

  • I oppose the two simple options offerred in this poll. The second option seems unworkable; it would lead to huge top-level categories. The first option is not too bad, just that I can imagine to want to have an entry both in a finest-level category and its supercategory. The first option is clear and simple, but not necessarily the best one. If I were asked which of the two options I prefer, I would clearly prefer option 1 over option 2, but that does not mean that I want the wording of option 1 to become a policy for Wiktionary practice. --Dan Polansky 18:30, 10 February 2011 (UTC)Reply
  • I, too, oppose the two simple options offered in this poll. [[German Shepherd]] should be in Category:Dogs. I think a better approach might be to think about how many entries a topical category should have in order to be useful; Category:Dogs should have (say) the 500 most common/important/whatever dog-itive terms, most of which will also be in subcategories, but less important words should only be in subcategories. (That won't be perfect, because there might be some dog-related words that it's hard to categorize any more narrowly than that, such that they have to go into Category:Dogs no matter how unimportant they are; but I think that's the kind of logic that needs to be applied.) —RuakhTALK 16:28, 11 February 2011 (UTC)Reply

Poll: Deprecation of topical categories

As of this revision, DCDuring has created an additional option on the poll above, to deprecate topical categories. This option since then has been supported by two people.

Since the deprecation of topical categories is a separate subject, and there was no space to formally oppose or just comment on this proposal in the poll above, I'm moving it to this additional poll, naturally with the additional options.

Once more, feel free to make other proposals and comments. Thank you for your attention and your input. --Daniel. 21:00, 10 February 2011 (UTC)Reply

Option 1: Support deprecation

If you believe that topical categories should be deprecated at this time, with no further topical categories to be created after 20:37, 10 February 2011 (UTC).

  1. Support DCDuring TALK 20:37, 10 February 2011 (UTC)Reply
  2. Support I have to wonder if anyone even used these. They don't seem to be worth the trouble to me. -- Prince Kassad 20:43, 10 February 2011 (UTC)Reply
  3. Support strongly. See my comments with the same timestamp in the next subsection.​—msh210 (talk) 06:36, 11 February 2011 (UTC)Reply

Option 2: Oppose deprecation

  1. Oppose Daniel. 21:00, 10 February 2011 (UTC)Reply
    Topical categories serve the purpose of subdividing Wiktionary into various lists of words related by their subjects, including lists that would normally be found as various types of dictionary: a dictionary of medicine, of law, of technology, etc.
    If I want to know terms used in psychology, I use Category:Psychology. If I want to know the "vulgarities" of English and other languages, I navigate through Category:Vulgarities. If I want to know terms used in chess, I use Category:Chess.
    Appendices, Wikisaurus pages, and topical categories are valuable tools for the creation and maintenance of various lists of words by their contexts; each of these methods has their own scopes and qualities, and I personally like them all. Topical categories are not only dynamic, but easy to be populated, easy to be found at the bottom of each entry, and readers are used to them after all these years of creating and keeping topical categories. Unlike Wikisaurus, the scope of topical categories is simple and flexible enough to often allow adjectives, nouns, adverbs, etc. together, and unlike appendices, does not necessarily have additional information that may not be required, such as definitions of of each word, or very detailed explanations of usage and existence of groups of words. Topical categories also usually don't require to be often updated to keep up with entries.
    Furthermore, the appearance of categories is consistent among all projects, so if I find an interwiki link from Category:Childish to fr:Catégorie:Langage enfantin and sv:Kategori:Barnspråk, I know I will be able to navigate them without having to learn much; and, if I compare the same topical category in two Wiktionaries, I may find the terms that are missing from them. In addition, it's easy to find or create tools to browse topical categories and gather specific contents (such as downloading specifically the dictionary of chess). For that reasons, I oppose their deprecation.
    I almost forgot to mention that categories get praised or criticized occasionally, but regularly, at WT:Feedback, so other people are aware of them and use them too. --Daniel. 21:33, 10 February 2011 (UTC)Reply
    Terms found in a specialized dictionary (and not generally defined the same way in a general dictionary) are jargon terms, which are properly tagged with an appropriate jargon context tag like {{mathematics}} and categorized in the appropriate category like Mathematics. When I voted in support of getting rid of topical categories, it was in support of getting rid of a category that contains all mathematics-related terms: that's what I think of as a topical category. I oppose getting rid of jargon categories. Thus, the Mathematics category would remain (under my proposal) but contain only jargon terms (and the category description on the category page would indicate as much). Category:Herding dogs would presumably go (unless there's a jargon of the field of herding dogs specifically). So, Daniel, Psychology and Chess will remain categories and will contain, as you put it, "terms used in psychology" and "terms used in chess" — but not terms used outside of but about psychology or chess. Childish and Vulgarities are not topical categories at all: they're categories for registers, not topics. They'd of course be kept. (They should probably be renamed "English...", but that's a separate issue.)​—msh210 (talk) 06:36, 11 February 2011 (UTC)Reply
    With the distinction presented by msh210 in mind, let me clarify that, in my opinion, Category:Greens to find all names of shades of green of a language, Category:Dogs to find all names of breeds of a language, etc. are equally helpful as Category:Chess, Category:Psychology, etc. and for exactly the same reasons. --Daniel. 15:43, 11 February 2011 (UTC)Reply
    I may also use topical categories to check if certain terms already exist in Wiktionary and are properly categorized. For example, by reading Category:Chess, I know that we lack many names of strategies of this game; and, by reading Category:Dogs, I know that we lack many names of breeds of dogs. --Daniel. 21:37, 10 February 2011 (UTC)Reply
    You seem to have misread my proposed option in the above straw poll. I am not sure why. A category such as Category:Vulgarities is clearly linguistic in its scope. The truly topical categories are essentially part of the overall non-linguistic, encyclopedic tendency within Wiktionary. I consider this tendency destructive of Wiktionary as a language resource and incompetently duplicative of the superior work on encyclopedic topics that is carried out at Wikipedia. It seems to me that we need more work on attestation of old and new terms and their senses and usage, something with is not likely to be covered in WP. DCDuring TALK 22:57, 10 February 2011 (UTC)Reply
    I believe I did not misread anything you wrote in this or the above thread. First of all, I've seen, more than once, people using the umbrella of "topical categories" for anything whose naming system basically consists of "language code, then colon, then label". Second, the proposal does not make a clear distinction of categories to be kept or deleted, and their reasons (though you gave some reasons now). And, third, even if I assume that your proposal never meant to attack Category:Vulgarities, my defense of the existence of that category does not contradict my opposition. Apparently you have misread yourself. :p --Daniel. 23:37, 10 February 2011 (UTC)Reply
  2. Oppose The proposal is crazy talk. Categories are a living part of any wiki. They are created by users when needed, and the bad ones can be deleted, given some criteria for good and bad. Banning the creation of categories is unnatural. --LA2 21:46, 10 February 2011 (UTC)Reply
  3. Oppose. --Yair rand (talk) 22:12, 10 February 2011 (UTC)Reply
  4. Oppose Mglovesfun (talk) 23:08, 10 February 2011 (UTC). I don't oppose topical categories. I get annoyed by our lack of regulation, essentially meaning that it's a free-for-all, but I wouldn't want all topical categories deleted. Mglovesfun (talk) 23:08, 10 February 2011 (UTC)Reply
  5. OpposeInternoob (DiscCont) 00:45, 11 February 2011 (UTC) Per Daniel.Reply
  6. Oppose Dan Polansky 08:17, 11 February 2011 (UTC) I oppose deprecation of topical categories. For clarification of what I mean by "topical category": "vulgarities" is not a topical category; "Category:de:Latin derivations" is not a topical category; "mammals" and "vehicles" are topical categories; "geography" can be seen as a topical category, which contains "mountain", but it could be designed and regulated as a category of terms that are only used in geography, in which case "mountain" would not belong to "Category:Geography". While I do support having topical categories, I do not support any arbitrary level of their granularity: Category:Herding dogs could be too specific. --Dan Polansky 08:17, 11 February 2011 (UTC)Reply
  7. Weak oppose. I don't think we're doing a great job with topic categories, but they do fall in our remit as a dictionary-cum-thesaurus. If we allow ===Synonyms=== and ===Hypernyms=== and [[Wikisaurus:foo]] and so on, then clearly we're not just mapping from words to meanings, but also from meanings to words; and a topic-category system, done well, is a key component of that. —RuakhTALK 13:55, 11 February 2011 (UTC)Reply

Preference 3: Indifference

I am indifferent or indecisive about the existence or deprecation of topical categories.

Discussion

If this turns out to get enough support, we should start a formal VOTE to implement this change. -- Prince Kassad 21:14, 10 February 2011 (UTC)Reply

Flood Flag

In the spirit of getting things done, I propose we remove the whole Flood Flag "process" for administrators. I propose that administrators be able to use the flag at their discretion, with the understanding that while it is enabled they will make only edits which are non-controversial, and they use a verbose reason when they flag themselves, e.g. "adding lots of glosses to trans sections". I don't see any reason why we need to have two people agree and wait 48 hours before someone is allowed to do some work, many requests get done before the approval process can complete, flooding RC needlessly. - [The]DaveRoss 22:02, 10 February 2011 (UTC)Reply

Since I'm the most contrary person here, we could have a simple procedure: Check with User:DCDuring {;-)}. DCDuring TALK 23:01, 10 February 2011 (UTC)Reply
We can compromise and say that everyone must inform DCDuring, the preferred mechanism for doing so being an informative reason in the flag assignment. - [The]DaveRoss 23:30, 10 February 2011 (UTC)Reply
Yes totally agree, perhaps move Wiktionary:Requests for flood flag to Wiktionary:Flood flag/requests and create Wiktionary:Flood flag. Then admins could simply 'inform' others of their use of a flood flag, instead of 'requesting it'. Mglovesfun (talk) 00:15, 11 February 2011 (UTC)Reply
@TheDaveRoss: I think we might as well keep the page, and have admins comment there when they flag themselves, just so people who care can keep the page on their watchlist. We can easily drop that requirement at some point if we decide in future that it's too onerous and not necessary. —RuakhTALK 16:15, 11 February 2011 (UTC)Reply