Wiktionary:Beer parlour: difference between revisions

From Wiktionary, the free dictionary
Latest comment: 13 years ago by Mglovesfun in topic Romanized Korean, again
Jump to navigation Jump to search
Content deleted Content added
Mglovesfun (talk | contribs)
m →‎German CFI: this could use more input...
Line 946: Line 946:


== German CFI ==
== German CFI ==
{{look}}

I've started a draft [[Wiktionary:About_German#Criteria_for_inclusion|here]]. The problem with German is that it writes most words together, so the community is split on whether such words can be technically sum of parts. These rules are supposed to clarify the situation. Feel free to suggest things. -- [[User:Prince Kassad|Prince Kassad]] 00:04, 20 February 2011 (UTC)
I've started a draft [[Wiktionary:About_German#Criteria_for_inclusion|here]]. The problem with German is that it writes most words together, so the community is split on whether such words can be technically sum of parts. These rules are supposed to clarify the situation. Feel free to suggest things. -- [[User:Prince Kassad|Prince Kassad]] 00:04, 20 February 2011 (UTC)
: I believe that we should accept all German words (i.e. everything considered as a word in German) provided that their use can be attested. Some time ago, there have been many comments on the ''longest German word'' used in official texts. These comments clearly show that even very long compound words are considered as words in German (but this kind of word is exceptional). The attestation constraint is sufficient to limit the inclusion of such words. [[User:Lmaltier|Lmaltier]] 09:32, 20 February 2011 (UTC)
: I believe that we should accept all German words (i.e. everything considered as a word in German) provided that their use can be attested. Some time ago, there have been many comments on the ''longest German word'' used in official texts. These comments clearly show that even very long compound words are considered as words in German (but this kind of word is exceptional). The attestation constraint is sufficient to limit the inclusion of such words. [[User:Lmaltier|Lmaltier]] 09:32, 20 February 2011 (UTC)

Revision as of 18:22, 3 May 2011

Wiktionary > Discussion rooms > Beer parlour Wiktionary:Beer parlour/header

February 2011

Independence.

How do y'all interpret this passage from WT:CFI?

Independence

This is meant to exclude multiple references that draw on each other. Where Wikipedia has an article on a given subject, and that article is mirrored by an external site the use of certain words on the mirror site would not be independent. It is quite common to find that material on one site is readily traced to another. Similarly, the same quote will often occur verbatim in separate sources. While the sources may be independent of each other, the usages in question are clearly not.

The presumption is that if a term is only used in a narrow community, there is no need to refer to a general dictionary such as this one to find its meaning.

There are a few things here that confuse or bother me (like, what's with all the non–durably-archived examples?), but the biggest is that the two paragraphs seem to be describing two very different phenomena, but the second one is worded in such a way as to imply that it's actually just explaining the rationale for the first. I doubt there's much correlation between mirroring or verbatim quotation on the one hand and sharing "narrow community" on the other.

Because the section is so vague and seemingly self-contradictory, it's hard to resolve certain RFV questions that hinge on it. For example, as I wrote at the RFV discussion about (deprecated template usage) novum, the term seems to get tens of thousands of b.g.c. hits in the relevant sense — which is really quite a lot — but it's hard to find hits that don't acknowledge Darko Suvin's coinage of the term. Does that make them non-independent? Other affected terms would be (deprecated template usage) ambient findability (where most uses are in books that also mention the book Ambient Findability and/or its author), (deprecated template usage) osmotic communication (where most uses are in books that also mention Alistair Cockburn, who coined the term, and/or Crystal Clear, his development methodology), and possibly (deprecated template usage) iDollator (where almost the only news articles using it are articles about one selfsame man — who, incidentally, wants to popularize and standardize the term).

My preferred interpretation would be something like this:

  • Non-independence is reflexive, symmetric, and transitive.
  • Verbatim quotations, near-verbatim quotations, and translations are not independent of their original source.
  • Multiple quotations from a single author are not independent of each other.

but I'd like input before I start applying that interpretation to RFV discussions.

RuakhTALK
03:51, 1 February 2011 (UTC)Reply

I'm going to raise another question, with an example. In 1983, Dave Gabai published an article, "Foliations and the topology of 3-manifolds", in the Journal of Differential Geometry, in which he first described a process called disk decomposition: a means of breaking up certain topological spaces into smaller ones ("disk-decomposing" them). This caught on and became an important tool in the low-dimensional topologist's toolkit. For several years thereafter, probably every reference to disk decomposition added a caveat as in "disk decomposition in the sense of Gabai" or "disk decomposition as in [4]" (where "[4]" is an item in the list of references at the back, viz Gabai's paper). Does that mean they're dependent? Later papers didn't use such caveats, as disk decomposition was by then well known (in the right circles), so only said "disk decomposition [4]", where, again "[4]" is a reference to Gabai. (I'd not be surprised to find that papers still did that.) Are these dependent? They're still "readily traced to" Gabai (to quote the CFI as quoted above). According to Ruakh's "preferred interpretation" above, seemingly all of the above are independent from Gabai (except any written by him, naturally).​—msh210 (talk) 08:46, 1 February 2011 (UTC)Reply
As a quick and first-impression response, the section's current wording seems worth scratching and replacing with something like your bullets. --Dan Polansky 09:45, 1 February 2011 (UTC)Reply
Addendum: (deprecated template usage) Curselax failed RFV because of this criterion, because all quotations were from postings to a single Usenet group. (See Talk:Curselax.) I participated in that decision, and still agree with it, but it's actually not covered by what I gave above as my preferred interpretation. Maybe a fourth criterion:
  • Multiple quotations from the same book, periodical, or Usenet group are not independent of each other.
? —RuakhTALK 18:30, 1 February 2011 (UTC)Reply
Sounds good. If this bullet point turns out too stringent, which is merely a hypothetical possibility right now, it can be amended later. --Dan Polansky 18:38, 1 February 2011 (UTC)Reply
Multiple authors' posts in the same Usenet newsgroup seems akin to multiple authors' works on the same topic in different books and by different publishers. The only "dependencies" inherent in a newsgroup are (1) that the authors are (usually) writing about similar topics and (2) that the authors (usually) have read each others' posts. (This argument does not apply to posts in the same thread, where a later author is replying to an earlier one (or to someone who replied to an earlier one, or some iteration of that), so will often use a word the earlier one used just because the earlier one did so.)—msh210℠ on a public computer 20:58, 1 February 2011 (UTC)Reply
Multiple quotes in the same book (but by different authors, as in a compilation or Festschrift) or periodical would seem to be independent in general, as authors, I think, choose their words, for the most part. Copy editors, however, choose spelling, so perhaps they should be considered dependent for purposes of proving a particular spelling. (Of course, the way RFV usually works, attesting a particular spelling is attesting the word, so this distinction might be too fine.)—msh210℠ on a public computer 20:58, 1 February 2011 (UTC)Reply
In a newsgroup not only have the writers likely read each other's posts, but they are also writing for each other -slash- for a shared audience. It's like when one author takes over another's series, and the target audience is basically "everyone who liked the first part enough that they're willing to put up with the new author's inability or unwillingness to emulate the old". (Of course, in the latter case the original author is usually still listed as a co-author, so it's covered anyway. :-P ) As for quotations from the same book or periodical — fair enough. I was mostly thinking of spellings (such as diacritics in the The New Yorker) and perhaps construals (such as the choice of preposition after a verb, which seems potentially subject to house style), but that may be too narrow a case to try to catch with a CFI rule. But be warned that if all issues of Penthouse are independent of each other, then "I never thought it would happen to me" is in "clearly widespread use". ;-)   —RuakhTALK 21:12, 1 February 2011 (UTC)Reply
  • Your second two bullet points seem to sum up the situation well, as I understood it too. I can't quite work out what the first one means though. Ƿidsiþ 10:59, 2 February 2011 (UTC)Reply
    • The first means that if work A is dependent on (by which I mean not independent of) work B, and B on C, then B is also dependent on A, A on C, and A on A. (And, therefore, C is dependent on A, C on B, B on B, and C on C.)​—msh210 (talk) 19:32, 2 February 2011 (UTC)Reply
This section would benefit massively from being entirely rewritten. Mglovesfun (talk) 11:41, 2 February 2011 (UTC)Reply

Languages and flags

Hi there,

I just noticed that, of 35 Wiktionary editions I have analysed, only 4 of them (Italian, Greek, Lithuanian, Hungarian) use flag icons to denote languages.

I'm not too sure about Wiktionary, but I know for a fact that at Wikipedia this practice was abandoned and flag dropped for very good reasons.

  1. Relationship between languages and countries is generally not 1-to-1. In fact, it's not even one-to-many, it's many-to-many.
  2. Flags tend to stir up all sorts of nationalist arguments that we'd rather avoid.

Please see the current discussion on Italian Wiktionary.

My feeling is that we should get rid of flags in all Wiktionary editions, especially silly ones like , or , or (my god) . Thoughts? 220.100.118.132 13:25, 1 February 2011 (UTC)Reply

The Portuguese Wiktionary uses flags too: http://pt.wiktionary.org/wiki/ser
The English Wiktionary uses flags of countries and other geographical representations occasionally on categories of languages: Category:Cantonese language, Category:Old Frisian language, Category:Frisian languages, Category:Manx language, Category:Old Prussian language... --Daniel. 13:43, 1 February 2011 (UTC)Reply
I agree that the flags of countries to represent languages are a bad idea. For what is worth, File:English language.svg is inaccurate. It lacks about 97% of the languages listed at the bottom of Category:English language. --Daniel. 13:43, 1 February 2011 (UTC)Reply
Like I said, flags should only be an option, not the default. -- Prince Kassad 15:45, 1 February 2011 (UTC)Reply

What I would like to achieve is consensus towards demoting this use of flags, from option to strongly deprecated option, or something like that.

Just to expand on the technical point, getting flags right for languages is not simply a hard problem, it's an impossible task. The reason is that the mapping is many-to-many, and so if you see (or should I say, manage to spot) e.g. a Swiss flag in one of those mongrel flags, you have to look at the rest of it to finally work out what language the flag was supposed to convey "at a glance" - proving that its intended purpose has completely been defeated, because it's much quicker and easier to just read the word representing the language. So, no matter what alternative representation you come up with, you will always get it wrong, by construction.

This is to say nothing about those distracting stroboscopic horrors that ignore a well-estabilished web design best practice of not flashing contents - or in this case, non-contents. I mean, I respect the effort people have put in creating these things, but in my opinion they have to go. 220.100.118.132 23:45, 1 February 2011 (UTC)Reply

I am strongly opposed to telling other language communities what they should and should not be doing stylistically on their projects. I am strongly opposed to removing optional features which in no way detract from the general usage of the project, especially for potential future problems. - [The]DaveRoss 23:59, 1 February 2011 (UTC)Reply
I'm totally with you on your first point. I'm not at all with you on your second, in that I think all optional features have a cost, if only in that it's hard to find the useful optional features if there are too many useless ones. In general, I'm inclined to take a descriptivist view (let's get rid of features that no one is using) rather than a prescriptivist one (let's get rid of features I don't want people to use), but in the particular case of flags, we're more or less endorsing a specific set of language-to-flag mappings (by making it, and only it, available through the "Gadgets" tab of Special:Preferences), so if there's significant risk of people being offended by that endorsement, I would definitely support removing it from gadgets and making people who really want it add it to their vector.css or whatnot. —RuakhTALK 00:11, 2 February 2011 (UTC)Reply
I'm with Dave on this one. I do understand Ruakh's point, but to take the descriptivist approach, you'd need to do a survey of who's using what, which might not be realistic -- for example, how many WT users, who may or may not use / like / not use / dislike the flags, will never register and never see this posting, yet have strong opinions about features of WT that they do make use of?
My own ¥2 here is that I quite like the flags, as they give me a very easy-to-scan visual cue to look for. I can very quickly scroll through a long entry and tell whether there's a Navajo or Japanese entry, for instance, just from the colors -- no reading required, which is easier on the eyes and quicker to visually parse. I think they improve the site's usability. -- Cheers, Eiríkr Útlendi | Tala við mig 03:06, 2 February 2011 (UTC)Reply
I agree strongly with the anonip that we should not have flags, and agree with Ruakh re "we're more or less endorsing a specific set of language-to-flag mappings [] , so if there's significant risk of people being offended by that endorsement, I would definitely support removing it from gadgets" (and I think there is such risk).​—msh210 (talk) 03:44, 2 February 2011 (UTC)Reply
Yet, the flags are only an option -- you have to turn on the gadget to get them to display at all. So far as I understand it, there is no risk of John Q. Public / Yusef Mustafa / Juan Carlos / Ernst Baumann / etc. wandering over and seeing the flags in any default configuration, which eliminates most of the people that might be offended. This leads me to wonder if we might be ultimately catering to offensensitivity? -- Eiríkr Útlendi | Tala við mig 04:03, 2 February 2011 (UTC)Reply
My understanding is that John Public can currently see the flags in the Wiktionary editions that have them turned on. Five such editions were identified above. 205.228.108.58 05:28, 2 February 2011 (UTC)Reply
Well, I certainly agree with TDR that we should not be deciding for (or, really, even suggesting to) other Wiktionaries that they cease using flags.​—msh210 (talk) 06:01, 2 February 2011 (UTC)Reply
Fair enough, I'm not well-versed in localisation policy. 205.228.108.58 07:16, 2 February 2011 (UTC)Reply
It's not just offensensitivity. Displaying a single flag also implies stuff about dialect that we don't (usually) wish to imply. Plus there's Ruakh's too-many-gadgets point.​—msh210 (talk) 06:01, 2 February 2011 (UTC)Reply

I have explicitly included the hybrid flags above to fully display their ugliness, although aesthetics is subjective and it is not my main point.

I agree with Eirikr that the original intention of flags is to be "easier on the eyes and quicker to visually parse", and if it were for me, I'd just stick to the one, de-facto standard flag. However, for languages that (unlike Japanese and Navajo) don't have a trivial language-country relationship, people do start to take issue with it, and there is no right answer to this ill-posed question, which is why we end up with such... elaborate solutions. 205.228.108.58 05:14, 2 February 2011 (UTC)Reply

While it's true on one side that mapping flags to countries is many-to-many, we are a dictionary. Dictionaries describe words and their origins, they don't try to describe various countries and cultures, that's encyclopedia material. So what we could do is use the flag of the place whose name the language is named after. We use an English flag (the red cross one) because English is named after England, a Japanese flag because Japanese is named after Japan, and so on. This means we don't need to deal with all the different places where a language is spoken, because we can just go with the etymological origin of the language's name. After all, even Americans call their language English... —CodeCat 10:45, 2 February 2011 (UTC)Reply
Then it becomes impossible to do consistently because Klingon, Esperanto, etc. aren't named after countries. I bet there are some where the origin is disputed too; more contentious politics. Equinox 10:53, 2 February 2011 (UTC)Reply
For what it's worth, Esperanto was a particularly easy flag to choose, and a quick commons search turns up a singular flag for that language as well . Certainly the lack of mapping makes it a challenge but the ability to have multiple flags or even options allowing people to choose which flag they want displayed for each language would alleviate these concerns. - [The]DaveRoss 14:23, 2 February 2011 (UTC)Reply
If we can get a high consensus on one or more sets language-flag mappings that set or sets could be offered as a gadget. But I, for one, am strongly opposed to the use of the UK flag to represent English on any such set. I may have other strongly held beliefs in specific cases where some kind of implicit minimisation of minority languages or political groupings is involved. Language names themselves can be taken as offensive to some, but have the advantage of long-standing and inevitable use, which can be attested using our customary methods or by appeal to external "authorities". DCDuring TALK 12:19, 2 February 2011 (UTC)Reply
This whole debate is why I had the foresight to write Flag Law, I don't remember where that was though. - [The]DaveRoss 03:13, 4 February 2011 (UTC)Reply

External links are external links

The entries English, Anglo-Norman and base contain simultaneously the sections "External links" and "See also", that apparently are interchangeable. There are links to Wikipedia and to 1911 Encyclopædia Britannica in both sections.

Compare with talent, mouse and Texas, that link to encyclopedias (including Wikipedia) and other external sites using the "External links" section.

Also compare with second, nostrum and marionette, that contain external links in the "See also" section.

I may or may not be able to once more describe and rationalize an apparent practice, and try to answer why there is this discrepancy of usage of sections. But, frankly, it seems just too random. Instead, I am going directly to propose a guideline that I believe would look good, be relatively easy to implement and even easier to mantain, and, most importantly, provide consistency.

I propose always using the "External links" and never the "See also" to place external links.

Naturally, boxes such as {{wikipedia}} would be exceptions to the proposed rule, because they are supposed to fit virtually anywhere. That's it. --Daniel. 11:13, 2 February 2011 (UTC)Reply

Interestingly, the ELE says nothing about this issue. But yeah, See also should only contain internal links, and Extermal links should have all links which lead out of English Wiktionary. -- Prince Kassad 18:54, 2 February 2011 (UTC)Reply
Thirded.​—msh210 (talk) 19:28, 2 February 2011 (UTC)Reply
Curiously (given the comments above) I'd prefer everything under see also, unless there's enough content that it's better to separate them for tidiness purposes. What about {{pedia}} then? Mglovesfun (talk) 14:13, 3 February 2011 (UTC)Reply
I have been putting {{pedia}} to See also, given that Wikipedia is semi-external to Wiktionary. If a plain majority prefers putting {{pedia}}, {{commonslite}}, etc. to "External links" rather than "See also", okay with me. --Dan Polansky 14:22, 3 February 2011 (UTC)Reply
Mglovesfun, how much content is enough content? After pondering this subject, I came to the conclusion that I prefer never placing external links (including {{pedia}}) under see also. One reason for my preferrence, as I stated, is consistency, which is good by itself. Another reason is that "External links" clarifies the limits between Wiktionary and other websites. Compare with how "See also" literally implies "You, Wiktionary user, who came to see a definition, perhaps an etymology, derived terms, inflections and maybe more linguistic information, take your time to admire this encyclopedical article, or list of images, or this additional dictionary, too." We don't want to send this message. Do we? --Daniel. 14:31, 3 February 2011 (UTC)Reply

Hyper-verbs

So, is "Hyper-verbs" an acceptable header? Does it mean something? See the current revision of punch. --Daniel. 16:50, 2 February 2011 (UTC)Reply

Presumably he meant (deprecated template usage) hyperonyms. I've changed it now. —RuakhTALK 18:05, 2 February 2011 (UTC)Reply
The usual Wiktionary heading is "hypernym". See also WT:ELE#Further semantic relations. --Dan Polansky 18:15, 2 February 2011 (UTC)Reply

Wiktionary talk:Etymology

In trying to proofread the page a bit, I've come across two issues, which I've put on the talk page (bottom two, as of this date and time). Mglovesfun (talk) 14:11, 3 February 2011 (UTC)Reply

Linking to Commons.

The File:Compaq keyboard and mouse cropped.jpg contains an automatic message:

This file is from Wikimedia Commons and may be used by other projects. The description on its file description page there is shown below.

All images linked from Commons to Wiktionary share that text. I personally consider it a little hard to spot. I proposed deleting that bland message and replacing it with the box from w:File:Compaq keyboard and mouse cropped.jpg.

Thoughts? --Daniel. 16:22, 3 February 2011 (UTC)Reply

Sounds like a good idea. I've copied over the message. --Yair rand (talk) 23:00, 3 February 2011 (UTC)Reply
Thanks. --Daniel. 09:49, 4 February 2011 (UTC)Reply

Sense IDs

Previous discussions: Wiktionary:Grease_pit_archive/2010/June#Sense_referentials_and_links, Wiktionary:Grease_pit_archive/2010/July#Stable_identifiers_for_meanings

Wiktionary has the problem of not being able to refer to specific definitions in links, which could be fixed by adding anchors containing glosses to individual definitions. The template {{senseid}} could work for this, if there was a simple way to add glosses to links via existing link templates. I propose that {{senseid}} be allowed for general use in the mainspace, but not be bot-added to all entries yet, and that {{l}} be changed to accept the id= parameter to link to definitions with glosses ({{l|en|peach|id=fruit}} would link here). --Yair rand (talk) 02:24, 4 February 2011 (UTC)Reply

I am unconvinced that this is the best, or even a good solution, but I need to think about it. So I am going to think about it and come back here and see all of the good reasons why my thoughts are dumb lined up for me. Get to it! - [The]DaveRoss 03:11, 4 February 2011 (UTC)Reply
It doesn't 'solve' the problem, but templates such as {{context}} and {{gloss}} could contain anchors. This would only work for senses using these glosses, of course, and the same gloss may appear more than once in an entry. Mglovesfun (talk) 00:13, 5 February 2011 (UTC)Reply

Also this would make {{gloss}} better than just writing something inside brackets which weirdly, is all the template does right now. No clever span stuff. — This unsigned comment was added by Mglovesfun (talkcontribs) at 5 February 2011.

Adding anchors to {{context}} couldn't really work, as there are lots of times multiple senses of a word that contain the same context tag. I don't really see how adding anchors to {{gloss}} would really be helpful either. We need to have some way of connecting senses, and if no one has any better way, I don't see why not to use the {{senseid}} template. --Yair rand (talk) 06:09, 7 February 2011 (UTC)Reply
I do not know if this is relevant or useful, but Icelandic entries like falla#Icelandic anchor synonyms to senses. - -sche 02:54, 9 February 2011 (UTC)Reply

Poorly attested languages

The obvious solution - for me anyway - is instead of listing them on individual language consideration pages (such as WT:About English) would be to have a CFI subpage on attestation. Like I say, I favor the use of subpages to 'declutter' the CFI page, so that it contains only criteria for inclusion, not discussion about those criteria.

Anyway, something like Wiktionary:Criteria for inclusion/Attestation should do it. And something like:

"The following are considered exceptions to the 'three durably archived citations' rule as they are poorly attested"

Then stuff like

  • Ancient Greek: 1
  • Old English: 2
  • Old French: 2

clearly it can only be one or two; not zero, and three is the norm. Mglovesfun (talk) 10:58, 4 February 2011 (UTC)Reply

But how is one going to add to this list? Via a series of VOTEs? I would favor a general exception for ancient languages. (probably obscure ones as well but it's hard to define that) -- Prince Kassad 11:12, 4 February 2011 (UTC)Reply
Why not just consider all the works in languages with only a small amount available to be "well-known works"? --Yair rand (talk) 20:59, 4 February 2011 (UTC)Reply
Because it twists the meaning of the phrase, and still doesn't define "a small amount". Does "The Flag of My Country. Shikéyah Bidah Na'at'a'í: Navajo New World Readers 2" really count as a well-known work by any standard? I would be generous as to "well-known works" with Navaho, but not that generous.--Prosfilaes 21:12, 4 February 2011 (UTC)Reply
That doesn't strike me as a practical solution. I'm with Kassad; a general exception for ancient languages would be better, though we probably don't want to accept modern translations, like Ancient Greek Harry Potter. We could define obscure languages; say if the Ethnologue gives them less than a million speakers, ask for 2, less than 100,000, ask for 1. That doesn't achieve everything; Oromo (17.3 million speakers) is probably a lot hard to cite than Estonian (1.0 million speakers). But it is a definition that will catch all the American and Australian languages.--Prosfilaes 21:12, 4 February 2011 (UTC)Reply
I'm not sure that this would be controversial, or even 'interesting' enough for editors to disagree over it. Might not take as much finagling as you might think. Re number of speakers, not the best criterion as speakers and written language are independent. Middle French is very well attested because it's relatively recent (post 1400) but has zero speakers, since it's 'become' Modern French. Mglovesfun (talk) 00:09, 5 February 2011 (UTC)Reply
I don't know what the difference between 1, 2 and 3 should be, and if we can set up rules for that, why do we need to discuss each and every one? Number of speakers isn't perfect, but I wasn't suggesting that it be used for dead languages. The only non-dead language that stands out as being more easily attestable then that rule would imply is Yiddish, and given the lack of good Yiddish OCR and of Yiddish readers, asking for only 2 attestations wouldn't be a big deal. Otherwise, it improves the condition of many, many languages dramatically. (I also wasn't planning on it being used for artificial languages, which need their own rules here.)--Prosfilaes 00:44, 5 February 2011 (UTC)Reply

Dawnraybot and pronunciations

Some of the pronunciations mentioned by Dawnraybot are incorrect, at least in scruterais (now fixed) and scruterait (to be fixed), the only ones I checked. There are probably many more pronunciations to be changed. Lmaltier 10:24, 5 February 2011 (UTC)Reply

Which specific forms have errors? If we can't be specific about the forms we should bot-remove all pronunciations from Dawnraybot. Nadando 00:19, 6 February 2011 (UTC)Reply
I've been finding errors from Dawnraybot all over the place for a while. No specific forms have errors, as often the pronunciation of the stem is wrong, [1] [2] [3] but often it's the ones that end in /ɛ/ that are shown with /e/, understandable because some people pronounce them that way. —Internoob (DiscCont) 03:40, 6 February 2011 (UTC)Reply
Yes. For example, the forms of peinturlurer have wrong SAMPA (using the dollar sign). And the forms of récurer have wrong IPA as well. -- Prince Kassad 18:18, 6 February 2011 (UTC)Reply
If Plowman is right with his statement above — and for wonderfully obvious reasons we may probably assume that this is the case — then Nadando's suggestion is probably the easiest solution, even though a number of correct pronunciations might be deleted as well. -- Gauss 18:44, 6 February 2011 (UTC)Reply
I asked a native speaker about the final /e/ vs. /ɛ/. She said they are different but for most people you won't hear the difference (that is, they pronounce them the same). So that's so minor I wouldn't worry about it, that is, listing pronunciations which do exist but aren't the ones listed in dictionaries. The harder part, like Internoob says, is just tracking down the ones that are totally wrong. Mglovesfun (talk) 20:50, 6 February 2011 (UTC)Reply
In case of any errors, I apologize: they were not intentionally incorrect. --Plowman 20:55, 6 February 2011 (UTC)Reply

Customizing TOC

Does anyone know how to customize the appearance of the table of contents of an entry? I would like to know the following:

  • 1. How do I make only language names visible using CSS, while hiding "Noun", "Synonyms" and other deeper headings from the TOC?
  • 2. How do I replace the numbered lists with bulleted lists using CSS?
  • 3. How do I hide the list mark altogether?

Thanks for any input. --Dan Polansky 13:51, 6 February 2011 (UTC)Reply

#1: Add table#toc ul ul { display: none; } to your CSS.
#3: Add table#toc span.tocnumber { display: none; } to your CSS.
#2: This is a bit trickier. The version that seems to fit best with the Vector scheme is
table#toc ul
{
  list-style-type: square;
  list-style-image: url(http://bits.wikimedia.org/skins-1.5/vector/images/bullet-icon.png?1); 
  margin-left: 1.5em;
}
(plus the CSS for #3), but you seem to be using Monobook?
RuakhTALK 14:32, 6 February 2011 (UTC)Reply
Thanks! It works well even with Monobook. --Dan Polansky 14:39, 6 February 2011 (UTC)Reply
A Monobook appearance can be obtained by using Monobook bullet, like this:
table#toc ul {
  list-style-type: square;
  list-style-image: url(http://bits.wikimedia.org/skins-1.5/monobook/bullet.gif?1);
  margin-left: 1.5em;
}
--Dan Polansky 08:13, 8 February 2011 (UTC)Reply
Re #1 (table#toc ul ul { display: none; } ): I have tried it and it has at least one downside: it also applies to TOC in Beer parlour, so it hides headings of the discussions. --Dan Polansky 14:54, 11 February 2011 (UTC)Reply
Try .ns-0 table#toc, etc.​—msh210 (talk) 15:55, 11 February 2011 (UTC)Reply
Works nice; thanks. Now I would like to do something even more fancy: how do I format the list in TOC to make it a horizontal list instead of vertical, like "English · French · Spanish"? --Dan Polansky 13:31, 13 February 2011 (UTC)Reply
What browser do you have? In IE8, and in common non-IE browsers, you can do
.ns-0 table#toc ul ul { display: none; }
.ns-0 table#toc span.tocnumber { display: none; }
.ns-0 table#toc li { display: inline; }
.ns-0 table#toc li + li:before { content: ' · '; }
but that won't work in IE6 or IE7.
RuakhTALK 15:07, 13 February 2011 (UTC)Reply
I have Firefox 3.6.13, so this works great. I have added .ns-0 div#toctitle { display: none; } to hide the title of the TOC, thereby removing one more vertical element from the box. The result is extremely compact, typically taking only one line. Thanks again. --Dan Polansky 11:16, 14 February 2011 (UTC)Reply

Poll: Etymology and the use of less-than symbol

I would like to ask you about your preference in using "<" vs "from" in etymologies in Wiktionary. Etymology sections in Wiktionary are not united in the use of "<" vs "from".

An example of the two different formats:

A longer example of the two formats:

This poll disregards whether the etymology should start with "from" or omit "from" from the start (or "<", respectively). The preference of "<" is compatible with the format "From A < B < C < D"; the only thing in question are the second, third, and later word or symbol.

I tend to prefer "<", but am okay with "from" if this is the majority preference. For me, the use of "<" makes it easier to scan the string of items using my eyes and locate the individual items, while "from" gets more easily lost in the jumble. One argument against the use of "<" is that its meaning is much less obvious than the meaning of "from". But I think the meaning of "<" can be quickly picked up by the user of the dictionary. Century 1911 and Encarta[4] use "<", while some other dictionaries including Merriam and Webster online[5] use "from".

This poll combines discussion with a clear indication of one's current preference, a preference that can change later as a result of discussion. Feel free to make other proposals and comments alongside your indication of your preference.

Thank you for your attention and your input! --Dan Polansky 08:42, 7 February 2011 (UTC)Reply

Preference 1

I prefer the use of "<" over the use of "from".

  1. Support Mglovesfun (talk) 12:19, 7 February 2011 (UTC). And the reason is, otherwise you repeat 'from' a lot which I find irritating. I don't feel that strongly about it, I don't mind being outvoted. Mglovesfun (talk) 12:19, 7 February 2011 (UTC)Reply
  2. Support Dan Polansky 14:14, 7 February 2011 (UTC) As I said, for me, the use of "<" makes it easier to scan the string of items using my eyes and locate the individual items, while "from" gets more easily lost in the jumble. I am okay with going by the option that is preferred by a plain majority. --Dan Polansky 14:14, 7 February 2011 (UTC)Reply
  3. Support We can do fancy things if we use a <, we can do even more fancy things if etymologies were templatized. We can do nothing fancy with a from. Also I think it is a much simpler, easier to read solution. - [The]DaveRoss 05:04, 9 February 2011 (UTC)Reply
    We can do fancy things if etymologies are templatified. For example, if, instead of (at [[sully]])
    From {{etyl|fro}} {{term|lang=fro|souillier}} (> {{etyl|fr|-}} {{term|souiller|lang=fr}}). Compare {{term|soil|lang=en}}.
    we had
    {{from|lang=fro|souillier}} {{whence|etyl=fro|lang=fr|souiller}} {{more at|soil|lang=en}}
    and instead of (at [[wend]])
    {{etyl|enm}} {{term|lang=enm|wenden}} from {{etyl|ang}} {{term|lang=ang|wendan||to turn, go}}, causative of {{term|windan|||to wind|lang=ang}}. Akin to {{etyl|ofs|-}} {{term|lang=ofs|wenda}}, {{etyl|osx|-}} {{term|lang=osx|wendian}}, {{etyl|non|-}} {{term|venda||to wend, to turn|lang=non}} ({{etyl|da|-}} {{term|vende|lang=da}}), {{etyl|de|-}} {{term|wenden||to turn|lang=de}} and {{etyl|got|-}} {{term|𐍅𐌰𐌽𐌳𐌾𐌰𐌽|sc=Goth|tr=wandjan|lang=got}}.
    we had
    {{from|lang=enm|wenden}} {{from|lang=ang|wendan||to turn, go}} {{etymon form of|form=causative|windan||to wind|lang=ang}} {{cognate|lang=ofs|wenda}} {{cognate|lang=osx|wendian}} {{cognate|non|venda||to wend, to turn}} {{whence|etyl=non|vende|lang=da}} {{cognate|wenden||to turn|lang=de}} {{cognate|𐍅𐌰𐌽𐌳𐌾𐌰𐌽|sc=Goth|tr=wandjan|lang=got}}
    that'd be both machine-readable and (with some work) human-readable as opposed to the current system which is only the latter. (However, I still think that the human-readable display should include "from" rather than "<".  :-) ) But I don't know what fancy things we can do with "<" (untemplatified) that we can't do with "from".​—msh210 (talk) 16:34, 9 February 2011 (UTC)Reply
    We can't easily parse from since it is a word which may appear in an etymology for other reasons. - [The]DaveRoss 20:24, 9 February 2011 (UTC)Reply
    • Making that human-readable would be really difficult. Etymology editing should eventually be done through a WT:EDIT module (hopefully one that also updates the necessary Derived terms/Descendants sections of other entries), rather than editing the wikitext itself, so it doesn't matter all that much if the wikitext is more complicated, so long as the wikitext is machine readable/editable and the output is human readable. --Yair rand (talk) 11:12, 22 February 2011 (UTC)Reply
  4. Support Maro 19:21, 11 February 2011 (UTC)Reply
  5. Support The uſer hight Bogorm converſation 11:11, 14 February 2011 (UTC)Reply
  6. Support Ivan Štambuk 16:47, 14 February 2011 (UTC)Reply
  7. Support —Stephen (Talk) 23:15, 14 February 2011 (UTC) I don’t know about parsing difficulties or even why parsing should be needed, but IMO the use of < makes an etymology much more readible.Reply
  8. Support — Stevey7788 18:06, 17 February 2011 (UTC) Concise and elegant, and most people would quickly figure out what this means.Reply

Preference 2

I prefer the use of "from" over the use of "<".

  1. I prefer "from", but sometimes "which is from" or the like, whichever fits best in the paragraph, over "<". But I don't mind "<" terribly.​—msh210 (talk) 09:46, 7 February 2011 (UTC)Reply
  2. Support Ƿidsiþ 09:55, 7 February 2011 (UTC), in general.Reply
  3. Support H. (talk) 11:52, 7 February 2011 (UTC) Yes please, this has been bothering me for a while, Wiktionary is not paper, remember? Furthermore, I never understood why to use < but not >. Furthermore, please begin with ‘From’ as well, in the same line of thinking: it is a sentence, not a telegraph message.Reply
    You are entitled to your preference, but the use of "<" has not much to do with whether Wiktionary is paper. You seem to imply that the people who prefer "<" do so to make the etymology shorter, but that is not necessarily the case, not with me anyway. For me, it is all about the ease of visual parsing. --Dan Polansky 12:16, 7 February 2011 (UTC)Reply
    I always thought that it's < because it looks like an arrow pointing from oldest to newest, showing the direction of progression. —Internoob (DiscCont) 03:00, 9 February 2011 (UTC)Reply
    On that subject of Hamaryns', we should use "from the Language (deprecated template usage) term": "from the (deprecated template usage) [etyl] English (deprecated template usage) water" sounds like it's talking about the term, whereas "from (deprecated template usage) [etyl] English (deprecated template usage) water" sounds like it's talking about the liquid. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 11:05, 14 February 2011 (UTC)Reply
  4. SupportCodeCat 11:58, 7 February 2011 (UTC)Reply
  5. Support It's not a huge thing for me, but from is more obvious and less jargony.--Prosfilaes 20:49, 7 February 2011 (UTC)Reply
  6. Support the less-than sign is really confusing to most people. It's a far better idea to write it out. -- Prince Kassad 20:53, 7 February 2011 (UTC)Reply
  7. Support — is the principle that Wiktionary is not paper applicable here? It has the space to spell words out rather than to use unclear abbreviations. - -sche 05:14, 8 February 2011 (UTC)Reply
  8. Support Daniel. 15:16, 8 February 2011 (UTC) I personally feel more comfortable with "from", but I don't mind if people choose any of these possibilities. --Daniel. 15:16, 8 February 2011 (UTC)Reply
  9. SupportRuakhTALK 16:08, 8 February 2011 (UTC)Reply
  10. Support DCDuring TALK 16:23, 8 February 2011 (UTC)Reply
  11. Support — I prefer "from" to "<", but msh210's proposal in #Preference 1 above would be much better. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 11:05, 14 February 2011 (UTC)Reply
  12. Support – for reasons mentioned above, notably more understandable by casual users (who are already a bit confused by etymologies) and I find “from” easier to read. —Nils von Barth (nbarth) (talk) 11:18, 14 February 2011 (UTC)Reply
  13. Support Equinox 11:20, 14 February 2011 (UTC) But I'm not really picky; both seem all right. Equinox 11:20, 14 February 2011 (UTC)Reply
  14. Support Bequw τ 13:41, 14 February 2011 (UTC)Reply
  15. Support, we have no need for space-saving devices here. bd2412 T 16:40, 14 February 2011 (UTC)Reply
  16. Support (more explicit). Lmaltier 17:49, 14 February 2011 (UTC)Reply
  17. Support, because it is easy to understand Kinamand 07:14, 15 February 2011 (UTC)Reply
  18. Support ---> Tooironic 07:32, 15 February 2011 (UTC) Me too.Reply
  19. Support --Makaokalani 08:44, 15 February 2011 (UTC)Reply
  20. Support Eclecticology 09:51, 16 February 2011 (UTC) Wiktionary is not paper. Older dictionariies that used "<" did it to conserve space as much as anything else. They also had conveniently and predictably located glossaries of the abbreviations and symbols that they used.Reply
  21. Support Neskayagawonisgv? 18:54, 16 February 2011 (UTC) I dislike the less than symbol, in general. Additionally I would like to bring to attention the fact that from is far clearer than the less than symbol for users of screen readers or other accessible technology, such as a Braille display. In fact I would say that at this point I strongly oppose usage of the less than symbol, due to the accessibility concerns. A screen reader would read out the etymology section with 'less than' as the representation of the symbol, which makes utterly no sense in terms of etymology. Less than language name? Seriously? --18:54, 16 February 2011 (UTC)Reply
  22. I'd like this to be customisable, with a very slight preference for "from" as the default. Thryduulf (talk) 15:43, 30 March 2011 (UTC)Reply

Preference 3

I am indifferent or indecisive about the use of "from" vs the use of "<".

  1. Support Vahag 13:57, 7 February 2011 (UTC) I use "from" in one- and two-member etymologies, but "<" in longer ones. Long chains with < are easier to read. --Vahag 13:57, 7 February 2011 (UTC)Reply
  2. SupportInternoob (DiscCont) 03:01, 9 February 2011 (UTC) Per above.Reply
  3. I'm not really a fan of either method. In my opinion, a template would be better. --Yair rand (talk) 05:26, 9 February 2011 (UTC)Reply
    Isn't that orthogonal to the question under discussion?--Prosfilaes 19:49, 9 February 2011 (UTC)Reply
    Not really. If "from" and "<" were identical in practice, and we had templates to display them, then any user would be able to see what he wants to see. (For example, I may see "from" and you may see "<".) --Daniel. 16:10, 10 February 2011 (UTC)Reply
    There would necessarily be a default view, though, so this discussion would not be moot.​—msh210 (talk) 06:41, 11 February 2011 (UTC)Reply
  4. Support SemperBlotto 11:27, 14 February 2011 (UTC) and I think that people who add etymologies that state that a word came from language1 via language2 via language3 should be prepared to supply evidence. SemperBlotto 11:27, 14 February 2011 (UTC)Reply
  5. Support Conrad.Irwin 20:01, 14 February 2011 (UTC). I am a fan of standardising on one or the other; but I don't really mind which is chosen. Conrad.Irwin 20:01, 14 February 2011 (UTC)Reply
  6. Support Whatever becomes the standard, I'll accept, no preference. --Anatoli 22:20, 14 February 2011 (UTC)Reply
  7. Support Whatever becomes the standard, I'll accept, no preference. --Jcwf 00:45, 15 February 2011 (UTC)Reply
  8. Support JamesjiaoTC 06:38, 15 February 2011 (UTC) This depends on the situation. Overall, I don't have any preference. So will see how this poll goes.Reply
  9. Support -Atelaes λάλει ἐμοί 10:57, 20 February 2011 (UTC) I think '<' does make it slightly easier to parse at a glance (but not enough, so we really need to think of something more), while 'from' is more intuitive and I think, a tad more professional. Using a hodge-podge of the two is confusing and ugly, and so I support a standard, whatever that standard may be.Reply
    Again, I bring up the accessibility concern of '<', that while it may be obvious enough to sighted users, anyone who looks at wiktionary with a screen reader will not be easily able to parse it. --Neskayagawonisgv? 18:13, 21 February 2011 (UTC)Reply

Discussion

A few thoughts:

  • While > (greater than) is elegant (terse and symmetric with <), it can be rather confusing; I’m not sure how much it is used.
  • Various alternative presentations of etymologies (and cognates and the like) are possible and could be interesting – one can imagine giving etymologies as lists (rather than prose, which makes scanning easiest), timelines, graphs (of ancestors and cognates), time-lapse animations (spread and evolution of a word across time, space, and other languages), etc., etc. – though for now these are science fiction (other than timelines of usage).
  • More formal templates, as msh210 discusses above, would help with making etymologies more computer-parsable – currently they are mostly pretty formulaically {{etyl}} + {{term}}, with the main variation being “from” vs. “<”, as Dan is discussing here.
—Nils von Barth (nbarth) (talk) 11:32, 14 February 2011 (UTC)Reply
    • I would like to point out the fact that the less than/greater than symbols are NOT screen reader/braille display accessible, and could be very potentially confusing to anyone using such technology. The intended meaning as far as etymological context is not clear at first glance, let alone at hearing 'less than'. --Neskayagawonisgv? 18:56, 16 February 2011 (UTC)Reply
  • What are the effects on machine-parsing en.WT content? how might this affect the dbpedia-folx efforts to build ontology parsing of en.WT content? are there similar or developing standards across Wiktionary languages? What specific problem does this address? just some of my thoughts. - Amgine/talk 23:01, 14 February 2011 (UTC)Reply

Wiktionary:"/Templates

I created that and it was speedy deleted, without asking me. Well, it is not used as a redirect, but as most shortcut pages, they are to be used in the search box. I regularly want to lookup one of those templates, and it is easier to type ‘WT:"/Templates’ then ‘Wiktionary:Quotations/Templates’, let alone remember the name of the latter page. So why not keep it? H. (talk) 11:14, 7 February 2011 (UTC)Reply

Because we don't allow anything to redirect to anything. And like you say, it's your personal redirect. Another admin may disagree with me. Have you considered just adding the link from your user page. Oh having said that WT:" already exists. Sigh, shoot (that's a euphemism). Mglovesfun (talk) 12:29, 7 February 2011 (UTC)Reply
Definitely agree with Mg, Wiktionary namespace shouldn't have shortcut redirects. If you want to redirect to something in the Wiktionary namespace, a WT namespace page should be created for the purpose, maybe WT:QT for Quotations/Templates (which seems like a badly titled page to me). - [The]DaveRoss 14:43, 7 February 2011 (UTC)Reply
I'm not saying that no redirects should exist, just that you want to avoid ambiguity. For WT:QUOTE to redirect to WT:Quotations seems fine, as it's obvious what you're linking to. Having said that, we use a lot of initialisms which are only obvious once you click on the page. Still, that's not a reason to allow anything that someone cares to type in and hit enter. Mglovesfun (talk) 14:47, 7 February 2011 (UTC)Reply
But [[WT:"]] seems pretty obvious, and this is an obvious extension of it.​—msh210 (talk) 21:12, 7 February 2011 (UTC)Reply
Wiktionary: and WT: namespaces coincide now, TDR. (See, e.g., [[WT:Criteria for inclusion]] and [[Wiktionary:CFI]].) If you mean that onle short titles should redirect, well, there are loads and loads of redirects within the Wiktionary: namespace with long titles. Also some with mixed long and short titles (which I mention in case that's your objection), like [[WT:Editable ELE]]. Undelete or, if restored, keep.​—msh210 (talk) 21:12, 7 February 2011 (UTC)Reply
Oh, I had forgotten that we made WT an alias instead of just a pseudo-namespace. - [The]DaveRoss 21:50, 7 February 2011 (UTC)Reply

Wiktionary:Requests for moves, mergers and splits

Badly needs input. TBH I'm just gonna grant some of these as unopposed (1-0) unless someone objects. Mglovesfun (talk) 11:30, 8 February 2011 (UTC)Reply

Appendix-only constructed languages

Subcats of Category:Appendix-only constructed languages should not also be subcats of Category:All languages, right? I mean, Category:APL language (for example) should not show up among all the real languages in Category:All languages, right?​—msh210 (talk) 16:09, 9 February 2011 (UTC)Reply

I would advise deleting Category:Appendix-only constructed languages, because its title is too technical, in favor of creating "Category:Minor constructed languages" and "Category:Computer languages"...
As for your question, yes, APL should be a member of Category:All languages. Why not? It's a constructed language too, per constructed language. --Daniel. 16:26, 9 February 2011 (UTC)Reply
Re "why not": Because Category:All languages is not a topical category where we can debate whether something "is a constructed language too" and belongs in it. It's a lexical category, and we've decided to exclude APL (and everything else in Category:Appendix-only constructed languages) from the lexicon.​—msh210 (talk) 16:38, 9 February 2011 (UTC)Reply
I think having Category:Appendix-only constructed languages in Category:All languages is sufficient. Mglovesfun (talk) 16:41, 9 February 2011 (UTC)Reply
Having all languages in Category:All languages makes them easier to be found. If I want to find Category:Klingon language, or simply want to know whether we have a category for Klingon, I would like to have the possibility of browsing through the "K" part of Category:All languages. If any language is deliberately excluded from Category:All languages, I would at least expect this fact to be announced somewhere, like "This category does not contain certain minor or computer languages, that may be found in this other category." --Daniel. 16:50, 9 February 2011 (UTC)Reply
Another 2p in favor of including, well, all languages under Category:All languages. I'm actually a bit surprised this is even being discussed. The category name is, after all, all languages, and the category page states quite clearly that This category contains, indeed, all languages, or rather, all language names (in English). -- Bemused, Eiríkr Útlendi | Tala við mig 19:50, 9 February 2011 (UTC)Reply
I agree with User:Eirikr and User:Daniel. — it is counter-intuitive to include languages only somewhere other than Category:All languages. - -sche 08:31, 10 February 2011 (UTC)Reply
Above all, most of the categories from Category:Appendix-only constructed languages should be deleted, per Wiktionary:Votes/pl-2010-10/Disallowing certain appendices. For instance, Category:Klingon language contains Appendix:Klingon/ghommey, which should not exist per the vote. After the subpages are deleted, the category for Klingon gets pointless. --Dan Polansky 09:14, 10 February 2011 (UTC)Reply
I was under the impression that that vote was about fictional universe appendices, not fictional language appendices. --Yair rand (talk) 09:26, 10 February 2011 (UTC)Reply
Oops, you are right. So maybe we should have another vote that extends the treatment also to appendix-only languages. --Dan Polansky 12:17, 10 February 2011 (UTC)Reply
The treatment of fictional universe appendices is still relatively unclear. However, I can safely assume that, if appendix-only language appendices eventually should follow the rule of being defined in lists and never in individual entries, them the following facts about Category:Klingon language should be taken into consideration:
  1. We would still eventually have many lists of Klingon words, possibly alphabetically (Appendix:Klingon/A, Appendix:Klingon/B, Appendix:Klingon/C...), per part-of-speech (Appendix:Klingon/List of nouns, Appendix:Klingon/List of adjectives), and/or per subject (Appendix:Klingon/List of animals, Appendix:Klingon/List of clothing), thus justifying the existence of Category:Klingon language.
  2. We would still eventually have appendices for certain pieces of information that are common for other languages as well, such as Appendix:Klingon Swadesh list and Appendix:Klingon given names, thus justifying the existence of Category:Klingon language.
  3. We would still perhaps have Klingon templates (for example, to display multiple scripts), thus justifying the existence of Category:Klingon templates and Category:Klingon language.
  4. Not to mention Category:Klingon derivations and Wiktionary:Requested entries (Klingon).
  5. Category:Klingon language fits a intuitive and organized category tree of "all languages", and displays, or should display, relevant information such as script, family, a link to an entry, links to policies, subcategories, a link to a Wikipedia article and a useful warning about Klingon being forbidden in entries, so it is justified.
That's it. --Daniel. 14:00, 10 February 2011 (UTC)Reply
I think the problem is that Category:All languages is misnamed: it doesn't contain entries like English and French, as its name implies, but rather categories like Category:English language and Category:French language. A more accurate name might be "Category:All entries by language", or "Category:All language categories". —RuakhTALK 23:41, 10 February 2011 (UTC)Reply
In my opinion, "Category:All languages" is not a bad name, but for the sake of clarity...
--Daniel. 23:45, 10 February 2011 (UTC)Reply
Category:Klingon language is in this category, albeit not directly, but via a subcategory. Mglovesfun (talk) 23:53, 10 February 2011 (UTC)Reply
Category:Klingon language is (and always has been as far as I remember) a direct member of Category:All languages; check again. --Daniel. 23:57, 10 February 2011 (UTC)Reply
Re: "Category:Portuguese language, Category:English language, Category:French language, etc. contain not only entries but much more information": O.K., but let's focus on one problem at a time. ;-)   —RuakhTALK 00:06, 11 February 2011 (UTC)Reply
@Ruakh: "Category:All entries by language" and "Category:Entries by language" sound good to me. These categories mostly contain entries; the only other thing they contain are indexes and appendixes, but these are very few compared to entries, so the misnaming is not too bad. If we want to be more accurate, we can have "Category:Content by language"--Dan Polansky 14:24, 11 February 2011 (UTC)Reply
Language categories such as Category:English language and Category:Portuguese language also contain templates, relatively many pages of rhymes, few categories or pages of requests for attention, requested entries, etc., and citations. I appreciate accuracy, and I also appreciate your suggestion of "Category:Content by language". --Daniel. 15:47, 11 February 2011 (UTC)Reply
I like "Content by language".​—msh210 (talk) 15:57, 11 February 2011 (UTC)Reply

Poll: Choosing topical categories

I would like to know the opinions of other Wiktionarians about the problem mentioned in WT:BP#How to choose topical categories:

Most entries fall into the scope of various redundant topical categories simultaneously. For example, German Shepherd may be categorized into Category:Herding dogs, Category:Dogs, Category:Canids, and so on. Should "German Shepherd" be a member of all these categories? Or, perhaps, should it simply be a member of only the narrowest one, that is, Category:Herding dogs, and all the others would be merely implied?

This poll is not about deciding categorization of topical categories, or deciding names of topical categories, because I believe these are complex and separate problems to be discussed eventually. Nonetheless, I also believe our category tree is good enough to undergo this project of becoming more consistent by actually deciding and letting editors know where they should categorize entries.

Feel free to make other proposals and comments. Thank you for your attention and your input. --Daniel. 18:04, 10 February 2011 (UTC)Reply

Preference 1: Narrowest topical categories only

I prefer all entries as members of only the narrowest topical categories available.

If you agree 100% with this practice, or if you agree in essence with it but have ideas of some situations where it would be better to use less narrow categories, please vote for this option. Feel free to elaborate your thoughts.

Examples:

  1. Support Daniel. 18:04, 10 February 2011 (UTC)Reply
    I, personally, prefer the system of using only the narrowest category, because:
    1. It avoids redundant superpopulated categories such as Category:Nature with thousands of terms from Category:Animals, Category:Plants and Category:Weather. Feel free to correct me, but I believe they don't have any practical use; that is, I don't remember or assume that any particular target group of Wiktionary users would need, want or appreciate the existence of superpopulated topical categories. (And these possible wide lists have the potential to be scanned by external tools anyway, if anyone bothers to mention or create such a tool.)
    2. It is easier to visually scan the list of categories from an entry that avoids that redundancy. For example, the list of categories of the current revision of "dog" randomly contains various versions of Category:Canids, Category:Mammals, and Category:Animals, and I wouldn't call it the most comfortable piece of text to be read and understood.
    3. It helps to organize existing categories, by keeping a balance of populated "narrow" categories, and underpopulated "wide" categories. For example, with this system, if any editor finds any name of a specific animal on Category:Animals (or Category:fr:Animals, Category:pt:Animals, etc.), he/she would automatically know that these entries need to be recategorized under Category:Insects, Category:Mollusks, and so on. --Daniel. 18:04, 10 February 2011 (UTC)Reply
  2. Support Mglovesfun (talk) 18:18, 10 February 2011 (UTC)Reply
  3. I agree that, if we are to have topical categories at all, then entries should be only in the narrowest available. Otoh, if we are to have topical categories, IMO the categories should be as broad as feasible, so that the examples given for this option are not in accord with my view. (Even the wording of the option, "I prefer all entries as members of only the narrowest topical categories available", while technically correct, is misleading in that someone might mistakenly infer that I prefer having topical categories.)​—msh210 (talk) 18:40, 10 February 2011 (UTC)Reply
  4. SupportInternoob (DiscCont) 00:38, 11 February 2011 (UTC) I actually thought that that was (more or less) already one of those de facto policies. —Internoob (DiscCont) 00:38, 11 February 2011 (UTC)Reply
  5. Support , but you need to keep in mind that a term like 'narrow' only really works if the categories actually behave like a tree. But they don't always, some categories are nested in strange ways and there isn't really a clear definition of what's 'narrower' than something else. —CodeCat 14:10, 11 February 2011 (UTC)Reply
  6. As phrased, I cannot agree with either option, but I'm certainly NOT disinterested. I have to agree that there needs to be some organisation to categories. Some are simply a mess, while some are redundant. But the great majority are useful, and so could be made to be more useful with a bit of organisation. However, I can easily think of cases where an entry might need to be at two different levels on the category tree, without implying that the lower level is therefore redundant. -- ALGRIF talk 17:02, 11 February 2011 (UTC)Reply

Preference 2: All topical categories

I prefer all entries as members of all the topical categories available.

If you agree 100% with this practice, or if you agree in essence with it but have ideas of some situations where it would be better to disregard certain topical categories, please vote for this option. Feel free to elaborate your thoughts.

Examples:

Preference 3: Indifference

I am indifferent or indecisive about the choice of topical categories.

Discussion

  • I oppose the two simple options offerred in this poll. The second option seems unworkable; it would lead to huge top-level categories. The first option is not too bad, just that I can imagine to want to have an entry both in a finest-level category and its supercategory. The first option is clear and simple, but not necessarily the best one. If I were asked which of the two options I prefer, I would clearly prefer option 1 over option 2, but that does not mean that I want the wording of option 1 to become a policy for Wiktionary practice. --Dan Polansky 18:30, 10 February 2011 (UTC)Reply
  • I, too, oppose the two simple options offered in this poll. [[German Shepherd]] should be in Category:Dogs. I think a better approach might be to think about how many entries a topical category should have in order to be useful; Category:Dogs should have (say) the 500 most common/important/whatever dog-itive terms, most of which will also be in subcategories, but less important words should only be in subcategories. (That won't be perfect, because there might be some dog-related words that it's hard to categorize any more narrowly than that, such that they have to go into Category:Dogs no matter how unimportant they are; but I think that's the kind of logic that needs to be applied.) —RuakhTALK 16:28, 11 February 2011 (UTC)Reply

Poll: Deprecation of topical categories

As of this revision, DCDuring has created an additional option on the poll above, to deprecate topical categories. This option since then has been supported by two people.

Since the deprecation of topical categories is a separate subject, and there was no space to formally oppose or just comment on this proposal in the poll above, I'm moving it to this additional poll, naturally with the additional options.

Once more, feel free to make other proposals and comments. Thank you for your attention and your input. --Daniel. 21:00, 10 February 2011 (UTC)Reply

Option 1: Support deprecation

If you believe that topical categories should be deprecated at this time, with no further topical categories to be created after 20:37, 10 February 2011 (UTC).

  1. Support DCDuring TALK 20:37, 10 February 2011 (UTC)Reply
  2. Support I have to wonder if anyone even used these. They don't seem to be worth the trouble to me. -- Prince Kassad 20:43, 10 February 2011 (UTC)Reply
  3. Support strongly. See my comments with the same timestamp in the next subsection.​—msh210 (talk) 06:36, 11 February 2011 (UTC)Reply
  4. Support Ƿidsiþ 16:45, 11 February 2011 (UTC) I would quite like a few top-level topical cats -- but if it came to it I would rather have none than have the crazy proliferation of them that we currently have.Reply

Option 2: Oppose deprecation

  1. Oppose Daniel. 21:00, 10 February 2011 (UTC)Reply
    Topical categories serve the purpose of subdividing Wiktionary into various lists of words related by their subjects, including lists that would normally be found as various types of dictionary: a dictionary of medicine, of law, of technology, etc.
    If I want to know terms used in psychology, I use Category:Psychology. If I want to know the "vulgarities" of English and other languages, I navigate through Category:Vulgarities. If I want to know terms used in chess, I use Category:Chess.
    Appendices, Wikisaurus pages, and topical categories are valuable tools for the creation and maintenance of various lists of words by their contexts; each of these methods has their own scopes and qualities, and I personally like them all. Topical categories are not only dynamic, but easy to be populated, easy to be found at the bottom of each entry, and readers are used to them after all these years of creating and keeping topical categories. Unlike Wikisaurus, the scope of topical categories is simple and flexible enough to often allow adjectives, nouns, adverbs, etc. together, and unlike appendices, does not necessarily have additional information that may not be required, such as definitions of of each word, or very detailed explanations of usage and existence of groups of words. Topical categories also usually don't require to be often updated to keep up with entries.
    Furthermore, the appearance of categories is consistent among all projects, so if I find an interwiki link from Category:Childish to fr:Catégorie:Langage enfantin and sv:Kategori:Barnspråk, I know I will be able to navigate them without having to learn much; and, if I compare the same topical category in two Wiktionaries, I may find the terms that are missing from them. In addition, it's easy to find or create tools to browse topical categories and gather specific contents (such as downloading specifically the dictionary of chess). For that reasons, I oppose their deprecation.
    I almost forgot to mention that categories get praised or criticized occasionally, but regularly, at WT:Feedback, so other people are aware of them and use them too. --Daniel. 21:33, 10 February 2011 (UTC)Reply
    Terms found in a specialized dictionary (and not generally defined the same way in a general dictionary) are jargon terms, which are properly tagged with an appropriate jargon context tag like {{mathematics}} and categorized in the appropriate category like Mathematics. When I voted in support of getting rid of topical categories, it was in support of getting rid of a category that contains all mathematics-related terms: that's what I think of as a topical category. I oppose getting rid of jargon categories. Thus, the Mathematics category would remain (under my proposal) but contain only jargon terms (and the category description on the category page would indicate as much). Category:Herding dogs would presumably go (unless there's a jargon of the field of herding dogs specifically). So, Daniel, Psychology and Chess will remain categories and will contain, as you put it, "terms used in psychology" and "terms used in chess" — but not terms used outside of but about psychology or chess. Childish and Vulgarities are not topical categories at all: they're categories for registers, not topics. They'd of course be kept. (They should probably be renamed "English...", but that's a separate issue.)​—msh210 (talk) 06:36, 11 February 2011 (UTC)Reply
    With the distinction presented by msh210 in mind, let me clarify that, in my opinion, Category:Greens to find all names of shades of green of a language, Category:Dogs to find all names of breeds of a language, etc. are equally helpful as Category:Chess, Category:Psychology, etc. and for exactly the same reasons. --Daniel. 15:43, 11 February 2011 (UTC)Reply
    I may also use topical categories to check if certain terms already exist in Wiktionary and are properly categorized. For example, by reading Category:Chess, I know that we lack many names of strategies of this game; and, by reading Category:Dogs, I know that we lack many names of breeds of dogs. --Daniel. 21:37, 10 February 2011 (UTC)Reply
    You seem to have misread my proposed option in the above straw poll. I am not sure why. A category such as Category:Vulgarities is clearly linguistic in its scope. The truly topical categories are essentially part of the overall non-linguistic, encyclopedic tendency within Wiktionary. I consider this tendency destructive of Wiktionary as a language resource and incompetently duplicative of the superior work on encyclopedic topics that is carried out at Wikipedia. It seems to me that we need more work on attestation of old and new terms and their senses and usage, something with is not likely to be covered in WP. DCDuring TALK 22:57, 10 February 2011 (UTC)Reply
    I believe I did not misread anything you wrote in this or the above thread. First of all, I've seen, more than once, people using the umbrella of "topical categories" for anything whose naming system basically consists of "language code, then colon, then label". Second, the proposal does not make a clear distinction of categories to be kept or deleted, and their reasons (though you gave some reasons now). And, third, even if I assume that your proposal never meant to attack Category:Vulgarities, my defense of the existence of that category does not contradict my opposition. Apparently you have misread yourself. :p --Daniel. 23:37, 10 February 2011 (UTC)Reply
  2. Oppose The proposal is crazy talk. Categories are a living part of any wiki. They are created by users when needed, and the bad ones can be deleted, given some criteria for good and bad. Banning the creation of categories is unnatural. --LA2 21:46, 10 February 2011 (UTC)Reply
  3. Oppose. --Yair rand (talk) 22:12, 10 February 2011 (UTC)Reply
  4. Oppose Mglovesfun (talk) 23:08, 10 February 2011 (UTC). I don't oppose topical categories. I get annoyed by our lack of regulation, essentially meaning that it's a free-for-all, but I wouldn't want all topical categories deleted. Mglovesfun (talk) 23:08, 10 February 2011 (UTC)Reply
  5. OpposeInternoob (DiscCont) 00:45, 11 February 2011 (UTC) Per Daniel.Reply
  6. Oppose Dan Polansky 08:17, 11 February 2011 (UTC) I oppose deprecation of topical categories. For clarification of what I mean by "topical category": "vulgarities" is not a topical category; "Category:de:Latin derivations" is not a topical category; "mammals" and "vehicles" are topical categories; "geography" can be seen as a topical category, which contains "mountain", but it could be designed and regulated as a category of terms that are only used in geography, in which case "mountain" would not belong to "Category:Geography". While I do support having topical categories, I do not support any arbitrary level of their granularity: Category:Herding dogs could be too specific. --Dan Polansky 08:17, 11 February 2011 (UTC)Reply
  7. Weak oppose. I don't think we're doing a great job with topic categories, but they do fall in our remit as a dictionary-cum-thesaurus. If we allow ===Synonyms=== and ===Hypernyms=== and [[Wikisaurus:foo]] and so on, then clearly we're not just mapping from words to meanings, but also from meanings to words; and a topic-category system, done well, is a key component of that. —RuakhTALK 13:55, 11 February 2011 (UTC)Reply
  8. Oppose. I find categories a very useful tool. For instance, I sometimes use a topic category to fill the gaps .. leading to lots of new vocab. (as stated above). I feel that I must remind users that the lemmings also do this. But with paper there are restrictions, so they end up with all the usual limited range of stuff Weights and Measures, Clothes, Parts of the body, Cars, Garden tools, and whatever else comes into their heads. I think the categories can be really useful. Maybe a bit of culling from time to time, but with care. One contrib's idea of a bad category is almost sure to be another's useful portmanteau. How about checking with the contrib who set a category up before culling it? -- ALGRIF talk 16:51, 11 February 2011 (UTC)Reply
  9. Oppose. They are useful for contributors, but also for readers, especially when they want to find a word they have known, but forgotten. Lmaltier 21:22, 11 February 2011 (UTC)Reply
    However, if they have to be useful to readers, category names should be very easy to understand (not only by specialists). Names such as fr:Dogs should be changed to something such as Dogs in French. And categories such as Chaetodontidae would be useless (but a category such as Butterflyfish might be useful). Lmaltier 22:16, 11 February 2011 (UTC)Reply
  10. Oppose. I am of the same opinion as User:Mglovesfun and User:Ruakh. We should discuss, improve or delete single categories, but not delete all categories categorically. - -sche 21:42, 12 February 2011 (UTC)Reply
  11. Oppose. Mend it, don't end it. bd2412 T 16:42, 14 February 2011 (UTC)Reply

Preference 3: Indifference

I am indifferent or indecisive about the existence or deprecation of topical categories.

Discussion

If this turns out to get enough support, we should start a formal VOTE to implement this change. -- Prince Kassad 21:14, 10 February 2011 (UTC)Reply

What we really need is to standardize these. Right now, anyone can create a topical category, there aren't really any 'rules' or even 'guidelines'. Also, the discussion below is productive - it would be nice to move some things out of the topical realm which aren't topics in the way' biology' and 'art' are. Mglovesfun (talk) 12:04, 14 February 2011 (UTC)Reply

Flood Flag

In the spirit of getting things done, I propose we remove the whole Flood Flag "process" for administrators. I propose that administrators be able to use the flag at their discretion, with the understanding that while it is enabled they will make only edits which are non-controversial, and they use a verbose reason when they flag themselves, e.g. "adding lots of glosses to trans sections". I don't see any reason why we need to have two people agree and wait 48 hours before someone is allowed to do some work, many requests get done before the approval process can complete, flooding RC needlessly. - [The]DaveRoss 22:02, 10 February 2011 (UTC)Reply

Since I'm the most contrary person here, we could have a simple procedure: Check with User:DCDuring {;-)}. DCDuring TALK 23:01, 10 February 2011 (UTC)Reply
We can compromise and say that everyone must inform DCDuring, the preferred mechanism for doing so being an informative reason in the flag assignment. - [The]DaveRoss 23:30, 10 February 2011 (UTC)Reply
Yes totally agree, perhaps move Wiktionary:Requests for flood flag to Wiktionary:Flood flag/requests and create Wiktionary:Flood flag. Then admins could simply 'inform' others of their use of a flood flag, instead of 'requesting it'. Mglovesfun (talk) 00:15, 11 February 2011 (UTC)Reply
@TheDaveRoss: I think we might as well keep the page, and have admins comment there when they flag themselves, just so people who care can keep the page on their watchlist. We can easily drop that requirement at some point if we decide in future that it's too onerous and not necessary. —RuakhTALK 16:15, 11 February 2011 (UTC)Reply
That seems reasonable, a simple "post here before you flag yourself" is not an undue burden and will keep the usage somewhat transparent. - [The]DaveRoss 21:51, 11 February 2011 (UTC)Reply

placenames up for deletion

It was recently decided that placename entries that don't meet our CFI should be tagged, and then deleted after a month if they haven't been sufficiently improved. May I make the following queries, assumptions, suggestions...

  • When CFI says "Information about grammar, such as the gender and an inflection table." then just the gender is sufficient.
  • Two different translations that are "not spelled identically with the English form" count as the necessary two criteria.
  • It takes only a few seconds to tag an entry but several minutes to improve it such that it meets CFI. I would like the initial month to be extended, and for people tagging entries to limit the number that they tag each day. Some of us who want these entries to stay also have other things to do that are more useful to our users (such as adding every Latin word in the Vulgate).

Cheers. SemperBlotto 20:01, 11 February 2011 (UTC)Reply

AFAICT the policy is fairly clear on what entries need to meet CFI, but not what to do when they don't, since presumably the information needed for the entry to pass always exists. 30 day rule sounds good to me, and like RFV, if an entry is deleted then restored and such information is quickly added, then it will meet CFI. Mglovesfun (talk) 20:09, 11 February 2011 (UTC)Reply
If an entry is known not to meet our criteria for inclusion, as these are, then I think it's already pretty generous to tag them for a month in the hopes that someone will edit the entry so it does meet the criteria. —RuakhTALK 21:11, 11 February 2011 (UTC)Reply
If it can meet the inclusion criteria, but the appropriate content hasn't been added yet, I don't see why not to keep it there for at least a month. What's the difference between this and entries that don't have citations added yet? --Yair rand (talk) 21:14, 11 February 2011 (UTC)Reply
An entry that doesn't have citations hasn't been demonstrated to meet the CFI: maybe it meets the CFI, maybe not. A month of waiting for citations has been deemed sufficient time to rule out the former case. By contrast, a placename entry that doesn't have linguistic information is known not to meet the CFI; it's an entry that shouldn't exist. If it was entered before the new rule was passed, then some leniency may be justified, and some time given to interested editors to bring it up to par; if not, I think speedy-deletion is the best course. (Even in the former case — this is a low bar. If we give thirty days to track down citations for obscure words, then thirty days should be way more than enough to track down the gender of a common placename!) —RuakhTALK 21:29, 11 February 2011 (UTC)Reply
It's harder for English place name entries though. English has no gender. -- Prince Kassad 21:49, 11 February 2011 (UTC)Reply
I haven't heard any hue and cry about the discrimination against English place names. Let's let sleeping dogs lie. DCDuring TALK 23:36, 11 February 2011 (UTC)Reply
That's not how I see it. For citations, we care about whether the cites are listed in the entry, not whether they exist at all. We only have access to an extremely small amount of durably archived works, and a lot of the words that get deleted through RFV (especially those affected by WT:BRAND and such) almost certainly have the necessary cites somewhere in the world. I don't see how placenames are different. --Yair rand (talk) 21:52, 13 February 2011 (UTC)Reply
Re: "I don't see how placenames are different": They're different in that they were explicitly voted to be different. To quote from WT:CFI: "A place name entry should initially include at least two of the following: [] " (emphasis mine). —RuakhTALK 12:15, 14 February 2011 (UTC)Reply
The first is true. If a language has only gender but no inflection of proper nouns, the gender by itself is sufficient. The second isn't as far as I can tell, translations by themselves will never fulfill the criteria. -- Prince Kassad 21:24, 11 February 2011 (UTC)Reply
I agree with PK.​—msh210 (talk) 08:30, 13 February 2011 (UTC)Reply
  • I don't think alternative spellings, synonyms, and derived terms count as "information about grammar". By the way, note that the requirement is "Information about grammar, such as the gender and an inflection table" (emphasis mine). By my reading, the "such as" is not intended to suggest that you can provide some random sampling of grammar information and call it good, but rather, to acknowledge that different languages have different types of grammar information. If place-names in a given language have gender and case inflections, then you have to supply both to fulfill this requirement. Most English place-names don't really have grammatical information worth mentioning, but some do: I think a usage note at [[Sudan]] explaining usage with and without "the" would count for this. That's just me, though. —RuakhTALK 03:26, 13 February 2011 (UTC)Reply
  • Like Ruakh, I don't think alt.sp.s, 'nyms, and derived terms count as "information about grammar", but think a usage note about the for the Sudan should.​—msh210 (talk) 08:30, 13 February 2011 (UTC)Reply
Okay, but what's special about declension that's not special about a demonym, for instance? What makes a word's genitive form more valuable than a word's demonym? Some of them are not intuitive at all, like Malagasy or Filipino. And why is a translation into another language that's not spelled identically different than a synonym or alternative form that is not spelled identically in the same language? It seems a bit selective to me. —Internoob (DiscCont) 03:45, 14 February 2011 (UTC)Reply
Re: "It seems a bit selective to me": Indeed. That was one of the reasons I opposed it. —RuakhTALK 12:18, 14 February 2011 (UTC)Reply
  • Coming in late to the debate, this proposal to exclude place names unless they satisfy artificial criteria seems utterly silly. It looks like the kind of thing designed to engage contributors in unnecessary work. These are mostly places that can easily be encountered in a person's reading. There is no credible doubt that Bridgetown exists as a city. Etymology helps us to understand a word, and having it for place names is an interesting but not necessary historical detail, and there is no guarantee that two identical place names will have the same etymology. For grammar information it may be sufficient to know that a name is invariant, and in the absence of other information that should be implicit by default. Eclecticology 10:55, 13 February 2011 (UTC)Reply
  • This thread seems to be a response to the activity of msh210 of adding {{placename/box}}[[Category:Place names needing additional information February]] to entries of geographic names, an activity that seems to have started on 11 February 2011. {{placename/box}} that msh210 uses to tag entries for geographic names was created by DAVilla on 10 February 2011.

    I would like the tagging to stop, especially for entries that, while not currently meeting the CFI requirements, are likely to be able to meet them, such as "Barcelona". Alternatively, the tagger may, for each added tag, add the required information to at least one geographic name, thereby making a genuine contribution to the usefulness of Wiktionary for its users.

    If the tagging does not stop, we may need to modify CFI from saying "A place name entry should initially include at least two of the following:" to "A place name entry should be able to include at least two of the following:". See also Wiktionary:Votes/pl-2010-05/Placenames with linguistic information 2. --Dan Polansky 12:01, 13 February 2011 (UTC)Reply

    That's {{subst:placename}}, fwiw.​—msh210 (talk) 15:46, 14 February 2011 (UTC)Reply
  • Sorry, but no. This requirement was a bad one, but its intended purpose was to prevent useless place-name entries from being added, by adding a small burden on their creators to make them useful. Shifting that burden onto people who are merely trying to implement our primary, defining policy document doesn't make sense. Eliminating the requirement would be one thing, but that sort of "hacking around it" is just the worst of all worlds. —RuakhTALK 15:11, 13 February 2011 (UTC)Reply
    Re shifting: Right on.​—msh210 (talk) 15:46, 14 February 2011 (UTC)Reply
    I do not see how deleting the Dutch entry for Barcelona is doing useful work. The Dutch entry now says that (a) there is a term in Dutch "Barcelona", and that (b) it is translated into English as "Barcelona". If the policy gets modified as I propose, the person considering to tag an entry will have to ask themselves one question: is it likely that this entry can carry useful lexicographical information? In the case of Dutch "Barcelona", the answer is "yes, it is actually certain", and the person can proceed to another work. This question does not seem to put too much burden on the tagger. It is like with the attestation requirement: if a term is very likely to be attestable, no one should be tagging it for RFV; RFV should only receive terms whose attestability is questionable. --Dan Polansky 15:14, 13 February 2011 (UTC)Reply
  • Please see my above comments. According to this rule, a place-name entry without enough information is like a term that has already failed RFV. I'm not saying that deletion is "useful work", just that it's implementing a policy that was approved by the community. (Personally, I'd be happy with deleting all place-names, but the community doesn't support that, so I don't.) —RuakhTALK 16:48, 13 February 2011 (UTC)Reply
    • I do admit that the tagging is consistent with the literal and strict reading of the current policy for geographic names. That is why I have mentioned the option of changing the policy from "A place name entry should initially include at least two of the following:" to "A place name entry should be able to include at least two of the following:". My previous post defended the usefulness of such a change against the charge that the change would place too much burden on taggers. --Dan Polansky 20:08, 13 February 2011 (UTC)Reply
I think the above discussion has pretty much proved that the WT:Votes/pl-2010-05/Placenames with linguistic information 2 vote was a mistake. Most of the voters probably assumed that no one would go out of their way to try to eliminate hundreds of valuable placename entries as the policy allows. (I'm starting to wonder how much longer before RFD is clogged by entries killed by the bucket problem...) The policy clearly needs to be modified. --Yair rand (talk) 21:52, 13 February 2011 (UTC)Reply
  • Ideally this project should include all place names. The policy amendment was clearly a terrible idea. As is typical with such bad policies there are always people for whom the literal interpretation of the policy is more important than the health of the project. In most cases the criteria in the added list could be easily met, but to insist that the originator include them does nothing but create dissension. If someone thinks these details are so important he should accept the responsibility. Better still, this excuse for a policy should be completely revoked. 07:21, 14 February 2011 (UTC) — This unsigned comment was added by Eclecticology (talkcontribs).
    • I assume I should take offense at that. Note that in tagging the city names I tagged, I had no intention (as I mentioned to SB on my talkpage) of deleting the pages after a month. In fact, within the month, I was planing to go (and would still be planning to go, except that recently proposed votes may make such action not as necessary) back to each entry and add pronunciation for any I knew the pronunciation for (which is very few), and if that made the entry meet the CFI then I would detag it. (If an entry remained so tagged for a while I'd certainly delete it, though.) The category of entries so tagged is also a cleanup category. I didn't think that my tagging the entries would create such a horrified response: I thought it was the natural outcome of our new policy, which certainly had community approval: frankly, I'm surprised.​—msh210 (talk) 15:46, 14 February 2011 (UTC)Reply
  • One practice that is bound to create dissension is sticking a template and saying that the article will be deleted unless certain criteria are met. It calls up the apprehension that if it can be done on one article it can be done on any, and someday it may be about something that I care more about. Deletion without warning is a thoroughly unacceptable way of doing things, but generates a lot less noise. Natural outcomes depend on what you want to accomplish. Is it improving bad articles or deleting them? Eclecticology 00:32, 15 February 2011 (UTC)Reply
    Since I tagged them rather than deleting them without warning (whcih, as you say, might be less noisy, and which would be justified TBH), obviously I'm not solely interested in their deletion. If they can be improved so as to satisfy our criteria for inclusion, that's great. Some have been already.—msh210℠ on a public computer 17:53, 16 February 2011 (UTC)Reply
Since no one seems to support this policy anymore, I've created a vote proposing that it be revoked: Wiktionary:Votes/pl-2011-02/Remove "Place names" section of WT:CFI. Please take a look, make any necessary improvements, etc. —RuakhTALK 14:19, 14 February 2011 (UTC)Reply
The policy seems fairly good to me. It explicitly programs on (refers to; is expressed in terms of) the rationale for inclusion of geographic names: their ability to carry useful lexicographical information. It says there is a consensus that geographic names should be largely included. A minor tweak to the policy should do, so I have created an alternative vote: Wiktionary:Votes/pl-2011-02/Relaxing CFI for geographic names. Proposals for improved wording are welcome, especially here in Beer parlour, and on the talk page of the vote. --Dan Polansky 14:49, 14 February 2011 (UTC)Reply
There is a distinction to be made between what Geographic names to include, and what is included in articles about geographic names. As long as it is about the latter it doesn't belong in CFI. Eclecticology 00:55, 15 February 2011 (UTC)Reply
In my opinion, the best thing to do to this policy would be to make what counts as "information about grammar" more comprehensive for the reasons I stated before. Also, the "multiple-word" thing seems to exclude multiple-word place names whose etymologies aren't intuitive. Other than that, this policy is okay IMHO. —Internoob (DiscCont) 04:46, 15 February 2011 (UTC)Reply
We can decide, with or without a vote, that entries that existed before the place name vote passed must be tagged longer than a month before deletion. 3 months, 6 months, a year even? Tagging them is useful to the project, and msh210 does not strike me as a person "for whom the literal interpretation of the policy is more important than the health of the project". The present Dutch entry for Barcelona is a needless stub, only repeating the information in the translation table: "Dutch: Barcelona n". The CFI can be amended to count demonyms or synonyms or two foreign translations.--Makaokalani 09:00, 15 February 2011 (UTC)Reply
We don't generally disallow entries just because they are repetitions of what's already shown in a translation table. Why should we do that for place names? --Yair rand (talk) 09:11, 15 February 2011 (UTC)Reply
Because a place name entry without linguistic information is encyclopedic. Because names are different from words that have a meaning. --Makaokalani 11:11, 15 February 2011 (UTC)Reply
How would it be without linguistic information, or encyclopedic? Such entries (those that are duplicates of information already in translations sections) are exclusively linguistic information containing no information about the place itself whatsoever. An English entry for a place name only containing information about the place itself could be encyclopedic, but I don't see how a translation could be. --Yair rand (talk) 11:18, 15 February 2011 (UTC)Reply
Saying that a place name entry without linguistic information is encyclopedic is a very narrow interpretation of the problem. There are certain features of a place name that overlap the two kinds of work. People will run into a place name in a course of ordinary reading and will ask some basic questions that do not require a detailed encyclopedic treatment, most importantly, "Where is it?" For many English names, where pronunciation is self-evident, the only other grammatical information may be that the name is invariant. Eclecticology 00:52, 16 February 2011 (UTC)Reply

The two votes that have resulted from this thread have started:

--Dan Polansky 15:09, 22 February 2011 (UTC)Reply

I don't think it's all that likely that either of those will pass. We need a vote allowing all attested place names, in my opinion. --Yair rand (talk) 15:53, 22 February 2011 (UTC)Reply

Reading level or frequency for Wiktionary entries.

On the English Language and Usage - Stack Exchange Web Site the question: vocabulary - To what reading level does a specific word like 'verbose' belong? has no satisfying answers. Wiktionary.com seems like it would be the right place for this kind of information. A new section like Pronunciation could be added for Reading Level to each word's entry. Also, a Frequency section would be useful for each word's entry.

What measure would you use to evaluate such a statistic? I suppose the most obvious would be to divide the English corpus by target age level (somehow) and then look at the frequency in each sub-corpus for each word. That is not a simple task at all. - [The]DaveRoss 17:36, 12 February 2011 (UTC)Reply
I could see how computational linguistics could generate an approximation for word readability. Generate a readability index score for a passage. (Even better, generate multiple scores and check for consistency, discarding ratings and passages that were excessively discordant, combing the remaining measures into a single score.) Use that score as a score for each word in the passage, possibly excluding high-frequency words. Repeat until one has a statistically adequate number of data points for all words for which the readability score is desired. The process could be used recursively to assign passage readability scores based on the word readability scores to the passages and thence again to the component words.
This would be a substantial project. It would still miss context sensitivity, such as careful explicit definition of an otherwise hard word.
As with most frequency analysis this does not get at specific senses or analyze by lexeme rather than spelling.
Obviously this a fairly ambitious project. DCDuring TALK 19:49, 12 February 2011 (UTC)Reply
Thinking about this, another fun approach might be to crowdsource the whole scoring part. Set up a "survey" where people input their age and level of education, and are then presented with a series of words. Even a simple "I know this word" vs "I don't know this word" (perhaps an intermediate "I recognize but do not understand this word") response given by enough people could generate some pretty interesting data. I have seen this method used in other contexts, but I don't see why it wouldn't work for readability. Probably not a Wiktionary project but I assume that someone would be interested in what words people understand. The only measurement I am aware of for readability is the Flesch–Kincaid readability test and that doesn't work well for single words. - [The]DaveRoss 20:21, 12 February 2011 (UTC)Reply

Poll: Categories of lexicons are "lexical categories"

The names of categories of Wiktionary normally follow a relatively clear distinction between "lexical categories" (including language names, among other characteristics) and "topical categories" (including language codes, among other characteristics). However, ironically, Category:Lexicons apparently is an exception by having a huge majority of subcategories that use the format of "topical categories", and a few exceptions that use the format of "lexical categories".

This situation leads to the awkward simultaneous existence of these category trees, among many others:

I believe it is possible to achieve logic and consistency (and, by extension, navigability) by using only one categorization system for all lexicons. For that reason, I developed this poll with an initial simple proposal and the common options to make decisions and comments about it. I expect to eventually develop this basic idea into other, more detailed, proposals, such as possibly decisions about individual categories and their individual names.

I would like to know the opinions of other Wiktionarians about it.

Thank you for your attention and your input. --Daniel. 06:40, 13 February 2011 (UTC)Reply

Preference 1: Support

I prefer categories of lexicons with language names instead of language codes

For example, Category:pt:Vulgarities may be replaced by Category:Portuguese vulgarities, Category:Portuguese vulgar terms or other category name without "pt" but with "Portuguese".

If you agree 100% with this practice, or if you agree in essence with it, please vote for this option. Feel free to elaborate your thoughts.

  1. Support Daniel. 06:40, 13 February 2011 (UTC)Reply
    It is consistent with Category:English nouns, Category:English symbols, Category:English phrases and Category:English suffixes, among other lexical categories. --Daniel. 06:40, 13 February 2011 (UTC)Reply
  2. Support. The name should include "in Portuguese" (used without in, Portuguese is ambigous, as it may refer to the country or to the language). This should apply to all categories, including topical categories : for users, language codes are meaningless. Lmaltier 12:33, 13 February 2011 (UTC)Reply
    I agree that language codes are bad. But I don't agree about "in", and I'm not sure that we should drop the idea of a naming-convention distinction between grammatical categories and topical ones. —RuakhTALK 13:05, 13 February 2011 (UTC)Reply
  3. Support Mglovesfun (talk) 21:54, 13 February 2011 (UTC) in fact I'd actually considered proposing this. Mglovesfun (talk) 21:54, 13 February 2011 (UTC)Reply
  4. Support I've always been very confused by these two systems. They didn't always make sense and seemed rather randomly chosen sometimes. So I'd be happy to see this confusion removed. I also think using language codes in category names is ugly, and gives preferential treatment to English. —CodeCat 15:59, 14 February 2011 (UTC)Reply
  5. I support Preference 1, but I also support the opposite proposal: that all such categories use the language code. As long as we're consistent: all lexical categories should look the same.​—msh210 (talk) 15:52, 14 February 2011 (UTC)Reply
    If we rename all lexical categories to imitate the format of "topical categories", then it would cause a conflict between Category:French parts of speech and Category:fr:Parts of speech, because both would fit the name "Category:fr:Parts of speech". --Daniel. 05:48, 15 February 2011 (UTC)Reply
    Good point. (Unless we use FR for lexical cats and fr for topical, but that's unwise IMO.)​—msh210 (talk) 17:22, 15 February 2011 (UTC)Reply
    It's a good point, but do you really think that users understand the difference? The solution is to be make names clear (e.g. Parts of speech in French and Parts of speech categories in French) Lmaltier 20:43, 15 February 2011 (UTC)Reply
    One way to make sure users would understand the difference would be linking from fr:Parts of speech to French parts of speech and vice-versa, and explaining the difference shortly in all categories. Parts of speech in French is a good suggestion, while Parts of speech categories in French is not; the latter does not convey well the concept it tries to address, and would require further context to be clearly understood. --Daniel. 20:53, 15 February 2011 (UTC)Reply
  6. SupportInternoob (DiscCont) 04:51, 15 February 2011 (UTC) For usability reasons.Reply
  7. Support Yair rand (talk) 17:39, 6 March 2011 (UTC)Reply

Preference 2: Oppose

I oppose the proposal described as the preference 1

Oppose. I think that the categories should be standardized in the other direction, for example Category:Swedish swear words becoming Category:sv:Swear words. --Yair rand (talk) 22:00, 13 February 2011 (UTC)Reply
If we rename all lexical categories to imitate the format of "topical categories", then it would cause a conflict between Category:French parts of speech and Category:fr:Parts of speech, because both would fit the name "Category:fr:Parts of speech". --Daniel. 00:54, 15 February 2011 (UTC)Reply

Preference 3: Abstain

I am indifferent or indecisive about the proposal described as the preference 1

  1. Abstain Ƿidsiþ 11:14, 14 February 2011 (UTC), but they should probably be standardised in one direction or another. Ƿidsiþ 11:14, 14 February 2011 (UTC)Reply
  2. Abstain I don't like the phrasing of the second option ("I oppose the proposal described as the preference 1"): the preference one is not a proposal but a statement of user preference of A over B, and the alternative preference would be one of B over A rather than an opposition. I don't like the title of the poll ('Categories of lexicons are "lexical categories"'), as it does not match the text of the preferences. I would rather not touch the poll, but abstaining cannot harm I guess. I surmise that categories of lexicons (as "Vulgarities") are lexical categories rather than topical ones. I further surmise that categories of lexicons should use the naming convention of lexical categories, which is currently along the lines of "Spanish vulgarities" rather than "es:Vulgarities", unless someone gives reasons against. --Dan Polansky 17:44, 15 February 2011 (UTC)Reply
    On second thought, maybe it would actually be better to emphasize "user preference" over "proposal" as the Preference 2, for example by creating one of these alternatives:
    • Preference 2: I don't prefer categories of lexicons with language names instead of language codes
    • Preference 2: I prefer categories of lexicons with language codes instead of language names
    One of these is a preferrence of "B over A", as you (Dan) suggested. Nonetheless, I'm happy with the existing headers, since there isn't any conflict between the concepts of proposal and preferrence, and "I oppose the proposal described as the preference 1" covers both alternatives. I see good logical choices here. Their wording and my reasoning could be simpler, though. --Daniel. 20:06, 15 February 2011 (UTC)Reply

Discussion

There is one point that hasn't really been raised yet. What do we do with categories like Category:Agriculture? 'English Agriculture' just sounds odd... —CodeCat 15:46, 23 February 2011 (UTC)Reply

How is Category:Agriculture part of "Categories of lexicons"? --Yair rand (talk) 17:28, 1 March 2011 (UTC)Reply

"Cambodian" and "Khmer"

The language whose code is km is named Cambodian here and Khmer here. Shouldn't these pages use only one name to refer to that language? --Daniel. 03:52, 14 February 2011 (UTC)Reply

In a word, yes. Mglovesfun (talk) 11:59, 14 February 2011 (UTC)Reply

Wiktionary:Votes/pl-2010-12/Names of individuals

The vote has started. --Daniel. 07:25, 14 February 2011 (UTC)Reply

Renaming CFI section for spellings 2

Wiktionary:Votes/pl-2011-02/Renaming CFI section for spellings ends on 17 February 2011, in three days. Only 5 people have voted so far. It would be nice if some more people voted; even explicit abstains would be nice. I can imagine most people just don't care about the subject of the vote, which makes plenty of sense, as the proposal is really merely cosmetic. --Dan Polansky 12:27, 14 February 2011 (UTC)Reply

As I said on the vote page this change is misleading. The CFI page is about what to include, and this section discusses a particular type of entry, so "Spellings" does not address the problem. A better alternative would be to simply delete "common misspellings" from the title as a redundancy. Perhaps I should add this to the vote page as an alternative. Eclecticology 21:36, 15 February 2011 (UTC)Reply

Quotations - Proposed enhancements

TO: Wiktionarians

FROM: Geof Bard

RE: Quotations - Proposed enhancements


Feb 14, 2011

Summary: Date; Regionalize; Situate & Tag Quotes (More Better)

I propose that:

(1) Wiktionary guidelines be amended to strongly encourage dates on all quotations; without them it is impossible to "date" the term or word in question. Since language is a living and evolving creature of living persons, the lack of dates seriously impairs their value.

(2)Wiktionary guidelines be amended to strongly encourage editors/writers to distinguish the region associated with the quote, de minimis, whether we are talking Castillian or Latin American Spanish; American or British English, Southern writers who write in Southern idiom should be also identified particularly when using a word which is not in its basic entry associated with the region.

(3) A third area of concern is there is seldom adequate identification of which slang subcultures words originate from. For instance, some slang is clearly associated with, in its origination, with African American Northern urban USA. Is that not the case for instance with "homeboy"?

Other terms are affiliated with Chicago where electric blues developed its own nomenclature.

(4) Fourthly, it seems that there is need for a greater array of classification templates. At this point, I will not elaborate that issue pending greater familiarity with Wiktionary culture.

Thank you in advance for your thoughtful comments


Geof Bard


Geof Bard 02:08, 16 February 2011 (UTC)Reply

You're very active to say you've been contributing less than 24 hours! Does it not occur to you to get a feel for how Wiktionary works before proposing this sort of thing?

Mglovesfun (talk) 11:41, 15 February 2011 (UTC)Reply

Active is bad? Some of the bots add 100 articles per day, I don't think I have added so many.

I've been using English and other languages a bit longer than that. Didn't notice any rules about reservation of opportunity to make proposals to old guard. Isn't the point of a beer "parlor" to loosen up? See alcohol, tavern, R and R. Sometimes someone with a fresh perspective notices things you may have long gotten used to, aside from anonymous sarcasm; to wit, many of the existing quotes are not representative of the sense, except, obviously, for some specific place and time. You might find it of interest to take a look at the idea expounded in the book Zen Mind, Beginner's Mind. At any rate, the following post seems to take my points regarding new context-based classifications, etc. I am surprised you aren't more open to the specific suggestions, as per your user page:

A dictionary is about quality and quantity, so correcting the entries we already have is very important.

But based upon your response, I should wait until my initial enthusiasm fades into jaded cynicism, and then proceed. That process is receiving a boost, so maybe I will implement the fourth item, below, and create a context template for a certain lingo or two. But, alas, my initial flush of enthusiasm is no longer at my disposal; well, seventy two hours is a long time in cyber-time, so I think I'll work on something else for a few bytes.

Geof Bard 02:02, 16 February 2011 (UTC)Reply

Your generous provision of management services is greatly appreciated.
  1. Dating quotations: Feel free to get started on the entries marked with {{rfdate}}. There are also many entries marked with {{webster 1913}} that have the same problem.
  2. Regional dialect ascription is usually important only if it leads to semantic, pronunciation, orthographic, or usage difference. Quotations should be associated with specific senses.
  3. Register/subculture. Feel free to improve any entry for which you have good reason to believe you can.
  4. A user formerly created new context-based classifications when there was a sufficient number of entries that used the {{context}} tag. Perhaps he or someone else will recommence that practice.
To make your comments more readable, try not to start any paragraph with a space. Also, it is greatly appreciated if contributors to this kind of page avail themselves of the Wikimedia software convention for generating timestamped signature by typing ~~~~. DCDuring TALK 17:12, 15 February 2011 (UTC)Reply
  1. RE "Your generous provision of management services is greatly appreciated."

Dropping a comment card in the comment box is not the same thing as a hostile buyout.

  1. RE# Feel free ... many entries have the same problem.

I'll take that for a positive feedback

  1. Register/subculture. Feel free...

I'll take that for a positive feedback

  1. RE: A user formerly created new context-based classifications

Maybe I will feel inspired to do that...

  1. RE: Readability...I kind of liked the layout before but you are probably more used to the standard layout so I modified it. Yeah I forgot the ~~~~ on that post but then doesn't the bot do it anyway? The problem being, I suppose, server calls = more traffic and the possibility of an edit conflict. So anyway, yeah, oh course, whatever.

Geof Bard 02:31, 16 February 2011 (UTC)Reply

Plurals of proper nouns

Per WT:RFD#Qu'ran but also in many other entries, when proper noun can have plurals, does that make them 'common nouns', such as Coke, Pepsi, Mars Bar (etc.) I think no; proper nouns can have plurals, there's no need for a separate PoS just to accommodate countable use, be it singular or plural. Mglovesfun (talk) 11:39, 15 February 2011 (UTC)Reply

The entry Pepsi has only one English noun sense: "A portion of Pepsi." I'd say that sense is indeed a common noun, despite existing in context of a specific entity. There is already a proper noun definition to cover a similar but different concept. --Daniel. 13:03, 15 February 2011 (UTC)Reply
@Daniel.: Agreed. —RuakhTALK 13:16, 15 February 2011 (UTC)Reply
@Mglovesfun: If you say "a ____" or "____s", then you're not really using "____" as a proper noun. But almost every proper noun can be pressed into common-noun service in this way. Sometimes this sort of use is conventionalized/lexicalized enough that it's worth including as a separate sense; sometimes it isn't. When it is, ===Noun=== is the place for it. I think (deprecated template usage) Bible and (deprecated template usage) Bible both warrant common-noun definitions; I would imagine that (deprecated template usage) Qur'an probably does as well. In the case of (deprecated template usage) Coke and (deprecated template usage) Pepsi, I'd actually be more inclined to include the common nouns (which denote the substances) than proper nouns (which denote the companies — not very common — or the brands — less common yet). In the case of (deprecated template usage) Mars Bar, similarly, I'd be more inclined to include the common noun (deprecated template usage) Mars bar. —RuakhTALK 13:16, 15 February 2011 (UTC)Reply
In cases like these, proper nouns always seem to refer to specific instances of things. For example, Coke isn't a single thing, it's a type of thing that happens to have a trademarked name. The proper noun itself refers to the type of thing it is, not to that specific thing I might have in my hand. —CodeCat 13:34, 15 February 2011 (UTC)Reply
Yes, in a Coke, Coke is a common noun, this is clear. But in a Churchill, Churchill is not a common noun, either it's a standard use of surnames, or it's a standard use of proper nouns (figure of speech, metaphor). Lmaltier 19:10, 15 February 2011 (UTC)Reply
I noted last year that we lack a lot of plurals for proper nouns such as given names and surnames. Things like Martins, Stevens, Eves (etc.) Mglovesfun (talk) 13:59, 17 February 2011 (UTC)Reply
It's because of WT:Votes/pl-2008-06/Plurals from proper nouns. --Makaokalani 12:54, 19 February 2011 (UTC)Reply

Legality of བྱང་ཆུབ་སེམས་དཔའ

The original section heading before renaming: "བྱང་ཆུབ་སེམས་དཔའ is this legal? Search that, look at what is there, discuss,if so inclined."

བྱང་ཆུབ་སེམས་དཔའ

It is Tibetan script but some people might want an English language definition. But maybe that is not "legal" here, I am the new kid on the block. Please advise.[

Also, for that matter, http://en.wiktionary.org/wiki/M%C5%ABlamadhyamakak%C4%81rik%C4%81

Another example is http://en.wiktionary.org/wiki/%E0%A4%B6%E0%A5%82%E0%A4%A8%E0%A5%8D%E0%A4%AF%E0%A4%A4%E0%A4%BE

The reason I find these useful is in part because if people use google, it is very difficult to find an English definition without finding a translator, translating, and then dealing with a cruddy "definition" of sorts. Wiktionary is more direct and can do it better with live wiki style editing.

Frankly, I would like to see longer more specific definitions result from lengthly debate, but the time for that will probably be in a few years.

Geof Bard 05:50, 16 February 2011 (UTC) Geof Bard 06:24, 16 February 2011 (UTC)Reply

Yes, all definitions are written in English. It's generally best to 'translate' when a translation exists and explain the content at the English page. But when no direct English translation exists, lengthier definitions are the norm. Mglovesfun (talk) 06:36, 16 February 2011 (UTC)Reply
RE: "it is best to 'translate' when a tranlation exists" not sure what you mean...
I have simplified the section heading. --Dan Polansky 07:26, 16 February 2011 (UTC)Reply
That is, when a single word translation exists, or a phrase, basically anything that exists or could exist as an English entry, it's better to link directly to it and let the English entry explain its meaning. When there is no equivalent in English, you should try to provide a concise explanation of what the word means. Mglovesfun (talk) 13:12, 17 February 2011 (UTC)Reply

Wiktionary:Votes/pl-2011-02/Romanian orthographic norms

Mglovesfun (talk) 12:57, 16 February 2011 (UTC)Reply

Putting proto-languages into subpages

Currently we list the entries for proto-languages in the appendix in a 'flat' format: Appendix:Proto-Germanic *handuz. I think it would be better if we changed this to a subpage: Appendix:Proto-Germanic/handuz. This would make templates work a lot better, because the actual entry name can be easily extracted from the full name. It also fits better with how we have entries in English that can't go in mainspace. —CodeCat 12:10, 17 February 2011 (UTC)Reply

I don't oppose it; {{proto}} would need to be changed, quite a minor change, though. Mglovesfun (talk) 13:50, 17 February 2011 (UTC)Reply
It looks like this is getting enough support to pass. Would any of you like to help me move the pages once the changes to the templates have been made? —CodeCat 17:54, 17 February 2011 (UTC)Reply
Pywikipediabot/movepages.py, FYI. This would be extremely easy to do. Any exceptions to watch out for? And should we keep the redirects? Nadando 19:20, 17 February 2011 (UTC)Reply
I don't think there are any exceptions. But a few of the pages are redirects... so we'll need to move them too. I think we should have redirects from the old names to the new ones too. —CodeCat 19:38, 17 February 2011 (UTC)Reply
I tried moving some of the pages by hand but it became really tedious after a while. But I don't know how to use that bot, either... —CodeCat 21:09, 17 February 2011 (UTC)Reply
If someone else hasn't jumped on it by then I'll do it next time a dump comes out. Nadando 04:52, 18 February 2011 (UTC)Reply

Poll: Sorting representative entries of topical categories

More often than not, a topical category may contain a representative entry: For example, Category:History may contain history. When this happens, people usually try to emphasize the entry in question by placing it right at the start of the list of members of the topical category. For example, the first entry listed as a member of Category:Time is "time", and the first of Category:Sex is "sex". On the other hand, the entry "biology" follows the alphabetical order, by appearing below the "b" header of Category:Biology.

For what it's worth, "history" is not a member of Category:History, "geography" is not a member of Category:Geography and "chemistry" is not a member of Category:Chemistry, but they can be categorized anytime and fit into one of the two systems described above.

As noted above, that emphasis does not occur everytime when possible. As a result, I would like to know the opinions of other Wiktionarians about how to deal with this specific inconsistent aspect of topical categories.

Thank you for your attention and your input. --Daniel. 06:18, 18 February 2011 (UTC)Reply

Poll: Sorting representative entries of topical categories — Preference 1

When possible, I prefer representative entries organized among other entries normally.

If you agree 100% with this practice, or if you agree in essence with it, please vote for this option. Feel free to elaborate your thoughts.

  1. Support Daniel. 06:18, 18 February 2011 (UTC)Reply
    • I support this decision, for the following reasons:
    1. Emphasizing a representative entry by placing it as the first item of the category is redundant, at least in categories of English words, because the category already has (or should have) a good description linking to it. The description is where people will be most certainly always be able to find links to representative entries.
    2. The entries to be emphasized should not always be members of the category in question.
      • For example, day and week are not members of Category:Days of the week, so a reader who searches for representative entries among the first few items of that category will fail to find them. On the other hand, both day and week can be listed within the description.
    3. There is inconsistency of where to sort the entries in question to emphasize them.
      • The entry şah is sorted into Category:ro:Chess by a space, but échecs is sorted into Category:fr:Chess by an asterisk, among countless other similar examples. (This is a minor issue that can be fixed relatively easily, regardless of the decisions of this poll.)
    4. It breaks the alphabetical order.
      • For example, if I am navigating through the "W" section of Category:Weather, I would like to see weather listed there. If "weather" is not listed there, then it gives the impression that it is absent from the category (i.e., either not defined yet or just uncategorized). Contrariwise, if one believes that "weather" sould be the first item of the category, but it is actually listed below the "W" header, then the initial impression would be equally that the term is absent from the category.
      • As another example, it feels unnatural to see "psychohydraulics", "psychological refractory period", "psychologist", "psychometrician", "psychometrics" listed together, without "psychology" among them, as members of Category:Psychology.
  2. Support for above reasons, but also because:
    1. Making an exception to the alphabetical order adds complexity (see KISS principle).
    2. Users have no reason to read a category page when they want to read the page. Categories are useful mainly when you want to find words you don't know, or words you have forgotten, or words you cannot enter with your keyboard.
  3. I don't like topical categories, but if we're to have them then I prefer this option.​—msh210 (talk) 18:24, 21 February 2011 (UTC)Reply

Poll: Sorting representative entries of topical categories — Preference 2

When possible, I prefer representative entries listed as the first items of their respective topical categories.

If you agree 100% with this practice, or if you agree in essence with it, please vote for this option. Feel free to elaborate your thoughts.

  1. Support. This provides a basic means of making sure that the title of the category maintains some connection with the ordinary meaning of the words used. This seems especially important as so much of the naming and structure of our categories is the product of a narrow base of users who are not native speakers of English. Moreover, where the name of the category does not correspond to an entry name or a specifically sanctioned combination of entry names, the category should be renamed. DCDuring TALK 19:11, 21 February 2011 (UTC)Reply
    Do you have any example of category that should be renamed because its name does not correspond to an entry name or a specifically sanctioned combination of entry names? --Daniel. 19:57, 21 February 2011 (UTC)Reply
    Or, perhaps, do you have any example of topical category that was renamed for this reason? Or an example of a topical category that never existed, but would have to be renamed for this reason? --Daniel. 08:47, 22 February 2011 (UTC)Reply

Poll: Sorting representative entries of topical categories — Preference 3

I am indecisive or indifferent about where to sort representative entries of topical categories

Feel free to elaborate your thoughts.

  1. I don't think topic categories should include the "representative entries" at all. --Yair rand (talk) 20:06, 21 February 2011 (UTC)Reply
    How come?​—msh210 (talk) 20:15, 21 February 2011 (UTC)Reply

Poll: Sorting representative entries of topical categories — Discussion

Vote: Deprecating less-than symbol in etymologies

I have created Wiktionary:Votes/pl-2011-02/Deprecating less-than symbol in etymologies, as a follow-up on the recent poll #Poll: Etymology and the use of less-than symbol, February 2011.

The vote is planned to start on 22 February 2011 and last 14 days.

The vote is supposed be a mere formality after the poll showed a supermajoritarian preference of one of the discussed options. What could theoretically be controversial about the vote is the proposed use of "From A, from B, from C" rather than "A from B from C" (comma rather than no comma). If this turns controversial before the start of the vote, I will remove this from the vote and leave it open instead. --Dan Polansky 09:03, 18 February 2011 (UTC)Reply

Indefinitely block User:Daniel.

Discuss. Have fun y'all. Mglovesfun (talk) 22:54, 18 February 2011 (UTC)Reply

Reason? -- Gauss 23:04, 18 February 2011 (UTC)Reply
Disruptive edits; creating hundreds (maybe) of non-dictionary entries and categories to match. I've discussed it in private with various admins and to be honest, there's more support for it than I imagined. I think the thing the people involved don't realize how much support there is, so they keep quiet. Mglovesfun (talk) 23:08, 18 February 2011 (UTC)Reply
And to think a vote on this not too long ago failed... -- Prince Kassad 23:15, 18 February 2011 (UTC)Reply
For the record, that vote was about desysopping and ended 5-9-6. -- Gauss 23:51, 18 February 2011 (UTC)Reply
So they keep quiet? What a pity. I happen to like to be praised once in a while. --Daniel. 23:40, 18 February 2011 (UTC)Reply
Daniel., I can't help the impression that you might serve the project better by adding Portuguese words rather than fictional dogs and other characters. (Just a thought because I discovered a few days ago to my surprise that Category:Polish nouns seems to contain more entries than Category:Portuguese nouns (5052 vs. 4277), while I would have expected a different relation.) Better in terms of global benefit, I mean. -- Gauss 23:52, 18 February 2011 (UTC)Reply
Right now, I can help by adding and attesting fictional characters mainly because this is a controversial subject that should be organized sooner or later, especially since there are active discussions everywhere about it. There are other ways to help too, but I really don't feel like adding 775 random Portuguese nouns right now just to keep up with Polish nouns. --Daniel. 05:33, 19 February 2011 (UTC)Reply
But wouldn't be enough to add just one fictional dog and wait until that entry has been discussed? Or to create a vote? I am irritated by your strategy of flooding the project with controversial entries, as if you were wishing that nobody will have the energy to rfd/rfv them all. As if the other editors were rivals and you were trying to outfox them. --Makaokalani 12:51, 19 February 2011 (UTC)Reply
I created and participated in numerous discussions and few votes about this subject. I kept fictional characters within Category:Fictional characters for the convenience of anyone who wants to see them, nominate them for attestation, etc. as a whole. Most of the proper nouns from that category were not originally created by me, though I searched for them and organized them into something as logic and consistent as I could for now. --Daniel. 13:02, 19 February 2011 (UTC)Reply
  1. Oppose Daniel. 23:24, 18 February 2011 (UTC)Reply
"Indefinitely" seems like overkill. His edits are highly controversial, but they're not harassment, not quite vandalism, and probably not trolling. Even in such extraordinary numbers, I don't think they add up to an indef-blockable offense, at least without trying less severe remedies first. (That said, if this came to a vote, I might "support" it. An indef-block might be better than doing absolutely nothing.) —RuakhTALK 01:41, 19 February 2011 (UTC)Reply
Or you (Wiktionarians as a whole) could as a community decide to support my actions, or oppose my actions. Both forms of consensus already have been working well. --Daniel. 01:58, 19 February 2011 (UTC)Reply
As before, wild accusations leveled, but no evidence offered. —Stephen (Talk) 05:16, 19 February 2011 (UTC)Reply
I think you should first try to find support for a weaker statement. The statement you are now seeking support for is "Daniel. should be indefinitely blocked", a statement that is executable and has severe consequences for the user. Weaker statements, not immediately executable, include "Some behavior of Daniel. is unworthy of an admin", "Daniel. does many controversial edits", or "I wish Daniel. would change some of his behavior". I would sign all three statements.
To build at least a minute chance that a vote on infinite block could succeed, you probably need to build a case, which is a lot of work, I am afraid. --Dan Polansky 07:01, 19 February 2011 (UTC)Reply

Some entries created by Daniel are not words and should be deleted, e.g. Clifford the Big Red Dog or Hound of the Baskervilles, but I think the simple solution would be to clearly state the principle all words are accepted (whatever their meaning), but only words. (this includes phrases that belong to the vocabulary of the language and can be studied from a linguistic point of view). Lmaltier 13:28, 19 February 2011 (UTC)Reply

Among all the personal positions on criteria for inclusion of Wiktionary, yours is usually presented as very simple, yet it is subjective. Where one draws the line between a word and a nonword?
Per our current policies, both entries "Clifford the Big Red Dog" and "Hound of the Baskervilles" would be idiomatic (they have characteristics that can't be inferred from the parts of their names), but dependent from their universes (because one has to understand their stories to recognize the characters), then their independence would be introduced by certain citations (and they're cited already). --Daniel. 13:54, 19 February 2011 (UTC)Reply
While this argument sounds plausible, such activity is a waste of time (mainly yours, admittedly) because pretty much no practical person is going to look for such content in a dictionary - for someone finding a reference to a dog named Clifford and suspecting it to be a fictional character the information given is hardly useful in any respect (except perhaps the link to WP). On the other hand, if I wanted to know, say, the word for subtitle in whichever foreign language or related grammatical information on it then it would be natural to look for it a dictionary. The success of this project, like the success of most others, depends on how useful it is, not how much entries it contains. In the long run, too many entries which just barely do not fail CFI are detrimental to WT's reputation, I guess. -- Gauss 14:42, 19 February 2011 (UTC)Reply
First of all, while I don't consider this task a waste of time, I still feel surprised by the fact that the existence of entries of fictional concepts apparently is attributed mainly to me; while, by comparison with other editors, my main role was just of organizing these entries. (It would be like saying that I created most of the members of Category:Appearance just because I populated that category.)
Mickey Mouse, for example, contains a pronunciation section and various translations; I believe other languages would include declensions as well. Citations:Lassie contains a considerable amount of examples of usage of the name of a certain dog, among few explanations of its characteristics. Wikipedia is not a very good place to find these pieces of information, except in certain cases; in my experience, "certain cases" are translations only back to the original language when different from English and pronunciations only of entities who merit individual pages, and mentions mainly by "reliable sources" by linking back to them. While the pronunciation of "Mickey Mouse" can be derived from those of "Mickey" and "Mouse", that hardly would be the case for Pikachu or Hulk. --Daniel. 15:00, 19 February 2011 (UTC)Reply
The distinction between something which is a word and something which is not a word is not always obvious, but, in many cases, it is obvious and consensual. It's clear that Lassie and Clifford are words you can find in English texts, and that Clifford the Big Red Dog is a title, but is not considered as a word by anybody (not even you, I think), unlike New York. If it can be argued that CFI consider that this is a word, then that means CFI should be revised (they're much too complex anyway, this is why there are such discussions; see KISS principle). Lmaltier 15:18, 19 February 2011 (UTC)Reply
Our rules as currently interpreted don't seem to exclude Clifford the Big Red Dog. If you object to the rules, them you might want to propose amending them. IMO, prime candidates for reconsideration are WT:FICTION, WT:BRAND, the toponym vote, and the vote on attributive use/names of specific entities. The chickens have been coming home to roost -- and they look like Daniel.. I dislike such entries, especially with the encyclopedic definitions and misleading way in which they are purportedly cited, but those are quality issues, not rule violations. Daniel. might make a good attorney. He seems very much inclined to take our rules and push them to their limit. I think of the creation of the one-page per word appendices for words from fictional universes. DCDuring TALK 16:32, 19 February 2011 (UTC)Reply
I'll try to propose a general umbrella for CFI, basic principles acceptable by everybody. Lmaltier 17:30, 19 February 2011 (UTC)Reply
Such an undertaking is a heroic effort, in the nature of drafting the Napoleonic Code. DCDuring TALK 18:37, 19 February 2011 (UTC)Reply
Nah! Trivial! All you have to do is unambiguously define the words (deprecated template usage) all, (deprecated template usage) word and (deprecated template usage) language. The "all words in all languages" does the rest. SemperBlotto 18:42, 19 February 2011 (UTC)Reply
I've already suggested defining "word" and "language" unambiguously, at least. :p This might help. --Daniel. 19:05, 19 February 2011 (UTC)Reply
There are far too many obstacles to getting a usable CFI. First, we have a sizable population of users who don't want proper nouns at all, or at least want to exclude most proper nouns that we currently do allow. Second, we have a much larger group of users who have "CFI shouldn't allow Pokemon (or similar)" (or, more accurately, "Wiktionary needs not to look silly") as a very high priority, regardless of whether it's possible to have a general CFI that disallows "unprofessional" entries without making overly specific additions. Third, Wiktionary's scope has a lot of areas, and many people are looking at CFI from the point of view of any of these individual areas. (Looking at if from a definition perspective and thinking whether anyone needs a definition for it, looking at it from a translating perspective and thinking whether translations could be useful, looking at it from a Rhymes perspective and thinking whether anyone would be surprised at seeing the word in a rhymes list.) Fourth, "attestation" is impossible in way too many situations, and we don't have any other way of verifying anything. Fifth, we want neologisms and we don't want protologisms, and no one has a cutoff point in mind. Sixth, when to include brand names, trademarks, company names, and anything else sufficiently commercial is a nightmare to figure out. (Anyone want to continue the list?) --Yair rand (talk) 05:29, 20 February 2011 (UTC)Reply
So, what to do? Things can change (the project is still very young). I think that the best way is to come back to objectives and basic, founding, principles, to clarify them, and to agree on them. This would make a consensus on detailed criteria much less difficult. Lmaltier 11:23, 20 February 2011 (UTC)Reply

Thank you guys, for all comments in opposition to the original idea of this thread. (: --Daniel. 10:20, 21 February 2011 (UTC)Reply

While I think it is clear that, in general, most people don't want to block you, I don't think you should flippantly overlook the fact that there are a large number of contributors who are upset or annoyed to one degree or another with the way you behave on this project. In your behavior and attitude (according to what I have read from you) you seek out controversy for the sake of pushing boundaries. This is not something which I believe fosters community and I don't think it helps further the project.
I would hope that a smiley face and an utter disregard for the serious concern that other contributors have raised about your activity are not your response, but rather that you would evaluate what you are doing and how you are doing it and perhaps concede that picking fights is not the best way to change policy here. Don't consider this a victory for your methods, if you do then I my vote here is an unambiguous support for blocking. We need more people willing to propose a change, discuss a change, find a compromise that meets widespread support, and then affect that change; we need fewer people who take a position and refuse to change despite widespread concern over that change.
If I didn't think you had the potential to contribute in an unquestionably positive way I would just vote support here and not really qualify it. Your contributions to date however are riddled with lots of low- to no-value entries which will probably end up being deleted when your crusade to change the CFI by drowning it in borderline or over-the-line dross is ended, either by you getting tired of it or the community getting tired of it and ending it for you. Take a few minutes (or longer) to think about how best you can serve the goal of Wiktionary and whether it is in a flood of fictional characters or perhaps some other method which you can enjoy and is also widely accepted. - [The]DaveRoss 11:39, 21 February 2011 (UTC)Reply
I don't seek out controversy for the sake of pushing boundaries; I believe the closest to this I have said is: "I can help by adding and attesting fictional characters mainly because this is a controversial subject that should be organized sooner or later, especially since there are active discussions everywhere about it." I can also discuss about them; for example, right now. I can also create, delete or refine entries based on discussions, as I have done before.
Seriously, I don't think "flippant" behavior is an issue. So, I don't complain about this discussion being opened with "Discuss. Have fun y'all."
Surely I may have overlooked or disregarded something important; however, it would be more productive to talk about disagreements in plain English instead of replacing facts by the simple proposal of indefinitely blocking me. Anyone, feel free to criticize my actions, and even feel free to point out ways of improving my behavior as an editor, though I'd appreciate if you could demonstrate that you've read my replies, either by replying directly to them, or perhaps by merely don't repeating a particular criticism if it has been proven wrong, and you aren't willing to counterargue.
If I may continue giving advice on how to interact with me (and possibly with other people), then let me point out that, when you disagree with me, I will most certainly disagree with you too. That seems a basic logical reasoning, though it apparently has been neglected by some people. When a Wiktionarian says they disagree with an entry, a policy, or an action, they're making a point. When they say "you're so stubborn, so you should get out of here indefinitely, then my opinion will prevail", they're just sounding ridiculous. For example, when I discussed about the creation of a category for individual people, then created Category:Individuals out of consensus and populated it, then a sense of Jesus Christ that was created in 2004 suddenly became the subject of an RFD discussion under the argument that it was part of "[seemingly] wilfully anti-community" "mass addition of entries contrary to consensus". That just didn't make sense in various levels.
I'm an editor since 2006 and an administrator since 2009; of course I've discussed many times about mine and other's actions. While "we need fewer people who take a position and refuse to change despite widespread concern over that change" is subjective enough to be impractical, as a rule of thumb, I don't even qualify for that specific criticism because I like to open discussions with people to know their opinions; when I am engaged in editing entries, templates and categories, I often take suggestions from other people (though there are some non-implemented suggestions yet, so I apologize for making anyone wait for my help); when I discuss, I often comment on others' opinions.
I don't have any plans of irritating people or picking fights; on the contrary, more than once when people attacked me or my ability to be an editor, I discussed with them until reaching an amicable agreement. I guess I made two or three friends that way. However, I don't feel obliged to change my opinions to keep up with others' opinions, even when I agree to follow them.
Personally, I'd hardly call 152 entries of fictional characters a "flood" (Maybe it was just a "flood" when people saw streaks of related edits in their recent changes; or, perhaps, my definition of flood differs from them, especially if they possibly want a Wiktionary devoid of fictional characters.) And I hardly would attribute most of their existence to me, because... I didn't create most of them. If you compare the categories of fiction with the ones of mythology, you would most certainly notice that the latter is much messier; it seems no one took the time to clean the latter up, while I organized the former (though it isn't perfect yet, mainly due to inconsistencies of topical categories as a whole, so I've been creating discussions to gather consensus on this subject). I believe categorizing names of fictional characters into Category:Fictional characters is already "unquestionably positive", regardless of whether or not their members are subject of controversy. On the other hand, narrower categories such as Category:Fictional people and Category:Fictional dogs may be less likely to meet consensus, so naturally I've been discussing them too. --Daniel. 13:45, 21 February 2011 (UTC)Reply

Dating rfts

I don't come that often, so this may be old news, but it seems to me that the column on the right of the Tea Room listing all rfts is new. I like it, but lots of these discussions are stale and have been archived, so that clicking on the word takes you to the page and then clicking on the rft just takes you back to the Tea Room without helping you find the discussion. If the datestamp of the rft were added automatically when the rft is added, that would make finding the original discussion easier. Not sure this is clear. Let me know if not.--Brett 16:33, 19 February 2011 (UTC)Reply

Tea Room discussions should be copied to the talk page for the entry, IMO. DCDuring TALK 16:50, 19 February 2011 (UTC)Reply
Not new, just newly overprominent: in the new version of MediaWiki it becomes a ginormous <pre> for some reason. I've found a hack to make it look better, which I applied the other day on WT:RFV and just now on WT:TR.
To find stale archived discussions, you can use [[Special:WhatLinksHere/foo]].
I agree with DCDuring, except that I think they should actually be moved to the entry's talk-pages rather than copied there. I don't see a need for a central Tea-room archive at all.
RuakhTALK 17:03, 19 February 2011 (UTC)Reply
Copying to the discussion page makes a lot of sense. Not sure if there should be a central archive of discussions. Duplication wouldn't be good, but at least a list of discussed words with links to the discussions on the individual words' pages.--Brett 17:07, 19 February 2011 (UTC)Reply
The archive is automatically there in history through the magic of Wiki software if we really need it. (But why would we?) Having the substance at the talk page is more useful to users, passive and contributing, than an archive of discussions whose existence is unknown to a user of the entry. I'd be happy enough with no archive, other than in the form of history. But links to the archive would also be acceptable if that is easier to do or, better, to automate than moving the content to appropriate talk pages. Perhaps someone could process the history from a full dump into something more accessible. More usefully perhaps someone could process the TR archive into talk page links (avoiding duplication, if possible). DCDuring TALK 18:20, 19 February 2011 (UTC)Reply
I've started on Wiktionary:Tea room/Archive 2010/March (so chosen because the oldest-tagged request for tea was on that page). One problem I've quickly run into is that many tea-room discussions aren't really about one individual entry, or even about a small, explicitly delineated clutch of entries. I'm archiving those that I can; those that I can't, I guess will remain on that page. —RuakhTALK 16:31, 21 February 2011 (UTC)Reply
Question: I've been using {{rft-archived}} for these, but it occurs to me that such an archive-box may not really be helpful for these in the way that they are for RFD and RFV discussions (where we want to preserve the exact discussion that led to whatever decision). I'm wondering if I should just move these the talk-page with whatever header seems appropriate to me, and a hatnote explaining where the discussion started? I mean, if someone wants to reply (belatedly) to an old tea-room comment, there's no reason they shouldn't just reply in the normal threaded-discussion fashion, right? —RuakhTALK 00:23, 24 February 2011 (UTC)Reply
Not IMO. It can, at least sometimes, make it appear as though those commenting in the TR discussion might have done so on the talkpage (or that later responses might have been made in the TR discussion), and so might be expected to respond to responses to what they said, and a "silence is acquiescence" argument might be made. (Not that that argument is particularly strong on a wiki anyway, but still.)​—msh210 (talk) 16:50, 24 February 2011 (UTC)Reply

Prefacing verbs with to in Template:en-verb

Currently, Template:en-verb automatically adds to to before the base form of the verb. I suggest that we do away with this for the following reasons:

  1. The word to is no more part of the verb than the is part of the noun. Rather it is a subordinator that marks the following verb phrase (VP) as subordinate and infinitive, similar to the way that that marks the clause as subordinate in ...that he arrive on time
  2. It is not the infinitive form of the verb that is being shown but rather the base form, which happens to be used in the infinitive, the subjunctive, and the imperative. These are all types of clause or VP, depending on your definition of clause, not verb forms. Of these, only the infinitive employes to.
  3. Even if it were the infinitive, and not the base form, there is the marked to infinitive (e.g., I want to go), and the bare infinitive (e.g., make me go).
  4. The standard among other English-language dictionaries is to present the verb without the to.

--Brett 17:04, 19 February 2011 (UTC)Reply

The principal reason for retaining the "to" is that it is yet another backstop against users mistaking a verb entry for another PoS and that some of our term template glosses retain "to" to make clear that the etymon is a verb, not a noun or other PoS. I suppose the backstop is redundant where inflection is shown. But we also have many entries for phrases that are headed by a verb that do not show any inflection. IOW, "to" in an entry or entry section serves as a marker that the entry/section concerns a verb. In some cases (eg, glosses) it is not redundant. The uses in the inflection line are redundant because of the PoS header and sometimes the content of the inflection line. DCDuring TALK 18:33, 19 February 2011 (UTC)Reply
It is in the inflection line that I'm suggesting it be removed. It is both redundant and misleading. I don't know what a "term template gloss" is, but I agree that you couldn't remove the to in cases like crawl: to move slowly unless you moved to full sentence explanations such as If something crawls, it moves slowly, a change I'm not advocating.--Brett 18:48, 19 February 2011 (UTC)Reply
Keep the 'to' and keep the 'a' in {{ro-verb}}. Mglovesfun (talk) 22:05, 20 February 2011 (UTC)Reply
Why?--Brett 00:58, 21 February 2011 (UTC)Reply
Just feels right. Mglovesfun (talk) 14:08, 21 February 2011 (UTC)Reply
In an inflection line, "to" seems like a pretty clear way of indicating that the following word is the base form of the verb. But is such an indicator necessary or helpful? I don't know. —RuakhTALK 14:11, 22 February 2011 (UTC)Reply
You have some good points there, including the predecent of English dictionaries. However, it seems that most of the points also justify the removal of "to" from the definition lines, which is not customary. I don't really know; interesting points, anyway. --Dan Polansky 14:42, 22 February 2011 (UTC)Reply
The other thing is that our inflection line displays the principle parts for a verb, making it a bit like a kind of grammatical table rather than just a dictionary lemma. And while the to-form is not that common in dictionary headwords, it's very common in declension tables and the like. Ƿidsiþ 14:52, 22 February 2011 (UTC)Reply
I agree that removal from the definition lines is not customary. Of the one-look dictionaries, only COBUILD learners and Wordnet do so. COBUILD uses complete sentences. Only Wordnet uses to-less infinitive clauses. But I don't agree that the arguments are the same. The word forms should show the word on its own. The definitions, however, are typically infinitive clauses: words used with other words. And in English, when we use infinitive clauses as subjects or complements of linking verbs, it is always marked with to (e.g., subj: To join a group is..., comp: ...is to join a group.) Notice that this is also an issue of mention vs use, where when you mention a word, the typical syntactic properties it has don't apply.
I also think the argument regarding declension tables is misleading. First of all, for the reasons I pointed to above, I think this practice is a mistake even in those tables that employ it. Secondly, again as I pointed out above, many dictionaries list all the forms. These could be equally said to resemble declension tables, but the major dictionaries don't use to here. Finally, more modern declension tables often don't list to. For example, the Azar English grammar series (Pearson Longman), one of the most popular ESL grammars in the world, simply gives the verb alone in its lists of irregular verbs.--Brett 16:28, 22 February 2011 (UTC)Reply

This issue doesn't seem to generate much interest, but I'll give it one more shot with another analogy. Putting to in front of the verb is like putting be in front of the present participle. Yes, it commonly appears there, but it isn't part of the word form, and it's inaccurate to include it.--Brett 15:52, 24 February 2011 (UTC)Reply

I find your reasoning rather convincing. --Dan Polansky 16:30, 24 February 2011 (UTC)Reply

OK, since there don't seem to be strong opinions about this, should I "be bold", change it, and see what happens, or should we put it to a vote?--Brett 13:19, 26 February 2011 (UTC)Reply

I've made the change and copied this discussion to the template's discussion area.--Brett 12:20, 28 February 2011 (UTC)Reply

Basic principles for CFI

Here are a few simple principles to be included as the beginning of CFI, for comments and improvements before starting a vote. Of course, detailed CFI should be made consistent with these principles. They don't solve everything (detailed CFI are still required), but they should help much. They don't deal with inclusion criteria for languages: this is a very important, but distinct issue.

1. The objective of the Wiktionary project is to give people information required when they want to understand a language, or to speak (or write) in a language. More precisely, learners of a language may have to learn:

  • encyclopedic knowledge about the culture of native speakers
  • linguistic knowledge about:
    • the grammar of the language
    • the vocabulary of the language.

Encyclopedic knowledge is provided by Wikipedia, not by Wiktionary. Main space Wiktionary pages focus on the vocabulary part, i.e. lexical items (including set phrases) that learners may have to learn if they want to understand, to speak and to write the language as well as native speakers, even in very specialized domains (note that this also applies to obsolete words and dead languages, despite the way the rule is expressed).

2. Lexical items as defined above are called words for the purpose of these CFI. Wiktionary describes words from a linguistic point of view, i.e. it provides information of linguistic nature only, in addition to definitions (and possibly pictures). Definitions must be as succinct as possible, but clear, and sufficient to fully understand the meaning of the word. When present, pictures must help to understand the meaning of the word. In addition to words, Wiktionary also describes some other items of linguistic interest: affixes, characters, proverbs (list possibly to be completed).

3. All words of all languages are accepted for inclusion. All forms of all words may be included too (unlike other dictionaries), except in the case of phrases (restrictions may apply to this case for practical reasons).

4. A language section for a word may be created if and only if this word is used is this language. Used means that some people have used the word (not only mentioned it) and expected other people to understand it; this excludes typographic errors, misspellings, errors made by people learning the languages, etc. (there may be exceptions when interesting and useful linguistic may be provided about such errors). This rule does not exclude words which are not fully naturalized in the language (e.g. an English section is allowed for autoroute, despite the fact that it's difficult to consider this word as an English word). When the existence of a word in a language is disputed, attestation rules apply. These attestation rules may be more or less strict for different kinds of words in order to prevent the creation of useless entries (e.g. brand names just coined by a newly-created company).

5. Rarity and recentness are not considered, the only important thing is that the word must exist in the language (actually, pages dedicated to rare words and to recent words are likely to be very useful to readers, because they are likely to be absent from other dictionaries).

6. When it is obvious to everybody (or almost everybody) that something is a word, and that this word is used in the language, the creation of a section of this language for this word is accepted (detailed CFI are not applicable).

7. When it is obvious to everybody (or almost everybody) that something is used in the language, but cannot be considered as a word as defined above (element of the vocabulary of the language), this item is not accepted (detailed CFI are not applicable).

8. In other, less obvious, cases, detailed CFI explain in a more detailed way how the present basic principles should be applied in specific cases.

Lmaltier 22:04, 19 February 2011 (UTC)Reply

Why? It seems to push principles like "all words of all languages" that recently failed a vote, and pointless obviousness principles--if it's obviously a word, then cite it; if everyone agrees it's obviously not a word (and I'm not sure what you mean there), then the RfD should be short. And I really don't know why it's hard to consider autoroute an English word. "the growth of the suburbs, despite its uneven internal distribution in the period 1951-75, was intimately linked to the growth of the network of autoroutes and bridges surrounding Montreal." is just one Google Book hit that's clearly using the word in English. Combine that with a general oververbosity and unnecessity, and I don't see the point.--Prosfilaes 04:17, 20 February 2011 (UTC)Reply
I may be wrong about autoroute, please change the example if needed. But of course, the word is used in English, this is why there is an English section. I just wanted to insist on the fact that the question should not be Is this an English word? but Is this word used in English?.
I don't know what vote you refer to. This principle all words, all languages is not new, it's present from the origin of the project, and it's the first sentence of CFI. The goal is to clarify what it means.
Something new is that people need a dictionary not only to be able to understand, but also to speak or write. This fact has important consequences on which phrases are acceptable (you cannot guess that something is the set phrase to be used to express what you want if you have not learned this set phrase and you've never heard it).
But the main point is that criteria can never be formulated perfectly. This is the reason why there are so many discussions based on the letter of rules, but the spirit of rules, basic principles, common sense, are forgotten. When basic principles are sufficient to take a decision, there is no need to enter into details. In other terms, basic, sound, principles are more important than imperfect detailed rules, which are required only in some cases. You might compare what I propose to a Constitution and detailed CFI to laws detailing this Constitution. Lmaltier 09:23, 20 February 2011 (UTC)Reply
There was an attempt at Wiktionary:Purpose to try and define the reason "why" we should have such rules — it's all very well stating them (and I mainly agree with what you've put, though think they are far from "basic" principles :) but if there's no "why" then it's completely open to debate.
I like the distinction between cultural-knowledge and linguistic knowledge. It may be the case that this can be used to decide whether or not something is a "word". Given a sentence like "Go past Draycot Foliat and take the first on the left", I don't need to ask "What does Draycot Foliat mean?", it's clearly just a name for something; rather I would ask "What is known about Draycot Foliat?" (lest I fail to recognize it) — a question Wikipedia is more suited to answer. On the other hand, if I have a sentence like "Draycot Foliat is a tithing" — I need to ask "What does tithing mean?", I could also ask Wikipedia "What is known about tithings?" if I wanted more in-depth discussion.
I also like the OED's FAQ [6], and [7] is also interesting. Conrad.Irwin 02:48, 21 February 2011 (UTC)Reply
I was assuming that the "why" was explained clearly enough. You are right, this is fundamental. Lmaltier 07:07, 21 February 2011 (UTC)Reply

German CFI

Input needed
This discussion needs further input in order to be successfully closed. Please take a look!

I've started a draft here. The problem with German is that it writes most words together, so the community is split on whether such words can be technically sum of parts. These rules are supposed to clarify the situation. Feel free to suggest things. -- Prince Kassad 00:04, 20 February 2011 (UTC)Reply

I believe that we should accept all German words (i.e. everything considered as a word in German) provided that their use can be attested. Some time ago, there have been many comments on the longest German word used in official texts. These comments clearly show that even very long compound words are considered as words in German (but this kind of word is exceptional). The attestation constraint is sufficient to limit the inclusion of such words. Lmaltier 09:32, 20 February 2011 (UTC)Reply
Well, we don't think the same for any of the East Asian languages (Chinese etc.) so why should German get a special treatment? -- Prince Kassad 09:53, 20 February 2011 (UTC)Reply
Is this the same case? I want to accept words that are considered as words by people speaking German. Are Chinese... "words" you refer to considered as words by Chinese? Lmaltier 09:57, 20 February 2011 (UTC)Reply
Do note, in any case, that it is not ultimately my goal to forbid legitimiately useful entries which people will want to look up. Instead, my intent is to prevent editors from adding entries like neuntausendneunhundertneunundneunzig, which are not useful to anybody and just waste time and resources which could be better used elsewhere. Attestation criteria is insufficient for restricting these because of the large available German corpus, which allows even the most impropable compounds to be cited (including this one). -- Prince Kassad 14:37, 20 February 2011 (UTC)Reply
I agree on this one: nine thousand nine hundred and ninety-nine is equally waste of time and resources. So this is not necessarily a question of German CFI only. --Hekaheka 07:56, 22 February 2011 (UTC)Reply
This is a very special case. I agree that this is not very useful, and I think that such entries (in German or in other languages) should not be created by bot (creating billions of such entries by bot would be a waste of resource, you are right). They should be created manually, and only with several real citations. Don't worry, nobody will be willing to try to find millions of citations and to create manually millions of such entries. But such entries may be useful nonetheless in some cases, especially to people not knowing the language at all and trying to decode a text (e.g. a message they received), this is why they should not be forbidden. Lmaltier 14:56, 20 February 2011 (UTC)Reply
So is there now going to be a rule that someone has to provide citations for German words before entering them? A polysynthetic language would be hopeless to document word-by-word; every new document would have new words. Including every word we could find three times would produce an arbitrary mess. In certain languages, a reader has to be assumed to be able to build and dissect SoP words with us just providing the Ps. It seems like German is one of those languages.--Prosfilaes 18:47, 20 February 2011 (UTC)Reply
You reason as somebody speaking English. de.wikt does not reason like you, it include words such as Tanzschule. Yes, these words should be accepted, but they should be attested, just like any word. Lmaltier 22:01, 20 February 2011 (UTC)Reply
How is it that my reasoning is connected to my language? And what about polysynthetic languages where virtually every word is nonce? As a practical matter, very few of our words are actually attested; the requirement in practice is that they be attestable, not attested.--Prosfilaes 01:57, 21 February 2011 (UTC)Reply
I was meaning that, sometimes, the equivalent phrases in English are not considered as words, but that does not mean that German words are not words. You are right, language with many words are likely to produce many pages here. When I write "attested", I mean "words that we know they have been used. Lmaltier 07:11, 21 February 2011 (UTC)Reply
If having a million pages still doesn't let you reliably look up words, then there's a real question of what the value of having a million pages is, if we need to approach the problem another way. I don't see why we should produce a bunch of pages by hand, if we can produce equally high quality pages by bot, and if entries like neuntausendneunhundertneunundneunzig are acceptable, then we should produce them by bot, which should have no problem verifying the existence of three cites before creating the page.--Prosfilaes 07:40, 21 February 2011 (UTC)Reply
  • I see no reason to exclude any orthographic word in German, no matter how sum of parts (obviously if it can be attested). Why worry about it? What "resources" does it waste exactly? It's a bit like saying lets exclude all English plurals in -s because their meaning is obvious. I don't think it's the same as East Asian languages, just because words are not spaced at all there; in German there are spaces between (orthographic) words and so SOP compounds are felt more as distinct words. They certainly are by non-German speakers. Ƿidsiþ 09:29, 21 February 2011 (UTC)Reply
I feel roughly the same. We all give our time voluntarily here; I suppose editors could be viewed as wasting their own time by creating these. I agree it would be like not allowing things like chlorineless (discussed in the tea room) or researchers because they are obvious from the sum of their parts. The principle goes beyond German; especially to Dutch but also to English, as pairs of short words often become single-word compounds in English like faceguard. So we would logically have to refuse some of these too. Mglovesfun (talk) 14:05, 21 February 2011 (UTC)Reply
I agree with Widsith's stance, and, mostly, with his reasoning.​—msh210 (talk) 16:13, 21 February 2011 (UTC)Reply
I agree with Ƿidsiþ. - -sche 00:16, 22 February 2011 (UTC)Reply
The only risk is for numbers or other infinite series: what would you think if a bot begins to create billions of entries for Italian and German numbers, and if Random entry returns a number for 99.999999 % of clicks? This is why I propose to exclude bots and to require the presence of several citations for entries belonging to infinite series. Lmaltier 18:46, 21 February 2011 (UTC)Reply
I don't see why we should do by hand what could be done by bot. A bot can certainly check Google Books for sufficient hits on a word. If a word is valid, and a bot can make a suitable entry, then let it make an entry. If there are several billion attestable words for Italian and German numbers, then there should be an entry for each of those words, and better done by bot than hand.--Prosfilaes 00:54, 22 February 2011 (UTC)Reply
If we want "Random entry" to return an English word / a "real" word, why do we allow so many conjugated forms of Spanish and Italian verbs? Their meanings are easy to guess, verb stem + inflection suffix. - -sche 23:32, 23 February 2011 (UTC)Reply
(Disclaimer: I don't speak German, so all of my knowledge of it is secondhand. Some of this comment may well be misinformed, and it will be shocking if every detail of it is exactly correct. Hopefully folks who know better will correct my errors.) Overall I think Prince Kassad's proposed CFI are the way to go, but German seems to be a difficult case, because on the one hand:
  • It seems obvious that (pace some commenters above) something like "neuntausendneunhundertneunundneunzig" is not a single word. Anyone who knows any German at all will instantly recognize that it's a compound that happens to be written solid.
  • Old-fashioned writing, of the sort that uses long s (ſ) medially, clearly recognizes that there is a distinction between this sort of compound and a true word, in that short S (s) is used at the ends of words even in the middle of compounds.
  • Due perhaps to influence from English, there's a tendency among some younger speakers to write such compounds with spaces between the words (though not so strong a tendency as in some of the North Germanic languages).
  • A dictionary can only do so much to help readers figure out where one word ends and the next begins. In the end, any sort of comprehensive help can only be achieved by true software using a dictionary as its back-end database, not by a look-up dictionary alone.
while on the other hand:
  • Even someone who knows some German may not be able to tell where the word-breaks are. Since our target audience is people who speak better English than German, it seems a bit unhelpful to say that a certain word-sequence is NISOP and users have to go elsewhere for help identifying the P.
  • As Widsith points out, German is not like East Asian writing where the orthography is syllable-oriented rather than word-oriented. These compounds are not "words", but they are "orthographic words" in a writing system where that concept is fairly meaningful.
  • Such phrases are at least constituents — they're noun phrases — so are at least conceivably possible to create entries for. (I'm thinking here, by contrast, of Hebrew, where several prepositions and conjunctions are proclitics, attaching to whatever word happens to follow. At least the German compounds are syntactic and semantic phrases rather than meaningless word-sequences.)
  • Compounds often have internal modifications at the word boundaries. But of course, such internal modifications should be documented at the entries for the individual words, so maybe that's a non-issue.
All told, as I said, I think Prince Kassad's proposed CFI are the best approach; but I definitely see where other commenters are coming from!
RuakhTALK 01:37, 22 February 2011 (UTC)Reply
Just to clarify: German compounds are words rather than phrases (not just in an orthographical but also in a linguistic sense). The whole problem is that German allows for arbitrary combinations of concepts to be incorporated into one (compound) word. Just because many of them will be attestable doesn't mean that they're actually "lexicalized" words of German -- it's perfectly possible that most of those will just be ad-hoc creations that happen to have been created multiple times. I believe this is what makes German different from most polysynthetic languages where complex letter strings, even though written without spaces in between, are not considered words, but phrases (I might be wrong here, though). In German orthographic words and actual words almost always coincide. So, since there are potentially arbitrary amounts of unlexicalized words, I doubt that the "all words" rule can be successfully applied to German. Where to draw the line between includable and not includable words, though, unfortunately I don't know. Longtrend 20:52, 22 February 2011 (UTC)Reply
Arbitrary combinations may be allowed, but I don't think there are so many attested compound words meeting attestation criteria. Lmaltier 21:26, 22 February 2011 (UTC)Reply

I am fairly certain that I do not want this text to become part of any official CFI; it should be removed from Wiktionary:About German, as it is supported only by a minority of editors, from what I can see:

Criteria for inclusion are currently strongly debated among the community. The issue is on which compound words are legitimately useful for a dictionary such as ours.

In English language, anything written together is automatically permissible, but this rule does not adapt well to the spelling conventions of the German language. Instead, inclusion of compound words should be based on a number of criteria. These are not binding, but fulfillment or non-fulfillment of these criteria can determine the worthiness of any compound word for being included:

  1. If the meaning of the compound word is not obvious by just looking at the individual compound members, the entry should almost certainly be included. This should account the knowledge expected from both native German speakers and to a limited extent learners of German language, since these will be using the dictionary the most. For example, the meaning of the German term Baumschule cannot be guessed by knowing the two words Baum and Schule.
  2. Certain terms have specific definitions in specialized dictionaries. These should therefore be included in this dictionary. This applies to terms such as Eigentumswohnung.
  3. The following changes to a word do not affect their inclusion in any way:
    • Usage of a filler phoneme like e or s, like in Bilderbuch
    • Usage of merely the stem of a compound member, like in Wanderweg

I disagree with the core of what the text says: with treating, when determining idiomacity, all German closed compounds the same way as English open compounds. In particular, I want to see "Kopfschmerz" included, rather than being treated as English "head ache" would be if there were no "headache". --Dan Polansky 09:26, 23 February 2011 (UTC)Reply

On de.Wikt, we try to exclude "Spontanbildungen" (spontaneous constructions, nonce words) like Scheißkind, but we include compounds, because they are words. It is unfriendly to non-fluent speakers to delete words only because they are compounds. Consider Dachterrasse: is it Dachter + Rasse, or Dach + Terrasse? Consider Wachstube: a non-fluent speaker could try splitting it W + Achstube and see that this was not an intelligible split. She could then split it Wa + Chstube, then Wac + Hstube, both also visibly unintelligible. She could split it Wach (guard) + Stube (room) and, because this is intelligible, never understand that the word was in her context however truly "tube of wax" (Wachs + Tube). - -sche 23:32, 23 February 2011 (UTC)Reply

Of course words like Wachstube which can be composed differently would be included. (Vollzug, with two vastly different pronunciations and meanings, is another one of these.) I just need to amend the proposed rules to reflect this. -- Prince Kassad 23:46, 23 February 2011 (UTC)Reply
I above all think that you should not be posting proposed rules to a page that tries to track community consensus: Wiktionary:About German. You would do well to move your proposal somewhere else. --Dan Polansky 09:25, 24 February 2011 (UTC)Reply
I have removed the offending section. For the sake of further discussion of the proposal, the section is still available in this revision. You could develop further proposals at User:Prince Kassad/German CFI or the like. --Dan Polansky 09:28, 24 February 2011 (UTC)Reply
For a further track of community consensus on the subject, see also Talk:Zirkusschule to which a RFD on the term is going to be archived. --Dan Polansky 09:49, 24 February 2011 (UTC)Reply
Model dictionaries: I have found a list of Duden entries on German Wiktionary: de:Benutzer:Ivadon/Duden/5. It includes such entries as "Himbeergeschmack", "Himbeerlimonade", "Himbeermarmelade", "kälteempfindlich" and "kokainsüchtig", all of which are semantic sums of parts with respect to the words contained in the compounds. --Dan Polansky 10:04, 24 February 2011 (UTC)Reply
@-sche: Nobody proposes to exclude compounds altogether, I think. At least opaque compounds such as Angsthase (coward, literally "fear-rabbit") would of course be included. Longtrend 21:32, 24 February 2011 (UTC)Reply

Why not cling to the attestability criteria? It would prevent from entering made-up compounds and would keep guesswork out of interpreting German words and figuring out their English equivalents. After all, the number of compound nouns is not astronomical, and many of them would fulfil the "set term" -criterion. See also RfD discussion for Zirkusschule. --Hekaheka 19:44, 24 February 2011 (UTC)Reply

The attestability criteria sound very reasonable at first, however this would still include words that I'm sure nobody would want to include. I just made up the word Rechtsfenster, transparently meaning "window on the right side". Would anyone want to see that included as an entry? I can't imagine that. Checking Google Books, though, it turns out that this word seems to be attestable indeed. It gets 5 hits, all extremely context-dependent ad-hoc creations that are purely results of a totally productive word formation mechanism in German and thus not dictionary-worthy at all. Longtrend 21:32, 24 February 2011 (UTC)Reply
I still agree with Ƿidsiþ (09:29, 21 February 2011) and Mglovesfun (14:05, 21 February 2011). Wiktionary's remit is "all words in all languages". A child learning to speak German natively or an adult learning German as an additional language may look up a long word. Wiktionary is not paper, so these words do not take up space that we could use for other entries — we can have all entries. What harm, then, does it do to allow all attestable words, including compounds? Some have expressed fear that the "Random entry" function will return only German numbers if someone adds a flood of these: but already the "Random entry" function returns only Spanish and Italian verbs, which are perhaps all "sum of parts" by the logic expressed above, because they are almost all clearly "verb stem + inflection suffix", but which are in any case no better random words than German numbers. And if we do want to forbid numbers, numbers are only a subset of all compound words, and should be discussed specifically (and perhaps without regard to language: Hekaheka points out English can have as many numbers as German). Why forbid general compound words like Tanzschule? Why forbid Rechtsfenster? Some have suggested that compounds are not words, but as Dan Polansky hints, the authorities on the German language disagree: even [the unattestable] Rindfleischetikettierungsüberwachungsaufgabenübertragungsgesetz is (possibly) the longest word (das längste Wort) — it is a word — it is not "the longest words". If we are to make an exception to our policy of "all words in all languages" to forbid compounds, I would think there must be some exceptional reason; Tanzschule must do us some exceptional harm. What is that reason, what is that harm? - -sche 09:55, 7 March 2011 (UTC)Reply
Short answer to your last question: I wouldn't exclude Tanzschule, since it's a conventionalized word. I would exclude Rechtsfenster, since it's nothing but an ad-hoc creation.
Do we have any policy about polysynthetic languages? I think that the "all words in all languages" rule, even if we assume that it works for German, will fail for those languages, unless you want to have entries for whole propositions that happen to be incorporated into one word. Not even the attestability proposal will work here since it will be hard to attest words from unwritten languages (which is true for many polysynthetic languages). IMHO, the argument that non-native speakers of German, when they see the whole compound word, are unable to comprehend which parts it's composed of, doesn't hold water either: Even now we assume the reader to know that the single parts of a multi-word term (e.g. put one's money where one's mouth is), i.e. the words, are meaningless parts of that term rather than contributing compositionally to the meaning of the term. Longtrend 23:23, 7 March 2011 (UTC)Reply

Deprecating Category:English plurals and the like

On WT:RFDO#Category:Catalan noun forms it has been suggested to always use noun forms rather than plurals. See also Category talk:Plurals by language. One problem with plurals is that not only nouns can have plurals - this is true even of English, but even more so of many other languages such as French (adjective plurals) and even more so those with case systems like Russian, Latin where there are different kinds of plurals - nominative, genitive, instrumental, etc. Mglovesfun (talk) 12:13, 22 February 2011 (UTC)Reply

I suggest renaming from Category:English plurals to Category:English noun plural forms. --Daniel. 12:21, 22 February 2011 (UTC)Reply
Yes, not to disagree but to comment, that wouldn't align English with Category:Latin nouns forms[sic] and the other languages that use noun forms, not 'noun plural forms'. Mglovesfun (talk) 15:34, 23 February 2011 (UTC)Reply
(I think you mean Category:Latin noun forms :p)
I'm not sure how it is the current position of other users as to detailed categories of inflections, but Category:English noun plural forms would at least align with Category:German verb plural forms.
I, personally, oppose the existence of "Category:English plurals" per Mglovesfun's reasons and support the creation of either "Category:English noun plural forms" or "Category:English noun forms". --Daniel. 15:44, 23 February 2011 (UTC)Reply

Translations for inflected forms

Do we want translations for cars, fights, paints etc.? If so, why? Mglovesfun (talk) 15:08, 23 February 2011 (UTC)Reply

In my opinion, translations for inflected forms are not worthwhile, because, among other reasons... There isn't a one-to-one correspondence between English and foreign inflections. For example, a verb in simple past or past participle (among other inflections) may be translated into various cases, grammatical persons, etc. of other languages. It would not be very helpful to state that stopped can be translated into Portuguese parado, parada, parados, paradas, parei, paraste, parou, paramos, parámos, parastes, etc. To indicate the correct translation among them for each context, we would have to duplicate portions of inflection tables, that I expect to be more readable at entries of lemmas. --Daniel. 15:32, 23 February 2011 (UTC)Reply
See User talk:Stephen G. Brown#Translations of inflected forms and page history of "translations". Mglovesfun (talk) 15:36, 23 February 2011 (UTC) IFYPFY.​—msh210 (talk) 15:56, 23 February 2011 (UTC)Reply
I fixed the erroneous gender of the Portuguese translation of translations. Another reason for preferring the absence of translations for inflections is to avoid keeping track of the same information in various places like this. (though "keeping track of the same information in various places" is tipically a task for templates)
Also, that translation table is incomplete, maybe deliberately. Where are the declensions of German and Swedish within the English entry? I currently have to go to Übersetzungen, then to Übersetzung to find a German plural genitive from translations. --Daniel. 15:55, 23 February 2011 (UTC)Reply
What Daniel said (15:32, 23 February 2011 (UTC)), and to some extent was Ruakh said (15:45, 23 February 2011 (UTC)). I used to think having a translations on a form-of entry was a good idea, but have come around.​—msh210 (talk) 16:11, 23 February 2011 (UTC)Reply
We should be translating lexemes, not wordforms. "Car" and "cars" are a single lexeme, and [[car]] is the appropriate place to provide translations for it. —RuakhTALK 15:45, 23 February 2011 (UTC)Reply
Exactly. Ƿidsiþ 15:57, 23 February 2011 (UTC)Reply
You could go further with the non-lemma forms; for example play could also mean "I play, you play, we play, they play, or play!" which in case possible French translations would be joue, joues, jouent and jouez. For -ir and -re verbs it could be more, as you could include subjunctive forms. Mglovesfun (talk) 15:59, 23 February 2011 (UTC)Reply
Gets more complicated than that; traductions is the French translation for one sense, but also translations for the mathematical sense! So you'd need to change the definition from Lua error in Module:parameters at line 95: Parameter 1 should be a valid language code; the value "translation" is not valid. See WT:LOL. to "more than one end result of translating text." Mglovesfun (talk) 11:52, 24 February 2011 (UTC)Reply

When stopped needs translations (and synonyms), maybe this is a sign that it has gone from being just a past participle (a verb form) and really become an adjective in its own right. --LA2 13:13, 24 February 2011 (UTC)Reply

My examples of translations of stopped were Portuguese verb forms, thus translations of an English verb form. --Daniel. 13:17, 24 February 2011 (UTC)Reply

I split the entry translations after User:Stephen G. Brown reverted my initial removal, stating that there was "no consensus" for me to do so. Which makes me wonder what this discussion shows, just nothing? Mglovesfun (talk) 12:07, 28 February 2011 (UTC)Reply

Ergo Wiktionary:Votes/pl-2011-02/Disallowing translations for English inflected forms.
Nobody is ever obligated to translate anything. We have always allowed people to translate the word, sense, and form they wanted and this has worked out well. There are very few translations of inflected forms here because people generally do not want to expend their energy on them. There are a few; for example, more, most, parents, including, bombing, interesting, barring, painting, dwelling, Ten Commandments, tired, accused, men, broken, Tamil Tigers, translations, spades, clubs, diamonds, hearts, Killing Fields, and people. These entries are all the more valuable as a result of their translations; some languages like French and Spanish have simple and predictable inflected forms, other languages are not so easy. The few inflected-form entries that we have where there are translations are the only place on the Internet where that stuff can be found in most cases.
I have a lot more faith in the judgment and experience of our linguists who decide for themselves which words, senses and forms they want to address than I have in Mglovesfun who just enjoys deleting useful pages and has no sense of their value. Deleting well-made pages that some people use or need always has bad effects, either on the users or on the contributors or both, even though the effects of lost work and resources is not easy to discern; on the other hand, deleting correct and well-formatted pages never brings any advantage at all, unless you count the thrill of deleting good information and hard work. —Stephen (Talk) 11:45, 7 March 2011 (UTC)Reply
Most of the entries you cite have plural-only senses; you can't have 'a Ten Commandment' so that's not an inflected form, it's a plural only form. Mglovesfun (talk) 12:00, 7 March 2011 (UTC)Reply

Lorem ipsum

Are all the words of lorem ipsum suitable as dictionary entries? If so, why? If not, why not?

Examples: lorem ipsum dolor sit amet consectetuer adipiscing elit aenean commodo ligula eget

--Daniel. 19:22, 23 February 2011 (UTC)Reply

The community has already decided that they are not: see the old discussion.​—msh210 (talk) 19:28, 23 February 2011 (UTC)Reply
Some of them are words in languages, some of them seem not to be, as lorem ipsum isn't actually written in a 'language' as such. Mglovesfun (talk) 11:46, 24 February 2011 (UTC)Reply
Well, of course, some of them are actual words. I assume Daniel was asking about those that are only in the lorem ipsum, or having a sense line because they're in it.​—msh210 (talk) 16:45, 24 February 2011 (UTC)Reply
That's what I meant too; I just couldn't be bothered coming back to expand on it. Mglovesfun (talk) 16:48, 24 February 2011 (UTC)Reply

Norwegian headings

This was also asked on no:Wiktionary:Tinget#Norsk i en.wiktionary

We have many templates that make anchored links to entries based on the lang= parameter, e.g. {{form of|...|uno|lang=it}} will create a link to [[uno#Italian]]. But Norwegian (lang=no) has two standard orthographies, Bokmål (lang=nb) and Nynorsk (lang=nn), and three different headings. ==Norwegian== is used for sections describing words that are common to both Bokmål and Nynorsk, whereas ==Norwegian Bokmål== and ==Norwegian Nynorsk== are used for sections describing words that are unique to one variant. This creates a problem for form entries like husi, which is unique to Nynorsk, appears under a ==Norwegian Nynorsk== heading and links back to the main entry with {{inflection of|hus|lang=nn}}. The problem is that the main entry is common to both variants and thus appears under ==Norwegian==. Therefore, the template call needs to be {{inflection of|[[hus#Norwegian|hus]]|lang=nn}}.

As an experiment, I introduced a new template {{Norwegian}} that can be used for headings, always creating HTML anchors for all three variants, so you can link to bil#Norwegian, bil#Norwegian Bokmål and bil#Norwegian Nynorsk and always end up at the same place. It is currently used in the main entry bil#Norwegian and in the Nynorsk inflected form bilane. Is this an acceptable solution? Would it for example break language statistics if headings use a template like this? --LA2 11:38, 24 February 2011 (UTC)Reply

Not a big fan of templated headers myself. Mglovesfun (talk) 11:47, 24 February 2011 (UTC)Reply
This particular template, {{Norwegian}}, has at least three downsides: (1) it links to a draft policy (Wiktionary:About Norwegian) that contradicts the template, by saying "Norwegian entries should have a common L2 header, ==Norwegian=="; (2) it doesn't fix the discrepancy of identification of varieties of Norwegian that LA2 mentioned above; (3) all headers without templates also have anchors automatically, thanks to MediaWiki: see bil#Icelandic and bil#Faroese. --Daniel. 12:01, 24 February 2011 (UTC)Reply
Yes, all headings have anchors. The problem is that ==Norwegian== in the page hus doesn't have the anchor for hus#Norwegian Nynorsk. In the most recent XML database dump, there were 6553 ==Norwegian== headings, 2401 ==Norwegian Nynorsk==, and 694 ==Norwegian Bokmål==. Should we enforce the draft policy and change these 2401+694 to the single standard heading? --LA2 12:16, 24 February 2011 (UTC)Reply
While I'm not a speaker of Norwegian, the idea of merging these three headers into "Norwegian" seems good to me.
FWIW, we don't have language headers named "Hiragana Japanese", "Simplified Mandarin" or "American-spelled English" to indicate these written standards of these other languages. --Daniel. 12:32, 24 February 2011 (UTC)Reply
I wrote some stuff some time ago on the policy talk page. I am not convinced that merging them to a common header is a too good solution - in fact, it is in some aspects easier to separate them completely, as tagging e.g. which synonym belongs to which standard can be a real pain in the rear end at times. Three different headers is a compromise between excessive tagging of words (one header) and creating almost two identical entries (two headers).
I should also point out that most of the differences between Nynorsk and Bokmål are not simply different spellings of the same words, as the use of the word ortography could indicate. For instance, the two words eg (Nynorsk) and jeg (Bokmål, both meaning 'I') do not have a common ancestor before the Proto-Norse word *eka! Njardarlogar 13:41, 24 February 2011 (UTC)Reply
I see the reason for having a template generate anchors for all three languages. I don't see the point of having it link to About Norwegian, a page for editors. If you drop the link to About Norwegian, I'd like this idea, iff our major bot operators and dump analyzers assure us that they will still be able to do what they do even though the header is a template and not a raw language name. Otherwise, just use {{inflection of|hus|lang=no}} even in a Nynorsk entry.​—msh210 (talk) 16:43, 24 February 2011 (UTC)Reply
Don't the two standards also differ in vocabulary somewhat? I think it might be better to have just two headers. The duplication would make it explicitly clear that the word exists in both standards, because a lot of external sources treat 'Norwegian' as meaning the same as 'Bokmål', and so editors might assume the same. —CodeCat 22:40, 24 February 2011 (UTC)Reply
(1) On no.wiktionary, only one heading (Norsk = Norwegian) is used, and the differences between Bokmål and Nynorsk are indicated next to each word (e.g. no:bil), just as our draft policy Wiktionary:About Norwegian suggests.
(2) But on nn.wiktionary, two separate headings are used (Bokmål and Nynorsk), even if the two sections have identical content (e.g. nn:kinesisk).
(3) What my template experiment tried was to preserve the current mess with three different headings. I'm starting to conclude that this is a bad choice.
(4) The fourth possibility is to have three more years of inconclusive discussion, and I think that is the worst outcome.
I think we should either have one or two headings, but not three. In support of one heading (1, the no.wiktionary model) is the existing draft policy and a pretty advanced template {{no-noun-infl}} that handles both variants in one common inflection table. Myself being a Swede and frustrated that Scandinavia has too many languages already, I'm tempted to go for a two heading solution, where the main alternative is Bokmål, using the heading Norwegian. This would reduce Nynorsk to a second rate dialect, similar to Scots, Limburgish, or our historic entries for Old English. Even though this would be an Alexandrian solution to this Gordian knot, I'm sure such a suggestion would make many Norwegians bitter or angry, so perhaps one common heading is the better alternative? --LA2 01:12, 25 February 2011 (UTC)Reply
The last "solution" that you put up there is totally unacceptable in many ways (it's equivalent to labeling Swedish as 'Scandinavian' due to its size and the rest as 'regional dialects').
CodeCat: yes, the vocabulary is also different. Many important/frequent words are different (though they often have the same Old Norse roots). Njardarlogar 10:12, 25 February 2011 (UTC)Reply
I think taking into account what would make people angry would be a violation of NPOV. What do you mean by "second rate dialect"? I don't see anything in common between our coverage of the languages you listed, so I don't understand what you're refering to. --Yair rand (talk) 11:57, 25 February 2011 (UTC)Reply
I think we need to look at this from a usability point of view. Most people who look up words on Wiktionary already know two things. They know the word they're trying to look up, and the language it's in. So, someone who wants to look up German (deprecated template usage) das will first find the entry 'das' and then look for a German section. But for Norwegian it's a bit more complicated right now, because someone who wants to look up a Nynorsk word will have to look not just for a 'Norwegian Nynorsk' section but also possibly a 'Norwegian' section. And to make matters more confusing, entries like (deprecated template usage) som contain both... —CodeCat 13:47, 25 February 2011 (UTC)Reply

For a solution with two headings (2 above), there is a questions what those headings should be. As Njardarlogar says above, using "Norwegian" as a heading for Bokmål is unacceptable to many, so we would have to ban that heading, only using "Norwegian Bokmål" and "Norwegian Nynorsk". I think this is yet another argument for the one heading solution (1 above). Do we have consensus for solution (1)? --LA2 14:09, 25 February 2011 (UTC)Reply

Please do not use templates in headings — it makes it significantly harder to work out what's going on (particularly if they take arguments and do magic — the worst offender is the "abbreviation" thingy). If we want to have one unified heading for Norwegian, then the text "Norwegian" does that fine [there are much better ways of distinguishing dialects, Template:UK and Template:US for example]. In common English parlance (and this is a dictionary for English speakers) both languages are called Norwegian. This ties closely into index creation — at the moment I'm using the headings to determine which words belong to which language, and to find translations by looking for * [language name]: {{t|. It's not the end of the world, and I could write code to fix it for Norwegian [and then for BCS [and then for ...]]; but every single person who wants to get information out of Wiktionary will have to fix it too. On a related note, what should Index:Norwegian contain? All of Norwegian Nynorsk, Norwegian Bokmal and Norwegian; or just Norwegian and have Norwegian Nynorsk and Norwegian Bokmal as separate indexes? Conrad.Irwin 19:33, 26 February 2011 (UTC)Reply

On no.wiktionary (which uses model 1), the category:Norwegian nouns has subcategories for nouns in Bokmål and Nynorsk, but vast majority of nouns are in the main category. I think one common Index would be the way to go. --LA2 01:23, 27 February 2011 (UTC)Reply

We shouldn't reach a final conclusion before we have setups properly defined. The consequences of a merger are at present largely unknown. If we are going to settle the issue "once and for all", we have to do a decent job. No need for the haste.
@CodeCat: I do not see how entries like som are more confusing than having to e.g. check out different etymologies. Njardarlogar 16:49, 3 March 2011 (UTC)Reply

What do you mean with "setups properly defined"? What is missing? The proposal is to follow what the existing draft policy says, and not create any new entries with other headings than ==Norwegian==. It's your creation of non-standard entries that should be slowed-down. --LA2 20:22, 3 March 2011 (UTC)Reply
That policy is indeed a draft. The details should be discussed - do people actually agree with them? Do they follow the standard framework of the English Wiktionary?
Furthermore, I do myself oppose a one header solution, I think it is more accurate to include the label "Bokmål" or "Nynorsk" in the header rather than beneath it. The two written forms do, after all, not fully converge on some sort of middle point. Njardarlogar 13:30, 4 March 2011 (UTC)Reply
Ehm, discussing this issue is exactly what we have been trying to do here. I think we all agree that having just one heading, which is what the draft policy says, is very well in line both with the rest of English Wiktionary and with the Norwegian (Bokmål) Wiktionary. As far as I can see, you are the only one against this. But except for your personal opinion, what arguments do you have? Earlier, when other users have asked on your user talk page that you should follow the existing draft policy, you have also refused to do so by only referring to its draft status and not by contributing any good arguments for not accepting it. --LA2 00:58, 5 March 2011 (UTC)Reply
My question is how many people have actually read WT:ANO? Getting support for the number of headers is one thing, but there are more details to consider. For instance, does layoyt nr. 3 look good? (which I suppose is the layout that is actually going to be used)
Here's another test of the proposed layout. It may not illustrate all of the consequences draft policy setup, however; one important thing is that ther are quite a few entries that will receive this vague tagging on the inflection line. The Nynorsk part is almost invisible. There has been drawn parallels to British English versus American English, however, I do not think it is a too grave offense to write customize in a text otherwise using British spelling. In comparison, using certain Nynorsk words in a Bokmål text or vice versa would be equivalent to using Swedish words in Danish text in the minds of many. I therfore find the vague tagging that follows the one header solution as not too good solution, as people may more easily misinterpret our entries.
I was asked to follow a draft policy which nobody have agreed upon! Naturally, I would not comply (please also notice the dates of these two edits [8] [9], and who is making them. Conclusion: things are not as set in stone as has been claimed. This is also clear by reading the talk page at WT:ANO) . Njardarlogar 09:30, 5 March 2011 (UTC)Reply
As you can see from the first line of this discussion, I have tried to invite opinions from the Norwegian (Bokmål) Wiktionary to this discussion, but their reaction was to ask what the problem is, because their Wiktionary follows just what our draft policy proposes (one single heading) and it works great for them. I didn't ask the Nynorsk Wiktionary, because it is almost dead with no daily activity. You are of course free to invite more opinions.
Yes, I think your example of "layout 3" looks great. The examples (tjørn, vatn, draum) are somewhat extreme, just like a good example should be, in illustrating differences between Bokmål and Nynorsk. If (Bokmål) and (Nynorsk) are sprinkled all over the example, it is not going to be much worse than this, because many words are common to both variants of Norwegian. In many cases, quotations and example sentences will need to be taken from Ibsen, Bjørnson and older writers which predate any standardized Riksmål/Bokmål, and from Vinje/Aasen which predate modern standardized Nynorsk, some bordering on either pure Danish or dialects, so the year of the quotation will say more than the Bokmål/Nynorsk label. This is no different from Danish and Swedish, which also use old spelling and grammar in some example sentences.
Your attitude of "I was asked to follow a draft policy which nobody have agreed upon! Naturally, I would not comply" gives me the impression that you will by principle disobey any proposed policy. So perhaps I should just update it to recommend two headings, and you will voluntary start to use a single heading? This is now all about you and your attitude, and not about what's best for the language and Wiktionary. --LA2 13:32, 7 March 2011 (UTC)Reply

No, it isn't. You keep dragging personal elements into this, also known as ad hominem. Stop doing it, and stick to the bloody arguments that are being put forward.
If someone creates a draft, then it does not automatically follow that people should follow it - it is a draft after all. What follows is debate - which is where we still are almost 3 years on. Very little argumentation has been but forward in this debate that explains what the problems with a three header solution are. What are actually the problems? A three header solution is the simplest way to accurately label the two language forms without having to create almost duplicated entries.
I am, by the way, worried by the fact that primarily non-native speakers of Norwegian have participated in the debate here so far. Njardarlogar 17:09, 7 March 2011 (UTC)Reply

Good, let's start all over with the very basics. The drawback with having three different headings is that other articles link to pagename#Norwegian (either explicitly as [[pagename#Norwegian]] or through some template with {{...|lang=no}}) but suddenly the page changes and it no longer has that heading, but separate headings for Norwegian Bokmål and Norwegian Nynorsk, so all other articles linking to it need to be updated. The same happens when a red link goes to pagename#Norwegian_Nynorsk but the article is later created with a common heading for ==Norwegian==. Most mechanisms on Wiktionary assume one heading for one language. For example, that is how we organize categories and count statistics on how many words we have per language. With three headings we need to count Bokmål as the sum of Norwegian + Norwegian Bokmål and Nynorsk as the sum of Norwegian + Norwegian Nynorsk. The linking problem was what I tried to solve by introducing a template (top of this discussion) that created HTML anchors for all three names, so a link would always find something. All agreed that this was a bad solution and the template is now deprecated. All seemed to agree that we should go for either one (Norwegian) or two headings, but not three. I have repeatedly asked you why two headings are better, and all you have responded is that the existing draft policy is still a draft. This is not an argument for why two headings is better than one. The only consistent criticism is that you are against a single heading. That summary is not an ad hominem attack.
I agree that more Norwegians should enter this discussion. It's sad that so few do. I have tried to invite more people, and I suggest you do the same. --LA2 19:18, 7 March 2011 (UTC)Reply
I have presented my main argument several times without it receiving any feedback; and it has nothing to with drafts. Yes, I am aware of the problem with linking. It may, however, be solved through other methods One problem, though, is that the string functions are not going to be implemented, making very simple tasks challenging. Regardless, people having to scroll down the page rather than arriving right at the correct entry is not the end of Wiktionary.
The second problem is also one of practicality, one that does not greatly affect the usability of the English Wiktionary (not to mention that if we only used one header, we would get huge counts for Norwegian compared to Danish and Swedish; and for no good reason, remember). What is certainly confusing, though, is the vague labelling at the inflection line that so many entries would receive with only one header. If someone not familiar with the fact that there are two written standards of Norwegian see (Bokmål) or (Nynorsk) at the inflection line, would they even know what the tag means, why it is there? If Bokmål or Nynorsk is in the header, it will become clear that we are dealing with two separate written standards.
To sum up: a complete separation is unpractical, whereas just one header is too vague. In my view, we must either separate them completely or partially. Njardarlogar 20:05, 7 March 2011 (UTC)Reply
Fine then, do we have consensus for splitting Norwegian into two separate headings ==Norwegian Bokmål== and ==Norwegian Nynorsk==? It can certainly be done, since we already do separate headings for Swedish, Danish, and Icelandic. After the split, lang=nn will refer to Norwegian Nynorsk and lang=nb to Norwegian Bokmål. But what should we do about entries that refer to lang=no? Should that be treated as an error?
There are currently 6,500 entries for Norwegian, 80,000 for Swedish and 400,000 for Italian, so I wouldn't worry too much about flooding Wiktionary with too many Norwegian entries. If we get enough contributors, we should have half a million entries (including form entries) for each of these languages and maybe two million for Finnish (which has more inflected forms). --LA2 21:05, 7 March 2011 (UTC)Reply
I think both could work, but it's certainly easier to split them (for any other Wiktionary than no.wikt). no.wiktionary.org is also the only Norwegian dictionary that doesn't split Bokmål and Nynorsk, every other Norwegian dictionary I know is either for and in one of the standards. So, I think you should do the same here at en.wikt.
I also think that you should use no and not nb as the language code. I think it will be the most logical for those who are not very interested in the politics around this, and just want to contribute (especially new users). Norwegian Bokmål is in many ways the mayor standard in Norwegian (for example, if you learn Norwegian as a second language, you will most probably learn this standard), and no is a code that many knows or can guess since it's a common code for both Norwegian and Norway. And also, nn is common for Norwegian Nynorsk and most people, especially those who contribute in Nynorsk, will know this code for Nynorsk. Mewasul 09:54, 8 March 2011 (UTC)Reply
If we end up splitting them, we'll need an easy way to make sure that any Bokmål terms that also exist in Nynorsk get their own entries too. Maybe a bot could periodically check entries that have only Bokmål and make a list of them, so that any Nynorsk users can add the Nynorsk forms of those words as well? —CodeCat 11:13, 8 March 2011 (UTC)Reply
First: the codes that are used need to represent their languages, anything else is not NPOV in this context. Same goes for language names. Furthermore, the dialects would have to be treated as "Norwegian" - one would think.
There are quite a few words and spellings that are unique to either language form, so creating such a list would largely be useless; not to mention that its efficiency will continuously drop as time passes - if I understand your idea correctly. Njardarlogar 12:42, 8 March 2011 (UTC)Reply
Is it possible for the bot to know in advance which words are shared by both languages? If it is, then it could use that to make its list. —CodeCat 13:31, 8 March 2011 (UTC)Reply
I'm not sure how we could a tell a bot that. Be also aware of that words that are both Nynorsk and Bokmål could be used much more in one of the language forms than the other one, meaning that the effort may be best be made on words exclusive to the language form. The fact that a lot of the most common/important words do already exist under "Norwegian" headers anyway, which simply have to be split up if we were to do this, further strengthens the idea. With all that said, it may well be possible and worth the effort if it is done the right way, but I don't know how that would be. Njardarlogar 18:08, 8 March 2011 (UTC)Reply
I think you are mistaken, CodeCat. If we are to treat Norwegian Bokmål and Norwegian Nynorsk as two separate languages, then they don't need to be synchronized any more than Danish and Swedish. One Bokmål contributors writes entries for all kinds of fruit, another Nynorsk contributor writes entries for all kinds of fish, a third Swedish contributor writes entries for all kinds of bread. Isn't that how Wiktionary works? --LA2 01:42, 12 March 2011 (UTC)Reply

I conclude from this lengthy discussion that two separate headings should be used: ==Norwegian Bokmål== and ==Norwegian Nynorsk== and with time all existing ==Norwegian== sections should be split up or changed into these two. I intend to update the existing draft policy Wiktionary:About Norwegian with this information. It will still be a draft policy until we decide to make it formal. But for now, the trend is going towards two separate headings instead of one common. --LA2 20:36, 14 March 2011 (UTC)Reply

The outcome is pretty vague for now; I do not think it makes sense to touch older entries before we've finally settled for something (though of course, all the entries marked as Norwegian, but with no futher specifications need to be fixed; which is something that I have been working on for a while). I invited the users user:EivindJ, user:Kåre-Olav and user:Meco in an attempt to get more feedback. Njardarlogar 09:18, 15 March 2011 (UTC)Reply
Sure it is vague, but we can change that by taking action. We need to lift Norwegian from being the 34th biggest language here to some more prominent position. Let's say Bokmål (just like Danish) should be among the 20 biggest and Nynorsk (just like Icelandic) among the 30 biggest. That is a huge lift for both variants, but we can do that. I'll track the statistics at Wiktionary talk:About Swedish#Statistics. --LA2 15:46, 15 March 2011 (UTC)Reply

No italics for Latin/French/Japanese words?

Hello, I am here to put my two cents' worth of knowledge: In my native language, Spanish, we often put Latin/French in italics, but not Japanese, so I was shocked when I read the following sentence:"Loanwords and borrowed phrases that have common usage in English—Gestapo, samurai, vice versa, esprit de corps—do not require italics. A rule of thumb is not to italicize words that appear unitalicized in major English-language dictionaries." I would change the wording to "Some words do require italics" because for example, many people nowadays know what samurai is, but not what Gestapo or esprit de corps mean. Thus I will seek either removal or rewording of that section of the Wikipedia Manual of Style. — This unsigned comment was added by Fandelasketchup (talkcontribs).

WF? Mglovesfun (talk) 13:37, 24 February 2011 (UTC)Reply
Probably. --Daniel. 13:40, 24 February 2011 (UTC)Reply

Appendix:Elfen Lied, etc.

I redesigned some appendices of fictional terms to display lists of terms, their definitions, and their inflections. For example, see Appendix:Elfen Lied. --Daniel. 19:12, 24 February 2011 (UTC)Reply

Deleting categories for derived terms

I would like to see the categories for derived terms deleted, and the concept discontinued. These are categories located in Category:English derived terms, such as Category:English words derived from: horse or Category:English words derived from: cube (noun). Related templates include {{derv}}. These seem to be a result of an initiative of DCDuring from September 2010. IMHO derived terms are better placed directly to the section "Derived terms", as is the prevailing practice. Many of these categories have very few members, such as one member. Anyone else feels the same? --Dan Polansky 17:43, 25 February 2011 (UTC)Reply

The category Category:English derived terms is meaningless. Category:English words derived from: horse is not meaningless, but useless if the Derived terms sections is present. Lmaltier 17:46, 25 February 2011 (UTC)Reply
I don't agree; these categories are relatively empty because they are new with few users working on them. It's not what I do however; for related terms/derived terms, I often point to a 'central' entry, for example for homeless I would do
====Related terms====
* see {{term|home|lang=en}}
Thus bypassing the category all together. Mglovesfun (talk) 17:50, 25 February 2011 (UTC)Reply
I am not saying that they should be deleted because they have only few members. I am saying that they are redundant to the section "Derived terms", unless the section is emptied and its content is moved to a category. I want to see the list directly in the section "Derived terms"; I do not want to see the section emptied. --Dan Polansky 17:56, 25 February 2011 (UTC)Reply
Sure, deleted 'em, won't bother me. Mglovesfun (talk) 17:58, 25 February 2011 (UTC)Reply
The problem of the populating Derived and Related terms manually is that the result bears to no relation to entries that we have, containing both many red links and missing many entries that we have. Etymology-section templates could populate categories which would be available for use in creating derived and related terms lists on demand. The effort foundered because it surfaced the still-unresolved issue of what we mean by derived terms and, to a lesser extent, related terms. Is derivation a synchronic or diachronic process for our purposes in English, in synthetic languages, in poorly attested languages? As long as the conceptual issues remain unresolved, we should probably not allow any automated procedure to populate the Related and derived terms sections. Better we should let the varies judgments of contributors populate the section without guidelines and uncertain result, but slowly.
If a user wants to see the Derived terms directly in the entry instead of in a category, we have {{rel-top}} to hide the heavily populated sections from those who don't want the section to take up the whole screen. DCDuring TALK 18:18, 25 February 2011 (UTC)Reply
For lots of Swedish words, I have used the templates {{compound}}, {{prefix}}, {{suffix}} and {{confix}} to explain how a word was put together. This goes under the Etymology heading. These templates categorize the page in e.g. Category:Swedish words suffixed with -sam, which is also featured under -sam#Swedish using {{suffixsee}}. But compounds are put in the huge and quite useless Category:Swedish compound words instead of separate categories for each component word. It is easy to imagine the alternative, that each compound component would get its own category, just like the prefixes and suffixes do. I don't believe this would be useful, however.
Now, words are formed in many steps. For example, (deprecated template usage) ofullbordad = o- + ((full + borda) + -d). All three steps (1. fullborda, 2. fullbordad, 3. ofullbordad) contain "full", but only the first step uses {{compound|full|borda}}. The second step uses {{sv-verb-form-pastpart|fullborda}}. The third step uses {{prefix|o|fullbordad}}. Thus, if {{compound}} were to create individual categories, only fullborda would show up in the category for words derived from full. --LA2 21:11, 25 February 2011 (UTC)Reply
The order in which morphemes have historically combined within a language can give one answer: the diachronic one. There are problems with a lack of sufficient historical evidence in many languages. Or one could break a word into all the morphemes that someone could use to reconstruct it, possibly using only currently productive affixes, affixes that have been productive at some time in the language, or affixes that users analyze as having a meaning, though the affix has not been productive in the language. Any of these latter are a more synchronic approach. The synchronic approach is better for generating comprehensive lists of related terms. The diachronic approach is better for lists of derived terms, but could be pressed into service for related terms, albeit awkwardly. DCDuring TALK 22:49, 25 February 2011 (UTC)Reply
As I often say, the question is not whether such terms exist or whether it is worth listing them, but whether it is worth listing them in a category. Mglovesfun (talk) 23:19, 25 February 2011 (UTC)Reply
As I didn't want to repeat so close to my last, the issue to me is whether we trust the process of manual creation or related terms and derived terms lists:
  1. They are often ridden with redlinks
  2. But remain incomplete.
  3. They sometimes have nothing whatsoever to do with a legitimate concept of what derived or related terms might be.
  4. At other times they are simply quirky.
The existing derived terms lists can be systematically used to visit the blue entries to make sure that they have an etymology section with the appropriate templates. We can decide whether the redlinked derived terms should be retained as is, converted to black links, or deleted. The process of adding missing etymology sections will lead to ever more complete derived and related terms. The definition of related terms can be refined and made more complete to include all words with shared stems and generated from appropriate categories.
The process of manual construction and maintenance of such lists is a quaint one, which I doubt is being undertaken by very many contributors. It is because I have attempted it in a few cases that I would favor the gradual implementation of a category-based population of Derived and Related terms over the do-nothing alternative, which is what the manual approach amounts to. DCDuring TALK 23:49, 25 February 2011 (UTC)Reply
It would really be nice if we could automatically generate lists on a page based on the contents of a category! But I don't think there is an easy way to do that... —CodeCat 00:25, 26 February 2011 (UTC)Reply
The templates {{prefixsee}} and {{suffixsee}} are quite good. --LA2 13:23, 26 February 2011 (UTC)Reply
Manual collection of derived terms has worked fairly well for me, with the help of the what-links-here function and the search function. I have populated many sections of derived terms in the past. I was not alone in doing so; I have seen many lists created by other people. From what I can see, the use of {{derv}} does not significantly simplify the process of identifying derived terms: you first need to know to which entries to add the template. The manual process can be enhanced using a bot, without the need to resort to categories. The manual process does not in any way amount to "doing nothing"; it is the easy way of setting up new technological devices and then waiting that amounts to doing nothing. If you say that the categories have proved very helpful to you, can you point me to several such categories that you have populated and that were not already populated before in the Derived terms section? --Dan Polansky 18:23, 3 March 2011 (UTC)Reply
I stopped after experimentation due to the lack of interest in clarifying the meaning of related and derived terms. In the absence of such clarification, there is much to recommend a hobbled manual process of populating such headings. I would think we would not want, for example, to alter {{compound}}, {{suffix}}, {{prefix}}, and {{confix}} to autopopulate the relevant categories. Such a modification would quickly (enough time to populate the missing category list) provide us with ample material to compare against our manually created derived- and related-terms lists. DCDuring TALK 19:42, 3 March 2011 (UTC)Reply
(unindent) Can you point me to several such categories that you have populated and that were not already populated before in the Derived terms section? Or is it true that there are no categories that demonstrate the usefulness of the technique that you have introduced? --Dan Polansky 09:06, 7 March 2011 (UTC)Reply

As a follow-up, I have tagged {{derv}} as in dispute. MG has sent it to RFDO, so the discussion and voting follows there: WT:RFDO#Template:derv. --Dan Polansky 16:04, 17 March 2011 (UTC)Reply

Disallowing templates that need languages from defaulting to English

Right now, there are quite a few templates that automatically assume the language at hand is English if nothing else is specified. I think this is a bit biased, but that's not really my point in this case. If the default is English, forgetting to specify the language will inevitably mean that the entry gets added to an English-specific category. This is of course not what we want, but because of the default option it's very hard to catch errors like that unless someone happens to spot the out-of-place entry. So for practicality alone I think it would be better if this default behaviour is removed, and an explicit lang=en is needed.

Linked to this proposal is the naming of topical categories. Currently, English categories do not get a language prefix, but other languages do. I think it would be better if English topical categories were prefixed with en:.

There are some templates such as {{term}} where the language is optional, but the default is not English but a generic case for all languages. This proposal does not affect those, such templates can keep working as they always did. —CodeCat 14:56, 26 February 2011 (UTC)Reply

To illustrate the above, 黃道吉日 is currently in Category:English idioms as it uses {{idiomatic}} with no language given. Mglovesfun (talk) 15:00, 26 February 2011 (UTC)Reply
I think that operating like {{term}} is ideal, I agree that defaulting to English is rarely the best option. There may be cases where it does make sense, but in general I agree. - [The]DaveRoss 15:03, 26 February 2011 (UTC)Reply
Not all templates can work like {{term}} though. There are some templates that add pages to a category, but what category should they add pages to if they don't know the language? In that case, some kind of requests category would be better than English. —CodeCat 15:15, 26 February 2011 (UTC)Reply
Ones that really require languages could default to a "needs a language" category. For cleanup. - [The]DaveRoss 15:18, 26 February 2011 (UTC)Reply
It is trivially easy to go to the end of any English category and identify a large number of items using other scripts that are misclassified. We also have had bots that have generated cleanup lists for items misclassified in many ways, including this. Why is this proposal necessary to solve the stated problem. This is the English-language Wiktionary for which we still have hopes of recruiting English native contributors to expand our coverage of terms and update our creaky Webster 1913 entries. Making it easy for them seems like a good thing. DCDuring TALK 18:22, 26 February 2011 (UTC)Reply
A reason for using the prefix en: is mentioned here. - -sche 19:48, 26 February 2011 (UTC)Reply
IMO {{context}} tags, specifically, should not require lang=en, as it takes up pre-definition real estate on the definition line, and KassadBot or a new bot should be employed to add lang to tags in non-English sections to prevent/remove miscategorization as English. However, IMO most other templates can and perhaps should do as proposed by CodeCat in this section.​—msh210 (talk) 15:12, 27 February 2011 (UTC)Reply

I've created a vote on this now, since it would probably best to get a proper consensus before this is made policy: Wiktionary:Votes/pl-2011-03/Default language of templates that require a languageCodeCat 12:33, 11 March 2011 (UTC)Reply

Pronunciation for Word of the Day for 2011-02-27 ("endeavour")

The audio file for today's Word of the Day, "endeavour", rhymes with "beaver". The audio file should be removed, or the audio file on that page should be changed to (the pronunciation from the page for "endeavor"). 68.9.112.195 00:58, 27 February 2011 (UTC)Reply

I see you've figured out how to do that yourself. :-)   —RuakhTALK 01:17, 27 February 2011 (UTC)Reply
Wiktionary:Word_of_the_day/February_27 is protected from editing, so I wasn't able to change the audio file for that page. I was able to change only endeavour because it wasn't protected. The parameter "audio=en-us-endeavor.ogg" should be added to the end of the wotd template on Wiktionary:Word_of_the_day/February_27. 68.9.112.195 01:36, 27 February 2011 (UTC)Reply
 DoneInternoob (DiscCont) 04:55, 27 February 2011 (UTC)Reply
We really need to try harder with regards to WOTD, there have been lots of them recently which have had pretty glaring issues. Unless a word is reasonably complete and reasonably accurate it should not be made WOTD. Don't forget that for a number of people the WOTD is their portal to the project, it is the first thing they see, and if it is wrong it does not build confidence in the project as a whole. Wikipedia does this one a lot better than us, and while I don't think we need as complex a procedure as theirs I do think we could do a bit more to ensure that the terms which are published in this way are as good as they can be. - [The]DaveRoss 17:30, 28 February 2011 (UTC)Reply
In this particular case, this issue was that I added the word as WOTD and could not play audio. In general, a good practice (besides helping out on the WOTD project itself, which I understand most people won't want to do) is as follows IMO: Someone who anyway checks, or can check, pronunciations (or etymologies, or whatever) should do so for upcoming WOTDs. (The March entries are at [[Wiktionary:Word of the day/Archive/2011/March]] (or the appropriate year and month). But the way WOTD templates work, (for example) March 11 template includes last year's word until someone updates it with this years. So don't bother checking words on the March page that have not yet been updated for 2011, as they're last year's WOTDs. The status, indicating which is the last template updated, is at [[Wiktionary:Word of the day/Status]].)​—msh210 (talk) 17:54, 28 February 2011 (UTC)Reply

March 2011

Looking for some help at Simple

Over at the Simple English Wiktionary, we'd like to use the technique used here for citations (e.g., Template:quote-book) for synonyms and antonyms. Unfortunately, we are a small group and nobody there seems to have the expertise to implement this. Would somebody mind popping over and helping?--Brett 14:17, 1 March 2011 (UTC)Reply

I'm not sure what you're referring to. Do you mean the way that when you nest an unordered list under a definition,
  1. like
    • this,
it get collapsed into a sort of show/hide thing?
RuakhTALK 16:52, 1 March 2011 (UTC)Reply
That's it.--Brett 18:17, 1 March 2011 (UTC)Reply
We use this javascript to hide quotes, but it is tailored for our formatting. If you use different formatting conventions it may not do what you expect.
Hidden Quotes

 
function setupHiddenQuotes(li)
{
   var HQToggle, liComp;
   var HQShow = 'quotations ▼';
   var HQHide = 'quotations ▲';
   for (var k = 0; k < li.childNodes.length; k++) {
      // Look at each component of the definition.
      liComp = li.childNodes[k];
      // If we find a ul or dl, we have quotes or example sentences, and thus need a button.
      if (/^(ul|UL)$/.test(liComp.nodeName)) {
         HQToggle = newNode('a', {href: 'javascript:(function(){})()'}, '');
         li.insertBefore(newNode('span', {'class': 'HQToggle', 'style': 'font-size:0.65em'}, ' [', HQToggle, ']'), liComp);
         HQToggle.onclick = VisibilityToggles.register('quotations',          
            function show() {
               HQToggle.innerHTML = HQHide;
               for (var child = li.firstChild; child != null; child = child.nextSibling) {
                  if (/^(ul|UL)$/.test(child.nodeName)) {
                     child.style.display = 'block';
                     }
                  }
               },
            function hide() {
               HQToggle.innerHTML = HQShow;
               for (var child = li.firstChild; child != null; child = child.nextSibling) {
                  if (/^(ul|UL)$/.test(child.nodeName)) {
                     child.style.display = 'none';
                     }
                  }
               });
 
         break;
         }
      }
   }            
 
addOnloadHook( function () 
{
   if (wgNamespaceNumber == 0) {
      var ols, lis, li;
      // First, find all the ordered lists, i.e. all the series of definitions.
      var ols = document.getElementsByTagName('ol');
      for(var i = 0; i < ols.length; i++) {
         // Then, for every set, find all the individual definitions.
         for (var j = 0; j < ols[i].childNodes.length; j++) {
            li = ols[i].childNodes[j];
            if (li.nodeName.toUpperCase() == 'LI') {
               setupHiddenQuotes(li);
               }
            }
         }
      }
   });

I think Atelaes wrote it initially, but maybe someone can help you port it. Not me since I am .js inept. - [The]DaveRoss 20:07, 1 March 2011 (UTC)Reply

I've created a version that does what I think Brett wants; interested parties, see [[simple:User talk:Brett#Hiding 'onyms.]]. —RuakhTALK 23:13, 1 March 2011 (UTC)Reply
Just in case no-one else has thanked you: Thank you. The javascript in your common.js is well documented and looks great. Conrad.Irwin
Ruakh's script is working well at simple.wiktionary. Thanks very much for everyone's attention and especially to Ruakh!--Brett 12:07, 3 March 2011 (UTC)Reply

Tabbed Languages, Definition side boxes, and Sense IDs

I propose that User:Yair rand/TabbedLanguages.js and User:Yair rand/editor.js be enabled by default, and that the template {{senseid}} be used in entries. I'm proposing these at the same time because large parts of them are dependent on each other (TabbedLanguages "new language" button doesn't work without Editor.js, Editor.js's Add Synonyms/Antonyms buttons don't work without senseids, Editor.js's add/edit gloss buttons are useless without senseids, senseids are hard to add or modify without Editor.js, and a lot of Editor.js's features require TabbedLanguages).

  • User:Yair rand/TabbedLanguages.js (based on User:Atelaes/TabbedLanguages.js) changes language sections into "tabs", rather than stacking the sections on top of each other. The languages are listed at the top of the page, and clicking on a language displays that language section's content.
    • The tab open upon loading the page is the section that was linked to if the user arrived at the entry through a link, or the Translingual section or English section if there is one, or the tab of the most recently viewed non-English/Translingual section's language, or the tab of any language that has been targeted with TargetedTranslations (priority given to languages which were added to the user's preferences earliest), or otherwise the first language on the page. (Edited 08:03, 10 April 2011 (UTC))
    • At the right of the list of languages, there is a "+" symbol, for easily adding new language sections, relying on Editor.js's "Add part of speech" function for filling it with content. (We might also want to have this as an option for starting new entries, using something like User:Yair rand/NECV2.js.)
    • At the bottom of the page, the categories of the currently viewed language section are visible, along with HotCat-style buttons for adding/modifying/removing categories. These functions use WT:EDIT.
    • A TOC is not visible by default, but a per-language TOC can be enabled by clicking "Show TOC" in the Visibility section of the sidebar.
  • User:Yair rand/editor.js adds small arrows to the left of definitions. Hovering over an arrow expands a box of editing options, including "Edit definition", "Add gloss" (for senseids), "Edit gloss" (if the definition already has a senseid), "Add example sentence", "Add synonyms", and "Add antonyms":
    • "Edit definition" causes the definition to be replaced by an input box with the definitions wikitext with Preview/Cancel buttons on the right. Senseids to not appear in the definition wikitext.
    • "Add gloss" allows users to easily add senseids. The language code does not need to be added manually.
    • "Edit gloss" edits the definition's senseid. It includes an option to "Edit matching glosses", updating all {{trans-top}}s and {{sense}}s that match the old gloss with the new gloss.
    • "Add example sentence", for easily adding new usexes. All formatting (including script templates) other than bolding is done automatically.
    • "Add synonyms" and "Add antonyms" (requires the definition to have a senseid) adds a new section header of the 'nyms unless there already is one, and adds input fields for adding new 'nyms. Prefixes the 'nym list with a {{sense}} template. (Doesn't have a field for qualifiers, yet.)
    • The script also adds a "Add part of speech" button in the toolbox section of the sidebar.
    • Any definitions added through the "Add definition" function have a editing-options box added to them.
    • All the above functions use the WT:EDIT framework, so multiple additions and modifications can be made at once, and the standard "Page editing - Save changes - Undo - Redo" box is used.
  • The {{senseid}} template adds anchors to individual definitions, allowing linking directly to a definition (example). These templates, formatted like {{senseid|language code|definition gloss}}, also help machine parsing of Wiktionary content, making it possible to directly connect definitions to translations tables and such. The template {{l}} could be modified to add an id= parameter for linking to definitions.

The TabbedLanguages script and the Editor.js script are both available in Wiktionary:Preferences. (The options are labeled 'Tabbed browsing of language sections ("headers" style)' and 'Add expandable editing options side box next to definitions.', respectively). If these scripts get turned on by default, there should be an option for any individual user to turn them off, with both cookie-based preferences and gadget-based preferences. I think that Tabbed Languages will greatly improve usability and that YR/editor.js will make it much easier for new users to edit.

So whaddaya think? --Yair rand (talk) 20:25, 2 March 2011 (UTC)Reply

IMO having "add gloss" or "edit gloss" function is undesirable, as it's too tech-y: people will not know what it is, what it's for, what value they should put in it, that it should be stable, or when to select "edit matching glosses"; I oppose its implementation. I personally don't like using tabbed languages (I like the standard view), but I do think it would be better for newbies to Wiktionary than the current view, so support its implementation.​—msh210 (talk) 20:46, 2 March 2011 (UTC)Reply
The add gloss form, like the tgloss editor, has a link to Help:Glosses, which has pretty good explanation of what glosses are. Ordinarily, though, I think most users who don't already understand it will just not use it unless someone or something specifically asks them to. --Yair rand (talk) 20:56, 2 March 2011 (UTC)Reply
I've seen the link to the help page. Nonetheless, I disagree that most users who don't understand it won't use it without being asked, and I think it will lead to the problems I outlined.​—msh210 (talk) 05:48, 3 March 2011 (UTC)Reply
Do you have any ideas as to how this might be fixable? Or perhaps the gloss buttons should just be removed for non-autoconfirmed users? --Yair rand (talk) 19:52, 20 March 2011 (UTC)Reply
I think the gloss buttons should be removed for all users. At least nonadmins.​—msh210 (talk) 16:38, 7 April 2011 (UTC)Reply
While I think these are both excellent, I oppose making them the default at this point. This is a major shift in the overall appearance and functionality of MediaWiki, which means that people will have an even harder time just jumping in. I think we need to consider exactly how we go about this very thoroughly before we launch, including some serious functionality and usability testing by numerous people who give feedback. We also need to make sure that the code is ready for prime time usage, it has to be relatively well documented, and generally accessible in case Yair gets hit by a bus or something. Also this is at least three votes worth of changes, so we could certainly roll out some changes in the nearer future than others, but all of these require some thought and discussion about how they will be presented and how they will be received. - [The]DaveRoss 11:08, 3 March 2011 (UTC)Reply
If I understand it correctly — and please correct me otherwise — TabbedLanguages decides which categories belong to which language as follows. It goes down the page looking for categories, and applies them to the first language section on the page. It continues to do so until it sees a category "Foo bar" or "foo:Bar derivations", where Foo/foo is the second language section on the page, at which point it starts applying categories to that language and continues until it sees "Baz xyzzy" or "baz:Xyzzy derivations" where Baz/baz is the third language section on the page, at which it point it starts applying categories to that language, etc. The problems with this AFAICT are at least three (perhaps someone will think of more), as follows. (1) If there are more than one language section on the page and one of them, not the last, has no categories (which shouldn't happen but does and will) then the algorithm will apply categories to the wrong language. (2) If a category "foo:Bar" (not "foo:Bar derivations") happens to be the first category in Foo's language section, it will be applied to the preceding section. (This, I'm informed, shouldn't occur. I imagine it does, and am confident it will.) (3) If categories are misplaced (as, if an English-language category is at the bottom of the wikitext although another language is on the page, as it sometimes is and will be), they will be applied to the wrong language.​—msh210 (talk) 21:48, 3 March 2011 (UTC)Reply
The above description is correct. The first problem could be fixed by a bot adding maintenance categories to language sections without categories, which could be detected by the script. I started a section about this in the grease pit, and also asked Prince Kassad if KassadBot might be able to do it. This definitely needs to be dealt with before Tabbed Languages is enabled by default. Problems two and three require that categories get regularly sorted into the correct language sections and moved to the bottoms of sections, which Autoformat used to do. I don't know whether KassadBot does it now. --Yair rand (talk) 22:14, 3 March 2011 (UTC)Reply
Re problem 2, IIRC AF stopped moving cats to the bottom of the language section (and to the correct language section, perhaps) when people complained about its doing so, contending that not all cats belong at the bottom, as, for example, an etymology-related cat belongs in the etymology, or a cat related to one sense belongs at the end of that sense line. (That's a sentiment I agree with, incidentally.)​—msh210 (talk) 22:18, 3 March 2011 (UTC)Reply
Are you sure? That seems to directly go against Wiktionary:Votes/2007-05/Categories at end of language section. --Yair rand (talk) 22:35, 3 March 2011 (UTC)Reply
Apparently, my memory failed me. You're quite right. (Incidentally, I see Carolina wren was disturbed by the result of that vote in '09, as were Mzajac and I, but seemingly nothing ever came of it.)​—msh210 (talk) 22:47, 3 March 2011 (UTC)Reply
Finding sections that don't categorize seems nontrivial to me, if you use the dump or if you use a bot. Templates can add categories- you'd have to know which ones do and which ones don't. Templates can also transclude other templates which will add categories, so you can't just parse each template individually to see if it adds a category. Anyone have ideas on how to do this? Nadando 23:18, 3 March 2011 (UTC)Reply
That's an issue not only for finding sections that don't categorize but for assigning categories to the correct section. (I assume an offline version of special:expandtemplates would come in handy. But I know nothing about such things.)​—msh210 (talk) 23:24, 3 March 2011 (UTC)Reply
@Nadando: One of the dumps is a multimap between entries and their categories, which can be used as a starting-point. (See Wiktionary:Grease pit‎#Entries without categories.) —RuakhTALK 21:04, 6 March 2011 (UTC)Reply
I tend to think tabbed interface should not be enabled by default. It hides too much what is going on: it pretends that a wiki page contains only a single language when in fact it contains several. --Dan Polansky 06:40, 4 March 2011 (UTC)Reply
That sounds rather strange to me, as it displays large, noticeable buttons for each language name across the top of the page, which, in my opinion, are a lot harder to miss than another section all the way at the bottom of the page below all the content for other languages... --Yair rand (talk) 14:26, 6 March 2011 (UTC)Reply
I would support enabling tabbed languages by default. At the expense of making things a bit more unfamiliar for editors, we can make things much better for the primary audience, readers. A big scrolling page of potentially-unrelated words in different languages isn't the usual interface people expect from a dictionary. Rspeer 03:05, 9 March 2011 (UTC)Reply
I think you have it backwards, frequent editors would have little problem adjusting to the new layout, whereas readers would be looking at something unusual. The wiki "look" is very widespread and well-known at this point, thanks mostly to Wikipedia and in part to all of the other projects and Wikia. A departure from the conventional wiki layout is a bigger deal than a departure from a conventional dictionary layout, especially since most online dictionaries already require some scrolling to get passed the ads. - [The]DaveRoss 11:11, 9 March 2011 (UTC)Reply
Here's a button that enables the two scripts for testing:
(Please purge your cache so that this button will work.)
--Yair rand (talk) 01:40, 21 March 2011 (UTC)Reply
The tabs look up as black/grey text on white background and with a larger fontsize in my browser, Firefox. I think they could be improved by highlighting or decorating in order to make more obvious to readers that they are clickable tabs or buttons, which isn't that clear with the current layout. Matthias Buchmeier 10:41, 1 April 2011 (UTC)Reply
Given that they're rather large, and that hovering the mouse over them causes them to react like links, underlining and changing the cursor, it seems like most people would be able to tell that they're clickable. Do you have any specific suggestions as to how the display could be changed to make it more obvious that the tabs are clickable? --Yair rand 11:45, 1 April 2011 (UTC)Reply
What about some more catchy font and background color maybe together with horizontal lines, i.e. light blue or light cyan similar to the boxes on the main page? Matthias Buchmeier 12:23, 1 April 2011 (UTC)Reply
I'd rather the tabs look "clean" enough not to seem strange on single language pages, and background colors would probably bring in a lot of objections. Changing fonts (maybe Arial Black?) or adding boldness might be helpful, but the tabs shouldn't be so distracting that they cause confusion on single-language pages. If you know CSS you could experiment on your common.css; the CSS classes are .selectedTab and .unselectedTab. --Yair rand 21:47, 1 April 2011 (UTC)Reply
I've turned on some of the stuff: Definitions now have side boxes containing "Edit definition" and "Add example sentence" buttons. Click the button below if you want to disable these features.
Please purge your cache to enable/disable definition side boxes.
--Yair rand 02:32, 7 April 2011 (UTC)Reply
  • I have a couple of questions?
    1. Why is any of this enabled by default for all users?
    2. Why is this heading used to introduce another user-interface change that is unrelated? DCDuring TALK 05:23, 7 April 2011 (UTC)Reply
  1. To make Wiktionary easier to edit.
  2. This isn't unrelated. This section is about "Tabbed Languages, Definition side boxes, and Sense IDs". The newly implemented feature is Definition side boxes (with a couple things missing by default), as explained at the top. --Yair rand 05:27, 7 April 2011 (UTC)Reply

I'm supportive. It looks like you've worked pretty hard. I have turned it all on, and I think it is pretty and efficient. It will attract more edits by casual passers-by, if made default. The language names across the top could be misunderstood as an advertisement that says, "Wiktionary is available in these languages, too!" If there are a lot of languages, then they add another row across the top, which will not help my mom, who's browser is already so stuffed with toolbars that she can only read one line at a time. For those of us with usable screen space remaining, and given the amount of scrolling that the language tabs eliminate, it is a positive net benefit. I don't know how you could make the tabs explicitly understandable, except with more text that says "Define what foobar means when used in Albanian|Crimean Tatar|Ido|Guaraní|Quechua." It's a decision between aesthetics and usability. I already know what it means, so please, for my benefit, choose aesthetics. :) ~ heyzeuss 07:20, 7 April 2011 (UTC)Reply

I am in favour of the definitions editor being enabled for everyone. If it turns out that it's a large source of vandalism or confusion, it is easy to turn it off (just as the initial deal was with the translations editor). I think that making things easier to use (even in small steps, though I think this is a reasonably large step) is about the most important thing Wiktionary can do right now. As a somewhat sporadic contributor of late, I keep having to look up the documentation for all the templates I need to use (to the extent where I often just don't bother), it'd be great if this could be abstracted away so that I can concentrate fully on the actually hard part of making dictionary entries — i.e. writing definitions. I think Yair Rand has done and is doing great work on the UI there.
I am undecided on the gloss editor (and the features that depend on it) — I think it needs to be presented in a clearer way (though I am at a loss to explain what that clearer way would be). Perhaps hiding the gloss, synonyms and antonyms behind an "advanced" button would make it less confusing for new users. I really doubt a new user could fill in the gloss correctly.
I am not (yet) in favour of enabling tabbed languages. I found when I had them enabled that everything would jump noticably every time I loaded a page — that gets very irritating — and it's not yet clear enough what the tabs are. The best part of the tabbed languages feature is the "+" button that lets you add a new language — we should definitely make it easier to do that, perhaps a "+" tab at the top of each page (or y'know, something that does more that more clearly). The wording on the failed search results page could also be better. Conrad.Irwin 07:46, 7 April 2011 (UTC)Reply
About the "jump", if it actually became the default to have tabbed languages on, the extra bit of CSS that styles the tabs could be added to Common.css instead of importing them from User:Yair_rand/TabbedLanguages.css after everything else, which would help the situation, I think. Additionally, if Mediawiki was changed so that the JS started loading before the rest of the content, it would help things significantly, and I think (though I can't remember where I read this, and I might be mistaken) that someone is working on that. --Yair rand 07:56, 7 April 2011 (UTC)Reply
Oppose. --Dan Polansky 08:31, 7 April 2011 (UTC)Reply
Support Tabbed languages and definitions editor, Oppose sense-id editor, which seems too complicated for the occaisonal user to me. Matthias Buchmeier 09:27, 7 April 2011 (UTC)Reply
Support Randolph's changes. I like tabbed browsing in particular. --Vahag 11:16, 7 April 2011 (UTC)Reply

Proto langs in topical categories

I just found Category:gem:Light. When did that become consensus? Normally, proto languages are not supposed to be in topical categories. -- Prince Kassad 08:38, 6 March 2011 (UTC)Reply

I don't see why they shouldn't have their own topical categories, really. As long as they have topics, there might be a need for topical categories too, I think? —CodeCat 11:15, 6 March 2011 (UTC)Reply
It makes them appear together with "languages" in the topical category tree. But they're not "languages", they're reconstructed. -- Prince Kassad 10:02, 7 March 2011 (UTC)Reply
They're languages whose terms have been reconstructed. But they are real languages. —CodeCat 12:17, 7 March 2011 (UTC)Reply
There may have been a language that those people spoke; but it was not the language that has been reconstructed. A reconstructed language that people don't actually speak is basically a "toy language", not as real as Klingon or Elvish.--Prosfilaes 18:58, 7 March 2011 (UTC)Reply

Votes on geographic names

As a reminder, there are two votes on geographic names running, with fairly low participation:

Both votes have started on 7 March 2011 and have 15 days to go. The participation is low compared to Wiktionary:Votes/pl-2010-05/Placenames with linguistic information 2, on which 23 voters participated. If the votes would be closed today, the mentioned vote on geographic names from May 2010 would be undone. --Dan Polansky 08:28, 7 March 2011 (UTC)Reply

Romanian vote in need of voters

The vote Wiktionary:Votes/pl-2011-02/Romanian orthographic norms has only three days to go, with the currently winning option "Proposal 2: use ș and ț (comma forms)" having 6:3:0 for support:oppose:abstain, whereas the other option "Proposal 1: use ş and ţ (cedilla forms)" has 3:6:0 for support:oppose:abstain. If some more people support proposal 2, it may be a clear-cut winner, while if more people oppose, the vote will be a more clear no-consensus one. As it is now, proposal 1 fails while proposal 2 is on the borderline between pass and no consensus. --Dan Polansky 08:47, 7 March 2011 (UTC)Reply

User:MalafayaBot for operation in Article namespace (2)

Hi, all.

I just wanted to let you know I opened a vote to request permission to run my bot in the Article namespace in order to update interwikis locally. If you would like to cast your opinion and/or vote, please do so at Wiktionary:Votes#User:MalafayaBot_for_operation_in_Article_namespace_.282.29. Thanks, Malafaya 22:59, 7 March 2011 (UTC)Reply

Vote: Deprecating less-than symbol in etymologies 2

FYI Wiktionary:Votes/pl-2011-02/Deprecating less-than symbol in etymologies is ending today, currently 9:3:1 for support:oppose:abstain, meaning 75% supporting the vote. 13 people have voted, out of 38 people who have taken part on the poll that preceded the vote, which is 34% of the pollers. --Dan Polansky 06:20, 8 March 2011 (UTC)Reply

Update: it is now 9:4:1, so 69% supporting. --Dan Polansky 08:06, 8 March 2011 (UTC)Reply

Place names are under threat again

Just making you aware about my question regarding possible future place names deletions. Fixing all entries to match CFI may take some time but threatening flags put pressure on editors and discourage, "fix or I delete them". They all have useful translations but etymology, pronunciation, etc. can be added at some later stage but I strongly disagree with threats to delete entries, which don't conform 100% with CFI. --Anatoli 00:00, 9 March 2011 (UTC)Reply

See also the thread #placenames up for deletion, 11 February 2011, on the subject that you are discussing in Wiktionary talk:Criteria for inclusion#Delete or improve?, 8 March 2011. See also the two votes that address the disagreement, in #Votes on geographic names. --Dan Polansky 11:27, 9 March 2011 (UTC)Reply

Newspeak

I'm not sure neither that this is the right place to propose this, nor that the proposal is actually workable. Basically, I'd like to propose adding a bunch of newspeak words (proper nouns like Miniluv I don't know, but crimestop and goodthink might be good?), and labelling them as such with a template like {{dated}}. Kayau 13:46, 9 March 2011 (UTC)Reply

We don't include Newspeak words just as we don't include entries in other constructed languages like Klingon, Quenya, etc... They aren't going to be useful to the majority of people since this language is hardly ever used for actual communication. -- Prince Kassad 16:22, 9 March 2011 (UTC)Reply
But... some such words might be attestable in English. Mglovesfun (talk) 09:21, 10 March 2011 (UTC)Reply
Is 1984 a well-known work? Including these would then be automatic. As far as I am concerned, this would be a reason to not allow the well-known work exception to our normal attestation. DCDuring TALK 09:54, 10 March 2011 (UTC)Reply
It's undoubtedly a well-known work. The problem is, the "well-known work" rule clashes with the "fictional universe rule" in WT:CFI. Let's quote them:

Usage in a well-known work

and

Terms originating in fictional universes which have three citations in separate works, but which do not have three citations which are independent of reference to that universe may be included only in appendices of words from that universe, and not in the main dictionary space.

Now CFI doesn't say it specifically, but I'd take the "[u]sage in a well known work" criterion to mean 'includable in the main namespace'. Furthermore, I don't know what a fictional universe is. Not in any detail, anyway. It's easy to see how 1984 can only be considered fictional, but we've had this rule applied to The Da Vinci Code as well, which is set in modern-day Europe! So it seems that the 'usage in a well known work' is only cut-and-dry when the well known work is non-fictional.

Seriously though, what is a fictional universe? Do we have any sort of definition of it? Mglovesfun (talk) 11:02, 11 March 2011 (UTC)Reply

I would propose that it's only terms referring to things in a fictional universe; w:crimestop arguably has a definition independent of the universe, so the fictional universe rule wouldn't apply to it, but Miniluv is only defined in terms of that fictional universe, so the rule would apply to it.--Prosfilaes 20:56, 11 March 2011 (UTC)Reply

As a note, the entries for ungood, crimethink and doublethink already exist and there may be more. Kayau 12:24, 11 March 2011 (UTC)Reply

Sure, but ungood is well-cited; crimethink is citable outside the range of 1984, etc. If you can find two citations that don't refer to Orwell or 1984, then the argument about well-known work and fictional universe is moot.--Prosfilaes 20:56, 11 March 2011 (UTC)Reply
Correct but that wouldn't help four our James Joyce stuff. One possible interpretation at WT:RFD#bababadalgharaghtakamminarronnkonnbronntonnerronntuonnthunntrovarrhounawnskawntoohoohoordenenthurnuk is that word used in well-known works but only in an appendix. That I assume that the CFI "[u]sage in a well-known work" does not apply only to the main namespace, but also to other ones, such as Concordance and Appendix. Cryptex already failed RFV entirely without being moved to an appendix. --Mglovesfun (talk) 15:47, 15 March 2011 (UTC)Reply
Personally, I'm not a huge fan of appendixes. I'd rather put in everything we know about the word in a page in the main namespace.--Prosfilaes 19:19, 15 March 2011 (UTC)Reply

Pejorative, offensive, obsolete, colloquial, jargon, scientific terms etc.

I realize that there are many different kinds of users who contribute to this wiki. For that reason, the number of entries are very robust and thorough. However, after a certain point it begins to break down and the wiki's legitimacy as a valuable resource starts to fall apart, because specialized words are intermixed with what would be considered more proper words.

The idea I have put forth to the wiki webmaster, and they asked me to post it here, was that perhaps it would be nice if the wiki users could tag various terms with different labels. These words still participate in the full database and would not be excluded for any reason, but being able to tag words would give the users higher power over accessing the database and their usage. As anything wiki, these labels would be subject to the wiki revision process of the community. These would be similar to labeling things as a noun, verb, adjective, etc. but would provide a higher degree of context for how the word is utilized.

Once that feature is implemented, the webmasters could show the labels in the index view and allow users to filter out words to make usage of the database easier. For example, perhaps you know that the word you are looking for is scientific jargon, so you might be able to filter for these labels and only show a subset of the full index.

Thoughts from the community? — This unsigned comment was added by Imagic.designer (talkcontribs) at 21:27, 9 March 2011.

Unless I'm missing something, these already exist, see Category:Context labels. Also I don't see how including specialist vocabulary causes anything to 'break down' or 'fall apart'. If you don't want to know what something means, don't search for it! Mglovesfun (talk) 22:07, 9 March 2011 (UTC)Reply
Our aim is to include all words, as simple as (deprecated template usage) cat, as specialised as (deprecated template usage) catabolism (context label - biochemistry) or (deprecated template usage) catalepsy (context label - pathology). We will continue to do so. SemperBlotto 22:13, 9 March 2011 (UTC)Reply
Thank you very much for the clarification on this. Didn't mean to offend with my "fall apart" comment. You see, I use this site more for word studies and data mining of word groupings, not really for definitions of individual words. But I was unaware, and you have educated me on the fact, that this site is in fact equipped to handle such things. For example, lets say I want to not simply know what a square knot is, but to have a listing of all the different types of knots that are included in the database. Or perhaps a listing of all the prepositions in the English language.
What I was calling labels are done with categories on this site. My over-site, so again my apologies. There are a couple of limitations with this strategy, for example a category isn't all inclusive, but it is overcome by the wiki nature of editing over time to place a category label on each word. Thanks again for alerting me to such things. — This unsigned comment was added by Imagic.designer (talkcontribs) at 10 March 2011.
I definitely wasn't offended; I just disagreed with you. As for all the English prepositions, see Category:English prepositions - there are probably some we don't have yet, or that we have and aren't categorized. We also have Category:Knots, same thing. Mglovesfun (talk) 14:20, 10 March 2011 (UTC)Reply
One last comment and not sure how to correct this, but here is an example of how it falls apart, because you get a lot of words (when data mining) that are hard to sift through. For example, (deprecated template usage) red_cunt_hair is labeled in categories of slang and vulgarities but other related terms (deprecated template usage) RCH and (deprecated template usage) RPH are not included in these categories. Perhaps synonyms need some kind of category inheritance. Also categories need some standardization (or a way to cross-reference) because some words are ethnic slurs and categorized as offensive, but do not show up in the ethic slur list. It is hard to maintain some consistency across the entire database. For those interested in word by word definitions this isn't a problem, but those who are interested in the database as a whole with word studies of groupings it is a major problem.
So (deprecated template usage) carpetmuncher is labeled as vulgar, but the plural form (deprecated template usage) carpetmunchers is not, nor is (deprecated template usage) carpet_muncher, nor the alternative (deprecated template usage) rug_muncher
There is also some issue with errors in inputting the word entry. For example, the term (deprecated template usage) acceptance lists it as obsolete when it points to (deprecated template usage) acceptation, #8 in the definition portion. The term 'acceptation' is obsolete, but the word 'acceptance' clearly is not, and yet 'acceptance' shows up in the obsolete category, because it received that label within its definition page. User:Imagic.designer
Re your last point: You're right. Categories (in the software this Web site is based on) can only categorize entire pages, so using categories to mark obsoleteness marks the entire page as obsolete, even though only one sense of the word is really obsolete. Similarly, the entire page "la" is marked, using categories, as an English noun, even though only one sense of la is that of en English noun, and others are, for example, French. Arguably, then, categories (which must work this way, simply because of the software that the Web site uses) are not the best way to mark senses as obsolete or as English nouns — yet I really don't think there's a better way. Alternatively, arguably, we should have a separate page for each sense, and that page can then be categorized unambiguously. Someone's suggested this in the past, but it would vastly decrease usability (and editability) of the site as a whole.

Re marking rug muncher as vulgar: I agree: it should be. This is a collaborative Web site, and you can help by marking vulgar senses as such. Edit the page using the "Edit" link atop it and look for the sense in question. Then put {{vulgar}} at its start, right after the #. Thanks!​—msh210 (talk) 16:11, 10 March 2011 (UTC)Reply

I certainly will look to see and edit those entries that seem to be mislabeled or incomplete. A couple of things that might help that can be done by the website administrators, might be to merge redundant categories (or at least cross reference them). For example, a term that is pejorative and derogatory are the same thing and yet they each have their own category. Also, the webmasters might be able to have some way to inherit the category listings from their synonyms or plural forms without input from the user to manually include the category for each entry. It makes perfect logical sense that if an entry is included in a category, then the plural or participle form of the word would also be part of the same category. — This comment was unsigned.
Re automatically labeling forms of vulgar entries as also vulgar: While the software cannot do such a thing, it is doubtless possible to write a script that someone can run on his own computer that will do it. But consider: a Romance-language or Semitic-language verb can have numerous forms. (Even an English one has a few; for example, go has go, went, gone, goes, and going.) If a verb is vulgar, and you want all those forms to be marked as vulgar also, then the category will quickly fill up with verb forms and it will be impossible to glance through the category and pick out the forms one wants. (On the other hand, the category will be more complete. It's a trade-off.) This is IMO different from categorizing both carpet muncher and carpetmuncher as vulgar: those are different spellings rather than forms of the very same word, and both should IMO be categorized as vulgar. And it's certainly IMO different from categorizing both carpet muncher and rug muncher: those are mere synonyms.​—msh210 (talk) 16:43, 10 March 2011 (UTC)Reply
Don't forget that we are all volunteers here. If you want something done, you normally have to do it yourself. SemperBlotto 16:34, 10 March 2011 (UTC)Reply
Seconded. The feedback you are giving is exactly the kind of thing we work with everyday (by we, I mean regular contributors). Despite our progress, Wiktionary will never be "complete" simply because the content is limitless (such as languages are). When in doubt, be bold, or ask an admin. ---> Tooironic 12:53, 11 March 2011 (UTC)Reply
We tend not to categorize inflected forms like fucks, fucked, fucking (etc.) so carpetmunchers isn't categorized because the singular is. --Mglovesfun (talk) 15:49, 15 March 2011 (UTC)Reply

Malagasy Wiktionary (mg.wikt)

Does anyone know what the story is with the Malagasy Wiktionary? Each time we get a new dump, and I find new interwiki-links to add, the great majority are to mg.wikt entries. For example, the current batch includes 144,623 edits, of which 109,922 edits (76%) are just to add links to mg.wikt. And those are all links to new pages, obviously, since we already had links to all the previously existing pages. Are they just incredibly industrious, or are they bot-adding pages? If the latter — are they useful pages, with real content, or are they just stubs? And, if the latter — should we be adding interwiki links to pages that don't have actual content?

(I seem to recall a similar question coming up once with ru.wikt entries, but I don't completely remember the conclusion; and anyway, it's been long enough since then that people's views could easily have changed.)

Thanks,
RuakhTALK 14:35, 14 March 2011 (UTC)Reply

Yep, that wiki uses bots to artifically increase the pagecount. It seems to mostly steal inflected forms from other wikis. -- Prince Kassad 14:43, 14 March 2011 (UTC)Reply
Here are Recent changes in Malagasy Wiktionary. Here the same thing but with robotic contributions shown. A look at http://mg.wiktionary.org/wiki/in suggests this page has been imported from en:wikt without any postprocessing: en:wikt templates are all over the page. The page was created on 13 February 2011 by User:Bot-Jagwar (contributions).
The pages that they have and that en:wikt does not have may be raw imports from other language dictionaries; this could be discernible by the templates that have been left in the imports. --Dan Polansky 14:59, 14 March 2011 (UTC)Reply
Here is the talk page of the bot owner: Jagwar. --Dan Polansky 15:02, 14 March 2011 (UTC)Reply
Thanks, both of you. So, those pages are basically useless. Even a native Malagasy speaker would be better off using our entry than using an mg.wikt entry. Onto my next question: should our entries have interwiki-links to theirs? I'm leaning toward "yes" — just as a matter of diplomacy, we don't pass judgment on other wikis' content — but part of me feels that we include interwikis because they're useful, so when they're not useful, we shouldn't include them. —RuakhTALK 15:25, 14 March 2011 (UTC)Reply
I don't really know. It seems okay with me if en:wikt interwiki bots don't care to update Malagasy interwikis; going out of our way to delete interwikis seems unnecessary. I don't find their way of procedure perfectly okay. It also comes down to what it costs to add these interwikis, in terms of attention and other resources. I do not know how sensitive the issue is, diplomacy-wise. --Dan Polansky 16:44, 14 March 2011 (UTC)Reply
See also Wiktionary:Beer parlour archive/2010/October#Malagasy Wiktionary. An interesting quote from there: "They do the same from the French wiktionary (without any change to the pages). Some templates have an mg version, making entries more or less readable, but some templates are not supported, and definitions, notes, etc. are in French. Lmaltier 19:54, 4 November 2010 (UTC)". --Dan Polansky 16:53, 14 March 2011 (UTC)Reply
Re: "going out of our way to delete interwikis seems unnecessary": Oh, definitely. I wouldn't even contemplate that. I more meant, should I even be bothering to add them? Currently I treat the mg.wikt-only edits as a low priority (I don't do them until after I've done all others), but maybe I should just skip those entirely.
Re: "what it costs to add these interwikis, in terms of attention and other resources": Not free, but not huge. From my standpoint: three-quarters of my interwiki edits are just to add these; but when you consider "fixed costs" vs. "variable costs", they probably only triple the cost of running an interwiki-bot.
Re: "I do not know how sensitive the issue is, diplomacy-wise": I doubt it's a particularly sensitive issue, just, we don't want to get in the business of deeming sister projects useless. We're so consistent in our use of interwiki-links that it's something that can be (and is) performed by bots, so pointedly excluding certain sister projects would be a bit . . . pointed.
Thanks for the link to that past discussion; I don't know if I wasn't paying attention at the time, or if I simply forgot about it. (This didn't affect me at all until I started the interwiki bot.) I guess mg.wikt is simply the current worst offender.
RuakhTALK 20:09, 14 March 2011 (UTC)Reply
Actually, I was looking at the heading of this discussion ("Malagasy Wiktionary (mg.wikt)") and admired its design: it contains two keywords by which someone might be looking for a discussion on this subjects: "Malagasy Wiktionary" and "mg.wikt". Then I wondered what I would find if I sought for "Malagasy Wiktionary" in BP archives, and so I found the other dicussion. I have forgotten the other dicussion long ago. --Dan Polansky 20:48, 14 March 2011 (UTC)Reply

The whole point of having Wiktionary in different language versions is to have content in different languages. Report 'em to wikipolice. Njardarlogar 19:51, 14 March 2011 (UTC)Reply

Yeah, I might do that — not because their content is pointless (though it is), but because they seem to violating our copyright to do it. If they want to mass-import pages like this, they should be using the import tool, not a CopyVioBot. —RuakhTALK 20:09, 14 March 2011 (UTC)Reply
We might drop a note to Jagwar on that Wiktionary. Various cultures have various attitude towards copyright, so it may seem to him that he is doing nothing wrong. He may be unaware of the issues related to multiple authors in revision histories. I have seen no post to his talk page that would talk about this practice. He may turn responsive. I am myself not very confident about the issue, so what I could write is something like this: "Hello Jagwar, I and some other people in English Wiktionary think that it is no good idea to copy pages from other dictionaries using a bot. Every page in, say, English dictionary is copyrighted by its authors; if you want to import it to Malagasy Wiktionary in a way that respects copyrights and contributions of other people, you have to use the import tool: see TODO:LINK-TO-GUIDE-TO-IMPORT-POOL. Best regards, Dan Polansky". This could be extended with something like this: "Actually, merely importing pages without the intention to turn them into pages written in Malagasy seems to be a really poor idea: Malagasy Wiktionary should be written in Malagasy." I release this post into public domain, so feel free to tweak it and repurpose as you see fit. --Dan Polansky 20:58, 14 March 2011 (UTC)Reply
Sorry, too late! What I did was leave a comment at w:mg:Dinika amin'ny mpikambana:Jagwar#Importation par les bots., in rusty French. Hopefully it asks the bot owner to stop, for copyright reasons, and points him/her at the import tool. It doesn't address the question of whether such edits are at all useful, but I didn't feel it was my place to raise that too strongly. (It does say, « il ne me semble pas utile d'importer de texte non-malgache à Wikibolana », which is hopefully intelligible with the meaning "it doesn't seem useful to me to import non-Malagasy text to Wikibolana", but that's it.) It also doesn't address the question of what to do with the existing copyright violations. (But what I did doesn't seem too different from what you were preparing to suggest.) —RuakhTALK 21:02, 14 March 2011 (UTC)Reply
Cool! I was just trying to push myself to put something together in spite of being really uncertain; it is great to see someone know what to do, and actually do it. --Dan Polansky 21:10, 14 March 2011 (UTC)Reply

The Malagasy Wiktionary now has more than 1,000,000 entries, so I'm placing it together with French and Chinese in our list of Wiktionaries in other languages. Presumably if this number drops due to concerns of quality or copyright, feel free to revert this change. --Daniel. 21:26, 14 March 2011 (UTC)Reply

What makes you think what you did was a good idea? --Dan Polansky 22:28, 14 March 2011 (UTC)Reply
I am with Dan here, we want to show the number of entries in the target language, not the number of English and French entries mirrored on the target URL. Let's hold off on including that wiki until a determination has been made about what is to be done. - [The]DaveRoss 10:07, 15 March 2011 (UTC)Reply

I have posted to Jagwar in English on his WP page and then on his WT page: mg:Dinika_amin'ny_mpikambana:Jagwar. On his WT talk page, people have already been trying to talk to him about the imports. Sorry for posting a link to his WP page; his WT page would have been a better link it seems. --Dan Polansky 13:53, 15 March 2011 (UTC)Reply

He tells me that he no longer imports pages from other Wiktionaries, due to the copyright concerns I had mentioned. (In fact, he says that he had stopped even before my comment, after other users raised similar concerns.) The current bot edits are "form of" entries, and they're actually in Malagasy. Personally, I don't know how useful a bot-generated "form of" is when the lemma form is a redlink, but a Malagasy speaker coming across one of our form-of entries might at least benefit from the grammar explanation of the Malagasy entry, even if for lexicosemantic information they need to come back to our lemma entry. So, I think it's clear that we should keep adding the interwikis. Do you agree? —RuakhTALK 18:52, 15 March 2011 (UTC)Reply
Coming from here, speaking in English, we can't really dictate how the mg.wiktionary should be run. Whether bot entries in Volapük or copyright concerns, the deciding factor should be the local community of Malagacy-speaking volunteers, who can decide if the entries are good and useful or not. But is there such a community? And what should the Wikimedia Foundation do with projects that instead of a community only have one or two free-wheeling bot operators? None of us is proud of the Volapük Wikipedia. It's a stain on our reputation as Wikimedia volunteers. I'm bewildered here, trying to restrain myself from creating too many of the Swedish entries in English Wiktionary in lack of a larger community. --LA2 19:09, 15 March 2011 (UTC)Reply
We can't dictate how mg.wikt should be run, but we can dictate how we are run. We're within our rights to link to them or not. And since they're violating our contributors' copyrights, we probably have an obligation to notify the Foundation so that this can be dealt with. (I give Jagwar full credit for stopping the bot-imports when the issue was pointed out to him; existing copyvios still need to be deleted somehow.) Unfortunately, I don't know whom exactly to notify, or how. —RuakhTALK 19:17, 15 March 2011 (UTC)Reply
I do not know whether we should keep adding the interwikis, and I do not oppose adding interwikis to mg:wikt, or at least not yet. Adding inflected forms that do not point to real content is pitiable IMHO, but possibly still basically acceptable. The high number of entries is driven in considerable part by the category for inflected forms for Volapük, which has 701,121 entries. I have picked one entry from that category, and it has zero Google hits. We have no idea whether these entries are at least correctly constructed; many of them are probably unattestable. Which is a great trick: go creating inflected forms for an artificial language, for which the attestability requirement can be wanted to be lifted. I have no idea how that guy has created these inflected forms, but I think he should better stop. I have posted more to his talk page (mg:Dinika_amin'ny_mpikambana:Jagwar), but to unknown effect. --Dan Polansky 19:55, 15 March 2011 (UTC)Reply
This is the problem: we think this is bad and want it to stop. But we speak neither Volapük nor Malagasy, so who are we to speak up? I think the Foundation needs to have some policy in place, similar to the policy for starting new projects. --LA2 23:18, 15 March 2011 (UTC)Reply
Must have not noticed an edit conflict earlier as I tried to post in reply to this; as long as the Malagasy Wiktionary has some entries with content, we should keep the interwikis. The second question about adding new interwikis; are any 'good' entries being created there at all? If so, keep 'em and do as LA2 suggests; go to Meta about it. Mglovesfun (talk) 00:19, 16 March 2011 (UTC)Reply
I think it might be a disservice to our users to have interwiki links to content we are very skeptical of. The only real benefit the interwiki links may have is to point other mg speaking folks to a project where they can contribute, but for our users the links are not so beneficial. - [The]DaveRoss 02:13, 16 March 2011 (UTC)Reply
Re: "This is the problem: we think this is bad and want it to stop. But we speak neither Volapük nor Malagasy, so who are we to speak up?": The guy does not speak Volapük either; Volapük is a constructed language. When I suspect that something is running astray, I feel entitled to speak up on a sister Wiktionary project, no matter whether I speak their language. Furthermore, the guy really has no mandate to do what he does: he is the sole admin on the project, so there is no Malagasy opposition that could stop him. If you actually belive what he is doing is wrong, you have to tell him. His inflation of numbers has a side effect that actually affects all Wiktionary negatively, albeit in a minor way: it completely skews the statistics. It is ridiculous to skew the statistics with 700,000 mostly unattestable entries from a constructed language. --Dan Polansky 07:48, 16 March 2011 (UTC)Reply
Dan, I do applaud your effort. However, I fear that it might be without effect since this guy can do what he likes in the absense of a local community. As an admin he has done nothing wrong that calls for revoking his rights, he only displayed poor judgement in creating worthless entries. We can't call for the closing of the project, since Wiktionary in Malagasy is a perfectly legitimate project. Maybe we need some kind of all-language Wiktionary committee that can coordinate projects and act (with some kind of authority) as the missing community on smaller projects. --LA2 11:10, 16 March 2011 (UTC)Reply
My effort may have little effect but it achieves at least one thing: Jagwar can no longer claim that his action went undisputed. I am already part of a larger Wiktionary community, without needing a specific formal title for that. So are you. Another effect is that it is now clear that Jagwar understands and speaks English fairly well, so English can be used as the language of communication to make things traceable to the largest audience. --Dan Polansky 13:23, 16 March 2011 (UTC)Reply

Category:Time

This is a relatively small project of reorganizing Category:Time. Here is a list of subcategories and few examples of members. It's open to suggestions.

--Daniel. 23:53, 16 March 2011 (UTC)Reply

What is the problem you are trying to solve? There are 415 articles in Category:Time today. Is that too many? I don't think so. On the contrary, too many intermediate level categories (e.g. References of time) only make the category tree deep (tall) and narrow. We already have Category:Months as a more natural intermediate there. Many languages only have one set of month names, e.g. Swedish doesn't have words for the Persian calendar months, so there is no need for subcategories for each calendar. --LA2 01:29, 17 March 2011 (UTC)Reply
Does anyone agree with LA2? Category:Time is simply one of many messy categories; the lack of logic and consistency of topical categories has been mentioned multiple times by many people.
Here are some problems regarding the specific category tree in question, as it exists today (that can be fixed by various means, including but not limited to implementing the suggested category tree above):
I'm happy to collapse the tree into Time.​—msh210 (talk) 06:14, 17 March 2011 (UTC)Reply
Some of your subcategories make good sense, so you could just create them and start adding articles. I wonder if time travel has enough many articles to make a good subcategory. As for the months and week days, I think the current categories are better than your proposal. Especially, I don't see the point in the rather abstract intermediate category "references of time". To me, that is a synonym to "time". --LA2 15:12, 17 March 2011 (UTC)Reply
LA2, you said that "references of time" is synonymous to "time". I disagree with your statement. Chronos, wristwatch and Novikov self-consistency principle are related to time but are not references, so they might be categorized into Category:Time but not into Category:References of time.
As I mentioned in a previous message, we currently have Category:Islamic months and Category:Hebrew calendar months. The word "calendar" should be either present or absent from both categories, for consistency. Relatedly, we currently have Category:Gregorian calendar months but Category:Persian months.
How many entries do you expect to be enough for a good category named Category:Time travel? --Daniel. 23:42, 17 March 2011 (UTC)Reply
If you can think of 20 terms, that's a good reason to create a category, in my opinion. (This rule of thumb would disqualify "months" as a category, if there are only 12 month names. But I'm not claiming to be logical.) --LA2 07:29, 28 March 2011 (UTC)Reply
OK. Category:Time travel would meet the threshold of 20 terms by having time travel, time machine, time-traveller, wormhole, Novikov self-consistency principle, timeslip, flux capacitor, chronoclasm, tesseract, grandfather paradox, clock paradox, closed time loop, bootstrap paradox, closed timelike curve, time dilation, Roman ring, chronovisor, retrocausality, suspended animation and autoinfanticide. --Daniel. 09:41, 28 March 2011 (UTC)Reply
I am mentally busy elsewhere so only a sketchy response: Overall, I do not like the proposal made in the introduction of this thread, although there are some parts of it that seem good. One pick in particular is "Category:Duration (new, old, recent)": new, old and recent do not come under the head of duration. "new" and "old" have several meanings in some of which they come under the head of "age"; "recent" comes under the head of "distance in time in the past direction", which I do not know how to say in one word, other than perhaps "recentness". Another particular pick: "Category:Time-measurement devices" is redundant to "Category:Clocks". --Dan Polansky 11:42, 18 March 2011 (UTC)Reply
Dan, I would support merging "Category:Time-measurement devices" with "Category:Clocks" to make it simpler and avoid that redundancy. Also, "Category:Duration" may be removed from the proposal, and replaced by one or more categories to fit the multiple meanings of the words in question. For example, possibly Category:Past might contain old, new, recent, preterite, memory, retrocognition, and so on. --Daniel. 12:13, 18 March 2011 (UTC)Reply

I created a vote on this subject: Wiktionary:Votes/2011-03/Category:Time. --Daniel. 21:38, 28 March 2011 (UTC)Reply

Category:Etymology -- Quechua derivations and Quechuan derivations

Hi! In Category:Etymology ([10]) I see the following sub-cats:

  • Quechua derivations
  • Quechuan derivations

As far as I know both are about the same language. I would like to fix that but I'm not sure what's correct here. (By the way: the topical categories without language prefix are terribly hard to see; they drown in the prefixed ones.) --MaEr 19:01, 18 March 2011 (UTC)Reply

I seemed to think that Category:Quechuan derivations was nominated for deletion, and passed. Though, I don't know why. Mglovesfun (talk) 14:53, 20 March 2011 (UTC)Reply
Seems that Quechua is a language, and Quechuan is the language family. Hmm. --Mglovesfun (talk) 13:05, 22 March 2011 (UTC)Reply
Thank you for the information. This looks a bit complicated. I'll leave it to a quechuanologist. --MaEr 18:46, 22 March 2011 (UTC)Reply

I still think it should go. WT:LANGTREAT comes to my mind. -- Prince Kassad 18:48, 22 March 2011 (UTC)Reply

Entries with two etymologies

The Greek word μα represents 2 part of speech with separate etymologies. To date I have been entering these with 2 Etymology headings (1 & 2), with POS headings at level four instead of three. An anonymous editor has used a different way of solving the problem - by giving the 2 etyms. under one heading and using the sense (not used but intended) to differentiate the POS (now promoted to the normal level 3).
This seems a clear way of doing things but not one I'm accustomed to.

1. I see that the French and Greek wikts use the one etym head format (at least they do for fr:μα & el:μα. Do we have house rules?
2. Is the one etym format used elsewhere in the English wikt?
3. Is this new format acceptable - and should it have wider use?
Any thoughts? —Saltmarshtalk-συζήτηση 12:29, 19 March 2011 (UTC)Reply
I would prefer that layout, because it's a lot less confusing than the layouts we have now. Some of them might have four etymologies and it is very confusing for people who want to look up a word and don't care for its origin at all. —CodeCat 12:36, 19 March 2011 (UTC)Reply
I think I agree —Saltmarshtalk-συζήτηση 12:49, 19 March 2011 (UTC)Reply
I think I agree with that as well - so perhaps both methods should be permissable - perhaps they already are? —Saltmarshtalk-συζήτηση 06:05, 20 March 2011 (UTC)Reply
French Wiktionary until recently only used this method. The downside is it can get very confusing, for example when there's more than one noun, you get things like noun #1, noun #2 (etc.). I still prefer to split by ety, but for shorter entries it's ok I think. This entry is short enough that it would all fit on a single page with no scrolling anyway... Mglovesfun (talk) 14:52, 20 March 2011 (UTC)Reply
A reason for putting homographs in different sections is that they are different words which just happen to be written the same. We split different spellings of the same word (colour/color) rather than handling them on the same page like Wikipedia does; it is counter-intuitive to combine unrelated words. - -sche (discuss) 19:30, 20 March 2011 (UTC)Reply
It would also be counter-intuitive and confusing to have two systems of etymology; thus we should not permit both methods. As Ƿidsiþ and Prince Kassad suggest, using the single etymology header in complex entries (like race) we would need to distinguish six noun senses from five noun and three verb senses from one noun sense, and to be sure any added senses were correctly listed in the etymology; thus it is best to keep homographs separate. - -sche (discuss) 19:39, 20 March 2011 (UTC)Reply
See also [[User:Msh210/ELE]].​—msh210 (talk) 16:44, 21 March 2011 (UTC)Reply

Unblocking

Can someone please unblock me? I have been unjustly blocked by Mglovesfun. --Dan Polansky; --188.138.84.132 14:37, 20 March 2011 (UTC)Reply

To quote myself "I'm actually tempted to block you for longer for "intimidating behavior/harassment". That's my main regret. I'm just hoping to avoid it." Mglovesfun (talk) 14:48, 20 March 2011 (UTC)Reply
If you check my talk page, you will find not a single recent complaint about my behavior that would be blockable. The single complaint that I have received that suggested blockability was by SemperBlotto on his talk page. I cannot adjust behavior if no one complains about it. SemperBlotto in particular has failed to complain about my behavior that he disliked: my posting notifications for votes. Instead of telling me on my talk page that he does not want any further notifications, he has decided to sabotage a unifiction effort. --Dan Polansky --188.138.84.132 14:54, 20 March 2011 (UTC)Reply
endorse blocking. You do not seem to be able to engage in discussions in a civil manner. This short block was more than overdue. -- Prince Kassad 15:09, 20 March 2011 (UTC)Reply
Prince Kassad, you are making a blank accusation of my incivility. Do you care to provide a diff or two that shows what you consider my incivility? --Dan Polansky 15:30, 21 March 2011 (UTC)Reply
Could someone link to some of the edits in question? I think I've missed something. —RuakhTALK 15:14, 20 March 2011 (UTC)Reply
I have to finishing something offline so this message may be a bit rushed, and I intend to come back online later — This unsigned comment was added by Mglovesfun (talkcontribs) at 16:24, 20 March 2011 (UTC).Reply

I just wanted you to leave me alone to make some edits. Yes edits, we make lots of them here. I don't object to your calling my actions into question, so I answered your question and then hoped you'd drop it and I could keep on editing.

MediaWiki has no solution whatsoever to this problem, leaving me with the options

  1. Do nothing
  2. Protect my talk page, sysops only
  3. Log out and edit under IP or create a second account
  4. Block you (that is, Dan Polansky)

I went for #4 based also on other people's comments.

Furthermore I think you should get a further block for abusing multiple accounts; I find it highly hypocritical to complain on user's talk pages about not following Wiktionary policy, but breaking said policies when it suits you. Mglovesfun (talk) 16:24, 20 March 2011 (UTC)Reply

It is a bad to evade discussions by blocking. On the other hand, it is just as bad to continue fruitless (but entirely civil!) discussions on possible minor violations of bot operation policies. In such cases, I'd personally opt for #1. -- Gauss 16:40, 20 March 2011 (UTC)Reply
I'm evading the harassment, not the discussion. I'll be the judge of whether I feel intimidated or harassed, thank you. Mglovesfun (talk) 16:48, 20 March 2011 (UTC)Reply
I get the impression that I'm missing (or haven't paid enough attention to) much of the context; but going just based on the discussion at Mglovesfun's talk-page, it seems that Dan Polansky's initial concern was valid, but that his approach was hostile and confrontational, and that it quickly grew more so, even after Mglovesfun tried to ignore the hostility (per WT:AGF) and to take his questions at face value. Indeed, it looks in retrospect as if Dan Polansky's original comment may have been intended as somewhat of a rhetorical "trap" for Mglovesfun: asking Mglovesfun to defend a position, while preparing to "pounce" with contrary evidence.
I don't know whether Mglovesfun's later comments and subsequent blocking of Dan Polansky were justified, but they certainly seem understandable. :-/
RuakhTALK 19:07, 20 March 2011 (UTC)Reply
I admit I have been confrontational. I came to the talk page with the intent to expose MG as a policy-flouting oligarch, who performs mass changes in the mainspace contrary to consensus and previous unanimous practice, yet opposes mass changes in the mainspace that are preferred by a large majority of the editors. I have chosen the rather aggressive method of asking unpleasant yet relevant questions, trying to get him to accede that the principles that underly his actions are unacceptable. I feel I have treated him no more harshly than he has treated EncycloPetey, so he does not really get to complain.
I have not abused multiple accounts: I have created no other account but rather edited anonymously; IP addresses are not user accounts. When I get unjustly blocked, I feel justified in making anonymous edits that ask for my unblocking (using anonymous IP addresses). I have made no other edits using anonymous IP addresses; in particular, I have created no new entries but rather filed them into a personal list of entries to be created. If there really is a policy that forbids anonymous asking for unblocking, this policy should be revised for its being unjust. --Dan Polansky 15:30, 21 March 2011 (UTC)Reply
One more thought on "abusing multiple accounts": I have acted in good faith that I am violating no policy. Can someone point me to a policy that forbids using multiple accounts? This search does not find anything for me. --Dan Polansky 05:51, 22 March 2011 (UTC)Reply
IP addresses are accounts and no; we don't have policies on loads of things. WT:BOT is toothless and WT:BLOCK was a major step in the wrong direction, as it has a lot less specific detail than its predecessor. Mglovesfun (talk) 11:17, 22 March 2011 (UTC)Reply
Yeah, I'm not exactly known for subtlety myself. Mglovesfun (talk) 11:18, 22 March 2011 (UTC)Reply
Finally, "[or] I will make sure that the task is so innocuous that no one could possibly object." Surely no such edit exists; ; surely anyone can object to any edit at any time; Hell, I could object to my own bot edits and cause myself to be in violation of WT:BOT. What I also want to know is what Dan's objection is. I suppose re-reading my talk page, technically he hasn't objected to the edits yet so I'm not in violation of WT:BOT until he does or someone else does (including myself). So what is this objection? Mglovesfun (talk) 11:23, 22 March 2011 (UTC)Reply
Let us sum it up: you have made several false statements, such ones that accuse me of things that I have not done, but you do not show any remorse. You have failed to explicitly warn me that you are about to block me. You have given a false reason for my block. Then you have threatened me with extending the block even further. Finally, you falsely accused me of hypocrisy; when asked to provide evidence and links to policies that I have allegedly broken and whose violation should suffice to extend my block even further, you switch the subject and talk about how WT policies are actually broken. Instead of responding straight to the questions I have asked and admitting that you have goofed a bit, you have evaded the question. Also an option. --Dan Polansky 11:52, 22 March 2011 (UTC)Reply
You're confusing 'falsely' with the fact we just have differing opinions on the matter. To be honest all I can say in reply is stuff that's already have, or on my talk page. I think we have more important things to do than repeat ourselves ad nauseam. Like I said, to be a violation of WT:BOT someone would have to object, and nobody has. --Mglovesfun (talk) 12:00, 22 March 2011 (UTC)Reply
This not a matter of opinion but of fact. Your threat still remains unretracted: "I think you should get a further block for abusing multiple accounts; I find it highly hypocritical to complain on user's talk pages about not following Wiktionary policy, but breaking said policies when it suits you". Instead of admitting that you did something wrong, you continue to switch the subject. A new fallacy that you have introduced consists in your reclassifying matters of fact as matters of opinion, which makes it possible for you to claim of any disagreement that we just have a differing opinion. You imply that I have "abused multiple accounts", when in fact I have only used an IP address to ask for unblocking, which could hardly be construed as "abuse". You have conceded not a single point. Now go, and call EP a bully. --Dan Polansky 12:08, 22 March 2011 (UTC)Reply
How can any statement that starts "I think" be a matter of fact? You can't 'force' me to agree with you. We disagree, period, end of. Now let's do some editing. The dictionary, remember? --Mglovesfun (talk) 12:22, 22 March 2011 (UTC)Reply
First off, if you do not want to respond to me, just stop responding. I do not feel like stopping responding to you. You seem to be entitled to a last word, but I do not think you are.
I did not introduce my statements with "I think". I am wholly certain that no one has pointed to a single policy that I have broken, so, to the best of my knowledge, I have broken no policy and you accusation is false. You do not admit that much, instead resorting to what might be called mere-opinions-no-facts fallacy. --Dan Polansky 12:30, 22 March 2011 (UTC)Reply
I introduced my statement with I think. The only impression I really get is that you're angry and upset. Presumably not about MglovesfunBot, which you're not even commenting now, even when I bring it up. My best guess is whatever it is is not Wiktionary related at all, but this is a 'safer' outlet for your anger. --Mglovesfun (talk) 13:03, 22 March 2011 (UTC)Reply
I now see what I have indeed overlooked: the first part of the threat, the one before semicolon, was indeed secured by "I think". That has made the threat a little bit more covert and defensible. Nonetheless, the part after semicolon was not so guarded: "I find it highly hypocritical to complain on user's talk pages about not following Wiktionary policy, but breaking said policies when it suits you", unless you construe "I find" as yet another guard, yet the part "breaking said policies when it suits you" is not so guarded. But even I admit that this is getting really technical. The essence is that you are evading responsibility for your threats, and you show no remorse when you are proven wrong; you do not admit things until you get pushed very far by very targeted questions, which of course is nothing else but interrogation, a rather unpleasant thing to be exposed to. --Dan Polansky 13:23, 22 March 2011 (UTC)Reply
Let me ask another question that I have not asked: do you believe that I have "abused multiple accounts"? --Dan Polansky 13:29, 22 March 2011 (UTC)Reply
Re: "Presumably not about MglovesfunBot, which you're not even commenting now, even when I bring it up." Wait a minute. Where have you brought up MglovesfunBot? Is there something you said about MglovesfunBot that you want me to respond to? (Fact is, there is still one blatantly false statement of yours that I have still have not mentioned, sitting in the pipeline. I am much more enraged by your having unjustly blocked me, though, and I do not want to discuss too many things at once. )--Dan Polansky 13:33, 22 March 2011 (UTC)Reply
Because Dan Polasnky is angry (or enraged as he puts it) and wants people to know about it. Nothing to do with Wiktionary. --Mglovesfun (talk) 13:46, 22 March 2011 (UTC)Reply
Fact is, I have posted on your talk page today, but you have referred me to Beer parlour. If you want to continue on your talk page, we can do that. --Dan Polansky 13:57, 22 March 2011 (UTC)Reply
Fact is, you're being selfish. This debate says to me "I'm upset; everyone look at me until I feel better" while what people are actually doing is concentrating on the dictionary-side of our project. Yes dictionary, remember that? --Mglovesfun (talk) 14:20, 22 March 2011 (UTC)Reply
You have made yet another inaccurate statement about me pushing things to Beer parlour, but do not retract. If I am being selfish, what is it that makes you respond to my objection? What prevents you from ignoring me and going to build the dictionary, as you have suggested several times? (You don't need to answer. I am as okay as you are to have the last word, and win the bear-fight described by László Mérő. I still have some resources to waste. But I am inclined to leave your next response unanswered in order that you can "go building the dictionary", and collect your bear-points. ) --Dan Polansky 14:26, 22 March 2011 (UTC)Reply
You're determined to turn this into some sort of macho pissing contest. It ought not to be. You're confusing "inaccurate" with "opinionated". I have an opinion. I've often been told not to present my opinions as facts, but this is the first time I've had it the other way around; I present an opinion and the other person tries to present them as facts against my will. --Mglovesfun (talk) 14:35, 22 March 2011 (UTC)Reply
See the bracketed part of my previous post :). And enjoy your mere-opinions-no-facts fallacy. --Dan Polansky 14:41, 22 March 2011 (UTC)Reply

I declare this discussion over. -- Prince Kassad 15:02, 22 March 2011 (UTC)Reply

Prepositional phrases

Since Wiktionary:Votes/pl-2010-01/Allow "Prepositional phrase" as a POS header passed some time ago, I assume that we should be going through Category:Prepositional phrases and changing the headers of all the ones that don't have ===Prepositional phrase=== with a bot/AWB. Am I right? —Internoob (DiscCont) 18:11, 22 March 2011 (UTC)Reply

There are cases where the phrase is only used adverbially, so an adverb category and heading is appropriate. When I add something as a prepositional phrase I try to make sure that it is used both as adjective and as adverb. So it is not a good bot candidate. AWB also makes for a rush to judgement, but could work with sufficient self-restraint. DCDuring TALK 19:35, 22 March 2011 (UTC)Reply
Could you provide some examples, DCDuring?--Brett 12:25, 23 March 2011 (UTC)Reply
I'll try later to provide some candidates. I would look among phrases with non-spatial prepositions and those that are idiomatic. Another approach to finding them is to use the category-intersection capability of CatScan on the tool server. The intersection of English prepositional phrases, English adverbs, and NOT English adjectives would provide some. But not all such phrases would be hard-categorized as prepositional phrases. DCDuring TALK 16:26, 23 March 2011 (UTC)Reply
From the first 30 or so generated by catscan prep phrases + adverbs: against the clock, against time, at a glance, at a pinch (I think, not part of my idiolect), at all, at any rate, at heart. Let me know how many candidates you would like. DCDuring TALK 17:13, 23 March 2011 (UTC)Reply
Ruakh has provided a helpful list (below) of entries that are at once (!) idiomatic and categorized as prepositional phrases and adverbs, but not adjectives. in front is an example of an entry that has IMHO improperly so categorized, but which now has prep phrase as header and category using {{infl}}. As the list is dynamic, it no longer appears on the list. It was #10. The items preceding it seemed more likely to be adverbial only, IMHO. DCDuring TALK 18:56, 23 March 2011 (UTC)Reply

Thank you for the examples. And could you explain what you mean by preposition functioning as an adjective? I get the adverb idea (you mean it answers the questions when, where, why, and how, right?) --Brett 19:37, 23 March 2011 (UTC)Reply

After forms of "be" and other copulas. DCDuring TALK 20:03, 23 March 2011 (UTC)Reply
Compare: This item costs next to nothing vs. The cost of this item is next to nothing. —Internoob (DiscCont) 21:21, 23 March 2011 (UTC)Reply
There are many expressions where either a great deal of either expertise, work, consultation of references, or arbitrary decisiveness is required to make the classification as to whether something is "really" adjectival or adverbial. I don't and won't have CGEL in front of me, don't have the expertise, am not decisive about such matters, and am unwilling to work to a resolution. I am simply interested in preserving the result of the labors of those who have gone before and may have correctly classified a given prepositional phrase into one of the traditional PoSes. When a good usage examples provided shows use of the prepositional phrase after a copula, that suggests that they were not so careful about PoS (as was the case with in front. As to my own personal preference, I wouldn't mind everything being called a prepositional phrase, but I'm not willing to impose that preference on others. DCDuring TALK 22:08, 23 March 2011 (UTC)Reply
If you're wondering about what the CGEL would say, one of its key principles is the separation of category and function. In the CGEL there is no question of something functioning as an adjective or an adverb. These are categories, and dictionaries label things primarily by their category. Thus something is an adjective (or AdjP) or it is not. The adjective may or may not function as the complement of a linking verb, and if it doesn't (e.g., drunken) a note to the effect would be a useful kindness, but that has no bearing on its category. What you are calling "adverbial" is a common but unfortunate label which leads people to conflate the category of adverb with a function that is commonly performed by adverbs. As far as the CGEL is concerned, if something is headed by a preposition, it's a prepositional phrase. Period.--Brett 23:39, 23 March 2011 (UTC)Reply
And I don't remember seeing a single entry in any dictionary that specifies a PoS for a phrase, except for NPs and phrasal verbs. Nevertheless, when adopting the Prepositional phrase header was discussed, there was an explicit statement of conservatism used to sell it. Let's not break faith with voters, however common that practice might be elsewhere. DCDuring TALK 00:27, 24 March 2011 (UTC)Reply
I missed the vote, but I went back and read the discussion. I don't find such a promise. Sorry for being a pain, but could you point it out for me?--Brett 00:42, 24 March 2011 (UTC)Reply
See the talk page for the vote and Wiktionary:Beer_parlour_archive/2010/January#Preposition_forms_and_prepositional_phrases.. Also note the tiny number of voters. DCDuring TALK 00:51, 24 March 2011 (UTC)Reply
Yes, I've been through it. I see no explicit statement of conservatism.
Regardless, I'm not sure why we would make the careful category-function distinction between true adjectives and nouns used as modifiers but when it comes to various uses of prepositional phrases, take the other route. The fact that one is a lexical category and one a phrasal category should have no bearing on the matter.--Brett 01:06, 24 March 2011 (UTC)Reply
That particular example — next to nothing — seems like a pronoun to me. You can refer to it not only with how much ("next to nothing is how much I make") but also with what ("next to nothing is what I make"). But what do I know.​—msh210 (talk) 22:17, 23 March 2011 (UTC)Reply
Yes, and no. Here, next is not the head, and nor is to. The head is the determiner no, but it is part of a compound determiner: nothing, and it's being modified by the prepositions. Nothing is not a pronoun as pronouns cannot be modified by adverbs (e.g., almost nothing but not *almost I; yes you can have that's almost it, but almost here is part of the VP, and doesn't modify it; cf. *almost it was there.)--Brett 23:39, 23 March 2011 (UTC)Reply
So I'm not really sure what it is, but I'd say it's SOP.--Brett 01:12, 24 March 2011 (UTC)Reply

I'm creating a Kindle dictionary based on Wiktionary

I have produced a first version of a free ePub/Kindle dictionary based on Wiktionary and GCIDE v.0.46 XML, i.e. Webster's 1913 Unabridged with additions from WordNet. See: http://personal.inet.fi/koti/korhoj/

Now my goal is not to sell this, it's freely available. For convenience, I want to publish it in Amazon, and due to distribution costs they require a minimum price of $2.99 for files over 10 MB! They are considering whether public domain books (yes, I know that Wiktionary is not PD) might be published for free. At any rate, my dictionary is freely available from my webpage. Anyone have anything against this?

I read the Jan 2011 article Wiktionary:Beer_parlour#Publishing_a_Wiktionary. I'm not overly concerned about the legal side as I'm not selling it as such and I trust you don't want to sue me. I merely linked to the English Wiktionary site and included a reference to the license. There's really no good way to include a list of authors in Amazon book details. The best I could come up with I listed "Wiktionary Volunteers" as an Editor. The book itself is less of a problem. I suppose we could produce a compilation of all the volunteers and I could link to it. Although I don't think it's worth the effort.

Interestingly enough, GCIDE is covered by yet another (free) license so I had to license entries from GCIDE under that license. Since I'm not a lawyer (I'm an IT expert) I'm not sure if this is the way to do it but should be good enough.

Now so far I have mostly tackled with GCIDE XML, using an XSLT2 transformation. For Wiktionary I have yet created only the most crude transformation based on entries in the single-line format. The goal was to just quickly include the bulk of the definitions. I'm yet to tackle most of the niceties such as inflections. For GCIDE I have already included most of them though.

As I do have other engagements, I was wondering, is there any interest in participating in say SourceFourge based development of the Java program? It is really crude at the moment. I only read the editing rules this morning...

Korhoj 06:30, 24 March 2011 (UTC)Reply

Might I recommend you get hold of an IP lawyer before proceeding further?​—msh210 (talk) 16:08, 24 March 2011 (UTC)Reply

Boldface in image captions

I think there should be no boldface in image captions. For boldface in image captions, see this revision of "grillwork"; for no boldface, see most images in Wiktionary and this revision of "grillwork", which was reverted by a local oligarch without a summary. If anyone has an idea how to achieve no boldface in image captions, please let me know. --Dan Polansky 10:33, 24 March 2011 (UTC)Reply

In a sense, an image is a usage example, so the headword should be in bold shouldn't it? —CodeCat 10:54, 24 March 2011 (UTC)Reply
For one thing, I have a proposal in the pipeline to remove boldface from usage examples, as I do not see what useful function it serves and it highlights what should not be highlighted. Definitions contain no boldface and are more important than usage examples. Other dictionaries do not highlight the terms in usage examples using boldface, from what I have seen (lemmings). This proposal stands little chance of passing through a supermajority, though, so I have planned to let the proposal sit in the pipeline indefinitely.
For another thing, image captions are not really usage examples. It is true that the images themselves are examples of things to which the word is used to refer, so, informally, the images themselves are somthing like uage examples, and yet, images are not exactly usage examples. In any case, the current hugely prevalent practice is to have no boldface in image captions, and I like this practice.
As regards processes, I do not see why a minority of editors should overthrow the common practice in the main namespace. --Dan Polansky 11:06, 24 March 2011 (UTC)Reply
Oops, the images themselves are not examples of things to which the word refers; they are mere images of the things to which the word refers. --Dan Polansky 11:08, 24 March 2011 (UTC)Reply
Argument 1: That captions are running text
  1. All usages of a word in running text on its entry page have been highlighted, including principally usage examples and usage notes.
  2. Captions are a form of running text.
Argument 2: That imperfect, yet useful, images require relatively longer explanatory captions.
If one believes that the only images he have or should have are those that provide perfect self-explanatory images that need no caption other than the headword itself, then one might believe that there is no need for captions that make explicit the link between an image and its caption. We do not have access to a sufficient range of images to provide those perfect images. And yet even imperfect images can illustrate a definition of a headword. If a caption has a less-than-obvious relationship to a headword sense it may be essential to make clear what that relationship is, which almost always means a longer caption. Bolding the headword in the caption reminds the user of what is supposed to be important about the image.
Argument 3: That a caption is as good a usage example as most.
An explanatory caption may be a phrase or a sentence. It is not likely to be particularly unusual or unrealistic if it actually explains something in an image.
Argument 4: That images may only relate to a single sense of a headword requires bold to draw attention to the sense.
Polysemic words may have an image relating to but a single sense. It is necessary to make sure that users can understand which sense is illustrated. Whether it is a sense number or a gloss, the bolded headword draws attention to the immediate vicinity of the sense number or gloss.
Argument 5: A consistent user interface requires that if the headword is sometimes to be bolded in captions, that it always be bolded.
Both for contributors and for users, a consistent rule is desirable.
This summarizes arguments favoring bold for image captions. DCDuring TALK 15:00, 24 March 2011 (UTC)Reply
Ad 1: Usage notes do not always highlight the headword: our practice is inconsistent. Etymology sections, when they mention the headword (which is, admittedly, not often), do not generally boldface it. Derived-terms lists which list phrases including the headword to not boldface it. Nor should any of those.
Ad 2: The caption should clearly enough point to the relevant sense that there be no need to emphasize the headword. A close-up picture of someone serving a tennis ball, used to illustrate racquet, can read "Someone holding a racquet while playing tennis"; this is clear enough without boldfacing.
Ad 3: Granted. This may be sufficient reason to boldface the headword in an image caption. Still, personally, I don't like doing so.
Ad 4: I don't understand this argument. How does boldfacing the headword in an image caption specify which sense is meant?
It draws attention to the headword, which immediately adjoins the gloss or "sense" + sense number. More bolding strikes me as excessive and creating deception, confusion, and distraction. — This unsigned comment was added by DCDuring (talkcontribs) at 16:40, 24 March 2011.
Ad 5: Granted, I suppose, but I've answered your arguments (i.e., 1 and 2) that relate to only some entries.​—msh210 (talk) 16:06, 24 March 2011 (UTC)Reply
I agree with all or most of what msh210 said. I will pick some highlights.
Statement 1-1 is false, as msh210 has pointed out: not all usage notes have the headword in boldace, and IMHO no usage notes should have the headword in boldface.
Argument 5 is a glaring fallacy. The argument that says that consistency requires "... if the headword is sometimes to be bolded in captions, that it always be bolded" can be easily rephrased as "... if the headword is sometimes to be not bolded in captions, that it always be not bolded"; the structure of the argument does not support bolded over nonbolded at all. But this is very plain: the requirement of consistency of formatting of captions does not in any way indicate preference of one formatting over another.
To hear "Both for contributors and for users, a consistent rule is desirable" from a person who has helped to ruin the recent vote on unification of formatting of etymologies (Wiktionary:Votes/pl-2011-02/Deprecating less-than symbol in etymologies, also diff) is curious indeed. Nonetheless, I agree that unified formatting of image captions is desirable. There already is almost unified formatting, without boldface. --Dan Polansky 07:34, 25 March 2011 (UTC)Reply

I'm against bold face as it doesn't work well in all scripts (see CJK for example). -- Prince Kassad 15:30, 24 March 2011 (UTC)Reply

I tend to use bold face like I would for a headword. I suppose when boldface is not used for the head word then it shouldn't be used elsewhere, such as CJK and Hebr. Mglovesfun (talk) 15:33, 24 March 2011 (UTC)Reply
I would not think that we would necessarily impose, prefer, or allow such a practice for scripts where the harm was greater than the benefit. I am only speaking for Latin script, really English. Should this be a matter for Wiktionary:About English? DCDuring TALK 16:40, 24 March 2011 (UTC)Reply

CFI and vandalism

Now this is a section CFI could do well without:

==Vandalism==

From time to time, various parties will insert material into Wiktionary which clearly has nothing to do with Wiktionary's purpose or practices. Such activity is considered vandalism and will be undone at the first opportunity. If the vandalism consists of an edit to an existing page, that edit will be reverted. If the vandalism consists of a new article, that article will be removed. This is done at the discretion of the administrators and does not require discussion, even if the vandalism consists of a new article for a term which would otherwise meet these criteria but has not yet been entered legitimately.

Any supporters of removal of the section from CFI? And if removal is not supported, are there any proposals for abbreviation of the section?

The section first appeared in CFI in this revision, on 11 August 2005. --Dan Polansky 10:22, 25 March 2011 (UTC)Reply

I support its removal.​—msh210 (talk) 15:52, 25 March 2011 (UTC)Reply
No strong feelings but it seems to me this is relevant and correct so it could stay. It's staing the obvious but I don't mind that in policies; I don't think you can have too much clarity. --Mglovesfun (talk) 15:59, 25 March 2011 (UTC)Reply
There is a risk of meta:instruction creep and excess bureaucracy if everything is explained to a high level of detail. However, in this case it is relevant and helpful to outline guidelines for dealing with vandalism. I do think it could be shortened a bit though. Tempodivalse [talk] 19:41, 25 March 2011 (UTC)Reply

Here is a shorter version of the text:

==Vandalism==

Edits and pages which clearly have nothing to do with Wiktionary's purpose or practices will be undone or removed at the first opportunity, at the discretion of the administrators.

--Daniel. 21:55, 25 March 2011 (UTC)Reply

Much better! It says the same thing in a more concise fashion. Tempodivalse [talk] 22:55, 25 March 2011 (UTC)Reply
That is a nice and succinct statement the original; thanks! Yet, I still think it does not belong to criteria for inclusion. CFI governs (a) what terms should be included, (b) what senses of the terms should be included. CFI does not govern which other material should be included: it does not govern the inclusion of etymologies, pronunciation, example sentences, quotations, derived terms, and other material. Furthermore, CFI does not govern the process details of removal of material, such as via what process (RFD, RFV, speedy deletion), within what discussion time frame, and under whose discretion things should be removed from Wiktionary. See also WT:Vandalism. --Dan Polansky 05:40, 26 March 2011 (UTC)Reply
You're welcome. I agree with Dan Polansky's reasons to remove the Vandalism section from CFI. Perhaps the succint version of that section should be moved to another policy. The current Wiktionary:Vandalism is not a good place to do that, because it is worded as a help page, rather than a policy; so I believe it should be renamed to Help:Vandalism. --Daniel. 10:38, 26 March 2011 (UTC)Reply
I also agree that the CFI does not need and should not have a section on vandalism. Vandalism isn't content. - [The]DaveRoss 03:14, 27 March 2011 (UTC)Reply
Well... it is until it's removed. But I take your point. Mglovesfun (talk) 22:11, 28 March 2011 (UTC)Reply

Name of the language codes osx, gml and nds

The names that these three codes expand to are a bit inconsistent. {{osx}} expands to Old Saxon, {{nds}} expands to Low Saxon. But the code {{gml}}, for the descendant of Old Saxon and the ancestor of modern Low Saxon, expands to Middle Low German. Shouldn't that last one be Middle Low Saxon then? —CodeCat 15:05, 25 March 2011 (UTC)Reply

No, because it's known as Middle Low German. That's the code: German, Middle Low. Middle Low Saxon gets 13 hits on Google Books; Middle Low German gets 350. Middle Low German also gets 30 times as many hits on Google. It's the standard name for the language.--Prosfilaes 17:03, 25 March 2011 (UTC)Reply
I agree; but we might consider changing {{nds}} to "Low German" (which is much more common, and which Ethnologue does identify as an alternative name). —RuakhTALK 17:23, 25 March 2011 (UTC)Reply
There is also the code {{nds-nl}} which expands to Dutch Low Saxon. Maybe the other code is Low Saxon because of that... otherwise you'd have Dutch Low Saxon and Low German. Or even Dutch Low German? :p —CodeCat 18:30, 25 March 2011 (UTC)Reply
Like I stated in the deletion discussion, {{nds-nl}} is pointless because there's already {{gos}}, {{twd}}, {{drt}} etc. -- Prince Kassad 21:46, 25 March 2011 (UTC)Reply

Psychology and psychiatry

Has there been any discussion on when these two categories should be used? It seems arbitrary now. I would suggest putting official diagnoses and psychiatric drugs in Category:Psychiatry but putting all the more general words in Category:Psychology. Sorry if this has already been discussed, I couldn't find it. WurdSnatcher 06:08, 26 March 2011 (UTC)Reply

CFI and protologism

I propose to remove the section for protologisms from CFI:

===Protologisms===

The designation protologism is for terms defined in the hopes that they will be used, but which are not actually in wide use. These are listed on Wiktionary:List of protologisms, and should not be given their own separate entries.

see discussion for exclusion of the words in lists - Protologisms, Wikisaurus, concordances etc, from application of the CFI to each individual listed word.

My rationale is that protologisms are already excluded by failing the attestation criterion specified at the beginning of CFI, so this section reads like a comment rather than an additional regulation.

While my favorite option is removal, rewording the section would also be better than doing nothing. Dropping the text that is small and in italics seems also better than nothing.

The section is a result of the following edits to CFI (I could have overlooked very minor tweaks):

  • diff, Richardb, 26 February 2006, addition of the small text
  • diff, Eclecticology, 23 May 2005, minor tweak in wording
  • diff, Dmh, 2 May 2005, creation of a dedicated section while the text remains unchanged
  • diff, Dmh, 12 April 2005, initial wording of a paragraph

Any supporters? --Dan Polansky 06:20, 26 March 2011 (UTC)Reply

Afaik, WT:LOP is actually deprecated - it is no longer linked to from the pre-defined deletion summary, for example. Therefore, I support removing the section wholesale. -- Prince Kassad 08:43, 26 March 2011 (UTC)Reply
Yes, remove it and re-nominating WT:LOP for deletion. Mglovesfun (talk) 22:09, 28 March 2011 (UTC)Reply

CFI and formatting of alternative spellings

One more proposal; I hope to be excused for the series.

This subsection of the section "WT:CFI#Spellings" could be removed or simplified:

====Formatting====

Once it is decided that a misspelling is of sufficient importance to merit its own page the formatting of such a page should not be particularly problematical. The usual language and part of speech headings can be used, followed by this simple entry:

# {{misspelling of|[[...]]}}

An additional section explaining why the term is a misspelling should be considered optional.

Layout of formatting is not within remit of CFI. The part "the formatting of such a page should not be particularly problematical" seems outright worthy of removal.

Any supporters? --Dan Polansky 07:46, 26 March 2011 (UTC)Reply

Yes, support; I've spotted this before, it's non-CFI material; formatting is not CFI, it should be in ELE (and I think it isn't). So I'd support such a removal, and coverage in WT:ELE. Mglovesfun (talk) 22:08, 28 March 2011 (UTC)Reply
Good one! Support. -- Prince Kassad 22:14, 28 March 2011 (UTC)Reply
This section tells us not only how to format the entry (which certainly belongs elsewhere and not in the CFI) but also that the entry should be a mere 'soft redirect' that clearly spells out that the misspelling is just that, rather than a real entry or "alternative spelling" entry. I think that that belongs in the CFI. So unless and until we agree on wording that specifies that without specifying formatting, I say keep the section as it is.​—msh210 (talk) 06:14, 29 March 2011 (UTC)Reply

Headword-line templates

Suddenly changing their names with "move" operation, in particular when that change is accompanied with mass-substitute operation committed by a bot under the command of the mover, is generally undesirable. Every template name is a form of API that can be used both by humans to type, or by external programs to parse. In case of Serbo-Croatian, I utilize such external tools which process regular dump for inconsistencies, help me create new entries etc. And I'm sure I'm not the only one that relies on such tools for regular editing activity. Each time you change a template name, template parameter name or functionality, without making an effort to inform the concerned parties affected by it, you're introducing braking changes into the system. This is especially douchey if you're doing it for the languages that you're not editing it.

Also: the headword-line template names and their pointless abbreviations. They are not standardized. Any standardization is de facto, not de iure. Why on earth would I have to scratch my head each time, manually looking up the documentation, to find out whether the pron in xx-pron refers to a "pronoun" or "pronunciation", or whether the conj in xx-conj refers to "conjunction" or "conjugation", and so on. So far the existing headword-line template abbreviations in wide use (adj, adv, conj and such) are leftovers from the old days where editors where hoping to spare some keystrokes, and were also IMHO biased by similarly coined abbreviations that have a long use in certain computing environments where the length of a filename was an issue. I see no use for them today. They are not reducing the footprint of the database (both xx-adj and xx-adjective will compress equally, or almost equally), they are introducing unnecessary ambiguities as outlined above, and they are hardly sparing editors of much typing (which, if it really is an issue, can be solved by creating redirects that can be resolved by bots at a delay, from abbreviations to full names). So I propose that we get rid of them completely and use xx-<full part-of-speech name> Wiktionary-wide.

Also, if any of the local Javascript wizards is up to it, I'd love to have some Intellisense-like capabilities for templates. E.g. you type {{abc, and a drop-down list appears showing a list of available template names, and once a name has been fully typed/selected, its documentation taken from Template:template-name/doc the is shown as a tooltip. That wold greatly enhance the typical template usage scenario IMHO, which all too often relies on guessing the name and clicking the documentation page manually. -Ivan Štambuk 09:59, 27 March 2011 (UTC)Reply

  1. Support Daniel. 05:34, 28 March 2011 (UTC)Reply
    • I support the standardization effort of having {{en-adjective}}, {{en-noun}}, {{en-adverb}}, {{en-verb}}, {{en-interjection}}, {{en-pronoun}}, {{fr-noun}}, {{es-noun}}, {{fr-adjective}}, {{es-adjective}} and so on.
    • My main concern is consistency. It would be nice to have an easily recognizable pattern like "xx-conjunction for all conjunctions of all languages, always, no exceptions". For that goal, I would support either using full names of parts-of-speech like "en-adjective" or abbreviations like "en-adj". However, full names aren't bad and are common: Wiktionarians often are trained to type full language names as L2 headers, full context labels, full category names and so on.
    • In addition, at least from a technical point of view, since these are templates with multiple functions, every (reasonably short) name can be seen as an "abbreviation". {{en-adjective}} is a very, very short way of telling the software to format the headword bold, display links to other entries, categorize the current entry, etc. --Daniel. 05:34, 28 March 2011 (UTC)Reply
I like to blame my tiredness rather than something more inherent in me, but I do not understand that point of your first paragraph, Ivan. But I agree with your second — standardizing template names — so long as the name an editor in a particular language is used to remains as a redirect.​—msh210 (talk) 07:56, 28 March 2011 (UTC)Reply
The point was to remind editors not to be too casual about innocent-looking template-rename operations, on widely-transcluded and long-term-stabilized templates, as these could have some unforeseen repercussions that break things. I undid several such renames recently by multiple editors, so instead of issuing explanations on multiple talkpages, I decided to bring it up here instead. "If it ain't broke, don't fix it" as Americans would say. --Ivan Štambuk 09:35, 28 March 2011 (UTC)Reply
I strongly agree with the ain't-broke-no-fix point as a general rule for widely used templates, such as the inflection-line family. Specifically I greatly prefer short template names where they are unambiguous. In English conjugation isn't used so "en-conj" is unambiguous. OTOH it is not so widely used that we would be wasting many keystrokes to have it spelled out. Similarly for "en-pronoun" and "en-contraction". But "en-adv", "en-adj", and "en-prep" do save keystrokes and don't have the same ambiguity risk AFAIK. We don't have generally discussed and accepted templates for headers like "Proverb" and "Phrase" (ambiguous among use for phrasebook entries and other uses as for phrases that might have multiple or uncertain grammatical functions or be non-constituents). These could probably stand to be spelled out or even extended to resolve the ambiguity (eg, en-phrasebook or en=phrasebk), at least once we have more consensus on Phrasebook generally. DCDuring TALK 20:01, 28 March 2011 (UTC)Reply
Support consistent naming of such template; I don't care what name that turns out to be; NB the redirects would be kept, for example {{en-adj}} redirecting to {{en-adjective}} (currently it's the opposite). Mglovesfun (talk) 22:06, 28 March 2011 (UTC)Reply

Attestation from Google Groups

Google Groups contains various non-Usenet groups. Are they "durably archived"? For example, can pages like this and this serve as sources to attest a word? --Daniel. 05:23, 28 March 2011 (UTC)Reply

Interesting question. At the moment, they seem to be as durably archived as any Usenet post archived by Google. I suspect, though, that others will care more about Usenet posts than about them, and scrap them when buying Google's Usenet archive whenever Google divests itself of Groups or folds. But that's all conjecture, and not necessarily useful conjecture for a discussion of our CFI. One point that might be worth considering is that Usenet hits are already weaker than edited works' and many regulars dislike using them; extending the weakness of the CFI to further fora may be out of order. Do note that certainly (AFAICT at least) the status quo is not to allow them.​—msh210 (talk) 08:06, 28 March 2011 (UTC)Reply
What he said. Furthermore I still don't know what "durably archived" means in a Wiktionary context; I don't think we have anything on the issue at all. The way we use the term seems to me to be circular; it's durably archived if we considered it to be so, otherwise it isn't. --Mglovesfun (talk) 14:17, 28 March 2011 (UTC)Reply
It's not circular, though there is ambiguity at some broad margins. We consider what we find at Google Books and Scholar to be durably archived because it is almost without exception from print (though I am not sure about Scholar). News is almost always from print, but there are many hits from newspaper blogs and online-only newspapers of uncertain archival status. Usenet is the only content we have been accepting that exists purely in digital form.
We are following a very conservative approach in that we do not accept electronic media unless they are in the custody of multiple long-lived institutions (eg, public libraries, university computing centers). The digital archives that offer snapshots of the Web at frequent intervals don't seem to count because of their uncertain long-term future. Similarly with Google itself. DCDuring TALK 19:43, 28 March 2011 (UTC)Reply
w:Usenet#Archives does not list any archiver of Usenet except Google. I can't find any evidence of any now in operation, either, except for the archives of specific newsgroups.​—msh210 (talk) 20:01, 28 March 2011 (UTC)Reply

Linking to sister projects

Please see these two pages:

They don't exist.

When one accesses a page on Wikipedia that doesn't exist, they can see a good-looking box of alternative sister projects that use "Special:Search" to go directly to the page or automatically begin a search if the page doesn't exist in the other project too.

However, when one accesses a page on Wiktionary that doesn't exist, the only suggestion of sister project is a rather hidden link to Wikipedia that is spelled exactly like the word in question (that is, it doesn't use the aforementioned Special:Search).

I suggest copying the box from Wikipedia to Wiktionary, to enhance the user experience by giving them more places to look for whatever they are looking for, and starting the search automatically when the page is not present in the other project as well. --Daniel. 06:40, 28 March 2011 (UTC)Reply

While someone who doesn't find what he wants in WP might find it better here or at b: or s: or q: or commons, I really do think that we're best off sending people to w: when we lack what they seek; the additional links would just lower the ratio of signal to noise. That said, I agree we should link to the search page rather than the article page, so as to increase the likelihood of being helpful.​—msh210 (talk) 08:00, 28 March 2011 (UTC)Reply
Good catch, Daniel.. I agree with Msh210's specific recommendation as a default. Perhaps we could have a specific template deployed something like {{only in}} for cases where the w: default isn't good enough and there is evidence that users mistakenly come here or a contributor has a lot of enthusiasm and a good theory for anticipating such user behavior. DCDuring TALK 19:30, 28 March 2011 (UTC)Reply

Thank you for your input. Yet, I disagree with your reasons to send people only to Wikipedia, either always or as a default procedure. Let me explain why:

Wikipedia, Wiktionary, Wikiversity and Wikibooks are very close to each other in contents (they are supposed to provide information on relatively everything, after all); in fact, they differ mainly by coverage and presentation. Therefore, it would be helpful to, at the very least, link between these 4 projects. If one wants to know what "water" is, then their thirst of knowledge may be relatively satiated by an encyclopedic article, a dictionary definition, a "learning resource" and/or a "textbook".

Differently, Commons, Wikiquote, Wikinews and Wikisource have a greater focus on coverage or presentation, in detriment of providing information. The existence of sister projects with different focuses and goals is a Good Thing™. However, again, if one wants to know what "water" is, then media, quotations, news and sources are not inherently or easily going to be helpful.

That said, I suggest linking between all eight projects, under the rationale of Wikipedia, Wiktionary, Wikiversity and Wikibooks being strongly related with each other; and Commons, Wikiquote, Wikinews and Wikisource being merely complementary material. --Daniel. 23:52, 28 March 2011 (UTC)Reply

Gothic entries in Latin script

Right now we have quite a few entries in Gothic. Most of them use the Gothic script, but a few (those from Crimean Gothic) are in Latin script. I was wondering if it would be desirable to have Latin script entries for the others as well, and have them redirect to the Gothic script entry as 'alternative spelling of' or something like that. It would be very useful for looking up Gothic words, because support for Gothic script is generally poor, and most actual sources on Gothic only use Latin script. —CodeCat 13:42, 28 March 2011 (UTC)Reply

I think the others are in Latin script because w:Crimean Gothic is only attested in Latin script. The other works are written in Gothic script, so why should we not use it (it's not like it's not in Unicode)? -- Prince Kassad 14:08, 28 March 2011 (UTC)Reply
I'm not saying we shouldn't have entries in Gothic script. What I'm suggesting is that we allow entries in Latin script as alternative spellings of the Gothic script entries. Because as far as attestation is concerned, there are a lot more sources with Latin script than there are in Gothic. —CodeCat 14:20, 28 March 2011 (UTC)Reply
I see. What you're saying (IMO) is that Gothic words (some, at least) can be attested in the Latin script in running text (that is, not purely as transliterations). --Mglovesfun (talk) 14:34, 28 March 2011 (UTC)Reply
Well, it depends on the interpretion of CFI and whether people think it only takes historical sources in account for extinct languages. -- Prince Kassad 14:42, 28 March 2011 (UTC)Reply
I think it's more a practical consideration. People who are studying Gothic generally study it in the Latin alphabet, and read transliterated texts. It would be rather inconvenient to ask them to transliterate back into the Gothic alphabet before they can find terms. Even though Gothic is only natively attested in the Gothic script, Latin transliterations are far more abundant than texts in the original script. —CodeCat 15:27, 28 March 2011 (UTC)Reply
Note that Special:Search already finds terms if you enter the transliteration (provided it's given in the entry, of course). -- Prince Kassad 16:03, 28 March 2011 (UTC)Reply
That's true, but search is very inconvenient for terms that appear in many entries. Try searching for the Gothic entry 'aba' and you'll see what I mean... —CodeCat 16:11, 28 March 2011 (UTC)Reply
Then you use "Gothic aba" in the search box, and in general "<language name> transliteration". You can use ====Transliterations==== section to provide other less-used transliteration schemes. --Ivan Štambuk 20:26, 28 March 2011 (UTC)Reply
Personally, I find doing Gothic in the Gothic script to be a silly affection. As far as I know, Gothic has never been published in the Gothic script. Whenever you find it in real life, it's in the Latin script, unless you're actually looking at an ancient artifact or manuscript, in which case I don't think you're using Wiktionary.--Prosfilaes 19:56, 28 March 2011 (UTC)Reply
But I think CFI only allows the ancient artifacts or manuscripts to be used...? -- Prince Kassad 20:03, 28 March 2011 (UTC)Reply
I don't see where you're seeing that. CFI suggests using printed sources, not manuscripts, whether that be of Stephen King (who, for all I know, might have terrible spelling that gets normalized by his editors) or Bishop Wulfia. I find that interpretation silly; is it out of line to cite Pepys, because he wrote in shorthand, not Latin script? Are we really going to have words not listed under the spellings that they're being published with because those spellings don't match manuscripts that few get a chance to look at?--Prosfilaes 20:13, 28 March 2011 (UTC)Reply
Publishing is irrelevant. What matters is the attestation in the primary sources, regardless of whether they are hand-written or printed. Gothic written in Gothic script is the only (apart from Crimean Gothic which gets special Latin-script treatment, and which is silly to even mix with centuries-older Gothic proper, but that's another issue) "real-life" source of Gothic lexemes that we can harvest, as it was written and used by native speakers of Gothic language. Enabling additional functionality of soft-redirecting transliterations to main entries (similar to what we use for e.g. misspellings) should be done at the level of MediaWiki software, by selecting desired checkboxes which would confine search to parts of the page that list transliterations for a specific language. Adding Latin script transliterations as mainspace entries for Gothic would inevitably lead to the same argument being applied for Sanskrit, Mandarin and many other languages each of which utilizes gazillion disparate transliteration schemes. We don't want those kind of maintenance problems, unless they can be 100% automated. Many ancient languages, and especially in case of those such as Gothic where we're dealing with a rather small corpus, have primary sources digitized and freely available as high-resolution images on the web, so there is no problem with that. --Ivan Štambuk 20:34, 28 March 2011 (UTC)Reply
Publishing is rarely if ever a primary source. I'll ask the question again; can we not cite Pepys's Diary because the manuscript is not written in shorthand, not the Latin script? Gothic written in Latin script was good enough for generations of lexicographers and linguists, as that is how they received every bit of information about the language. Why is the Gothic script suddenly important?
My argument is that the spellings in which people in the modern world read texts should be recorded. When people buy a book of Middle English texts, they should be able to look up the spelling in the book on Wiktionary. When people buy a book with Gothic in it, they should be able to look up the spelling in the book on Wiktionary. 100% of the time, that Gothic will have been written in the Latin script. That's not true for Sanskrit, Mandarin, or any of these other languages.
Let's talk about a modern language community which your policy hurts. 19th century Turkish was written in Arabic; if there's some words only used by Ziya Pasha and his contemporaries, your policy would demand we only have them in Arabic script (and useless transliterations), not modern Turkish spelling, even if that's the script that our users would be most likely to find the text in, even though most Turkish users couldn't read or enter Turkish in the Arabic script.
Wiktionary should not be an idle or scholarly project. We're here to produce a dictionary for the average user, and the average user--in fact all users--looking up Gothic words do so in Latin script.--Prosfilaes 21:31, 28 March 2011 (UTC)Reply
Re Turkish, that's what we're doing already. See Category:Ottoman Turkish language. Also, Wiktionary should not be an idle or scholarly project - why not? I think we will attract many good contributors if we apply scholarly standards upon ourselves. Otherwise, we would be just another unreliable website. -- Prince Kassad 21:34, 28 March 2011 (UTC)Reply
Not having both spellings for 19th century Turkish means that we are making it impossible to use modern editions of 19th century Turkish texts. It's like stopping people from looking up words in Twain unless they can transcribe the words into Deseret. It's fine to apply scholarly standards, but it's not fine to target only a scholarly audience. In the case of Gothic, having only the Gothic script makes it hard for everyone to look up Gothic; everyone trained in Gothic is trained to use a Latin spelling of Gothic, not the Gothic script.--Prosfilaes 21:42, 28 March 2011 (UTC)Reply
we are making it impossible to use modern editions of 19th century Turkish texts - Exactly. I personally see no problem with that. If there are some old Turkish words attested (as used) only in the 19th century (and earlier) Arabic-script spellings, they must not be added in the modern Turkish Latin script. Although, this is again a different type of problem: transliteration of a script to Latin that was never originally written in Latin, but only artificially by language specialists in other to facilitate its study (none of them actually being a native speaker of it), and reprinting a known work in another alphabet because standard alphabet has officially changed, with the language itself being very much alive, are two very different things. Personally I'd allow for exceptions such as when the alphabet was officially changed, or when there are multiple used alphabets for a language but a word can be attested only in one, to add it in the other ones too in order to maintain symmetry, and similar. --Ivan Štambuk 16:05, 29 March 2011 (UTC)Reply
You see no problem in preventing some real world users from using Wiktionary to look up words found in textbooks for theoretical reasons? I see no reason why we shouldn't serve our users.--Prosfilaes 18:21, 29 March 2011 (UTC)Reply
I'm not familiar with the shorthand notation used by Pepys so I cannot comment on that. But in general no - shorthand spellings cannot be used as sources of attestation of any word in any language, if they cannot be added in their original form as Unicode-encoded words. But that is another problem that has nothing to do with the issue of Gothic language transliterations in Latin script.
Gothic written in Latin script was good enough for generations of lexicographers and linguists - But it is not good enough for us. We have Unicode, digitalized sources, programming skills - tools that enable us to be orders of magnitude more productive than any poor drudge of the analog era. There are no technical excuses for us except in a few corner cases (e.g. langauges not yet in Unicode and so on).
My argument is that the spellings in which people in the modern world read texts should be recorded. When people buy a book of Middle English texts, they should be able to look up the spelling in the book on Wiktionary. When people buy a book with Gothic in it, they should be able to look up the spelling in the book on Wiktionary. 100% of the time, that Gothic will have been written in the Latin script. - Anyone who bothers enough to learn Gothic grammar can waste another half an hour (if that much) getting accustomed to Gothic alphabet, and half a minute installing Gothic Unicode fonts and a keyboard mapping. The inability to search transcriptions and confine search results to specific languages should be solved at the level of MediaWiki software, and not by introducing Latin-script clones of entries.
Wiktionary should not be an idle or scholarly project. We're here to produce a dictionary for the average user, and the average user- - Yes it should be a scholarly project. Every dictionary is a scholarly project. Anyone who bothers to pick up a dictionary from time to time in order to look up new words that he has encountered or expand one's vocabulary is already in some 10% of population. We must not be confined by the parameters of average human's sphere of (dis)interest, stupidity and laziness. Dumbing down in general is undesirable. In case of ancient languages such as Gothic it doesn't make much sense either, since we're already dealing with an obscure topic, and those who study it are accustomed to separate set of high standards. --Ivan Štambuk 16:05, 29 March 2011 (UTC)Reply
Shorthand is exactly related to this case; do we worry about what Stephen King actually wrote in his manuscripts, or do we worry about the printed text our users will actually be looking up. We've cited Pepys a couple dozen times, despite the fact that his diary was kept in shorthand, not Latin script; are we going to delete those citations? Gothic users have made an active choice not to use the Gothic script; see Don't Proliferate; Transliterate!. Why should they spend any time at all figuring out how to set up their computer to use a website that ignores that choice? To add another script to the equation doesn't make anyone any more productive.--Prosfilaes 18:21, 29 March 2011 (UTC)Reply
Can I quote the whole of the above?
[quote] Personally, I find doing Gothic in the Gothic script to be a silly affection. As far as I know, Gothic has never been published in the Gothic script. Whenever you find it in real life, it's in the Latin script, unless you're actually looking at an ancient artifact or manuscript, in which case I don't think you're using Wiktionary.--Prosfilaes 19:56, 28 March 2011 (UTC)[end of quote]Reply
I don't see how 'Gothic in Gothic in Gothic script' is any odder than Latin in Latin script. I don't really know what you mean by published; a lot of old works weren't published as publishing didn't exist yet. Furthermore I'm not sure why someone who's looking an an ancient artefact or manuscript wouldn't use Wiktionary. On the contrary really. Mglovesfun (talk) 21:53, 28 March 2011 (UTC)Reply
Why do we have English in Latin script, instead of English scripts like Shavian or Deseret or a shorthand? Because English is actually published in the Latin script. Why do we have Japanese in Chinese script, instead of only Japanese scripts? Because Japanese is actually published with Han ideographs. What I mean by published is published; and all old works that we use have been published, and rarely in photocopies or any close copies of the original, but in modern fonts with lower-case letters with ligatures and abbreviations expanded out. Every time Gothic has been published, it's been published in the Latin script. Nobody cared about encoding Gothic, because no one was printing or working with Gothic in Gothic script.*
If there are people looking at ancient artifacts or manuscripts instead of the printed form who would choose an incomplete source like Wiktionary, given that they were trained to read Gothic in the Latin script, and probably transliterate the manuscripts or artifacts before reading them, I suspect the use of Gothic script would be a turn off. If they had a program to look up words they were typing in, it wouldn't work with Wiktionary, because they would be entering the words in the script their publishers and audience would expect, the Latin script.
* Gothic is beyond the 2-byte limit, meaning early versions of Windows and Java and other programs couldn't handle them. It was one of the first batch of scripts so encoded, and it, like Deseret and Shavian, were chosen because nobody wanted their script to be first to be placed there; so some scripts nobody actually using Unicode was using were encoded there, to encourage support and future encoding there.--Prosfilaes 23:19, 28 March 2011 (UTC)Reply
We do support Shavian for pronunciation sections, if you were wondering. See {{shavian}} and water as an example. -- Prince Kassad 07:13, 29 March 2011 (UTC)Reply

Support Gothic in Latin script. Agree that Gothic in Gothic script is a silly affection. --Vahag 06:15, 29 March 2011 (UTC)Reply

I have no opinion on whether use of the Gothic script is silly or an affectation, but if Gothic works are primarily or exclusively published in the Latin script, then we should definitely have entries for the Latin spellings, either as main entries or as alternatives. Questions of languages and script have a lot of awkward edge cases, but this doesn't seem to be one of them. —RuakhTALK 17:31, 29 March 2011 (UTC)Reply
I will not support this without a VOTE. -- Prince Kassad 17:42, 29 March 2011 (UTC)Reply
Again, what part of WT:CFI says that we can't cite printed works?--Prosfilaes 18:21, 29 March 2011 (UTC)Reply
While we do have rules against transliterations, doesn't this become a different matter if the entire work is a transliteration, and this transliterated version is far more common than the original? Gothic texts are published in just that way. —CodeCat 19:29, 29 March 2011 (UTC)Reply
We have (toned) pinyin. Even if we keep the Gothic script entries as the main entries, we should have Latin script entries like our pinyin entries. - -sche (discuss) 05:01, 31 March 2011 (UTC)Reply

Allowing Romanized Korean as entry titles

Do we allow Romanized Korean as entry titles, as opposed to only as transliterations? Should we? Is Romanized Korean used as a 'language', that is, are there published works in it, or whole paragraphs written in it in CFI allowable sources? If not, we should just treat them as transliterations. We don't have too much Romanized Korean so getting rid off it wouldn't be too painful. --Mglovesfun (talk) 14:22, 28 March 2011 (UTC)Reply

To answer your questions: no, no and no. Korean is a syllabic and very regular script (unlike Japanese kana and Chinese ideographs) so there's no need to have romanizations as separate entries. -- Prince Kassad 14:27, 28 March 2011 (UTC)Reply
This kind of relates to the topic above this one. I think romanisations should be allowed, for the sake of convenience since most of our users will only be able to read or type the Latin alphabet. And there is also our mission statement, to include all words in all languages. Isn't a transliteration also still a word, just a different representation of it? —CodeCat 19:47, 28 March 2011 (UTC)Reply
There are many transliterations from Korean to the Latin script; are we going to have all of them? All transliterations from all languages to Latin will do good for our entry count, but won't make it easy to find anything, especially if two Korean words transliterate to the same spelling using different transliterations. This isn't like the Gothic case; Korean is always written in the Korean script, and users who expect to work with Korean will know how to enter it.--Prosfilaes 20:04, 28 March 2011 (UTC)Reply
CFI line one "As an international dictionary, Wiktionary is intended to include “all words in all languages”." Transliterations don't meet CFI as they aren't words. Conversely, if we're going to allow Gothic and Korean in the Latin script, we should allow, English, Dutch, Finnish (whatever) in Gothic and Han script too. Mglovesfun (talk) 21:48, 28 March 2011 (UTC)Reply
Not necessarily. This is a dictionary aimed at English-speaking users, and the only script we can expect English speakers to know is Latin. So just as we treat English in a privileged way, we might want to give the Latin script a special place too. —CodeCat 21:51, 28 March 2011 (UTC)Reply
Not necessarily, no. But why not? Mglovesfun (talk) 21:55, 28 March 2011 (UTC)Reply
So then, you expect the Russian Wiktionary to transliterate all English words in Cyrillic script? Or should the Mandarin Wiktionary transliterate all English words in Han script? Or should the Ancient Egyptian Wiktionary transliterate all English words in Egyptian Hieroglyphs? Or should the ASL Wiktionary transliterate all English words in SignWriting? -- Prince Kassad 22:03, 28 March 2011 (UTC)Reply
The point is, for me that is, WT:CFI#Attestation allows words/terms to meet CFI in three different ways; use in a well-known work, clear widespread use or three independent durably archived citations. Transliterations won't meet CFI if they're only transliterations, and I don't think we want to extend CFI to allow words that aren't used in a language, like philein not being used in Ancient Greek texts. Mglovesfun (talk) 22:19, 28 March 2011 (UTC)Reply
I think the problem wouldn't be nearly as bad if the search function always showed possible transliterations first, or could be made to search for only a given language... —CodeCat 22:44, 28 March 2011 (UTC)Reply
I plan on addressing that on the Beer Parlour when it's a bit quieter. Concerning Korean and Korean only, we've had a quite a good arfument above why Romanized Korean shouldn't be allowed; would anyone argue the opposite? Is it worth rfv'ing some Romanized Korean entries to see if even one of them can pass or, if there are no objections, do we just delete the lot? I certainly don't mind rfvs providing there's a realistic chance that even one such entry could pass. If Prince Kassad's right, there isn't. --Mglovesfun (talk) 15:27, 29 March 2011 (UTC)Reply
WT:RFV#Hunmin Jeongeum, my initial search came up with absolute zero for the Korean entry. Mglovesfun (talk) 15:44, 2 April 2011 (UTC)Reply

WT:CFI#Company names

Company names

Being a company name does not guarantee inclusion. To be included, the use of the company name other than its use as a trademark (i.e., a use as a common word or family name) has to be attested.

Um, that's it? Any company name that has another meaning can be included? What about Suzuki and Honda then? Mglovesfun (talk) 22:35, 28 March 2011 (UTC)Reply

As I've said elsewhere, and in much the same words, I don't think that section is terribly well phrased, but I take it to mean that company names aren't included, period. I think it's saying that if a company name is also a family name, then that's included; and if a company name is also a regular word, then that's included, but in no case will the company name itself be included (except in the loose sense that a word might be included while also, simultaneously, being a company name). —RuakhTALK 00:26, 29 March 2011 (UTC)Reply
Fwiw, I agree with Ran.​—msh210 (talk) 06:17, 29 March 2011 (UTC)Reply
Now I took about 45 seconds to discover or refresh my memory about the fact that Ruakh's name is Ran. :p Well, I agree with his interpretation of the policy too. --Daniel. 06:40, 29 March 2011 (UTC)Reply
I just want to see the unvoted-on section on company names removed from CFI. (I have said this before, I know.) I don't care all that much about what the section is intended to mean exactly; I do not assign it any authority. I would like to see "Verizon" included, if only for the pronunciation. A regulation dedicated to geographic names has been deleted recently; company names (names of other specific entities) could follow the suit. --Dan Polansky 08:00, 29 March 2011 (UTC)Reply
Regarding above sentence "but I take it to mean that company names aren't included, period". Really? I could never interpret it like that. For example, to me Ford as a vehicle manufacturer would meet CFI as it's attested as a family name. Actually, not many company names would be attested with other meanings, so in reality 99% of companies would be excluded. But, if I started up a company called Smith it would meet CFI as long as I could satisfy WT:CFI#Attestation as it clearly meets the criterion above. --Mglovesfun (talk) 12:51, 29 March 2011 (UTC)Reply
You're looking at it from the perspective of a veteran RFVer: such-and-such sense is included, now we have to see if that sense meets the CFI, hey look, WT:CFI has rules for whether a company-name sense is included, hey that's weird, the rules say we include a company-name sense if there's any other sense at all. But I don't think it was written from that perspective. I think it was mostly written from the perspective of, we don't include company names, hey but wait a sec, we don't want to exclude "Smith" or "Apple" just because they're company names, O.K., so to clarify, we don't include company names that are just company names. (Actually, if you look at the history of that section, it's more complicated than that; in its original version there was a parenthetical side-note that clearly assumed my perspective, but much of the other text could be taken either way; but years of refactorings have changed the parenthetical part from a side-note to a key defining element of the section, such that IMHO the only way to read it now is the way that I described.) —RuakhTALK 14:23, 29 March 2011 (UTC)Reply
I can't do much more than repeat what I said above; I could never read it that way. I could possibly ignore the text itself and take a guess at what the person (or persons) who wrote it intended and come up with your interpretation. Frankly, that's it. --Mglovesfun (talk) 15:04, 29 March 2011 (UTC)Reply
It was almost certainly intended to be read with the attributive-use rule. Thus it remains as a testament either to our limited ability to draft proposals and conduct votes that anticipate significant consequences, even those involving explicitly discussed topics in the same document or specific defects in the vote and discussion about removing attributive use. DCDuring TALK 15:33, 29 March 2011 (UTC)Reply
Problems arise from complexity. Clearly stating the rule all words, all languages (whatever the meaning of the words) would make things simpler: company names should be included, but only when they can be considered as a single word rather than as a sequence of several words. However, words created by somebody and not used by anybody else should not be included, and company names are not an exception: several attestations independent of the company should be required. Lmaltier 20:46, 29 March 2011 (UTC)Reply
If I'm talking about a company, and I use its name in reference to it, is that considered "independent"? (I mean, assuming I have no relationship with the company other than talking about it?) —RuakhTALK 21:13, 29 March 2011 (UTC)Reply
Of course, it's a use using the normal sense of the word. But uses found in advertising, etc. should not be considered as independent attestations.
In my opinion, Société Nationale des Chemins de fer Belges or IBM, Inc should not be included (not single words), but SNCB or IBM should be included. The boundary is not always obvious. Lmaltier 05:31, 30 March 2011 (UTC)Reply

Unless I have overlooked something, there are only two edits to CFI that have lead to the current text for company names:

  • diff, by DAVilla, 21 November 2007, creation of a section dedicated to company names, with today's phrasing
  • diff, by Uncle G, 22 May 2005, a new paragraph on trademarks and company names

The current intended meaning has to be known by DAVilla. But I still do not see how the intended meaning matters to anything, given the section has close to zero authority. Fact is, there is no consensus on which rules should govern the inclusion of names of specific entities, although there was a consensus at some point on the rules that currently govern brand names. --Dan Polansky 09:36, 30 March 2011 (UTC)Reply

Seems like an atypically clear case where the passage should be removed. Mglovesfun (talk) 10:04, 1 April 2011 (UTC)Reply
Thanks for the diff! I agree with the interpretation above, and that it isn't very well written. The original text below, before my edit, had a lot of corporate propaganda between those two sentences. It was removed because the brand names vote meant it no longer applied. We did stuff like that then, voting on principles rather than whether to make the wording consistent. Previous discussions about trademarks heavily sided on including genericized terms at the very least. The term xerox is listed as a verb in the OED, a slightly higher authority on linguistic matters than Xerox Corp. DAVilla 06:29, 6 April 2011 (UTC)Reply

Being a trademark or a company name does not guarantee inclusion. (Of course, some company names are derived from family names, and are included on that basis.) Although some words are trademarks and company names, not all trademarks and company names are words. (Indeed, trademark holders will vigorously defend their trademarks against becoming words. According to Adobe Systems, there is no such word as Photoshopped, since Photoshop® is a trademark and not a common verb that can have a past participle; according to Xerox there is no such word as xerox, since Xerox® is a trademark and not a common verb; according to Sony there is no such word as Playstationize since there’s no word Playstation at all and PlayStation® is a trademark and not a common verb.) Many trademarks and company names are deliberately protologisms. To be included, the use of a trademark or company name other than its use as a trademark (i.e., a use as a common word) has to be attested.

WT:AN and WT:NFE

What is the difference between the purposes of these two pages?

  • WT:AN (Wiktionary:Announcements)
  • WT:NFE (Wiktionary:News for editors)

--Daniel. 23:02, 28 March 2011 (UTC)Reply

Asked and answered: see [[Wiktionary talk:News for editors#Wiktionary:Announcements]].​—msh210 (talk) 06:17, 29 March 2011 (UTC)Reply
Thanks, but the answer from the discussion that you linked is either inaccurate or simply not always followed by people. Both pages have been used to inform about changes of templates and changes of rules. WT:AN is more abrangent, by listing technical aspects of Wiktionary such as the Bug #5033 and a particular instance of the site being temporarily offline; also, it lists a small number of admins and 'crats. And it often duplicates parts of WT:Milestones.
My main concern is that I don't get why we need these two pages. Shouldn't one of them be RFDO'd or RFC'd, perhaps the bigger one? --Daniel. 06:32, 29 March 2011 (UTC)Reply
I tend to think we don't need both; a merge sounds good, at least worth a proposal. --Mglovesfun (talk) 15:33, 29 March 2011 (UTC)Reply
I don't mind getting rid of AN so long as NFE and Milestones are kept.​—msh210 (talk) 15:48, 29 March 2011 (UTC)Reply

CFI and a double negative

I suggest getting rid of the double negative found at the top of WT:CFI:

  • It should not be modified without a VOTE.

And replacing it by:

  • A VOTE is required to modify it.

--Daniel. 09:00, 29 March 2011 (UTC)Reply

Neutral, it's not a grammatical double negative as without isn't grammatically negative. The two phrases seem to be exactly equivalent to me. --Mglovesfun (talk) 12:53, 29 March 2011 (UTC)Reply

Sourced policies

I suggest revising our two major policies as follows:

The "sourced" versions have sources. In the form of a "References" section in the bottom, linking to various relevant votes. I think I could list them all, but feel free to correct me. Thoughts? --Daniel. 10:28, 29 March 2011 (UTC)Reply

I haven't checked to see whether your sourced versions are absolutely identical to the current policies, except for the inclusion of references; however, if all you're doing is adding footnoted links to supporting VOTEs, then I wholeheartedly support your proposal. Are the linked-to VOTE pages protected against alteration? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:19, 30 March 2011 (UTC)Reply
I think all the linked-to VOTE pages aren't protected against alteration. They naturally may be protected anytime in the future. Yes, the sourced versions are absolutely identical to the current policies, except for the inclusion of footnoted supporting VOTEs. --Daniel. 17:48, 30 March 2011 (UTC)Reply
As long as the VOTEs are protected, this could only be a good thing. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:57, 31 March 2011 (UTC)Reply
OK. I protected all the 17 VOTEs. --Daniel. 14:21, 31 March 2011 (UTC)Reply
I'm unprotecting them. This is a wiki, and those vote pages are informative. People may have good reasons for editing them, and that is not forbidden. —RuakhTALK 14:25, 31 March 2011 (UTC)Reply
Personally I'd have left them protected, but I don't feel strongly enough to do anything about it. Mglovesfun (talk) 14:44, 31 March 2011 (UTC)Reply
The vote pages should better be left unprotected: I don't recall any vandalism targetting the vote pages, while I recall sensible adjustments made by non-admins, such as updating a link to a BP discussion, and adding the percentage of votes. As long as each vote page contains the revision history, I do not see why protection should be needed for the purpose of sourcing. --Dan Polansky 15:06, 31 March 2011 (UTC)Reply

User:Pilcrow for rollbacker

Per User talk:Pilcrow, he seems to spend a lot of time fighting vandalism. I think it's too early to nominate him for admin, but for rollbacker seems ok (or else I wouldn't be doing it, now I think about it). Rollbackers can quickly revert all edits to a single page with one click; this also marks the vandalized edits as patrolled; right now I'm marking a lot of these as patrolled as Pilcrow can't do it. Mglovesfun (talk) 15:46, 30 March 2011 (UTC)Reply

Have all his reversions hitherto been justified? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:54, 30 March 2011 (UTC)Reply
No idea, but, surely asking for 100% is asking a bit much? Mglovesfun (talk) 16:02, 30 March 2011 (UTC)Reply
I wasn't really; I was just wondering. I think the nomination's a great idea. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:01, 31 March 2011 (UTC)Reply
I have noticed (on my daily pre-breakfast vandalism hunt) that I am constantly following vandalism that he has removed. So I am in favour of this change in status. 16:06, 30 March 2011 (UTC) — This unsigned comment was added by SemperBlotto (talkcontribs).
¶ I will admit, I do not always include my reasoning in the summaries, as I often thought it was unnecessary, example: replacing a page with ‘crap lol’ does not usually warrant a comment of mine, or so I hope. Sometimes I am afraid I will misspell my summaries as well, and I am sensitive about my spelling. ¶ I will write better summaries next time. --Pilcrow 16:07, 30 March 2011 (UTC)Reply
Using the rollback tool eschews edit summaries anyway, so don't worry about it. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:01, 31 March 2011 (UTC)Reply
Support. The only other time we approved someone for rollbacker, it was at [[WT:WL]] and using that page's procedures (a nomination and second from two admins with no dissents): diff. I think that that would work in the (present and) future also. We currently then have enough to make Pilcrow a rollbacker; but since it's here at the BP already, I won't act yet.​—msh210 (talk) 05:26, 31 March 2011 (UTC)Reply
Made it so. It can be easily undone. SemperBlotto 07:41, 31 March 2011 (UTC)Reply
Great! Congratulations to Pilcrow! — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:01, 31 March 2011 (UTC)Reply
Congratulations. --Daniel. 14:22, 31 March 2011 (UTC)Reply
¶ Thank you very much for your congratulations and thank you for this promotion! I promise I will not let it go to waste. --Pilcrow 19:20, 31 March 2011 (UTC)Reply
Don't forget to check your next paychecks for the customary percentage increase. DCDuring TALK 19:42, 31 March 2011 (UTC)Reply

{{transterm}}

Sometimes in English entries we give citations which are translations of things not originally written in English. Many translations in fact are key English texts (various Bibles for example). And often it's useful to know, for a given English term being cited, which original FL term it's being used to translate. When I know or can check, I've been adding this in by hand, as it were, but I decided to create a simple template to do it. See for example at (deprecated template usage) night-bat or (deprecated template usage) Great Turk. Any comments, thoughts, violent attacks etc, let me know. Ƿidsiþ 06:01, 31 March 2011 (UTC)Reply

I've tweaked it a little. Revert ad lib.​—msh210 (talk) 06:21, 31 March 2011 (UTC)Reply
Interesting. I've been trying to find a few but it's very difficult when you don't speak the language (for example, throw has a quotation translating the Heimskringla in Old Norse which is available online)- maybe a request template is needed as well. Nadando 06:38, 31 March 2011 (UTC)Reply
Both {{transterm}} and the request template seem like excellent ideas. DCDuring TALK 06:54, 31 March 2011 (UTC)Reply
{{transterm req}} Nadando 07:01, 31 March 2011 (UTC)Reply

Oxford Modern Grammar

Oxford has recently published the Oxford Modern English Grammar by Bas Aarts. It is a relatively slim and accessible reference grammar at a reasonable price. It is modern both in that it looks at modern English and that it takes account of modern linguistic description. If you find the CGEL daunting, this might be a good alternative for you. Although Aarts has not adopted all the innovations in the CGEL, he does follow it in many respects. --Brett 14:29, 31 March 2011 (UTC)Reply

Poll: adj vs adjective

We have headword-line templates for adjectives whose name includes either "adjective" or "adj". Following are some examples. For more results, feel free to check Special:Search.

Can we standardize them, to use only one of these two naming systems?

Feel free to support multiple options.

Thank you for your attention and your input. --Daniel. 15:34, 31 March 2011 (UTC)Reply

Poll: adj vs adjective — Preference 1

I prefer using only the short version: en-adj, es-adj, sh-adj, ru-adj, etc.

  1. Support Daniel. 15:34, 31 March 2011 (UTC)Reply
  2. SupportCodeCat 15:39, 31 March 2011 (UTC)Reply
  3. Support Prince Kassad 15:59, 31 March 2011 (UTC)Reply
  4. Support DCDuring TALK 17:27, 31 March 2011 (UTC)Reply
  5. Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:19, 31 March 2011 (UTC)Reply
  6. Support Panda10 19:45, 31 March 2011 (UTC)Reply
  7. Support Matthias Buchmeier 10:23, 1 April 2011 (UTC)Reply

Poll: adj vs adjective — Preference 2

I prefer using only the complete version: en-adjective, es-adjective, sh-adjective, ru-adjective, etc.

  1. Support --Pilcrow 17:19, 31 March 2011 (UTC)Reply

Poll: adj vs adjective — Preference 3

I prefer allowing names of templates to freely include either "adj" or "adjective".

  1. Support Ƿidsiþ 17:16, 31 March 2011 (UTC)Reply
  2. Support. Constantly changing things annoys infrequent editors. Consistency would have been nice, but changing existing templates to make them consistent across languages is not so nice. —RuakhTALK 20:03, 31 March 2011 (UTC)Reply
    What about using redirects from older names to newer, consistent names? --Daniel. 20:08, 31 March 2011 (UTC)Reply
    I'm fine with redirects, though redirects from the consistent names to the legacy names might be more clear. (Otherwise someone is sure to ask what the redirects are for.) —RuakhTALK 20:13, 31 March 2011 (UTC)Reply
    If someone asks what the redirects are for, surely someone else is going to answer easily. :) Now here is a rhetorical question: if redirects from the consistent names to the legacy names are to be created, which names are the consistent ones? --Daniel. 00:49, 1 April 2011 (UTC)Reply
    Re: "which names are the consistent ones?": Either way. I have no strong preference. —RuakhTALK 00:55, 1 April 2011 (UTC)Reply
    Disregarding laboriousness of entry, templates named -adjective are better, but I'd prefer to save myself six keystrokes per template. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:23, 1 April 2011 (UTC)Reply
  3. SupportSaltmarshtalk-συζήτηση 06:30, 2 April 2011 (UTC)Reply

Poll: adj vs adjective — Preference 4

I am indecisive or indifferent about adj vs adjective.

  1. Informally speaking, I am hesitant about which of the two discussed schemes I like better: one is concise, the other one is explicit.

    However, this poll is poorly designed. A preference is always of something over something else, but the options fail to indicate what is preferred over what. The designer of the poll even agrees at the same time with "I prefer using only the short version: en-adj, es-adj, sh-adj, ru-adj, etc" and "I prefer using only the complete version: en-adjective, es-adjective, sh-adjective, ru-adjective, etc", which would mean that he is indifferent about which of the two options is chosen. Thus, there is a confusion of the notions of support and preference, which is admittedly suggested by the use of the icons and wording for "support" taken from votes; having some other template that says such things as "I agree", "Holds for me", "True of me" or whatever sounds best to natives would be nice. --Dan Polansky 09:18, 1 April 2011 (UTC)Reply

    I've fixed my former vote of implied indifference; thanks for pointing that out. Yet, the word only in "only the complete version" and "only the short version" already states that they are preferences over everything else. The concept of supporting a preference is similar enough with agreeing with a preference, and the entry support doesn't suggest otherwise. However, for greater accuracy, I'm inclined to support a new template such as " I agree". --Daniel. 09:54, 1 April 2011 (UTC)Reply
    I have created {{agree}} with a green icon. Blue icon does not indicate an agrement to me. --Dan Polansky 14:52, 1 April 2011 (UTC)Reply

Poll: adj vs adjective — Discussion

If {{en-adj}} redirected to {{en-adjective}}, and we could rely on a bot to change all human additions of {{en-adj}} to {{en-adjective}}, but that human users were still allowed to use {{en-adj}}, I would become indifferent to the outcome of this poll. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:29, 31 March 2011 (UTC)Reply

Similarly, I am indifferent, as long as humans can use either and one redirects to the other. (If I'm not mistaken, that's the current state of affairs.)​—msh210 (talk) 20:22, 31 March 2011 (UTC)Reply
Wholly agree with this. Mglovesfun (talk) 10:00, 1 April 2011 (UTC)Reply

The wording of the preferences is not parallel. For preferences one and two the question is what one prefers using. For preference three the question is what one would allow. Option four is consistent with either preference or rule. Is this a survey about one's personal preferences or about what rules one would wish imposed on others? DCDuring TALK 14:44, 1 April 2011 (UTC)Reply

What happened to the logo?

When did the logo get that unsightly addition of a red caret and schwa? Heck, we all know the logo's not great, but that's hardly an improvement. Moreover, that correction of the pronunciatory transcription is certainly not the most urgent — whereas many accents do omit the schwa, none (AFAIK) pronounce an alveolar trill instead of an alveolar approximant. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:26, 31 March 2011 (UTC)Reply

It's April Fool's Day. Some idiot thought that was funny. --Vahag 19:28, 31 March 2011 (UTC)Reply
Oh, I see. It's still the 31ˢᵗ of March where I live for another three hours and twenty-six minutes. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:34, 31 March 2011 (UTC)Reply
Anyway, who uses a schwa? Americans use /ɛ/, and I thought Brits use nothing.​—msh210 (talk) 19:36, 31 March 2011 (UTC)Reply
support changing the addition to an /ɛ/ — lexicógrafa | háblame19:38, 31 March 2011 (UTC)Reply
Where is Dan Polansky when he is needed? He is good at organizing polls. --Vahag 19:40, 31 March 2011 (UTC)Reply

¶ Please, this is seriöus business. --Pilcrow 19:31, 31 March 2011 (UTC)Reply

I agree with the new logo. Let it stay. -- Prince Kassad 19:36, 31 March 2011 (UTC)Reply

¶¶ It wasn't me, Pilcrow/Wonderfool. Somebody must have used my computer when I was in the bathroom. I think it was Ruakh: when I came back my keyboard was covered in matzo crumbs. --Vahag 19:37, 31 March 2011 (UTC)Reply

¶ I do not find your ethnocentric foolery amusing in the slightest. Such immature remarks only make me feel less comfortable volunteering for this website. --Pilcrow 19:45, 31 March 2011 (UTC)Reply
@Vahag: As a Sephardic Israeli, I don't eat matzo, but matzah. Sheesh, what a n00b. —RuakhTALK 19:56, 31 March 2011 (UTC)Reply
LOL, pwnt! — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:06, 31 March 2011 (UTC)Reply
For the record, despite Ruakh's good humour, I don't find this sort of thing remotely funny. Or appropriate. Ƿidsiþ 09:22, 1 April 2011 (UTC)Reply
You wanna lollipop? --Vahag 11:16, 1 April 2011 (UTC)Reply
I think he wants less of your overt racism displayed on this project, and I am right with him on that. I am sure there are plenty of great venues available on the internet for you to be disparaging of Jews or whomever else you would like to insult, no reason to bring it here. - [The]DaveRoss 15:00, 1 April 2011 (UTC)Reply
If you think my joke was racist, you should see a doctor. --Vahag 16:19, 1 April 2011 (UTC)Reply
Vahag's remark didn't strike me as necessarily racist. Though we can decide this matter easily by Wiktionary:Votes/2011-04/Ethnocentric joke policy. --Daniel. 02:10, 2 April 2011 (UTC)Reply

April 2011

Sorting topical categories

It's a little hard to spot "Category:Mechanics" and "Category:Energy" within the list of members of Category:Physics due to the new order that involves sorting everything under uppercase letters. I suggest reverting to the old order, if possible. --Daniel. 07:49, 1 April 2011 (UTC)Reply

An easier idea: use a sortkey for the language subcategories that makes them sort somewhere else, preferably near the end. -- Prince Kassad 07:58, 1 April 2011 (UTC)Reply
I think our ideas are virtually identical. I never meant to literally revert the change by undoing the whole update to MediaWiki 1.17 that apparently caused it, or something like this. --Daniel. 08:09, 1 April 2011 (UTC)Reply
Or: Move all English topic categories to begin with en:, so we don't have to deal with English topic cats having other language subcategories at all. --Yair rand 08:11, 1 April 2011 (UTC)Reply
Uh, and how does this solve the problem? -- Prince Kassad 08:12, 1 April 2011 (UTC)Reply
Hm? If we didn't have other language categories inside the English categories, the lower-down topic categories wouldn't have anything to get lost in. We might still have the non-prefixed categories, but they wouldn't really be used. --Yair rand 08:31, 1 April 2011 (UTC)Reply
In this case, non-prefixed categories would be used to link between categories of various languages. (Say, if someone wants to go from Category:ja:Biology to Category:ru:Biology.) This use would be in conflict with the "Other languages" box, of course, but I'd like to be one step ahead and think that the box could be automatically populated by the non-prefixed category for better coverage, forming a nice symbiosis. --Daniel. 08:45, 1 April 2011 (UTC)Reply
But then, we'd have other language categories inside the top-level topical categories. It doesn't fix the problem at all, it just moves it to another location. -- Prince Kassad 08:39, 1 April 2011 (UTC)Reply
The problem I mentioned in the first message is the confusing mixture of language versions of one category (Category:pt:Physics, Category:es:Physics) and English subcategories (Category:Mechanics, Category:Energy). It could be entirely solved by either Kassad's idea or Yair's. --Daniel. 08:45, 1 April 2011 (UTC)Reply
One solution that came to mind some time ago is to remove non-English categories from Category:Physics, leaving them nonetheless accessible from the collapsible table at the top entitled "Other languages", as they already are. Thus, de:Category:Physics would no longer be a subcategory of Category:Physics. This could be executed by a single edit to a template, {{topic cat}} I think. --Dan Polansky 09:01, 1 April 2011 (UTC)Reply
But the table is incomplete. It will never be complete either, as it's infeasible to add 7,000+ languages there. -- Prince Kassad 09:05, 1 April 2011 (UTC)Reply
Dan Polansky's idea is good in essence, yet I too disagree with replacing a complete list with an incomplete list; Category:ase:Colors is not linked from Category:Colors, among countless other examples. --Daniel. 09:12, 1 April 2011 (UTC)Reply
The table can be made complete by editing {{nav table}}. Why is it infeasible to add more than 7,000 languages to the table? Consider that, theoretically, the same number of languages is meant to be added to translation sections in great many English entries. --Dan Polansky 09:35, 1 April 2011 (UTC)Reply
Well, the table would take a sizable amount of time to expand, and about twenty seconds to scroll through, so it could be a bit of an inconvenience. It might add a bit to the loading time, too, but it is a possibility we should consider. The translations sections of English entries aren't all that large a problem since it can be helped by targeted translations. --Yair rand 09:58, 1 April 2011 (UTC)Reply
When an entry has many translations, they are already relatively difficult to be navigated. However, with the advent of targeted languages, it's now thankfully easier to spot the right ones. In addition, simply the fact that the many translations are useful information that was proudly added by our contributors is a good enough reason to make people wait for the time to load long pages, at least until the next big idea (Separate translation pages with parts accessible with JavaScript, anyone?) to handle lots of translations.
The great majority of foreign-language versions of topical categories that could exist don't exist, so you (Dan) apparently are literally asking for each instance of the {{nav table}} to contain thousands of black redlinks whose only purpose (apart from teaching what are the languages that exist, or merely the majority that we accept as categories) is reassuring that, yes, most topical categories weren't created yet. We only have little more than 850 language categories, though the current nav table of approximately 270 languages (if I could count right) is already infeasible enough, and would be much, much worse if expanded to 850 or 7000 categories. --Daniel. 10:25, 1 April 2011 (UTC)Reply
I support adding en: to English topical categories, and placing them in a parent category for all languages, so that the topical categories of other languages no longer appear in the categories for English. —CodeCat 09:55, 1 April 2011 (UTC)Reply
I too support this. We voted on it in 2009 (or early 2010) and it was a no consensus with something like 65% support rate. --Mglovesfun (talk) 19:11, 1 April 2011 (UTC)Reply
Ditto. 65% sounds like a high support rate; isn't the requisite threshold really close to that? Something like ⅔ (~66.7%) or ⁷⁄₁₀ (70%)? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:39, 1 April 2011 (UTC)Reply
I agree: 65% sounds like a high support rate. However, for comparison, this vote failed with ~68% of support. On the other hand, if another vote on the proposal of adding "en:" to English topical categories were created, well... Notably, most of the opposers of Wiktionary:Votes/pl-2009-08/Add en: to English topical categories (12 "Support", 7 "Oppose", 1 "Abstain") have been absent from Wiktionary, or have been making few contributions lately, so their probability of voting again is obscure. They include EncycloPetey, Carolina wren, Razorflame and Robert Ulmann. --Daniel. 07:34, 3 April 2011 (UTC)Reply

Restructuration of foreign languages

We are always striving to make Wiktionary easy to use for anyone. Of course, I do realize we're currently quite far away from that goal, which is why I, and you too, strive to make suggestions that make Wiktionary easier to use.

Now, foreign languages have an unique problem. Since words occuring in multiple languages are lumped in one entry, finding the language you want can be difficult especially for newcomers. Take, for example, the entry der: a Swedish speaker, or perhaps learner, wants to find out what "der" means. For that, he first has to scroll past the table of contents, then he will arrive at the Danish, Dutch, etc. entries, which is absolutely not what he is interested in. To find his language, Swedish, the reader would have to scroll all the way to the bottom, but most people have already closed the browser window at this point. (Using the ToC does not make it better, as that will make the screen end up somewhere between Latin and Limburgish.)

My suggestion is revolutionary, requires major changes, but ultimately makes Wiktionary easier to use: I propose to split up entries by language, using prefixes for the target language. So you'd have Danish:der, Dutch:der, Latin:der, etc. Now, let me tell you why this is a brilliant idea:

  • Finding your favorite entry will be much easier, as long as you specify language and word, you'll always end up at the information you want, and not at a page with several languages you're not actually interested in.
  • The prefix system makes adapting the search function pathetically easy. It just has to look for words starting in the language name.
  • This system allows us to make "language portals" of some sort, akin to the French Wiktionary portals. These could have useful grammar information, helpful entries, and other informational material to readers, maybe even tips on how to contribute.
  • Large pages, like a, would benefit a lot from this. This also helps 56k readers (of which there are still many in the world, mind you).

Of course, this would first require a lot of work to split up the old entries. But the end result is totally worth it and will help us to become a more popular resource on the Internet and possibly even in print.

-- Prince Kassad 09:59, 1 April 2011 (UTC)Reply

I think the problems with the current system could be fixed by tabbed languages. How exactly do you suggest that the search system work in your proposal? And would each of these be separate namespaces? --Yair rand 10:04, 1 April 2011 (UTC)Reply
Awful suggestion. I would leave the project. SemperBlotto 10:06, 1 April 2011 (UTC)Reply
I certainly oppose the suggestion.
I don't see how the "French portal" could be better than the simultaneous existence of WT:AFR and a good set of appendices; and I don't see why the French portal would necessarily benefit from the proposal as a whole.
In addition... If your hypothetical Swedish speaker can't take the time to notice that a Swedish section of der is available below the Danish, Dutch, etc. sections of that entry, he most certainly won't take the time to find out the difference between the individual pages Danish:der, Dutch:der and Swedish:der. --Daniel. 10:42, 1 April 2011 (UTC)Reply
I had a similar idea a while ago, but my idea was to use subpages: der/sv. —CodeCat 11:21, 1 April 2011 (UTC)Reply
How is typing [Language]:[term] more convenient than typing [term]#[Language]? Bear in mind that the vast majority of pages, I would wager, have only one language section, yet every page would require the language prefix. Also, this change would make it much more difficult to compare the same word when it is shared by multiple languages (e.g., (deprecated template usage) lapsus linguae). I oppose this "revolutionary" suggestion that "requires major changes" when the benefits seem so small when compared with the burdens. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:18, 1 April 2011 (UTC)Reply
Oppose, I think that this is a very bold change with a high potential to break a lot of things. On the other hand the tabbed browsing approach seems harmless, can easily be switched off, and will essentially provide much the same improved readability effect. Matthias Buchmeier 13:04, 1 April 2011 (UTC)Reply
Tabbed browsing doesn't reduce the total size of the page though. It still all has to be downloaded, because the filtering is done in the browser. —CodeCat 14:17, 1 April 2011 (UTC)Reply
If you want to reduce the total size of the page the filtering out of non-desired languages doesn't help much. If you look at the page source, you will immediately recognize that the page-size is mainly bloated due to advanced html-formating/scripting features like CSS class-tag long mime-type lists for the audio files and so on. So in order to improve the speed on slow network-links the most effective solution is to provide some minimalistic rendering option switch' inside the Mediawiki-software, rather than splitting the pages into languages. Matthias Buchmeier 15:04, 1 April 2011 (UTC)Reply
Don't forget that providing information about all languages in the same page allows an easy comparison of senses between languages, detection of false friends and of small differences in uses, etc. This is a something very useful, and not found anywhere else. The main drawback is that some pages will become too large to be loadable (imagine a description for 5000 languages in the same page). This case will be exceptional, but solutions will have to be found. I think that, in these exceptional cases, a solution might subpages by language (or subset of languages). Lmaltier 18:55, 1 April 2011 (UTC)Reply
I can imagine a description for 5000 languages, but I don't think it will happen, given the limited number of languages in the world, our practice of not just copying dictionaries, and the distribution of spellings among languages. Simply listing letters--i.e. a is used as a letter in English, Afrikaans, Danish, German, etc.--instead of having an entry for each letter, would massively shrink some of our biggest pages.--Prosfilaes 19:20, 1 April 2011 (UTC)Reply
Yes, letters is the typical case. And each language must be studied separately: for pronunciation, but also for the gender (e.g. h was used to be feminine in French, but is now masculine), for derived terms (from A to Z), etc. Lmaltier 20:52, 1 April 2011 (UTC)Reply

Categories in etymology chains

Currently, if an English word was inherited from Indo-European, we add the entry to the derivation categories of Middle English, Old English, Proto-Germanic and Proto-Indo-European. This creates a lot of redundant categories, even though nothing 'special' really happened throughout the word's history. It was simply retained over thousands of years of evolution into the modern language. So I would like to propose that we only add entries to the most recent categories, and those of languages from which it originates through a means other than natural evolution. So the word kettle for example would be added to Middle English derivations, but not Old English or Proto-Germanic (because this is the natural evolution of the word) but it would be added to Latin derivations because the word was borrowed from Latin into Germanic. A completely 'native' word like foot would only be in Middle English derivations. The entire chain of etymology would be shown in the entry as usual, but only the most recent ancestor would actually be categorised. —CodeCat 15:52, 1 April 2011 (UTC)Reply

I immediately see a drawback: some modern English words were present in Middle English and in Old English before that, others were present in Middle English — but are of unclear etymology before that. If I today want a list (or category) of English words derived from Old English, I look at Category:Old English derivations. If your proposal were instituted, I could look at Category:Middle English derivations, but this category would be adulterated with words of unclear derivation (possibly not Old English) — and words derived from languages other than Old English. A modern English word might also clearly derive from an Old English word, but be entirely unattested in Middle English. What would happen in this case? Assume natural evolution and place the word into Category:Middle English derivations, although it is unattested in Middle English? Said briefly, I see several drawbacks. What are the benefits?
I find it also a bit arbitrary to place the words into the category of the most recent parent language as opposed to for example the oldest common ancestor (Old English for English, Old High German rather than Middle High German for German) or oldest known ancestor. Finally, recall the discussions that have happened over whether we should even have Middle English as separate from modern English. - -sche (discuss) 03:12, 2 April 2011 (UTC)Reply
Yeah, the problem is that not very many people seem to care about how many of our words were used in Middle English (though I'm sure some do), whereas the question of how much of our vocabulary is inherited from Old English is a very active one. The change from Middle to Modern English is gradual and relatively uneventful, whereas the change from Old to Middle English is drastic and marked by the imposition of a whole new stratum of borrowings, the loss of gender and case system etc. Hence, the fact that a modern word ‘came from’ Middle English is relatively trivial compared to the question of whether or not it survived the Norman invasion. Plus I agree with pretty much all of what -sche says above. Ƿidsiþ 12:13, 2 April 2011 (UTC)Reply
I see your point... do you think it might be more useful then to categorise only the oldest language in which it was created and from which it naturally evolved? So that foot would be categorised in Proto-Indo-European derivations but become in Proto-Germanic? Loanwords would then be categorised as derivations of both the languages they were borrowed from and into. —CodeCat 12:36, 2 April 2011 (UTC)Reply

Upcoming votes

I've created a few upcoming simultaneous votes based on discussions created by me in the last few months. The first one of these is scheduled to start in three days.

--Daniel. 02:04, 2 April 2011 (UTC)Reply

Move Romanian entries with cedillas

I've started moving Romanian entries with cedillas to new names with commas according to community decision (Wiktionary:Votes/pl-2011-02/Romanian orthographic norms). I applied for the bot flag (Wiktionary:Votes/bt-2011-04/User:Flubot for bot status) and the tools I'm going to use are here. If there are any comments by anyone I'd be happy to listen to them on WT:GP#Move Romanian entries. --flyax 07:35, 3 April 2011 (UTC)Reply

Serbo-Croatian edits

Very frequently I see Serbian or Croatian terms changed to Serbo-Croation and vice-versa. I never know if this is vandalism, well-meaning error or the correct thing to do. Normally, I neither revert the edits not mark them as patrolled. I have read the introduction to Wiktionary:About Serbo-Croatian but know nothing of these languages. Am I doing the right thing? (I don't want to start another war) SemperBlotto 10:46, 3 April 2011 (UTC)Reply

Horribly hard to say; the current consensus among editors of the English Wiktionary - I choose those words very carefully by the way - is that Serbo-Croatian is a single language and that Bosnian, Serbian and Croatian are really regional variations. I say "editors of the English Wiktionary" as I'm not convinced that this is the consensus among outsiders, such as people who live in the region and linguists.
Not unique to {{sh}}, could be applied to Catalan/Valencian, Urdu/Hindi, English/Scots and many more; too many mention without missing a lot more out. Mglovesfun (talk) 14:14, 3 April 2011 (UTC)Reply
Wiktionary:Votes/pl-2009-06/Unified Serbo-Croatian asked whether only sh should be used as a language (to the exclusion of bs, sr, and hr), and ended in no consensus but a majority supporting. So there's IMO nothing wrong with converting a bs, sr, and/or hr entry to a sh one (assuming no information is lost during the conversion). Also IMO nothing wrong with doing the reverse. At some point we'll need to have a rule, I suppose, but right now AFAICT editors can do what they will. OTOH, the usual rule of "keep an entry the way it is if there's no specific reason to change it" would seem to apply.​—msh210 (talk) 14:17, 3 April 2011 (UTC)Reply
Just undo or mark with {{attention|sh}} if there was a genuine content modification (i.e. something other than removal or multiplication). --Ivan Štambuk 14:46, 3 April 2011 (UTC)Reply
Agree with msh210 on this. Mglovesfun (talk) 15:36, 4 April 2011 (UTC)Reply
As there is no consensus for excluding such language sections, there is something wrong with removing sections for accepted languages: no Serbo-Croatian section should be removed, but no Croatian, Serbian, Bosnian section should be removed either. If you want to add a section, just add it, but other existing valid sections must be kept. Lmaltier 17:44, 4 April 2011 (UTC)Reply
Your suggestions would lead to needless editing anarchy and content redundancy, not to mention inevitable multifarious inconsistencies that would emerge from maintenance of such disorderly pile of what is more or less the same lexical data. We have enough problems cleaning up after those poor souls who took your advice on this matter in the past. Please don't encourage new "recruits" by such public display of support for futile cloning efforts. --Ivan Štambuk 22:34, 4 April 2011 (UTC)Reply
More Lmaltier idealism that shouldn't be applied to the real world. Mglovesfun (talk) 23:28, 4 April 2011 (UTC)Reply
This is the only way to discourage edit wars. You encourage taking a position in a very polemic subject and you forget NPOV. A few days ago, we had a comment on fr.wikt stating that Serbo-Croatian entries should not be allowed, because it's not a language. I replied that we accept Serbo-Croatian sections as well as Serbian, Croatian, Bosnian sections, because Wikimedia also accept them in the list of wiktionaries. And this policy does work. Tolerance in the real world is a good thing. Lmaltier 05:51, 5 April 2011 (UTC)Reply
It's not about tolerance, Lmaltier (how about being tolerant to Serbo-Croatian speakers?). There is no phobia against Serbs, Croats, Bosnians or Montenegrins and there is no point in using Serbo-Croatian as an umbrellaa to cover all of these languages if the redundancies are not removed. The sh contributors have been removing the duplicates to make the maintenance easier. --Anatoli 06:30, 5 April 2011 (UTC)Reply
NPOV is a bad thing to bring up, as you have to have some view on the topic, or else completely ignore it. Saying that Bosnian, Croatian, Serbian and Serbo-Croatian should coexist is a point of view. Mglovesfun (talk) 07:28, 5 April 2011 (UTC)Reply
My point of view is to allow Serbo-Croatian (while describing special cases and language specific usages) and disallow the split into separate BCS languages. We have language specific policies created and maintained by the actual contributors. Let them decide. In my observation, people who bring up the coexistence don't actually work on these languages or are rare visitors and don't know what's involved in maintaining this and how much lexical similarity there is. For example, creating an entry in Bosnian alone may cause users to think that the Serb and Croatian word is different, so a mirror entry would be required with all the flections, examples, whatnot. I mentioned "tolerance" meaning it has little to do with the recognition of the official status of Bosnian, Croatian and Serb. Let's use these terms for politics, for linguists, let's use "Serbo-Croatian". BTW, in Europe there is a demand for Serbo-Croatian interpreters, they no longer cringe at the word. --Anatoli 08:21, 5 April 2011 (UTC)Reply
As I understand it, Wiktionary makes no claims about what is and isn't a "language", it has simply decided to treat all relevant languages/dialects from the area under a Serbo-Croation header. Ƿidsiþ 08:27, 5 April 2011 (UTC)Reply
The whole notion of a language is completely subjective and arbitrary. There are no languages in the real world: only words with meanings that are used by individuals to convey information. Languages are convenient abstractions fabricated by linguists in order to describe macroscopic properties of verbal interactions among individuals (i.e. how words change in their meaning or certain phonetic properties). As an all-inclusive dictionary, the first of its kind, Wiktionary's focus is on words alone, and not on divides drawn among groups of them by external subjects. We draw our own "borders" on the basis of criteria such as pragmatics and usefulness of presentation. The problem are the folks who relate the arbitrary and convenient abstraction of a language to the also arbitrary and inconvenient abstraction of a nation-state. Borders between languages are as imaginary as those between countries - if you try really hard, you can perhaps convince yourself that they really exist - it's just that you cannot see them or touch them, and the only way to enforce them is with violence (what does one see at borders? Men with guns). That approach of course does not work in the cyberspace because there is no overarching authority to enforce the decisions (except, perhaps, a decree from the Foundation :). It's most unfortunate that certain groupings and splits of words into languages affect the feelings of some, but that's a necessary rather than a deliberate and premeditated evil. --Ivan Štambuk 08:46, 5 April 2011 (UTC)Reply
Yes, roughly like that. From Wiktionary:About_Serbo-Croatian: "Today, each of those states regulates its own standard language, the prestigious literary idiom, termed Croatian, Bosnian, Serbian and Montenegrin, respectively. Those 4 different standard languages are however based on the same dialect - Neoštokavian - are mutually intelligible, and have almost identical grammar and most of the lexis. ...
The term Serbo-Croatian on Wiktionary acts as a generic container to all 4 national varieties". --Anatoli 08:35, 5 April 2011 (UTC)Reply
That bit of Wiktionary:About Serbo-Croatian is so polemically prescriptivist that the only word for it is "mistaken". The "standard language" or "prestigious literary idiom" of each state is termed Standard Croatian, Standard Bosnian, Standard Serbian, Standard Montenegrin. People who don't use the term "Serbo-Croatian" feel no compunctions about applying the terms "Croatian", "Bosnian", "Serbian", and "Montenegrin" to non-standard varieties based on other dialects besides Neoštokavian. —RuakhTALK 18:01, 5 April 2011 (UTC)Reply
Isn't this kind of the same as the difference between the two varieties of Norwegian? Maybe we should look to them to see how this can be solved. —CodeCat 10:41, 5 April 2011 (UTC)Reply
Perhaps both sides are irreconcilable. There have been endless fights. As mentioned above, we had votes but no agreement was reached. --Anatoli 12:32, 5 April 2011 (UTC)Reply
They are not irreconcilable. There is a controversy on this subject (not only here). Choosing one of the options is taking a position, which is forbidden by the NPOV principle. The only way not to take a position is to allow the addition of sections for all these languages and to forbid the removal of correct sections for all these languages. Lmaltier 17:03, 5 April 2011 (UTC)Reply
Wiktionary:About Serbo-Croatian is being changed, I can sense some smoke already... --Anatoli 12:34, 5 April 2011 (UTC)Reply
In case of two forms of standard Norwegian, it was decided a few years ago (there was a long discussion in the BP, look it up if you're interested) that these two are "sufficiently distinct", so it's best to keep them separate. Despite the fact that there do exist dictionaries that cover both forms in a single volume. The problem is that we provide comprehensive coverage, including details such as pronunciation and inflection, and subsuming both under a single entry would introduce enough problems that it's best to keep them separate. In case of Serbo-Croatian varieties, the differences between variant words (variant as in color/colour) are minor and very regular (predictable), and the since we're dealing with 4 literary standards (Bosnian/Croatian/Serbian and newly-invented Montenegrin) redundancy is much greater than in case of Norwegian. Interestingly some folks that were pro unified Norwegian were contra unified Serbo-Croatian, despite the fact that the latter makes much more sense than the former. --Ivan Štambuk 14:09, 5 April 2011 (UTC)Reply
I don't touch Serbo-Croatian, but if I did, I would be very unhappy about having to maintain four different language sections independently. Not to mention it's ridiculous to have both Serbo-Croatian and each of BCS. Image what if a reader came by and saw a Croatian section, a Serbian section and a Serb-Croatian section all for the same word? What impression would this give them of how professional we are? And what if they appeared to contradict each other, heaven forbid!? For this reason, I support unified Serbo-Croatian. If we think that NPOV is being violated, we could use the ==Bosnian, Croatian, Serbian(, Montenegrin)== header that somebody has proposed before. —Internoob (DiscCont) 04:10, 6 April 2011 (UTC)Reply
Again, I don't think there is a neutral stance on this issue apart from "not touching it". Mglovesfun (talk) 09:51, 7 April 2011 (UTC)Reply

How to grow faster

In the last half year, I have tried to improve the Swedish entries here. I have manually added main entries for many frequently used words, and semi-automatically I have added what I believe to be all missing form entries listed in these main entries. For example, in förtroende (a Swedish noun) all eight slots in the blue declension box are now links to existing form entries. There are now 20,000 Swedish gloss definitions and 80,000 form-of definitions. The ratio 1:4 is reasonable for Swedish. For any given Swedish text, I think 85-90 % of all words are now covered by Wiktionary. But to be a useful Swedish dictionary, the gloss definitions would need to be around 100,000 or five times more than today. If volunteers spend 3 minutes on each entry (which would be really fast), we can theoretically add 80,000 entries in 4000 man-hours (100 fulltime weeks of 40 working hours, or two full man-years). So apparently, this would take a lot more time than we have volunteers for. What methods are there for working faster? How were so many gloss definitions created for Italian, Mandarin, Serbo-Croatian and Finnish? Did they start out with existing dictionaries that could easily be uploaded? --LA2 02:43, 4 April 2011 (UTC)Reply

There is no magic, just lots of work. If you have a list you want to add, just save the basic template and add your entries, here's a template for nouns:
==Swedish==

===Noun===
{{sv-noun|g=c}} <!-- change gender -->

# [[]] <!-- add translation(s) into English -->
It's more important to add the basic lemma forms, the other forms are optional but the method woould be the same. Chinese gloss definitions provide links to individual chracacters, which was a lot of for some people in the past, currently more work is done on complete words made of multiple characters. Talk to User:Tooironic about Chinese. Not sure now who created the majority of Hanzi entries. Check with User:Hekaheka if she has any special tricks for Finnish. --Anatoli 05:34, 4 April 2011 (UTC)Reply
The inflected forms of French, Italian etc. nouns, adjectives and verbs are mostly added by bots. The same could be done (not by me) for Swedish (and many other languages). SemperBlotto 07:08, 4 April 2011 (UTC)Reply

language/family conflict

We have both Category:Egyptian language and Category:Egyptian languages. One category is of a language, other is of a family.

What other languages share its name with a family? --Daniel. 08:25, 5 April 2011 (UTC)Reply

At least none that are represented here on Wiktionary. I have looked around our categories and did not find another such pair. -- Prince Kassad 08:54, 5 April 2011 (UTC)Reply
Arguably both Chinese and English could be called both languages and language families, but we don't call either both. - [The]DaveRoss 10:34, 5 April 2011 (UTC)Reply
We have, however, Category:Sinitic languages, which is an exact synonym of "Category:Chinese languages" with a name that doesn't conflict with the Chinese language. Whether we should have a Category:Chinese language is subject of a separate controversy. --Daniel. 05:10, 7 April 2011 (UTC)Reply
There's also {{etyl:hyx}} and {{etyl:euq}}, I guess. -- Prince Kassad 12:59, 5 April 2011 (UTC)Reply
Thankfully the codes hyx (Armenian family) and euq (Basque family) aren't used, so they don't cause conflicts with languages! On the other hand... That makes our coverage of categories of families is technically incomplete; Category:Armenian languages may hypothetically be created eventually to contain Category:Armenian language and Category:Old Armenian language. --Daniel. 08:53, 6 April 2011 (UTC)Reply
Don't forget Category:Middle Armenian language. I would have created that category long ago, if it weren't for that conflict. -- Prince Kassad 09:51, 6 April 2011 (UTC)Reply
Addendum: maybe we could coin a protologism to solve that problem? -- Prince Kassad 20:35, 7 April 2011 (UTC)Reply
What protologism? Egyptic (Egyptian), Basquish (Basque), Anglean (English) and Hayerenian (Armenian)? Doesn't sound like a good idea to me. --Daniel. 20:43, 7 April 2011 (UTC)Reply
Well, it would at the very least solve the language conflict problem. It isn't a nice idea, but people have done it in the past and some of these terms went to be used by other authors afterwards.
Basque does not need a proto, it has an internationally accepted alternate name (Vasconic). For the others, we would have to be creative: either Hayic or Arminan for Armenian, and possibly Kemetic for Egyptian. -- Prince Kassad 20:50, 7 April 2011 (UTC)Reply
Successful prior experience would be a good argument in favor of protologisms. Kassad, please provide an example of this being done in the past. --Daniel. 20:57, 7 April 2011 (UTC)Reply
Examples are abundant. For example, when Greenberg reformed the classification of African languages, he coined the new terms Afro-Asiatic (replacing the older term Hamito-Semitic), Niger-Kordofanian, Nilo-Saharan and Khoisan, among many others. Most of these are still used today. -- Prince Kassad 21:01, 7 April 2011 (UTC)Reply
addendum: I just created Category:Vasconic languages since, like I said, that name is already established. -- Prince Kassad 08:58, 8 April 2011 (UTC)Reply

There's also {{etyl:sqj}}, which also has a potential for conflicts. What to do with this one? -- Prince Kassad 12:24, 10 April 2011 (UTC)Reply

Simplification of brand names rules

The title is the 10% inspiration. You are to provide the 90% perspiration. Here's a starting point: Wiktionary:Votes/pl-2011-04/Commercial terms. DAVilla 08:42, 6 April 2011 (UTC)Reply

I strongly oppose the proposal in its current form, as in this revision. (But you probably intend someone to modify the proposal to actually achieve the simplification that you speak of.) You are proposing to extend the section that governs brand names to also apply to company names. I have never understood why a term of commercial interest should per default be excluded. I have never seen a plausible argument for the claim that Wiktionary is exposed to a significant risk of being used for commercial promotion via inclusion of brand names and company names. I do not understand the phrase "X has entered the lexicon"; it means nothing to me. --Dan Polansky 09:05, 6 April 2011 (UTC)Reply
If what you want to achieve is a simplification of brand name rules, you should do just that instead of trying to cover brand names and company names under same rules in the same vote. The task of simplifying the rules for brand names is hard enough (because of the likely opposition); adding more to the plate seems unwise. I am all for simplifying rules for brand names, but I oppose governing company names by the complex rules for brand names, even after their simplification. I think we are luckly that company names are as yet not regulated by a formal vote. --Dan Polansky 09:31, 6 April 2011 (UTC)Reply
Okay, now we're at a legitimate starting point. It could definitely use more eyes.
Part of the process of simplification is to unify different categories. The current splintering is going to lead to a CFI longer than the tax code. What I'm most worried about is that trying to pin down the criteria for company names will lead to the whole thing being rejected. DAVilla 09:37, 17 April 2011 (UTC)Reply

Poll: Including company names

I would like to ask your opinion in an informal poll on whether at least some companies should be included in Wiktionary. This is not so much about the inclusion of a company name alone but rather about the inclusion of a company together with its name: Microsoft is now included as a common noun but not as a proper noun, so Microsoft corporation is now excluded from Wiktionary. Detailed comments on which companies and which company names should be included are welcome: this is a combination of a poll and a discussion.

Currently included are IBM (sense line International Business Machines), BMW (sense line Bayerische Motoren-Werke [...], a manufacturer of motor vehicles), Intel (sense line Intel Corporation, a US-based multinational corporation that is best known for designing and manufacturing microprocessors and specialized integrated circuits; has an etymology), Nokia (sense line Finnish mobile telephone company and brand name), Boeing (sense line An American aerospace company, which created many commercial airplanes.), Sony (sense line An international electronics and media company based in Tokyo, Japan; has an etymology), and more. Some deleted companies include Microsoft (see Talk:Microsoft#Deletion debate). --Dan Polansky 09:53, 6 April 2011 (UTC)Reply

Some companies should have dedicated sense lines in some entries.

  1. Agree Daniel. 10:12, 6 April 2011 (UTC)Reply
  2. Support --Yair rand 10:22, 6 April 2011 (UTC)Reply
  3. Support --Anatoli 10:40, 6 April 2011 (UTC)Reply
  4. Agree. In particular, I want the following entries to have the company included on a sense line: IBM, BMW, Intel, Nokia, Boeing, Sony, Verizon, Motorola, Google, Microsoft. I am not so sure about General Motors; I tend to see it excluded. --Dan Polansky 18:16, 6 April 2011 (UTC)Reply
  5. Agree. In particular, those that are single words (or acronyms). SemperBlotto 18:58, 6 April 2011 (UTC)Reply
  6. Agree Ivan Štambuk 06:11, 7 April 2011 (UTC)Reply
  7. Agree. --Vahag 11:22, 7 April 2011 (UTC)Reply

No company should have a dedicated sense line in any entry.

  1. I can't think of any exceptions offhand, but will be glad to move this to the preceding subsection if I do.​—msh210 (talk) 18:09, 6 April 2011 (UTC)Reply
  2. Agree I suppose this poll is meant to lead to a possible change to the CFI? I'm not necessarily opposed to such a change. But the poll is worded in such a way as could imply support for breaking the CFI, and I do not support that. (I mean, in general there are times that the CFI are just wrong, and we have to break them; but on this point specifically, I'm O.K. with the CFI, and think we should follow them until and unless we change them.) —RuakhTALK 18:18, 6 April 2011 (UTC)Reply
    The poll does not take any stance on whether CFI should be broken. The purposes to which the poll can serve are in part left open. The purposes include informing further proposed changes to CFI. The findings of the poll may be used by various people in various ways. In case of doubt, anyone can state that they do not support breaking current CFI until it is changed via a vote. I am not going to state that, as I have little respect for top-down unvoted-on regulations whose introduction to CFI is supported neither by a vote nor by a poll nor even, it seems, by a discussion in Beer parlour. (Correct me if I am wrong and there is in fact a discussion that proposed the current regulation of company names.) --Dan Polansky 18:48, 6 April 2011 (UTC)Reply
    You explicitly choose not to oppose breaking the current CFI; and I think this poll could easily be read, if option #1 wins, as "See! This rule doesn't have support! We should ignore it!". In fact, you seem to be pre-emptively taking the view that we should ignore it, and viewing this poll as an alternative to following the CFI. If anything, this just reinforces my "Agree" vote. —RuakhTALK 19:26, 6 April 2011 (UTC)Reply
    You seem to be confusing what I am likely to do with what the poll does. The poll asks questions about what people would actually like Wiktionary to do, rather than speculating about what an unvoted-on passage of CFI means. The poll makes certain actions easier, but it does not call for these actions. The poll does not ask "Should CFI for company names be ignored?" but rather "[Do you agree that ...] Some companies should have dedicated sense lines in some entries [?]". Nonetheless, I have called for ignoring CFI for company names several times before the poll, and I feel neither guilty nor ashamed for accepting the consensus principle and rejecting the things-sneaked-into-CFI-reign-supreme principle. --Dan Polansky 07:55, 7 April 2011 (UTC)Reply
    You seem to be operating under the principle that there is some purpose to this poll, that this poll will have some sort of effect: otherwise, why have it? I, therefore, am operating under the same principle. To me, there seem to be two possible intended effects: (1) this could lead to some sort of vote to change the CFI; or (2) this could be used as justification for flouting the CFI. Since you refuse to disavow the latter, I have to take that possibility into account in my "vote". —RuakhTALK 12:40, 7 April 2011 (UTC)Reply
    As option 1 reads "Some companies should have dedicated sense lines in some entries" and most people choosing it are doing so without clarifying what companies and what entries they mean, I don't see how it's useful at all.​—msh210 (talk) 19:34, 7 April 2011 (UTC)Reply
  3. -- Prince Kassad 09:04, 7 April 2011 (UTC)Reply
  4. Support - -sche (discuss) 22:22, 6 April 2011 (UTC)Reply
  5. Agree DCDuring TALK 00:51, 7 April 2011 (UTC)Reply

I have other preference or position, such as indifference or indecision.

Discussion.

This is not the right question. We should include all words, and this should not depend on the sense of the words (except that some attestation rules might be slightly different depending on the sense, in order to prevent abuse). Company names should be included when they can be considered as attested words, when enough attestations independent of the company can be found (e.g. IBM). All senses of these words should be included (e.g. for IBM: the company, or a computer from the company). But they should not be included when it's difficult to consider them as words (e.g. Société nationale des chemins de fer belges) or when the only uses found originate from the company. Lmaltier 17:08, 6 April 2011 (UTC)Reply
See my comment under #Serbo-Croatian edits, before I get told off again. Mglovesfun (talk) 17:11, 6 April 2011 (UTC)Reply
I think it is a good question. It makes it possible for you to agree with the first option as long as at least one company name is a word according to your understanding of "word", and as long as you want that company included on a sense line. From what you have written, it seems that you actually want at least one company included on the sense line--IBM--so you actually agree with the first option, just that for some reason you do not want to indicate this under the corresponding option. The question asked in the poll does not prevent you from stating a tentative decision procedure for inclusion in your comment, which you have done. You may provide details, such as whether you think "Nokia" should be included on a sense line in "Nokia". If a majority of editors thought that no company should have a dedicated sense line, that would be a significant discovery made via this poll. --Dan Polansky 18:04, 6 April 2011 (UTC)Reply
For company names which are words in the typographic sense, it's easy. Nokia is a word in this sense, and it's very easy to find external attestations. Lmaltier 20:23, 6 April 2011 (UTC)Reply
If that is so, I do not see what prevents you from agreeing that "Some companies should have dedicated sense lines in some entries". As you can seen from the poll, some people want to see every single company excluded, and this is also what currect CFI for company names seems to do. You cannot so easily learn this by asking other questions. --Dan Polansky 08:06, 7 April 2011 (UTC)Reply
Lmaltier, you keep saying this but it doesn't solve anything. All you are doing is shifting the debate to how we define a ‘word’, which is exactly the same debate as we are already having over what to include. This is a semantic quibble. Ƿidsiþ 20:16, 6 April 2011 (UTC)Reply
Asking the right questions makes answers much easier. Lmaltier 20:23, 6 April 2011 (UTC)Reply
Yes, and I agree with Widsith that "what is a word" is not the right one: it merely shifts the debate without addressing it.​—msh210 (talk) 20:31, 6 April 2011 (UTC)Reply
Remember our basic principle : all words, all languages (1st sentence of CFI). I believe that this is the right question: which company names may be considered as words. Semperblotto does not state anything else above. Lmaltier 20:56, 6 April 2011 (UTC)Reply
Sure hat's the basic principle, and sure the question is then what's a word. But that question has — rightfully — devolved into issues of attestation and idiomaticity. Shifting back into "what is a word" mode is restarting with naivete with no point that I can see.​—msh210 (talk) 21:20, 6 April 2011 (UTC)Reply
In my opinion, if the final result doesn't end up with a policy that does fit with "Is this a word?", there's something wrong with it. ("It's not really a word, it's just a name made up by the company and used by people around/connected to the company. If that's not the case (ie it's used among people not connected to the company), then it is a word.") --Yair rand 21:45, 6 April 2011 (UTC)Reply
Exactly. And the question addressed here is focusing on the total exclusion (or not) of some words/senses of words, because of the sense. This is why I would reword it to focus on attestation. Lmaltier 05:49, 7 April 2011 (UTC)Reply
If I understand you right, you agree with the first option, but disagree strenuously with the second option? And the reason this is "not the right question" is that it doesn't already presuppose your first-option-supporting point of view? —RuakhTALK 12:44, 7 April 2011 (UTC)Reply
Yes, I agree with the first option, but what is important is the reason for including or not including them. Lmaltier 17:59, 7 April 2011 (UTC)Reply

Edit definition gadget

Can someone please undo the "Edit definition" thing that appears at each definition? --Dan Polansky 08:28, 7 April 2011 (UTC)Reply

Um, did something go wrong with the script? It's supposed to be just a tiny grey triangle next to the definition that can expand into a menu. Did something break? --Yair rand 08:34, 7 April 2011 (UTC)Reply
At the very least, I support the possibility of making it easier for me to edit definitions. However, was that change approved by the community? It caught me by surprise, but I may have missed discussions. --Daniel. 08:40, 7 April 2011 (UTC)Reply
It seems to do what it is intended to do, yet I find it outright annoying. I find this sort of interactive-for-dummies interface annoying; I accidentally move mouse over the little gray triangle and things start popping in front of me. I want this to be undone. Wiktionary should be a wiki rather than a WYSIWYG interactive thing, IMHO anyway. --Dan Polansky 08:41, 7 April 2011 (UTC)Reply
There is a button to disable it in the WT:BP#Tabbed Languages, Definition side boxes, and Sense IDs section. --Yair rand 08:44, 7 April 2011 (UTC)Reply
Wiktionary should be a wiki rather than a WYSIWYG interactive thing, not only for me but also for other editors, per default and not only when I choose a setting. Let's see what other editors think. --Dan Polansky 08:49, 7 April 2011 (UTC)Reply
The current "WYSIWYG interactive thing" is good enough for me, but a lack of "things popping in front of me" may hypothetically be better. Currently, I have to move the mouse to the arrow, then click on the "Edit definition" that is in a different place; thus the status quo is not simple enough. Not to mention that there is an annoying bug: once you edit a definition, you have to reload the page if you want to edit it again. --Daniel. 08:59, 7 April 2011 (UTC)Reply
I don't understand what you mean. The same definition? You mean if one decides that they don't want the definition they inputed, and want to have it as something different? There's a "undo" button for that in the top left for that. Or do you think that the ability to press the "Edit definition" button more than once for the same definition is necessary? --Yair rand 09:04, 7 April 2011 (UTC)Reply
The "undo" button stops working after one clicks on "Save Changes". I think that the ability to press the "Edit definition" button more than once for the same definition, after saving a change, is necessary. In other words, it is unnecessary to have the additional feature of disabling the "Edit definition" button just because a definition was edited. For comparison, there is an [edit] button in every entry and discussion page of Wiktionary, including the ones changed by the editor. My programming language teachers were prone to repetitively remind their students that users don't always take the most practical path, that in this case would be editing a definition only once in one's life, perfectly. (If everyone was able and willing of doing that, we wouldn't need that button anyways.) --Daniel. 09:34, 7 April 2011 (UTC)Reply
Okay, I've made the change. --Yair rand 09:45, 7 April 2011 (UTC)Reply
Thanks. --Daniel. 09:54, 7 April 2011 (UTC)Reply

I disabled the script because of the problem seen to the right. -- Prince Kassad 12:56, 7 April 2011 (UTC)Reply

  • I think that the gadget is a definite improvement to the Wiktionary interface, both viewing and editing-wise. When the remaining bugs are fixed, it should be enabled for some representative period of time (e.g. 1 week) so that one can gather statistics (from edit summaries) how many new users (IPs) edited with it, so that we can have hard numbers quantifying its potential benefits. --Ivan Štambuk 17:19, 7 April 2011 (UTC)Reply
I've fixed the bug and re-enabled the tool. --Yair rand 17:39, 7 April 2011 (UTC)Reply
This edit summary seems to be correct.​—msh210 (talk) 17:50, 7 April 2011 (UTC)Reply
I didn't detect any consensus yet, even for the experiment which Ivan has proposed. User interface changes should not occur without some kind of consensus. Significant changes should first be experimental, once authorized. Once the experiment is complete and results analyzed and discussed, we can see whether there is a consensus that does not need a vote or determine whether a vote is appropriate. Frankly, I don't see why this needs to be said. DCDuring TALK 18:21, 7 April 2011 (UTC)Reply
What do y'all think of maybe (1) enabling this for admins by default and (2) in the little popup-y box, including a link to an information/discussion/disablement page? That way we can test out the change, Yair rand can get feedback, and we can all determine how and whether it can be improved to the point that we want to enable it for all users by default. (Obviously any non-admins who want to try it out could still "opt in", but only admins would be forced to "opt out".) By the way, to make clear — I'm not suggesting that people should feel free to make user interface changes "opt out", even just for admins, without discussion and agreement first. I'm only suggesting a way forward for this specific change. —RuakhTALK 18:39, 7 April 2011 (UTC)Reply
How hard is it to put it on the gadgets page instead of having a one-off method for enabling/disabling it? An alpha test among some set of users as Ruakh suggests seems desirable to make sure that it the worst bugs have been caught before beta-testing it on a wider population of users as Ivan suggested. We should first try eating the dog-food before giving it to the dogs. DCDuring TALK 19:04, 7 April 2011 (UTC)Reply
(Sigh...) I've re-added the option for definition side boxes to WT:PREFS, added a gadget to turn it on, and removed the gadget to disable it. In my opinion, the script (at least the parts of it that I enabled by default, which was only definition editing and example sentence adding) already have gotten enough testing for it to be used by default. I've received a lot of positive feedback about the tool over IRC over the past month, and I think the tool is ready for use. Dan Polansky's objections seem to be against the concept of having tools that make it simpler to edit without requiring people to go to the edit screen as a whole ("Wiktionary should be a wiki rather than a WYSIWYG interactive thing"), which the overwhelming majority of the community disagrees with. Short-term analyses of anonymous use of the tool isn't likely to work, since it can take up to thirty days before the old cached JS is cleared and the new JS is actually displayed. (Note: Just in case some people weren't aware, discussion of the script began at WT:BP#Tabbed Languages, Definition side boxes, and Sense IDs.) --Yair rand 02:34, 8 April 2011 (UTC)Reply
Sigh, indeed. There is no amount of discussion among Wiktionary insiders that conveys information about how users respond to the interface. Hypothetical user advantages are mere bits of rhetoric. Thus, until we have an acceptable means of testing, we will always be faced with the risk that our interface changes will only cause bad results. We lack the most basic things, like baseline user satisfaction and behavior data. Thus, until this lack of user data and a valid testing method is corrected, or we find a substitute for such data, such as accepting other websites as models or accepting the authority of user-interface design experts, interface changes should be limited to those our core user group can opt into or out of. DCDuring TALK 04:22, 8 April 2011 (UTC)Reply
By "interface changes should be limited to those our core user group can opt into or out of", do you mean they should be limited to those that are opt-in options and opt-out default-on options, or that they should be limited to default-off options that can be turned on or off by the core user group? --Yair rand 04:30, 8 April 2011 (UTC)Reply
Default-off seems like the best option unless there is a contrary vote. Votes are unfun so I can't imagine that we would ever do default-on in practice. Once we have contributor data that says a high share (1/2, 3/5, 2/3, 7/10, 3/4) of active users have opted in, then it might make sense to help new contributors by making it default-on. DCDuring TALK 04:47, 8 April 2011 (UTC)Reply
Um, do you think your suggested process for improvements is likely to result in a successful, easy to use/edit interface? --Yair rand 08:07, 10 April 2011 (UTC)Reply
Re: "Dan Polansky's objections seem to be against the concept of having tools that make it simpler to edit without requiring people to go to the edit screen as a whole": Not exactly: I am really happy that Mediawiki makes it possible to edit a single section rather than the whole screen. But with a bit of correction and clarification, yes, I want to edit a wiki rather than a thing whose data structures are frozen because some sort of GUI interface depends on them.
Re: "..., which the overwhelming majority of the community disagrees with": Any evidence please? --Dan Polansky 10:03, 8 April 2011 (UTC)Reply
I still edit using the wiki syntax, but browsing is much more efficient without those monstrous TOCs and a bunch of sections in languages I'm not interested in. The GUI interface generated by the javascript isn't that good, I agree, and needs some polishing. This type of interface is the natural next step of evolution whether you like it or not. IMO all users should, after browsing the site for some time (e.g. more than 10 entries), be presented with some sort of options dialog enabling them to filter languages they're interested in, choose between wiki and GUI editing, and similarly. Then, we can make default whatever most folks choose to use, without needless voting bias of our close-knit community of veterans. The current state of interface reminds of old MSDN: it used to list examples of usage of a certain API in all the programming languages it was available to at once. Later they added check-boxes to filter languages you're not interested in, and now they have tabs that remember your preferred language of choice. --Ivan Štambuk 07:34, 10 April 2011 (UTC)Reply
  • This probably isn't going to influence anything here, but I figure I should point out that the Wikimedia Foundation Board just approved a resolution to (among other things) "urge the Wikimedia community to promote openness and collaboration, by ... Supporting the development and rollout of features and tools that improve usability and accessibility". --Yair rand 08:07, 10 April 2011 (UTC)Reply
    The resolution should influence things. All we need to do is decide on what gives us reason to believe that any one purported improvement will in fact be good for usability among infrequent registered users, unregistered users, new users, and new would-be contributors. Unfortunately almost all we know is about users like us. If we are to take the resolution seriously, we should devote our efforts to getting user information, to learning about usability in general and about wiki usability in particular, and to identifying improvements that have been clearly successful at other wikis. We might also seek to learn from non-wiki sites, especially the social media sites. I'd be interested in what others have found from reviewing the usability initiative concerning tools or data that help in these regards.
    In the absence of knowledge, then presenting choices along the lines Ivan suggests, which can only work for registered users, is an excellent step. If that is all we can do, then we should devise means of encouraging users to register. DCDuring TALK 13:47, 10 April 2011 (UTC)Reply
    I disagree that this would be the course of action most likely to result in an improved interface at this point. While it might make sense at some later time to figure out how to fine-tune the usability and editability of the project and make sure it isn't so much affected by what Wiktionary regulars think of as being simple to use for everyone, right now we're dealing with a system that is virtually impossible to edit and navigate, and you don't need a team of usability experts and a pile of studies and user data to be able to tell that an Edit definition button makes it easier to edit definitions than requiring a user to navigate through kilobytes of incomprehensible gibberish. The new (and old, for that matter) WT:EDIT modules will probably undergo significant changes over time, but we should at least have some way of doing the more basic editing actions, without involving enormous amounts of wikitext. --Yair rand 18:07, 10 April 2011 (UTC)Reply
    Re: "...right now we're dealing with a system that is virtually impossible to edit and navigate, ...": These sort of implausible statements make me even more distrustful of what you are trying to do. I have been navigating and editing Wiktionary using wiki markup for years, and it worked quite well for me. Looking at your mainspace contribution, though, you seem to prefer to play around with Javascript instead of contributing content and editing content. I really do not think that your lack of content contribution has to do with "virtual impossibility to edit Wiktionary". Re: "... we should at least have some way of doing the more basic editing actions, without involving enormous amounts of wikitext ..." What? Do you imply that, currently, editing definitions requires editing "enormous amounts of wikitext"? Does editing etymologies require editing "enormosus amounts of wikitext"? Your picture of how hard it is to edit Wiktionary seems really distorted. --Dan Polansky 10:16, 12 April 2011 (UTC)Reply
    Mm-hm. Sure. See, this conversation is getting really difficult as I seem to be going under the delusion that there's not a single person on this project who agrees with you that Wiktionary is easy to edit and that the interface doesn't need real improvement, and that no one even comes close to having this view. What craziness is going through my head. Let's fix this whole mess by proving me wrong, shall we? (See box below.) --Yair rand 10:30, 12 April 2011 (UTC)Reply
    You apparently do not see any problem with your extravagant claims. You equate my opposition to your extravangant claims with "Wiktionary is easy to edit and needs no improvement", which my opposition to the claims does not amount to. I see no ability on your side to critically evaluate what you say. Now you start some sort of a poll in a thread that is already running for five days, a slightly deceptive technique if you ask me. If you want to create a fair poll, do so: create a new section heading that starts "Poll: " or the like, formulate a clear question, and let us see what the responses are going to be. --Dan Polansky 11:06, 12 April 2011 (UTC)Reply
    What is your position, then? You said above that "Wiktionary should be a wiki rather than a WYSIWYG interactive thing", which, as far as I can tell, equates "wiki" with the edit screen. Should the edit screen continue to be the primary method of editing, in your opinion, or not? If so, you seem to think that the editing system is satisfactory and needs no improvement. Am I wrong about this? (And re the box below, that's not a poll. I just want to know if there is at least one person who agrees with you that the WT:EDIT project should not be pursued. Should I re-post it at the bottom of the BP?) --Yair rand 11:14, 12 April 2011 (UTC)Reply
    ──────────────────────────────────────────────────────────────────────────────────────────────────── My position is that not each section and each type of content should be turned into a thing edited per default via interactive GUI. I have supported the interactive editing of translations, which was enabled, created or co-created by Conrad Irwin. (I don't remember which is the case.) I do not know what "WT:EDIT" project amounts to. The shortcut "WT:EDIT" refers to "User talk:Conrad.Irwin/editor.js", which does not describe any project and its scope. If you can refer me to a section of that page that describes a project that I allegedly oppose, I can have a look at it. I have opposed your proposal in another thread (#Tabbed Languages, Definition side boxes, and Sense IDs, 2 March 2011), in which you have proposed introducing (a) tabbed interface for a tab per language, (b) the editing tool discussed in this thread (User:Yair rand/editor.js), (c) sense IDs. --Dan Polansky 11:30, 12 April 2011 (UTC)Reply
    The script associated with WT:EDIT (User:Conrad.Irwin/editor.js) contains a framework for a Wiktionary semantic editor, with a lot of code to make it simple to add modules to the editor. Currently enabled modules include the translations adder, the translations balancer, the tgloss editor, the definitions adder, and the rhymes editor. (WT:EDIT#Coding lists a bunch of ideas for modules, one of which I've successfully built into User:Yair rand/editor.js.) The aim is to make it possible to edit without wading through piles of wikitext on the edit screen. --Yair rand 11:41, 12 April 2011 (UTC)Reply
    If there is a project that you want to implement and that you seek agreement for, you should describe the project a bit at some location that belongs to the project. "User:Yair rand/WT:EDIT project" might be such a location, or "User:Yair rand/Easy editing project" or whatever. The first heading of User:Conrad.Irwin/editor.js is "Usage"; the table of contents can only be found three page-downs below, so going to sections that are located below is a bit painful. The section "Coding" contains a subsection "Features", which does not speak of editing of definitions, but what it speaks of is "(Add example sentence/quotations buttons)". I do not see any statement of aim to turn each Wiktionary section into a GUI editable thing. So again, if there is a project that you are picking on and pushing, it should be better described on a subpage in your user space; some 200 words should do the first job, and should not be too much work, especially compared to the amount of coding that you already did. --Dan Polansky 12:04, 12 April 2011 (UTC)Reply
    I'm not really sure what you're saying; WT:EDIT already exists. Re the name, Cirwin suggested a while ago that WT:EDIT could be moved to Wiktionary:Editing UI, but I don't really see any problem with the current location. --Yair rand 13:20, 12 April 2011 (UTC)Reply
    I really don't like the location "User talk:Conrad.Irwin/editor.js". I have started editing there, but the namespace suggest I actually should not. If this is the project page for WT:EDIT, the page should still state that (a) there is a project, (b) what the project is trying to do, soon and eventually, (c) do so under a dedicated heading. --Dan Polansky 14:28, 12 April 2011 (UTC)Reply
No, I don't think it's 'easy' just by the nature of what Wiktionary is. Wiktionary is complicated. That said, I don't think we should make these tools default but rather an opt-in thing, raise awareness about them, and this is the best place to do that. --Mglovesfun (talk) 10:51, 12 April 2011 (UTC)Reply
Are the objections to this tool specifically, or to javascript tools on by default in general? Are you hoping that at some point down the line someone will make a better tool for editing definitions? I would like to know if the community is opposed to the WT:EDIT project, and if it is, then I should cease putting in effort to making it better. --Yair rand 10:58, 12 April 2011 (UTC)Reply
Re: "...Dan Polansky's view that Wiktionary is simple to edit and that the interface does not need improvement": I have not said this, but my view does come within two shooting distances of the statement. The interface may benefit from an improvement here and there; Wiktionary is reasonably simple to edit. Phrasing good definitions and finding the right slicing into senses and their definitions is hard, much harder than using Wiktionary's current editing interface. --Dan Polansky 11:12, 12 April 2011 (UTC)Reply

Just poking my head in here as a relatively new user: Wiktionary is quite difficult for the average Joe-not-a-linguist to add information to, possibly even more so than Wikipedia, in respect to formatting and the use of templates. If one's contribution isn't completely correct and formatted, there's a good chance it'll be reverted with no explanation in the following thirty minutes. With that said, I am wholeheartedly in favor of making features which unambiguously make it easier to edit at least optional, if not default. — lexicógrafa | háblame12:30, 12 April 2011 (UTC)Reply

Hm? It already is optional, this is about whether it should be on by default. --on the pseudo-home page of the project 13:20, 12 April 2011 (UTC)Reply
Are you even listening to yourself? Not three inches up the page you've put a huge boxy div, font-size 18px, line-height 32px, asking people to say whether they agree with your straw-man presentation of Dan Polansky's general view on the Wiktionary edit interface, which is supposedly "that Wiktionary is simple to edit and that the interface does not need improvement". Nothing there about this specific script, nothing there about whether such improvements should be on by default. If you don't want people to respond to that, then maybe you shouldn't have made it so huge? —RuakhTALK 14:47, 12 April 2011 (UTC)Reply
Oh. I kind of figured that the box would look like an invitation to add one's name the the list inside the box, and didn't realize that the comments after were in direct response to that. Well, I feel ridiculous. I've collapsed the box, it was probably a bad idea. I guess a proper poll could be started, as soon as I figure out what the disagreement is about, since it's starting to look like I'm completely misunderstanding what Dan Polansky's objections are, though I'm not sure about that... Dan Polansky, are you objecting to any further expansion of WT:EDIT (ie adding more modules on by default), or just these specific tools, or just to having tools in certain areas, or ...? --Yair rand 15:15, 12 April 2011 (UTC)Reply
What about you describe in 200 words on the pseudo-home page of the project what the WT:EDIT project is trying to achieve? I am scared of the idea of etymologies and definitions being editable using a GUI gadget by default. I do not oppose any such expansion that is not enabled by default. I will oppose expansions considered for default enablement on a case-to-case basis, yet currently I am inclined to oppose further GUI gadgets. However, given that I am only a single person, a quick poll could show that I am in a very small minority, which would end this discussion anyway. --Dan Polansky 15:30, 12 April 2011 (UTC)Reply

Wiktionary:Votes/2011-04/Lexical categories

The vote Wiktionary:Votes/2011-04/Lexical categories is scheduled to start in 3 days, after a number of tweaks by multiple editors, and postponings. Feel free to double-check the set of categories to be affected by this vote before it starts. --Daniel. 09:27, 7 April 2011 (UTC)Reply

The glossary and language codes

The page WT:Glossary says that "en" is English, "dat" is the Darang Deng language, "est" is Estonian, "pl" is Polish, among few other language codes.

I can think on a number of possible reasons to display this information there, but frankly WT:Glossary simply isn't a good place to look for language codes. Most aren't listed anyway, making it thankfully incomplete. I suggest removing all the language codes from there to make it cleaner. --Daniel. 13:54, 7 April 2011 (UTC)Reply

Yeah support and add to internal links, or see also, or whatever, links to such appendices/WT pages. The page could do with a spring clean. Mglovesfun (talk) 14:00, 7 April 2011 (UTC)Reply
yup, support. -- Prince Kassad 14:03, 7 April 2011 (UTC)Reply
Yeah.​—msh210 (talk) 16:25, 7 April 2011 (UTC)Reply

Palindromes by language

I have a proposal for organizing our coverage of palindromes. Since this involves converting two huge appendices into dozens of smaller big appendices, I thought it would be better to explain it here, rather than at WT:RFM or WT:RFC. The idea is:

Thoughts? --Daniel. 14:14, 7 April 2011 (UTC)Reply

Actually it's already been at RFC and RFM (in that order) with no input. Support. Mglovesfun (talk) 14:17, 7 April 2011 (UTC)Reply
Sounds good to me. The current pages can link to the new appendices IMO. (Even better IMO, can redirect to Appendix:Palindromes (currently a redirect to Appendix:Palindromic words), which can, in turn, link to the new appendices.)​—msh210 (talk) 16:23, 7 April 2011 (UTC)Reply
Sure, Appendix:Palindromes can link to Appendix:Portuguese palindromes, Appendix:English palindromes, etc. However, a Category:Appendices of palindromes would do the same job unless I'm missing something, so the appendix of the list of links probably would be unnecessary. --Daniel. 16:41, 7 April 2011 (UTC)Reply
Good point. I agree. The deletion summary for the current appendices should mention such category, so people looking for the old page can find the new ones.​—msh210 (talk) 17:18, 7 April 2011 (UTC)Reply

Poll: Playing cards

There are certain terms defined as playing cards, including queen of spades, ace of spades and queen of diamonds. Simultaneously, certain names of playing cards are actually redlinks, including two of spades, two of diamonds and ten of hearts.

Notably, the 16 playing cards that begin with "ace", "jack", "queen" and "king" are defined while the 36 ones that begin with "two", "three", "four", "five", "six", "seven", "eight", "nine" and "ten" are not.

A more-or-less recent RFD discussion listed and deleted a number of them. However, it didn't list them all and didn't delete them all. Another, slightly less recent RFD discussion kept ace of hearts in the end. There are opposing arguments.

I would like to know the opinions of people on what to do with them, so I have one specific question:

In your opinion, how many of the following 52 entries (4 suits, 13 cards per suit; joker isn't included) should be defined as English names of playing cards?

Thank you. --Daniel. 18:06, 7 April 2011 (UTC) Edited for legibility.​—msh210 (talk) 19:39, 7 April 2011 (UTC)Reply

All of them

  1. Agree Daniel. 18:06, 7 April 2011 (UTC)Reply
  2. Agree. No really, I do. Mglovesfun (talk) 18:19, 7 April 2011 (UTC)Reply
    Clarification, because of Talk:ace of diamonds. Sets a precedent which has not yet been overturned. Mglovesfun (talk) 13:06, 11 April 2011 (UTC)Reply
  3. Agree. --Anatoli 22:09, 7 April 2011 (UTC)Reply
  4. Agree WurdSnatcher 02:44, 8 April 2011 (UTC)Reply
  5. Agree Ivan Štambuk 07:14, 10 April 2011 (UTC)Reply

None of them

  1. Except that any that has a secondary meaning (two sense lines in English) should keep that sense line and should have {{&lit}} for the playing-card sense. (I suppose this really belongs in the "Some of them" section.)​—msh210 (talk) 19:39, 7 April 2011 (UTC)Reply
    • The basic idea of "None, except..." is clear enough for the purposes of this poll, but it is ambiguous in practice. If someone cares to prove that every listed card has a secondary meaning, then your choice naturally would automatically be read as "All of them"; however, now it logically means "Some of them" indeed. --Daniel. 19:48, 7 April 2011 (UTC)Reply
  2. Agree -- Prince Kassad 20:11, 7 April 2011 (UTC)Reply
  3. Agree Ƿidsiþ 21:18, 7 April 2011 (UTC) per Msh210Reply
  4. Agree with msh210 (and presumably Ƿidsiþ). —RuakhTALK 21:34, 7 April 2011 (UTC)Reply

Some of them

  1. Agree per Msh210. ~ heyzeuss 07:07, 8 April 2011 (UTC)Reply

The current sixteen

  1. Agree DCDuring TALK 18:14, 7 April 2011 (UTC)Reply
    What are the "current twelve"? Currently, we have sixteen bluelinked.​—msh210 (talk) 19:56, 7 April 2011 (UTC)Reply
    DCDuring, as the creator and only current voter of this section, please change the header or clarify what it means. In the meantime, it would be better if other people who want exactly 12 or 16 bluelinks voted for "Some of them", which covers everything from 2 to 51 entries. --Daniel. 20:07, 7 April 2011 (UTC)Reply
  2. Agree? DAVilla 08:55, 17 April 2011 (UTC)Reply

I am indifferent or indecisive

  1. Agree: I don't know. --Dan Polansky 11:14, 8 April 2011 (UTC)Reply

(Discussion)

  • I would include them if they can carry some useful lexicographical information, yet they either are semantic sums of parts or they border on being so. For instance, "king of spades" is translated into Finnish as "patakuningas" and into Czech as "pikový král", so the order of words is reverse. Having "king of spades" is certainly a convenience for a translator, who does not need to look up some appendix for how names of cards are formed in various languages. The thing might be expressed in English as "spade king", but it is not. I might lean to inclusion also considering that 52 entries is not all too many, but I do know what flood of other entries made on this model this would open. I have documented the formation of these sort of phrases by placing examples into fragments of example sentences, as "piková dáma" in pikový entry. --Dan Polansky 11:14, 8 April 2011 (UTC)Reply
To help decide what should be included to override the CFI, we could have little votes for specific groups of words, expression and combinations, even if they are SoP, like this with the result attached to the category page. What do you people think? --Anatoli 06:15, 21 April 2011 (UTC)Reply

Idiomaticity of individual words.

Wiktionary:Criteria for inclusion#Idiomaticity gives (deprecated template usage) megastar as an example of an idiomatic expression, because it (mostly) only uses one specific sense of (deprecated template usage) mega- and one specific sense of (deprecated template usage) star. The implication is that even individual words are subject to the "idiomaticity" criterion, and could be deleted as NISOP.

I think this is more or less false; there are various edge cases where we've decided to exclude certain NISOP wordlike entities, but always the core of the argument was that the entities weren't actually single words. (For example, we exclude (deprecated template usage) brother's on the grounds that it's not a single word, but rather a word ((deprecated template usage) brother) plus a clitic ((deprecated template usage) 's).) I don't think that applies to (deprecated template usage) megastar, which would be a single word even if it were NISOP.

I think this needs to be fixed, but I'm not sure exactly how. Anyone have any thoughts?

RuakhTALK 20:31, 7 April 2011 (UTC)Reply

Our current solution has just been to ignore this section of CFI all together, at the very least since I started contributing in 2009. I'd like it changed just because it's plain wrong, but ignoring it is a good old fashioned simple solution, and it's worked fine until now. Mglovesfun (talk) 20:40, 7 April 2011 (UTC)Reply
I dunno, I think most of that section is O.K.; it's just the "megastar" part that bothers me. I mean, even without that part it could be taken as applying to individual words, but that's the only part that implies that it should be taken that way. (Unless there are other problems I haven't noticed . . .) —RuakhTALK 21:49, 7 April 2011 (UTC)Reply
In my ideal dictionary certain words would not be included except in some kind of lesser status, like in an appendix or as a run-in entry. Examples would be many of the words created by affixation (fewer by compounding) using productive affixes and combining forms. Only if a meaning were not decodable from the components would it merit a full-fledged entry. Accordingly, I find the idea behind the section appealing. Why we need to have all of the members of Category:English words prefixed with un-, Category:English cardinal numbers, and Category:English ordinal numbers is beyond me. However, I doubt that we could implement it in our processes without a further increase in unproductive discussions and more inconsistent implementation. Further, we seem to always find advocates for the principle at least of, say, including all cardinal numbers, however impossible that is, and for including all attestable cardinal numbers, as a fallback position.
If someone else would take the trouble to propose a scheme for efficiently excluding "non-idiomatic" words, I would be happy to entertain the possibility that it would not lead to endless unproductive debate. DCDuring TALK 04:00, 8 April 2011 (UTC)Reply
I'm actually O.K. with a hypothetical idea of sometimes excluding words that are NISOP, but current, vote-approved practice is not only to include all NISOP words, but even to include all NISOP phrases that are sometimes written as single words (link), and all regular inflected forms of single words (link). I consider it more important that the CFI reflect accepted common practice than that the CFI or accepted common practice be perfect. —RuakhTALK 14:27, 8 April 2011 (UTC)Reply
I would just remove this from CFI altogether:
For example, mega- can denote either a million (or 220) of something or simply a very large or prominent instance of something. Similarly star might mean a celestial object or a celebrity. But megastar means "a very prominent celebrity", not "a million celebrities" or "a million celestial objects", and only rarely "a very large celestial object" (capitalized, it is also a brand name in amateur astronomy).
The most telling part of CFI#Idiomacity is the first sentence: "An expression is “idiomatic” if its full meaning cannot be easily derived from the meaning of its separate components." While this sentence is not without problems, the discussed paragraph does not help to fix these problems. --Dan Polansky 10:40, 8 April 2011 (UTC)Reply
My general beef with CFI is that where there's a consensus among editors, often CFI doesn't reflect this and says, in fact, we should be doing something else. I think that's to a large extent because votes are long and getting 70% is often going to be difficult, hence our solution of just ignoring whatever CFI says and carrying on regardless. Which I seem to dislike more than most editors. Mglovesfun (talk) 16:40, 8 April 2011 (UTC)Reply
The threshold of 2/3 is being discussed alongside of 70%, for votes that are not about voting and similar meta-things. While at least one admin has mentioned even the threshold of 80%, that does not make any sense to me. An admin that would use the threshold of 2/3 when closing a vote would be doing nothing too revolutionary, AFAICT. I was mentioning the threshold of 70% alone some time ago, but after some research I realized there is not much precedent for pushing that threshold as the common practice, so now I am usually mentioning the pair of 2/3 and 70%. --Dan Polansky 17:06, 8 April 2011 (UTC)Reply
Probability arithmetic would suggest that the smaller the number of voters the higher the voting threshold to achieve the equivalent level of certainty about the "will of the people". DCDuring TALK 18:05, 8 April 2011 (UTC)Reply
The general principle is all words but common sense should prevail: if a bot creates trillions of pages for infinite series (such as numbers), the project will be closed. Therefore special attestation criteria are required for such cases. Another case is about words that anybody can create for their own use, such as brand names, company names, etc. In such cases, precise attestation rules are also required.
But we should keep the principle all words. If most dictionary exclude many words, especially those easily understandable from their components, it's only because their limited space available is better used for other words. These considerations don't apply here.
About the idiomacity criterion: I would remove it altogether, and replace it by is the word or phrase part of the vocabulary of the language?: blue bicycle is not an element of the vocabulary, while good morning, even if not idiomatic, is an element of the vocabulary of English, because this phrase must be learned by people learning English. Lmaltier 20:19, 8 April 2011 (UTC)Reply
I'm not hypothetically against the exclusion of unidiomatic single words like Zirkusschule (Talk:Zirkusschule) but I can't see how it would work in practice. How do you exclude some and yet keep others? Do people want headache and birdcage to not overtly meet CFI? I sincerely doubt it. So I'd like to see unidiomatic words meet CFI. That was in my amendment to CFI that I proposed in 2010, which failed. Mglovesfun (talk) 13:11, 11 April 2011 (UTC)Reply

Feature request: Reverse translation button

When I hover my cursor over a translation, I would like a button in the popup that sends me to an edit page with the reverse translation prefilled. ~ heyzeuss 06:55, 8 April 2011 (UTC)Reply

Where would the headword-line wikitext come from? --Yair rand 07:05, 8 April 2011 (UTC)Reply
'''{{subst:PAGENAME}}'''?​—msh210 (talk) 08:29, 8 April 2011 (UTC)Reply
But then the entry would be uncategorized. --Yair rand 08:30, 8 April 2011 (UTC)Reply
True; is that terrible? Note also that the prefill couldn't have a POS header (or it could, from the English, but it would be very tentative, since the FL word might not have the same POS as the English). Presumably, no one would click the link unless he was sure doing so would yield a correct entry; and such a person would usually know some more about the word that he could add in, such as an inflection line or POS category.​—msh210 (talk) 09:06, 8 April 2011 (UTC)Reply
Good point. I've made User:Yair rand/transnewentry.js, which pulls the POS header from the English and uses {{infl}} as the default headword-line. --Yair rand 09:45, 8 April 2011 (UTC)Reply
Thanks, that was pretty fast. I tested it on ajatusvirhe, and it did what I wanted. I have some other ideas for it, too, but I'm pretty happy with what I've got. ~ heyzeuss 11:48, 8 April 2011 (UTC)Reply

Language categories

I don't know whether it has been already discussed or not but I think that language categories are not as useful as they could be. Here is an example: I want to find all Romanian words written here with ş, ţ (cedilla characters) in order to move them to new names with ș, ț (comma characters). So I am obliged to go through each and every separate Romanian category (nouns, adjectives, verbs etc). My job would be much easier if I could find all Romanian words in just one category. And it's not just me, I suppose. Why should users wait for the next dump and spend time and energy to built indexes of Romanian words when it's so easy to have a full list of this (every) language words? Other Wiktionaries (Greek, Romanian, French etc) categorize entries in both categories, as nouns, adjectives, verbs and as words of a certain language. Couldn't we do the same? --flyax 09:04, 10 April 2011 (UTC)Reply

Indexes of all words of a language are more useful than categories of all words of a language, unless I'm missing something. Indexes have audio examples, definitions and more space.
It would be very easy to make a query to look for what entries contain ş or ţ in the title and a Romanian section in the body, wouldn't it?
In addition, the possibility of finding all Romanian words that include a certain letter with a certain diacritical mark could be implemented with the creation of Romanian counterparts of categories like these:
--Daniel. 10:17, 10 April 2011 (UTC)Reply
I didn't mean to suggest abandoning indexes. They are good for anyone who needs them. Maybe it's easy to make a query in them but I don't know how. On the other hand it's easy to search through a certain category using API commands. Alternatively, go type the category's name on Special:Export, click add, and have the words in it (it will work perfectly for categories with less than 5000 members). So, for me, it's much easier to have comprehensive language categories - and it costs nothing. --flyax 10:42, 10 April 2011 (UTC)Reply
I like flyax's idea better than Daniel.'s; Category:English is simple, does not proliferate categories, and allows things Category:English terms spelled with A, Category:English terms spelled with B etc would not. On de.Wikt, users have used it to add translations systematically to every page. - -sche (discuss) 04:22, 15 April 2011 (UTC)Reply
In theory, both Category:English and Category:English terms spelled with B can exist simultaneously as they serve different purposes. However, the usefulness of the latter is obscure, because B is a extremely common letter in English. --Daniel. 04:27, 15 April 2011 (UTC)Reply

Since my job with Romanian entries is almost done (I'm just waiting for the vote to get the bot flag) let me add another example. The next step is to change cedillas to commas in English entries, ie find all instances of {{t|ro|something}} and modify them. Suppose that I have prepared the fix and want to run it. The command would be :

python replace.py -fix:ro-something -cat:'English something'

You see, it will be much easier for the bot to run in just one category that would contain all English words. --flyax 11:16, 10 April 2011 (UTC)Reply

In reply to "Indexes of all words of a language are more useful than categories of all words of a language, unless I'm missing something." If we had such indexes, yes. Currently indexes like Index:Old French don't include inflected forms like plurals, verb forms, adjective forms, etc. Using AutoWikiBrowser you could do "Category (recursive)" and "Category:Romanian parts of speech". That would get everything apart from uncategorized Romanian entries. Mglovesfun (talk) 12:54, 11 April 2011 (UTC)Reply
That's interesting. I've never tried using AWB so far. Thanks for the tip--flyax 13:11, 11 April 2011 (UTC)Reply
Now I see that there is a similar possibility for replace.py: use subcat instead of cat. My fault; please ignore my last example. --flyax 14:22, 11 April 2011 (UTC)Reply

Wiktionary:About Latin#Orthography for Latin entries

Do we want to keep this as it is? For example, we have Latin entries which use j. I don't know about ligatures; certainly there are attestable Latin words using ligatures, for example pæninsulae, according to WT:RFV#pæninsulae. I'm inspired by the debate over vp meaning up. It seems to me that the Latin i/j situation is the same; i and j may not have been considered separate letters for a long time, perhaps a thousands years or even 1500, but they are now. I also find the section horribly inconsistent; do not use ligatures "as these do not appear in Classical Latin", but use v and u separately despite the fact that they don't appear in Classical Latin. Mglovesfun (talk) 12:58, 11 April 2011 (UTC)Reply

You should possibly ask EncycloPetey (talkcontribs) (perhaps on Wikipedia [11], since he seems to be more active there now), considering the fact that he has written most of WT:ALA. Caladon 10:51, 12 April 2011 (UTC)Reply
I was hoping for a consensus, so no. Mglovesfun (talk) 13:57, 12 April 2011 (UTC)Reply
Yes, EP has left the project. We should decide for ourselves. Personally, I won't be adding any ligatures - I don't know how to type them. — This unsigned comment was added by SemperBlotto (talkcontribs) at 14:10, 12 April 2011 (UTC).Reply
These are the modern rules for spelling Latin, which is a good orthography to normalize to.--Prosfilaes 15:20, 12 April 2011 (UTC)Reply
I see two issues in that, the issue of whether to recommend one spelling/form over another, like paeninsula over pæninsula, the second is whether to exclude the non-recommended spellings/forms in spite of any attestation one might find. I can't see how to do that without bypassing CFI. I goes way beyond Latin, in terms of consistency. Would anyone want to exclude an English term with a ligature that's attested, such as præsent instead of praesent. There are so many comparisons that I could make, I'd like to stop there for fear of going too far off topic. --Mglovesfun (talk) 15:32, 12 April 2011 (UTC)Reply
English is easily attested. Considering words with completely regular variations of orthography different words can make it much more difficult to attest words, for little to no gain.--Prosfilaes 17:56, 12 April 2011 (UTC)Reply
IMO we should include forms with i' and forms with j, as attested. One should of course be a form-of entry linking to the other.​—msh210 (talk) 16:02, 12 April 2011 (UTC)Reply
I think the question then is which one we treat as the 'main' lemma... —CodeCat 16:07, 12 April 2011 (UTC)Reply
I think you're skipping ahead a bit there CodeCat, WT:ALA still says not to include ligatures and j spellings at all. To Prosfilaes, these to me are two separate issues. WT:ALA isn't intended to supersede WT:CFI. I'm not advocating creating Latin ligatured spellings and j spellings even when not attested. WT:ALA right now says they should be deleted no matter how well attested, even if we've got 100 citations. That's the issue I want to tackle. NB (third separate point of this comment) the French Wiktionary counts Latin as a living language because it's the official language in one country (The Vatican), so j spellings are preferred to i spellings when both are attested. Make of that what you will. Mglovesfun (talk) 16:02, 13 April 2011 (UTC)Reply
This is a wiki, I'm gonna make some changes and y'all are free to change those changes. Mglovesfun (talk) 12:17, 16 April 2011 (UTC)Reply

Wikisaurus def subpages

I would like to get deleted Wikisaurus pages named ".../def". There are 11 such pages. The slash-def pages and their content are listed at Template talk:ws refer; an example: Wikisaurus:tiny/def - "small in size". I would like to replace all uses of {{ws refer}} in the mainspace with something like "See also [[Wikisaurus:entry]]". Anyone who has an idea how I should best proceed about this, please let me know. I am about to proceed to replace {{ws refer}}, and maybe let the slash-def subpages be without trying to get them deleted; I don't know. If you have any questions, don't hesitate to ask: I will try to answer them. --Dan Polansky 13:36, 11 April 2011 (UTC)Reply

Related: WS:penis/translations seems like an odd idea. Good idea or not? These translations should be in the main namespace; or, I personally would favor WS:penis/French, the French Wiktionary allows all languages to have Wikisaurus entries; I'd be happier for us to do that than have this approach. Mglovesfun (talk) 13:40, 11 April 2011 (UTC)Reply
I have created WS:penis/translations back in 7 September 2008 as a place to which I can move the translations section of WS:penis. Years later, as far as I am concered, WS:penis/translations could be deleted, but it might better stay until the issue of naming of non-English Wikisaurus entries--an open and rather challenging subject--is resolved a bit. One demo entry that I have created is WS:příbuzný, which could alternatively be called "WS:cs:příbuzný", "WS:relative (Czech)" or "WS:relative/Czech". The subject of naming of non-English entries seems rather unrelated to slash-def pages, though; the considerations that lead to its resolution are rather distinct. --Dan Polansky 14:54, 11 April 2011 (UTC)Reply
I agree with all of this. Mglovesfun (talk) 14:55, 11 April 2011 (UTC)Reply

Update: I have removed all remaining uses of {{ws refer}} from the mainspace (there were around 50 AFAIR) and have sent the template for deletion, together with the slash-def subpages: Wiktionary:Requests_for_deletion/Others#Template:ws_refer. --Dan Polansky 17:42, 19 April 2011 (UTC)Reply

WOTD

What's happening with Word of the Day these days? Who's running it since EP left? All the current words seem to be from April 2010. Ƿidsiþ 06:27, 13 April 2011 (UTC)Reply

IIRC, Lexicografía took over in early January and completed January, I did February, and no one's touched it since, so March and, so far, April's entries were recycled from last year. You can see the status at [[WT:WOTDN#statusboxofwotdupdates]]. Lexicografía has expressed interest in managing it, but has not been doing so. It's a big job; I'd be glad to split it somehow with someone. The hard part is finding entries good enough to be WOTD, or bettering them to that point; the easy part is the actual setting of them as WOTD, which is almost rote.​—msh210 (talk) 07:31, 13 April 2011 (UTC)Reply
I'm okay with either taking it on or splitting it with someone else, I've just been quite busy recently. I should definitely be able to pick it up by the beginning of May. — lexicógrafa | háblame12:11, 13 April 2011 (UTC)Reply

ASL Orthography

On Wiktionary, the main index is to find a word based upon spelling by people who either 1) heard a word and know the language's pattern of sound-to-text transcription or 2) saw a word in print and want to know the meaning. ASL has no standard phoneme-to-text transcription, and thus ASL is never seen in print. The chicken and the egg syndrome.

I have developed a rich orthography for ASL over the last five years. It is documented, but not yet tutorialized, at www.aslsj.com. Apparently no one would be willing to learn it unless they had to. It is similar to sheet music. Most people don't need to read sheet music if they aren't going to practice other people's musical compositions.

I would like to start creating ASL entries on Wiktionary using my own orthography for the titles. It is consistent and concise. The best part is that, since it is spoken-language-neutral, French or German Sign Language Wiktionary could refer to it without trying to also translating the titles. As there isn't any real activity in the ASL section, I don't think anyone would even notice.

Example: The entry "Claw5@TipFinger-Claw5@CenterChesthigh Claw5@SideChesthigh-Claw5@SideChesthigh" would be written as "5bts-jxr". The texts of entries would still be in English. I would just like to use my own entry titles. - Positivesigner 08:14, 14 April 2011 (UTC)Reply

The ASL entries were all created by User:Rodasmith, who is apparently now inactive. I'd like a different transcription system if it's documented somewhere (on the wiki, that is), because the current one is very unwieldy (you can see on some entries that their name is insanely long) and it is not any more or less intuitive than yours. -- Prince Kassad 09:43, 14 April 2011 (UTC)Reply
I am pretty strongly opposed. Frankly I would prefer the flavors of sign languages be handled exclusively in translation tables, via image strings or videos. These systems for writing down scripts for signs overlook the fact that the majority of those who sign read and write exactly like everyone else, barring those who use braille. We obviously should have sign language in Wiktionary, but I don't like the current or proposed handling of it. - [The]DaveRoss 10:23, 14 April 2011 (UTC)Reply
Why would you exclude sign languages from the mainspace? Isn't that a bit racist? -- Prince Kassad 13:17, 14 April 2011 (UTC)Reply
I guess. I am pretty racist against things which don't make sense to me, like inventing an inelegant solution to a problem which already has an elegant one. Also I don't want to exclude them from the main namespace. - [The]DaveRoss 19:10, 14 April 2011 (UTC)Reply
I'm surprised that you find forbidding entries for terms in non-written languages to be an "elegant solution". Doubly surprised, actually: surprised that you find it to be elegant, and surprised that you find it to be a solution. —RuakhTALK 21:07, 14 April 2011 (UTC)Reply
Let's say that I were to create another encoding scheme for English, perhaps one which had fewer letters or had a 1:1 letter-phoneme structure. This may be an interesting, even useful encoding, but I would expect it to be rejected from Wiktionary because it isn't English. Obviously ASL and the other varieties of sign language are different than what I described, but the encoding scheme for them above is not. I don't think that Wiktionary is really well equipped to handle ASL right now; I think there are plenty of elegant solutions which we could implement rather than implementing the inelegant one which was suggested. That is what I meant. If I want to find out how to sign a particular word I would love to see that sign available in video or at least picture format in the translations at the applicable entry. - [The]DaveRoss 21:28, 14 April 2011 (UTC)Reply
Let's take a closer analogy, and say that several thousand people in North Dakota speak an indigenous language called Fake among themselves, but English with the outside world. They don't have a writing system for Fake: about half are literate in English, but none are "literate in Fake". I suppose that, if a linguist came up with a sensible writing system for Fake, and wanted to start documenting the language here, you would reject it "because it isn't Fake". If they wanted to add audio files to translations tables for English entries, fine, but no Fake entries would be allowed, and there would be no way to search for a Fake word unless you already know its English translation. Do I have that right? —RuakhTALK 21:42, 14 April 2011 (UTC)Reply
Sounds right, I would reject that as well. If the new orthography actually gained some currency and the Fake speakers adopted it that would be the written form of their language and we should include it. In the case of ASL the written form of the language is English. The proposal is for a "phonetic" transcription of the "verbal" form (quoted words used to parallel sign language to spoken languages) which is completely artificial. It would be great if there were a sensible way to search for signs, this isn't it. - [The]DaveRoss 23:00, 14 April 2011 (UTC)Reply
Displaying ASL visually on a computer or mobile could soon become a possibility, perhaps using a series of .svg images, and a user might be able to choose ASL preferentially just as we do other languages. It seems to me that the main barrier is the lack of a concise and consistent computer-friendly notation. If we had one, I think it would soon be feasible to write a "transliteration" script to make this work. I doubt that "Claw5@TipFinger-Claw5@CenterChesthigh Claw5@SideChesthigh-Claw5@SideChesthigh" could be made useful, but "5bts-jxr" seems reasonable, and Positivesigner states that it is consistent and concise. If that is true, I think it is a good idea to go ahead. If a better system ever comes along, it should be a simple task to convert all of the "5bts-jxr" entries to the new system. —Stephen (Talk) 23:47, 14 April 2011 (UTC)Reply
There is some push to get SignWriting Unicode codepoints, which would give us canonical pagetitles for SLs.​—msh210 (talk) 03:15, 15 April 2011 (UTC)Reply
Fortunately, the current system is phonetically accurate and complete according to current ASL liguistics, so it should be bot-convertible to any eventual SignWriting Unicode mapping. I can't wait for that day!  :-) —Rod (A. Smith) 01:59, 18 April 2011 (UTC)Reply
See this. If this turns out to be mature, we may soon see SignWriting in Unicode. -- Prince Kassad 02:19, 18 April 2011 (UTC)Reply
Re: "In the case of ASL the written form of the language is English": That's not true. ASL signers use written English as their written language, just as Fake speakers use written English as their written language, but that's a cultural fact, not a linguistic one. Note that English has very different syntax from that of ASL, and that historically, ASL is related to French Sign Language (and not, for example, to British Sign Language). Various ASL lexemes have been influenced by English writing (e.g., by incorporating English letters), but I'm sure that Fake speakers use a lot of English loanwords, too. Re: "It would be great if there were a sensible way to search for signs, this isn't it": For the record, I'm not actually endorsing this proposal. So far, I haven't seen any really great proposals for how to deal with this problem. I'm just objecting, very strenuously, to your claim that your proposal, that of forbidding entries for attested-but-unwritten languages, is a "solution" to the problem, let alone an "elegant" one. If you had described it as "arguably the least terrible, by a small margin, of the various proposals", then I'd probably be on board. —RuakhTALK 00:19, 15 April 2011 (UTC)Reply
The lack of clarity here is my fault, I am not saying _my_ solution is the elegant one, I am saying that the various media-based sign language dictionaries out there are better than the proposed shoehorning into Wiktionary. I have seen a number of them which have videos or images and I find them much better than this alternative. - [The]DaveRoss 01:52, 15 April 2011 (UTC)Reply
[Indented under Ruakh, as it sort of is in response to what he said; but the question at the end is more for TDR.] No need to go to Fake. There are real indigenous languages that have never been written by native speakers and that are written by linguists using either IPA or transliteration. I believe we even have some entries of the sort. Why are SLs less worthy?​—msh210 (talk) 03:17, 15 April 2011 (UTC)Reply
On the subject of inelegant things, the language name "Proto-Central New South Wales". On the subject at hand, perhaps Positivesigner can enter words as user subpages or an appendix? To msh210: do the speakers of PCNSW or Darling communicate in writing in another language, the way ASL-signers write in English? - -sche (discuss) 06:37, 15 April 2011 (UTC)Reply
In response to TheDaveRoss, you mentioned that the users of signed langauges actually use a regional spoken language's transcription system to communicate in writing. This does not document their own language's words; only that some of their words happen to be similar to some other language's words and phrases. Therefore they never truly write down what they are thinking / feeling.
The general rule for inclusion in Wiktionary is any natural language word "if it's likely that someone would run across it and want to know what it means," even if it isn't attested to by Daniel Webster. The problem here is that there is no standard way to find out what a sign word means. And THAT has irritated me since age 12. I have devoted the last five years to documenting ASL phoneme patterns. Now I can write down ASL and read it months later without watching the video again. Even if I do not remember the word, I can sound it out using rules of pronunciation.
According to your comments, apparently your main objection to using my plain-text transcription system on Wiktionary is that it hasn't been adopted by any group of native-ASL speakers. Deafness is almost always one generation thick. 90% of Deaf people are born into hearing families; 90% of the children from Deaf families are born with normal hearing and are bi-cultural, bi-lingual. Due to 100 years of oralism which punished Deaf children for using sign language, historically Deaf enculturation and learning sign language happened after the age of maturity. Therefore there is no "Deaf Region" where the people could adopt a written language of daily commerce. 50% of Deaf persons today graduate without attaining a 4th grade English reading level.
I want to document ASL, not for those Deaf who cannot read it, but for the hearing people who are trying to find out what it means. They won't be able to type in the proper spelling using my system without study. But they can browse the words and see that there is structure. They can see the kinds of phonemes used and how they are put together. They can look up an English word's "See Also" section and notice how the signed word has slightly different inflection in it's definiton. They can read about the word's etymology, synonyms, antonyms, words for which it might be mistaken and common mispellings.
I have to personally remember how to spell the entry titles so I can make these links. The current encoding system is incredibly unwieldy in this respect. Since I have already done this much work, it seems better to use my transcription system than to have a personal index back to the current system. Especially since there is no one around willing to help me learn the current system.
About your comment, "I think there are plenty of elegant solutions which we could implement rather than implementing the inelegant one which was suggested." What I read in this is you would prefer to have no information rather than incomplete information. Unless you are deciding to start implementing a more elegant solution in which I could participate, I do not feel that my making ASL entries would be wasted effort. There is no one else beating me to the punch, so to speak.
I would just like to use my own entry titles to create English Wiktionary entries about ASL words. As I am not very familiar with the Wikimedia power structure, I am looking to see what barriers there are to my doing so and by what means I may address those barriers. If no real barriers are presented, then I can assume to proceed just like any other new word entry. - Positivesigner 06:42, 16 April 2011 (UTC)Reply
My view is that ASL entries are all by their very nature unsupported by MediaWiki. So, they should be in appendices. Mglovesfun (talk) 12:19, 16 April 2011 (UTC)Reply
So is Jurchen or Minoan. Did that stop us from having entries in these languages? -- Prince Kassad 12:54, 16 April 2011 (UTC)Reply
I think your proposal is fine. The biggest problem I see with it is that the current naming system for SL page titles was enacted by vote so needs a vote to be so heavily emended.​—msh210 (talk) 04:47, 17 April 2011 (UTC)Reply

Re TDR and Mglovesfun, we've already voted in a naming scheme for SLs, thereby also explicitly allowing SL entries; the horse has left the stable already. The only question here is whether the naming scheme should be changed. (Of course, if you want to change the topic of conversation to be about revisiting whether we should include SLs also, you're welcome to; but I suspect (and hope) that the proposition that we not include them won't garner much support.)​—msh210 (talk) 04:53, 17 April 2011 (UTC)Reply

Comparing the current transcription system with the one Positive proposes here, two differences stand out to me:

  • The current system favors phonetic transparency for readers, while the proposed one favors ease of writing for editors.
  • The current system represents the phones as analyzed and described in professional publications, and has apparently broad consensus in the field of ASL linguistics, while the proposed one is based on private analysis.

I can empathize with the expressed difficulty in writing entry titles, but perhaps a less O.R. resolution would be to adopt editing tools that simplify the writing the typographically lengthy phone names. —Rod (A. Smith) 17:12, 17 April 2011 (UTC)Reply

I agree with Rod's analysis and conclusion. - -sche (discuss) 03:11, 18 April 2011 (UTC)Reply
Out of curiosity — on what sort of timescale are we expecting SignWriting Unicode codepoint assignments to be approved? How confident are we that it will happen, and how much do we know about what it will be like? When that happens, will it be possible for a bot to effect the conversion from the "Claw5@TipFinger-Claw5@CenterChesthigh Claw5@SideChesthigh-Claw5@SideChesthigh" style to SignWriting? —RuakhTALK 02:14, 18 April 2011 (UTC)Reply
Nice to see that serious Unicode SignWriting submission. I'd be curious about approval timeframe, bit it looks solid enough. After approval, I say we forge ahead with adoption/migration. I suspect I could write a migrator bot in a week, unless I'm busy with work. Hmm, would MediaWiki need an update to accept an expanded Unicode range in entry titles? —Rod (A. Smith) 04:48, 18 April 2011 (UTC)Reply
Probably not, MediaWiki should be fully Unicode aware. And if I try something like 򡅔򪪉 in a tentative "plane 10" (which will probably never exist), I see that the link works correctly and I could theoretically create an entry there. So by the same means, SignWriting should work as well. -- Prince Kassad 12:52, 18 April 2011 (UTC)Reply
Reading through communications that led to the Unicode SignWriting submission, and then re-reading the submission, I see that there are a couple of known deficiencies in the submission. A minor one relates to normalization/canonicalization. In the existing SignWriting software like SignPuddle, two authors trying to compose the same sign often encode it through different typographical sequences. People are working on the rules that could yield a single canonical form of each sign, but those rules have not yet been created, let alone tested for acceptance by SignWriting users. We could ignore that problem because we could impose our own canonicalization standards. The major deficiency, though, is that the SignWriting Unicode submission omit positioning details. Unfortunately, there are multiple glyph-positioning systems in current use (absolute freeform layout, cartesian coordinates, polar coordinates, and variations with restricted layout rules), and it's not yet clear which one should be made standard. The authors of the Unicode submission acknowledge this as a key deficiency, and say it will be at least a couple of years before sufficient testing can be done to determine which system should be adopted as the standard. The three universally accepted classes of sign language phones are handshape, position, and movement, so without glyph-positioning details in the script, entry titles based on SignWriting would be a largely unintelligible sequence of handshape and movement phones. —Rod (A. Smith) 23:53, 18 April 2011 (UTC)Reply
Although the SignWriting Unicode submission lacks the specification of glyph positioning, it does include symbols that represent the phonetic positions of signs. See http://www.signbank.org/bsw/iswa/386/386_bs.html for some examples. So, although entry titles would not show pleasingly-arranged glyphs, they would include the same details our current system transcribes (e.g. "...@CenterChesthigh..."). Looking forward to it's adoption. —Rod (A. Smith) 06:26, 19 April 2011 (UTC)Reply

In response to Rod (A. Smith), you are correct in stating that my primary concern is the entry length. And also that my research is of my own originality, not published, reviewed, approved, etc. Now, while I type this, you are unable to hear my voice and understand the subtle nuances of tone not alphabetized. So please realize the statements following are made with curiosity, and not with demands. I am always glad to be filled-in where I have missed something.

Specific to Wiktionary entries, the information documented here is intended to be a reference for use by the other Wikimedia projects. It may not simply be a re-wording of existing, copyrighted definitions. It can delve into areas of possibly non-accredited research created by professionals and non-professionals alike. No one ever gets to make a final decision on any page unless it gets deleted; they can always be updated or reverted. "Being bold is generally a defensible position."

The intention of creating American Sign Language entries is to encourage other projects to use those entries. How they find the entry they want will be a guess-and-check process until ASL does get an approved orthography by a linguistic group or a political body. They would create approved ASL dictionaries and books on how to read and write in ASL. Native speakers would create scholarly and entertaining works. And so on.

Until that happens, I would like to use the information I've gathered to ease the re-use of Wiktionary ASL entries. The professional publications of which you speak are not created with concise communication in mind. I imagine that most contributors of ASL definitions will not have read those publications. Therefore they should not have the last say on this topic. My research has been solely directed toward organizing ASL words into writing patterns. Although I have not attained any degrees in sign language, I have been a software systems analyst for 20 years. My ASL encoding system is based upon my many and varied experiences in how software designers have solved language encoding problems.

Your mention of adopting "editing tools that simplify the writing of entry titles" applies equally to my system and the current system. All of your work with the ASL phone descriptions, entry structure, categories, templates, and more has laid a great groundwork. There are still other issues to address, but this issue is really at the heart of the entire project. What are our end-user's needs for the ASL entries? Copy-paste works fine with long text, but associating words to ideas, placing a lot of links in tables, or searching for an entry you have already seen does not.

I'm not trying to say that I want to "have it my way." At the same time, I do realize that Wikimedia projects are designed to aggregate "accepted knowledge." As no one uses my new orthographic design it is proof that no group has accepted it, a.k.a championed it. I'm not trying to get involved in politics. I am trying to solve a human-software interaction issue with an encoding solution. My suggestion is not meant to supplant SignWriting, discourage the use of Unicode, or ignore other's feelings. I simply feel that the current titling system can be improved and that I have a have means to readily do so. If there are other ways to address it, I'd be happy to look into those as well. -- Positivesigner 06:17, 18 April 2011 (UTC)Reply

In reply to Prince Kassad, I don't like the idea of transcriptions as entries. See almost all the entries in Category:Egyptian nouns for example. If something's unsupported, put it in an appendix and hope that one day it becomes supported. --Mglovesfun (talk) 06:41, 18 April 2011 (UTC)Reply
That's probably where I'm jumping the gun. msh210 stated that "the current naming system for SL page titles was enacted by vote so needs a vote to be so heavily emended." Perhaps we could start with a vote about whether the current naming system for SL needs be changed at all. If such were decided, then proposals could be made as to alternatives, timelines, etc., and each of those voted upon. -- Positivesigner 07:03, 18 April 2011 (UTC)Reply
Re Gloves, those need to be moved to hieroglyphs anyway, since we now have 'em in Unicode. Just needs someone to invest some time. -- Prince Kassad 12:52, 18 April 2011 (UTC)Reply
Positivesigner, do you acknolwedge SignWriting-in-Unicode transcription as the ultimate goal for our entry titles? If not, why does your Sign-Jotting transcription system seem better? (Perhaps because it's easier to type on a Qwerty keyboard?) If you do acknowledge SignWriting as the ultimate solution, then your proposal to change from the system in Wiktionary:About sign languages to yours seems like a temporary solution to barrier to the creation of ASL entries, right? If so, I suggest you be bold with creating ASL entries here. Feel free to create them with temporary page names, and add an {{rfc|Rename this entry per WT:ASGN}} tag at the top of your entry. Once renamed, the page should theoretically be human-assisted-bot-migratable to its ultimate SignWriting-based name. —Rod (A. Smith) 18:44, 18 April 2011 (UTC)Reply
So could the temporary page names be "ASL000001" and sequential? That way anyone could add what they know with the renaming done by admins at a later date. I'll go with your suggestion of a temporary page name using the Rename template. Thanks for the input. -- Positivesigner 21:33, 18 April 2011 (UTC)Reply
Yes, I think (hope) most editors here would welcome your ASL entry contributions that are temporarily named like that, especially if you include {{rfc|Please rename per WT:ASGN}} at the very top of the entry. I'll try to review and rename them, but if I go MIA for awhile, anyone can drop me a line to remind me. —Rod (A. Smith) 21:59, 18 April 2011 (UTC)Reply

Romanian entrees

Are entrees like școală also being fixed internally? If you open the conjugation table you'll still see the cedilla forms. --Ooswesthoesbes 18:32, 17 April 2011 (UTC)Reply

Fixed. The plural form used the old cedilla form. --Dijan 01:42, 18 April 2011 (UTC)Reply

new "show/hide" design

Really don't like the new design for the "show/hide" buttons. It makes it harder to break up the page into sections and the buttons are less visible too. ---> Tooironic 01:22, 18 April 2011 (UTC)Reply

The ± sign isn't even showing correctly in my browser. It shows under the "]". It's just plain ugly. --Dijan 01:26, 18 April 2011 (UTC)Reply
Is common.css even being loaded? Maybe it's some issue with Mediawiki? --Yair rand 01:35, 18 April 2011 (UTC)Reply
If it is loading, it's not loading properly. The specified font sizes for Arabic and Devanagari and I'm sure various other customized scripts aren't showing properly now. --Dijan 01:41, 18 April 2011 (UTC)Reply
Thanks. --Dijan 01:47, 18 April 2011 (UTC)Reply
Please change it back ASAP! One should check for compatibility with other browsers before changing. Mozilla Firefox look-and-feel and functionality for translations is disgusting. --Anatoli 01:51, 18 April 2011 (UTC)Reply
I agree. --Dijan 01:53, 18 April 2011 (UTC)Reply
A recent edit to common.css had a typo in it, which broke the CSS. I fixed the typo a few minutes ago. --Yair rand 01:54, 18 April 2011 (UTC)Reply
Thanks for the fix! --Anatoli 06:30, 18 April 2011 (UTC)Reply

Category:ko:Elements

Why is there the different category Category:ko:Elements? I believe all pages in this category should be moved to Category:ko:Chemical elements. Right? Malafaya 16:43, 20 April 2011 (UTC)Reply

You're correct, that category should not exist. -- Prince Kassad 16:45, 20 April 2011 (UTC)Reply
Based on the interwikis, no, empty and delete (speedy, that is). Mglovesfun (talk) 16:47, 20 April 2011 (UTC)Reply
I have doubts in the two remaining articles. They are Hanzi characters with no meaning provided in Korean and yet they are categorized as "ko:Elements". Can someone please help? Malafaya 16:53, 20 April 2011 (UTC)Reply
Yes I just removed the categorization; if the translingual mean is correct, in both cases it means oyster though it could have additional meanings. Categorization can be added back to match the meanings, if/when we have some. Mglovesfun (talk) 19:31, 20 April 2011 (UTC)Reply

CFI, attestation and ISBN

I would like to see this CFI sentence go: "When citing a quotation from a book, please include the ISBN." Finding citations and adding them to the entry is laborious enough without ISBN. I have never included ISBN in the quotations that I add, and I am not ashamed of it. --Dan Polansky 08:11, 22 April 2011 (UTC)Reply

The ISBN is the most reliable identifier we have of one specific edition of a book. It's only useful for certain modern books, but in those cases, I have no problem with CFI saying that it should be included. I don't get the laborious part; it's just 11 digits to be added, easily found on Google Books or on the physical book.--Prosfilaes 19:27, 22 April 2011 (UTC)Reply
I agree with Prosfilaes. Incidentally, I also think the SBN should be added for older books, the ISSN for periodicals, and the DOI for anything that has one. I'd also love it if a bot could go through our entries to see where the ISBN, SBN, or ISSN does not have the right checksum, marking such instances for human attention. English Wikipedia used to have such a bot (perhaps still does), but I can't seem to find it.​—msh210 (talk) 06:54, 24 April 2011 (UTC)Reply

External links

Perhaps we should remove the "External links" section from WT:ELE (and possibly delete Wiktionary:External links). There's never been quite clear policy or description of practices of linking to external sites in a dictionary, except via other systems already covering the topic like WT:Citations and WT:References. In the rare cases that it might warrant inclusion, the section might not even be needed. TeleComNasSprVen 04:13, 24 April 2011 (UTC)Reply

I use ===External links=== all the time to link to Wikipedia articles related to the Wiktionary entry. I abhor the usage of {{wikipedia}} template which is IMHO too conspicuous. Simply removing "External links" section from WT:ELE and deleting [[Wiktionary:External links]] won't fix existing usages or ===External links=== in the entries, which measure by the thousands, or change editors' preference for its usage. Without providing a plan on how to change established practice under a new policy, and how to migrate existing usages of the section under a new proposal (we cannot simply make all the entries having it non-conformant to ELE layout), I don't see how your suggestion is feasible. --Ivan Štambuk 07:12, 24 April 2011 (UTC)Reply
I use ===See also=== and {{pedialite}} and agree that {{wikipedia}} makes pages look a mess, particularly when images and table-of-contents are present. I try to look at pages occasionally with the default "look" - the one which the average user will see. —Saltmarshtalk-συζήτηση 09:58, 24 April 2011 (UTC)Reply

Saltmarsh's suggestion seems like a viable idea, using the WT:See also system to produce links to Wikipedia articles instead. Perhaps we can have a bot run to replace/change existing uses of "External links" headers to "See also", and merge the two if both are present. TeleComNasSprVen 05:12, 26 April 2011 (UTC)Reply

multi-word entries

I was just putting a Greek translation in "give rise to" - so we have "führen zu", "οδηγώ σε" etc - are we wanting all these multi-word foreign entries? Would "führen zu", "οδηγώ σε" be better? —Saltmarshtalk-συζήτηση 06:45, 24 April 2011 (UTC)Reply

Need help with removing vandalism, pic of vagina

On the das (NSFW) page, you can see there is a picture of a vagina on the bottom right corner of the page. I think the vandal edited a css page to do it, and I can't figure it out. Perhaps someone with more expertise could try and fix it. Thanks! Xxpor 03:23, 25 April 2011 (UTC)Reply

Follow up: it seems to have been removed, but could someone tell me the change they made so I can fix the same problem in the future? Xxpor 03:25, 25 April 2011 (UTC)Reply
I have absolutely no idea what's going on. See the WT:ID, where two people have complained of this. —Internoob (DiscCont) 03:51, 25 April 2011 (UTC)Reply
What browser are you using? Is there anything in your browser's downloads? —Internoob (DiscCont) 03:54, 25 April 2011 (UTC)Reply
It was Template:langnamex. Problem solved. —Internoob (DiscCont) 04:54, 25 April 2011 (UTC)Reply
The same vandal also hit Wikiversity. - -sche (discuss) 19:17, 25 April 2011 (UTC)Reply
Yes, the user is Meepsheep (talkcontribs) using various meepsheep related user names, see the history of {{defdate}}. To administrators, block anyone with an expiry time of indefinite if they match these two criteria (the first criterion being blatant vandalism). Mglovesfun (talk) 21:37, 25 April 2011 (UTC)Reply

Vote on removing usage in a well-known work

I have created a vote: Wiktionary:Votes/pl-2011-04/CFI: Removing usage in a well-known work. I don't know how it turns out; let us see. --Dan Polansky 14:57, 25 April 2011 (UTC)Reply

Where's the prior discussion? DCDuring TALK 20:33, 25 April 2011 (UTC)Reply
I share DCDuring's question. However, I know this has been discussed some before; it's just a question of finding the discussions. - -sche (discuss) 20:49, 25 April 2011 (UTC)Reply
Yes and I said I'd opposite it in that aforementioned discussion. Mglovesfun (talk) 21:30, 25 April 2011 (UTC)Reply
I have linked to a previous discussion from the vote when I have created the vote, as is customary. A link to the discussion, anyway: Beer parlour: CFI: Removing usage in a well-known work, January 2011.
Also of interest, if you go to the top of the Beer parlour page, enter "well-known work", and press "search in the archives of Beer parlour", you get these search results. I should have used this link before and look more carefully, as I would have found this:
--Dan Polansky 09:23, 26 April 2011 (UTC)Reply

Quick poll: plurals to noun forms

We currently have a mix of [[Category:<language> plurals]] and [[Category:<language> noun forms]]. A couple of languages use both - most notably Catalan which double categorizes plurals as plurals and noun forms. This is a quick poll to ask whether we should eliminated the contents of Category:Plurals by language all together and replace with Category:Noun forms by language, that is to say, [[Category:<language> noun forms]]. Mglovesfun (talk) 14:07, 30 April 2011 (UTC)Reply

I support using only Category:<language> noun forms

  1. Support for consistency. It's a real mess currently as every language uses a different system. -- Prince Kassad 14:24, 30 April 2011 (UTC)Reply
  2. Support for the same reason. —CodeCat 14:28, 30 April 2011 (UTC)Reply
  3. Support for conveniency. The unspoken (or spoken) rule of "Let's type noun forms everywhere, except when the only noun form of a language is the plural, then we'll type plurals!" results in having two category trees for the same scope. Moreover, a category named "Category:English plurals" is very likely to be interpreted as a good place to have entries defined as plurals of verbs and of pronouns too, listed together, and (AFAICT) no one wants that. --Daniel. 15:10, 30 April 2011 (UTC)Reply
    @Daniel., yes, we already have some proper noun plurals in Category:English plurals. Mglovesfun (talk) 16:20, 30 April 2011 (UTC)Reply

I oppose using only Category:<language> noun forms

  1. Oppose. There's no reason that different languages should do this the same way. Both Category:Plurals by language and Category:Noun forms by language should be deleted as meaningless, since the terms "plural" and "noun form" both apply differently to different languages. ("Noun form", especially, means "noun form that isn't the lemma" — but we use different lemmata for different languages. Latin accusatives and Old French nominatives do not have anything interesting in common.) —RuakhTALK 14:57, 30 April 2011 (UTC)Reply
    Latin accusatives and Old French nominatives do have something in common! They are nouns, and they are not lemmas (I suppose). I find that pretty interesting. If you are looking for closer linguistic comparisons, I'm open to suggestions regarding floods of well-explained and strict categories like "Category:French noun nominative forms", which is akin to "Category:Spanish verb imperative forms"; however, "Category:French nominatives" or "Category:Spanish imperatives" would be bad names to me. --Daniel. 11:07, 1 May 2011 (UTC)Reply
  2. Oppose. AFAICT there is no good reason to reduce the amount of information conveyed to normal English users by the name Category:English plurals. Trying to achieve translingual consistency in the category structure seems foolish and linguistically naive or tendentious. DCDuring TALK 16:13, 30 April 2011 (UTC)Reply
    Well (perhaps this merits a separate section below), do you think all plurals should be in Category:Plurals? For example, we do have plurals of proper nouns. For example, Jameses. Mglovesfun (talk) 21:32, 30 April 2011 (UTC)Reply
    But what about plurals of adjectives? —CodeCat 21:37, 30 April 2011 (UTC)Reply
    As mentioned by me above, the name "Category:English plurals" is actually of poor linguistic value, either by possibly alowing too many unrelated words such as "we" (a first-person plural personal pronoun), "understand" (the lemma of a verb, that technically can be used as a plural too), "beautiful" (the lemma of an adjective, with the same technicality), "dogs" (a plural of a common noun) and "Bermuda shorts" (a plurale tantum) together, or by don't conveying clearly the fact that it is supposed to contain only non-lemma plurals of common nouns. I don't remember mentioning before that a category named "Category:English plurals" may contain pluralia tantum, but it's true as well. --Daniel. 21:41, 30 April 2011 (UTC)Reply
    My question was to DCDuring, and I was thinking specifically about English plurals as that's what he's mentioned. Mglovesfun (talk) 21:49, 30 April 2011 (UTC)Reply
    I knew that your question was to DCDuring, thanks to indentation. I, personally, would oppose the suggestion of leaving all plurals in Category:Plurals for fear of having either Category:es:Plurals, Category:pt:Plurals, etc. or unnecessary inconsistency between languages. --Daniel. 22:06, 30 April 2011 (UTC)Reply
    Damn, did I write that? I meant Category:English plurals. Mea culpa. Mglovesfun (talk) 22:33, 30 April 2011 (UTC)Reply
    @Daniel.: I believe you are mistaken. In my opinion:  "We" is plural, but is not "a" plural. "Understand" and "beautiful" are neither plural nor plurals. "Dogs" is plural, and is a plural (the plural of "dog"). "Bermuda shorts" is plural, and I could go either way on whether it's a plural. (google books:"inherent plurals" finds some instances referring to pluralia tantum, so some people clearly feel it's O.K.) —RuakhTALK 23:28, 30 April 2011 (UTC)Reply
    I am not convinced, so I took a look at some relevant pages: The entry plural is generic enough to support my explanation of what a "plural" is. Moreover, Wiktionary's descriptive approach makes it easy to attest nuances of it by finding sources that agree with the idea I described. For example, the page 32 of this book contains "There are words that some doctors use when talking to patients — curious words such as 'pop' as in 'pop up on the couch', or plurals such as 'we' in 'We'll just have a look at you'." --Daniel. 11:07, 1 May 2011 (UTC)Reply
  3. Neither alone will work for all languages.​—msh210 (talk) 04:19, 1 May 2011 (UTC)Reply
    This is a pretty strong claim! Can you (or anyone else) please mention one language that won't fit the category tree of "Category:<language> noun forms" and another language that won't fit the category tree of "Category:<language> plurals"? --Daniel. 11:07, 1 May 2011 (UTC)Reply
    Not sure what you mean by "category tree" here. The categories "Hebrew plural forms" and "English plural forms" are not parallel, as the former should have a plural-noun-form, a plural-adjective-form, and a plural-pronoun subcategory, whereas the latter should have a plural-noun and a plural-pronoun subcategory. But for another language, say one that has plural noun forms but no other plural forms, it is kind of silly to have a plural-form category just to hold the plural-noun-form category. (Otoh, that hypothetical language's plural-noun-form category should be in the plural-forms-by-language category so it can be found.) I don't know enough languages to name such a language.—msh210℠ on a public computer 14:10, 1 May 2011 (UTC)Reply
  4. I am hesitatant, so I cannot say that I support. I support this: "For each language, if the language has plural forms for several parts of speech such as for nouns and adjectives, there should be no [[:Category:<language> plurals]]. In Czech, nouns, adjectives, and verbs have plurals, and it seems pointless to list all plurals in one category. There could be languages for which getting rid of the category for plurals would cause trouble, but is this merely hypothetical? I don't know. --Dan Polansky 09:44, 2 May 2011 (UTC)Reply

Related discussions

May 2011

Romanized Korean, again

Hunmin Jeongeum, by the letter of the law at least, has failed RFV. We normally give entries more than a month as people don't spend that much time trying to cite them, that and the backlog's so big that it takes two months just to get to an entry. But, this entry does seem totally unattestable. For fairness it would seem fair to RFV all Romanized Korean before adding to WT:AKO that we don't accept Korean in Latin script. My practical side says that's just a waste of time; delaying the inevitable. Thoughts on this specific issue?

FWIW Tatar has a similar issue where it's Latin forms, which are listed in the Russian and Tatar Wiktionaries don't seem to be attestable, or at least not very easily. That's a separate issue; let's deal with Korean first and have a separate discussion on Tatar later if we want to. Mglovesfun (talk) 13:33, 2 May 2011 (UTC)Reply