Wiktionary:Beer parlour/2012/November

This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.

Beer parlour archives edit

2024

2023

Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

December

‘market’ terms

I expanded the derived terms of market here, but some of them contain verbs now, and it seems that these particular derived words are for the nouns. Can this still be included? --Æ&Œ (talk) 03:41, 1 November 2012 (UTC)[reply]

My practice has been to split Derived terms tables by PoS, with duplication where necessary. DCDuring TALK 20:03, 4 November 2012 (UTC)[reply]

Poll: Etymology nested under PoS header?

This has come up several times before, and I think that it would make sense in a lot of cases because users expect the definition to come first, not the etymology. Rather than starting a whole discussion about it I'd like to hold a simple poll to see if there is enough support for this change, and if not, what the objections are. Please note that I am only asking whether it is better in principle, so please do not say no only because it would be too difficult to implement. Once a decision has been made, we can always discuss how to make it happen later; this poll is not about that. I'm also not asking where under the PoS header it should be nested (although presumably it would go below the definition), that is also a detail that would be discussed later, when a more concrete vote to modify WT:ELE comes along.

Agree: Ideally, etymology would be nested inside the part-of-speech section, at level 4

Support —CodeCa t 20:56, 1 November 2012 (UTC)[reply]
Could you clarify what you mean, exactly? If there are multiple parts of speech with the same etymology, would you duplicate the sections? --Yair rand (talk) 21:09, 1 November 2012 (UTC)[reply]
Not duplicate; usually one of them is derived from the other. The etymology of, for example, the noun feed can specify that it derives from the verb feed. —CodeCa t 21:13, 1 November 2012 (UTC)[reply]
Tentatively support. It is true that most people come for a definition (we see many annoyed comments on the feedback page where an attention-deficient user couldn't find the definition!), and in fact I think most mainstream dictionaries put the ety last (e.g. Merriam-Webster and Chambers). Equinox ◑ 23:54, 1 November 2012 (UTC)[reply]
support in principle. --Anatoli ^{(обсудить}/^вклад) 00:48, 2 November 2012 (UTC)[reply]
Support putting the etymology after the definitions, provided such a major change is technically feasible. -- Gauss (talk) 00:55, 2 November 2012 (UTC)[reply]
I think it's feasible. Note that, unlike many changes, it wouldn't need to be made to all pages at once; even if it took us a full year to migrate — and I can't imagine it taking anywhere near that long — the inconsistency wouldn't cause any serious problems in the meantime. —Ruakh_TALK 02:36, 2 November 2012 (UTC)[reply]
Tentatively support. Even for cases where we currently split by etymology and would want to continue to do so, I'm not sure that the ===Etymology n=== structure is really all that clear to the typical reader; the different header-levels just don't look different enough to make it consistently obvious which sections are nested in which. We would, of course, need to come up with some sort of replacement structure. —Ruakh_TALK 02:36, 2 November 2012 (UTC)[reply]
Addendum: Actually, the replacement structure is easy: we can use ===Noun 1===, etc., in such cases. We already do that in a bunch of entries where split-by-etymology is unsatisfactory. —Ruakh_TALK 02:49, 2 November 2012 (UTC)[reply]
Re current practice: well, almost. We currently have some entries with consecutive ===Noun=== ===Noun=== sections, but they're not numbered. (If they're numbered, AutoFormat/KassadBot tags them as nonstandard and I fix them.) - -sche (discuss) 02:56, 2 November 2012 (UTC)[reply]
Is the numbering really necessary, though? —CodeCa t 03:04, 2 November 2012 (UTC)[reply]
I think it's necessary, yes. And we do still have many like that, even if you've "fixed" others. See google:"Noun 1" site:en.wiktionary.org. —Ruakh_TALK 03:13, 2 November 2012 (UTC)[reply]

I would also be in favour of increasing the differences between header levels by changing the CSS, but that's a different story. (Personally I like the way French Wiktionary styles things) —CodeCa t 02:41, 2 November 2012 (UTC)[reply]
One benefit of moving the etymology down, if we do, will be that header-levels will be somewhat more consistent; for example, "Noun" will only ever be at L3. (Some headers will still vary, in that e.g. "Derived terms" is sometimes put inside a POS section and sometimes just appended to the whole entry, but not as many, and anyway those headers aren't as fundamental.) Once we do that, I'd support such a CSS change. —Ruakh_TALK 02:49, 2 November 2012 (UTC)[reply]
Under one possible way of structuring entries that would be the case, yes. But as I noted to -sche below, it's also possible to retain the current nesting structure while moving the Etymology section below the definition. —CodeCa t 02:54, 2 November 2012 (UTC)[reply]

(@CodeCat, after e/c) Assuming you're talking about the way fr.Wikt handles multiple etymologies, can you link to a fr.Wikt page you like? I checked fr:rape, but it lacks all the etymology sections we have except the sexual violation one; I checked fr:fast, but it lumps the homographs together and doesn't distinguish anything by header level AFAICT. - -sche (discuss) 02:56, 2 November 2012 (UTC)[reply]
According to fr:Aide:Étymologies, French Wiktionary uses a single etymology section, with qualifiers and such to indicate which words a particular etymology applies to. That may not be ideal for long etymologies, but if it can work for them it may work for us, too. It could also be an incentive for us to keep our etymologies short, although I'm not sure if that's a good incentive... —CodeCa t 03:02, 2 November 2012 (UTC)[reply]

Disagree: Ideally, etymology would remain above the part-of-speech header, at level 3

Oppose because I often navigate to a page just for the etymology, which I really don't do very often for any other part of the entry except the definition itself. Given the layout, I find the def to be easy to pick from the page, but pushed down from the top, etymologies are hard to find quickly. (Off topic: I wouldn't mind if we moved Alternative Forms down, though.) —Μετάknowledge^{discuss/deeds} 23:51, 1 November 2012 (UTC)[reply]
I support the current arrangement, which puts etymology first. It allows us to cleanly distinguish homographs (unrelated words which merely happen to have convergent spellings) from words which need multiple POS sections for the same etymology (see e.g. de:scheren, though our entry is deficient at the moment). Splitting the noun and verb senses of "feed" into separate, related etymology sections (see CodeCat's comment above) is something that I've hitherto seen undone, i.e. treated as wrong and corrected. - -sche (discuss) 00:18, 2 November 2012 (UTC)[reply]
While that is true, the question remains whether the etymology should be the first item displayed. We could also opt for a comprimise solution where senses are still grouped by etymology, but the etymology of each part of speech is nested within the PoS header (which would then be at level 4, and the etymology section at level 5 like derived terms). I think it is important to display the etymology of every part of speech individually, because the fact is that they always do have separate etymologies: one of the parts of speech was always "first" and the others then derived from it. For example, the adjective and adverb senses of fast have separate etymologies tracing back at least to Proto-Germanic, and it is clear that feed was not a noun until modern English, so the noun must be derived from the verb. Perhaps we could introduce numbered ===Sense 1=== ===Sense 2=== etc for this purpose, or maybe ===Root 1=== and ===Root 2=== since that is what we really group words by. The Catalan diccionari.cat follows a similar scheme, see [1]. —CodeCa t 00:32, 2 November 2012 (UTC)[reply]
What would be the point of splitting by root using Root or Sense headers but putting the other etymological info elsewhere? - -sche (discuss) 04:13, 2 November 2012 (UTC)[reply]

I think that two parts of speech within one etymology belong to two words. For example, "noun:paper" and "verb:paper" are two words, even more so as, in English, they have distinct sets of inflected forms. Being two words, "noun:paper" and "verb:paper" are two homographs within one etymology. Thus, structuring by etymology first does not enable us to cleanly distinguish homographs; the number of etymology section is not necessarily the number of homographs on the headword. However, what bothers me is that I do not have any academic source for this. OTOH, my view is consistent with Merriam-Webster's presentation of "sound": they have 7 indexed sections for "sound", some of which have the same part of speech (³noun, ⁵noun), and some of which have the same etymology (³noun, ⁴verb). --Dan Polansky (talk) 08:55, 3 November 2012 (UTC)[reply]
Oppose. In a single word, there exist multiple senses which are quite different from other senses, such as bark, rape, hack, hail, et cetera, and these are explained by etymologies. I am of the opinion that we should emphasize the causes of these distinctions, which should emphasise the distinctions theirselves, and not provoke confusions. The supposed ‘readers’ who complain to us often have very stupid complaints which are hardly worth taking into consideration. Wiktionary doesn’t need to be dumbed down. --Æ&Œ (talk) 01:32, 2 November 2012 (UTC)[reply]
I'm not sure what you mean. I'm not suggesting that we lump all "noun" senses together. In fact I would oppose such a change as well. My proposal is only to move the etymology to another part of the page, not to merge any sections together or remove information. —CodeCa t 01:53, 2 November 2012 (UTC)[reply]
Oppose. My impression from other Wiktionaries is that it looks like a pocket dictionary when the etymology is below the definitions. A better solution, in my opinion, would be making the etymology collapsed by default (like quotations) but having a preference for making the default collapsible or not. — Ungoliant ^(Falai) 03:46, 2 November 2012 (UTC)[reply]
Oppose nesting Etymology under Part of Speech, but I might possibly support some other way of splitting up distinct "words" on the page...it can't be done by part-of-speech though, because there are often several parts of speech for each Etymology. Ƿidsiþ 06:29, 2 November 2012 (UTC)[reply]
Oppose. As others have said, keeping the etymology above the POS is the best way of distinguishing homographs. —An gr 11:36, 2 November 2012 (UTC)[reply]
Why is it the best way? Or rather, why are other options not the best way? —CodeCa t 13:33, 2 November 2012 (UTC)[reply]
Because it's the only way to group all the POSes with a single etymology together. Otherwise you just wind up repeating yourself. —An gr 15:39, 2 November 2012 (UTC)[reply]
Oppose for many of the reasons given above (homographs, the definition itself being easy to spot anyway, etc.). Also, I've looked at other Wiktionaries and when the etymology is hidden down below it always seems out of place. Wiki Tiki 89 16:30, 2 November 2012 (UTC)[reply]
Oppose. The structure that Etymology affords English entries seems absolutely essential for English. Of all the things we have that are somewhat over the head of many of our English users, this one seems the easiest to justify.
Etymologies often take up too much landing-screen space, but so do Pronunciation sections. {{rel-top}} can conceal some of the excess and other approaches could conceal more, even all, of the contents of these sections. I'd favor more concealment of both Etymology and Pronunciation section contents, preferably by default, with expanded display available for all users during a session and for registered by expressed preference. DCDuring TALK 18:08, 2 November 2012 (UTC)[reply]
Oppose. As others have said, they are plenty of words where the adjective, noun, verb, etc. all come from the same place, and this change would result in a lot of duplication of information for little gain (there are very few articles where the etymology is so long it floods the page, and there are better ways of dealing with them than this). The etymology itself could be placed below the definitions, but the Etymology heading needs to stay at level 3. Smurrayinchester (talk) 21:13, 2 November 2012 (UTC)[reply]
Can you name an example where different parts of speech share the same etymology throughout all of their history, rather than one being directly derived from the other? —CodeCa t 21:21, 2 November 2012 (UTC)[reply]
Loanwords like abseil and tattoo seem to have had multiple parts of speech created from the same source. In general, I think that "cash (verb): verbing of cash (noun)" is over pedantic, but I'll happily cede to actual linguists on that point. Smurrayinchester (talk) 10:05, 3 November 2012 (UTC)[reply]
Oppose. @ Codecat There are many examples from Egyptian where one etymology will produce a noun, a verb and an adjective (e.g. jmn). Egyptian can derive verbs from nouns, but it can also do vice versa, and often (usually, even) the evidence is too poor to say which PoS came first. Furius (talk) 05:12, 3 November 2012 (UTC)[reply]
Oppose. Looking at the mockup it takes significantly more space to impart the same information in a less useful manner - it is much harder to see the etymological relationships between words - which senses are related and which are homographs. As noted above, this would incentivise very short etymologies, which is the opposite of what we want in most cases. Thryduulf (talk) 18:50, 13 November 2012 (UTC)[reply]

Other

This section was added after many of the votes and comments had been posted.

I think etymology'd be best as an L3 header, but below the definitions not (as it is now) above them. That said, IMO the proposal on the table here is probably better than what we have now. But any vote to effect it should be preceded by a fuller poll than this one: a poll that allows other possibilities also, like the one I prefer. Or not, but then it should be considered a vote without a preceding poll. What I mean to say is that this is not a binary issue.—msh210℠ (talk) 06:15, 2 November 2012 (UTC)[reply]

Abstain

Need more information (perhaps an example page for a polysemous word like fast) showing how this works. Equinox ◑ 22:41, 1 November 2012 (UTC)[reply]
I've created an example mockup at User:CodeCat/fast, which shows one possible way of writing the entry. All I've done is changed the nesting so that the PoS sections are at level 3, the sections within them at level 4, and the etymology also within them. Notice that there are no more level 5 headers; under this scheme, a header like "noun" is always at level 3, and "derived terms" is always at level 4. —CodeCa t 23:18, 1 November 2012 (UTC)[reply]
At WT:FEED, there are complaints about being able to find definitions, but other users complain about lack of etymologies. Not sure this would actually constitute a step forward. Mglovesfun (talk) 21:53, 2 November 2012 (UTC)[reply]

I don't know. --Dan Polansky (talk) 09:26, 3 November 2012 (UTC)[reply]

Comments

Is there an option of neither and/or both of the above? Although a nice, clean logical hierarchy makes it easier to find things and to decide where to put them, the reality we're trying to model is anything but clean and hierarchical. There are a variety of dimensions that make up language, and the speakers of a language will make changes in one or more of those to accomplish what they collectively need or want: a change in the position of the accent (phonology) might cause inflectional endings to be reduced and sound alike (phonology), which causes other parts of speech such as determiners to be pressed into service to carry the same syntactic information (syntax), which causes a single word to split into two, one free and one dependent, which causes the dependent word to change one way, and the free one to change another, because of the predominant phonological environments for each. New technologies and physical items are introduced, along with the words used by the foreigners who introduce them, or changes in political and military factors mean that some groups become more important, and the words they use, and the way they talk becomes more prestigious and displaces inherited ones (etymology), which causes phonological and syntactic change, and so on.

Which dimensions take the changes and which remain the same is often an accident of history. That means that sometimes things are best grouped by etymology, other times by part of speech, sometimes by other factors. Each language will have its bias toward one or the other, but anomalies can crop up anywhere: think about suppletive verbs, with multiple etymologies within the same paradigm, or the English indefinite article, which is etymologically identical with the number one (at least up to the point they split).

Having the etymology under the POS won't work in some cases, and having the POS under the etymology won't work in others. Both sides can point to clear-cut cases where their approach works, and the other doesn't, and there are a whole lot of cases where both are equally good. That means that we can go back and forth between mutually-exclusive, but equally-good approaches until someone gets tired of it, or until the universe slowly winds down after countless millenia- whichever comes first.

The question is, how can we accomodate this variability without tying ourselves into knots and letting chaos get in the way of usability and clarity? Right now we have a rigid, ironclad hierarchical setup which we play with or ignore when it seems right, but which too often doesn't provide what users want where or how they want it. Flipping it from one half-right arrangement to another will solve some problems and create others. Is there any way to use either as needed, without making a mess of things? Chuck Entz (talk) 03:58, 2 November 2012 (UTC)[reply]

Problem with French categories

Good evening (sorry for my English),

there is a problem with Category:hy:Grammar (for example. I have already made correction on Category:en:Grammar and several others). The good interwiki would be fr:Catégorie:Lexique en arménien de la grammaire (= Armenian terms related to grammar). But it actually links to fr:Catégorie:Grammaire en arménien, which is in fact Category:Armenian parts of speech (= Armenian verbs, adjectives, nouns, etc.). It is probably the same with many others (and even if French problem is eventually settled, I am sure other interwikis are wrong). So what are we supposed to do ?

Regards, --Fsojic (talk) 21:23, 2 November 2012 (UTC)[reply]

You can just fix them... —CodeCa t 21:35, 2 November 2012 (UTC)[reply]

Well, there are about a hundred that need checking! Seriously, using AWB I reckon I can fix this in a few minutes. Mglovesfun (talk) 21:36, 2 November 2012 (UTC)[reply]

Something wrong, I can't get AWB to do even the simplest replacement, I can't get it to replace a with b with now! Mglovesfun (talk) 21:50, 2 November 2012 (UTC)[reply]

Proposal to eliminate all Goguryeo words in the main namespace

I have proposed that all Goguryeo words be deleted. See Wiktionary:Requests_for_deletion#All_Goguryeo_words_in_the_main_namespace. --BB12 (talk) 21:22, 3 November 2012 (UTC)[reply]

We evidently have a block evader. See Special:Contributions/71.191.95.5. —Μετάknowledge^{discuss/deeds} 02:20, 4 November 2012 (UTC)[reply]

Can Goguryeo be blocked? --BB12 (talk) 04:36, 4 November 2012 (UTC)[reply]

Yes, if that's what we want to do.—msh210℠ (talk) 06:06, 4 November 2012 (UTC)[reply]

-sche has deleted all of the Goguryeo entries. Perhaps we should wait and see for now. If more get inappropriate ones get added, this could be taken as a countermeasure (that could be undone if someone wants to add appropriate entries.) --BB12 (talk) 21:54, 4 November 2012 (UTC)[reply]

Translations into Low German dialects

User:129.125.102.126 has been adding a few translations into some of the Dutch Low German dialects. They have their own language codes, but since we already use {{nds}}, what should we do with them? This is a bit like Serbo-Croatian; either we treat them as one language or many, but we can't do both. One difference with SC is that Low German does not have sub-codes for all dialects, only some, so if we do decide to add each dialect individually, we would necessarily have to continue to also use {{nds}} for those Low German dialects that have no code of their own. Which seems a bit silly to me. —CodeCa t 04:17, 4 November 2012 (UTC)[reply]

In the Netherlands, Low German tends to be influenced by Dutch and by Dutch orthographical ideas, while in Germany, it tends to be influenced by High German and by German orthographical ideas. Therefore, combining the Dutch Low German lects under the umbrella of the {{nds-nl}} and the German(y) Low German lects under {{nds}} could be useful. It would be arbitrary, though: every lect pronounces and spells things slightly differently; the macro-codes would still represent a mess. There is also disagreement on whether to capitalise nouns, and on how to indicate long and short vowels, in any lect. (I know you know this; I say it for others' benefit.) I am in favour of combining the lects, either into nds-nl and nds, or all into nds. However, we should require that every entry have a {{context}} tag specifying which lects or portions of the Sprachraum use it. - -sche (discuss) 07:15, 4 November 2012 (UTC)[reply]

Requiring a context tag might be a bit overboard, because there will be many words that are the same in all of them. Then again, being explicit isn't bad either? Also thank you for including previous discussions. —CodeCa t 13:51, 4 November 2012 (UTC)[reply]

I think I've seen Low German entries using things like {{context|in many dialects}} and {{context|in many eastern dialects|_|including|_|Foobar}} already. I'd like to require entry-creators to think in those terms—about what dialect(s) a term (and spelling!) is used in, even if the answer is "most of them"—lest we end up with some entries in Netherlandic orthography, some with long and short vowels marked, some with long but not short vowels marked, some with only short vowels marked, some with neither, some capitalised, some not, some using öö(ff) for a sound that in another word (in another dialect) is spelt ü(nf), without giving readers any clue that the differences are dialectal and orthographical. (Imagine having color, favour, hain't, realise, fellar, 'alf, and bastardize without indicating that some are standard US spellings, some are standard UK, some are from h+ dialects, some are from h-less dialects, etc.) Btw, I wouldn't delete context-less entries; I would merely put visible cleanup tags on them.

Re previous discussions: Thank you for continuing to prod the community to do something about the issue. I hope we can reach a decision and do something this time. I think I've been too nauseated by how much of a total mess Low German is to take a position, in the past... but you're right, merging them is the most sensible thing to do. - -sche (discuss) 23:02, 4 November 2012 (UTC)[reply]

Let's make a list of all the Low German dialect templates, so we can see if there are any that shouldn’t be merged, and so we can start merging the ones with the fewest transclusions first. For example, some users argued in past discussions that {{pdt}} should be kept separate. (And the mess which is {{frs}} vs {{stq}} has to be sorted out semi-separately.) I think a merger of the Dutch dialects into {{nds-nl}}, a merger of the German dialects into {{nds}}, and a retention of {{pdt}} would be sensible and feasible. - -sche (discuss) 22:41, 14 November 2012 (UTC)[reply]

OK, if we do this:

{{pdt}} (Plautdietsch) → keep as-is
~~{{wep}} (Westphalian)~~ → merge into {{nds}}
~~{{act}} (Achterhoeks)~~, {{drt}} (Drents), {{gos}} (Gronings), {{twd}} (Twents), ~~{{sdz}} (Sallands)~~, ~~{{stl}} (Stellingwerfs)~~, ~~{{vel}} (Veluws)~~ → merge into {{nds-nl}}

is anything missing? - -sche (discuss) 22:53, 14 November 2012 (UTC)[reply]

I think that is all of them, and I agree with your proposal. But we would need to make it clear somehow that {{nds}} is only the German variety. Either that, or use {{nds-de}}. Dutch Low Saxon speakers might otherwise decide to add their terms under Low German instead. —CodeCa t 23:04, 14 November 2012 (UTC)[reply]

{{nds-de}} is a good idea.

Should nds-de just be named "Low German", as {{nds}} currently is, or should we make the name (as used in L2 headers and translations tables) different, too? "German Low German" is unschön...hm, I suppose plain "Low German" is best.

Do you think we should group nds-nl and nds-de next to each other in translations tables, under an empty "Low German"/"Low Saxon" header, the way we put "Mandarin" under an empty "Chinese" header? Should {{pdt}} also be in that grouping? - -sche (discuss) 00:11, 15 November 2012 (UTC)[reply]

I agree on the grouping, but I'm not sure where to put pdt. People who aren't in the know might look for it under Dutch instead, and get confused. Concerning grouping, there is a discussion about it further down, and at Wiktionary talk:Translations. —CodeCa t 00:48, 15 November 2012 (UTC)[reply]

I've added a notice to WT:NFE now. —CodeCa t 02:54, 15 November 2012 (UTC)[reply]

What should we do with {{nds}}? Delete it? We can't have two codes for the same language, that will break things. —CodeCa t 03:02, 15 November 2012 (UTC)[reply]

I suppose deprecating {{nds}} would be best, but that should be done slowly, giving people plenty of time to suggest alternatives, since {{nds}} is used on some 1000–1500 pages. I deleted wep, act, sdz, stl, and vel rather speedily only because they all had <5 transclusions. - -sche (discuss) 03:26, 15 November 2012 (UTC)[reply]

FYI, I have now orphaned (deprecated template usage) drt (Drents) and {{twd}} (Twents), but not yet deleted them. - -sche (discuss) 04:50, 15 November 2012 (UTC)[reply]

Mention quotations in the mainspace and Citations namespace

I am inclined to remove mention quotations (as opposed to use quotation) from Streisand effect, and from Citations:Streisand effect. Should I feel free to do so? If not, why? (I have posted the same question to Tea room, but then I realized Beer parlour seemed better suited.)

An example of mention quotation is IMHO this: "Let's call it the Streisand Effect." --Dan Polansky (talk) 11:47, 4 November 2012 (UTC)[reply]

I agree. The main entry should only contain usage examples and citations that demonstrate how the term is used. Citations that merely demonstrate the existence of the term belong on the Citations page. --Wiki Tiki 89 11:52, 4 November 2012 (UTC)[reply]

Maybe, but not removed from the cites page.—msh210℠ (talk) 03:12, 5 November 2012 (UTC)[reply]

I think it might be a good idea to keep at least one or two of the mentions, but at the bottom under a separate header, with a brief explanation of why they're no good for CFI. Call it an educational exhibit, or call it salting against more mention-cites- either way, it should be beneficial. Chuck Entz (talk) 09:41, 5 November 2012 (UTC)[reply]

Like I said in the Tea Room discussion, I've removed the mention quotations. ~ Röbin Liönheart (talk) 19:49, 4 November 2012 (UTC)[reply]

Dubious citations should remain on the citations page, or else what's the citations page for? Some mentions can be useful in terms of chronology, such as a term that's coined as a mention then actually gets used after that. Mglovesfun (talk) 10:46, 5 November 2012 (UTC)[reply]

And in the case of neologisms, we of course want Citations to note their original sources, even if the first recorded instance is a mentiony proposal like "Let's call it the Streisand Effect". Frex:

2005 October 17, The Colbert Report‎^[2], Stephen Colbert (actor):
And that brings us to tonight's word: "truthiness".

would not belong in the main namespace, but is of historical interest in Citations. ~ Röbin Liönheart (talk) 20:42, 5 November 2012 (UTC)[reply]

Comment: Essentially for the most part, I agree with Wikitiki89 (talk • contribs), msh210 (talk • contribs), Mglovesfun (talk • contribs), and Robin Lionheart (talk • contribs), above. And I'm quite thankful to Robin Lionheart (talk • contribs) to the help at the page with selecting the most optimum cites from the Citations page for the main entry page. But yes, especially for chronological history and development of terminology over time and subsequent usage, it's good to have a broad scope for the Citations page, agreed. :) Cheers, -- Cirt (talk) 04:14, 6 November 2012 (UTC)[reply]

Religion related goals for 2013

This is probably more than a bit presumptuous on my part, but I have started a discussion at w:Wikipedia talk:WikiProject Religion#Goals for 2013? in the English wikipedia asking what if any sort of goals we might be able to reasonably set for the next year, in wikipedia and other WF sites as well. I figured the wikipedia probably gets more attention, which is why I started the discussion there. But I would be very interested in seeing any input regarding what the editors here think might be the areas of wiktionary most in need or meriting additional attention. Maybe, and at this point it is just a maybe, maybe we might be able to get some input on such topics if we have some idea what it is we really need to work on. Anyway, I would welcome any input anyone here might have. John Carter (talk) 19:33, 4 November 2012 (UTC)[reply]

Editors here tend to focus on a language as a whole, instead of a specific topic. But of course, anyone is more than welcome if they want to work solely on religion. Just don’t make the mistake of assuming rules here will be the same as WP rules, and understand that there might be some controversy when editing such a sensitive topic (such as the recent discussion about the {{biblical character}} template). If you’re interested, you can use the Category:en:Religion category and its subcategories as a starting point to see what’s missing. — Ungoliant ^(Falai) 22:22, 6 November 2012 (UTC)[reply]

Disruptive IP user(s)

Lately there is a user who is adding a lot of unattested Gothic words and removing correct information from entries. It appears they are acting in good faith, but they completely ignore any messages I send and even revert editors who revert their changes. So far they have used several IPs: Special:Contributions/2.9.125.57, Special:Contributions/109.211.108.157, Special:Contributions/90.59.167.191, Special:Contributions/2.9.253.236. Two have these have been blocked for disruptive editing, but I doubt it will help because they just keep coming back. I'm not sure what to do about this but I'd appreciate it if others could keep an eye out; any IP editing Gothic or Proto-Germanic related information could be this user again. —CodeCa t 13:25, 6 November 2012 (UTC)[reply]

He is from the French wiktionary, and has already been blocked several times : http://fr.wiktionary.org/w/index.php?title=Spécial:Journal/block&page=Utilisateur%3AX . He is currently translating pages of the Appendix (the ones I am linking to myself) in French. According to other users he sometimes does good job, but is impossible talking (to talk ? sorry for my English) to. --Fsojic (talk) 20:34, 6 November 2012 (UTC)[reply]

Yes I noticed, the appendix pages on the French Wiktionary look ok though. What have they been blocked for on French Wiktionary? And why unblocked? —CodeCa t 20:44, 6 November 2012 (UTC)[reply]

For the same reasons as here : he sometimes removes correct information, doesn't listen to remarks, etc. But he seems to act in good faith, and does create correct articles. --Fsojic (talk) 20:58, 6 November 2012 (UTC)[reply]

They're back again: Special:Contributions/2.9.254.133. Still not replying and still adding unattested words. Should we just ban for all eternity? —CodeCa t 20:19, 9 November 2012 (UTC)[reply]

I'm not sure perma-blocking someone who just keeps changing accounts to circumvent shorter blocks will have any effect. I'm also not sure it's appropriate, given Fsojic's comment. I've blocked them for two weeks for showing no sign of willingness to work constructively and listen to advice. - -sche (discuss) 21:55, 9 November 2012 (UTC)[reply]

What I meant was should we just keep forever blocking all the new IPs as they appear. It's like... we're damned if we have to keep chasing after them, but also damned if we let them continue to insert bad content into Wiktionary... :/ —CodeCa t 22:06, 9 November 2012 (UTC)[reply]

I'm not so sure they're changing IPs just to evade blocks: even with wildcard searches, I haven't been able to find any address that has been used on Wiktionary longer than about a week. The only time I saw them come back to the same IP was after a 3-day block- it was still within the usual 1-week time period. It looks to me like we're going to have to play IP whack-a-mole, but I doubt the blocks will have to be more than a week. I saw one edit on the French wiktionary where the claimed δαίμων (daímōn) and Proto-Slavic *bogъ were from the same PIE root! Chuck Entz (talk) 00:03, 10 November 2012 (UTC)[reply]

Caught again: Special:Contributions/2.9.254.76. —CodeCa t 17:38, 15 November 2012 (UTC)[reply]

Another: Special:Contributions/109.211.64.183, blocked. I'm listing these here just to keep track of what this user is doing, and also so that maybe we can figure out what IP range to block to keep them out. —CodeCa t 15:29, 20 November 2012 (UTC)[reply]

Keeping a list now: User:CodeCat/disruptive IP. Please help to spot this user and add their IP to the list! —CodeCa t 15:35, 20 November 2012 (UTC)[reply]

That is hilarious (seriously) --Fsojic (talk) 17:59, 20 November 2012 (UTC)[reply]

Could Wiktionaries have an influence on ISO639?

Hi everebody there.

I really think Wiktionnaries are an entity which can do that : help to decide what is a language and what is not.

Yes, I understand it's far too early. But imo I really think that it will become reality in some future.

Just think about that : Wiktionaries being part of ISO639 council. Imagine a world... --GaAs 21:49, 6 November 2012 (UTC)[reply]

That would be nice. The problem is that many people here are merely interested in linguistics, but are not linguists. Trusting people like me with such important linguistic decisions wouldn’t be very wise. — Ungoliant ^(Falai) 22:10, 6 November 2012 (UTC)[reply]

Also, we tend to be influenced by ISO rather than the other way around. --Wiki Tiki 89 08:19, 7 November 2012 (UTC)[reply]

You can submit new requests to the ISO 639-3 maintenance body at their website. I'm not sure what you want; is there any thing you tried to communicate with them on but didn't feel you were listened to, or is this a generic "we want power"?--Prosfilaes (talk) 09:08, 7 November 2012 (UTC)[reply]

I once wanted to email them to suggest merging {{xno}} into {{fro}}, but I couldn't find a way to do that. Mglovesfun (talk) 10:20, 7 November 2012 (UTC)[reply]

I think ISO-639 needs better transparency than the Wiktionaries can provide. Language is politics, and most Wiktionarians are anonymous. It's not a good fit. —Ruakh_TALK 16:12, 7 November 2012 (UTC)[reply]

I think Wikileaks has anonymous leakers yet their voice is heard and even propagated. Also, for the Wiktionarians that are onymous the last sentence appears quite pessimistic. --biblbroks_{дискашн} 22:16, 7 November 2012 (UTC)[reply]

More like: Liliana-60 being part of ISO639 council. It may sound like a self-advert, but I don't really think many people know about languages around here other than me. -- Liliana • 21:55, 10 November 2012 (UTC)[reply]

Wiktionaries should not have an influence on ISO639. Instead, NPOV should lead to accept at least all languages with an ISO code. But I think that, as the Wiktionary project is a tool, and this tool is available to people taking decision, it could (unfortunately) have an influence, especially in the future, when projects will be much more complete. Lmaltier (talk) 14:48, 18 November 2012 (UTC)[reply]

Shouldn't NPOV also lead to ISO639 accept only languages? Also, if projects are to be much more complete in the future any influences they could have should be regarded anything but unfortunate. --biblbroks_{дискашн} 22:55, 18 November 2012 (UTC)[reply]

A stock reply to Lmaltier's stock comment. That would be ok if ISO 639 didn't get it wrong, but they do. Their codes aren't designed to be used for Wiktionary, so it's perfectly acceptable to them to give two codes (or more) to the same language, or to give codes to dialects which aren't languages in their own right. Also, sometimes they just get it totally wrong. NPOV doesn't seem like the right term, you mean following ISO 639 blindly like a robot no matter what the consequences. Mglovesfun (talk) 23:02, 18 November 2012 (UTC)[reply]

Not at all. "Following ISO 639 blindly" or "like a robot" or "no matter what the consequences" would entail accepting all and only languages with an ISO code. Lmaltier is saying that, whatever point(s) of view ISO 639 represents, NPOV forbids us from forbidding it/them. After all, despite what you say, it's not really possible to speak in objective terms about their "getting it wrong", or about "dialects which aren't languages in their own right"; it's all a matter of viewpoint. (For the record, I don't really agree with Lmaltier about this. True NPOV would require us to acknowledge the viewpoint that X is a language and Y is not, as well as the viewpoint that Y is a language and X is not; but Lmaltier proposes to acknowledge only half of each viewpoint, the "___ is a language" half, while studiously ignoring the "___ is not a language" half. Personally, I actually think I'd rather that we followed ISO 639 blindly than that we followed Lmaltier's conception of NPOV.) —Ruakh_TALK 02:59, 19 November 2012 (UTC)[reply]

But ISO does not state that its list is complete and definitive, and that anything not in the is not a language. This is why several versions of the standard have been designed. Lmaltier (talk) 18:27, 27 November 2012 (UTC)[reply]

The same applies to words: when a classical dictionary includes a word, we can safely include it (except when this is a mistake, but this is very rare). This does not mean that a word we cannot find in classical dictionaries should not be included. Lmaltier (talk) 18:31, 27 November 2012 (UTC)[reply]

I'm not sure that the w:Congregation for the Causes of Saints compiles a list of miracles but if it did, I'm almost certain that such a list wouldn't entail a statement that the list is complete and/or definitive, nor that anything in this list is not a miracle. Lack of any such statements cannot be an argument to accept these lists as authoritative enough for every case. Anyway, if there are several version of the standard that differ in the list of languages, which one are we to follow and by which criteria shall we decide which one to follow?

The difference between dictionaries and ISO639 is that dictionaries are many and come from different sources, while ISO639 is (AFAIK) one and centrally managed and also that you usually don't say lexicography is politics. --biblbroks_{дискашн} 23:40, 27 November 2012 (UTC)[reply]

Subdivisions for different stages of a language in translations

As it is now, most old languages are listed entirely separate from their modern counterparts. For example, Old High German is sorted under O, while German goes under G. However, I think Ancient Greek is listed as a sub-item of Greek. I think sub-items are useful, personally. Would it be desirable to make such sub-items standard practice? We should probably list them based on the name though, rather than the genetic family tree. After all, Luxembourgish is every bit as much "modern Old High German" as is German itself, and Scots likewise descends from Old English too. So we would sort Old High German under German, because it has German in the name. Old languages that are not clearly related to a modern language by name, such as Old Norse, would be sorted as usual. —CodeCa t 21:38, 10 November 2012 (UTC)[reply]

See water, it does it like this:

German: blah
- Middle High German: blâh
- Old High German: blahaz

This is what it should be like. -- Liliana • 21:51, 10 November 2012 (UTC)[reply]

Yes, that is what I meant. —CodeCa t 22:38, 10 November 2012 (UTC)[reply]

Fully support, everywhere possible. (I, personally, also wish that Scottish Gaelic was next to Irish instead of Scots, and a lot of stuff like that, but I understand that de-alphabetizing is not user-friendly.) —Μετάknowledge^{discuss/deeds} 04:04, 11 November 2012 (UTC)[reply]

I thought we were already doing this and had been for some time. When I use the little rapid-entry form to add Old Irish translations, it automatically lists them as an indented entry below Irish. There is a problem with Old English and Middle English, though: ideally they should be listed as subheadings under English, only of course we don't have English lines in translation tables. —An gr 09:38, 11 November 2012 (UTC)[reply]

Maybe we should, what if someone wants to know how to say something in English?

But seriously here's what I think: If OHG is a subdivision of German then Old English is a subdivision of English, meaning that it would be listed in the entry rather than as a translation, which I don't think is that bad of an idea if only I knew where to put it. It could go in as a synonym with {{qualifier|Old English}} or it could have it's own heading like ====In older forms of English====. --Wiki Tiki 89 09:56, 11 November 2012 (UTC)[reply]

I'd rather keep them in the translation table, but maybe we could have a line that says "English (older forms)" that doesn't contain any translation itself but under which Middle English and Old English can be listed on their own lines. And maybe Old Norse could be listed as "Norse, Old" and alphabetized under N. —An gr 12:27, 11 November 2012 (UTC)[reply]

I agree, we should keep Old English in the table. Changing the sorting order of certain languages also seems like a good idea, but I'm not sure how to do that. Someone who has better knowledge of how the translation adder works should have a look. —CodeCa t 13:20, 11 November 2012 (UTC)[reply]

Personally I am against translations for different stages of a language. There are no living persons that can verify meaning in older stages, although from an older stage of language we can "assume" (even be 99% or 100% certain) how it was used (of course only under well-known, at the moment, circumstances). So translations will remain "unverified" for ever. Maybe a special old-language-table, including all older languages, can do the trick, in case someone wants to give it a try and translate something in previous-stage-of-language (I do not know why someone wants that, but anyway...), or some of us want an easier way to find how a word "probably" was spoken. But it still is a language mangling. It does not correct translate the lemma since we do not know for sure if there was a special separate word for "water-under-sun" or yellow-water (for example) which makes our translation unworthy (or funny?-:). Someone can also say "who cares if translation is not exact since no one can verify it" but...).

The exact opposite: what the exact ("pagename") word meant a hundred or two hundred years ago could do a benefit. Saying just obsolete is not enough. When it started having that "obsolete" meaning? And when it stopped? This is crucial for someone who reads an XXXX year book and wants to search dictionary meanings of a specific word. For Ancient Greek or Chinese the era is crucial and I think that the same goes for older stages of other languages (fr, de etc). Wikifriendly --Xoristzatziki (talk) 13:25, 11 November 2012 (UTC)[reply]

See Wiktionary talk:Translations, where I've started off a list of languages that could have subdivisions by period or variety. —CodeCa t 16:49, 11 November 2012 (UTC)[reply]

To virtually everyone, yes the translation adder already nests a lot of Old and Middle languages, but a lot of translations predate the translation adder or were added without it. The point of this thread seems to be to standardize what we do, as the practice isn't official, I don't think it's even unofficial per se, it's just want Conrad.Irwin's translation tool does. So having a policy or a guideline to me seems like a very good idea. Mglovesfun (talk) 17:16, 11 November 2012 (UTC)[reply]

I disagree with this nesting. Someone who sees the Yiddish or Dutch translation of a word and wants to know the ancestor is going to look under "Old High German" not "German" I think. Moreover, OHG is no more German than it is any other descendant and listing it as such does naive readers a disservice by implying it's a German.—msh210℠ (talk) 17:59, 16 November 2012 (UTC)[reply]

Several of our entries already nest Ancient Greek under Greek, though. Should that be undone, and Ancient Greek sorted under A? —CodeCa t 18:15, 16 November 2012 (UTC)[reply]

Yes, it should absolutely be undone: Ancient Greek starts with A. Even consistent use of "Greek" to mean "Modern Greek" is already confusing, there's absolutely no reason to make matters even worse by being inconsistent about it. —Ruakh_TALK 19:21, 16 November 2012 (UTC)[reply]

Reconstructed pronunciations

I just noticed CodeCat adding pronunciations to Middle Dutch entries and it made me think: Shouldn't we somehow mark reconstructed pronunciations? Like with an asterisk or something? We can create a version of {{IPA}} that puts an asterisk before the transcriptions and displays a tooltip when moused over with some sort of warning message ("Caution: This pronunciation is not approved by the FDA. Pronounce at your own risk."). --Wiki Tiki 89 20:02, 13 November 2012 (UTC)[reply]

I created a quick example: {{:User:Wikitiki89/template:IPAr|/jæːɑr/|/jɑːr/|lang=ang}}:

IPA^(key): */jæːɑr/, */jɑːr/

--Wiki Tiki 89 20:17, 13 November 2012 (UTC)[reply]

I think in the case of a dead language we don't need to specify, it is kind of redundant. Unless we have a reason to suspect some of our users will believe we actually have access to native speakers of Middle Dutch here. :) —CodeCa t 21:08, 13 November 2012 (UTC)[reply]

If I'm not mistaken, the earliest articulatory/phonetic descriptions of Classical Arabic are about a thousand years old. Just because no Wiktionarian has access to any native speaker of a language, that doesn't automatically mean that pronunciations of it are necessarily reconstructed . . . But that said, I don't think this is a big deal. Even for English, many of our pronunciations seem to have been added by literate native speakers who know a word and don't realize that they don't know how to pronounce it. :-/ —Ruakh_TALK 22:18, 13 November 2012 (UTC)[reply]

That's also true. Even in a language with a spelling system as disastrous as that of English, you can still make an accurate guess about most words. In fact, I think nowadays most people use the spelling as a pronunciation guide rather than the other way around. The same can be done with dead languages too, with a bit of help from historical linguistics to distinguish minimal pairs or rule out impossible reconstructions. (As an example, although in Middle Dutch both sien and vrien have the <ie> digraph, comparison to Germanic cognates and to modern Dutch shows that the first had one syllable with an i-diphthong while the second had two syllables and a long i.) —CodeCa t 22:58, 13 November 2012 (UTC)[reply]

In many (most?) reconstructions, the pronunciation is basically self-evident, based on the same comparative evidence that provided the term itself if not attested. I personally trust the output of {{grc-ipa-rows}} more than a pronunciation added by a native speaker on a word that even specialists have trouble with (like (deprecated template usage) chlorenchyma or something). —Μετάknowledge^{discuss/deeds} 01:43, 14 November 2012 (UTC)[reply]

I will agree that we usually know which phonemes were present, but don't know the realization of the phonemes. Like for ġēar, how do we know it was [jæːɑr] and not [jeːɑr] for example? Also, we often don't know for sure whether a vowel was long or short since it was unmarked. Yes, it's not that important but we ought to let people know that it's not necessarily accurate. I see this being more useful for languages like Hebrew, where the pronunciations from before the Masoretes invented nikkud must be reconstructed from Greek and Latin transliterations. And especially languages like Egyptian where we just don't know at all. (For what it's worth, I never mentioned anything about having native speakers as the only way to know a pronunciation.) Also, I think {{grc-ipa-rows}} is a perfect place to use this as it is extremely unlikely that the pronunciation of a word remained remained systematical for two thousand years. It is very likely that many of these words were not colloquially pronounced as they were spelled. --Wiki Tiki 89 08:11, 14 November 2012 (UTC)[reply]

Concerning gēar, I believe it's because the same diphthong is found as an allophone of ǣ, and because in Middle English it evolved into [ɛː] and only became [eː] with the great vowel shift. —CodeCa t 13:31, 14 November 2012 (UTC)[reply]

In other words, it is reconstructed. But I wasn't talking specifically about gēar. Honestly I see no downside to this other than having to go back and fix everything. --Wiki Tiki 89 13:46, 14 November 2012 (UTC)[reply]

Well, if you consider that long vowels are reconstructed in Latin, should we indicate this in front of all Latin terms, or drop the use of macrons altogether? —CodeCa t 14:13, 14 November 2012 (UTC)[reply]

Well, that's more of an argument for making an exception for Latin, rather than for not marking reconstructed pronunciations. Also, I'm not an expert on Latin but w:Apex (diacritic) claims that the apex was "actually quite widespread during classical and postclassical times" to mark long vowels. --Wiki Tiki 89 14:41, 14 November 2012 (UTC)[reply]

I've always disliked reconstructed pronunciations on Wiktionary as guesswork, albeit scientific guess work. Of course there's no way of telling if we're wrong or right. Honestly I'd happy get rid of the whole lot, but as far as I know I'm the only person that thinks that. Mglovesfun (talk) 22:06, 14 November 2012 (UTC)[reply]

I don't think that it's correct to refer to it as guesswork. Guesswork implies that there is no basis for it whatsoever, which isn't true. Rather, reconstructions are accurate to a certain amount of uncertainty, depending on each word individually, and are the "best fit" with the given facts. That is why I don't think there is much harm in them - Wiktionary is descriptive, so why not describe what we know? As long as they are not too overly detailed (to account for the margin of error); broad phonemic pronunciations are enough. —CodeCa t 22:20, 14 November 2012 (UTC)[reply]

Also, if you think about it, our RFV process is similarly guesswork: we have a word in a certain context and have to figure out the meaning from that. It doesn't seem that different, really. —CodeCa t 22:23, 14 November 2012 (UTC)[reply]

Not blind guesswork, but based on assumptions which aren't testable. Mglovesfun (talk) 22:38, 14 November 2012 (UTC)[reply]

Are you sure? It is still considered a science, not pseudoscience. I mean, we can't test what is inside the sun, or how far away stars are, nobody has ever actually seen a quark or a dinosaur. But we can infer it based on the evidence we have. Reconstructing words or pronunciations is no different. (And if you want to make it interesting, I have no evidence that suggests you are not a robot :p) —CodeCa t 22:44, 14 November 2012 (UTC)[reply]

Can someone please answer me this: If a given pronunciation is reconstructed, why shouldn't we mark it as such? --Wiki Tiki 89 08:45, 15 November 2012 (UTC)[reply]

Because for any language that isn't spoken today, it's assumed that it is reconstructed. And for a language that is spoken today, we don't include pronunciations. Now if we got into the habit of reconstructing pronunciations for languages that are spoken today, then it would be different. But as it stands now, dead language implies reconstructed pronunciation, so it is somewhat redundant and not really worth the effort of fixing tens of thousands of entries. —CodeCa t 13:37, 15 November 2012 (UTC)[reply]

Well you're wrong about that. Hebrew is spoken today and we do include reconstructed ancient pronunciations for it (whenever we can be bothered to that is). --Wiki Tiki 89 13:47, 15 November 2012 (UTC)[reply]

If you look at it that way, Latin is still spoken today, except we call it French, Spanish, Italian etc. As far as I'm aware, ancient and modern Hebrew are very different from one another, and can't really be considered one and the same language, except that we do because the words are spelled the same in both (I think?). But if we indicate that such pronunciations are ancient (with {{qualifier|ancient Hebrew}} or something) then that already implies that we didn't get it from a native speaker. And I'm sure that we don't want ancient Hebrew pronunciations without such a qualifier, so there is no chance of ambiguity. —CodeCa t 14:23, 15 November 2012 (UTC)[reply]

Ok, in that case, why do we mark words such as Proto-Germanic *jēran with an asterisk if it is already obvious that it is reconstructed since it is Proto-Germanic? --Wiki Tiki 89 14:33, 15 November 2012 (UTC)[reply]

Because it's not already obvious to someone who isn't familiar with the concept of a proto-language, and because that is a long-standing, pretty-much-universally-used convention in references that use reconstructed forms. It's really a matter of where to draw the line: any term (perhaps even senses of some currently-used terms) that became obsolete before pronunciation was documented up to modern standards has a reconstructed pronunciation to some extent, and yet all the dictionaries which are consistent in giving pronunciations will provide the pronunciation for each of those terms. If you decide to mark pronunciations as reconstructed, you have to have clear and plainly-stated criteria as to when to do so, or risk misleading people in cases where you dont't mark it. There's just so much gray area which isn't dealt with in references- why bother? Chuck Entz (talk) 15:28, 15 November 2012 (UTC)[reply]

Here's a thought: how about we mark reconstructed pronunciations as such by including that information in the accent? E.g. {{a|reconstructed}} for Old English, or {{a|Classical (reconstructed)}} for Classical Hebrew. —Ruakh_TALK 17:00, 15 November 2012 (UTC)[reply]
Not a bad idea, but what is the advantage of that over asterisks? --Wiki Tiki 89 18:49, 15 November 2012 (UTC)[reply]

Firstly — asterisks have several different meanings. They're used for reconstructed forms, but I've never seen them for reconstructed pronunciations of attested forms; so if I saw your examples outside the context of this discussion, I don't think I'd immediately understand what they're about. (I realize that you included hover-text, but I don't think it would occur to me to look for it.) Secondly — I think putting it in the {{a}} makes it easier to address the possibility of multiple reconstructions. To take your original example — is the OE word reconstructed as having had two different pronunciations? Or is are there two competing reconstructions of its pronunciation? In the latter case, they could be on separate lines, treating the two reconstructions as different "accents", e.g. {{a|Hallimer reconstruction}} vs. {{a|Gregson reconstruction}} or whatnot. —Ruakh_TALK 19:37, 15 November 2012 (UTC)[reply]
Now there's an argument I'm willing to concede to. --Wiki Tiki 89 19:58, 15 November 2012 (UTC)[reply]

Odd compound template use

I was looking at à and found this: {{gloss|{{non-gloss definition|used to express something not completed}}}}. It's paradoxically sensical and kinda poetic in that way, but it still looks awful in the code, not to mention that it's confusing as hell. Isn't this usage pretty much what we created {{qualifier}} for (although admittedly for translation sections)? Would using that template be okay by the community or should a forked version be created for use in definitions? Circeus (talk) 20:24, 14 November 2012 (UTC)[reply]

I would just remove {{gloss}} from instances like that. - -sche (discuss) 20:38, 14 November 2012 (UTC)[reply]

My issue is that it seems to me the parentheses in those definition should be as easily hooked at with css as those created by {{sense}}, {{context}} or {{qualifier}}. If you have a gloss, the non-gloss component should be a single unitary template. That's why we created the various italic and/or parentheses templates in the first place!

Note the difference between senses 6, 7 and 9. Sense (6) is probably an appropriate use of {{gloss}} (though it's not currently using it), sense (9) is clearly a non-gloss. Sense (7), however, has a gloss component and non-gloss one. There is no reason for the parentheses of (6) to be templated, but not those in (7). I'm going to use SOMETHING to accomplish that and what I want to know is whether people think {{qualifier}} is appropriate or whether something new should be coined for that case. Circeus (talk) 20:58, 14 November 2012 (UTC)[reply]

You're right, use {{qualifier}}. - -sche (discuss) 21:05, 14 November 2012 (UTC)[reply]

Sorry about the rantiness. I think I've been getting kinda cranky recently...

As a side note, I'll probably try and add some css hooks to {{gloss}}: y'see I've been using the hooks in these templates (context, qualifier, non-gloss definition...) to detect untemplated instances (I use a teal color, so when I see black parenthesized or italicized text, I now there should be a template). Circeus (talk) 21:52, 14 November 2012 (UTC)[reply]

Category:English locatives

Out of curiosity, why isn't this category named Category:English locative adverbs? Isn't that the traditional term for this set of words? Is this maybe a question of theoretical preference? --Pereru (talk) 09:24, 15 November 2012 (UTC)[reply]

Category:Units of measure

Shouldn't this be a subcategory of Category:Metrology? If so, how can I make this happen? Should some category template be changed? --Pereru (talk) 14:43, 15 November 2012 (UTC)[reply]

Never mind, I've just figured out how to make Category:Units of measure have Category:Metrology as one of its parents. --Pereru (talk) 14:46, 15 November 2012 (UTC)[reply]

Category:Entries with translation table format problems

Down to 200 so they all now fit on one page. If a few people could work on this at once, we could have this done in a couple of days. Use [3] if you want to convert Latin to Cyrillic, or vice-versa. Please participate if you can, thank you. Mglovesfun (talk) 11:00, 15 November 2012 (UTC)[reply]

I've looked at a couple of terms, but I don't see what is wrong with their translation tables. What is the problem? --Pereru (talk) 14:49, 15 November 2012 (UTC)[reply]

The ones I looked at seemed to be all with an extra Serbo-Croatian entry placed alphabetically where one would expect Croatian (of course, one could just as easily say that the other entry is the extra one, but Serbian alphabetizes the same as Serbo-Croatian, so it's at least in the right place). Bots can easily substitute language headers, but they're not as good at merging translations from separate lines into one line. Chuck Entz (talk) 15:50, 15 November 2012 (UTC)[reply]

Down to 20. Mglovesfun (talk) 14:36, 18 November 2012 (UTC)[reply]

Using etyl with "-" as language

(The following discussion uses "la" as an example code, but it is analogous for every other code) There are currently many uses like this: {{etyl|la|-}}. The "-" serves to suppress categorization in cases where it is not desirable to treat it as an etymology, so this is basically equivalent to just {{la}}, except slower/more complex and less obvious. I'm not sure what the reasoning behind this usage is, but it doesn't make sense to me. If the point of using "-" is to not treat it as an etymology, why even use {{etyl}}? Why not just write {{la}} instead, or even just "Latin"? Why make {{etyl}} more complex to support such usage? So... should we abandon this practice in favour of using {{la}} or writing out "Latin"? —CodeCa t 14:28, 16 November 2012 (UTC)[reply]

Time was, {{etyl}} would generate a link to the Wikipedia article on a language, except for several dozen major languages for which such links were deemed unnecessary. Eventually, the language-templates were changed in such a way that no languages generate Wikipedia-links, with two exceptions:

There was always an option in WT:PREFS to generate those Wikipedia-links for all languages, using JavaScript. That option still works.
Languages with their own etyl: templates, such as Classical Hebrew, still have Wikipedia-links. For example, {{etyl|hbo|-}} produces Classical Hebrew.

I suggest that we simply change {{etyl}} to always generate those links. If nothing else, it's a helpful visual cue. In fact, we've discussed this before, and I don't think anyone objected. So, I'll make that change shortly, unless someone objects now.

—Ruakh_TALK 15:19, 16 November 2012 (UTC)[reply]

I would object, if nothing else because it is inconsistent with our practice of not linking languages elsewhere. Why are etymologies special? —CodeCa t 20:59, 16 November 2012 (UTC)[reply]

Usually, a language-name is only relevant to someone looking for language content. For example, if I'm looking for a French entry or a French translation, then I just need the French L2 header or the translation-label. I don't need French to be linkified, because I already know what it is, and I don't need other L2 headers and translation-labels to be linkified, because they're not relevant to me. But if I'm reading an English entry, and its etymology says the word came from French, that is relevant to me, and I don't know what it is. (O.K., so obviously in the case of French I do actually know; but sadly, we have no way to distinguish a case like French from a case like Guugu Yimidhirr.) Another case that may be like etymologies in this respect is descendants-sections; I'm not sure. —Ruakh_TALK 23:41, 16 November 2012 (UTC)[reply]

Personally, I rarely bump into a language I'm not at least superficially familiar with. That said, I support automatic 'pedia-linking in etyl as long as there is JS available to disable it if one so chooses once logged in. —Μετάknowledge^{discuss/deeds} 00:59, 17 November 2012 (UTC)[reply]

That is why I asked why etymologies are special. There are many situations where a user would come across a language name and want to know what it is, not just in etymologies. I don't think we should be selective in when we link them, because users will not be selective in where they encounter them; it could be in an English etymology, an English descendants section, a translation table, in an etymology or descendants section of another language (English speakers might visit a Proto-Indo-European page and wonder what Pashto is), or even just as the name of a language section on a page. So we should be consistent and either link none, or link all. I think linking none is the easier option because that's what we already do. —CodeCa t 01:05, 17 November 2012 (UTC)[reply]

I'm confused. Do you disagree with all of what I wrote, or just part? If just part, then — which part? —Ruakh_TALK 02:40, 17 November 2012 (UTC)[reply]

I agree with your reasoning about providing information when users encounter unfamiliar terms, but I don't agree with the conclusion that we should linkify languages in (only) etymologies because of that. —CodeCa t 02:44, 17 November 2012 (UTC)[reply]

Support enabling linking in {{etyl}}. But I find it worth noting that links are not the only way to navigate wikis (there is also a search bar, you may have noticed it). --Wiki Tiki 89 08:46, 17 November 2012 (UTC)[reply]

Using {{etyl|la}} (or at least {{la}}) rather than spelling out the language name also makes it a lot easier to change the name of a language later, without having to wade through thousands of pages by hand or try to write code that can tell an instance that should be changed from one that shouldn't be. - -sche (discuss) 08:41, 20 November 2012 (UTC)[reply]

Template:frs

In the first of two previous discussions of {{frs}}, all participants assumed that {{frs}} and {{stq}} referred to the same thing, and debated merging them. The second discussion investigated whether {{frs}} referred to a Frisian lect or a Low German one: ultimately, it seems it's unknowable. Fortunately, we don't have to use Template:frs at all:

After several discussions, all Dutch Low German dialects were merged into {{nds-nl}}, all German Low German dialects merged into {{nds}}. (We'll have a discussion elsewhere later about whether to rename nds {{nds-de}}.) If {{frs}} referred to a Low German dialect, it has been subsumed into {{nds}} and should be deleted. OTOH, if it refers to a Frisian lect—which is how we've always actually used it—it should be deleted for being both largely redundant to {{stq}} and entirely too ambiguous.

What I want to know is what to do with the small number of translations of it which cannot easily be switched to Saterland Frisian and stq. I have two suggestions:

convert them to {{stq}}, and let stq stand also for the non-Saterlandic kinds of East Frisian that existed in the past, the way we let {{he}} stand also for Biblical Hebrew, or
create and use an unambiguous exceptional code in its place, something like Template:gmw-efr or Template:gmw-fre.

- -sche (discuss) 10:28, 17 November 2012 (UTC)[reply]

I have edited six of the seven entries which were formerly in Category:Terms derived from Eastern Frisian: the Latvian entries āķis, brūns, lode, šķīvis, tornis and the English entry clever. If we decide to design an exceptional code for non-Saterland East Frisian, I'll edit those entries again. I have not edited breeze.

A number of appendices listed Seeltersk forms but used the code frs, I have changed these to stq. Approximately 145 pages still use frs. - -sche (discuss) 22:39, 21 November 2012 (UTC)[reply]

Special:AbuseFilter/9

Currently, any time a new editor makes a vandalistic edit with so-called "bad words" in it (generally "penis", although there is somewhat of a variety), the edit is marked by AbuseFilter 9 with the tag (bad-word), but the edit isn't touched. It takes a human patroller to revert the edit for the vandalism to go away, which is ridiculous, because this is something that even a machine can spot. Given the extremely low false positive rate (I have yet to see any edits which actually added a definiton, translation, or any other feature that this filter stopped), I advocate that we set this filter to prevent new users from being able to make those edits in the first place. Can we get consensus for this change? —Μετάknowledge^{discuss/deeds} 02:28, 19 November 2012 (UTC)[reply]

What if it's a legitimate use of a bad word? —CodeCa t 02:32, 19 November 2012 (UTC)[reply]

Can't we personalize the message the editor gets when the edit fails to go through? It could mention using WT:ID (or the relevant talkpage) to request the edit. Such good naughty edits are so rare that to do so would not represent much of a strain on us, while allowing good faith editors to do their work. —Μετάknowledge^{discuss/deeds} 03:34, 19 November 2012 (UTC)[reply]

Oh, and Special:AbuseFilter/12 as well. I can't find any false positives there. —Μετάknowledge^{discuss/deeds} 05:11, 19 November 2012 (UTC)[reply]

I have actually seen at least two false positives here over the past months, but I can't remember any details. I would leave things as they are. SemperBlotto (talk) 08:28, 19 November 2012 (UTC)[reply]

Rather than jumping straight from "allowed" to "forbidden", we might try the intermediate step of "allowed with warning" (using the "Trigger these actions after giving the user a warning" checkbox). —Ruakh_TALK 14:37, 19 November 2012 (UTC)[reply]

IIRC I copied this filter from enWP with the single change that forbidding the action was changed to merely tagging it for the simple reason that we allow all vulgarity in citations (even of non-vulgar words). I don't think forbidding is a good idea. Warning sounds okay.—msh210℠ (talk) 18:15, 19 November 2012 (UTC)[reply]

True — and obviously I agree with the warning — but still, I'd be suspicious of an edit that replaced something with a vulgar quotation. (The filter specifically targets edits where the page-size after the edit is only slightly larger than the page-size before the edit, so adding a complete quotation should never trigger it, unless something else is removed at the same time.) —Ruakh_TALK 01:49, 20 November 2012 (UTC)[reply]

Good point. Maybe forbidding is indeed the right thing to do. Still, IMO warn awhile and see how it goes, as you said.—msh210℠ (talk) 14:37, 20 November 2012 (UTC)[reply]

@All: How does warning for AF 9 and forbidding for AF 12 sound? —Μετάknowledge^{discuss/deeds} 00:55, 20 November 2012 (UTC)[reply]

Good to me.—msh210℠ (talk) 14:37, 20 November 2012 (UTC)[reply]

I would support blocking edits in mainspace that lack a level 2 or level 3 header. —CodeCa t 19:50, 20 November 2012 (UTC)[reply]

Hm, make an exception for pages containing {{only in}} or {{only-in}}, though; there are thousands of L2- and L3-less such pages, and it's not wrong to create more (we probably still don't have all the SI compounds). - -sche (discuss) 21:24, 20 November 2012 (UTC)[reply]

We could make an exception, but are such pages ever created by new users? I think that the set of users who leave out L2 and L3 headers, and the set of users who add only-in entries are entirely distinct. —CodeCa t 21:26, 20 November 2012 (UTC)[reply]

I oppose blocking page creations whose only problem is formatting. Quite a few good entries were written by newbies, often IPs, who didn't know our formatting: the entries needed cleaning up, of course, but we'd lack them were the edits blocked. (We do already tag some such edits, using abuse filters 1 (tag no-L3) and 15 (tag bad-lede).)—msh210℠ (talk) 04:41, 22 November 2012 (UTC)[reply]

For info, the commonest "rude word" used by vandals is poop by a long shot, not penis. Equinox ◑ 20:05, 22 November 2012 (UTC)[reply]

link duplications

Is it unwanted to have a disambiguation at the top and an alternative form section that carry the exact same spelling? (See praetor.) --Æ&Œ (talk) 04:27, 19 November 2012 (UTC)[reply]

I'm not sure what you're referring to. If you mean the "see also" at the top, that's not specific to any language- it's to help people who may have entered something similar to what they want in the search box, for instance. The alternative forms section within the entry is specific to the language: if I'm looking at the English entry, I may not care if there are alternative forms for Latin-speakers to investigate. Chuck Entz (talk) 04:40, 19 November 2012 (UTC)[reply]

1. --Æ&Œ (talk) 05:51, 19 November 2012 (UTC)[reply]

No it's not unwanted. Mglovesfun (talk) 10:40, 20 November 2012 (UTC)[reply]

Highlighting current senses for highly polysemic terms with many rare/obsolete senses

I detect a conflict in sense-listing between historical and common-usage principles; that is, the historical principle is that senses ought to be listed by date of first attestation (to show a word's historical, and also usually logical, development), whereas the common-usage principle is that senses ought to be listed in order of decreasing currency (i.e., from most to least common in corpora of the the contemporary language — intended, amongst other things, to save a given dictionary-user time by maximising the probability that the first sense(s) listed are the one(s) that he's looking for). For monosemic and most oligosemic terms, these principles don't, in practice, conflict; however, for highly polysemic terms with many rare and/or obsolete senses, these principles can (and sometimes do) conflict.

I believe that the historical principle should dictate the order of senses; however, that shouldn't mean that we have to neglect completely the value of highlighting the (likely) most sought-after senses. To that end, I propose that we create and use a template (call it {{currentsense}} or {{mainsense}} or suchlike) to enclose the entire definition-lines of the few current senses in the entries of terms that are highly polysemic and that have many rare and/or obsolete senses. That template would highlight the text of the senses that its transclusions enclose by emboldening it, giving it a distinctive background colour, or by displaying it in a larger font size. This highlight could be toggleable by a button in the sidebar similar to the Show/hide quotations toggle. Does this sound like a good idea? Has such a thing been proposed and/or attempted before? I'm so meta even this acronym (talk) 12:47, 19 November 2012 (UTC)[reply]

I very much like this idea. --Wiki Tiki 89 15:09, 19 November 2012 (UTC)[reply]

After three days, having received one response — and a very positive one at that — I've gone ahead and created {{currentsense}}. It's a very simple template; the entirety of its code is: <font size=3>{{{1}}}</font> — is it satisfactory? I don't know how to give it toggleability; could someone who does know, please do so? I'm so meta even this acronym (talk) 13:56, 22 November 2012 (UTC)[reply]

I like the idea of making common senses more prominent, but I oppose listing senses in historical order if the original sense is now obsolete or very uncommon/archaic. Ideally, the first sense listed should always be a common/current one. —CodeCa t 14:37, 22 November 2012 (UTC)[reply]

Put me firmly with CodeCat on this. Such an approach is only acceptable in dictionaries (e.g. OED) where the explicit aim is to have senses listed in chronological. Although we do include obsolete and historical senses, this is not how the consensus (as far as I understand it anyway) on Wiktionary stands. Circeus (talk) 20:23, 22 November 2012 (UTC)[reply]

There actually is no consensus on this and it has been debated many times. What I like about this suggestion is that it is a compromise between the two that seems like it will work quite well. Even if obsolete senses are listed first, the highlighted senses will still stand out more. --Wiki Tiki 89 09:28, 24 November 2012 (UTC)[reply]

I oppose highlighting some senses by means of a larger font or by means of color. I oppose embracing some senses with a template. --Dan Polansky (talk) 11:21, 24 November 2012 (UTC)[reply]

trihalomethanes

Needs fixing. I put it here in case there is a Bot out there that needs fixing too. -- ALGRIF talk 14:03, 19 November 2012 (UTC)[reply]

Looks like it was a temporary glitch that happened at the same time the edit was saved. I did a null edit and the problem went away.Chuck Entz (talk) 14:18, 19 November 2012 (UTC)[reply]

Wiktionary Day: tenth anniverary

WD10 is nigh. How about putting suggestions forward for a way to commemorate this? Possibly a new competition, new project, renaming the project, publishing the news on Wikimedia, awards for the greatest contributors, revisiting the oldest pages, new banner, Hunger-Games style reaping of the youngest users, new logo, or something like that. --Adding quotes (talk) 21:45, 19 November 2012 (UTC)[reply]

Can't we just delete the main page for 24 hours? DTLHS (talk) 21:47, 19 November 2012 (UTC)[reply]

I'm working on that. --Adding quotes (talk) 21:53, 19 November 2012 (UTC)[reply]

We could add some new languages to WT:STATS. — Ungoliant ^(Falai) 21:58, 19 November 2012 (UTC)[reply]

We can think bigger than that, comrade. --Adding quotes (talk) 22:03, 19 November 2012 (UTC)[reply]

I've just brought Plautdietsch up to snuff, though I agree with AQ/WF. I've also set a commemoration for December 12th's WOTD (feel free to suggest better words; I wanted to use something like yearday, but the relevant "anniversary" sense of yearday doesn't seem to meet CFI). - -sche (discuss) 22:22, 19 November 2012 (UTC)[reply]

It's not eierlegende Wollmilchsau is it? --Adding quotes (talk) 22:37, 19 November 2012 (UTC)[reply]

Ha, given our mission creep (we have a phrasebook now! and placenames!) that might be apt. - -sche (discuss) 22:46, 19 November 2012 (UTC)[reply]

Another thing we could do is to create a brag list of the best things we've come up with. Like, if we have the best web resource for e.g. Italian, it'd be good to announce it to the world. We could try to tie in the decade with reaching 1000 entries in Category:English vulgarities too. I'd be keen on playing a part in that. --Adding quotes (talk) 23:33, 19 November 2012 (UTC)[reply]

Hi again, W.F. I am thinking about a new motto. How about ‘Wiktionary: the nepotistic, unchecked, bullshit dictionary that anybody can afford.’ --Æ&Œ (talk) 00:23, 20 November 2012 (UTC)[reply]

Ideas I like and/or will help with: new competition, announcement to W(M)F, new logo, adding trolly easter eggs around the site in celebration. By the way, the input of anyone familiar with foreign langauges is requested at Wiktionary talk:Foreign Word of the Day/Nominations#Wiktionary Day. —Μετάknowledge^{discuss/deeds} 00:52, 20 November 2012 (UTC)[reply]

ё in Russian

We seem to use the letter ё in Russian entry titles, e.g. [[ёж]]. Why? I was under the impression that ё is used only in learners' materials and maybe children's books and that normal written Russian just uses е. So shouldn't we treat this like the macrons in Latin and Old English, and use it in headword lines but not entry titles? WT:About Russian doesn't address this issue. —An gr 16:46, 22 November 2012 (UTC)[reply]

I think that modern Russian has admitted use of the letter in all circumstances, but that actual usage is still somewhat divided between e and ë. —CodeCa t 17:47, 22 November 2012 (UTC)[reply]

For what it's worth, the Russian Wikipedia uses ё in at least some article titles, e.g. w:ru:Ётуны, w:ru:Бёрдо. - -sche (discuss) 20:25, 22 November 2012 (UTC)[reply]

@-sche. The Russian Wikipedia made letter ё mandatory. --Anatoli ^{(обсудить}/^вклад) 21:25, 22 November 2012 (UTC)[reply]

Using ё is considered formal and is always used in the Russian dictionaries. It's the standard way for dictionaries. It's OK to write texts for adult native speakers both ways - using ё or replacing the letter with е (with no affect on pronunciation). Also, letter ё is written to distinguish homographs (все and всё) and like Angr said for learners' materials and children's books. WT:About Russian needs to have some info on this. The currently established practice is to have redirect with е to entries with ё веревка -> верёвка, unless it's also a different Russian word or a word in a different Cyrillic based language, самолет (bg) and самолёт, береза (uk) and берёза.

There were many attempts to make the letter mandatory and other attempts to remove it from the alphabet altogether. Note that Belarusian ё is mandatory, so is Tajik, etc. --Anatoli ^{(обсудить}/^вклад) 20:55, 22 November 2012 (UTC)[reply]

ё is a letter in Russian. This is similar to the use of E instead of É in French : accented capitals are normal, and accents are part of the spelling, but the writing is often simplified, especially because of keyboards, as they propose é but not É, etc. I think that soft redirects explaining the issue would be appropriate. Lmaltier (talk) 22:40, 27 November 2012 (UTC)[reply]

I would support moving all entries with 'ё' to the equivalent with 'е', but keep 'ё' in headwords on the page itself (just like Latin macrons and Hebrew nikkud). Note that Belarusian and Tajik are completely irrelevant here, as is the policy on other projects such as the Russian Wikipedia. The usage of 'ё' is analogous to nikkud in Hebrew, in that it is used in dictionaries (along with stress marks), in children's books, in certain formal contexts, and for disambiguation, but it is not used in the vast majority of written text, and that is what we should base the decision on. --Wiki Tiki 89 23:11, 27 November 2012 (UTC)[reply]

I was meaning soft redirects from the word without ё to the word with ё. Following the majority of texts is not always a good argument, it all depends on the reason for the omission. What is this reason? This issue has nothing to do with macrons in Latin, as macrons don't belong to the spelling. In the present case, ё is a specific letter in Russian, not the same letter as е, belonging to the Russian alphabet, taught as such in Russian schools and used as such in dictionaries. Letters with stress marks don't belong to the Russian alphabet. We should follow the Russian wiktionary: ru:самолёт. I feel that it's the same case as É in French. And, in French, the right spelling is clearly with É for some words (e.g. Écrammeville, a village) even if the word is almost always written with E on Internet for practical reasons, and is never used in books. Lmaltier (talk) 23:02, 29 November 2012 (UTC)[reply]

I strongly oppose moving entries with 'ё' to the equivalent with 'е'. It is an established practice to have words with 'ё' as the main spelling and correct and entries with 'е' as alternatives in dictionaries, encyclopedias (including Wikipedia). That's the established practice in the Russian Wiktionary (you may want to check with administrators there) as well. The Russian Wikipedia adopted a rule to "ёфицировать" ("yoficate") all articles (change all occurrences of 'е' to 'ё' when appropriate), not just the titles but all the contents. There is some similarity with the Arabic and Hebrew vocalisation symbols tashkil and nikud but this is not the same, it's more like Arabic letters hamza, which is used more often, e.g. أ and إ is used more often on non-elidable alif (ا), not just in dictionaries but in informal writings (excluding Egypt where they have different practices). Adding word stresses to Russian words (and writing 'ё' throughout) would be an equivalent of the Arabic tashkil or Hebrew nikud.

Russian ё is a letter of the Russian alphabet (not a letter with diacritic) (just like й is a letter, not и with a diacritic), which has 33 letters, without it, it would be 32 letters. There are many advocates of making the letter mandatory and there was a period when that was the case. Adult native written texts can have both 'ё' and 'е', even though the latter is more common. Besides, writing formally is educational and we are educating people. Having a usage note about spelling or having a full-blown entries rather than redirects 'е' is an option I might agree on. Also, I think the choice should also be with people who actively work with Russian, not people who only have casual interest. Russians don't consider words with 'е' as alternative forms of words with 'ё', like English cafe vs café. Conceptually, the letter is 'ё' in берёза, even if people may choose to write береза because of the big difference in pronunciation. It's more like ى in the final position, when ي is meant (the former is conceptually a different letter with a different pronunciation but relaxed spelling allows using the former when the latter is meant (especially Egypt)). The big difference though, is that Arabic ى is not part of the Arabic alphabet, but ё is part of the Russian alphabet. --Anatoli ^{(обсудить}/^вклад) 23:28, 29 November 2012 (UTC)[reply]

I think you are incorrectly assuming that nikkud are used the same way as tashkil. Theoretically, they serve exactly the same purpose, but in reality Hebrew nikkud are much more more common, probably even more common than Russian "ё". You find full or partial nikkud very frequently on billboards, street signs, logos, etc., which I suspect is much more common than Russian "ё". --Wiki Tiki 89 11:57, 6 December 2012 (UTC)[reply]

Greek high frequency words?

The Greek list is based on frequent subtitle words. Does anyone know if there is a Greek frequency list available that is the equivalent to the Longman Corpus Network for English? — This unsigned comment was added by Forevergreece (talk • contribs) at 21:07, 24 November 2012.

"Rico Suave" and 'vandalism'

An entry I created which directly linked to the definition from an external source, was called "vandalism". I'd like to know why it is considered vandalism, since the admin who deleted it has not responded to my inquiry as to why it is considered vandalism, or why it qualifies for speedy deletion, after being restored at WT:RFD#Rico_Suave. The linked discussion is WT:RFV#Rico_Suave. SemperBlotto (talk • contribs) and Metaknowledge (talk • contribs) both considered in speediable, while Metaknowledge called it vandalism. The linked to definition source is

The Australian, "'Suave bandit' has day in court", Sallie Don, 4 February 2011
The phrase "Rico Suave" has a number of origins, being slang in the US for a cool, confident Latino and ladies' man who dresses sharply
(select that phrase and do a find on the webpage)

which was what the entry said before being deleted. I can see that you might want examples of usage, but I cannot see how this qualifies as vandalism, or being speedily deleted.

Can someone explain this situation?

-- 70.24.250.26 09:01, 25 November 2012 (UTC)[reply]

Do we need a separate discussion for this? This isn't a 'policy' matter, I think we should keep this all in one place at WT:RFV#Rico Suave. Mglovesfun (talk) 11:52, 25 November 2012 (UTC)[reply]

While I agree that it's better to keep the discussion at rfv, I definitely think that alleged admin misconduct is a policy matter. That said, although I would never use such strong language myself and consider the premature deletion ill-advised, it doesn't seem to me quite up to the level of actionable misconduct- more a matter of simple courtesy and professionalism (for lack of a better word). Chuck Entz (talk) 15:56, 25 November 2012 (UTC)[reply]

Good point. I thought Rico Suave merited a full RFV with up to 30 to cite it. I doesn't seem to me to be so improbable as to merit immediate deletion. Mglovesfun (talk) 16:18, 25 November 2012 (UTC)[reply]

What "strong language" did I use exactly? I ran a Google Books search, you know, before deleting, and I didn't find anything of use in citing the entry. Seeing that, combined with an anonymous editor (unfortunately, anons are associated with vandalism) and the suspicous fact that the same editor was adding the same thing to Wikipedia (a common vandalistic method, remember the RFV we had for a term in the fake Ança language that had a similar pattern?) I thought it was a perfect candidate for speedy deletion. The anon seems not to have been aware of the WT:CFI, trying to use a reference as a citation, but he's dredged up enough cites over at the RFV that I assume at least three good ones are in there. I'm not a professional vandal-whacker and I do make mistakes. I think I failed to distinguish vandalism and good-faith efforts in this case, but as so much vandalism is composed of good-faith efforts, I don't think you can do much better than some solution where we make some false positives, but at least we don't RFV everything questionable that comes in. I'm glad that the anon decided to defend his entry, and thus saved us an entry, but I sincerely hope that this doesn't turn into a witchhunt against patrollers. —Μετάknowledge^{discuss/deeds} 16:31, 25 November 2012 (UTC)[reply]

I was looking for an explanation as to why it qualified as vandalism. I would have thought the reference supplied would have at least meant it would travel through a deletion process instead of being speediable, as the supplied reference provided the definition used, and a claim that it is slang found in use, neither of which were creations of my own (therefore not made up in my own mind). (also, the Australian is a dead tree reference broadsheet newspaper) Also, the edit comment saying that I was vandalizing w:Rico Suave and that it needed administrator attention was not friendly. -- 70.24.250.26 08:27, 26 November 2012 (UTC)[reply]

Why not work on A free translation software?

Source: http://he.wiktionary.org/wiki/%D7%95%D7%99%D7%A7%D7%99%D7%9E%D7%99%D7%9C%D7%95%D7%9F:%D7%9E%D7%96%D7%A0%D7%95%D7%9F#.D7.9C.D7.9E.D7.94_.D7.9C.D7.90_.D7.A2.D7.95.D7.91.D7.93.D7.99.D7.9D_.D7.A2.D7.9C_.D7.AA.D7.95.D7.9B.D7.A0.D7.AA_.D7.AA.D7.A8.D7.92.D7.95.D7.9D_.D7.97.D7.95.D7.A4.D7.A9.D7.99.D7.AA.3F

Translated by Google:

I'm talking about software like: WordPoint (if possible with the OCR as Babylon - all the better). Thus, it will be possible to download a file with the vocabulary free, and use translation software to translate as normal. Only the translation will also be linked if required On-Line, the information center Wiktionary adding values.

For example:

You can put buttons X or V under translation. So that clicking on the V - validate the proposed Translator. Clicking on the X - deny the proposed translation. Thus the formation of proper statistics, we can bring the many translations of the many values.

You can also make a button "Donate" translation - so that the user does not even have to log Wiktionary to offer translations, but everything is done through software linked Wiktionary Information Center.

What do you say? — This unsigned comment was added by 46.120.14.81 (talk) at 23:27, 25 November 2012.

I think this could be a good idea. But should this be a software or a service? From a service (similar to Google Translate), the operators (WMF) could collect statistics and feedback more efficiently than from a free software. --LA2 (talk) 01:37, 26 November 2012 (UTC)[reply]

Google Translation is available only Online. By a free software we achieve both goals: Offline Translation and Statistics+Feedback.

Bilingual offline dictionaries for some languages are already available here in ding-format (essetially ASCII) and dictd-format. (talk) 09:04, 12 December 2012 (UTC)[reply]

Are these dictionaries available by Mouse Hovering/Click? Software has many advantages without a doubt.

Wiktionary:Votes/2012-11/Bureaucrats and de-privving

A new vote, on whether to retain a configuration change that was made a few years ago without our say-so. (A meta user has been campaigning heavily for this change to be revoked on all projects that didn't explicitly ask for it, so I think we now have to make an explicit decision on the subject.) —Ruakh_TALK 00:28, 27 November 2012 (UTC)[reply]

Thanks for setting up the vote! It has now started. - -sche (discuss) 20:48, 4 December 2012 (UTC)[reply]

proto-indo-european word spellings

While I know these words are reconstructed and thus not real, I was still wondering what the best way to write out the words are. Specifically, I'm talking about the discrepancy between certain spelling conventions used for some of them. Some have an initial "h₁" while others don't, and just begin with the vowel. For example *oḱtṓw as the entry on here doesn't, but under the Wiktionary List of Proto-Indo European roots it's listed as "*h₁oḱtṓw". While these are essentially the same, which way is best to use universally here so there's consistency between etymologies and reconstructed words/appendices? On there, a word for "fire" is listed as "*h₁eh₂-ter-", while on another appendix, the list of Proto-Indo-European nouns it's written as "*eh₂ter-", and many of the links in wiktionary go to this form (none of the pages are created yet though). Same for "*h₁eǵʰs" vs. "*eǵʰs". Just wanted to know so the dictionary can be made more uniform. I guess both forms can be listed in the etymologies, but which should actually be created? So far, there are some existing pages that use a preceding h₁ and others that don't, so there isn't a standard to base it on. Wikipedia seems to prefer using forms with h₁ in most cases. Word dewd544 (talk) 21:48, 27 November 2012 (UTC)[reply]

We do have a standard spelling outlined at WT:AINE, but that doesn't help much here. I'd say that a spelling that has a source is better than a spelling that doesn't. But I'd also say that modern up-to-date sources are better than old outdated ones (like Pokorny). The consensus among linguists is that most roots began and ended with a consonant. However, it could be tempting to assume that all of them do, when there is no evidence to support it other than "other roots begin with a consonant too". That is of course a fallacy and partly circular reasoning. If you were to take only those roots and words that we know for sure (through direct evidence in descendants or related forms) that they began with a consonant, you'd still be left with quite a substantial amount of roots and words for which we can't be sure. So the answer is really "we don't know", and both forms are potentially correct. —CodeCa t 22:30, 27 November 2012 (UTC)[reply]

For the case in point, there are even more options: I've often seen the PIE 'eight' word reconstructed as *h₃eḱtṓw, on the argument that all the other numerals from 3 to 10 take e-grade. —An gr 17:39, 28 November 2012 (UTC)[reply]

Yeah I figured there would be many ways to reconstruct them. And as far as sources, I've seen many divergent ones; I can of course tell what word they're referring to but in many cases they use slightly different ways of expressing it. The e-grade vs o-grade is another issue; wasn't sure whether to make the word for "eye" one or the other, but since most etymologies on here linked to the e-grade I used that (though apparently many descendents probably came through the form "h₃okʷ-"). I've noticed that some exclude the initial neutral laryngeal for forms or roots that were later derivations of the original PIE one, for occasions between the proto-[something] family and PIE proper, sometimes described as pre-[whatever]. And I agree that most linguists seem to agree that most started with some kind of consonant, so I guess I'll go with including the h₁laryngeals when there's a toss-up between them, and mention the other form alongside it possibly. Word dewd544 (talk) 03:52, 30 November 2012 (UTC)[reply]

I think the most important thing is not which spelling one chooses to create a PIE page with -- there are cases with good arguments for variant spellings, in some cases we'll have to choose more or less subjectively -- but to make sure that the links to said pages are to the right word. So, if the reconstructed form adopted here is *h₃eḱtṓw but your source happens to have a different spelling, I think you should keep the original spelling but make sure it links to the *h₃eḱtṓw page -- there are, after all, simply different theories on what exactly the very same word looked like in PIE times. --Pereru (talk) 16:34, 8 December 2012 (UTC)[reply]

Category:English phrasebook/Needs

Do we really want a few thousand more of these - do we need more than one? — Saltmarsh^{απάντηση} 07:07, 28 November 2012 (UTC)[reply]

WE don't need any at all. Nuke the entire phrasebook. SemperBlotto (talk) 08:18, 28 November 2012 (UTC)[reply]

Please stop attacking the phrasebook. If you're not interesting in the project, just stay away from it. --Anatoli ^{(обсудить}/^вклад) 02:45, 30 November 2012 (UTC)[reply]

Having only one would be completely useless. How large the phrasebook should be has yet to be determined, but I doubt it will end up with thousands of phrases beginning with "I need". --Yair rand (talk) 08:21, 28 November 2012 (UTC)[reply]

I agree with SemperBlotto. — Ungoliant ^(Falai) 08:41, 28 November 2012 (UTC)[reply]

Personally, I think the phrasebook is useful but does not really belong in the main namespace. I think we should move it to the Appendix. But I agree that having thousands of phrases beginning with "I need" would be useless. I think we should look at real phrasebooks in print and have the same sorts of entries. --Wiki Tiki 89 08:47, 28 November 2012 (UTC)[reply]

What if there's another Wikimedia project, called "Wikiphrase(book)", that could accommodate our phrasebook? --Lo Ximiendo (talk) 08:50, 28 November 2012 (UTC)[reply]

Why would we move it to a separate project? On this project, we already have translation templates and associated scripts (translation adder, targeted translations), pronunciation/dialect templates, a bot that adds audio from Commons... I'd say the phrasebook overlaps quite well with Wiktionary. --Yair rand (talk) 08:59, 28 November 2012 (UTC)[reply]

We can't move the entire phrasebook to the appendix namespace, because some of the entries in it are idiomatic, CFI-compliant terms. For example goodbye. Concerning the "needs": I think that it would make more sense to give translations with the term being asked for left out and replaced with "..." and a grammar note where applicable. So it would become just one entry I need ... and one of its translations would be for Dutch: ik heb ... nodig. We may want to distinguish "I need ..." from "I need a ..." though, as this can differ between languages. —CodeCa t 16:53, 28 November 2012 (UTC)[reply]

goodbye isn’t in the phrasebook. — Ungoliant ^(Falai) 20:04, 30 November 2012 (UTC)[reply]

I too agree with SemperBlotto - since the phrasebook is in our space we're entitled to take a view. If every possible phrase is included - and there could be quite a few thousand "need" entries using the policy advocated. But then we shall need a few thousand more "I would like ", and then "please may I have " - the combinations could be endless. — Saltmarsh^{απάντηση} 05:38, 30 November 2012 (UTC)[reply]

We could delete a few. I would leave I need a dictionary, I need an interpreter and I need a doctor, maybe some more and any new entry with "I need ..." would be deleted. I'm worried about the quality of the phrasebook and I'd like the project, so a clean up is in order. Actually most of these phrases reflect what printed phrasebooks may have. We won't have thousands of entries but too similar might be redundant. --Anatoli ^{(обсудить}/^вклад) 05:46, 30 November 2012 (UTC)[reply]

I would leave I need water, I need food, I need clothes, I need shelter and maybe I need money and perhaps add I need light and I need help. I'm not sure about I need fire, but something on the line with it is needed also, IMO. --biblbroks_{дискашн} 22:42, 30 November 2012 (UTC)[reply]

Maybe I need some fire instead of I need fire. --biblbroks_{дискашн} 22:44, 30 November 2012 (UTC)[reply]

Is there really a need to argue for separate entries for each of them, rather than a common one as I described above? The point of the phrasebook is to describe, in simple terms, how to say something in another language. If we can give a simple description of how to insert any word into the template "I need ...", including a short note about the grammar, then doesn't that do the job too? —CodeCa t 22:53, 30 November 2012 (UTC)[reply]

The best way is to organize it thematically, in another space. I agree with WikiTiki89. Page titles could look like Phrasebook:At the hotel, Phrasebook:In a shop, etc. This would solve this issue. Lmaltier (talk) 22:59, 30 November 2012 (UTC)[reply]

Do we really need to make phraseboook entries exempt the SOP rule? Phrasebooks are used by visitors (eg a Greek visiting the UK or an Englishman visiting Greece). In either case he would go to the Greek phrasebook. Where (for example) he will find μπορώ να έχω (boró na écho, “can I have”). Doesn't that work for both users? — Saltmarsh^{απάντηση} 07:37, 2 December 2012 (UTC)[reply]

I am in favor of putting them in a separate namespace, maybe even their own. But if we were to leave some, I would leave those mentioned. --biblbroks_{дискашн} 14:39, 2 December 2012 (UTC)[reply]

WOTD/FWOTD for December 12

I was intending on having a word related to the number twelve as the WOTD for December 12, as it will be 12/12/12, the last such date this century — or for the rest of eternity, if the Mayans turn out to be correct. ;) But -Sche beat me to the punch and set that day to "decade" to mark the 10th anniversary of Wiktionary's founding (I wasn't aware of this occasion). So perhaps we could schedule a foreign word for "twelve" as the FWOTD for December 12 to mark both events simultaneously? Astral (talk) 09:41, 28 November 2012 (UTC)[reply]

I don't mind changing "decade"; I'd like to mark the twelve-ness of 12.12.12, too. I'm sure we can find a word that lets us mention both things—because we can work both things into a usex under the flimsiest of pretexts, lol. For example, if we used [[twelve]] or [[Twelfth cake]], after the definition could be the usex the twelve of them passed around pieces of Twelfth cake to mark 12.12.12 and the tenth anniversary of Wiktionary's founding in 12.12.02. - -sche (discuss) 16:46, 28 November 2012 (UTC)[reply]

Is there any word that ambiguously means either 10 or 12? I'm sure there is. --Wiki Tiki 89 16:49, 28 November 2012 (UTC)[reply]

The best I can think of is Old Norse hundrað (which we don't even have an entry for), which unlike all of its descendants and cognates means not 100 (i.e. ten tens) but 120 (i.e. twelve tens). —An gr 17:33, 28 November 2012 (UTC)[reply]

In English there's [[long hundred]] ([[short gross]]), the same number Angr describes. There's also "short dozen" (=10), which seems perfect. It's uncommon, but perhaps we could tolerate that just this once, under the circumstances. - -sche (discuss) 17:48, 28 November 2012 (UTC)[reply]

December is the twelfth month, but literally the tenth. Unfortunately, there's no decembade. 46.115.99.214 18:09, 28 November 2012 (UTC)[reply]

I think it would be a bit awkward to try to press a single English word into service to mark two separate events. That's why I think using the WOTD to mark Wiktionary's 10th anniversary and FWOTD to mark 12/12/12 is an elegant solution. We could of course have usexes noting the relevance of each word to that particular day (the FWOTD usex would have to have an English translation as well).

Some words for tenth anniversary: decennial, decaversary, tin anniversary. Astral (talk) 22:52, 28 November 2012 (UTC)[reply]

experiencing problems with https on en.wiktionary

Does anyone else's browser render https strangely. There's a pic to show what I am talking about.

fr, ru, sr, hr wiktionary's as well as well as http:en.wikt's pages were rendered ok. --biblbroks_{дискашн} 21:38, 28 November 2012 (UTC)[reply]

It's not just HTTPS; I had that on regular HTTP as well. But it seems to be an intermittent problem, and a hard-refresh fixed it (for now). —Ruakh_TALK 22:18, 28 November 2012 (UTC)[reply]

I logged in HTTP-wisely in parallel with HTTPS and didn't reproduce the problem. But I had a similar experience on commons: all the top-right links except one produced the strange rendering. The "Contributions" page wasn't affected. Anyway, when booted in another OS the problem went away. --biblbroks_{дискашн} 22:29, 28 November 2012 (UTC)[reply]

I had it too, but I tried the hard refresh (ctrl+F5) and it worked. —CodeCa t 22:32, 28 November 2012 (UTC)[reply]

More spambots being created

e.g. User:Kevin13Kevin1977, User:Jamie6Jamie1988, and User:Louis18Louis1973. Can this evidently automated creation of formulaically-named accounts be halted? Equinox ◑ 23:35, 28 November 2012 (UTC)[reply]

I've fiddled with Special:AbuseFilter/16. Someone else should check it, though, who knows regexes.—msh210℠ (talk) 09:22, 30 November 2012 (UTC)[reply]

It looks good to me, though I suddenly wonder why we're using \b instead of ^ and $? (Both for the original test, which I think we got from a Wikipedian, and for the new one you just added.) —Ruakh_TALK 15:25, 30 November 2012 (UTC)[reply]

WT:COALMINE

Assuming that a compound like coal mine is SOP (some may argue it isn't, but that's what RFD is for), why should we keep it just because coalmine exists? The argument for keeping coalmine is that some people might not know where to break the words apart, so why couldn't we just delete coal mine and have something like User:Wikitiki89/coalmine for coalmine? --Wiki Tiki 89 05:51, 29 November 2012 (UTC)[reply]

I support overturning COALMINE, as did 52% of voters last time it was discussed. As WT:VOTE says, "failure of a vote does NOT mean that it cannot become a new vote in the future"; if you're making a new proposal to overturn it, I support you. But COALMINE is like Rush Limbaugh: many people like it because it does their thinking for them. In the last vote, some people expressed that they might support substituting other rules for COALMINE, but not just allowing people to use their own judgement in RFD debates.

COALMINE also helps us in, as Ruakh put it once, our war with the French. - -sche (discuss) 06:34, 29 November 2012 (UTC)[reply]

I support COALMINE because it counters the tendency of editors to treat "part" as meaning "surrounded by punctuation". I believe this is incorrect because whether something is SoP or not should not depend on orthography. It gives English compounds special treatment compared to other languages purely because of the spaces between the parts, which doesn't make much sense at all. That's the point that COALMINE tries to make: coalmine and coal mine are one and the same compound word, so either they both are SoP or neither is. —CodeCa t 14:05, 29 November 2012 (UTC)[reply]

My point was if it is SOP, we can still include the single word compound (because many people complain that "some people" have trouble breaking up compounds into parts), while not having to keep the one with a space. --Wiki Tiki 89 15:23, 29 November 2012 (UTC)[reply]

And I think that makes no sense; why include some common alternative spellings of a word but not others? The actual term is /ˈkoʊlˌmaɪn/, which can be spelled both coal mine and coalmine. What makes one of those two spellings more SoP than the other, if they're really two alternative spellings of one and the same term? —CodeCa t 15:38, 29 November 2012 (UTC)[reply]

Disliking COALMINE is like Rush Limbaugh, too; the need to bloviate on subjects and foment dissent when it would be more productive quickly deciding on as many words as possible and continuing on to improve the dictionary.--Prosfilaes (talk) 19:48, 30 November 2012 (UTC)[reply]

I think there's a definite need for it, but we need to fix it, somehow. The biggest problem is the obsession by some with creating borderline entries for single-word terms to use as a get-out-of-rfd-free card for the multi-word equivalent. It would seem to me that the emphasis should be on whether the term really is a multi-word compound, with the single-word form serving as evidence in that determination rather than as a way of bypassing it. We especially need better tests to spot cases like blackbird vs black bird where the single-word and multi-word forms aren't the same term at all.

The way I look at it, single-word forms can be SOP in English, but we should allow an exemption for them because they're what people will tend to look up first before analyzing the terms into separate morphemes. That does not mean that the exemption should be extended to multi-word equivalents of those SOP single-word terms. WT:COALMINE should only apply in those cases where the single-word form isn't SOP and the multi-word form is the same term, but with spaces in it. Chuck Entz (talk) 16:17, 29 November 2012 (UTC)[reply]

You're correct in that a blackbird is not the same as a black bird, but what is more important is that blackbird is a compound, just like coal mine is. English is unusual among Germanic languages in that it sometimes spells compounds with spaces between the parts, even though it pronounces them the same (with the same intonation) as the other languages. coalmine vs coal mine is really a result of this confusion within English, but this confusion only happens with compounds, not with adjective + noun phrases like black bird. And I think the intent of COALMINE was to allow us to avoid this confusion by stating that a compound can be included if there is an alternative spelling that demonstrates it to be a compound (by leaving out the space, coalmine is unambiguously a compound, therefore coal mine is too). And since we already include compounds in other Germanic languages, it makes sense to include them in English too, despite its deviant spelling. So I believe that we can't see the issue of coal mine separately from, say, kolenmijn in Dutch, as they are syntactically the same term; either neither is SoP or they both are. —CodeCa t 16:53, 29 November 2012 (UTC)[reply]

Re: "either neither is SoP or they both are": I think you've misunderstood Chuck's point. When he writes that "single-word forms can be SOP in English, but we should allow an exemption for them", he's not denying the possibility that both are SOP. He's simply saying that even if they are, we still might want to include one and not the other. —Ruakh_TALK 20:35, 29 November 2012 (UTC)[reply]

Yes, and I'm arguing that if coalmine is SoP, then kolenmijn is as well. And I don't think it would make sense to include one possible spelling of the term, but not another. Especially not if, as COALMINE indicates, the one we are leaving out is the more common spelling.

I think, to come to a solution that makes the most sense (to me at least), we'd have to distinguish between lexemes, variant forms ({{alternative form of}}) and variant spellings ({{alternative spelling of}}). A single lexeme can have one or more variant forms (forms that are phonologically and orthographically different, but morphologically identical), and each variant form can have one or more variant spellings (forms that are orthographically different, but phonologically and morphologically identical). A lexeme can be SoP, but variant forms and spellings can't be because they are representations of a single lexeme (i.e. different forms of the same lexeme are either all SoP, or not at all). So when we argue for or against SoP-ness, we would have to do so irrespective of spelling or variant forms. This has already been addressed before in relation to Chinese languages (which don't use spaces). But there are other problems that CFI doesn't solve unless this distinction is made. Imagine that the form coal mine were attested exactly twice, and coalmine once. It's obvious that the lexeme, which is represented by these two distinct spellings, has three attestations and therefore meets CFI. But each spelling variation individually does not. So what do we do? My intuition says that there should be an entry, but neither spelling has the 3 attestations necessary to allow an entry to be created for it. The current CFI doesn't solve this. —CodeCa t 21:16, 29 November 2012 (UTC)[reply]

Re: "I don't think it would make sense to include one possible spelling of the term, but not another": That is, you support COALMINE. The thing is, many people don't. Telling them "I don't think your desired policy makes sense", with no supporting argument, is not likely to convince them. (My apologies if you are, in fact, offering an argument underlying that view. If so, I'd appreciate if you could clarify it.) —Ruakh_TALK 22:32, 29 November 2012 (UTC)[reply]

I suppose my argument was somewhat implicit. I think they should be included because I see no reason not to include them. We include all words in all languages, as long as they are attested and idiomatic, and I have argued that only lexemes can be idiomatic, not spellings. If there is a specific reason why the spelling coal mine should not fit CFI, but coalmine should, and that we should amend CFI to make that explicit, then I'd like to know what it is and why CFI should have an exception specific to English compounds in it. Unless of course you argue that it should apply to compounds in all languages, but I'd have my doubts about the utility of such a measure. —CodeCa t 01:13, 30 November 2012 (UTC)[reply]

I think Chuck was very clear about what his specific reason was for wanting to keep "coalmine" even if it and "coal mine" are SOP: in English, such single-word sums-of-parts "[a]re what people will tend to look up first before analyzing the terms into separate morphemes". That is — even if neither "coal mine" or "coalmine" is idiomatic, our users will be harmed by our OCD purism if we exclude "coalmine", while no such consideration applies to "coal mine". —Ruakh_TALK 01:35, 30 November 2012 (UTC)[reply]

While that may be true, I hardly see any harm in including more entries. I am certainly not arguing to exclude coalmine, but it seems a bit silly if we include that, but not a more common alternative spelling. Is it really such a bad idea to include coal mine, if that's the term most people actually use to refer to that thing? Translations are also a point to consider: (let's say for this example that coalmine didn't exist, only coal mine) if we exclude coal mine based on it being a sum of parts, how do we tell people that the proper term for a coal mine in Dutch is kolenmijn? Translating coal and mine separately doesn't help, since kool mijn or kolen mijn are nonsense in Dutch. So we really need to consider the needs of Wiktionary not just from an English monolingual perspective (in which coal mine may not be includable), but also from an English-to-other translation perspective. I just checked an English-Dutch dictionary and it has separate entries for coal basin, coal bed, coal black, coal box, coal bunker, coal dust, coal face, coal fish, coal gas, coal heaver, coal hod, coal hole, coal house, coaling station, coal measure, coal measure, coal mining, coal mouse, coal pit, coal scuttle, coal seam, coal shed, coal shovel, coal tar, coal tit, coal truck, coaly. If a paper dictionary can include all of these terms, why can't we? —CodeCa t 02:29, 30 November 2012 (UTC)[reply]

Has anyone ever argued that "coal mine" should be excluded? As far as I know, the point has always been that we should use our judgement to decide what is idiomatic or inclusion-worthy and was isn't, rather than being bound by an inflexible rule that shuts down debate as soon as it is shown that 3+ people have left out the spaces in spelling something. Perhaps it would be clearer if that rule were referred to by a better name, such as HOUSE_WALL. - -sche (discuss) 02:39, 30 November 2012 (UTC)[reply]

Re: "Has anyone ever argued that 'coal mine' should be excluded?": Yes; it was RFD'd, with the nominator and one other editor explicitly supporting deletion, and one other editor apparently supporting deletion but expressing it in a very roundabout way. (Against three editors voting "keep", each giving a different reason, one of which being what later became WT:COALMINE.) See Talk:coal mine. —Ruakh_TALK 15:34, 30 November 2012 (UTC)[reply]

Let me elaborate on what I said above: I think coalmine and coal mine should be kept, since they're both forms of a multi-word compound. There are other cases where neither the form with a space nor the single-word form are a true compound. In those cases, we should keep the single-word form and delete the other.

As I see it, WT:COALMINE arose because of a real problem: multi-word forms were in danger of being deleted simply because they had a space in them, without consideration of the evidence a single-word form might provide that they were really compounds. That would be throwing the baby out with the bathwater. Unfortunately, the response was too rigid the other way: never throw out the bathwater, because there might be a baby in it. The obvious question to ask in this endless argument about bathwater is: what about the baby? And why doesn't anyone know whether it's in the bathwater- who's watching the baby? The focus in this debate should be on the true nature of the different forms, not whether they have a space in them.

Let me break it down by (mostly hypothetical) type:

Multi-word compound with a single-word compound variant- keep both.
Single-word compound with a Multi-word compound variant- keep both.
SOP multi-word term with a few stray single-word occurrences- delete the multi-word term, and seriously consider whether the single-word occurrences merit an entry.
SOP single-word term with a multi-word variant (if such exists)- delete both.
SOP multi-word term where a single-word compound exists that coincidentally looks the same as a single-word variant for the multi-word might be, but is distinct (e.g. black bird and blackbird)- delete the multi-word term, keep the single-word one.
SOP multi-word term coexisting in usage with the multi-word variant of a single-word compound (my interpretation of the Chinese man / Chineseman / Chinaman mess)- keep the multi-word variant as an alternative-form entry, but delete the SOP sense or convert it to {{&lit}}, and keep the single-word compound

What I think we need is language added to WT:COALMINE that says that a multi-word term may be deleted if the single-word form is either a different term or SOP. The emphasis should be on whether it's really a multi-word compound. To make this work requires better ways to nail down which of the above types (or of any other types I may have missed) a given problem term belongs to. I therefore see this as a goal to work toward rather than something to implement immediately. Chuck Entz (talk) 00:39, 1 December 2012 (UTC)[reply]

In my opinion, WT:COALMINE should be kept. There are many dictionaries that have words separated by space, e.g. grammar, technical or medical terms, words that are considered single words. A weak compromise has been reached between "keepers" and "deleters". I don't know why this discussion has started again. Words that have been kept so far are not loosely joined words, like "black car" or "car key". Although, here we got majority of "deleters", this topic has been discussed ad nausea, please don' start this can of worm and continue discussing words case by case, we have RFD and RFV for this.

@Wikitiki89. If you open any bilingual dictionaries, no dictionary creator tries to match words one to one as per the quantity. Dictionaries provide correct terms, correct translations, they are not supposed to keep words atomic and try to avoid spaces. "Coal mine" is a word and is a more common spelling than "coalmine" (also a word). The why question has been answered numerous times when coal mine was going through the RFD process and many-many other similar words and I don't think we need to review it again. --Anatoli ^{(обсудить}/^вклад) 23:03, 29 November 2012 (UTC)[reply]

Maybe coal mine should be included, but my point is that the decision to include coal mine should have nothing to do with the existence of coalmine and that is what WT:COALMINE is all about. --Wiki Tiki 89 09:15, 30 November 2012 (UTC)[reply]

The decision is related to the fact that it belongs to the vocabulary of the language. Only to this fact. But, if coalmine is a term of the language, this is an evidence that other spellings of the term are also terms of the language ("termness" clearly does not depend on the spelling). This is what the rule is about. Lmaltier (talk) 23:06, 30 November 2012 (UTC)[reply]

On the contrary, I think the existence of coalmine is evidence that should be considered when looking at coal mine- but it shouldn't be grounds for ignoring coal mine altogether and keeping it no matter what. Chuck Entz (talk) 00:54, 1 December 2012 (UTC)[reply]

Template:lv-conj

For some reason, this template was deleted a couple of weeks ago; but I was working on it, so the deletion actually made me lose quite a lot of work. Is there some way to bring it back? (I note that, when I look at templates that transclude it, like template:lv-conj-2, I can still see the table as it should be, but template:lv-conj itself is gone, and when I use template:lv-conj-2 in an entry, I only get a red link to template:lv-conj (see tulkot). --Pereru (talk) 11:46, 29 November 2012 (UTC)[reply]

You already posted this at WT:GP. Try not to post things twice as it tends to decentralize the conversation. --Wiki Tiki 89 12:16, 29 November 2012 (UTC)[reply]