Wiktionary:Beer parlour/2013/September: difference between revisions

From Wiktionary, the free dictionary
Jump to navigation Jump to search
Content deleted Content added
Line 295: Line 295:
: Thanks for bringing this here. Personally, I actually ''don't'' support any long-term block: CodeCat obviously enjoys running a bot, and as long as she's using it to do things that the community has agreed should be done, I think that's great. My blocks were under the assumption that she would quickly fix the issue and then unblock it (I think my block-summary even said as much); I had no idea how much of a trial this would be. She's taken it personally, so has started making it personal herself, casting aspersions on my intentions, so now I'm annoyed enough that I'm half-tempted to support a long-term block, :-P &nbsp; but my best current judgment is that I should trust my earlier, non-annoyed judgment. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 02:43, 15 September 2013 (UTC)
: Thanks for bringing this here. Personally, I actually ''don't'' support any long-term block: CodeCat obviously enjoys running a bot, and as long as she's using it to do things that the community has agreed should be done, I think that's great. My blocks were under the assumption that she would quickly fix the issue and then unblock it (I think my block-summary even said as much); I had no idea how much of a trial this would be. She's taken it personally, so has started making it personal herself, casting aspersions on my intentions, so now I'm annoyed enough that I'm half-tempted to support a long-term block, :-P &nbsp; but my best current judgment is that I should trust my earlier, non-annoyed judgment. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 02:43, 15 September 2013 (UTC)
* I '''support''' temporarily blocking User:MewBot for bot actions made without first gaining consensus for them via appropriate channel such as Beer parlour. Whenever a dispute over there being a consensus for bot actions arises, CodeCat should provide links that show there is consensus for their actions. Only after the blocking admin is satisfied that the actions are supported by consensus can the User:MewBot be unblocked, on a case-to-case basis. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 09:48, 15 September 2013 (UTC)
* I '''support''' temporarily blocking User:MewBot for bot actions made without first gaining consensus for them via appropriate channel such as Beer parlour. Whenever a dispute over there being a consensus for bot actions arises, CodeCat should provide links that show there is consensus for their actions. Only after the blocking admin is satisfied that the actions are supported by consensus can the User:MewBot be unblocked, on a case-to-case basis. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 09:48, 15 September 2013 (UTC)
** And what is [[Wiktionary:Consensus|consensus]]? Yes, this is a redlink. As far as I know, we never practised "consensus" here. [[User:Kephir|Keφr]] 10:32, 15 September 2013 (UTC)


== Eliminating adjective PoS for Ainu ==
== Eliminating adjective PoS for Ainu ==

Revision as of 10:32, 15 September 2013


Term only citable with different spellings counting

I can find one hit for Copenhagenisation and two for Copehagenization (meaning “(sociolinguistics) the process of Danish speakers begining to use the dialect of Copenhagen”). Not enough citations for either, but they’re just different ways of spelling the same word, so should they be included? — Ungoliant (Falai) 12:35, 2 September 2013 (UTC)[reply]

Our entries are for spellings. DCDuring TALK 13:41, 2 September 2013 (UTC)[reply]
There is support for it (Wiktionary:Information_desk/Archive_2012/July-December#Request for clarification: How strict is WT:CFI regarding attestation of spellings which vary slightly?). — Ungoliant (Falai) 14:05, 2 September 2013 (UTC)[reply]
Why do you call that support? DCDuring TALK 15:52, 3 September 2013 (UTC)[reply]
Support creating both entries. I don't think there is much point in gerrymandering the CFI to exclude terms merely because of spelling differences. It's the same word. —CodeCat 13:58, 2 September 2013 (UTC)[reply]
Agreed. If it were a regional term or an alternative spelling, where the spelling is what's in question, it might be different, but -ize and -ise are substituted into words by an extremely regular and mechanical process analogous to inflection (most of the time, we don't even notice we're doing it). If we accept plurals for singular lemmas, or past for present lemmas, we should accept these. Chuck Entz (talk) 14:41, 2 September 2013 (UTC)[reply]
Yup, though not without exception. (Consider (deprecated template usage) compromise and (deprecated template usage) exercise and (deprecated template usage) advertise, whose counterparts in <-ize> are quite rare by comparison. And, for that matter, consider (deprecated template usage) matrices and (deprecated template usage) hypotheses and (deprecated template usage) phalanges, whose regularly-backformed singulars (deprecated template usage) matrice and (deprecated template usage) hypothese and (deprecated template usage) phalange are, similarly, quite rare compared to the standard singulars. So we do need to exercise caution.) —RuakhTALK 21:49, 2 September 2013 (UTC)[reply]
I agree with CodeCat. —RuakhTALK 21:49, 2 September 2013 (UTC)[reply]
Yet another step that means increase in quantity and decrease in quality of entries. DCDuring TALK 15:52, 3 September 2013 (UTC)[reply]
You’re just being a concern troll. — Ungoliant (Falai) 18:34, 3 September 2013 (UTC)[reply]
I'm not sure what "step" you're referring to. Are you implying that hitherto we have not allowed entries in cases where a word meets the CFI but has not had any individual spellings/forms that do? —RuakhTALK 20:29, 3 September 2013 (UTC)[reply]
Exactly. Am I wrong? I know I am not wrong about the poor quality of our definitions, both English and other. It's hard to say whether they are getting worse or not as we have no metrics (not that we could readily develop any, except on a sample basis). I'm quite sure that our definitions are not rapidly improving and that we are constantly adding FL terms with ambiguous glosses. DCDuring TALK 23:58, 3 September 2013 (UTC)[reply]
Support including the term, but I would like to see one proper entry for a lemma, plus a form-of page.
Our pages are for spellings, but many of our full entries are for lemmas, with form-of references for inflections and spelling variations. The latter is a much better arrangement for the reader, and also for integrity of the dictionary, per the w:DRY principle. Our citation practices encourage me to think that we cite terms, not spellings: “Unlike the main space, inflected forms and alternate spellings should be redirected to the primary entry. Variations in case should be on the same page, with the other(s) redirecting, even if the definitions are distinct” (from WT:CITE#Naming).
Some of these are also citable: Copenhagenize/Copenhagenise, Copenhagenized/Copenhagenised, Copenhagenizes/Copenhagenises, Copenhagenizing/CopenhagenisingMichael Z. 2013-09-06 04:33 z
Lightly object. To generalize from this example, we're talking about words that have two spellings in English, which means you can't come up with 5 examples in English--with Google Books, that's not usually a huge hurdle. You're also usually taking about words that predictable variations on words we should have; if we have Copenhagen, Copenhagenization should be pretty clear. I don't see the benefits as being huge.--Prosfilaes (talk) 05:52, 6 September 2013 (UTC)[reply]

Wiktionary's definition of a word is spelling-based, and I don't see why we should make an exception for Copenhagenisation and Copehagenization. If both can be independently cited they both deserve a separate entry, with one being a lemma and other an alternative form, misspelling, or whatever. --Ivan Štambuk (talk) 17:22, 12 September 2013 (UTC)[reply]

I'd've thought this was a good idea for things that Wiktionary:Entry layout explained either doesn't mention or doesn't give an unambiguous verdict on. For example, definitions may be formatted as sentences, or not. There's very little consistency. Even two consecutive definitions in a single entry, the first will have an initial capital and a full stop, and the second will have neither. Mglovesfun (talk) 11:11, 3 September 2013 (UTC)[reply]

Hi,

Is there here a system of good or feature articles, like on Wikipedia (Wikipedia:Featured articles/Wikipedia:WikiProject Good articles)?

Thanks by advance, Automatik (talk) 13:18, 3 September 2013 (UTC)[reply]

But we do have WT:WOTD. DCDuring TALK 13:25, 3 September 2013 (UTC)[reply]
Thaks for your answer. Automatik (talk) 15:35, 3 September 2013 (UTC)[reply]
But WT:WOTD isn't really comparable. There are no quality requirements on WOTD and no process for "bringing an entry up to WOTD level". The non-English WOTD must have a pronunciation and at least one citation (at least one mention for a limited-documentation language), but the requirements on English WOTD all have to do with the nature of the word itself, not with the quality of the entry. —Angr 21:46, 3 September 2013 (UTC)[reply]
That's true, but I think it's in part because we can improve an English entry quickly once it's announced as an upcoming word of the day. —RuakhTALK 04:14, 7 September 2013 (UTC)[reply]

Hello, I came from the French wiktionary too. We are trying to create a system to have a quality evaluation, and it seems no other Wiktionary have a system like that. Do you want to join to the discussion? If yes, we do have to work some weeks more and then we can translate it in English to share the ideas with you. Eölen (talk) 22:52, 3 September 2013 (UTC)[reply]

Can you please provide a link to the relevant discussion on French wiktionary? --Ivan Štambuk (talk) 17:16, 12 September 2013 (UTC)[reply]

Since I saw it on a "needed badly" list somewhere, I decided to start this page. It has been brewing on my disk for some time. I loosely based it on WT:ACS, while trying to explain some grammatical features, and highlight a few gaps in current practices. Please tell me what you think, whether anything is missing, needs change or an explanation. Keφr 10:34, 6 September 2013 (UTC)[reply]

Good work! Not much to criticise or suggest at this point. I'll watch this project and may use something to add to Wiktionary:About Russian. I'd like to see more treatment of verbs, including perfective/imperfective (not just entries but translations), abstract/concrete, semelfactive. Also interested in the policy for reflexive verbs, which seem to be handled differently across languages (separate entries or separate senses?). Polish could perhaps use more etymology info, which can often be looked up at Serbo-Croatian (or sometimes Russian) entries with Proto-Slavic derivations. --Anatoli (обсудить/вклад) 23:52, 8 September 2013 (UTC)[reply]
I do not remember ever encountering a semelfactive aspect which would be distinct from perfective. Translations - noted, will write something up. I tried to be descriptive of current practices rather than prescriptive, so if you people want to discuss how the policy ought to be, feel free. Not sure what you mean by the abstract-concrete distinction. Remember, this is not a complete guide to Polish grammar, just a quick summary to explain how it is relevant to presenting terms in Wiktionary. Keφr 07:25, 9 September 2013 (UTC)[reply]
Re: semelfactive vs simply perfective, example: "krzyknąć" and "pokrzyczeć" are both perfective, the former is semelfactive (instantaneous, momentive), the latter is not. Abstract vs concrete (verbs of motion only): "chodzić"/"iść". I've added some categories for a few Slavic languages other than Russian before. Your project page doesn't have to describe all that, of course. --Anatoli (обсудить/вклад) 12:24, 9 September 2013 (UTC)[reply]
And I used to think that aspect is an easy language… aspect. Did you notice the mention of frequentatives? Any idea whether and how this aspect mess should be handled? (I think I remember Russian having a similar feature.)
The abstract-concrete gave me some idea, but I am not sure I got it right. I think you will not find a good translation of the verb (deprecated template usage) go in all its generality. The verb (deprecated template usage) iść still sort-of refers to using feet, even if the main focus on something else.
And this page is not "mine" by any standard. If you think you have something to add, go ahead. In the worst case you will get reverted once or twice. Keφr 13:54, 9 September 2013 (UTC)[reply]
Although I would not mind having the former type of page somewhere in here, to be honest. There are a few languages I would like to learn, but have something of a hard time finding good resources. A brief grammar reference would be helpful. Keφr 07:28, 9 September 2013 (UTC)[reply]

As far as I can tell, Wiktionaries can be classified into four groups:

  1. Regular Wiktionaries, like fr.wikt and es.wikt. Except for various annoying edge cases that aren't the subject of this discussion, these work just fine, and exactly as you'd expect.
  2. Nonexistent Wiktionaries that redirect to the Wikimedia Incubator, like vep.wikt. I'm not sure quite how we should handle these, but I think we can basically do whatever we want; we just need to decide what we want to do with them, and then do it. Interwiki-links to [[:vep:...]], for example, work fine, linking to vep.wikt URIs that redirect to Incubator URIs.
  3. Nonexistent Wiktionaries that don't redirect to the Wikimedia Incubator, like zza.wikt. With these we can do whatever we want for translation-links (we just have to link directly to the Incubator entry if we want that), but interwiki-links are uglier (we'd have to add them JavaScript-ically).
  4. Closed/locked Wiktionaries, like aa.wikt and dz.wikt. (I suppose these could be considered a subset of the previous.) These are annoying, because they have some existent pages, and they have database-dumps, but redlinks to them are rather pointless (since content can't be added), even bluelinks to them are rather dubious (since problematic content can't be fixed or removed), and in some (most? all?) cases there's at least as much content on Incubator as on the Wiktionary domain itself.

Group #1 needs no discussion, but how do we want to handle each of groups #2–4?

RuakhTALK 06:21, 7 September 2013 (UTC)[reply]

Since no one's weighed in yet, here are my own views:
  • we should never link to closed/locked Wiktionaries — not as interwiki-links, and not as translation-links.
  • we should never link to non-existent pages on Incubator — not as interwiki-links (obviously), and not as translation-links.
  • when a translation has an appropriate-language Wiktionary entry on the Wikimedia Incubator, we should link to it using {{t+}}. (Note: since e.g. [[zza:...]] and [[aa:...]] don't work properly, this will require a change to the translation-templates. Actually these templates are already a bit broken when it comes to languages without Wiktionaries — {{t|zza|foo}} links to a page named zza:foo on en.wikt — so we'll want to make some sort of change to them regardless.)
  • when an interwiki-link would appropriately link to a redirect to an existent entry on the Wikimedia Incubator, we should use it. For example, [[April]] should include [[vep:April]] among its interwiki-links.
  • when an entry exists on the Wikimedia Incubator, but an interwiki-link wouldn't work, should we hack up some JavaScript to make it work? I'm not sure.
RuakhTALK 19:47, 7 September 2013 (UTC)[reply]
Sounds all reasonable to me on the face of it. As for the Javascript question in the last item, my instinct would be to avoid adding Javascript unless it generates significant added value, which does not seem to be the case. --Dan Polansky (talk) 20:14, 7 September 2013 (UTC)[reply]

IMHO, apart from top-X (where X < 5), other Wiktionaries are so much inferior in quality that linking to them in both interwikis and translation tables seems like a waste of time, database space and edit counts. --Ivan Štambuk (talk) 17:10, 12 September 2013 (UTC)[reply]

Number forms

Based on the Category:Inflections, I believe Wiktionary needs a new category called Numeral forms because some languages have inflections for their cardinal numbers. I hope this isn't a difficult suggestion. --KoreanQuoter (talk) 18:08, 7 September 2013 (UTC)[reply]

But we already have one? —CodeCat 18:26, 7 September 2013 (UTC)[reply]
I tried to make a separate page for одно (neuter form of один) and I think Numeral forms is more appropriate for a category. --KoreanQuoter (talk) 18:51, 7 September 2013 (UTC)[reply]
I still don't understand. What is wrong with the existing numeral forms category? —CodeCat 19:24, 7 September 2013 (UTC)[reply]
Wait. There was an existing numeral forms category? --KoreanQuoter (talk) 05:47, 8 September 2013 (UTC)[reply]
…yes? Keφr 06:02, 8 September 2013 (UTC)[reply]
Oh. Silly me. Thank you. --KoreanQuoter (talk) 06:18, 8 September 2013 (UTC)[reply]

CFI and Wiktionary is not an encyclopedia

I have created vote Wiktionary:Votes/pl-2013-09/CFI_and_Wiktionary_is_not_an_encyclopedia. I propose to remove or at least trim WT:CFI#Wiktionary is not an encyclopedia section.

Let us postpone the vote as much as the discussion needs. --Dan Polansky (talk) 08:08, 8 September 2013 (UTC)[reply]

Let's keep the comments on the talk page of the vote. Mglovesfun (talk) 09:06, 8 September 2013 (UTC)[reply]

CFI and trimming the Idiomaticity section

I have created vote Wiktionary:Votes/pl-2013-09/CFI and trimming the Idiomaticity section.

Let us postpone the vote as much as the discussion needs. --Dan Polansky (talk) 09:00, 8 September 2013 (UTC)[reply]

Underestimating idiomaticity of Finnish translations

While going around fixing translation lists, I noticed that very often, the Finnish translations are marked up as if they were sum-of-parts. At first I thought, "well, I guess Finnish is weird", but recently I started to doubt the accuracy of their such characterisation. Take (deprecated template usage) door-to-door. The Finnish translations listed look like simple inflections of the Finnish word for "door" ((deprecated template usage) ovi). The English meaning of "door-to-door" is apparently idiomatic, so I have a quite hard time imagining how the Finnish entry, which breaks down into constituents pretty much the same way, would be sum-of-parts. I am also suspicious of entries where Finnish translations are broken into roots and affixes.

Do you think we should go over these? Keφr 11:54, 8 September 2013 (UTC)[reply]

"ovelta ovelle" translates literally as "from door to door". (deprecated template usage) ovelta is the ablative case of (deprecated template usage) ovi and means "(away) from a/the door", while (deprecated template usage) ovelle is the allative case and means "to/towards/onto a/the door". —CodeCat 12:19, 8 September 2013 (UTC)[reply]
I've noticed this too and I think it's just the way they've been added (by a human editor) and nothing to do with the language itself. Mglovesfun (talk) 19:58, 8 September 2013 (UTC)[reply]
So in that case, should we not have an entry for the whole phrase and link to it in the translations list? Keφr 20:55, 8 September 2013 (UTC)[reply]
I would trust Hekaheka's judgement on how to translate into Finnish. We should invite her if any doubt. Translations may be done as "solids" (if they are idiomatic in the target language) or using "sum of part" methods. Using "Template:tø" allows you to see individual components and the grammar but "ovelta ovelle" requires the entry to exist or at least interwiki. --Anatoli (обсудить/вклад) 22:30, 8 September 2013 (UTC)[reply]
I agree (w/Anatoli). Even if it is just a case where the Finnish-speaking editors have made a different decision than most of the non-Finnish-speaking editors would have made . . . well, that doesn't seem like a big deal to me. There are a lot of things that it's important to be consistent about across languages, but I'm not sure this is one of them. —RuakhTALK 23:16, 8 September 2013 (UTC)[reply]
There are pros and cons in both approaches but it's often safer to use "SoP" approach, even if it's more time-consuming. It usually causes less criticism. It's the first I see criticism of the SoP approach.--Anatoli (обсудить/вклад) 04:32, 9 September 2013 (UTC)[reply]
Safer? Maybe. Though note that [[ovelta ovelle]] does exist in this case. The specific problem I see with overmarking as sum-of-parts are that 0) this discourages creation of entries for terms which may be non-trivial to translate into English; 1) these terms will not be picked up by Yair rand's gadget on the search page, and it does not seem obvious to me how it could be extended to do so; leading into 2) the translation of such a term will be harder and more time-consuming, especially when the entries for the constituent words are missing some meanings. So, marking translations as idiomatic can be beneficial even when that makes them redlinks. Keφr 07:09, 9 September 2013 (UTC)[reply]
By "safer" I mean in terms of someone disputing idiomaticity. I've translated double-decker bus and death camp as SoP двухэтажный автобус m (dvuxetážnyj avtóbus) and Template:tø as tomorrow someone may dispute both the English terms and the Russian translations. It's still educational to see the grammar of the translations, showing the individual parts and how the translation is made. --Anatoli (обсудить/вклад) 13:00, 9 September 2013 (UTC)[reply]
Educational — I am not denying that. Having these translations listed as SoPs is helpful, even if only by the virtue of it being better than having no translations at all. But the grammatical structure of a multi-word term can be also analysed in the whole term's entry, and can also be inferred by hand when the constituent word pages are reasonably complete, so this is not hugely relevant. Creating pages for these I find rather easy. Though granted, I do not always create these myself.
All I wanted to know is whether everyone is okay with treating such terms as SoPs. There are many other examples, they often land in Category:Translations to be checked (Finnish), because I usually just leave them there when xte suggests reviewing the translation. (Hekaheka would probably like to call me a perkeleen vittupää because of that, but oh well. Cannot please everyone.) Keφr 14:10, 9 September 2013 (UTC)[reply]

Merging Mari and Buryat varieties

Can we or should we merge some varieties of Mari and Buryat at Wiktionary?

  • Mari (chm, mhr, mrj):
Hill/Western Mari (mrj) can probably stay separate, it has a few more letters than the standard or Eastern (Meadow) Mari. Extra Cyrillic letters in Western Mari: Ӓ, ӓ and Ӹ, ӹ and they don't use standard Mari letter Ҥ, ҥ. This variety has about 30 thousand speakers. It's still possible to merge if Western Mari has context labels and the additional letters are handled. Anyway, chm and mhr can be merged safely.
Language codes with names:
  • chm - "Mari", "Standard Mari"
  • mhr - "Eastern Mari", "Meadow Mari"
  • mrj - "Western Mari", "Hill Mari"


  • Buryat (bxr, bxu, bxm, bua):
Russian and Mongolian Buryat use the same alphabet. Mongolian and Vagindra are hardly used. The overwhelming majority of Buryats live in Buryatia, some in Mongolia, even less in China.
Language codes with names:
  • bxr - "Russia Buryat"
  • bxm - "Mongolia Buryat"
  • bxu - "China Buryat"
  • bua - "Buryat", "Buriat"
(it's obvious that at least one is redundant)

--Anatoli (обсудить/вклад) 04:53, 10 September 2013 (UTC)[reply]

Meadow and Hill Mari have separate written standards so they should be kept separate (and the chm code deleted). Buryat, easy thing, merge them. There is no difference between these 'lects, except perhaps for some loanwords. -- Liliana 07:39, 10 September 2013 (UTC)[reply]
Seems like we have an agreement on Buryat. So we can delete bxr, bxm and bxu and make bua the only code for Buryat.
With Mari, I would rather delete mhr and leave the name "Mari". Standard Mari is "Eastern Mari" or "Meadow Mari" and chm is more common. OK, let's leave mrj but I'll make a transliteration page and a module, which works for both alphabets. --Anatoli (обсудить/вклад) 22:42, 10 September 2013 (UTC)[reply]

Overriding manual transliteration

This has been discussed in other pages, but no consensus was reached.

Automated transliteration works perfectly for several languages, such as Armenian. Some suggest to always override manual transliteration for these languages, because many of them are incorrect due to human errors and inconsistent (due to changes to transliteration system, etc.) Some others say we should always let the editors use |tr=.

Another solution is removing the old manual transliterations for the terms of these languages, and don't override manual transliterations after that. (we can put the pages with |tr= for terms of these languages in a category to keep track of them) --Z 13:29, 10 September 2013 (UTC)[reply]

Wouldn't we want to be able to let that be decided on a language-by-language basis? What about also allowing the overriding of everything with tr=, but allowing tr0= to override bad automatic transliterations, also on a per-language basis? DCDuring TALK 14:47, 10 September 2013 (UTC)[reply]
It is being decided language-by-language. See the override_translit section of Module:links. --Vahag (talk) 15:15, 10 September 2013 (UTC)[reply]
I support overriding manual transliteration for languages whose automatic transliteration works perfectly, e.g. Armenian, Georgian. For such languages manual transliteration will be redundant in the best case and wrong in the worst case. --Vahag (talk) 15:15, 10 September 2013 (UTC)[reply]

Great language game

I know that this isn't a forum, but there isn't really anywhere else to put it. And it would be a shame not to share it because I think many people on Wiktionary will like it. There's a new website called the Great Language Game where you can see how well you can tell different languages apart by ear. I seem to do pretty well with it, I hope it's fun to others as well. —CodeCat 12:32, 12 September 2013 (UTC)[reply]

Love it, thanks. Only 800 for me... And I got lucky, I kept ending up with Slavic languages. --Fsojic (talk) 20:16, 12 September 2013 (UTC)[reply]

What can be done to improve quality?

The more my wanderings take me to visit a wide range of non-English entries, the more I think that the English-language entry quality problem is not our only quality problem.

For non-English entries the problems range from the near-incoherent terseness of our copyings of a 110-year old Sanskrit dictionary to the frequent presentation of non-idiomatic calques as glosses and the use of terms that simply don't belong in a definiens of a contemporary dictionary due to the age, rareness, or unglossed polysemy of the term or terms used.

For English entries the quality problem includes the obsolete language of definiens and the poor coverage of polysemic terms, especially uses that developed in the 20th century and remain common today. The entries for polysemic terms contain many important definitions that are buried and lost in visual clutter. The definiens of many terms includes words that are rare and/or technical when neither characteristic is necessary.

Are there technical means that could help? An example might be processing the dumps to identify uses of terms labeled rare, obsolete, archaic in definiens. Or words used only once in any definiens.

What can we do to get more effort by existing and past editors devoted to entry improvement?

Are there helpful ways to more actively recruit or develop contributors? DCDuring TALK 12:56, 12 September 2013 (UTC)[reply]

I do think that we should avoid using obscure terms in definitions, but sometimes there just happens to be that one word that describes it so much better than anything else. In such cases I usually prefer to show both. I often include multiple glosses if it helps to narrow the meaning down more.
I'm not sure if there is much we can do to increase the effort. People will work on what they feel like working on. We can raise awareness, but that's about all we can do. Wiktionary is pretty decentralised and we have no central announcement system that everyone is guaranteed to see, except for WT:NFE which a lot of people ignore regardless. So if we want to raise awareness of issues we first need some kind of global platform to raise them on to begin with. Beer Parlour isn't really enough.
As for visual clutter, I think this is a real problem and I think it could be improved substantially by adopting a visual style similar or identical to the French Wiktionary. Their use of colours, borders and icons is far easier on the eye and does a lot to direct the user's attention to certain parts of the page. It makes things stand out more and gives visual structure to the page which is pretty much essential. —CodeCat 13:19, 12 September 2013 (UTC)[reply]
For foreign language entries, heavy use of glosses or listing several possible translations is a must. For example, in the Serbo-Croatian entry Template:l/sh/Latn, a definition given is “binding”. The reader is left to guess which sense of Template:l/en it refers to. When I add a Portuguese entry, I always try to add enough information via glosses and possible translations so the user won’t need to follow any link nor rely on guesswork to understand precisely what the term means.
Shortcut glosses like “(all senses)” should be avoided as well, IMO, as they can lead to error. — Ungoliant (Falai) 13:28, 12 September 2013 (UTC)[reply]
"For foreign language entries, heavy use of glosses or listing several possible translations is a must." Very much this. I already do this when adding entries in Polish, and I put a similar recommendation at WT:APL#Definitions. This should be a project-wide policy, because the reasons I gave there are not exclusive to Polish at all. This is a no-brainer for anyone who deals with translations, really.
As for visual side, I will disagree. I actually like our, shall I say, ascetically colourless style. I think we the best solution would be to convert pages to some kind of semantic markup so that we do not have to enforce any particular style at all. Dislike a style? Switch your skin.
Regarding the lack of a central propaganda tube, I would add a "add N4E to watchlist" link to the welcome template, and maybe streamline the template to the most essential bits. Could help. And the Beer parlour does not cut it, I presume, partly because the main BP page is just too damn big, loads slowly, and you have to keep adding and removing per-month pages to your watchlist to keep being updated, which is tedious. Wikipedia's archive pages system is better in this regard, although it has its own flaws. I cannot wait for mw:Flow to solve all our wiki discussion problems. In the meantime, why not convert the central discussion pages to LiquidThreads? Keφr 15:29, 12 September 2013 (UTC)[reply]
By comparing other meanings it is is obvious that the binding sense of the Serbo-Croatian noun (deprecated template usage) vȇz refers to "A finishing on a seam or hem of a garment". Sometimes using a dictionary requires a minimum amount of intelligence on reader's part. Ditto for what DCDuring calls "near-incoherent terseness" of Monier-Williams Sanskrit dictionary, the most comprehensive Sanskrit dictionary compiled by the most authorative Sanskrit lexicographer in the West. --Ivan Štambuk (talk) 17:04, 12 September 2013 (UTC)[reply]
Easy to say that when you’re a native speaker and already know the word. And the best one can do is figuring out that “A finishing on a seam [] ” is the most likely meaning, but without a gloss there is no certainty. Even then, people expect a dictionary, not a test on their figuring-out-the-most-likely-meaning-of-words skills. — Ungoliant (Falai) 17:14, 12 September 2013 (UTC)[reply]
@Ivan: I don't think that those providing support for this project intend that it be usable only by an intellectual elite. The intellectual elite that uses and contributes to this wiki needs to also serve the general population of those who need dictionaries.
I don't doubt the underlying quality of the Sanskrit dictionary in term of its coverage of Sanskrit. It seems like an outstanding basis for good Sanskrit Wiktionary entries. I just don't think that it is very usable for a non-specialist, partially because the style and wording of the Wiktionary entries resulting from the copying is not similar to that of other Wiktionary entries. The problem is not unlike the problem of copying Webster 1913 definitions, except the stylistic difference is even more dramatic. As it stands our Wiktionary Sanskrit entries are, in many ways, worse than the underlying dictionary because of omissions. DCDuring TALK 17:37, 12 September 2013 (UTC)[reply]
But Sanskrit and other extinct and classical languages are only used by intellectual elite. The only way a common person is going to come across a Sanskrit entry is through some etymology.
I don't recall having seen Sanskrit entries that are formatted radically different than entries in other languages. The only problem is the abundance of meanings that words have, and which sometimes get grouped by eras or sources, and not by semantic closeness as they are normally - but that's a particular issue of classical languages that have been used over a long period of time, which other "normal" languages don't have. But users looking up Sanskrit words expect such layout which enables them to quickly isolate set of meanings appearing in a particular work that they are reading.
I don't understand what omissions you are referring to. If you have some constructive proposal of how to change some user-unfriendly entry of your choice I'd be happy to hear it. --Ivan Štambuk (talk) 22:02, 12 September 2013 (UTC)[reply]
With other meanings being embroidery and needlework, I can't imagine anyone construing the binding sense in any definition of [[binding]] other than the one that has sewing context label attached. I don't think that the average reader is that stupid. --Ivan Štambuk (talk) 22:02, 12 September 2013 (UTC)[reply]
This is more of an observation than a suggestion, but when I was experimenting with using links to senses with sense IDs using {{senseid}} when writing definitions for foreign-language terms, I found that the sense that I wanted to link to was missing--about half the time, actually--and that there were many senses of the English term that I hadn't thought of, and which forced more disambiguation on the foreign language end than I had realized. The process of matching senses exposed gaps on both ends, so if that could be integrated in the editing process, it would be a big aid to editors. That would really tap into the collaborative power of the project. This is not to push or oppose sense IDs, just my experience with them. --Haplology (talk) 16:05, 12 September 2013 (UTC)[reply]
I would expect contemporary non-English terms to often need well-worded, contemporary senses of common English terms that the English entries lack. That is one of the biggest problems for English entries. {{rfdef}} helps us identify the need, but a note in the template to explain the need would be helpful in prioritizing work on the English entry. Do we have a tag to mark the FL definition as waiting for a suitable English definition to be provided? Would a use of {{sense-id}} in the English and FL entries help by providing a way to find the original FL entry problem (missing gloss)? DCDuring TALK 16:36, 12 September 2013 (UTC)[reply]
We also have {{gloss-stub}}. I add it to entries whenever I find that the definition doesn't identify the meaning specific enough. —CodeCat 16:46, 12 September 2013 (UTC)[reply]
That presumably goes in the FL entry and {{rfdef}} goes in a new line at the English L3/L4 section. How could those be linked more or less automagically by the use of {{senseid}} in each? DCDuring TALK 17:25, 12 September 2013 (UTC)[reply]
Install Wikidata. DTLHS (talk) 20:30, 12 September 2013 (UTC)[reply]
And I realize that saying "install wikidata" isn't helpful- just pent up frustration about trying to implement features of a database in something that very much isn't. DTLHS (talk) 20:49, 12 September 2013 (UTC)[reply]
Speaking of Wikidata — I often see statements in the Wikipedia metaspace that there are plans to deploy Wikidata on Wiktionary in some form. At the same time I am yet to see anybody from Wikidata approaching the community here about this. Trying to explain how it would work, how to handle existing dictionary content, and such. I smell a disaster. Keφr 20:55, 12 September 2013 (UTC)[reply]
d:Wikidata:Wiktionary. --Yair rand (talk) 22:06, 12 September 2013 (UTC)[reply]
Huh, I just realized I'm the only Wiktionary admin who's also a Wikidata admin. We're probably going to need some more Wiktionarians paying attention to Wikidata's progress if WD use is going to turn out well here. ... --Yair rand (talk) 22:20, 12 September 2013 (UTC)[reply]
@Ivan. I think we would want to be more than a wikisource for a 110-year old dictionary, no matter how good that dictionary may be, especially as there already it already is available: eg, [1]
We split proper nouns senses from common noun senses, but many, many Sanskrit sections do not. Not all Sanskrit sections include the link to the underlying dictionary, which is itself an omission, and not every bit of explanatory note in the original dictionary seems to have survived. A glossary for the abbreviations used does not seem to be included. The language used is not contemporary English and the definitions lack glosses. DCDuring TALK 23:23, 12 September 2013 (UTC)[reply]
For an example of a problem see WT:RFC#सह and join the discussion there. DCDuring TALK 23:25, 12 September 2013 (UTC)[reply]
MW dictionary is perfectly valid even today (because we're dealing with an extinct language, doh), some of the entries were created before the online version of MW dictionary was available, Sanskrit grammar tradition doesn't make the distinction between proper and common nouns, 98% of the words in its definitions are perfectly valid contemporary English as far as I recall, and anyone studying Sanskrit doesn't need a meaning gloss. In other words, there are no problems with Sanskrit entries. --Ivan Štambuk (talk) 16:02, 13 September 2013 (UTC)[reply]
The problem with the dictionary is not the definienda, it's the definiens. We need to convert the definitions to a more contemporary English, at least removing the archaicisms and obsolete terms. Formatting the entries to Wiktionary standards, eg, Proper noun sections. Including references to the underlying dictionary to aid the work wouldn't hurt. Having excellent coverage of Sanskrit is certainly an important goal for Wiktionary, which perhaps subsequent contributors will achieve. DCDuring TALK 16:42, 13 September 2013 (UTC)[reply]

Pages with protolanguage information?

CodeCat and I have discussed a couple of times the question of reconstructed forms without references in Etymology sections (the most recent discussion is here). One conclusion now seems to be that it would be a good idea to have pages (perhaps in the Appendix) with more detailed historical information, including perhaps original research by Wiktionarians, on specific topics, which could then be linked to from individual words. Case in point: Proto-Baltic vs. Proto-Balto-Slavic. The current tendency goes in the direction of Proto-Balto-Slavic, but there are not many published reconstructions of words out there, whereas Proto-Baltic has clearer sources. Now, if Wiktionarians want to add Proto-Balto-Slavic etymologies, or simply replace the Proto-Baltic label ({{etyl|bat-pro|LANG}}) with the Proto-Balto-Slavic one ({{etyl|ine-bsl-pro|LANG}}) on the assumption that most PB reconstructions will be acceptable PBS reconstructions as well, wouldn't it be nice to have a page (called, say, "Appendix:Proto-Baltic and Proto-Balto-Slavic") that discusses this in detail, with correspondences, derivations, and clear statements of what things in PB we think will remain the same in PBS, and why? In this way, any changes of PB to PBS can be referred to this page: it will be the basic source for the reconstruction, and the interested reader can read it to see on what grounds we have an (as yet unpublished) PBS etymology rather than the (already published) PB one. Also, Appendix pages with reconstructed PBS words could be linked to it. One objection is that this page would contain "encyclopedic" information. Yet I feel that this kind of information is quite vital for someone who is navigating the thorny area of Indo-European etymology and wants to feel sure the etymological information given at Wiktionary is correct and accurate -- as vital as having, say, a page on IPA and its symbols, or a page with the definitions of all grammatical terms used to tag words. What do you guys think? --Pereru (talk) 20:25, 12 September 2013 (UTC)[reply]

I think you missed a few important parts of the original discussion. To me, the point of having a special page for this is to act as a repository of sourced knowledge related to the reconstruction of a given language, but it's also the place that we as Wiktionarians would use to collect our own conclusions about certain minor issues surrounding them. Specifically I called it a way to allow original research, while keeping it both contained and publically accessible as a reference for etymologies and reconstructed entries on Wiktionary. Basically, to enable peer review of Wiktionary's reconstructions. I also think that the last two posts in our conversation are important:
Me: My main objection with really big stuff is that it is the kind of area where even professional linguists get things very wrong, so that makes it even more likely for amateurs to miss things. I don't have any professional schooling in linguistics, just a lot of curiosity that made me want to look for things and learn more. So I know a bit I think but what I know is not at a professional level and I don't think it is for anyone else here either. The limitations are mainly there to protect ourselves, Wiktionary and its users from our own incompetence. :P
Pereru: I am a professional linguist, though not an Indo-Europeanist (I work on South American indigenous languages). But one of the things I've learned is to stick to logics and good arguments, because (a) big stars with famous diplomas often think their fame is all their need to justify something, and (b) non-big-stars, without any diplomas, surprisingly often contribute really intelligent, insightful ideas that deserve recognition.
CodeCat 21:03, 12 September 2013 (UTC)[reply]
That is an interesting thing, and I certainly support it. But I do see the main point of having such pages in a 'dictionary (as opposed to a research journal) in being able to add references to specific reconstructions -- be they in ===Etymology=== sections, be they independent pages on PBS reconstructed forms. --Pereru (talk) 06:48, 13 September 2013 (UTC)[reply]
No we can not add original research by Wiktionarians in etymologies. Neither as reconstructions nor as speculations on word origins. Etymologies are like small encyclopedic articles and all of the Wikipedia policies on no OR and maintaining NPOV apply to them as well. If we allowed original research Wiktionary would become worthless as an etymological dictionary because there would be no way to differentiate among credible sources. We might as well restore H&M's Chinese phonosemantic interpretations. If you want to make up theories on word origins go write a blog or paper. It is not up to us to deem sources "right" or "wrong", but simply to collect all of the competing theories from established authorities and present them to the reader in the most appropriate fashion, taking into account issues such as neutrality, acceptance, and newness.
Proto-Baltic is an obsolete theory and it's quite irritating to see you intentionally replacing Proto-Balto-Slavic reconstructions that can be cited with the ones based on the 1980s scholarship. I don't think that there are linguists today (apart from some Russophobic Baltic nationalists) that dispute PBSl. There so no "tendency", it's a settled matter. There are are many details that need to be settled, but the grouping itself is not a point of contention. --Ivan Štambuk (talk) 21:26, 12 September 2013 (UTC)[reply]
But where is it actually cited as policy that we don't allow original research? We constantly do original researching when we document definitions, why is this different? If Wiktionary editors can be lexicographers, why not also etymologists? Pereru explicitly encourages the matter and he is a professional linguist himself, so he understands what is involved. I understand that you want to differentiate reliable theories from bogus ones and that is exactly what this proposal is supposed to prevent, as the idea is just that: to collect all of the competing theories and build up a body of peer reviewed research that can be used to support reconstructions in Wiktionary articles. I wonder if you even understand what has been suggested? —CodeCat 22:10, 12 September 2013 (UTC)[reply]
Writing definitions on the basis of attestations is "original research" in much the same way that writing Wikipedia articles based on cited sources is. We do not invent new meanings, but rather collect the ones attested in usage on the basis of our CFI (which are really "criteria for attestation"). The original part there is to word the definition in a manner that doesn't coincide with any of the existing dictionaries (unless they are out of copyright). No original research is one of the pillars of Wikipedia that protects users from obscure theories, and the project itself from being a propaganda machine for every fringe group that thinks that the lack of editorial or peer-review process as an opportunity to present its fringe view.
Pereru is just someone nicknamed "Pereru" (what is a "professional linguist" BTW? Somebody paid by taxpayers to produce work hidden from general populace behind paywalls and costly volumes?). I don't care if he is de Saussure reincarnated.
You seem to be conflating to separate points: 1) etymologist as somebody writing a paper on a word origin, postulating reconstructions and speculating on word origins 2) etymologist as somebody writing an etymological dictionary, which is usually done by every single headword having references to various scholarly opinions, with etymologist then choosing what he thinks is the "best" explanation. We can only do OR in the second sense, by being a synthetic work of the most recent scholarship. Not invent reconstructions and deep theories of word origins based on our own opinions of how languages evolved. Which is what you have been doing and seem to be keen on getting a community approval. --Ivan Štambuk (talk) 08:26, 13 September 2013 (UTC)[reply]
When we apply existing principles known to linguistics to come to a reconstruction that nobody has published before, is that not just applying the same science that linguists do? My intention was specifically to allow our own reconstructions while at the same time have every detail of that reconstruction accounted for by sources. This is currently an area that is lacking like Pereru points out below; we either have references to the whole reconstruction verbatim, or none at all. For example take Template:term/t. I don't need a source to tell me that it's a sound reconstruction, because I can see that it fits perfectly with all the relevant sound laws in Balto-Slavic and its descendants. Yet it has no source because no source happens to attest this word in Balto-Slavic, even though every single phoneme of the reconstruction can be accounted for by established and sourced sound laws. Also I'm not sure why you think there would be a lack of peer review. I specifically noted that the whole point of this is peer review, and wikis as a whole are founded on the principle of peer review. So fringe theories would be rejected because there is no consensus for them on Wiktionary. As long as we assume that Wiktionary editors are knowledgeable about the area, there would be peer review of new reconstructions to ensure that the science has been applied correctly according to the most mainstream theories. I have done this with many Germanic reconstructions in the past, and it has worked well. —CodeCat 12:09, 13 September 2013 (UTC)[reply]
By creating a reconstruction you are ipso facto making statements "these words are inherited" and "this is the proto/ancestral form" and "these are the sound laws that have occurred". These statements constitute true original research. Specifically, Proto-Balto-Slavic *dūmas "smoke" is by Kortlandt, Derksen and others from Leiden reconstructed as *dúʔmos, with segmental laryngeal merger as glottal stop, and without the change PIE *o > PBSl *a. *dūmas is far from being a sound reconstructions if radically different alternatives are given by reputable authorities in the field. And I think that I can saw forms *duHmos or *duHmas as well in the literature.
Wikis are based on the peer review of content that is itself based on solid evidence. There is no peer review of original research. What is susceptible to discussion are issues such as "is this wording neutral" or "is that prominent opinion or theory sufficiently represented". Not completely new and original interpretations of ex-wiki facts that are repeatedly revised by wiki editors.
That you have done such original research with Proto-Germanic - i.e. postulating reconstructions not found anywhere - only demonstrates that urgent action is needed to stop you from turning this project even more into your personal playground. I don't care about Germanic languages much, but some Balto-Slavic reconstructions and paradigms that you've been making are nothing but original research.
I do support however going beyond traditional etymological dictionaries which are constrained by space, by making extended etymologies describing every sound change that has occurred, and have even proposed how these should be formatted the last time PBSl. was discussed in the BP. But not creating our own reconstructions that cannot be found anywhere. There are thousands of of published works that deal with proto-forms, and if no reference can be found for a particular reconstruction that doesn't necessarily mean "this reconstruction is unreferencable not because it's implausible, but because no linguist has yet studied it" but rather "this reconstruction is unreferenced because it is implausible, and nobody authoritative has wasted time with it". Formally there is no way to distinguish the two cases, absence of evidence and of counter-evidence. You can combine countless theories on the development of particular properties in proto-languages, yielding dozens of equally "valid" reconstructions that individually cannot be attested, but with each sound change within being attested. --Ivan Štambuk (talk) 16:29, 13 September 2013 (UTC)[reply]
I'll have to agree with CodeCat here: where is it said that there can be no original work? To me, it seems every time you add new definitions to words -- definitions not previously published in other dictionaries --, you are doing original work. Where is it said that original work is not OK on Wiktionary, and why? (All I've seen is references to "Wiktionary is not Wikipedia".)
On your objections:
(a)If we allowed original research Wiktionary would become worthless as an etymological dictionary because there would be no way to differentiate among credible sources -- Why not? All you need to do is make accurate references. If you're taking something from a published source, by all means refer to it! (Shall we make it official policy that reconstructed protoforms are only allowed here with references?) If you're proposing one, write a page here with the details not still found in published sources and refer to it! In what way is this confusing, and how would this make it impossible to differentiate among credible sources? If at all, references would make it easier to differentiate among these sources... (On the subject of original research, I refer to published etymological dictionaries, in which the authors often advance original contributions and ideas for specific words, always carefully labeling them -- in the LEV, with a letter "K" at the end -- as the author's own work).
(b) It is not up to us to deem sources "right" or "wrong", but simply to collect all of the competing theories from established authorities and present them to the reader in the most appropriate fashion, taking into account issues such as neutrality, acceptance, and newness -- I agree fully. But note that most etymologies thus far presented here at Wiktionary are not like that: they are given without a source, and the casual reader has no way of judging whether they were presented "appropriately", with attention to "neutrality, acceptance, and care". It seems to me that adding a page in which things like PBS vs. PS etymologies could be explicitly discussed would be a great step forward in the direction of achieving precisely the goal you state. (In fact, here is another suggestion: how about a page, maybe in the Appendix, discussing precisely the good and bad points of all published sources for PIE etymologies that are used at Wiktionary, and why we trust some of them more than others? In the interest of full transparency and disclosure, wouldn't this increase the level of precision, as well as trustworthiness, of Wiktionary etymologies as a whole?)
(c) Proto-Baltic is an obsolete theory and it's quite irritating to see you intentionally replacing Proto-Balto-Slavic reconstructions that can be cited with the ones based on the 1980s scholarship. -- If they can be cited, why is (almost) nobody doing that? I've seen a couple of good citations of PBS forms (usually by you, actually), but most PBS forms proposed here have no support in published sources and, as per your own policy (in (b)) above, should not be here at all. So why are they, and why is wrong to remove them and replace them with sourced ones?
I don't care how "well established" you think PBS is (and a couple of Leiden specialists I've talked to -- both Dutch, not "Russophobic Baltic nationalists", whatever that is -- would beg to differ from you): the issue here is "what published source does a given reconstructed form come from"? Currently, almost nobody is adding sources to reconstructions here. If you have a good, published source for PBS etymologies, by all means refer to it! Heed your own advice! But when I see PBS forms being added without supporting evidence, and that in a world, no matter how well established you think PBS to be as a hypothesis, in which published PBS reconstructions are still few and far between, I think that the best policy is -- as you yourself propose! -- to trust the published sources, in which PB is still much more frequent. And, to follow this policy -- which, again, you yourself explicitly subscribe to! -- I delete, and will go on deleting, unsourced PBS etymologies and replacing them with sourced PB ones. After all, in a dictionary, sourced should always defeat unsourced. If a PBS etymology is sourced, it stays. If it isn't, it doesn't. I honestly don't see how you can subscribe to the "honesty and neutrality" policy you described above, and still disagree with that. Unless you simply want to push your personal vision of "what's right" in PBS reconstructions -- in which case, how is this NPOV?
Alternatively, you can do what CodeCat suggests: write a page in which YOU say why it is that PB reconstructions should be relabeled as PBS even in the absence of a published source that explicitly states that PBS = PS. You can sketch arguments, give examples, correspondences, etc... and then cite this page as your source.
How on earth would this be confusing, and how would this create trust problems for Wiktionary? Please riddle me that! If at all, what we're recommeding is that things be done more responsibly, and with more references. Don't you think that the current etymologies-without-references bonanza creates a much, much worse trust problem than any PBS-vs-PS page would?
I end up having to agree with CodeCat above: I think you didn't understand what it is you're disagreeing with. There is no contradiction between what is proposed here and any of the principles you espouse. Please read it again. --Pereru (talk) 06:48, 13 September 2013 (UTC)[reply]

Just out of curiosity, because I still don't understand what is at stake here: what's exactly the difference between Proto-Balto-Slavic and Proto-Baltic? Does Proto-Balto-Slavic theory say that there was simply no Proto-Baltic language, but that Latvian and Lithuanian evolved from Proto-Balto-Slavic exactly the same way that Proto-Slavic did? I've just drawn this (sorry for the probably simplistic view) so... which tree represents best the actual Proto-Balto-Slavic theory? The second or the third one? --Fsojic (talk) 13:18, 13 September 2013 (UTC)[reply]

The first has been more or less discredited, although some still hang on to it, maybe for political reasons. The second is how linguists generally saw it in the past. Newer research suggests that there are really three branches of Balto-Slavic (not the same as your third image): East Baltic, West Baltic, and Slavic. Each of those, it is supposed, had its own proto-language, but the proto-language of East and West Baltic together (what is called "Proto-Baltic") is not demonstrably different from Proto-Balto-Slavic itself. That is, if you try to find out what the common ancestor of all Baltic languages was, then you end up with a language that Slavic can also descend from. —CodeCat 14:30, 13 September 2013 (UTC)[reply]
But West Baltic evidence is limited. If one reconstructs a word from Latvian and Lithuanian - or East Baltic in general - alone because there is no known corresponding word in Old Prussian - or West Baltic in general - and label it as Proto-Baltic rather than Proto-East-Baltic (and I suppose some do this; well, I don't know), can we be sure it's the root for Proto-Slavic as well? --Fsojic (talk) 15:17, 13 September 2013 (UTC)[reply]
It's a matter of applying knowledge of how each language evolved, and then making all the ends fit together. Linguists formulate the phonetic evolution of a language through a series of ordered rules called "sound laws", which each act to change the pronunciation of words in some specific way according to certain rules. The sound laws for the Balto-Slavic languages are all more or less known, with some difficulty in the details still, but the general picture is clear. This means that it's fairly easy to find out if a given form can be an ancestor for a given Slavic term. All you need to do is apply all the Balto-Slavic-to-Slavic sound laws and see if the result you get matches what is actually found in attested Slavic or in reconstructed Proto-Slavic. An example: you start with Proto-Balto-Slavic Template:term/t. There are two sound laws that apply in this particular case. The first is Balto-Slavic *ū > Slavic *y, the second is masculine nominative singular Balto-Slavic *-as > Proto-Slavic *-ъ. Applying these two rules together gives *dūmas > *dymъ. And that is the form that is actually found in Slavic (see Template:term/t). Thus, the reconstruction is correct for Slavic. The same can then be applied to all the other Balto-Slavic languages, and if it matches all of them, then you have successfully reconstructed a Proto-Balto-Slavic term. —CodeCat 15:27, 13 September 2013 (UTC)[reply]
Except that not everybody accepts PIE *o > Proto-Balto-Slavic *a. You can get both Baltic and Slavic forms independently from Post-PIE *d(ʰ)ūmos. What is important here is w:Hirt's law yielding Balto-Slavic acute accent with fixed (columnar) paradigm on the root, and which is an exclusive Baltic-Slavic isogloss not found in other branches. Superficially, Lithuanian (deprecated template usage) dūmas is more similar to Sanskrit (deprecated template usage) dhūmás, but "under the hood" it's really not. --Ivan Štambuk (talk) 16:41, 13 September 2013 (UTC)[reply]

I'm starting to think that maybe our Translations sections should only link to target-language-Wiktionary entries that are actually known to exist (just like how we only have interwiki-links to existent pages). Under such an approach:

  • {{t}} would behave like {{}} does now.
  • {{t-}} and {{}} would redirect to {{t}}, and presumably eventually be eliminated.
  • various tools and bots (Conrad's translation-editor, Kephir's {{t}}-ifier, Rukhabot, etc.) would only deal in {{t}} and {{t+}}.

If y'all are on board with this, I think we'd probably want some sort of vote — the current system, give or take, has been endorsed by votes — but I figured I would start a discussion first, to see (1) if y'all are on board, and (2) if y'all have any alternative/additional ideas.

So . . . any thoughts?

RuakhTALK 20:18, 13 September 2013 (UTC)[reply]

It will not simplify anything for the tools for the same reason the move to {{g}} will not simplify anything until it is completely done, which will not be very soon: in the meantime, we have to deal with both unconverted and converted pages. Complexity in fact at best stays at the same level. My tool always generates {{t}} anyway and will have to recognise existing uses of {{t-}} and {{}} (which it currently does not touch at all). For other tools, including bots, it should be similar.
I am mildly opposed, actually. Contributors from foreign Wiktionaries might be actually looking for redlinks into their native Wiktionaries simply to create the missing entries. With the current approach, it takes two middle-clicks, two keyboard shortcuts, some typing and tab switching to copy our entry into their native Wiktionary, or just two clicks to start the entry from scratch. Although now they would have a somewhat hard time actually finding these. Categorising usages of {{t-}} would be useful for this. Maybe not the best use case, but… I can see some value in this.
So why, really? I fail to see any advantage in the above-mentioned… characteristics of this approach, for lack of a better word. Keφr 20:57, 13 September 2013 (UTC)[reply]
I'll start with your third paragraph ("So why, really? [] "), since I think that's the crux of your comment. (I didn't actually give my reasons for thinking we we shouldn't link to nonexistent FL-wikt entries; I guess I should have.) The reason is, I think such links are useless clutter:
  • In the case of {{t-}}, they're bright red, like redlinks within en.wikt, but unlike redlinks within en.wikt, there's little chance that readers and editors here will be able to help with them, and they're likely to be not-very-useful for en.wikt readers even once they exist. Note that we don't add red interwiki-links, for example, because the goal is to indicate what FL-wikts information can be found in.
  • In the case of {{t}}, the links aren't bright red, but in a way, that's even worse: it's hard to tell at a glance that it's linking to a non-existent FL-wikt entry (because the external-link blue is so similar to the bluelink blue), so it's a link to trick readers into thinking they're going to get more information, when in fact they're not.
That out of the way . . .
Re: first paragraph ("It will not simplify anything [] "): I'm not sure I completely agree with your literal statement, but I think we can agree on a key point, say, "we shouldn't do this because it's a simplification": you because you don't think it is a simplification, me because I don't think a small technical simplification (even if real) can justify a much-larger functionality change.
Re: second paragraph: Thanks for weighing in. For the specific use-case you mention (contributors from an FL wikt looking for our redlinks to them), I'd be happy to generate language-specific lists, which I think would work better for that use-case than searching for entries with {{t-}}. (And of course, even that use-case doesn't recommend {{t}}'s current behavior.) But if you can think of any other relevant use-cases, I'd be interested to hear about them.
RuakhTALK 03:26, 14 September 2013 (UTC)[reply]
Okay, I am fine with that. You can go ahead as far as I am concerned. Keφr 08:37, 14 September 2013 (UTC)[reply]
I support this. —CodeCat 01:51, 14 September 2013 (UTC)[reply]
  • Support. As for "Contributors from foreign Wiktionaries might be actually looking for redlinks into their native Wiktionaries simply to create the missing entries": I don't think it en.wikt's job to act as a worklist for other Wiktionaries, presenting the editors of en.wikt with redlinks that they cannot turn blue by editing en.wikt. --Dan Polansky (talk) 08:28, 14 September 2013 (UTC)[reply]

X-system and H-system in Esperanto

Discussion moved to Wiktionary talk:About Esperanto#X-system and H-system.

Block of User:MewBot

Ruakh blocked MewBot for updating {{it-noun}} quite profoundly without any prior discussion. This seems to violate WT:BOT#Policy. CodeCat has been unblocking her own bot. Since both the blocking and the unblocking are unilateral, I thought I'd bring it here. I also support an indefinitely (but presumably not infinite) block both for this issue and the fact that CodeCat can't always act alone on updating things in her grand vision of things without discussing it first. Mglovesfun (talk) 21:10, 14 September 2013 (UTC)[reply]

What are you talking about? You even took part in the discussion, and it wasn't even the only one that took place, there's more on SemperBlotto's talk page. —CodeCat 21:13, 14 September 2013 (UTC)[reply]
I must admit that CodeCat can be very annoying, particularly when modifying heavily-used modules/templates without testing them. But in this case, the modifications were discussed with me (the major editor of Italian nouns) in advance, and they seem to work OK. SemperBlotto (talk) 21:21, 14 September 2013 (UTC)[reply]
Thanks for bringing this here. Personally, I actually don't support any long-term block: CodeCat obviously enjoys running a bot, and as long as she's using it to do things that the community has agreed should be done, I think that's great. My blocks were under the assumption that she would quickly fix the issue and then unblock it (I think my block-summary even said as much); I had no idea how much of a trial this would be. She's taken it personally, so has started making it personal herself, casting aspersions on my intentions, so now I'm annoyed enough that I'm half-tempted to support a long-term block, :-P   but my best current judgment is that I should trust my earlier, non-annoyed judgment. —RuakhTALK 02:43, 15 September 2013 (UTC)[reply]
  • I support temporarily blocking User:MewBot for bot actions made without first gaining consensus for them via appropriate channel such as Beer parlour. Whenever a dispute over there being a consensus for bot actions arises, CodeCat should provide links that show there is consensus for their actions. Only after the blocking admin is satisfied that the actions are supported by consensus can the User:MewBot be unblocked, on a case-to-case basis. --Dan Polansky (talk) 09:48, 15 September 2013 (UTC)[reply]

Eliminating adjective PoS for Ainu

Although John Batchelor includes adjectives for Ainu in his works about 100 years ago, according to scholars such as Tamura and Kumagai [2], Ainu has no adjectives; that category of speech is best characterized as intransitive verbs. Wiktionary has four adjectives, listed at Category:Ainu_adjectives. Are there any objections to changing all of these to verbs? Since these words include the inchoative sense (become X), a possible way to gloss them is "To be/become X." BB12 (talk) 08:19, 15 September 2013 (UTC)[reply]

No objection to changing them to verbs. As to the rest: I would say a better way to categorize them is as stative verbs. See Category:Hawaiian stative verbs for one way of handling these without resorting to "be/become" in every definition. Chuck Entz (talk) 09:56, 15 September 2013 (UTC)[reply]