Wiktionary:Beer parlour/2014/January

Requesting rights to edit with AWB

I would like to be able to utilise AutowikiBrowser to do some occasional editing, and am requesting that right as stated at AutoWikiBrowser. I have utilised AWB wiki and have over 100k edits with the tool. Please {{ping}} whether successful or not, and that will allow me to know when to pop back. Thanks. — billinghurst sDrewth 15:27, 2 January 2014 (UTC)[reply]

Added. Please take care not to use the default settings, as Wiktionary pages have different layouts from Wikipedia pages, which AWB tends to want to order in Wikipedia style. Cheers! bd2412 T 21:20, 2 January 2014 (UTC)[reply]
Thanks. As the bulk of my edits are enWS, I fully understand. Primarily some global links replacements at the moment as we tidy some interwiki links. — billinghurst sDrewth 00:50, 3 January 2014 (UTC)[reply]

Translation targets

Per Dan Polansky on the current RFD on vegetable garden: "Keep at least as a translation target. My tentative personal inclusion criterion: The term has to be useful for translation into at least three languages and the three translated terms (i) must be single-word ones and (ii) they must not be closed compounds. The three single-word non-compound translations: French: potager, Russian огоро́д, Italian ortale."

How useful are translation targets in determining whether something should be included in our dictionary or not? Should they be included as a criteria in CFI next to idiomaticity and attestability? Or do we just leave a footnote as to what translation targets are? Do we have ways of measuring the usefulness of translation targets and whether their prevalence (for example, measured by the amount of google hits) merit inclusion into our dictionary, and by extension CFI? TeleComNasSprVen (talk) 21:16, 2 January 2014 (UTC)[reply]

Our entry for ortale says vegetable garden. Unless you want a page that tells you how "ortale" translates to French and Russian, I am dubious about the utility. In theory, we could pick a language, put the translation template on the single-word entry for that language, and put a note on the entries for other languages pointing to that initial entry (i.e., put all non-English single-word translations meaning "vegetable garden" on ortale, and put a note at огоро́д and potager saying, "for translations, see ortale". bd2412 T 21:26, 2 January 2014 (UTC)[reply]

People liked to throw around the term "translation target" a lot on our current RFD page as one of the main reasons to keep what would ostensibly be otherwise unidiomatic entries, or in addition to arguments to keep idiomatic but imprecisely defined entries. So I thought, if translation target is brought up so much, especially as a reason to keep such entries (reason to keep == criteria for inclusion), why not formally discuss it in the broader perspective and in terms of its relationship to CFI? TeleComNasSprVen (talk) 00:22, 3 January 2014 (UTC)[reply]

This has been discussed before but people in favour of deletions usually brush off such discussions as something "I don't care if a language X has a special word for a "yellow car" or "elder brother", in English it's SoP". There's a category Category:English non-idiomatic translation targets and {{translation only}}. --Anatoli ^{(обсудить}/^вклад) 10:58, 4 January 2014 (UTC)[reply]

I suppose it does make sense to have an entry for an otherwise SOP term like "elder brother" where a large enough number of languages have a term for it, but I would set the bar much higher than three. bd2412 T 14:52, 4 January 2014 (UTC)[reply]

The problem with our current CFI is that it only looks at one of two use cases for English. People don't look up English words just to find out what they mean, but also to find out how to say it in another language. SoP as a criterium makes sense in the former case, but not in the latter. I pointed this out before: English has a special role that other languages don't have on Wiktionary, and so its requirements should be different too. Using CFI as an argument for deletion doesn't make sense if it's the CFI itself that's broken. It really needs to be fixed so that it allows entries to be used for the second use case (translation) without meeting the criteria for the first (English-only). —CodeCa t 14:58, 4 January 2014 (UTC)[reply]

I'm not sure it's worth having an entry purely for translations (I am not a fan of elder brother for instance), however, I do think that the existence of single-word translations is sometimes a clue that the referent is a single lexical concept and it can thus be a clue that something is a set term in English. Preferably not the only clue, but I definitely think it's suggestive. Ƿidsiþ 15:03, 4 January 2014 (UTC)[reply]
- I think that it's worth having an entry purely for translations because being a "set term" in English is completely irrelevant for the task of looking up a translation for something. Set terms might be different in other languages, and just because a term is not idiomatic in English doesn't mean that translations are suddenly irrelevant. I definitely think that the CFI for translations should be based on the includability of those translations in other languages, not in English. If our dictionary doesn't offer people a way to find out how to say elder brother in a language where this may be idiomatic or otherwise significantly different from its English equivalent, then there is a gap in our coverage of lexicographical information that we need to fill. —CodeCat 15:11, 4 January 2014 (UTC)[reply]
  - I think that's a good point. I am more concerned with general criteria for inclusion regardless of translations, but if you are interested in translations specifically then there are certainly ‘unidiomatic’ collocations that you might want to include (and all bilingual dictionaries have them). Ƿidsiþ 15:13, 4 January 2014 (UTC)[reply]
    - That's exactly what I mean. Our CFI for English is geared towards a monolingual English dictionary, so any translations are kind of treated as extras. In essence, Wiktionary is monolingual for English, and it's foreign-to-English only for non-English languages, but not English-to-foreign due to the lack of suitable translations for many things that are idiomatic in the target language. —CodeCat 15:18, 4 January 2014 (UTC)[reply]
      - The entry at anh, for example, is used to refer to elder brother, but its meaning has since been extended to refer to the general pronoun you when addressing someone with approximately the same age with respect. It necessarily connotes a level of affection that is difficult to describe in English, but can be best described at our entry friend or friendliness; thus, that is what makes acceptable a term in everyday use. TeleComNasSprVen (talk) 17:54, 4 January 2014 (UTC)[reply]

I'd like to be able to find Vietnamese anh directly, since I know that many Asian languages have single words for older and younger siblings. If we don't have an entry for "commit suicide", it would be hard for me to find tự sát, which is also defined at tự sát@vdict.com and gives a translation to English - "to commit suicide". --Anatoli ^{(обсудить}/^вклад) 06:45, 5 January 2014 (UTC)[reply]

Interesting, because other translations just give "to suicide", not "to commit suicide". TeleComNasSprVen (talk) 07:17, 5 January 2014 (UTC)[reply]

"To suicide" is a synonym, "to commit suicide" is much more common. 自杀 (zìshā) is translated with a more common verb - to commit suicide. Translations into foreign languages shouldn't go to a less common term just because the more common term is in danger of being deleted. --Anatoli ^{(обсудить}/^вклад) 12:04, 5 January 2014 (UTC)[reply]

I support the existence of translation targets, but they should not be full entries; they should only have the POS heading, the definition lines containing {{translation only}} and a gloss if other unidiomatic senses are listed, and the translations. No etymology, pronunciation, inflection, categories other than the one added by {{translation only}}, etc.. In other words, translation targets should be translation targets, not an excuse to keep unidiomatic entries. — Ungoliant ^(falai) 15:33, 4 January 2014 (UTC)[reply]

Can translation targets that are not idiomatic in English still have multiple senses, and so also multiple translation tables? —CodeCa t 15:35, 4 January 2014 (UTC)[reply]

I can’t think of any case but I expect we will eventually have some. Suppose some languages have idiomatic equivalents for long table (furniture) and some for long table (grid of data), there would need to be two translation tables at long table. — Ungoliant ^(falai) 15:45, 4 January 2014 (UTC)[reply]

I actually think that's the worst of all worlds. Surely that's just making it less useful to users – if we're going to include it at all, at least write a basic definition. Ƿidsiþ 15:39, 4 January 2014 (UTC)[reply]

If they need a definition, they are probably idiomatic. Otherwise, linking to the individual words of SOP should be enough. — Ungoliant ^(falai) 15:45, 4 January 2014 (UTC)[reply]

I oppose forcing all translation targets into {{translation only}}. For instance, apple tree is good as is. If {{translation only}} were deleted, I would not give damn. On a side note, CFI's curious redefinition of the word "idiomatic" creates an undesirable rhetorical effect: no-one wants to keep "unidiomatic entries" in the sense of "entries phrased in a way that sound foreign to native speakers"; check Collins: unidiomatic. --Dan Polansky (talk) 15:50, 4 January 2014 (UTC)[reply]

I think that this should not be a criterion. If you assume that elder brother and vegetable garden are not set terms of the language, do not belong to the English vocabulary, then the translation for elder brother could be provided in brother, the translation for vegetable garden could be provided in garden, etc. But vegetable garden should be included as a set term (note that Wikipedia provides a definition and synonyms for this phrase). Of course, apple tree belongs to the English vocabulary, like all other tree names. The change I would support would be to forget sum of parts? in the reasoning. Many set terms belong to the vocabulary, it's useful to learn them when learning the language, even when they are SOP. Lmaltier (talk) 15:57, 4 January 2014 (UTC)[reply]

I agree completely with Lmaltier. We should put these translations where people would think to look. An English speaker would never think "I wonder what languages have idiomatic translations of elder brother." If someone wants to translate elder brother, they will look up [[elder]] and [[brother]]. If [[brother]] contains a translation such as "XYZ (older), ZYX (younger)", then that's what this person would use. The translations at [[elder brother]] would never be looked at. --Wiki Tiki 89 16:16, 4 January 2014 (UTC)[reply]

"They will look up elder and brother" – you seem very sure. I translate things every day as part of my job, and that is not how I work. I would expect a good dictionary to give me a set translation for ‘vegetable garden’ (to take one recent example) and I would never expect to get a suitable foreign-language translation by looking up ‘vegetable’ and ‘garden’ separately. Because it doesn't work. A vegetable garden to me is not just a garden with vegetables in it, it's a single specific idea. This is not something that can be easily quantified, it comes from the instincts of native speakers hearing the way phrases are used and thrown around in conversation, the way they are stressed, how they are qualified etc. etc. Everyone has slightly different ideas about them, which is why RFD is always going to be discursive and will never be reduced to a set of algebraic criteria. ‘Elder brother’ is not a set term to me (adjective + noun always feels less idiomatic to me than noun + noun), but it may be to some people and I wouldn't be so quick to decide how other users are likely to approach this sort of thing. Ƿidsiþ 17:52, 4 January 2014 (UTC)[reply]

"it comes from the instincts of native speakers hearing the way phrases are used and thrown around in conversation, the way they are stressed, how they are qualified..." What you are probably referring to is pragmatics, the study of the social contexts in which language is brought about. The thing is, most dictionaries only syntax, semantics and translations; do we want to include pragmatics as part of our scope? TeleComNasSprVen (talk) 18:02, 4 January 2014 (UTC)[reply]

Naturally pragmatics has an impact on how lexicographers choose their headwords, that is hardly controversial. What makes it controversial here is that we have no space restrictions and can therefore be much less arbitrarily restrictive. Ƿidsiþ 18:07, 4 January 2014 (UTC)[reply]

The issue is not space restrictions, but duplication of content. Whenever you duplicate content, you make it harder to maintain. If you need to make a change to the definition of "garden", you'd might have to now go and make the same change at every type of garden (vegetable garden, flower garden, etc.). --Wiki Tiki 89 18:15, 4 January 2014 (UTC)[reply]

One of the most interesting usage scenarios is this. I start at Italian ortale and want to get to other languages from there. I can do that using a translation target, since the following clickable path exists: Italian ortale --> English vegetable garden --> Russian огород. Or Czech jabloň --> English apple tree --> Russian яблоня. Lmaltier and you seem to be saying that you do not care about this usage scenario. --Dan Polansky (talk) 16:40, 4 January 2014 (UTC)[reply]

Sorry? I support the presence of vegetable garden and of apple tree. When the translation is not a set term (which is not very usual), this does not work, but there is a solution nonetheless: providing a translation table in all pages (even for non-English words). This is what de.wikt does, and I think the are right. But each translation must be added by someone knowing both languages: large translation tables should not be copied from a page to another page, this would lead to errors. Lmaltier (talk) 19:24, 17 January 2014 (UTC)[reply]

You may support the inclusion of "vegetable garden" and "apple tree", but not specifically in order to support the usage scenario. Whenever the scenario applies but a sum-of-parts term is not "part of vocabulary" (whatever you mean by that), you do not support the inclusion. Instead, you support putting translation tables on non-English term pages, a proposals that has very few supporters, from what I have seen. The translation target criterion is an alternative to that proposal, one introducing much less duplicate content. --Dan Polansky (talk) 20:05, 17 January 2014 (UTC)[reply]

I think we should think about incorporating common collocations into the page structure, including giving translation tables for them. In cases where a collocation is idiomatic, the collocation would consist of a soft redirect to that entry. After all, most translation dictionaries do something along those lines. That would make even SOP phrases searchable without forcing us to decide between diffusing our content further and omitting it. Chuck Entz (talk) 21:43, 4 January 2014 (UTC)[reply]

I agree with that too. I have no problem listing things like "vegetable garden" and its translations at [[garden]]. --Wiki Tiki 89 21:49, 4 January 2014 (UTC)[reply]

In Russian сад (sad, “garden”) and огоро́д (ogoród, “vegetable garden”) are not considered synonyms and are often used in contrast. Even "the grounds at the front or back of a house" may fit various types of garden but it's very vague in reference to огоро́д (ogoród), which is never a garden with trees or flowers but a garden for vegetables, sometimes berries.

"Translation target" entries allow to look up words, which can normally be found only when looking up foreign language words, and they allows translations into multiple languages where a foreign language can be idiomatic and non-idiomatic, as in English.
Entries considered non-idiomatic in English are used in bilingual dictionaries, see "vegetable garden" as a translation into English in the following example dictionaries: in Mandarin: 菜园@Nciku, Japanese: 菜園@Goo dictionary, Russian: огород@ABBYY Lingvo.
No-one suggests to keep ridiculous collocations like "yellow car" or other silliness. Examples like "vegetable garden" are more typical and there are numerous languages with single-word translations of the term. Non-translatable words, such as почему́чка (počemúčka) don't need to have translation target entries, there's no translation, the term is explained, rather than translated - "a person, often a child, who asks a lot of questions".

Not really trying to change some people's opinion, just expressing my own. --Anatoli ^{(обсудить}/^вклад) 06:33, 5 January 2014 (UTC)[reply]

Do you really want entries such as small bird, because there are words meaning small bird in other languages? If rules are clear, translators would look at bird to find translations of small bird. Personally, I'm convinced that vegetable garden or apple tree must be included (and maybe elder brother, but I'm not sure at all), but as set terms of the language, as parts of the vocabulary, not only as translation targets. Lmaltier (talk) 08:18, 5 January 2014 (UTC)[reply]

It's a fine line, no doubt. I'm glad you asked about small bird. I was going to say that we shouldn't perhaps include diminutives because you can make diminutives for almost everything in e.g. Slavic languages, German, etc. It's part of a language. So, Russian пти́чка (ptíčka), German Vögelchen are forms of пти́ца (ptíca) and Vögelchen, not "little bird" but we have "birdie", which is similar in meaning. I agree that vegetable garden, apple tree and elder brother should be included for other reasons as well. We could define CFI for translation target entries to filter out some silly possibilities but there's too much opposition at the moment to just the inclusion of even obvious idiomatic words in other languages if the English equivalent seem to be not idiomatic. --Anatoli ^{(обсудить}/^вклад) 08:34, 5 January 2014 (UTC)[reply]

Russian сад (sad) and огород (ogorod) are definitely not synonyms, but they both translate to English garden. The SOP phrase "vegetable garden" is a way to specify what type of garden you mean. There is nothing wrong with listing them both as translations of garden, because anyone who wants to use these terms would have to click on them anyway to see which is better. Translation tables are not supposed to give all the information needed to use a word, they are only supposed to point you to the entries of words you need to consider. --Wiki Tiki 89 17:04, 5 January 2014 (UTC)[reply]

There are also terms which are morphologically the same as the English word, but meet CFI while the English translation does not. This happens because of the strange way we consider spacing in CFI. So while desk drawer doesn't seem to meet CFI (from how I think others interpret it), its Dutch morphological synonym bureaula does. If we can't create desk drawer, then how would we best direct people to the entry bureaula? If we list it at drawer, then we would have to list all derived terms of la there too if their translations into English don't meet CFI themselves. That seems like a very bad idea, because there may be some words which have dozens of derived terms, and none of them have CFI-compliant English translations. —CodeCa t 17:47, 5 January 2014 (UTC)[reply]

Translators need to know how to form simple compounds anyway. bureau la is just as SOP as desk drawer, just that we include it because people might not know how to parse it since there are no spaces. --Wiki Tiki 89 17:59, 5 January 2014 (UTC)[reply]

Imagine a company that makes hand-illustrated catalogs for furniture sellers. Now imagine that the company has many employees who specialize in drawing specific kinds of furniture. Now imagine that one of these employees is tasked with drawing desks. There is your desk drawer. bd2412 T 19:18, 5 January 2014 (UTC)[reply]

Which is further proof that it is SOP. --Wiki Tiki 89 19:24, 5 January 2014 (UTC)[reply]

You're talking about a specific example, but my point was general. What do we do when a compound term meets CFI but its English translation, that is a compound of the same terms, does not? If you need more examples, you can start with Category:Finnish compound words and Category:Dutch compound words. This issue applies to all terms in those categories whose English translation does not meet CFI. —CodeCa t 19:27, 5 January 2014 (UTC)[reply]

What I said applies to all compound terms in other languages. --Wiki Tiki 89 23:39, 5 January 2014 (UTC)[reply]

Bihari language(s)

Hi,

The Module:languages/data2 currently considers that Bihari is a language—not a language family. Same Category:Bihari language. However, Wikipedia considers that Bihari is a family of languages (w:Bihari languages).

Therefore, the Wiktionary does not consider Maithili language as a language of the "family" Bihari, and similarly for Bhojpuri language, for instance.

Why the difference? — Automatik (talk) 22:18, 2 January 2014 (UTC)[reply]

RFV due diligence

I would like to propose that when editors start a new thread at RFV they should be required to do an at least cursory search themselves before posting and state in the post where they have already looked. There have been numerous RFVs where appropriate cites just jump out of the screen on the first page of gbooks results when I looked for them. Ultimately, if no one at all bothers to look for cites, perfectly good entries will end up being deleted. Also, it is helpful to those following on with some research to know what corpora (gbooks, scholar, usenet, jstor, coca etc) have already been looked at, what search terms were used, and, if a lot of results where obtained, how many of them were checked. This requirement will help keep the RFV page shorter with only items that really do need some research and will better focus our limited resources. Spinning Spark 12:18, 4 January 2014 (UTC)[reply]

I strongly support this proposition. Saying "I doubt this exists" should not be enough, when a quick search will reveal that it unequivocally does. bd2412 T 19:19, 5 January 2014 (UTC)[reply]
I don't feel comfortable with this idea. It creates a certain elitism and would discourage new users or users who don't know how to find proper sources. RFV is an essential part of our peer review system and we shouldn't add conditions to it. It should be open to everyone regardless of their ability or experience. —CodeCat 19:23, 5 January 2014 (UTC)[reply]
- It is not that difficult to explain to any editor how to do a quick search for proper sources. Go to Google Books, type the word, click search. Look at what comes back. If there are several hits using the word as defined, no need for RfV. If that initial search doesn't turn up anything, send to RfV. bd2412 T 20:04, 5 January 2014 (UTC)[reply]
  - We can suggest that as part of the RFV process, but I don't agree with making it a definite requirement to be allowed to submit RFVs. —CodeCa t 20:06, 5 January 2014 (UTC)[reply]
- We could allow trivially-citable definitions to be speedily kept instead. The effect will be the same (time not being wasted unnecessarily searching for citations), and we won’t need to berate newcomers for breaking the rules. Care must be taken to prevent abuse though. — Ungoliant ^(falai) 20:33, 5 January 2014 (UTC)[reply]
  We already can - and do - claim "widespread use" as a way of moving some RfVs along. Most of the really hard RfVs are for senses of polysemic terms for which this kind of change does us no good. The worst consequence of RfVs is that it distorts our priorities for inserting citations into definitions and takes some editors away from other efforts they find more interesting.

Perhaps we need to make it easier to keep definitions that are close the ones other dictionaries have. How about setting up {{RfV}} to insert {{R:OneLook}} and (an updated version of) {{googles}} onto the RfV page or onto bottom of the entry's talk page or onto some temporary subpage page of it? Even for experienced editors it might provide a useful workspace. DCDuring TALK 22:17, 5 January 2014 (UTC)[reply]

For a template that implements something like the easy part of the idea, see {{taxlook}} in use at Wiktionary:Requested_entries_(Scientific_names)/A#Ag. DCDuring TALK 22:21, 5 January 2014 (UTC)[reply]

socialism

Any chance the page socialism could be unprotected? Lots of translations to be checked... -WF

I made you an autopatroller, hope you don't mind. You can edit it now. --Wiki Tiki 89 16:43, 4 January 2014 (UTC)[reply]

Period at the end of pedia template

The {{pedia}} template and the like now show text that ends in a period. That was recently not so, and is not so in {{R:Webster 1913}}, {{R:Century 1911}} and the like. Please, if you know how, remove the period again. --Dan Polansky (talk) 10:48, 5 January 2014 (UTC)[reply]

The most recent edit to that template was in June which still has the period, and I don't think it introduced the period... TeleComNasSprVen (talk) 10:53, 5 January 2014 (UTC)[reply]

Template:borrowed

What's the difference between this and Template:borrowing? I'm currently trying to find a template that would support a term borrowed from a borrowed term, and I'm unsure as to which of these to use. The entry I'm looking for is phim which is a term borrowed from the French term film which in turn was borrowed from English film. TeleComNasSprVen (talk) 22:55, 5 January 2014 (UTC)[reply]

Please try this format - phim]. You can add other info, if you wish. --Anatoli ^{(обсудить}/^вклад) 02:03, 6 January 2014 (UTC)[reply]

Indicating gender of agents with ♂ and ♀

What do you think of this diff and using {{qualifier|♂}} vs {{qualifier|male}} and {{qualifier|♀}} vs {{qualifier|female}} in translations in general? --Anatoli ^{(обсудить}/^вклад) 01:51, 6 January 2014 (UTC)[reply]

I guess if we wanted to distinguish between grammatical gender and the gender of the physical things being referred to this might make sense- but I would still prefer "male" and "female" to ♂ and ♀. DTLHS (talk) 01:58, 6 January 2014 (UTC)[reply]

I prefer "male" and "female" to ♂ and ♀ as well, that's why I'm asking. --Anatoli ^{(обсудить}/^вклад) 02:06, 6 January 2014 (UTC)[reply]

I also prefer "male" and "female", especially since I can never remember which of ♂ and ♀ is male and which is female. --Wiki Tiki 89 02:15, 6 January 2014 (UTC)[reply]

Remember the pointy bit! ♂ is supposed to be the shield and spear of Mars. Equinox ◑ 02:21, 6 January 2014 (UTC)[reply]

In theory I prefer the idea of language-independent icons, but in practice they don't look great, and seem to suggest human sex rather than grammatical gender; so perhaps not. Equinox ◑ 02:21, 6 January 2014 (UTC)[reply]

Well they are supposed to represent natural gender, not grammatical gender. For grammatical gender we use m and f (masculine and feminine). The natural and grammatical genders do not always match. --Wiki Tiki 89 02:24, 6 January 2014 (UTC)[reply]

I wonder how widespread font support is for those characters- If they're not universally supported, they shouldn't be used to convey meaning in entries. This contributor tends to come up with "bright ideas" and to massively implement them without thinking them through, stubbornly resisting any change or removal. Chuck Entz (talk) 02:29, 6 January 2014 (UTC)[reply]

I have noticed that they are pretty common. It can't be only him adding them. I always thought that that was how it was supposed to be. --Wiki Tiki 89 02:32, 6 January 2014 (UTC)[reply]

They are very common with Volapük whoever edits in this language.

Not sure if I need to revert those edits - manicurist and pedicurist, especially because Russian had {{qualifier|male}} and {{qualifier|female}}. --Anatoli ^{(обсудить}/^вклад) 02:43, 6 January 2014 (UTC)[reply]

And they've no doubt contributed most of them. As for reverting them: like I said, they tend to massively implement things. If they didn't just start doing this, you'd need a bot. Chuck Entz (talk) 02:54, 6 January 2014 (UTC)[reply]

I have invited the editor to this discussion. IMHO, we should have some consistency in marking the sex of the agents. --Anatoli ^{(обсудить}/^вклад) 03:40, 6 January 2014 (UTC)[reply]

All stubborn people must be exterminated. --Æ&Œ (talk) 04:12, 6 January 2014 (UTC)[reply]

We're not Nazis :) Hans-Friedrich Tamke (talk • contribs) seems otherwise a good editor. --Anatoli ^{(обсудить}/^вклад) 04:17, 6 January 2014 (UTC)[reply]

Just imagine how much better this project would be if all of the stubborn people were gone forever. --Æ&Œ (talk) 04:25, 6 January 2014 (UTC)[reply]

I have been using the symbols ♂ and ♀ to indicate the actual sex of animals in Volapük translations because this language does not have grammatical gender. I originally used the words "male or female", "male", "female" for words such as pijun/jevod, hipijun/hijevod, jipijun/jijevod (cf. German Taube f/Pferd n, Täuber m/Hengst m, Täubin f/Stute f). I prefer the symbols ♂♀ to indicate the sex of animals because they are shorter, but I can go back to using the full words. Hans-Friedrich Tamke (talk) 04:29, 6 January 2014 (UTC)[reply]

Danke, Hans. Æ&Œ was already going to kill, which would be sad. --Anatoli ^{(обсудить}/^вклад) 04:33, 6 January 2014 (UTC)[reply]

Korean 하다-verbs

Korean 하다-verbs are analogous to Japanese する-verbs, which are are all converted to verbal noun entries and are categorised here: Category:Japanese type 3 verbs. Korean 하다 (hada) means the same as Japanese する (suru) - "to do" but it's used to form verbs from nouns, both extremely productive. Japanese suru-verbs are all placed in the verbal noun entries, see e.g. 挨拶, which has both noun and verb entries.

Does anyone object to converting Korean hada-verbs to a similar structure? That way all hada-verbs will appear in the entries they are derived from.

I'm picking the first hada-verb from the list of Korean verbs: 구별하다 (gubyeol-hada)

I'm suggesting to move the verb section to 구별 (gubyeol) (noun, verbal noun) and convert it as follows:

Verb
구별 + 하다 (gubyeol-hada)

Inflection follows. --Anatoli ^{(обсудить}/^вклад) 06:14, 6 January 2014 (UTC)[reply]

Does it make sense or is no one interested? Technically is not very challenging but will require significant changes and update work. --Anatoli ^{(обсудить}/^вклад) 23:08, 6 January 2014 (UTC)[reply]

If you do so for Japanese verbs then you can do so also for Korean verbs, though the online dictionary of the National Institute of the Korean Language lists them separately. There are some inseparable 하다-verbs such as 속하다 (sokhada) just like 属する in Japanese. There are also 하다-adjectives such as 건강 (geon'gang) (health) → 건강하다 (geon'ganghada) (be healthy), which are analogous to Japanese adjectival nouns such as 健康. — TAKASUGI Shinji (talk) 00:48, 7 January 2014 (UTC)[reply]

Yes, I'm aware of this but thanks for reminding me. The fact that some dictionaries treat them differently made me think that we don't need to delete existing 하다-verbs but make a soft redirect, the Japanese する-verbs could also use soft redirects with a descriptive template, since some dictionaries (even if not the best or quality dictionaries) and textbooks treat them as separate words as well and it would also benefit English users by the logic "if there's a verb for "to study" in English then there should be such a verb in Japanese" (勉強する (べんきょうする, benkyō suru)) and Korean (공부하다 (kongbu-hada)). I'd like to also apply the method of translations, so that translations into ko and ja, like in the preceding examples link to verbal nouns, rather than 하다/する- verb entries. I have been using this for Japanese but not for Korean, always hesitating if verbs 공부하다 should link to 공부하다 or 공부. --Anatoli ^{(обсудить}/^вклад) 01:06, 7 January 2014 (UTC)[reply]

BTW, there's an important RFD discussion about Vietnamese non-lemma forms: Wiktionary:Requests_for_deletion#Non-idiomatic_Vietnamese_words. Specifically - nominalised verb forms (verbs turned to nouns), which are very similar to Japanese, Korean verbs and nouns above, only the situation is reversed (verb is a lemma, noun is a derivation). --Anatoli ^{(обсудить}/^вклад) 01:13, 7 January 2014 (UTC)[reply]

I think they are best kept separate. -hada is extremely productive on native roots as well. For native morphemes unused on their own, -hada is just one of the derivational suffixes, alongside -seureopda, -hi, -doeda, -daeda, -i, -georida etc.

eg. 삐딱 (ppittak, “*inclinedness”) should be a root page containing links to 삐딱삐딱 (ppittakppittak), 삐딱이 (ppittagi), 삐딱하다 (ppittakhada), 삐딱대다 (ppittakdaeda), 삐딱거리다 (ppittakgeorida), as well as etymologically related terms 빼딱 (ppaettak), 비딱 (bittak), 비스듬 (biseudeum) and 비뚤다 (bittulda), and should not be a page where contents from all these pages are combined.

eg2. 두근 (dugeun, “*palpitation”) should contain links to 두근두근 (dugeundugeun), 두근두근하다 (dugeundugeunhada), 두근거리다 (dugeun'georida), 두근대다 (dugeundaeda), and 도근 (dogeun). Wyang (talk) 08:10, 9 January 2014 (UTC)[reply]

{{look}} What does everyone think of the 하다-verb section at 구별? Haplogy (話) 13:13, 15 January 2014 (UTC)[reply]

Great edit, as usual, thanks, Haplology. I have added a conjugation section and a usage example from 구별하다. Not sure, if we can rely on automatic transliteration as yet, though. Perhaps a hangeul and rv (transliteration) should be provided for the stem? If it works for Japanese, it will also for Korean without any information loss and without duplications. Also, "gubyeol-hada" should probably be "gubyeor-hada" due to phonetic changes but Korean transliteration method is discussed here: Module_talk:ko-translit/testcases. --Anatoli ^{(обсудить}/^вклад) 22:29, 15 January 2014 (UTC)[reply]

What about my comment above? Should 삐딱 (ppittak) have all the subsections of 삐딱삐딱 (ppittakppittak), 삐딱이 (ppittagi), 삐딱하다 (ppittakhada), 삐딱대다 (ppittakdaeda), 삐딱거리다 (ppittakgeorida)? Wyang (talk) 22:19, 15 January 2014 (UTC)[reply]

You have legitimate concerns but nothing will happen without a prior discussion. There is no special treatment for Japanese -になる (- ni naru) verbs for example. The above terms can be mentioned in the "related terms", though. I also think that all existing hada-verbs don't need to be deleted but made a (soft or hard) redirect to the lemma, e.g. 구별되다 -> 구별 --Anatoli ^{(обсудить}/^вклад) 22:29, 15 January 2014 (UTC)[reply]

That was sort of a demo edit intended to get discussion going again. I respect everybody's opinion. I thought Wyang's comments might be in support, actually: (I thought "keep them separate" meaning keep the lemma and -hada separate, not merging them into their own term with its own page). I admit I don't know much about Korean, but as a general rule, if a term has a trivial etymology it should have at most a soft redirect to a more primary term. -suru verbs had {{ja-suru_etyl}} and liberal use of {{ja-see-also}}. "See also" is a formatting hack to plug holes in the existing entry layout, not something to use for every single term. Keeping things on the same page helps enormously with Japanese so I expect it would help Korean editors on the day when we finally get some. Haplogy (話) 00:47, 17 January 2014 (UTC)[reply]

using /a/ vs /æ/ for the 'trap' vowel in RP

The Received Pronunciation of the vowel of trap, cat and apple is sometimes given as /æ/, and other times given as /a/. Until recently, I had only encountered /æ/ here (and on Wikipedia), but then one user began using /a/; another user objected, leading to this thread. Would anyone else like to give their input on what the trap vowel is in RP and how it should be transcribed? Also: is the vowel the same (and should it be transcribed the same) in RP and in the accent we designate {{a|UK}}? - -sche (discuss) 08:17, 6 January 2014 (UTC)[reply]

I think I'm the "one user" you mention, although I don't think I'm the only one to use /a/. Let me make the case though by pointing to three major sources – the first is the Upton system of transcribing RP as used by Oxford Dictionaries including the OED (3rd Edition) and the Oxford Dictionary of Pronunciation of Current English; you can see that here. The second is the huge survey of RP vowel sounds carried out by the British Library, which is here. (This source refers to both phones, but classes /æ/ as ‘conservative RP’, the same category as using /e/ for DRESS and /ɔː/ for CLOTH. This is extremely old-fashioned.) The third is Hughes, Trudgill and Watt 2005, English Accents and Dialects, which uses /a/ throughout for what they call ‘Standard RP’. I also note that IPA for English uses /a/. Obviously we can choose whatever symbol we like for this phoneme, but I strongly believe that /a/ is simpler and more accurate and allows us to represent the clear difference in this vowel between the UK and the US. Ƿidsiþ 08:43, 6 January 2014 (UTC)[reply]

... after edit conflict ... Whenever I see /æ/ for a British pronunciation, I always mentally convert this to /a/ for my northern pronunciation. I notice that the 3rd edition of the OED is using /a/, not /æ/ (though the 2nd edition on-line entries still have /æ/). My personal opinion is that we should use "General British" or "BBC English" rather than some claim to represent outdated Received Pronunciation, but I struggle to reproduce RP so I'm not an expert on this subject. Having read Widsith's expert comment above, I support his view. It is a useful distinction to make to show that there is a difference between the British and the American vowel. The British older RP /æ/ is not at all the same as the American version. Dbfirs 08:52, 6 January 2014 (UTC)[reply]

There is barely a difference between the RP vowel in trap and the American vowel in trap among Americans who don't use the so-called "tense æ". Using /æ/ in our transcriptions of RP is as accurate as using /a/ and allows us to keep down the proliferation of transliterations: it's much preferable to write "(General American, Received Pronunciation) IPA^(key): /tɹæp/" than "(General American) IPA^(key): /tɹæp/ / (Received Pronunciation) IPA^(key): /tɹap/", not only because the former is shorter but also because the latter implies a greater phonetic distinction than actually exists. —Aɴɢʀ (talk) 09:14, 6 January 2014 (UTC)[reply]

P.S. We should never use {{a|UK}} since there is no single monolithic UK accent. For that matter we should never use {{a|US}} either. —Aɴɢʀ (talk) 09:17, 6 January 2014 (UTC)[reply]

That's funny, for me even with lax /æ/ I think it's one of the most obvious differences in the two accents. If you ask a Brit to "put on an American accent" and say "man", basically what they do is shift from /a/ to /æ/. Ƿidsiþ 09:20, 6 January 2014 (UTC)[reply]

To respond to the wider point, though: well it depends what you want to do. If you want consistency between dialects, we could easily find a phonetic system which works for ALL of them. We've chosen not to do that though; we've chosen to transcribe different countries' accents separately. That being the case, I believe it's better to make each dialect consistent with its own phonetics rather than try to match up symbols artificially. Ƿidsiþ 09:26, 6 January 2014 (UTC)[reply]

Well, man is a different case because æ-tensing in AmEng is much more common before nasals than it is before voiceless stops like /p/. But even there it isn't as widespread in AmEng as many Brits think it is. It's what they notice because it's so different from what they say themselves, but what they don't notice is the large number of Americans who don't have æ-tensing even before nasals. (And in my experience when Brits put on an American accent and say a word like man, they don't shift to /æ/, they shift to /eə/.) I'm not advocating a pandialectal transcription here like the one used at Wikipedia, but I don't see any point in using different symbols for sounds that are practically indistinguishable. The RP pronunciation of trap is much closer in quality (not only duration) to the GenAm pronunciation of trap than it is to, say, the Texan pronunciation of tripe (/tɹaːp/). —Aɴɢʀ (talk) 09:37, 6 January 2014 (UTC)[reply]

Yes, the American /æ/ is one of the distinguishing marks of an American accent to us Brits, but I agree that we exaggerate it to /eə/ . The shorter (outdated?) British /æ/ sounds quite different to me. I suppose it is impossible to represent all variants. /a/ is standard in northern English. Dbfirs 09:45, 6 January 2014 (UTC)[reply]

(And in southern English.) Ƿidsiþ 09:46, 6 January 2014 (UTC)[reply]

That's what I thought, but I don't know enough southerners to be sure how universal it has become. Dbfirs 09:49, 6 January 2014 (UTC)[reply]

I'd say the RP trap vowel is closer to the Nebraska trap vowel than it is to the Yorkshire trap vowel. If we're going to transcribe the Nebraska vowel as /æ/ and the Yorkshire vowel as /a/ it only makes sense to transcribe the RP vowel as /æ/. (Incidentally, in California /æ/ before nonnasal consonants is itself being lowered toward /a/, so the Los Angeles trap vowel may sound like the Yorkshire one.) —Aɴɢʀ (talk) 09:55, 6 January 2014 (UTC)[reply]

The problem with this kind of discussion is that it inevitably lapses into anecdotal evidence and personal interpretations. That is why in my first comment I restricted myself to citing sources, which I think show the general consensus that the RP TRAP-sound is now best represented as /a/. I think we should follow these sources, however I do recognise your general point that this to some extent downplays similarities between UK and US English. Ƿidsiþ 09:59, 6 January 2014 (UTC)[reply]

Yes, there are wide variations on both sides of the pond. If the OED has changed, should we follow? Dbfirs 10:05, 6 January 2014 (UTC)[reply]

The pronunciation dictionaries of Gimson and Wells use /æ/ for RP, as does the Collins dictionary. I don't think there is any "general consensus that the RP TRAP-sound is now best represented as /a/". —Aɴɢʀ (talk) 10:44, 6 January 2014 (UTC)[reply]

To my American ears, the sound file illustrating "conservative RP /æ/" in jam at the BL link Widsith gave above sounds more like my /ɛ/, while the sound file illustrating "/a/" in platform sounds basically identical to my /æ/. So maybe we should switch to /a/ for GenAm too? —Aɴɢʀ (talk) 10:49, 6 January 2014 (UTC)[reply]

Maybe; I don't really have an opinion on US transcription. As for your references though, Gimson's RP transcription system was developed in 1962; he (and Wells) also recommended /e/ for DRESS and /ɔː/ for CLOTH, but we have rightly rejected those. The outdatedness is exactly why Upton was commissioned to update RP. Wells himself has commented on the changes here: "It is well known that the quality of the RP bat vowel has changed since the 1930's. It is now more similar to "cardinal [a]" than it used to be. Hence Upton's choice of the [a] symbol. A more conservative line is to stick with the familiar symbol [æ], but to redefine it as appropriate. […] A further argument in favour of retaining the symbol [æ] is that it preserves the parallelism with American and Australian English, in which the movement towards an opener quality has not taken place." This is somewhat defensive (it was Wells's sytem that was being replaced) but even there he acknowledges the change in UK pronunciation and the difference with US/Australia. Ƿidsiþ 11:12, 6 January 2014 (UTC)[reply]

Yet in spite of his acknowledgment that the rendering of TRAP is closer to /a/ than it used to be, he still argues for using /æ/. And it's not as if we're required to follow what other dictionaries do. After all, we render the English r-sound as /ɹ/, which no other English dictionary on the planet does. —Aɴɢʀ (talk) 12:20, 6 January 2014 (UTC)[reply]

Vowels seem to be still shifting on both sides of the pond. We don't want to get stuck in the first half of the last century. I still support /a/ but I admit that I'm biased, coming from the northern half of England where /æ/ is rare. I suppose I can continue to mentally convert Wiktionary's /æ/ to the OED's /a/. Dbfirs 16:08, 6 January 2014 (UTC)[reply]

Having listened to the two audio samples at apple, I can now hear the difference that Widsith is talking about. While the US sample has a clear [æ], the British one has something in between [æ] and [a], which I would transcribe narrowly as [æ̞]. I still think we should transcribe it broadly as /æ/. There are other differences between UK and US English that we never transcribe, such as the British [l] versus the nearly universal American dental [l̪]. In many American and British accents, /uː/ is diphthongized as [ʉʊ̯], and/or /iː/ is diphthongized as [ɪi̯]. We don't need to indicate such differences. --Wiki Tiki 89 16:32, 6 January 2014 (UTC)[reply]

I was reading w:Diaphoneme just now. Maybe such an approach would be useful here? —CodeCa t 16:43, 6 January 2014 (UTC)[reply]

I always supported as least a partially diaphonemic approach. We want to show users the phonetic structure of a word, and only second to that the actual pronunciation in various dialects. --Wiki Tiki 89 16:47, 6 January 2014 (UTC)[reply]

I sort of agree with that, inasmuch as I think we should aim for a quasi-phonemic approach rather than very narrow transcriptions. However it's quite long-established that we do choose different phoneme-symbols for each dialect, and in my opinion (and that of many modern phoneticians), /æ/ has drifted far enough from the underlying reality in Britain to be somewhat confusing. (Another solution might be to include a top-level diaphonemic transcription, and then separate country ones under that.) Ƿidsiþ 17:30, 6 January 2014 (UTC)[reply]

I don't see what's confusing about it. There is no other /æ/ in any dialect of English. In fact I think /a/ is more confusing, because it can be confused with the [aː] of Australian and some American accents (e.g. Boston), which corresponds to British /ɑː/ (and there wouldn't have been anything wrong with that if British [a] and Australian/American [aː] were the same vowel sound, but they are not; the Australian/American [aː] is a central vowel, while the British [a] is a front vowel just lower than [æ]). --Wiki Tiki 89 19:01, 6 January 2014 (UTC)[reply]

We only choose different phoneme symbols for each dialect if there's a good reason to, and in this case I just don't think there is a good reason. I rather suspect that Upton et al. use /a/ for the TRAP vowel as much for typographical convenience as for phonetic precision. —Aɴɢʀ (talk) 19:29, 6 January 2014 (UTC)[reply]

No, he chose it specifically because it's more phonetically accurate. As I've said, most other recent surveys have come to the same conclusion. (Wells says as much in the quote above: he suggests that if you keep /æ/ you have to ‘redefine’ it for UK use.) Although I suppose it does have the advantage of typographic convenience as well. Ƿidsiþ 19:52, 6 January 2014 (UTC)[reply]

Incidentally, it's interesting to see that the two British editors contributing to this discussion both favour /a/ while the two American editors favour /æ/. I'm kind of surprised you care so much; I don't think I'd really have an opinion on how to transcribe American English. That's not meant to sound rude!, I'm pleased you're interested. Ƿidsiþ 20:06, 6 January 2014 (UTC)[reply]
- Maybe I care so much because for the last 20 years or so I've been transcribing RP the way John Wells does and it would rub me the wrong way to have to switch to some different method, especially when (as I see it) there is absolutely zero benefit in doing so. —Aɴɢʀ (talk) 21:01, 6 January 2014 (UTC)[reply]
  - Yeah I get that. The thing is for me as a speaker of RP it's the opposite: I've been staring at dictionaries for so long that use this symbol which I never hear in real life, and the fact that so many lexicographers are finally catching up and using /a/ is such a huge sense of relief. So I guess the benefit is partly emotional. Ƿidsiþ 21:19, 6 January 2014 (UTC)[reply]
    - Well let's not make emotional decisions. Decisions should be based on reason. If we change /æ/ to /a/, then why not change /uː/ to /ʉː/, since that's how it's usually pronounced in many parts of the world? --WikiTiki89 23:01, 6 January 2014 (UTC)[reply]
      - My pronunciation of /uː/ is a closing diphthong, like /ʉu/, or maybe even /yu/. And I pronounce /a/, not /æ/. —CodeCa t 23:15, 6 January 2014 (UTC)[reply]
      - We could, but no one else has. The changes I am advocating are not major innovations – it's just recognising what has been adopted as phonemic by the major authorities on UK English (British Library, OED, textbooks). You say you've been using the old system for 20 years – but WikiTiki, accents have changed a lot since then. Using /a/ for /æ/ is a conservative change, all things considered. But moving away from the ‘conservative RP’ of the 1950s is hugely important, because it enshrines a minority dialect as standard – not just a minority dialect, but an elitist one. Ƿidsiþ 06:59, 7 January 2014 (UTC)[reply]
        That was me, not WikiTiki, who said that about 20 years. But I think you're overstating how widespread the recent shift to the /a/-symbol has been: some authorities have switched, but there is by no means a broad consensus that such a shift is appropriate. And when I listen to the sound files you linked to above, while it's clear that the TRAP vowel has become more open, it's less clear that it's really [æ] in the conservative variety and [a] in the new variety. The conservative pronunciation of jam in the sound file sounds like [dʒɛm] and the contemporary pronunciation of platform sounds like [ˈplætfɔːm]. But because people were used to using the æ-symbol for that [ɛ] sound, they're left with no choice but the a-symbol to stand for [æ]. I'd say the language has shifted in such a way as to catch up with the symbols used for it (the sound traditionally transcribed /æ/ didn't use to be pronounced [æ], but now it is). —Aɴɢʀ (talk) 08:39, 7 January 2014 (UTC)[reply]
        So it was, sorry. It's true that in Britain /æ/ used to be higher than [æ], but now and for the last 30 years at least, it is definitely lower. See for example: ‘there has been a noticeable tendency for a lowering of the TRAP vowel which is currently quite close to Cardinal Vowel 4 for advanced speakers of Received Pronunciation’ (Raymond Hickey, ‘Ebb and Flow’, Sounds, Words, Texts and Change); ‘It is a striking fact that the current trend in pronunciation of this vowel is towards a closer, longer, perhaps tenser or diphthongal quality in the United States, but towards an opener, [a]-like, monphthongal quality in England’ (John Wells, Accents of English, 1982). The point is that many speakers of RP ‘hear’ an American a as an e, i.e. they interpret US [æ] as /ɛ/ (as has been noted several times by phonologists, although it's interesting that Americans apparently don't hear a big difference when they hear English [a]s). We can argue about the precise phonetics but it doesn't really matter, the point is that the vowels are different in the UK from the US and very audibly so (to us at least). This is compounded by historico-social considerations: a ‘truer’ [æ] is characteristic of a particularly old-fashioned ‘posh’ accent in Britain, as heard for example by listening to any speech by Prince Phillip, and it is desirable to stop representing this accent as standard when in fact it's barely used any more. Ƿidsiþ 09:26, 7 January 2014 (UTC)[reply]
        Even today there are still many people who use [æ] in Britain without sounding posh. Eric Idle is a very good example. So while there's a definite trend towards [a], it hasn't completely ousted [æ] in general speech yet. —CodeCa t 17:55, 7 January 2014 (UTC)[reply]

Yes, that's correct ― I've started listening for /æ/ now and I've heard a few, but they sound "posh" to me, and the numbers are reducing. Dbfirs 20:47, 28 January 2014 (UTC)[reply]

Everyone who was interested has presented arguments above. Now we need to vote. I have never drafted a vote from scratch before, so I would prefer if someone else set it up first. --Wiki Tiki 89 16:27, 12 January 2014 (UTC)[reply]

Since no one else volunteered to do so nor objected, I have drafted a vote, which starts in a week: Wiktionary:Votes/2014-01/Representing the short-a phoneme of Received Pronunciation. --Wiki Tiki 89 19:50, 17 January 2014 (UTC)[reply]

The vote has started.

"singular" in some form-of templates

Currently there's a set of form-of templates like {{accusative of}}, {{genitive of}} and so on, which really show "singular" in the text even though it's not part of the template name. Contrast that with {{feminine of}}, where there is no "singular". Many of these also have equivalents with "singular" in the name, like {{accusative singular of}}, {{genitive singular of}}, which show the same text so they are really duplicates. (Note there is no {{feminine singular of}} though) Should these templates be redirected and merged, or should we do the opposite: remove "singular" from the templates that don't have it in the name? —CodeCa t 23:49, 6 January 2014 (UTC)[reply]

Asturbot

Do you think there'd be any chance I'd be allowed to run my Asturian conjugation bot again this year? --Back on the list (talk) 13:08, 7 January 2014 (UTC)[reply]

No. Blocked users are not allowed to run bots. SemperBlotto (talk) 08:23, 8 January 2014 (UTC)[reply]
Why did you block him? He was not abusing multiple accounts, but (supposedly) lost his password to the User:ElisaVan. There is not evidence that he's lying about this since Special:Contributions/ElisaVan and Special:Contributions/Back on the list do not overlap timewise. Blocking him will just encourage him to create more accounts. --Wiki Tiki 89 14:10, 8 January 2014 (UTC)[reply]
Well, blocking any known long-term vandal might encourage them to create more accounts, but that suggests there is something inadequate in the blocking mechanism, rather than that we should give up on it entirely. In practice, most of what WF does is actually legitimate (I think) so I don't know how much the bot would hurt. Equinox ◑ 03:52, 9 January 2014 (UTC)[reply]

Lol, I was blocked! No hard feelings, hey... --Back on the list (talk) 12:09, 9 January 2014 (UTC)[reply]

give a man a fish and you feed him for a day. Teach a man to fish and you feed him for a lifetime - Bengali translation

There appears to be a formatting problem with the Bengali translation on this page, can someone knowledgeable take a look? ---> Tooironic (talk) 13:32, 7 January 2014 (UTC)[reply]

It's too long to be a valid page title (255 characters is the limit). I've linked the individual components of the phrase. DTLHS (talk) 18:02, 7 January 2014 (UTC)[reply]

If the long thing is a desired entry, see [[Appendix:Unsupported titles#Unsupported length]] for comparison.—msh210℠ (talk) 07:33, 12 January 2014 (UTC)[reply]

`{{audio-list}}`

This template is used in situations where there is little space for the larger audio icon (e.g. Index:Hungarian/gy). The problem is that it opens a new page when it plays the audio. The regular {{audio}} template plays the sound file without opening a new page. Can this behavior be applied to the small icon? Thanks. --Panda10 (talk) 17:15, 7 January 2014 (UTC)[reply]

Last monolingual Chickasaw speaker has passed away

The last monolingual Chickasaw speaker has passed away. More details are available through this article at the NPR website. --EncycloPetey (talk) 03:06, 9 January 2014 (UTC)[reply]

Genders in noun form entries

There doesn't seem to be a clear practice on how genders are handled on noun form entries. Sometimes they're included, sometimes they're not. For adjectives it should be clear; they shouldn't be included because adjectives don't have genders, only their individual forms do, and in that case the gender is indicated in the definition, not in the headword line (which would be redundant and incomplete anyway). I don't think genders should be indicated in noun forms either, because that's duplication of information, which leads to maintenance problems. For example, if we included genders in all the 10 forms of domus, then if someone ever wanted to change the gender of the lemma, all the form-of entries would go out of sync and would no longer have the correct gender. This is not a common occurrence, but it does happen and it is a problem. We can and should prevent it, by not duplicating information that is already on the lemma form. Consider also that we don't duplicate any other information from the lemma, like the definition, the inflections, etc. So why would we treat genders any different? —CodeCa t 15:06, 9 January 2014 (UTC)[reply]

But sometimes gender is the most convenient way of distinguishing homographs. German Bands, for example, is the genitive singular of the masculine and neuter nouns Band but the plural (all cases) of the feminine noun; Kiefern is the dative plural of the masculine noun Kiefer but the plural (all cases) of the feminine noun. Currently our entries for Bands and Kiefern don't show that; but IMO they'd be easier to understand if they did. —Aɴɢʀ (talk) 19:32, 9 January 2014 (UTC)[reply]

Form-of templates have (or should have) a parameter to give a gloss to the link, in the same way that {{term}} does. In cases like this, that can be used to tell the different nouns apart. That's much more effective than just using the gender as the disambiguation, because there could just as easily be two different nouns with the same gender, and then that doesn't work anymore. —CodeCa t 19:46, 9 January 2014 (UTC)[reply]

Fair 'nuff. Can you edit the form-of templates so that the parentheses surrounding the glosses aren't in italics? That looks really ugly. —Aɴɢʀ (talk) 19:53, 9 January 2014 (UTC)[reply]

I've been thinking about how to do that, but I don't really know. It's really a CSS issue; everything inside a "use-with-mention" class becomes italic. If someone knows how to work around this please help? —CodeCa t 20:09, 9 January 2014 (UTC)[reply]

The format for the template should be rebuilt. Currently, the whole thing is italicized with .use-with-mention { font-style: italic; } in MediaWiki:Common.css, and the italics are neutered for individual parts. The code would be simplified by just formatting what should be formatted.

The non-gloss definition text could just be surrounded with <i>..</i>. —Michael Z. 2014-01-10 21:13 z

Wiktionary talk:Page deletion guidelines#Proposal: mandatory notification

Is this proposal relevant to our policies any longer? I personally don't support notifications being completely mandatory, perhaps only encouraged in regards to notifying newer editors that their entries might be deleted, but not requiring RFD-nominators mandatory notification outright.

Speaking of which, it looks like the our general deletion guidelines page could use a little updating by now. I'd like to include a clause or two about old anon talk pages, which should be deleted so new users don't get confused because the messages are no longer relevant, and which are I think regularly deleted in practice anyway. I'd suggest wording like "Old talk pages of IP editors, if they haven't edited or been blocked in a year, may be deleted." We can adjust the length of time before deletion accordingly, but I chose a year since I think that's a reasonable time to determine if a particular IP can be determined to be static and belong to one particular editor or not, and we have year-long IP blocks for some. Anyway, any changes to this page can be reflected in the automatic dropdown list at MediaWiki:Deletereason-dropdown. TeleComNasSprVen (talk) 18:26, 10 January 2014 (UTC)[reply]

The asterisk indicates that this character is attested

These 275 entries include Middle Chinese sections in which at least one reading/romanization is preceded by an asterisk, which is linked (in no visible way) to a hidden HTML comment which contains a note to the effect of "note: asterisk indicates that this character appears more than four times in the Tang Dynasty corpus of literature", sometimes followed by an explanation that /ɑ/ is an IPA symbol and is thus different from "a". Elsewhere on our site and in the linguistic world, an asterisk preceding a term indicates that the term is not attested, so it seems rather confusing to have an asterisk indicate here that a term is attested — and it seems unhelpful to have the comment explaining the asterisk be both (a) untemplatized and thus quite variable in content, and (b) hidden. Should these entries be changed? In what way? - -sche (discuss) 19:23, 10 January 2014 (UTC)[reply]

An implication of this would seem to be that at least some of other Middle Chinese characters either are known NOT to be attested or are Not known to be attested. How many of those are there that we have?

Without waiting for that answer:

Why not just have an invisible, possibly even non-categorizing template that contains the comment and direction to a more extensive explanation, perhaps at the template's documentation? That way editors would be warned that such characters probably don't warrant challenge and why. Is there a source for the attestation claim, I wonder? What-links-here is about as good as a category and is less distracting for this kind of maintenance-related information. Incorporating the information into a Middle-Chinese-character-specific inflection-line template might work, if so burdening the character template seemed worthwhile. DCDuring TALK 20:21, 10 January 2014 (UTC)[reply]

Not quite, this is the Unicode note on the kTang field: "The Tang dynasty pronunciation(s) of this character, derived from or consistent with _T’ang Poetic Vocabulary_ by Hugh M. Stimson, Far Eastern Publications, Yale Univ. 1976. An asterisk indicates that the word or morpheme represented in toto or in part by the given character with the given reading occurs more than four times in the seven hundred poems covered." Thus it is not the "Tang Dynasty corpus of literature", just the seven hundred poems covered by the book "T’ang Poetic Vocabulary".

This whole Middle Chinese / Old Chinese thing is a joke really. The MC reconstruction is some marginal transcription scheme used in "T’ang Poetic Vocabulary" by Hugh M. Stimson, and there is no explanation of how various initial and rhyme classes are transcribed anywhere, resulting in a system that perhaps only Stimson himself truly comprehends. Either convert it into a table like below or eliminate everything on Middle/Old Chinese here - I wouldn't say there is anything useful at present.

Middle Chinese (reconstructions)
Zhengzhang (Shangfang)	Karlgren		Li (Rong)		Pan (Wuyun)		Pulleyblank		Wang (Li)		Shao (Rongfen)
/ȵiɪ^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)	/ȵʑi^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)		/ȵi^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)		/ȵi^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)		/ȵi^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)		/ȵʑi^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)		/ȵʑjɪ^H/ replace ȵ with n̠ʲ and H with ʜ, invalid IPA characters (ȵH)

Old Chinese (reconstructions)
Karlgren		Li (Fang-kuei)		Wang (Li)		Zhengzhang (Shangfang)		Pan (Wuyun)		Baxter-Sagart
/*ȵi̯ær/ replace ȵ with n̠ʲ, invalid IPA characters (ȵ)		/*njidh/		/*ȵiei/ replace ȵ with n̠ʲ, invalid IPA characters (ȵ)		/*njis/		/*njis/		/*nij-s/

Sino-Tibetan etymology
Tibetan cognate: གཉིས (gnyis) Burmese cognate: နှစ် (hnac) Proto-Sino-Tibetan root: *g/s-ni-s

Wyang (talk) 20:35, 10 January 2014 (UTC)[reply]

Well, if you're going to bring specific knowledge into this discussion.... DCDuring TALK 20:52, 10 January 2014 (UTC)[reply]

The asterisk, used exactly this way, means that a term is reconstructed, and therefore explicitly that it is not attested. Can we choose a different notation for this? —Michael Z. 2014-01-13 17:18 z

It's actually the opposite in a way... * means not attested/found/used, so it's used for reconstructions, but it's also used to indicate incorrect usage for example. I've also seen ** sometimes, but I don't know what that is. —CodeCa t 17:26, 13 January 2014 (UTC)[reply]

Is there any objection to me removing the asterisks from these entries?
Is there any objection to me replacing the inaccurate, hidden HTML comments about the characters with the templatized qualifier Template:Stimson 4x, like this? - -sche (discuss) 19:19, 13 January 2014 (UTC)[reply]
Or should I convert the asterisk to a trailing superscript ¹, with a note then explaining the ¹, like this? - -sche (discuss) 19:25, 13 January 2014 (UTC)[reply]

Verification of Reconstructed Terms

Challenging proto-language terms is a bit awkward: they technically should be nominated for deletion at Requests For Deletion/Other, though I've also seen them posted at Requests for Verification (I believe elsewhere, as well, but I'm not sure). It just doesn't seem like a good fit.

I would like to propose they be nominated and voted on instead at the Etymology Scriptorum, since they're completely an artifact of etymology and thus are best evaluated by those who are familiar with the subject. A template should be developed for them with a name like "rfv-proto" or "rfv-pro", along the lines of {{rfv-etymology}}.

I'm not sure about reconstructed forms in attested languages. If we include them in the same process, then a name for the template like "rfv-recons" might be better. Chuck Entz (talk) 23:01, 10 January 2014 (UTC)[reply]

Recently there have been more discussions about reconstructions in WT:ES, and I think we can keep it there. It would also emphasize that the process is different for reconstructions than it is for attestations. A term is either attested or it isn't, but for reconstructions it's a much more intricate process of comparing and evaluating known sound changes, attestations and so on. I also don't think that something like "passed" or "failed" is really useful for reconstructions, something like consensus or no consensus would probably work better, and be more in line with the scientific process. I don't see why reconstructions in attested languages can't be treated the same, they're reconstructions too after all, although they're often much more secure than reconstructions in totally unattested languages. I think {{rfv-recons}} might work, but it might confuse some people if the template says "rfv" when it's really not an RFV. —CodeCa t 23:23, 10 January 2014 (UTC)[reply]

I don't think the process needs to be formalized since there is not enough volume of requests to verify etymologies or reconstructions. The way it works currently is perfectly fine (that is, just bringing it up at the WT:ES and discussing it). --Wiki Tiki 89 23:29, 10 January 2014 (UTC)[reply]

Protection level of hi

Is there a good reason for hi being blocked for editing not only for new users, but also autoconfirmed (and thus regular) ones? Tried raising this on the talk page but got no response. --Njardarlogar (talk) 11:38, 11 January 2014 (UTC)[reply]

I would ask Equinox (talk • contribs), since he is the one who protected it. --Wiki Tiki 89 15:08, 11 January 2014 (UTC)[reply]

It's history shows repeated vandalism. SemperBlotto (talk) 15:12, 11 January 2014 (UTC)[reply]

But only by anons, not by established users. Surely semiprotection would be sufficient. —Aɴɢʀ (talk) 15:23, 11 January 2014 (UTC)[reply]

That's true. Changed accordingly. SemperBlotto (talk) 15:29, 11 January 2014 (UTC)[reply]

Request for using AWB

I would like to request permission to use AutoWikiBrowser (abbreviated AWB) for my account User:Kc kennylau on English Wiktionary. I already have permission to use AWB in Chinese Wiktionary, if that makes my chance for successfully applying higher. I counted manually that I have, up till now, 721 edits to the English Wiktionary, with 620 edits being to the main namespace. --Kc kennylau (talk) 15:36, 11 January 2014 (UTC)[reply]

What will you be doing with it? --Wiki Tiki 89 15:58, 11 January 2014 (UTC)[reply]

You're already a regular contributor to Category:Candidates for speedy deletion with your use of acceleration. I wonder if you should have even more error-propagation capability so early in your Wiktionary career (yes, I realize you've been around since 2012, but with only 22 edits before this week). Chuck Entz (talk) 16:12, 11 January 2014 (UTC)[reply]

I would be adding the Mandarin level tags to the respective entries for the moment, and I may still use it in the future. If I ever produce any error, I will correct all my errors manually. --Kc kennylau (talk) 16:18, 11 January 2014 (UTC)[reply]

And I am making less and less errors now. --Kc kennylau (talk) 16:25, 11 January 2014 (UTC)[reply]

The way we usually do this is you have to do some test edits first, so we can see what you're doing. --Wiki Tiki 89 16:26, 11 January 2014 (UTC)[reply]

[1] [2] --Kc kennylau (talk) 16:32, 11 January 2014 (UTC)[reply]

Before you can do that, you should ask here at the beer parlor (in a separate section) whether we want to have these tags and how we would want to display them. Doing mass edits without asking first could get you into trouble. --Wiki Tiki 89 16:44, 11 January 2014 (UTC)[reply]

OK, I promise not do mass edit on the aforementioned topic until the consensus is reached, but I may still need it for other purposes at the moment or in the future. --Kc kennylau (talk) 16:49, 11 January 2014 (UTC)[reply]

@Chuck Entz: please notice that I often use the accelerator but there are only a few mistakes. Nobody is perfect, huh? --Kc kennylau (talk) 16:53, 11 January 2014 (UTC)[reply]

@Wikitiki89: And the conclusion is? --Kc kennylau (talk) 00:47, 12 January 2014 (UTC)[reply]

It depends on the discussion below and on input from a few other more senior administrators. --Wiki Tiki 89 03:20, 12 January 2014 (UTC)[reply]

Mandarin level tags

Is this and this the correct way of adding Mandarin level tags, or the consensus is different? --Kc kennylau (talk) 16:49, 11 January 2014 (UTC)[reply]

I think there is a two-part question here:

Do we need them?
Where should we put them?

Either way, I don't think they belong as context tags. I think it might be better to display them in floating boxes on the right. --Wiki Tiki 89 17:02, 11 January 2014 (UTC)[reply]

I notice that many pages have already done so. --Kc kennylau (talk) 17:10, 11 January 2014 (UTC)[reply]

That doesn't mean it can't be changed. --Wiki Tiki 89 17:13, 11 January 2014 (UTC)[reply]

It looks like there's some confusion about the categorization (see Category:Mandarin by difficulty level), but that doesn't affect your part of it. Chuck Entz (talk) 17:15, 11 January 2014 (UTC)[reply]

How should templates respond to bad input?

A while ago there was a discussion (Wiktionary:Beer parlour/2013/October#Proposal: All script errors are bugs.) where a lot of people expressed support for treating script errors as bugs. If something goes wrong internally in a template/module then I agree, the user can't fix it so there's little use in notifying them. But I think a rather serious point was overlooked in that discussion: a computer program can only give correct output if it's given correct input. Sometimes the template/module isn't the problem, but the user is, because they gave parameters that didn't make sense. An example is when you write a link but don't say to what term, or to what language. Or if you give a language code that doesn't exist. In this case, the user should be notified that they did something wrong, and also be given information on where the problem is and how it could be fixed. The current "script error" messages are kind of scary to users, but at the same time you can't notify editors of a problem if you don't notify them at all. You can't have your cake and eat it too. So I do think that templates should do something obvious to show that an editor made a mistake. How else would they know? I certainly would want to know, hiding script errors from me would definitely make it more difficult for me to edit Wiktionary. So how do we show it? What can we do to make user-made errors less scary/discouraging, while still making them obvious enough? —CodeCa t 00:57, 12 January 2014 (UTC)[reply]

Could we make things that only show up while being edited? That way editors would see things, but readers wouldn't be bothered by them. I'm a big fan of having and eating my cake. -Atelaes λάλει ἐμοί 01:11, 12 January 2014 (UTC)[reply]

That would work, I think, and I think it also achieves the point well: someone who uses preview does so to check for errors, so we want to show them then. We can also try to make seeing script errors on pages (while normally viewing the page) an opt-in feature, so that by default they're not visible on the page. That could easily be done through CSS. I don't know if there is a specific CSS class that the page content is wrapped in when previewing, but if there is, we could make the show-only-when-previewing feature with CSS as well. Ruakh and I are also talking about ways to make our modules more error-tolerant, by allowing parts of our code to handle errors internally instead of giving errors everywhere. That would give us more control over where errors can appear, and over which ones we want to show and how we want to show them. —CodeCa t 01:36, 12 January 2014 (UTC)[reply]

I like the idea of secondary modules passing error information back to the main module so that errors can be dealt with uniformly, gracefully and without undue drama. It would be nice to have some kind of visible sign that there's an error, which would prompt editors to investigate further, but not to panic. Chuck Entz (talk) 02:10, 12 January 2014 (UTC)[reply]

+1 —Ruakh_TALK 02:14, 12 January 2014 (UTC)[reply]

That's the approach that I took with Module:Quotations.......though I have to admit it's not perfect. I haven't been able to get it to display at all, so I have to look at the html. :-/ -Atelaes λάλει ἐμοί 02:41, 12 January 2014 (UTC)[reply]

I've made a change to MediaWiki:Common.css so that script errors are not displayed when viewing a page, but they're displayed in all other cases. If you want them to be displayed always (opt-in), add this to Special:MyPage/common.css:

.scribunto-error
{
	display: inline !important;
}

—CodeCa t 02:28, 12 January 2014 (UTC)[reply]

That's such an obvious and elegant solution that I can't believe we didn't think of it before. Nicely done. -Atelaes λάλει ἐμοί 02:41, 12 January 2014 (UTC)[reply]

Hold it. If a script error is not displayed, then what is? A think an empty space where there should be content is worse than a big red ScriptError. --Wiki Tiki 89 03:23, 12 January 2014 (UTC)[reply]

Well then what do you suggest? —CodeCa t 03:54, 12 January 2014 (UTC)[reply]

Well if there were a way to display a link to a documentation page instead of just "ScriptError", that might be handy. Errors are bad no matter how they're displayed, so we need to focus on making it easier to fix them rather than on hiding them. --Wiki Tiki 89 03:57, 12 January 2014 (UTC)[reply]

I know, but some people think that all errors are evil and scary and therefore worth hiding. Anyway, we can't change the content of a page through CSS, so we'd have to change the message that is displayed instead of "Script Error". —CodeCa t 04:00, 12 January 2014 (UTC)[reply]

One reason they are big and scary is because we make them literally big and scary. Maybe we should display "An error occured", red but in a normal font size. --Wiki Tiki 89 04:04, 12 January 2014 (UTC)[reply]

I agree that a visible indication that there was an error is better than "an empty space where there should be content". And I agree that we should reduce the font size of the script error message; such a reduction in size has been suggested before but never implemented. I think displaying "script error" in red in normal-size font or even the small-size font we use for some template errors would be good. - -sche (discuss) 04:19, 12 January 2014 (UTC)[reply]

How is this? Script error —CodeCa t 04:21, 12 January 2014 (UTC)[reply]

That or even just Script error (which for me shows up at normal font size) is fine. - -sche (discuss) 04:48, 12 January 2014 (UTC)[reply]

I don't think we should use the word "script". It can easily be confused with "script" meaning "writing system", especially in the context of templates and modules. I think Module Error or An error occurred is much better. --Wiki Tiki 89 04:53, 12 January 2014 (UTC)[reply]

How about: I'm sorry, Dave. I'm afraid I can't do that. Chuck Entz (talk) 05:10, 12 January 2014 (UTC)[reply]

Why would any error message at all have to be displayed to a passive user? Ever.
What is the bad consequence of an omitted script or language name of language name being used instead of a language code or vice versa? Very few of these mistakes bring the site to its knees. Some lead to a user being taken to the wrong L2 section, some lead to miscategorization. Some could easily be corrected by the software if we chose to. DCDuring TALK 04:13, 12 January 2014 (UTC)[reply]

I agree that errors don't need to be shown to people who aren't going to fix them. But we do need to show them to the people who do fix them. In principle that means every editor should see the errors they themselves caused. But that's not possible with our software. So we have to find a middle ground. And I don't understand your second point. —CodeCa t 04:17, 12 January 2014 (UTC)[reply]

As a wiki, I think we should treat every casual user as someone who could potentially fix an error. I also agree with DCDuring that modules should be programmed to still provide some functionality even when the language code is missing or wrong, which seems to be the most common type of Script Error. --Wiki Tiki 89 04:25, 12 January 2014 (UTC)[reply]

Editors can see them with preview. What are the actual bad consequences to normal users of the most common errors? They certainly don't cause problems like the site not functioning. If an unregistered user makes a mistake, isn't it simpler to just correct it? Aren't there obvious patterns to the errors? Don't we have some ability to track the patterns? If we have thousands of errors of a given type, isn't that an indication of defective software design rather than defective users?

What is the system by which bad software design features are communicated to the software designer? If that is not easily done or is ignored, then one can expect contributors to bypass the ideal system to achieve the result they believe to be right, using whatever means they can. DCDuring TALK 04:44, 12 January 2014 (UTC)[reply]

Re: "one can expect contributors to bypass the ideal system to achieve the result they believe to be right": Absolutely. In fact, we do this all the time with MediaWiki features we dislike. —Ruakh_TALK 05:32, 12 January 2014 (UTC)[reply]

Category:Turkish form-of templates to be deleted

MewBot's too busy to take this on, so I thought I'd see if anyone else is interested:

I've been chipping away at these, but there are eight left that have more than 50 transclusions apiece (I'm ignoring the note templates for the moment). I was wondering if MewBot could take care of them. Since I don't know much Turkish, my method has been to replace the templates with {{conjugation of}} or {{inflection of}} configured to provide the same information, but without the categories. I've worked out the before and after versions for the remaining templates, and tried them on one or two entries for each (X is, of course, the first parameter, consisting of the lemma form):
# {{first-person singular possessive of|X|lang=tr}}
# {{inflection of|X||first-person|s|simple present|possessive|lang=tr}}

# {{second-person singular imperative of|X|lang=tr}}
# {{conjugation of|X||2|imp|lang=tr}}

# {{second-person singular negative imperative of|X|lang=tr}}
# {{conjugation of|X||2|negative|imp|lang=tr}}

# {{second-person singular possessive genitive of|X|lang=tr}}
# {{inflection of|X||second-person|s|simple present|possessive|gen|lang=tr}}

# {{second-person singular possessive of|X|lang=tr}}
# {{inflection of|X||second-person|s|simple present|possessive|lang=tr}}

# {{third-person negative singular of|X|lang=tr}}
# {{conjugation of|X||3|negative|s|simple|pres|ind|lang=tr}}

# {{third-person singular possessive genitive of|X|lang=tr}}
# {{inflection of|X||third-person|s|simple present|possessive|gen|lang=tr}}

# {{third-person singular possessive of|X|lang=tr}}
# {{inflection of|X||third-person|s|simple present|possessive|lang=tr}}

I'm sure the formatting and/or punctuation could be tweaked a bit, but this at least preserves all the morphological data.
As for the notes templates: they contain useful information about accentuation, but I'm not sure the usage notes are the right place for it, or I would have substed all the "pronounciation" template's transclusions by now. At any rate, we can at least orphan all of the inflected-form-of templates before we get to that.

Thanks! Chuck Entz (talk) 02:20, 12 January 2014 (UTC)[reply]

A lot of these really need more thorough work. The "negatives" for example aren't really verb forms, they're verbs in their own right with a separate lemma form (infinitive). So there should really not be a "negative" form, instead the infinitive that the definition links back to should be changed. For the possessives, you can use {{tr-possessive form of}}, but a lot of those entries have been confused as well. The possessor can be plural, but so can the possessed thing, and these templates apparently were used indiscriminately of that. —CodeCa t 02:32, 12 January 2014 (UTC)[reply]

infinitive	negative infinitive
-mak	-mamak
-mek	-memek

Re: The "negatives" for example aren't really verb forms:

Actually, they are just simple verb forms. Verbs ending in -mak have -mamak as negative infinitive; verbs ending in -mek have -memek. The verb forms can be made negative by adding suffix -ma/-me/-mı/-mi/etc. to it. Adding the "negative suffix" to a verb is really nothing else than adding the word not to an English sentence.

For example, the definition line of sevmemek should just be: "negative infinitive of sevmek", or maybe even shorter: "negative of sevmek". -- Curious (talk) 22:26, 26 January 2014 (UTC)[reply]

You just proved my point though. If there is a separate infinitive, then it's a distinct verb lemma. We do the same with Latvian negative verbs; they're treated as distinct verbs that link back to the positive, and they have their own inflection table. —CodeCa t 22:33, 26 January 2014 (UTC)[reply]

I'm not sure if I fully understand what you mean. In your proposal, what exactly should be the text on page sevmemek? And, let's say, on page sevmiyorum (=I do not love / I am not loving)? Does your proposal mean that you would like to split up Template:tr-conj? Linking to a Latvian example could also be helpful. Btw, I wouldn't call it a separate infinitive, I'd call it an infinitive form. ;-) -- Curious (talk) 20:12, 27 January 2014 (UTC)[reply]

Proposal

I, kc_kennylau, hereby propose:

I am aware that you "are gradually trying to remove certain categories from form-of templates, so that there's less uncertainty about when a category should or should not be added by the headword template." [source] Therefore, I suggest that the {{head}} template and the related templates categorize a page in a certain category if the second parameter is missing, for the ease of maintenance and following up. It will also ease the use of AWB and bots to add the forms.

--kc_kennylau (talk) 14:52, 12 January 2014 (UTC)[reply]

I think it's too early for that, if we do that right now then that category would probably contain at least 100000 entries. We also don't know for sure whether there aren't legitimate reasons to leave out the category. I mean, there could be terms that have no part of speech at all? —CodeCa t 15:57, 12 January 2014 (UTC)[reply]

Plus, of course, there are plenty of form-of templates that do categorize. —Ruakh_TALK 00:57, 14 January 2014 (UTC)[reply]

I don't see much benefit to such a category, since it's easy to get this information from a database dump. Even if we agreed that all such entries were a problem, it would be easier to write a bot using that information than using a category. (A category is more useful for long-term continual monitoring, when we're currently consistent and want to ensure that new problems don't crop up.) —Ruakh_TALK 01:00, 14 January 2014 (UTC)[reply]

Two proposals about `{{de-decl-adj}}`, `{{de-decl-adj-notcomp}}`, `{{de-decl-adj-notcomp-nopred}}`

I apologize if I am too active these days, but I have two suggestions of rework:

First suggestion: I would like to build 3 templates: {{de-decl-adj-pos}}, {{de-decl-adj-comp}}, {{de-decl-adj-sup}}. The first template will contain the positive form of {{{1}}}, the second template will contain the comparative form of {{{1}}}, and the third template will contain the superlative form of {{{1}}}. The three templates will have one more parameter, {{{nopred}}}, which, if is not empty, will disable the predicate. Then, {{de-decl-adj}} can call the three templates, {{de-decl-adj-notcomp}} can call the first template, {{de-decl-adj-notcomp-nopred}} can call the first template with {{{nopred}}} turned on.

Second suggestion: I would like to build only 1 template: {{de-decl-adj-table}}, with three parameters. {{{1}}} will be the stem of the positive/comparative/superlative form. {{{form}}} will be whether the adjective is in positive form, comparative form or superlative form. {{{nopred}}} will disable the predicate when turned on. {{de-decl-adj}} can call the template thrice, {{de-decl-adj-notcomp}} can call the template once, and {{de-decl-adj-notcomp-nopred}} can call the template once while turning {{{nopred}}} on.

Since SemperBlotto left a comment on {{de-decl-adj}} about me informing him, I have already mirrored this on his talk. --kc_kennylau (talk) 04:02, 13 January 2014 (UTC)[reply]

Why? —CodeCa t 04:16, 13 January 2014 (UTC)[reply]

Because I am aware that the three templates use separate tables, and I want that they use the same table(s) so that debugging will be easier, as well as adding new features in the future. --kc_kennylau (talk) 04:19, 13 January 2014 (UTC)[reply]

You could just have one table template, and call it three times: once to give the forms of the positive, once for the comparative, and once for the superlative. Then when there is no comparative/superlative, the calling template can decide to just not make the second two calls. I don't really know why you're making it so difficult... —CodeCa t 04:24, 13 January 2014 (UTC)[reply]

That is my second suggestion. --kc_kennylau (talk) 04:27, 13 January 2014 (UTC)[reply]

It's not, because in my suggestion there aren't 3 parameters, but as many parameters as there are forms to show in the table. That is, the table template itself would be "blank", and you tell it what goes in each cell. That's how most inflection tables work on Wiktionary; they separate the table from the template that tells it what to put in. —CodeCa t 04:39, 13 January 2014 (UTC)[reply]

I apologize for my unclear explanation, and I will use my sandboxes to demonstrate. I need thirty minutes. --kc_kennylau (talk) 04:48, 13 January 2014 (UTC)[reply]

I get what you mean now, but if I were to build on your suggestion, I would like to have two layers, table->pos/comp/sup->main. Furthermore, I failed to build my example, so I will stick to your suggestion. Therefore here is my renewed suggestion based on yours:

I would like to build {{de-decl-adj-table}} which contains the table as well as 54 parameters, and {{de-decl-adj-table-nopred}}. Then, I would build {{de-decl-adj-pos}}, {{de-decl-adj-pos-nopred}}, {{de-decl-adj-comp}} and {{de-decl-adj-sup}} based on that. After that, I would build the existing three templates based on that.

--kc_kennylau (talk) 05:14, 13 January 2014 (UTC)[reply]

You wouldn't need those templates, you could just have two layers: {{de-decl-adj}} and {{de-decl-adj-table}}. You would give certain parameters to {{de-decl-adj}} depending on whether you want a predicative form, whether the adjective is comparable or not, and so on. Something like {{de-decl-adj|pred=-}} (or maybe nopred=1, I don't know if the predicative form is ever "irregular"), or {{de-decl-adj|nocomp=1}} or similar. Look at how it's done with {{nl-decl-adj}} to compare (that template uses a module instead, but it works on the same principle: the module first creates all the forms, and then fills in the table). —CodeCa t 14:36, 13 January 2014 (UTC)[reply]

I tried disabling the predicate optionally, but I failed to do so. --kc_kennylau (talk) 15:05, 13 January 2014 (UTC)[reply]

I changed | to {{!}}, now there is no problem. --kc_kennylau (talk) 15:23, 13 January 2014 (UTC)[reply]

I've created {{de-decl-adj-table}} and changed {{de-decl-adj}} to use it. If you give - as the second parameter, it doesn't show the comparative and superlative forms. It could probably be simplified further... right now it still contains the same declension information 3 times. —CodeCa t 15:37, 13 January 2014 (UTC)[reply]

The definite articles and the indefinite articles are all lost. --kc_kennylau (talk) 02:39, 14 January 2014 (UTC)[reply]

That can be added back easily. It was more to show the general idea, it can be fine tuned if needed. —CodeCa t 02:42, 14 January 2014 (UTC)[reply]

The edit you just made means that if it's ever necessary to not show a predicative form, the table will now display "er ist —"... —CodeCa t 03:30, 14 January 2014 (UTC)[reply]

I have already fixed it as well as added the other articles, and I am aware of the message left by SemperBlotto. --kc_kennylau (talk) 03:55, 14 January 2014 (UTC)[reply]

Modified {{de-decl-adj-notcomp}} and {{de-decl-adj-notcomp-nopred}} to make them use {{de-decl-adj}}. --kc_kennylau (talk) 05:35, 14 January 2014 (UTC)[reply]

I'm still not really sure about all the subtemplates (/comp, /sup etc) that you created. Are those really necessary? —CodeCa t 14:08, 16 January 2014 (UTC)[reply]

Wiktionary:Be bold in updating pages

Should this be marked {{inactive}}? I've frequently seen others tell newcomers to Wiktionary that, contrary to Wikipedia's policies, being bold is not the case and that one should observe the markup and formatting before creating or changing entries. TeleComNasSprVen (talk) 06:12, 13 January 2014 (UTC)[reply]

No, it should not be marked {{inactive}}. I think the page is pretty accurate, and if it should be improved in some respect (e.g. by noting our rigid formatting conventions), then it should be improved in that respect, rather than discarded wholesale. Who's been telling people not to be bold? —Ruakh_TALK 01:02, 14 January 2014 (UTC)[reply]

Here and here are a couple examples. Plus I always thought it common practice to delete malformed entries even if they contain correct or somewhat useful definitions, but is so badly formatted it was not worth saving, with a link directing its creator to WT:ELE as appropriate. TeleComNasSprVen (talk) 02:16, 14 January 2014 (UTC)[reply]

The first is from 2005, so it predates all of the current content of WT:BOLD (see Wiktionary:Be bold in updating pages?oldid=258321); so it can't possibly be taken to mean that said content is now obsolete. (And anyway, it says "the mantra 'be bold' pertains to entering new, undefined words first and foremost", which while not perfectly consistent with the current WT:BOLD, is not IMHO a total rejection of it.)

The second was created in September 2009 by {{subst:pediawelcome}} (or the equivalent); that template was rewritten a few months later by Dominic (talk • contribs) to be "actually welcoming, hopefully" (see Template:pediawelcome?diff=7797753). So it's the welcome-message that's now obsolete, not the policy that it's out of harmony with.

As for deleting malformed entries — that's not at all in conflict with WT:BOLD. See w:WP:BRD.

—Ruakh_TALK 02:38, 14 January 2014 (UTC)[reply]

Ruakh's edits to Module:links

Ruakh recently made a change to this module which removed a piece of code that checks whether terms in reconstructed languages are indicated appropriately with a *. So that means that it has now become possible to omit the * and make it look like it's an attested term when it's not. This has already caused some problems with people (including myself) forgetting the asterisk, so I reverted the change but it was immediately reverted back again. I put the check back in place, and asked him to solve the problem another way, but he just reverted again and he now threatened to block me if I restore the original situation again. Ruakh bases his decision on this discussion, which is rediculous because neither did that discussion talk about how to report errors to users (which is why I brought it up a few days ago), nor does it explicitly say that we should ignore mistakes just for the sake of not showing errors anywhere. Like I said, I asked Ruakh to find another way to report the error, but he ignored that request. So now we have buggy code that will probably cause problems in the future, and I'm being threatened with a block for trying to fix it? —CodeCa t 17:40, 15 January 2014 (UTC)[reply]

How about this: ~~Fix all the missing asterisks and then change the module.~~ --Wiki Tiki 89 17:43, 15 January 2014 (UTC)[reply]

What I meant was: Use Ruakh's version for now, fix all the missing asterisks, and then change the module back. --Wiki Tiki 89 17:48, 15 January 2014 (UTC)[reply]

There aren't any pages with missing asterisks, thanks to the check, which has been there for months now. Ruakh changed code that was valid, and which worked well for a long time, merely out of the principle of "script errors are bugs", but without actually looking at why the errors were there to begin with. The edit didn't actually fix any script errors, it merely prevented them from occurring when users did something they shouldn't do. That's why I brought it up in the discussion above: how do we report mistakes to users? Ruakh, implied by his edit, seems to think that silently ignoring mistakes without any way to track them down and fix them, is the best way to go. —CodeCa t 18:02, 15 January 2014 (UTC)[reply]

I have blocked you for one day, due to your restoration of the bug. You say that your intentional bug will prevent accidental mistakes, but that is no justification — especially since you have recognized that there are other ways to find and prevent accidental mistakes. —Ruakh_TALK 17:52, 15 January 2014 (UTC)[reply]
I have unblocked CodeCat for a couple reasons. 1. You shouldn't be the one issuing the block since you are directly involved in the conflict. 2. We need to have a discussion here about what to do and CodeCat needs to participate. --Wiki Tiki 89 17:59, 15 January 2014 (UTC)[reply]

I'm involved in the conflict, but the block was not about the conflict. (Are you saying that if you warn a vandal, you can't also block him/her?) Your point #2 holds, but only as long as CodeCat refrains from reinstating her beloved bug. —Ruakh_TALK 20:20, 15 January 2014 (UTC)[reply]
But CodeCat wasn't being a "vandal" by any definition I know of. Essentially you blocked CodeCat because CodeCat reverted an edit you made, and that's how you're involved, not because of the warning. --Wiki Tiki 89 20:27, 15 January 2014 (UTC)[reply]

I blocked CodeCat because she was repeatedly re-adding code to generate script errors, after I pointed out that the community had discussed this and agreed that script errors are bugs that deter readers and editors. That is what blocking is for: for preventing damage being done. The fact that I'm who'd removed the bug is not relevant. (Since then, her comments have made me furious, and I agree that someone else should take on the mantle of blocking her if she persists in her disruptive edits. But at the time, I think my block was reasonable, and I don't think there's a problem with my being who issued it.) —Ruakh_TALK 20:38, 15 January 2014 (UTC)[reply]
- I'm furious about you too, and I would love it if someone else could take over the "mantle" in dealing with you, too. I love how you say it though... as if blocking me is a responsibility that you take on because nobody else does. That just confirms my suspicions that you see yourself as my personal babysitter. —CodeCa t 20:42, 15 January 2014 (UTC)[reply]

And regarding the main point — you cannot, cannot, cannot prevent people from forgetting the asterisk on reconstructed terms. You lack that ability. As long as someone can write From Proto-Germanic *''wer'', they can write From Proto-Germanic ''wer''. So if you think that destroying data and driving away contributors is a good way to prevent mistakes, then we should just close down the project: destroy all the data, drive away all the contributors, and prevent all the mistakes. Problem solved. —Ruakh_TALK 17:58, 15 January 2014 (UTC)[reply]

You misunderstand. It's your edit that is the bug, because it removes a check on something that we do not want, and did not make any attempt to write code that checks for the condition any other way. So I would have to conclude that you don't actually intend to fix any bugs, and are rather misinformed about the discussion you keep using as the justification of your edit. To make it clear: I do not see the discussion as supporting your edit, and until it's clarified that there is support for it, it's out of line for you to keep reinstating it, and definitely out of line to block me and force your changes through after I call for broader discussion of the problem, effectively preventing me from taking part in that discussion. —CodeCa t 17:59, 15 January 2014 (UTC)[reply]

I understand just fine. The top priority is to remove intentional bugs, which — yes — your edit is, per the previous discussion. A second priority is to find better ways to solve the problems that your intentional bug was trying to solve. But your priorities are out of whack. You're trying to use broken technical means to impose your will on all editors. It's hypocritical to talk about me "forc[ing] [my] changes through", when this entire discussion is about how forceful you should be allowed to be in forcing your will. —Ruakh_TALK 20:12, 15 January 2014 (UTC)[reply]

Like I said below, and on my talk page, no amount of consensus can make the reporting of user errors a bug. That's just sticking your head in the sand. So no, my edit was not a bug, the discussion you keep citing was just misinformed and also misinterpreted by you. And as far as enforcing will goes, I'm not the one who blocked someone. I'm also not the one who block someone more than once to force an issue; you have a history of blocking me to try to force your will. It's rather frustrating when I have to deal with having you as my personal babysitter when I can't do anything about it without going down to your level. —CodeCa t 20:33, 15 January 2014 (UTC)[reply]

I have restored the version of 19:01, 14 January 2014, on the basis that WT:BP#How_should_templates_respond_to_bad_input.3F from 12 January evinces enough support for the idea that 'showing some kind of error message, even if we need to refine what kind, is better than silently throwing out content' to counter whatever "consensus" it is being claimed was formed in the October 2013 BP discussion (for which several of the editors who participated in the January BP discussion had not been around/involved). I myself was recently confused to land on a page (as I deleted the language family code aus per RFDO) that said its etymology was "from language family". I was a bit confused at first, and that despite the fact that the natre of my task made me aware that I should be looked for content (namely a template calling aus) that no longer worked. A less-adept user who simply mistypes a language code and finds that nothing shows up in the entry is likely to be even more bewildered. (But that is getting off topic. Perhaps I should move the latter part of this comment to the 12 Jan thread.) - -sche (discuss) 21:03, 15 January 2014 (UTC)[reply]

Longer term solution

We need to have a way of both displaying an error message and displaying useful text. For example something like this: Proto-Germanic wer^(Error). --Wiki Tiki 89 18:05, 15 January 2014 (UTC)[reply]

It's certainly possible to do that, and that's what the discussion above is about. But even so, a reasonable makeshift way of reporting mistakes, one that would have allowed us to track mistakes without showing script errors, would be to add a tracking category. I even asked Ruakh to do that as a way to solving the problem, but he didn't. —CodeCa t 18:07, 15 January 2014 (UTC)[reply]

I don't see what's wrong with a hidden clean-up category. --Wiki Tiki 89 18:14, 15 January 2014 (UTC)[reply]

Neither do I, I have made plenty of them myself. I do like your solution better though, that way it shows that there is a mistake to the user as well. —CodeCa t 18:17, 15 January 2014 (UTC)[reply]

Oh, fuck you. If you're saying that you're not capable of creating a tracking category for this, and that it's therefore my responsibility to do so, then by all means, please permablock yourself. —Ruakh_TALK 20:07, 15 January 2014 (UTC)[reply]

(By the way, for the benefit of those who haven't looked at the relevant code: the reason it doesn't add a tracking category is that it's set up in a way that makes that impossible. Currently, the only way it can report a problem is to generate a script error. So first we need to figure out what it should do instead, then we need to change all the code that uses it to actually support that, and lastly we need to change it to do that. So far as I'm aware, no one has objected to having a tracking category, it's just that no one has done the sizeable amount of work involved in making that happen.) —Ruakh_TALK 20:15, 15 January 2014 (UTC)[reply]

Right, like it doesn't matter when I said that we sould wait until we have a more permanent solution. I actually let the change go through for a while, because I figured we would come up with a solution soon enough and the problem wouldn't be too bad. But when I noticed that the errors were already piling up, I realised that if we didn't put the check back in, we'd be swamped in pages to fix once we did come up with a proper solution for tracing them. So I figured, it's better to delay things for a bit, rather than make it worse for us in the future. I think it's rude to say that just because I think tracking is absolutely necessary, I have to deal with your impatience in removing errors. The whole premise of this is wrong to begin with; errors are not bugs if the editors cause them. No amount of discussion or consensus can change that. —CodeCa t 20:33, 15 January 2014 (UTC)[reply]

I agree that "errors are not bugs if the editors cause them" and I also agree script errors are annoying and should be avoided if errors are too common. I think both CodeCat and Ruakh should calm down and work on a longer term solution. Hidden tracking category and an error message like suggested by Wikitiki89 sounds good. If an editor's error is quite common it's better to avoid script errors and add some code to avoid it. --Anatoli ^{(обсудить}/^вклад) 22:08, 15 January 2014 (UTC)[reply]

Hidden tracking category and an error message is almost what we already have. The only difference is how we display the error, and what else we display alongside it. There are some cases where an error should just be an error. In this case, the error is not specifying reconstructed terms with *. The module could work around that, but I don't think it should, because that would make it work inconsistently: {{l|ine-pro|x}} and {{l|ine-pro|*x}} would then do the same thing, while {{l|la|x}} and {{l|la|*x}} would do different things. That would probably just confuse people. I think it's better if {{l|ine-pro|x}} is disallowed altogether, like it has been so far. Then the module would work consistently and there are no surprises. —CodeCa t 22:15, 15 January 2014 (UTC)[reply]

Re: "errors are not bugs if the editors cause them": pure bullcrap; I hope you are not a programmer. A program that dumps on user input that it considers invalid is faulty. --Dan Polansky (talk) 18:11, 17 January 2014 (UTC)[reply]

And silently failing is better? That would be even worse, and nobody would write a program like that. Programs do often dump when you give them invalid input. If I type "cp" on my terminal by itself, it spits back an error too. That's the same as what our script errors do. —CodeCa t 19:32, 17 January 2014 (UTC)[reply]

Whether silently failing is better than dumping depends on what the failing involves, and on your application domain. Furthermore, "failing" in any seriouse sense is certainly not the only alternative to "dumping", and it is not even a proper contrast term, since dumping is a species of failing. Programs that dump when given invalid input are faulty. Let me emphasize that I comment on the general comment "errors are not bugs if the editors cause them", an arrogant Unix geek stance ("users are morons; let's get rid of them") that usability people have been trying to extirpate for years. --Dan Polansky (talk) 19:42, 17 January 2014 (UTC)[reply]

As I see it, the only problem here is that the template throws a big, red, scary, ugly error to the user's face. It should be more discreet, like a simple ^⚠ + tracking category (and if the error is not important, then only a tracking category would be used). As said in the conversation about script errors, the big red script error should never occur in pages, as these would be real bugs. There are better ways to mark input errors than to shriek at the users. Dakdada (talk) 20:34, 18 January 2014 (UTC)[reply]

NB : It's the same difference between "cp" gently telling you that it needs an argument (with a link to more help), and a program that segfaults and dumps unintelligible gibberish at you. The first one is aimed at the user, the second at the programer. Dakdada (talk) 20:37, 18 January 2014 (UTC)[reply]

What he said. DCDuring TALK 22:10, 18 January 2014 (UTC)[reply]

In every discussion of this so far, there has been broad agreement on making the error message's font smaller. So... where is its size controlled? Let's at least take that first step and make its font smaller. Various users have had different ideas about whether the error message should be made longer ("an error occurred") or shorter ("⚠"), but at least we could make the font smaller. - -sche (discuss) 00:18, 19 January 2014 (UTC)[reply]

I tried looking for it in the MediaWiki namespace, but I couldn't find it. --Wiki Tiki 89 00:23, 19 January 2014 (UTC)[reply]

It's in MediaWiki:Common.css. But you can only change how errors are displayed there, not what should be displayed alongside the error. To change that, we'd need some thorough changes to our modules, which would require a lot of thought and discussion beforehand. That's why I got so upset with Ruakh's actions here... he implemented a "decision" that talked only about some vague desired end result, and completely ignored whether it actually achieves what we want best. "If the people desire no error messages, then we declare that error messages shall be eradicated by any means necessary", is what it sounded like to me. —CodeCa t 00:30, 19 January 2014 (UTC)[reply]

Where does the text "Script error" come from? It's not in MediaWiki:Common.css. --Wiki Tiki 89 00:36, 19 January 2014 (UTC)[reply]

Oh, no. The text itself is in the system messages. If you search the messages for "scribunto" you'll see them. —CodeCa t 00:40, 19 January 2014 (UTC)[reply]

Where are the system messages located? --Wiki Tiki 89 00:43, 19 January 2014 (UTC)[reply]

Special:AllMessages —CodeCa t 00:47, 19 January 2014 (UTC)[reply]

Would anyone object to changing "Script error" to "Module error"? --Wiki Tiki 89 01:00, 19 January 2014 (UTC)[reply]

No objection from me. PS I just set the error message's font size to 50%. - -sche (discuss) 01:29, 19 January 2014 (UTC)[reply]

These seem like desirable short-term improvements that will cause fewer of the user complaints. We'll almost certainly not have any longer term solution if the complaints abate even a little. — This unsigned comment was added by DCDuring (talk • contribs).

So you're saying that while we're coming up with the long-term solution, we should make Wiktionary look as bad as possible? --Wiki Tiki 89 02:14, 19 January 2014 (UTC)[reply]

No. I'm predicting behavior of the relevant contributors based on sad and continuing experience. I'd be very happy for even a modest decrease in unusability. DCDuring TALK 15:38, 22 January 2014 (UTC)[reply]

Error messaging module

Here is an idea: we could write a simple error module (here in "err") that would be used like this:

if main_parameter == nil then
    err.add_error('Main parameter is missing')
    err.add_category('Template X without main parameter')
elseif main_parameter == '' then
    err.add_error('Main parameter is empty')
    err.add_category('Template X with empty main parameter')
end

And at the end of a script, we would have:

local errors = err.make_errors_message()
local categories = err.make_categories()
local final_text = text .. errors .. categories
return final_text

The return value would be the wiki text, followed by a concatenated error message (right now we use error() for this), and finally all the (error) categories that need to be created to mark errors in pages. Or something like that. Dakdada (talk) 14:12, 21 January 2014 (UTC)[reply]

I have thought of that, but it has a serious drawback that all modules have to call that module at their entry and exit points, to make sure that the categories are added to the output. If someone forgets it (which is likely), it all falls apart. This is why script errors are really so useful; they work without having to write code to support them. We really need a way to specify that a certain module should be used as a "constructor" and "destructor", before and after the main module runs. That way, we could insert extra code before and after every module. But there is currently no support for this in Scribunto. —CodeCa t 14:35, 21 January 2014 (UTC)[reply]

I'm not sure I understand. How would you write what I wrote above, if there was a constructor/destructor? Dakdada (talk) 16:13, 21 January 2014 (UTC)[reply]

@CodeCat, I don't see why this wouldn't work. Any module that does not use the error module simply won't report errors, all other modules will work fine. The import thing is that the originally invoked module returns the error text. --Wiki Tiki 89 16:18, 21 January 2014 (UTC)[reply]

What do you mean with "use" the error? —CodeCa t 16:26, 21 January 2014 (UTC)[reply]

I meant "error module". --Wiki Tiki 89 17:00, 21 January 2014 (UTC)[reply]

Ok, but let's say a module invokes another module, and that second module triggers an error. If the first module doesn't handle that error, the error will go completely unreported. It's not like with exception handling where an unhandled exception will cause the program to terminate, in this case not handling it will mean ignoring it. That's not really desirable. —CodeCa t 17:07, 21 January 2014 (UTC)[reply]

Yes, but that only applies to the originally invoked module. If the originally invoked module returns the errors that it should then even if a module somewhere up in the chain doesn't handle errors, the errors from all other modules will be returned. So as I just said before, the import thing is that the originally invoked module returns the error text. --Wiki Tiki 89 17:11, 21 January 2014 (UTC)[reply]

That would mean that every function that could potentially be invoked from a template (an "entry point" function) would have to explicitly call some function to handle the errors and add the error categories, before the result is returned back to wiki-space. I don't think that's workable at all. That's why I said we need something like a destructor; a function that is called automatically by Scribunto itself, after the invoked module returns its value back to template space. A kind of automatic postprocessor. —CodeCa t 17:16, 21 January 2014 (UTC)[reply]

I think that is workable. Every entry point already has to do special things at the beginning and end, adding another won't be too bad. --Wiki Tiki 89 17:24, 21 January 2014 (UTC)[reply]

It just adds more things to check. If someone forgets it, someone else has to clean it up. It should be automatic so that it can't be forgotten. —CodeCa t 17:25, 21 January 2014 (UTC)[reply]

Well it's not the end of the world. If someone forgets to handle errors, it would probably produce a script error anyway. --Wiki Tiki 89 17:28, 21 January 2014 (UTC)[reply]

No, it will ignore the error. That's the point I'm trying to make. It means everyone has to cooperate with handling errors, and they will be silently ignored if forgotten. That's why I think script errors are better; they don't let you ignore them unless you really want to. —CodeCa t 17:31, 21 January 2014 (UTC)[reply]

But it can't just "ignore" the error. An error means something went wrong, and since it wasn't fixed, something else will continue to go wrong. For example, if a language code is not found, presumably Module:languages will report the error and return nil. So then when the module tries to use the result, it will fail. --Wiki Tiki 89 17:35, 21 January 2014 (UTC)[reply]

That's true, but let's say Module:links then tries to use that result, which it can't because it's nil. So it checks to see if the result was nil, and because it is, it reports the error to "err" in Dakdada's example. And then what? Module:links can do different things now; it can either return nil itself, or it can decide to use "und" as a fallback. In any case, the error is now "hanging in limbo" inside the "err" module. It's now up to whoever called Module:links (for example Module:headword) to report that error, which could easily be forgotten. —CodeCa t 17:41, 21 January 2014 (UTC)[reply]

Then let's have Module:links return nil as well. --Wiki Tiki 89 17:48, 21 January 2014 (UTC)[reply]

But then it would no longer be able to return anything useful, which is the whole point of handling errors non-fatally. If the module is going to give up and do nothing anyway, it might as well just throw a script error. —CodeCa t 17:52, 21 January 2014 (UTC)[reply]

That only means that it won't return anything useful if it doesn't know how to handle the error. If a module does handle the error, it will notice that the result was nil and try again with the und code. --Wiki Tiki 89 17:55, 21 January 2014 (UTC)[reply]

I suppose that can work... but it's kind of cumbersome. Also, returning nil just means that there was an error, but there's no way to tell which. So it's possible that trying again with "und" is not going to make a difference at all. —CodeCa t 18:01, 21 January 2014 (UTC)[reply]

Well maybe the error module can help with figuring that out. --Wiki Tiki 89 18:02, 21 January 2014 (UTC)[reply]

What do we do when a function is expected to return nil as a normal value? How do we distinguish that from an error? —CodeCa t 18:03, 21 January 2014 (UTC)[reply]

Then it probably won't matter if there was an error. --Wiki Tiki 89 18:06, 21 January 2014 (UTC)[reply]

"Ok, but let's say a module invokes another module, and that second module triggers an error." : if the second module triggers an error when it's been fed valid arguments, it means that it has not been properly tested: unit-tests are meant for this case. To feed valid arguments is the main purpose of the functions I proposed: they are therefor only meant to be used in Modules that process a template parameters. Dakdada (talk) 20:25, 21 January 2014 (UTC)[reply]

That's only doable to some extent, because there can be many different "pathways" to a common function. It makes more sense for that single function to validate its parameters, rather than have each "entry point" function validate them separately in an identical way. An example is Module:links#full_link, which requires at least one of the term, alt or transliteration to be not nil. Yet that function is called from many different other modules, such as Module:headword, Module:translations and Module:form of, as well as through {{term}}. It would be impractical for all of those modules and templates to make the same "one of these three is not nil" validation; it's easier to let full_link do that, because it knows what it needs. —CodeCa t 20:46, 21 January 2014 (UTC)[reply]

In doing so, only a programmer would understand what is wrong with the error that is reported. We want to inform the user what is wrong with the input for the template in use: is the first argument invalid? Is there a missing argument? It sounds like you're saying that there should be no input checks in template modules. Again, it is like letting "cp" return an unintelligible internal error when it should just say that such or such argument is missing or wrong. Also, what about categorising errors? Module errors are all dumped in a single category: this is just not usable if the error only stems from invalid arguments to a template. Dakdada (talk) 09:51, 22 January 2014 (UTC)[reply]

Of course there should be input checks. I just don't think that it should necessarily be the entry point function that does all the checking. Sometimes it's more straightforward to let a "deeper" level function like full_link do the check. The problem, then, is how that deeper function should report back to the entry point function what went wrong. That's really what we're solving here: finding a way to make errors propagate outward so that the entry point is aware of them and can take measures to report them on the page in some way. —CodeCa t 14:27, 22 January 2014 (UTC)[reply]

Every called module should return an array containing error information obtained by either creating the array if it's not calling anything, or by combining the error arrays from the modules it calls and its own error information. In cases where there's no error, this can be empty, otherwise it would be the module where the error occurred and the error that occurred. The error module would then sort through the data and give a suitable output. We would need to make this protocol a more-or-less-mandatory practice, but it seems like the best way to lend some sanity to error-handling. Chuck Entz (talk) 14:51, 22 January 2014 (UTC)[reply]

The "mandatory practice" is what undermines its reliability. It's easy to forget, either through inexperience or inattention. But such omissions are hard for us to track down and fix. We need a system that's impossible to ignore, something that doesn't rely on cooperation from each module to work. —CodeCa t 15:01, 22 January 2014 (UTC)[reply]

As our lead Lua/Scribunto expert, you could model the good behavior that you seek. Perhaps take a simple case, then harder cases, then the really common ones. Or you could simply say that you won't do it - and won't dump all over the person who does - so someone else may do what so many of us hope for. DCDuring TALK 15:43, 22 January 2014 (UTC)[reply]

If only it were that easy. Scribunto has limitations too, and this is one of them... —CodeCa t 15:48, 22 January 2014 (UTC)[reply]

Then it seems like we have a bad strategy: we are asking a system to do something it can't do in a way that maintains an adequate level of usability, one sufficient to encourage new contributors and not discourage regular contributors. What are some alternative strategies? DCDuring TALK 15:59, 22 January 2014 (UTC)[reply]

Either work around it (which we're still trying to figure out), or request a change in Scribunto. —CodeCa t 16:09, 22 January 2014 (UTC)[reply]

I think a good solution would be (if it is possible) to have the Scribunto developers introduce a new feature that could wrap all Module invocations with the output of the error module. --Wiki Tiki 89 17:49, 22 January 2014 (UTC)[reply]

Request for comment on Commons: Should Wikimedia support MP4 video?

I apologize for this message being only in English. Please translate it if needed to help your community.

The Wikimedia Foundation's multimedia team seeks community guidance on a proposal to support the MP4 video format. This digital video standard is used widely around the world to record, edit and watch videos on mobile phones, desktop computers and home video devices. It is also known as H.264/MPEG-4 or AVC.

Supporting the MP4 format would make it much easier for our users to view and contribute video on Wikipedia and Wikimedia projects -- and video files could be offered in dual formats on our sites, so we could continue to support current open formats (WebM and Ogg Theora).

However, MP4 is a patent-encumbered format, and using a proprietary format would be a departure from our current practice of only supporting open formats on our sites -- even though the licenses appear to have acceptable legal terms, with only a small fee required.

We would appreciate your guidance on whether or not to support MP4. Our Request for Comments presents views both in favor and against MP4 support, based on opinions we’ve heard in our discussions with community and team members.

Please join this RfC -- and share your advice.

All users are welcome to participate, whether you are active on Commons, Wikipedia, other Wikimedia project -- or any site that uses content from our free media repository.

You are also welcome to join tomorrow's Office hours chat on IRC, this Thursday, January 16, at 19:00 UTC, if you would like to discuss this project with our team and other community members.

We look forward to a constructive discussion with you, so we can make a more informed decision together on this important topic. Keegan (WMF) (talk) 06:46, 16 January 2014 (UTC)[reply]

Second request to use AutoWikiBrowser

I, kc_kennylau, request again to use AutoWikiBrowser (abbreviated AWB). I am requesting it the second time because the first time had no result. I have done 1831 edits up till now, and am a user permitted to use AWB in the Chinese Wikipedia. I would like to use it to generate declension forms from the links that SemperBlotto gave me as well as performing other repeating actions in the future. I have read through the list of rules very well, and promise not to abuse it in any way. --kc_kennylau (talk) 11:18, 16 January 2014 (UTC)[reply]

I have no objection to granting this permission. Does anyone? bd2412 T 14:04, 17 January 2014 (UTC)[reply]

Just to be on the safe side with a relatively new user here, shouldn't we know some specifics, like what languages? Just German? Shouldn't SB, who is working with kennylau, or someone else with German knowledge be the one to OK this? DCDuring TALK 14:14, 17 January 2014 (UTC)[reply]

I have added his username to Wiktionary:AutoWikiBrowser/CheckPage and will keep an eye on him. SemperBlotto (talk) 15:57, 17 January 2014 (UTC)[reply]

The main problem I have is that we're doing the process backwards here: no one has seen fit to nominate this user to be an autopatroller, which is like giving someone a commercial trucking license before they have a regular drivers license instead of a learner's permit. Right now their edits make up at least 75% of the unpatrolled Recent Changes- I had to manually edit the URL from the maximum of 500 entries to 1000 entries just to see anything previous to 9 hours ago. Chuck Entz (talk) 19:27, 17 January 2014 (UTC)[reply]

Well, I was on RC patrol and was going to ask him myself to throttle the edits/minute until I saw this thread, but I think it's ok to give him a try with the new tools while having someone like Semp watch over him. Besides, it happened with BD2412's mass creation of "great-great-" redirects while I was on RC patrol, albeit that was much easier to manage because the process in an instant burst took less than ten seconds to finish, and did not obscure too many edits in between. TeleComNasSprVen (talk) 19:52, 17 January 2014 (UTC)[reply]

Please note that I am not unfamiliar with the functions of the AutoWikiBrowser, as I have been using it on Chinese Wikipedia, so this is not a new tool for me. --kc_kennylau (talk) 01:27, 18 January 2014 (UTC)[reply]

Regional terms in otherwise relatively homogeneous languages

See also: Talk:мазган

I had a little disagreement with Anatoli about whether or not to tag the Israeli Russian term мазга́н (mazgán) as "rare". This word is borrowed from Hebrew and although very common among Russian speakers in Israel, it is relatively unknown among Russian speakers outside of Israel, include Russia itself. Israel has a very large community of native Russian speakers (the third largest outside of the former USSR) making up about 20% of its population. There are many Russian newspapers in Israel; books are written and published in Russian in Israel; etc. On the other hand, the population of Israel in total is very small compared to the rest of the Russian speaking world. The consequences are that even though words Israeli Russian words such as мазга́н (mazgán) are very common in Israel, it is relatively unknown in the majority of the Russian speaking world.

In English, things like this are not a problem. When you see a context tag such as "Scottish", the reader will immediately know that this word is unlikely to be understood outside of Scotland. This is because English is a pluricentric language anyone who deals with English knows this. However, Russian is a very homogeneous language and people don't generally expect to find words in a dictionary that people in Russia would not understand, unless of course this word is tagged with "rare", "obsolete", or something like that. Since exclusively regional words are uncommon in Russian, a reader might not expect to find a tag such as "Israel", the way we have "US", "UK", "Australia", etc. for English. Anatoli is worried that the reader would not think to assume that the word is rare outside Israel and wants to add an additional "rare" tag to the definition. I, however, am worried that if there is a "rare" tag, then the reader might assume that the term is rare even in Israel.

So... Does there need to be a "rare" tag or usage note in cases like this, in any language where this situation applies? --Wiki Tiki 89 00:04, 17 January 2014 (UTC)[reply]

Actually, many of the words that Wiktionary tags as "Scottish", and even some that claim to be exclusively "Scots", are well known in northern England (and possibly parts of America?). I've altered a few. Dbfirs 20:42, 28 January 2014 (UTC)[reply]

Well that was just an example, but I'm sure that there are many Scottish words that are only known and used in Scotland (and we therefore probably don't have many of them here). --Wiki Tiki 89 20:45, 28 January 2014 (UTC)[reply]

I think just tagging it as "Israel" should be enough. Adding "rare" would just mean that it's rare even in places where it's used. That's what {{context|US|rare|lang=en}} would mean too, wouldn't it? —CodeCa t 00:09, 17 January 2014 (UTC)[reply]

That's what Wikitiki89's concern is but as he also mentioned, Russian is considered quite a homogenous language, despite the large territory of Russia and ex-USSR. There are usually no Israeli, Ukrainian, Chinese, American or German Russian defined, despite sizeable Russian speaking communities. From a native speaker's point of view, I'd really like to mark a word "rare", even if it's well-known in a region. Like I mentioned on the talk page, бу́шма (búšma) is both regional and rare, буря́к (burják) is regional but not rare, it doesn't mean that бу́шма (búšma) is rare in a particular region where it's used. Overseas Russians, including living in Australia, use Кри́смас (Krísmas) for "Christmas", Russians in Germany use гаште́т (gaštét) (Gaststätte) for "cafe". --Anatoli ^{(обсудить}/^вклад) 00:28, 17 January 2014 (UTC)[reply]

I think "rare" should mean "rare within the area where it is known/used" and not "rare compared to all places that speak the language". If we use it in the latter sense, then a lot of regional terms would suddenly have to be marked rare for no real reason. And besides, the label "Israel" for a Russian term already implies that something is rare, if the user knows how many speakers there are in Israel compared to Russia. There's no need to specify it. For comparison, a label "Luxembourg" on a German entry wouldn't need "rare" either. —CodeCa t 00:33, 17 January 2014 (UTC)[reply]

I agree. The automatic categorization that comes from the label doesn't differentiate between universally rare and locally common, so that label should be avoided if a sub-lect label is available. Otherwise we we would be forced to change labels such as "Scouse" or "Jamaica" to always categorize to Category:English rare forms, since there are so few speakers of those varieties compared to those in the entire Anglosphere. Chuck Entz (talk) 04:35, 17 January 2014 (UTC)[reply]

OK, I'm convinced now, if there is no dispute about the existence of "Israeli Russian" (new category Wiki has created). Thanks, everyone. --Anatoli ^{(обсудить}/^вклад) 04:40, 17 January 2014 (UTC)[reply]

There’s another issue to consider, which is the precision of our labelling. I believe all labels applied to a term or sense should be considered independently (as much as makes sense), so rare just means “rare,” and is just categorized as rare. It is a stretch to infer that when accompanied by Israel it means either “rare only in Israel” or “rare everywhere” or “rare outside of Israel.” In practice, labels’ juxtaposition does carry various nuances, but their interpretation is often ambiguous (professional dictionaries suffer from the same problem).

We can also explicitly establish relationships between them, for example {{cx|rare|_|in|_|Israel|lang=ru}} = (rare in Israel), but these only convey information to the reader and may not be reflected in categorization. In the case of negative qualifiers reflect the opposite: {{cx|Israel|rare|_|in|_|Russia|lang=ru}} = (Israel, rare in Russia) would incorrectly be labelled as a Russian regionalism (hm – actually, it is categorized identically to the first).

Do we need stronger qualifiers, to express and categorize an idea like “rare outside of Israel” in categorization? —Michael Z. 2014-01-25 16:02 z

{{cx|Russia}} does not currently categorize since no entry was added to Module:labels/data for "Russia". But you bring up a good point, although I think it is irrelevant to this discussion and you should start a new discussion about it. --Wiki Tiki 89 16:14, 25 January 2014 (UTC)[reply]

BTW, How can we expect someone to know that the label "Israel" is not a topic, but a regional division of Russian. If we are depending on users clicking on or hovering over the label, we are counting on a behavior known not to occur in the majority of cases. DCDuring TALK 00:42, 8 November 2014 (UTC)[reply]

That's another reason why we shouldn't put topics in the same place as contexts. --Wiki Tiki 89 18:39, 10 November 2014 (UTC)[reply]

Proposal: Use Lemming principle to speed RfDs

In the interest of speeding up RfDs for English terms and senses (and also possibly preventing some), would it make sense to give a version of the 'lemming principle' the same status as WT:COALMINE? That is, once it was established that some set of published lexicographers had deemed something worth including in their dictionaries, the RfD for the term, PoS, or sense could be summarily closed, except possibly for deciding whether the item should be included only as a redirect. Initially, I would suggest that we include only general monolingual dictionaries and exclude idiom dictionaries, phrasebooks, technical glossaries, and WordNet. We can subsequently have discussions on which classes of lexicographic resource or which individual works should be added. DCDuring TALK 17:09, 17 January 2014 (UTC)[reply]

I support this idea in principle, if we can agree on which sources we consider reputable. I don't think there are words people can look up in other English dictionaries that we wish to deliberately exclude from ours (though past RFD discussions suggest other editors disagree). Ƿidsiþ 17:20, 17 January 2014 (UTC)[reply]
I support this in principle too. While we are under no obligation to include a term just because some other dictionary has it, when multiple reputable general monolingual dictionaries include a term, it's a strong hint we should probably be including it too. —Aɴɢʀ (talk) 17:41, 17 January 2014 (UTC)[reply]
I support the principle, but probably not in any fixed form. Thus, I support that whether published lexicographers had deemed something worth including should be a consideration in WT:RFD and WT:CFI. I do not support excluding idiom dictionaries and phrasebooks; I think even technical glossaries can be taken into consideration. However, because of the absense of the fixed form of an inclusion criterion, the detail of the actual inclusion criterion has to be figured out by each voter in each RFD, on a case-by-case basis. And this is what some of us have been doing in RFD for years now. --Dan Polansky (talk) 17:50, 17 January 2014 (UTC)[reply]
Oppose. Instead, I suggest trying to figure out why the other dictionaries include the term and use that reason as an argument in the discussion, if it is a good one. Doing something solely because everyone else does it is never a good idea. — Ungoliant ^(falai) 17:55, 17 January 2014 (UTC)[reply]
- And how do you propose we figure out why other dictionaries include a term? Should we send an e-mail to the editors and ask? —Aɴɢʀ (talk) 18:14, 17 January 2014 (UTC)[reply]
  - Discussion among ourselves. — Ungoliant ^(falai) 18:16, 17 January 2014 (UTC)[reply]
    - Discussion among ourselves will never allow us to read the minds of people not involved in the discussion. —Aɴɢʀ (talk) 18:19, 17 January 2014 (UTC)[reply]
      - If not one of the many people who participate in RFD discussions can think of a reason why a term has been included in the other dictionaries, it is reasonable to doubt its includibility and to at least listen to the arguments of those who think it shouldn’t included. — Ungoliant ^(falai) 18:24, 17 January 2014 (UTC)[reply]
        Ungoliant, it's folly to speculate why a particular entry is in a particular dictionary. Many lexicographers are long dead, others are faceless, and neither of those archetypes can be contacted easily. However, there are still a number of overarching reasons that words make it into dictionaries; the chief one being that it is a word in common parlance at the time of the dictionary's writing. Purple backpack89 21:16, 7 November 2014 (UTC)[reply]
        
        Futhermore, Ungoliant, your line of reasoning seems to suggest a belief that us Wiktionary editors have a better grasp of what should be a dictionary than professional lexicographers of days of yore. Purple backpack89 23:53, 9 December 2014 (UTC)[reply]
        “us Wiktionary editors” lol. — Ungoliant ^(falai) 00:19, 10 December 2014 (UTC)[reply]
I oppose this for several reasons. If we have a word that is included most reputable dictionaries and someone, knowing this, still wants to bring it to RFD and have a discussion about deletion, then the issue is probably worth discussing. Using the lemming test to quickly close this discussion is basically like saying, "Well, this word is in all these dictionaries, therefore I don't have to consider your legitimate points about why it should be deleted." Of course we should consider the fact that many dictionaries include this term, and we should consider why they include this term, but we should not use it as an excuse not to have a discussion. --Wiki Tiki 89 18:04, 17 January 2014 (UTC)[reply]
Comment: I see that the lemming test has been acknowledged as a test of idiomaticity at WT:IDIOM for over six years already (though originally under a different name). In other words, the lemming principle already has the same status as WT:COALMINE. Maybe we just need a separate WT:LEMMING page to remind people of the fact. —Aɴɢʀ (talk) 18:14, 17 January 2014 (UTC)[reply]
No, the lemming principle does not have the same status as WT:COALMINE, since coalmine has been voted on. It can easily turn out that the lemming principle will not even have a plain-majority support, let alone supermajority support. Moreover, the lemming principle cannot be used as a test of whether an expression is such that "its full meaning cannot be easily derived from the meaning of its separate components" (WT:CFI#Idiomaticity), so its presense in Wiktionary:Idioms that survived RFD is wrong. --Dan Polansky (talk) 18:20, 17 January 2014 (UTC)[reply]
I agree with Dan. IDIOM does not have the same status as COALMINE. COALMINE was voted in as a hard-and-fast prescriptive rule, and has therefore been ignored and overruled by only a few RFDs. IDIOM is a descriptive list of terms that have passed RFD, grouped by the reasons they passed; the precedents its lists can be cited, but can also be (and often are) ignored. Trying to make IDIOM a rule would make it less useful, I think: I prefer the list of "categories of things that have often passed RFD" to be more full rather than less full, listing even tests that users decide to ignore in some cases. If IDIOM were an invariant rule like COALMINE, many of its "tests" would have to be deleted, because they wouldn't be able to gain sufficient majorities to become invariant rules. - -sche (discuss) 20:34, 17 January 2014 (UTC)[reply]

Another point here is that other dictionaries don't have a an encyclopedia that they can just redirect their readers to for an encyclopedic term. We do have such an encyclopedia, and therefore can afford to redirect our readers there rather than duplicate unnecessary content. --Wiki Tiki 89 18:38, 17 January 2014 (UTC)[reply]

If it can't be enacted into something virtually automatic, it wouldn't do the job I was looking to have it do. I could imagine circumscribing it to not apply to certain types of words, such as proper nouns, or to definitions, rather than terms (ie, Ety-PoS untis). We could be more specific about the nature or even identity of the dictionaries whose authority as to inclusion we accept. For example, we could decide to exclude Collins on some grounds such as their being too quick to include neologisms. We could exclude the OED with respect to attributive use of nouns. It would certainly be possible to create a list of acceptable dictionaries and have some reasonable process for accepting new dictionaries and even for merely conditional acceptance of inclusion therein as sufficient evidence. DCDuring TALK 19:17, 17 January 2014 (UTC)[reply]
- I would actually support a stringent fully automatic or algorithmic rule if the understanding was that it forms a black core, with the gray areas open to discussion. So "only general monolingual dictionaries and exclude idiom dictionaries, phrasebooks, technical glossaries, and WordNet" would be okay for a black core. --Dan Polansky (talk) 19:21, 17 January 2014 (UTC)[reply]
  This is only intended to facilitate inclusions, providing a means of reducing the amount of needless gum-flapping. There is plenty of gum-flapping needed where other "real" dictionaries don't have the terms. WordNet and technical glossaries include many terms that seem SoP to me. Some terms are too new, too old, too rare, too technical, etc to be included in any reference. We Have no choice but to discuss them. DCDuring TALK 17:49, 25 January 2014 (UTC)[reply]

I doubt the effect of this for English would be very dramatic, but some accelerated processing of RfDs should result and some RfDs will not occur because of the ease of checking other dictionaries. DCDuring TALK 17:49, 25 January 2014 (UTC)[reply]

I would support such a notion; I see no reason to exclude technical glossaries, but would accept that a phrase appearing only in such glossaries might belong in an appendix of technical terms specific to a particular field. bd2412 T 20:34, 17 January 2014 (UTC)[reply]
Oppose per WikiTiki. - -sche (discuss) 20:43, 17 January 2014 (UTC)[reply]
Comment Haven't decided either way, but I believe my quote is appropriate here:

Do we still consider ourselves less authoritative than other dictionaries that we have to look to them to determine what we should or should not keep, when we're about the fifth website hit that comes up when Googling for a word (e.g. google:granule)? Have any other dictionaries ever invoked the argument "that dictionary has that term, so we should have it too"? This question is not limited to the particular term at hand, however. TeleComNasSprVen (talk) 23:04, 16 January 2014 (UTC)

Other dictionaries are written by professional lexicographers. Wiktionary is written by amateurs, some of whom are almost proud of their complete ignorance of lexicographic principles. Ƿidsiþ 06:16, 18 January 2014 (UTC)[reply]

@Widsith: So I take your reply to mean "yes"? That's fair enough, but if you want higher quality you can make the wiki more exclusive to more linguistics-minded people you can take the approach our concurrent project on Wikispecies took in being geared towards biologists and other scientists. Of course, the downside to that would be to exclude people who are less linguistics-minded but can and want to still contribute translation entries in a bilingual sense. But anyway, we can discuss pros and cons of Wiktionary's overall model later. TeleComNasSprVen (talk) 07:36, 18 January 2014 (UTC)[reply]

It is only for inclusion that we would rely on other dictionaries under the proposal. Other references exclude items for reasons that are irrelevant to us, including their limited-space print heritage. They provide less useful guidance for exclusion. Only the most comprehensive of dictionaries, eg, Century 1912, OED, and the Merriam-Webster unabridgeds provide any guidance as to exclusion and they cannot be considered definitive about whether an item can be excluded. Our automatic exclusions (as of possessive forms of nouns) required a policy decision. DCDuring TALK 17:49, 25 January 2014 (UTC)[reply]

Professional dictionaries do poach each other’s entries under competitive pressure. See w:Fictitious entry, and the example of w:esquivalience. —Michael Z. 2014-01-25 16:48 z

Is that an excuse for us to do the same? --Wiki Tiki 89 16:56, 25 January 2014 (UTC)[reply]

That is an answer to a question above. My mouth is fine with just my words in it, thanks. —Michael Z. 2014-01-25 19:13 z

As far as I'm concerned we're not talking about poaching other dictionaries' entries. A large proportion of our RFDs concern the issue whether or not a certain phrase is SOP with many arguments both in favor of and against SOP-ness. Checking whether other dictionaries that also avoid SOP phrases include a certain phrase is a good idea because it allows us to see whether professional lexicographers consider the phrase SOP or not. (To some extent—the presence of an entry obviously means that dictionary's editors consider it inclusion-worthy, but the absence of an entry does not necessarily mean they consider it too SOP for inclusion, as there could be other reasons for its exclusion.) I think it would be irresponsible of us not to check what other dictionaries do and allow that to play a role in our decision-making process. —Aɴɢʀ (talk) 17:12, 25 January 2014 (UTC)[reply]

Did anyone suggest that we should not consider what other dictionaries do? --Wiki Tiki 89 17:15, 25 January 2014 (UTC)[reply]

I think your argumentative style gives the impression that you are taking such positions. — This unsigned comment was added by DCDuring (talk • contribs).

Even though I have clearly explicitly stated this? See my first comment above. --Wiki Tiki 89 18:35, 25 January 2014 (UTC)[reply]

You seem to be ignoring information packaging considerations. If you take a stance of disagreement, folks will naturally assume that you disagree with what was just said, related to the main topic of the discussion. They may not re-read the entire thread to get your entire current position. They may also take short-cuts based on their perception and memory of your position on matters related to the topic at hand. I have found myself often confusing people in this way, to the detriment of the effectiveness of my discussion contributions. DCDuring TALK 20:16, 25 January 2014 (UTC)[reply]

So I should repeat my entire stance in every post I make? I often do that when I think it's necessary, but it this case I don't anyone at all was saying that we should entirely ignore other dictionaries so I didn't think that anyone would confuse that with my position. --Wiki Tiki 89 23:35, 25 January 2014 (UTC)[reply]

It isn't just you; all of the "oppose" votes here seem to be saying basically that we should pay no attention to other dictionaries and rely entirely upon ourselves and our own intuitions of what is and isn't SOP, which I think is very dangerous considering that most of us have no experience in lexicography or even linguistics. —Aɴɢʀ (talk) 07:35, 26 January 2014 (UTC)[reply]

Which ones exactly? I count three oppose votes: Ungoliant, me, and -sche. Ungoliant and I both say that we should consider the reasons that other dictionaries include the terms, and -sche says that he agrees with me. So who is it that is saying that we should ignore other dictionaries? --Wiki Tiki 89 07:47, 26 January 2014 (UTC)[reply]

Ungoliant suggests "trying to figure out why the other dictionaries include the term" which means prioritizing our own speculations over professional lexicographers' expertise. Otherwise my impression of opposition on this basis relies not so much on arguments made in this thread as ones I've seen in RFD discussions in the past, where one person says "Dictionaries X, Y, and Z have this term" and someone else replies with "So what? Who cares what other dictionaries say?" or words to that effect. —Aɴɢʀ (talk) 08:07, 26 January 2014 (UTC)[reply]

But that works both ways. A lot of times the person only says "Dictionaries X, Y, and Z have this term" in order to circumvent the argument being made in the RFD discussion. One particular point that I strongly agree with Ungoliant about is his comment with the timestamp "18:24, 17 January 2014 (UTC)". --Wiki Tiki 89 08:15, 26 January 2014 (UTC)[reply]

That said, were this to pass I would oppose extending this to technical glossaries and other dictionary types. I acknowledge that some of the foremost linguists of our time have covered certain words in other dictionaries that are common enough for inclusion, words for everyday speech or conversation, for example, but these do not appear often enough in RFD to be challenged. I do not see how this argument could be used for lots of the entries at RFD given they only touch upon them in a broad sense and in fact I do not see how their inclusion is undoubtedly determinative of ours. Angr's stance is correct on this one: "While we are under no obligation to include a term just because some other dictionary has it, when multiple reputable general monolingual dictionaries include a term, it's a strong hint we should probably be including it too." TeleComNasSprVen (talk) 22:06, 17 January 2014 (UTC)[reply]
I think we consider ourselves more inclusive than other dictionaries - and we are in competition with them for users. Having entries that users can easily find that provide what they want to know and more requires that we have no omissions. If other lexicographers believe that including a term helps them in the competition for users we can benefit from their efforts. I have yet to see a term included in the true general-purpose dictionaries both on OneLook and elsewhere that would not be included here, at least as redirects. (There are some occasions when a dictionary may include something as an adjective where we have the appropriate sense as a noun and can't find support for its adjectivity.}

Perhaps I should just make it a point of noting what my suggested version of LEMMING would indicate for some RfDs to test it out. DCDuring TALK 22:50, 17 January 2014 (UTC)[reply]

Support in principle. --Anatoli ^{(обсудить}/^вклад) 10:22, 18 January 2014 (UTC)[reply]

Support in practical terms. I think there's already an informal agreement that if a term is in a general, mainstream, published Japanese dictionary such as 広辞苑, it's appropriate to be here too. If a term is in JimBreen (talk • contribs)'s online dictionary, which is very generous in its inclusion of terms, it may be ok but may not meet our CFI. If it's not included, that's a sign that it's very new, very rare, or just very bogus.

I heard somewhere that English lexicographers refer to the OED for confirmation... Haplogy (話) 02:50, 19 January 2014 (UTC)[reply]

Support in principle, but: I think that technical glossaries and specialized dictionaries are especially important, because they are written by people knowing the subject, people who feel when something is a term worth inclusion (of course, when it's clear that entries are not supposed to be terms of the language, e.g. Charles Darwin or list of ants living in Germany, they should not be included). My second comment is that even reputable dictionaries can make mistakes, and they are copied to other dictionaries. This principle should apply only when it's clear that this is not an error. Lmaltier (talk) 21:34, 26 January 2014 (UTC)[reply]

Support: If a dictionary has it, we should too. Purple backpack89 21:16, 7 November 2014 (UTC)[reply]
- No, that goes against CFI. See Appendix:English dictionary-only terms. If you want to be able to include things from dictionaries even if they don't meet CFI, there should first be a vote to modify CFI. —CodeCat 21:35, 7 November 2014 (UTC)[reply]
  - I was under the impression that this discussion entailed a modification of CFI, but, FWIW, I do want CFI modified in that respect. Purple backpack89 21:46, 7 November 2014 (UTC)[reply]
    - I intended that a lack of attestation trump any other consideration. Speaking solely about English entries, if an item is not attested, it should not matter how many dictionaries have it. I suppose we can have discussions about terms that appear in dialects of English. I think most others share that view, because it is a deep-rooted part of our practice. DCDuring TALK 22:15, 7 November 2014 (UTC)[reply]
      - Common sense would tell us that things that are in print dictionaries are most likely attestable. The main reason that entries make it into dictionaries is use, and use suggests attestability. Purple backpack89 22:33, 7 November 2014 (UTC)[reply]
        Appendix:English dictionary-only terms. — Ungoliant ^(falai) 22:37, 7 November 2014 (UTC)[reply]
        Notice how there are only a couple hundred words on that list, a very small drop in a very large bucket that is all English-language entries. I honestly that that list should be nuked and each thing listed there be given an entry, because I believe that sources exist somewhere for each one of them, somewhere. The problem is that somewhere is archaic non-digitized documents. The very idea of that list is highly presumptuous; it suggests that we know more than people who created print dictionaries. I think we should have deference to people who created past dictionaries. Purple backpack89 22:43, 7 November 2014 (UTC)[reply]

Older dictionaries didn't go by usage: they were trying to set standards of what would be correct to use in the language. It was quite common for them to incorporate lists of technical terms from various scholars, who may have seen a term proposed in the literature or may have even coined them themselves, but never seen it actually used to convey meaning. Dictionaries also have tended to copy from other dictionaries: there's a certain amount of commercial pressure to not seem less complete than competitors (see w:Fictitious entry and w:Ghost word). Chuck Entz (talk) 23:49, 7 November 2014 (UTC)[reply]

Believing sources exist isn't good enough. Either they do or they don't. Until we find them, for each word — and in the case of that appendix, I think we have already tried and failed — we should not defer to earlier dictionary compilers; they're also only human and sometimes wrong. Equinox ◑ 11:07, 8 November 2014 (UTC)[reply]

We could use both the Lemming principle and attestability rule, at least for English terms, in other words, if a term is included in a dictionary (from an approved list of dictionaries) and the term is attestable, then we can include it. --Anatoli T. ^{(обсудить}/^вклад) 11:27, 8 November 2014 (UTC)[reply]

First parameter of `{{de-adj}}` missing

If the adjective has a comparative form, the first parameter will be the comparative form. If the adjective does not, the first parameter will be set to "-". Therefore, the first parameter must be present. Since the first parameter needs to be present lest it generate an error, I suggest to add the following to the template's code to track:

{{#if:{{{1|}}}||[[Category:German adjective headwords lacking the first parameter]]}}

--kc_kennylau (talk) 06:24, 18 January 2014 (UTC)[reply]

The first parameter is usually identical to the page name though, isn't it? Couldn't it default to that? —CodeCa t 12:20, 18 January 2014 (UTC)[reply]

No, the first parameter is never identical to the page name. The first parameter is the comparative form of the adjective. For example:

klein: {{de-adj|kleiner|kleinsten}}

arm: {{de-adj|ärmer|ärmsten}}

--kc_kennylau (talk) 14:47, 18 January 2014 (UTC)[reply]

Then make it default to the page name + er? —CodeCa t 14:55, 18 January 2014 (UTC)[reply]

But then the incomparable adjectives will not want you to do so, and people may accelerate the incorrect comparatives. For example, portabel->portabler, no e between b and l. --kc_kennylau (talk) 15:02, 18 January 2014 (UTC)[reply]

That danger exists anytime someone provides the wrong parameter to a template. I'm not sure what you mean about incomparable adjectives though. Incomparable adjectives have "-" as the first parameter. —CodeCa t 15:04, 18 January 2014 (UTC)[reply]

Your plan is potentially feasible, if the translusions lacking the first parameter tell others to check the comparative/superlative forms. Otherwise, portabeler may be generated and accelerated. Or it may add a question mark after the comparative forms, like portabel(comparative portabeler?, superlative portabelsten?) because both auto-generated forms are wrong. (They should be portabler and portabelesten). I meant by incomparable adjectives that the adjective with transclusion lacking the first parameter might be an incomparable adjective, and if the forms are auto-generated and accelerated, they would have no meaning. --kc_kennylau (talk) 15:07, 18 January 2014 (UTC)[reply]

Is there a rule that adjectives ending in -el or -bel always have a comparative that loses the -e-? If so, we can improve on the template by including that in the default rule. —CodeCa t 15:19, 18 January 2014 (UTC)[reply]

I don't know much about German, but in my knowledge, yes, and the superlative takes one more -e- as well. Adjectives ending in -t will take an -e- in the superlative form. And the adjective may go through umlaut change in the comp/sup forms. Only in my humble opinion. --kc_kennylau (talk) 15:23, 18 January 2014 (UTC)[reply]

Actually I just noticed that this is {{de-adj}} we're talking about, not {{de-decl-adj}}. That does change things a bit. {{nl-adj}} generates the comparative and superlative automatically, though. —CodeCa t 15:46, 18 January 2014 (UTC)[reply]

Discussions on user talk pages

Dan Polansky has recently reverted the move I made of my talkpage discussion to the relevant RFD section (see links my move to RFD his reversion) and I wondered whether or not it is considered good form to have two split discussions on the same issue, which is harder to track. I've also recently posted to his talkpage about my other concerns as well. At the risk of further edit warring and disruption, I've allowed his post to stay as it is on my talkpage. But I would like some advice on what is best practice in this situation, in that I prefer the discussion moved to RFD instead of confined to my relatively low-traffic talkpage.

To do so, I've tried to read up on what our policies have to say about this. I note that the Wiktionary:Usernames and user pages guideline, which is actually a draft proposal, had this to say:

User talk pages (namespace User talk:) are for other users to leave messages about the contributions of the user concerned (and for their responses). They are also used to ask questions of the user, to make them aware of activity elsewhere in the wiki, and for community discussion of issues relevant to certain users. Discussions should not contain unnecessarily malicious or offensive material. Off topic discussion should generally be done in private, such as by email or over a social networking website.
User talk pages may be archived using subpages (see Help:User and user talk pages) or content may be deleted entirely. It may be considered bad etiquette to delete a discussion on a user talk page while the discussion is still active. Such a removal may be undone or reverted to allow the discussion to continue.

This is a bit too vague for me, and it seems to run contrary to some other people's practices on this website concerning removal of contents from their talkpages at "their discretion", so to speak. Does this guideline also extend to the moving of discussions, which does not outright delete such discussions, but preserve them in their original form and transfer them to another venue? I'd appreciate some answers. Thanks, TeleComNasSprVen (talk) 08:55, 18 January 2014 (UTC)[reply]

It is your talk page, and within reason you should be able to organize it as you see prudent. If you want to move the content to a community talk page, I would recommend leaving Dan's original post at the very least and also linking to where the discussion has migrated. Please restore the section on Copyright, as I do believe it is helpful and relevant, and email me if you continue to experience trouble. DAVilla 06:15, 20 January 2014 (UTC)[reply]

Alright, I had planned to store that section in my 2014 archives later on, but if you happen to find it relevant in the now, I have restored it for easy access and linking. Specifically, the reason it was brought up was because I wanted to ask why this site was licensed under CC-By-SA 3.0 rather than any of the other free licenses, and I have also noticed a trend among editors including me who have released their content into Public Domain using {{MultiLicensePD}}, see the associated category on it. TeleComNasSprVen (talk) 07:56, 20 January 2014 (UTC)[reply]

Tracking categories

Tracking categories were invented (I think) by CodeCat (talk • contribs). Their contents are generated automatically by templates or modules, and their intended use was to identify those entries that would be affected by a change in the parameters of a template.

They have since been used to identify entries that could be improved, e.g. by adding a missing inflection table.

I think it is time that these useful categories were put on a more formal basis, especially their nomenclature.

I propose the following:-

They should all belong to the same parent category - maybe "Category:tracking categories"
They should be hidden categories.
Their introductory text (generated by a "catboiler" template?) should include the following:-
- The purpose of the category.
- If it is temporary (to solve a problem) or permament.
- What to do with entries (or if they should be left to the category's "owner" to deal with).
Their names should reflect their purpose, especially if permament, and should start with a language name.
- The names of temporary categories could optionally start with a username - their "owner".

What do you think? SemperBlotto (talk) 15:03, 18 January 2014 (UTC)[reply]

I don't think I invented them, there were some before, but they were kind of scattered. I put them all together and created more. I do think your proposal is good, but making them always start with a user name might discourage others from helping out with their contents. It also doesn't make much sense because nothing on Wiktionary is really "owned" by anyone. We can't use a category boilerplate template because those are meant for generating standardised messages for lots of languages, but these categories would all need custom messages and there is only one of each, so there is no benefit. The current categories are often organised by some kind of purpose, and they then have subcategories named with the name of a template. For example, Category:Template with raw link/wlink. I think that's a nice practice because the template name makes it immediately clear where to go to find the "culprit" for filling that category. —CodeCa t 15:10, 18 January 2014 (UTC)[reply]

We've always had stuff like Category:Hebrew terms needing transliteration and Category:Hebrew noun entries missing plural construct forms. --Wiki Tiki 89 21:39, 18 January 2014 (UTC)[reply]

Bad Sources

Is there any way we could develop a list of sources that shouldn't be used, or that require specialized knowledge to avoid known serious flaws? To a contributor who doesn't have any background, a published source is a published source, and any online dictionary looks as authoritative as any other. Also, some of the Wikipedias are well known to make up terms when they translate articles from other Wikipedias- this is especially bad in the case of Wikipedias in extinct languages such as Gothic and Old English.

For instance, w:John Bellenden Ker Gawler was an English botanist who did good work in that field, but he also wrote some truly awful work trying to explain nursery rhymes and other sayings as derived from what he called "Low Saxon", which he derived from Dutch (see v.1 and v.2. See Talk:a little bird told me for a case where this was actually used as the source of an etymology.

Another example is this Gothic dictionary, which is a constant source of protologistic terms that get added to etymologies and descendants sections of Proto-Germanic entries.

Then there are online sources such as Blust's Austronesian Comparative Dictionary, and The Tower of Babel, which have lots of useful data, but also have reconstructions based on methodology and theories that are disputed by most mainstream linguists.

That's not to mention early and amateurish sources that are all we may have for some extinct languages, but need quite a bit of scholarly work to convert to something that makes sense.

At the very least, About pages for the languages and proto-languages in question should have lists of references that shouldn't be used, and of references that require special consideration to use properly.

It would be nice, though, to have a general, dedicated page with its own shortcut that could be used in revert/deletion comments as well as on talk pages to educate contributors about what they did wrong, and/or what they can avoid doing wrong in the future. Chuck Entz (talk) 18:56, 18 January 2014 (UTC)[reply]

Sounds good. Especially this: "At the very least, About pages for the languages and proto-languages in question should have lists of references that shouldn't be used, and of references that require special consideration to use properly." --Dan Polansky (talk) 07:53, 19 January 2014 (UTC)[reply]

Google Groups is now broken in some browsers

Just a heads up: I started getting a 404 Not Found error on all Google Groups searches using Opera. However, it still works in Internet Explorer. Apparently, they have dropped support for Opera and are giving this completely stupid and misleading error message. Might affect other browsers. It will certainly be a huge handicap for me in citing Usenet, since I rely on a lot of shortcuts and features that IE lacks. Equinox ◑ 03:00, 19 January 2014 (UTC)[reply]

You really still use Opera? I recommend considering Firefox or Google Chrome. --Wiki Tiki 89 03:02, 19 January 2014 (UTC)[reply]

I have used them all and by far prefer Opera. (For one thing, it's about twice as fast on my computer.) The solution to something not working in a (currently maintained and released) browser, due to the site maker deliberately blocking the browser, is never to change the browser. It is a fault of the site maker. I can try faking the user-agent string and pretending to be Firefox etc. but how lame that is. Equinox ◑ 03:09, 19 January 2014 (UTC)[reply]

Ok, I didn't realize that you prefer to use Opera. But I agree that blocking a browser on purpose is bad behavior. I was very annoyed back in the day when some sites didn't work for Google Chrome because it was too new of a browser and websites didn't trust it to work properly. In those cases I had no choice to but to use Firefox on those websites. Sometimes sites only work in IE, then I have to switch to IE to use them. My point is that it's worth having multiple browsers so you can switch to one that works when your favorite one doesn't. --Wiki Tiki 89 03:15, 19 January 2014 (UTC)[reply]

Welcome to post-net neutrality. DAVilla 05:59, 20 January 2014 (UTC)[reply]

Concerning incomparable adjectives

Is there a simple way to judge whether an adjective is incomparable or not? --kc_kennylau (talk) 06:44, 19 January 2014 (UTC)[reply]

Am I correct that any adjective with the prefixes dis-, un-, in-, miss- as well as the suffixes -ful, -less, -istic are all incomparable? --kc_kennylau (talk) 07:24, 19 January 2014 (UTC)[reply]

No, that isn't right at all. These are all comparable: disobedient, unruly, insolent, mistaken, beautiful, hopeless, and communistic. Equinox ◑ 07:31, 19 January 2014 (UTC)[reply]

I don't think you can identify comparability from the spelling. It's just a case of which kinds of thing are purely binary (either true or false, nothing in between). For example, non-smoking is incomparable (you either smoke or not). But even mostly binary things, like main and dead, can be comparable in some circumstances ("this is the deadest nightclub I've ever been to"). Equinox ◑ 07:33, 19 January 2014 (UTC)[reply]

@Equinox Then how do I know if the adjective is binary? Prerequisite: I have no common sense. --kc_kennylau (talk) 08:45, 19 January 2014 (UTC)[reply]

You could look at Google books etc to see if you can find instances of "...er", "...est", "more ..." or "most ...". SemperBlotto (talk) 08:52, 19 January 2014 (UTC)[reply]

I know that all colors are incomparable, and am I correct that "-able"s are incomparable? And can you help me find more categories (literal categories) of incomparable adjectives? --kc_kennylau (talk) 09:18, 19 January 2014 (UTC)[reply]

At least in languages like German, color adjectives are comparable, and I guess in English as well. I would strongly advise against using some automated mechanism to identify incomparable adjectives; each ones needs to be dealt with separately. Longtrend (talk) 09:21, 19 January 2014 (UTC)[reply]

bluer, redder, greener at the Google Books Ngram Viewer.; bluer, redder, greener. Conjectures and refutations. --Dan Polansky (talk) 09:29, 19 January 2014 (UTC)[reply]

Colours are comparable: maybe my shirt is redder than yours. And -ables can be comparable: "he is the most intolerable child I've ever met"; "how reliable are those employees"? Equinox ◑ 09:43, 19 January 2014 (UTC)[reply]

Danke. Ich weiss nun wie man kann tun es. "How divisible by 7 is 14?" doesn't make sense, while "How red is this shirt" does. --kc_kennylau (talk) 09:56, 19 January 2014 (UTC)[reply]

Continuing the dead discussion, RfD of `{{de-form-adj}}`

This is taken from Wiktionary:Requests for deletion/Others#Template:de-form-adj.

German adjectives have inflected forms that can be used in a bunch of ways, but displaying them on separate lines is just plain stupid. For example, at the adjective section of rechten, there are 26 definition lines using this template. Instead, we should switch over all German entries to use the format we already use for closely related languages like Yiddish, and which SemperBlotto already uses for German, which can be seen at a page like einzigen. —Μετάknowledge^{discuss/deeds} 05:42, 9 May 2013 (UTC)[reply]
We've had this discussion before and I still agree: einzigen is the right way to do this; the current version of rechten is the wrong way. The inflection tables at einzig and recht are sufficient to show which forms exactly end in -en. Beyond seven lines or so it's an information overload and becomes unusable for the reader. —Angr 09:44, 9 May 2013 (UTC)[reply]

I note that de.Wikt does spell out all of the forms each string constitutes (see e.g. de:einzigen). I have no strong opinion on whether en.Wikt should or not. What Angr suggests is the easiest thing to do. - -sche (discuss) 18:40, 16 May 2013 (UTC)[reply]

Oppose. How is having a list with 26 definitions worse than forcing users to look for the word in a table? As I’ve suggested before, if the clutter is too troublesome, it’s better to merge definitions (e.g. “weak masculine singular genitive, dative and accusative form of recht.”) — Ungoliant ^(Falai) 18:59, 16 May 2013 (UTC)[reply]
I'm fine with a merger like that, but it would still amount to the complete reconstruction of this template and editing of all of its uses (about the same as deleting it). Angr, what do you think about that suggestion? —Μετάknowledge^{discuss/deeds} 00:22, 17 May 2013 (UTC)[reply]
Maybe something along the lines of {{got-nom form of}} would work? It has special parameters that allow you to combine cases and such. Aside from that it can be used for all nominal parts of speech, not just adjectives. —CodeCat 00:56, 17 May 2013 (UTC)[reply]

I don't mind merging some of the senses if we can get it down to a maximum of seven lines. More than that and the reader's eyes will start to glaze over. One thing I think we can always eliminate is the "mixed" forms since these are always identical to either the strong form or the weak form. —Angr 14:05, 17 May 2013 (UTC)[reply]

Please express your view either here or there, while I personally prefer there. --kc_kennylau (talk) 06:59, 19 January 2014 (UTC)[reply]

RFP and RFE in multi-word entries

I ask that I am allowed to remove {{rfp}} and {{rfe}} from the likes of big picture, entries for multi-word terms, entries that are included since some of the meanings are idiomatic. I do not object to such entries having pronunciation or etymology, but I do not believe these are entries that should be prioritized as needing pronunciation and etymology, and prioritization is what rfp and rfe does. Merriam-Webster, for instance, has an entry for big picture, with no etymology and no pronunciation in it, while it has both etymology and pronunciation in picture. Again, I do not oppose pronunciations and etymologies in such entries, but I oppose RFP and RFE. --Dan Polansky (talk) 09:00, 19 January 2014 (UTC)[reply]

RFE seems appropriate for multi-word terms because the sense development of any idiomatic sense is sometimes not obvious. One of the pieces of evidence for claimed idiomaticity is that the stress pattern is distinct, not 'normal', for a truly idiomatic term. This might be a sensible idea for those terms marked as for translation only. DCDuring TALK 23:01, 19 January 2014 (UTC)[reply]

I don't really know what's normal for stress, but there is a difference in stress between adjective-noun phrases and compounds. This difference exists in other Germanic languages as well, but in English there's no orthographic difference while in the others there is. An example is "black hole", which could be a phrasal noun or a compound if all you know is the parts it's made of (in reality, of course, it's the former). The stress is different depending on which you choose. In German, these would be distinguished as schwarzes Loch and Schwarzloch respectively. So really, the pronunciation of any multiple-word term is ambiguous as soon as the distinction between phrasal noun and compound isn't clear. And that can happen whenever the first part can be both a noun and an adjective. —CodeCa t 23:14, 19 January 2014 (UTC)[reply]

Overall, I agree with Dan in this instance. These pronunciation changes as described by DCDuring and CodeCat are not so common, IMHO. Nobody will stop anyone from adding etymology and pronunciation if they're not obvious but requesting them is probably a waste of time. (I wouldn't fight over those RFP's and RFE's, like Dan did, if somebody insists on having them but those requesting might try and add etymology and pronunciation themselves, I agree with that.) E.g. I have removed some requested translations on keep up with the Joneses because I think it's not fair to request something too culturally specific and related to a story other cultures may not have. --Anatoli ^{(обсудить}/^вклад) 23:35, 19 January 2014 (UTC)[reply]

I don't think you should remove those translation requests either. What does "fairness" have to do with the matter? After all it might be considered "unfair" to delete the {{rft}} without leaving an indication that you consider it untranslatable, with absolutely no cultural possibility for translation.

If someone inserts an RfP or RfE or ~~RfT~~TrReq, then they evidently want to know, possibly whether the requested item exists. Why would we ever delete an expressed request for normal information expressed by a user? If you can't provide it, then don't. What harm does it cause? DCDuring TALK 00:42, 20 January 2014 (UTC)[reply]

Perhaps one could insert cat=alt into the requests that seem too hard, with the effect of removing the item from the main set of or requests and putting it in the too-hard pile (as my Australian acquaintances would say), an alternative category for refractory cases. DCDuring TALK 00:46, 20 January 2014 (UTC)[reply]

It's actually not such a bad idea. Such as Category:difficult translation requests but I've noticed people often use {{trreq}} lightly, often adding bulk-requests by copy pasting {{trreq}} with various language codes. (I don't mean you, who like to have translations for living things.) --Anatoli ^{(обсудить}/^вклад) 01:59, 20 January 2014 (UTC)[reply]

I know what you feel about {{trreq}} [redacted] but having them doesn't necessarily help getting a translation, they just sit there indefinitely. Theoretically, everything is possibly translatable or explainable, if a term doesn't exist in a given language. As I said in a previous discussion, I won't delete {{rft}} if there is a slight chance those requests get filled. I can say that numerous entries, real words deleted didn't cause any harm either but they got deleted, anyway - e.g. викифицировать, წიგნის მაღაზია, розовая слизь, Linux. --Anatoli ^{(обсудить}/^вклад) 00:55, 20 January 2014 (UTC)[reply]

I think you meant {{trreq}}. {{rft}} is used to tag an entry for the Tea Room. Chuck Entz (talk) 01:32, 20 January 2014 (UTC)[reply]

Yes, I did, thanks. Redacted my post for clarity. --Anatoli ^{(обсудить}/^вклад) 01:59, 20 January 2014 (UTC)[reply]

I never remove request templates unless they've been placed someplace they don't belong, such as a form-of entry. Different people have different interests, so different people will add requests to different entries. I don't think we should be removing other people's requests unless they're being abusive- for instance, adding dozens or even hundreds a day, or stuffing a particular entry full of requests for no apparent reason. I may think some requests are a bad idea, and I may ask the requester (nicely!) to reconsider, but I'm not about to start imposing my values on others. These are, after all, requests- not demands. I would recommend you follow the same practice. Chuck Entz (talk) 01:22, 20 January 2014 (UTC)[reply]

If you check Lo Ximiendo's numerous bulk requests, including for obscure languages, you would start removing them, which are often both ungenuine and unrealistic. I have removed some genuine requests in the past (they do clutter entries and interfere with real translations because of the bug with the tool) but unrealistic requests in the past but not any more. If a particular removal is really important, like keep up with the Joneses for which most languages won't have equivalents, I can restore it but I'd prefer to discuss its necessity first. --Anatoli ^{(обсудить}/^вклад) 01:37, 20 January 2014 (UTC)[reply]

I think Lo Ximiendo's bulk requests fall under "stuffing a particular entry full of requests for no apparent reason" (I had her in mind when I wrote that). Another type of request I might remove would be one that simply can't be fulfilled, such as asking for a translation for something unknown to the speakers of an extinct language such as Gothic, or that would require words not present in the entire corpus of such a language. Chuck Entz (talk) 02:11, 20 January 2014 (UTC)[reply]

Yes, and when you actually work on those translations and know the situations with contributors and available resources, you'll see that "unrealistic" more things than just extinct languages. A lot of unnecessary trreq's for unlikely or rarely used words also put off regular contributors. I don't see much value in having all these trreq's in [[life's a bitch and then you die]], for example. --Anatoli ^{(обсудить}/^вклад) 02:25, 20 January 2014 (UTC)[reply]

My belief has always been that these request templates send the message "If you have an interest in etymology, you can start contributing to Wiktionary right here!" rather than "See how disorganized Wiktionary is? It's completely broken and you have an obligation to fix it!" We're all volunteers here, and I think inviting more volunteers to partake in our work has never been a bad idea. I'm also afraid I might get the wrong etymology or pronunciation key for example, and I usually don't add unless it's something obvious like frequently being {{suffix|frequent|ly|lang=en}} or I use one of Wyang's better coded IPA templates. It's a collaborative effort: I can supply the definitions, and you can supply the etymology, and this makes the entry better overall. If you want to have a separate category for more "urgent" (even though Wiktionary is always WIP) requests for etymology that can also be arranged, but the current setup is how I've been using it. TeleComNasSprVen (talk) 05:57, 20 January 2014 (UTC)[reply]

For multiword entries such as big picture the etymology is needed especially for the context in which the word enters the lexicon. By itself 'big picture' is a sum of parts, and it basically means a 'very big picture or painting'. But the meaning of the word has since been, by way of metaphor or figurativeness, extended to mean the totality of a situation, as if the situation itself was a 'very big picture'. I'd like to know when it was first used in this way, rather than its usual intended sum-of-parts meaning. The pronunciation is needed because words often don't blend naturally together in English either, and even for other 'multi-word' entries in other languages like Chinese the pronunciation would still be supplied. Pronunciation for 'big picture' isn't as obvious as 'big' + 'picture', as indicated by the above concerns about stress and intonation (cf. that's what she said). In 'big picture' for example, the 'g' in 'big' becomes silent, which is not always obvious to a nonnative speaker. TeleComNasSprVen (talk) 05:57, 20 January 2014 (UTC)[reply]

The "g" in big picture is not necessarily silent; it's presence is just reduced (I'm sure there's a term for that). However, if one says "look at the big picture", and carefully articulates the separate words, "big" and "picture" , the result is neither incorrect nor different in meaning. What is important is that the stress in the phrase goes to the first syllable of "picture". bd2412 T 19:05, 20 January 2014 (UTC)[reply]

I don't know if this is universal or just my idiolect, but when I say "big picture" meaning "picture that is large" then I stress the first syllable of "picture", but when I say "big picture" meaning "broader view" I stress "big" and either put a secondary stress or a second primary stress on the first syllable of "picture". --Wiki Tiki 89 19:20, 20 January 2014 (UTC)[reply]

Yes, I believe that there are two primary stresses. That means that big and picture, in that instance, are independent words and not parts of a compound. Chuck Entz (talk) 21:33, 20 January 2014 (UTC)[reply]

Yes, and the reason is the implied contrast with "small picture". If you were talking about a "big picture" with the meaning "picture that is large", and were contrasting it with a "small picture", the stress would be the same as in the expression "big picture" with the meaning "broader view". --Wiki Tiki 89 21:56, 20 January 2014 (UTC)[reply]

Some {{rfp}} insertions are due to non-compliance with our formatting practices. If there are multiple etymologies in an L2 section and a pronunciation header occurs only after the etymology section our AF-type bots flag it as a possible cleanup. As I don't know whether there is a distinct pronunciation for the different etymology sections, I add an {{rfp}} if one is missing or remove the rfc if there are pronunciation sections under every etymology. If I can see that the pronunciations are identical in every way, I consolidate them above the first etymology. I don't see how the program of resolving these RfCs could be completed out if someone is removing the RfPs. DCDuring TALK 22:43, 22 January 2014 (UTC)[reply]
- In that case, couldn't you use {{attention}} to ask whether the different etymologies have the same pronunciation? —Aɴɢʀ (talk) 08:06, 23 January 2014 (UTC)[reply]
  - The item was already placed in the generic cleanup category for the language involved, which is often ignored for long periods of time, as are most cleanup lists that are "too long", which seems to mean containing more than about 50 items. I had pledged to Ullmann a long time ago that I would work to resolve the structure problems identified by Autoformat (now its descendants) and also the translation table problems it identifies. I resolve the ones I can as described above. I convert them to more specific requests when I cannot. If the users with the expertise to solve the problems are unwilling to do so, Wiktionary is doomed. DCDuring TALK 08:34, 23 January 2014 (UTC)[reply]

reference templates

I would like to know if there's some kind of pattern to follow in naming these templates. Because I created Template:R:Beekes 2010 this morning, not knowing it already existed under another name (Template:R:grc:Beekes), but following in this CodeCat who had created Template:R:De Vaan 2008 (who herself didn't know Template:R:ine:deVaan2008 was already there). Wiktionary:Reference templates ain't of any help. --Fsojic (talk) 19:31, 19 January 2014 (UTC)[reply]

I don't know of any official guidelines on this. Many of them have a language code in them, but that's sometimes a bad idea because so many reference works cover more than one language. And some of them have a suboptimal language code, like {{R:ine:Matasovic2009}}, which is focused on Proto-Celtic rather than Proto-Indo-European. I think you just have to trawl through Category:Reference templates and its subcats before creating a new one, just in case it already exists. —Aɴɢʀ (talk) 19:52, 19 January 2014 (UTC)[reply]

I would have looked in Category:Ancient Greek templates and searched the title "Etymological Dictionary of Greek" in Wiktionary. --Vahag (talk) 20:35, 19 January 2014 (UTC)[reply]

I don't think there is an expressly agreed on naming. I don't like the language code in the name; {{R:Webster 1913}}, {{R:Century 1911}}, {{R:LSJ}} and {{R:L&S}} don't have it and don't miss it, IMHO. As for searching, beekes in Template: namespace would have found what you were looking for. Accessible from Advanced search. --Dan Polansky (talk) 21:00, 19 January 2014 (UTC)[reply]

Twitter

"This entry is here for translation purposes only."

Seems like a terrible move to me. The entry somehow failed RFD after the rules on brand names were in place. It should have been RFVed instead, in my opinion. Can we agree not to translate brand names that haven't passed RFV, and not to delete brand names without applying those rules first? DAVilla 05:55, 20 January 2014 (UTC)[reply]

I imagine the user who created the entry in 2013 did not notice that it had previously failed RFD. The entry could either be deleted immediately, per the boilerplate RFD message "Failed RFD, RFDO; do not re-enter", or be sent to RFV to be (a) deleted if citations cannot be found that pass BRAND, or (b) given a definition if citations can be found that pass BRAND. - -sche (discuss) 07:15, 20 January 2014 (UTC)[reply]

Rhymes

I am aware that the rhyme pages contain phonetic symbols that are impossible to be typed by ordinary keyboards with any kind of input method. Currently I do not have a suggestion to solve this problem yet, but I am looking for suggestions or proposals to solve this problem. I request the help of every single Wiktionarians. --kc_kennylau (talk) 10:12, 23 January 2014 (UTC)[reply]

But they can be inserted in the edit box using "Special characters" in the toolbox or the IPA insertion box below the edit field. And users of Firefox can install the Transliterator add-on, allowing them to type in XSAMPA characters and have the corresponding IPA characters appear (that's what I do). —Aɴɢʀ (talk) 10:47, 23 January 2014 (UTC)[reply]

But they cannot be typed in mobile and in the search box (Alt+Shift+F), nor in Chrome. --kc_kennylau (talk) 11:27, 23 January 2014 (UTC)[reply]

You could make a pronunciation keyboard-like like this: [3], that I use for the French Wiktionary. Dakdada (talk) 11:31, 23 January 2014 (UTC)[reply]

Of course each language would have its own keyboard. Dakdada (talk) 11:33, 23 January 2014 (UTC)[reply]

Really I don't see what you (kc_kennylau) expect to be done. It isn't just an issue on the Rhymes pages, and it isn't just an issue with IPA. We use dozens of scripts here that can't be typed with ordinary keyboards and our users work around it as best they can. There are lots of character pickers around the web, and virtual keyboards that can be downloaded, and converters like the one Darkdadaah mentioned, and so forth. —Aɴɢʀ (talk) 12:03, 23 January 2014 (UTC)[reply]

Deleting the language templates

We kept these because we wanted to make sure everything is fixed. It's been a while now, and I don't think anything is using them anymore, it has all been moved over to Lua. Or at least it should be. I think we can do one of two things:

Delete them all outright.
Replace their contents with a script error or some other kind of tracking mechanism. That way, any time they're used, we can immediately track it down and either notify who did it, or fix the script/bot. We can wait another month or so after that before deleting them finally.

—CodeCa t 22:56, 23 January 2014 (UTC)[reply]

I like the second idea, but without the script error. We can just have them return the name as usual but also add a tracking category. --Wiki Tiki 89 23:12, 23 January 2014 (UTC)[reply]
That's not possible if the output of the template is used as the input of something else, like as part of a link or as the parameter to some other template. The added category would break that. So it's better if we make the breakage very obvious. I don't think there are many cases left where this would happen anyway, it's more a security measure. —CodeCa t 23:34, 23 January 2014 (UTC)[reply]
I do still use them, just not unsubsted in entries. If I encounter a language code somewhere (even outside Wiktionary) and want to know what language it is, I come here and look for the template. I know there are other ways to find out that information, but that's the quickest and easiest way to do it. Also, after I check a translation in a translation table I change {{ttbc|xyz}}: to {{subst:xyz}}: to quickly and easily insert the language name. If the templates are doing no harm, I see no reason not to keep them. —Aɴɢʀ (talk) 08:50, 24 January 2014 (UTC)[reply]

I also still look up templates as a way of checking which language a code refers to. It's a useful trick!
On the other hand, I have been working for a year now to make our list of languages (and their names and family info, etc) complete and up-to-date, and the work is still far from finished — which I mention only to establish that I know how much work it is to maintain one list of 7700 languages, and thus I know how quixotic it is to attempt to maintain two lists of 7700 languages, one in the template namespace and one in a module. They are already out of sync; over time, they will continue to fall further out of sync. Because only the module is used by our infrastructure, only it is updated. Languages are removed from the module, or added to it, or renamed, without the templates necessarily being updated, which means anything that still uses the templates gets out-of-date information.
One could try to vigilantly keep the templates in sync with the module, but any system that relies on volunteers maintaining vigilance over 7700 discrete and for the most part boring things is unlikely to succeed for long. It might be possible to have a bot update the templates periodically, but they are also sitting on a lot of valuable real estate that it would be nice to be able to re-assign. We could use {{lb}} and {{lbl}} as shortcuts for {{label}}, and we could use {{law}} instead of {{legal}}, and we could use {{now}} in contexts without worrying that it might cause Nyambo-related problems down the road, and... - -sche (discuss) 10:24, 24 January 2014 (UTC)[reply]

Couldn't the templates be edited so they invoke the module? That way there would be only one list of 7700 languages, but {{subst:xyz}}: would still work. —Aɴɢʀ (talk) 11:21, 24 January 2014 (UTC)[reply]

We did something like that on fr: (e.g. fr:Modèle:fr), although we still have lots of language templates in use (like en: actually). Since there are a lot of different uses (e.g. things like [[Category:Foo in {{fr}}]]), we can't have them do anything but write a language name.

Now that I think about it, there is a way to track all those pages: just create a dedicated module to host the function that returns the language name for those templates, and look at the pages that use the module (or, we can just track the use of each template). Dakdada (talk) 11:56, 24 January 2014 (UTC)[reply]

The tracking of pages that use templates by transcluding them is not a problem. The templates have already been orphaned. The problem is people or Javascript gadgets that subst them into entries. Those don't show up as transclusions, but they'll break if the templates go missing. —CodeCa t 13:46, 24 January 2014 (UTC)[reply]

Right. All the more reason not to delete them. —Aɴɢʀ (talk) 13:57, 24 January 2014 (UTC)[reply]

That's why I suggested adding script errors. They'll still break, but we'll know immediately when they do, and can fix the cause. —CodeCa t 14:09, 24 January 2014 (UTC)[reply]

We should do our best to never show a Script error to the users, so this would be a last resort. Before doing something so drastic, we should first hunt the templates that still invoke language templates (e.g. {{exthomophones}}). Dakdada (talk) 14:22, 24 January 2014 (UTC)[reply]

I think you misunderstand. This is meant as a last resort, because there's nothing left to do. —CodeCa t 14:31, 24 January 2014 (UTC)[reply]

I did find some leftover templates (User:Darkdadaah/Lists/Templates calling language templates), although they are thankfully not used much, from what I can see. There are still some pages which appear to use language templates, but it appears that they are simply out of sync. If we estimate that their number is small, they I would be ok to put a tracking category/error in their place, before deprecating them. Dakdada (talk) 14:56, 24 January 2014 (UTC)[reply]

Another set, this time language templates that are still used in articles: User:Darkdadaah/Lists/Leftover language templates (I suppose I could've missed some cases). Dakdada (talk) 16:13, 24 January 2014 (UTC)[reply]

Apparently all of those were done recently by a single user, User:White whirlwind. They're still a new user, so it's strange that they didn't see the big "deprecated" notice on those templates. —CodeCa t 13:07, 25 January 2014 (UTC)[reply]

Ok, assuming those templates are cleaned, if we consider that the language templates should never be used anymore, then we come back to the first proposition. Deletion or warning. The first allows templates to be recycled. The second would still help some people to check languages, but that's not what those templates are for. On fr: we chose to keep all the templates for "History" purposes (when we check older versions of pages), although I would rather have them recycled, for practical reasons. Dakdada (talk) 16:22, 25 January 2014 (UTC)[reply]

@Angr, A quicker and easier way to check what a language code means is to use Kephir's "Experimental Translator's Extension", enablable at WT:PREFS. It puts a dropdown menu at the top of the screen that, among other things, lets you search for language codes. You can also search the other way by inputting a language to find out the code. --Wiki Tiki 89 16:26, 24 January 2014 (UTC)[reply]

There's also WT:LL, although that takes a while to load. —CodeCa t 16:54, 24 January 2014 (UTC)[reply]

Which is why WT:LL doesn't count as faster. --Wiki Tiki 89 17:25, 24 January 2014 (UTC)[reply]

A category for all words or lemmas in a language

At WT:ID#English words, a user noted that we currently don't have an easy way to get an ordered list of all words, in the same way that you would find it in a paper dictionary. We group words by their part of speech, but we have no category for all words regardless of part of speech. We have indexes, but those are often incomplete and badly maintained, if they exist at all for a language. Several other Wiktionaries do have categories like this, and I have often wondered why we don't. I think this may improve the usability of Wiktionary as well. If we want to do this, I think there are two ways we could implement it:

One category that contains absolutely everything that has an entry in that language. This should be easy to do, but it would mean that inflected forms also end up in the category. For English that wouldn't be too bad, but for languages with highly inflected lemmas (like most Romance verbs, or Finnish nominals), the forms would swamp the main lemmas and make the category useless.
One category for lemmas alone. This is harder to do, because we'd probably want templates like {{head}} to put entries into it automatically, but that template can't easily tell what's a lemma and what isn't, except by looking at the category name. So in this case, we would probably want to make a clearer separation between lemma and non-lemma categories, maybe put forms in an entirely separate category tree (I think that may be a good idea even if we don't have a category for all lemmas). So perhaps we would rename Category:English parts of speech to Category:English lemmas, and remove all non-lemmas from that category and its subcategories, placing them in a new-to-be-named category.

This would be a major change, of course, so it would need good support. Should we do this, and if so, how would we implement it? (And maybe, if you don't support having a category for all lemmas, would you support separating non-lemmas into a separate category tree anyway?) —CodeCa t 00:51, 25 January 2014 (UTC)[reply]

I support the second. Switch the third parameter in {{head}}, if it is noun, verb, adjective, adverb, interjection, idiom, preposition, proper noun, pronoun, number, ordinal number, cardinal number, put it into Category:English entries. Otherwise, put it into Category:English lemma. The problem arises if the entry does not use any template for headword, i.e. it uses neither {{head}} nor {{en-noun}}, i.e. it just bolds the pagename. I've created an example in my sandbox. _{<added time=10:45, 25 January 2014 (UTC)>}After hours of searching, I found a real-life example._</added> --kc_kennylau (talk) 07:51, 25 January 2014 (UTC)[reply]

You're right, that is a bit of a problem (and it's just one reason why bolded headwords are bad! It comes back to bite you!), but there is a way to fix those entries, or at least find them. Once the category is created and filled with entries, we can compare the contents of Category:English lemmas with Category:English nouns, and make a list of all entries in the latter that are missing from the former. —CodeCa t 12:50, 25 January 2014 (UTC)[reply]

Some Wiktionaries have the first option (all words, including inflected ones), which I find useful. Once you have such categories for all entries in a certain language, you can use some interesting tools too, e.g.:

[A] "Recent changes" in a certain language: Well, sort of... It's not perfect: it catches edits to other languages on the same page too, but it's better than having nothing.
[B] Random word in a certain language (e.g. having "random word in English" in the sidebar is possible).

Demo at fr.wikt (let's say I'm interested in the Dutch language):

[A] "Recent changes" in Dutch: fr:Special:RecentChangesLinked/Catégorie:néerlandais.
[B] Random page in Dutch: fr:Special:RandomInCategory/néerlandais. -- Curious (talk) 15:28, 25 January 2014 (UTC)[reply]

I do agree that's a useful feature too, but it's more useful to editors, while a category of lemmas is much more interesting for users. Maybe we can do both, separately? Category:English lemmas and Category:English terms/English entries side by side? —CodeCa t 15:35, 25 January 2014 (UTC)[reply]

How would you compare the categories? Be aware that AWB can only eat 25000 pages. And I was also thinking of doing both. Category:English terms for the first project, and Category:English entries Category:English lemmas for the second project. --kc_kennylau (talk) 15:48, 25 January 2014 (UTC)[reply]

I use Python, with pywikipedia and mwparserfromhell. It doesn't have that limit. —CodeCa t 15:56, 25 January 2014 (UTC)[reply]

I see. I still support using both projects. --kc_kennylau (talk) 16:01, 25 January 2014 (UTC)[reply]

From my own experience, the first option including inflected forms is also useful to people (not very often...), not only to bots. And both options make the project usable the same way as a paper dictionary. Lmaltier (talk) 21:13, 26 January 2014 (UTC)[reply]

I suppose that "all words including inflected" can be obtained by recursively gathering from some category (only caution needed with translations). So it is not as useful for editors as "all lemmas" (except above-mentioned usecases [A] and [B], which can be programmed anyway). --Infovarius (talk) 13:17, 12 February 2014 (UTC)[reply]

I would like some more input on this before going ahead. —CodeCa t 13:33, 3 February 2014 (UTC)[reply]

`{{borrowing}}`

I don't really understand exactly what this template is for. In the case of English, for example, is it intended that every word not inherited from Old English should use it? Or is it just for words that ‘look’ foreign, and if so how is that judged? Ƿidsiþ 13:09, 25 January 2014 (UTC)[reply]

It's for loanwords. —CodeCa t 13:10, 25 January 2014 (UTC)[reply]

In the case of (Modern) English, it's specifically for loanwords that were borrowed during the Modern English era. If Modern English has inherited a word from Middle English that is a loanword from, say, Anglo-Norman French, then the Modern English entry would not use {{borrowing}}, though the Middle English entry would. —Aɴɢʀ (talk) 13:12, 25 January 2014 (UTC)[reply]

It just seems a bit weird to specify that something's been borrowed when everything from any other language has, by definition, been borrowed. Ƿidsiþ 13:20, 25 January 2014 (UTC)[reply]

Yes, it seems to convey no information not conveyed by the text of the etymology. If it were restricted to cases where the spelling was identical or differed only in diacritics. ie, not in other aspects of orthography, in meaning, or in PoS, it might save some keystrokes in providing the links and categorizations. DCDuring TALK 13:36, 25 January 2014 (UTC)[reply]

I can see the value for (let's say) Romance languages, where there are many doublets, some inherited from Latin and some borrowed from the same source. Ƿidsiþ 13:37, 25 January 2014 (UTC)[reply]

But wouldn't a {{etydoublet}} have the advantage of a name that gave a clue and the potential for creating a category whose content was not redundant, and text (including links, possibly to an Appendix or to WP) that was particular to the phenomenon. DCDuring TALK 13:46, 25 January 2014 (UTC)[reply]

I think the essential problem here is that we don't usually specify when the word was borrowed. If the {{borrowing}} template had a date= parameter, it might be more useful. --Wiki Tiki 89 14:57, 25 January 2014 (UTC)[reply]

Geordie dialect words

Once upon a time, a list of Geordie words was promulgated on Wikipedia by someone. It was good, but after much wikifying (by Encycloshave) and wokifying, and many hands making provenance much more interesting, it was determined that a word list belonged on Wikibooks, and so it was transwikified. Some items clearly come from elsewhere, so some it may be Wikisourcified, but only if discarded as the Wiki~~pedia~~books list is one of many 'selective' word lists online[4] but it is also a derivative and amalgamation, and Wikisource isnt Selective Derivatives Amalgamated .. so .. umm .. can Wiktionary into this wordlist as-is? John Vandenberg (talk) 12:45, 26 January 2014 (UTC)[reply]

I would personally think it would not hurt at all to at least import it here into the "Appendix" space. Discussion of importing it as actual entries could then take place at a leisurely pace. — hippietrail (talk) 07:50, 27 January 2014 (UTC)[reply]

Slav(on)ic

Slavic and Slavonic are exact synonyms. One of these things is not like the others:

Category:Balto-Slavic languages
Category:Slavic languages
Category:East Slavic languages
Category:South Slavic languages
Category:West Slavic languages
Category:Proto-Balto-Slavic language
Category:Proto-Slavic language
Category:Old Church Slavonic language (also includes the modern Church Slav[on]ic language)
Category:Old East Slavic language

Why don’t we rename that one Old Church Slavic in our dictionary? This would avoid confusion about the meaning and use of terminology among readers and editors. It would also make us look less confused. —Michael Z. 2014-01-26 21:23 z

We try to use the most common name that a language goes by. In this case, "Old Church Slavonic" is the most common name. --Wiki Tiki 89 21:26, 26 January 2014 (UTC)[reply]

How did we determine that this is the most common name?

And why do we do so: to communicate clearly with our readers? It would be clearer if we didn’t use two names for Slavic. —Michael Z. 2014-01-26 21:51 z

One way is like this. But maybe you also want to rename German to High Germanic to avoid using the two different terms German and Germanic? --Wiki Tiki 89 22:13, 26 January 2014 (UTC)[reply]

Is there a discussion you can link to, or are you just guessing about this? —Michael Z. 2014-01-26 23:00 z

From WT:LANG: "Language names are chosen by consensus. Whenever possible, common English names of languages are used, and diacritics are avoided. Attested names (names which meet CFI) are strongly preferred." --Wiki Tiki 89 23:05, 26 January 2014 (UTC)[reply]

Both Slavonic and Slavic are common English names. Nothing mandates using the most common name. And the guideline discourages using multiple names in different places. Thanks. —Michael Z. 2014-01-26 23:28 z

We are not using "Slavonic" as a separate term. It's part of the name of the language. The words "Slovene" and "Slovenian" also mean the same thing as "Slavic" and "Slavonic", but they are also used as the name of a language. And anyway, "Old Church Slavic" is not a common English name of the language. In Google books search, "Old Church Slavic" gets 15 pages of results while "Old Church Slavonic" gets 97 pages of results (after paging through all of them). I guess you're right that nothing mandates using the most common name, but it does say that the name should be chosen by consensus and every discussion I've seen about which name to use for a language placed the commonness of the name as the most important criterion. --Wiki Tiki 89 23:54, 26 January 2014 (UTC)[reply]

I don’t think Slovene and Slovenian do mean “Slavic” in English. Nor German “Germanic.” They share etymologies. If etymologies interest you, Slavic and Slavonic are exact synonyms, except that the latter is based on a historical misconstruing of the province of Slavonia for the Slavic world. —Michael Z. 2014-01-27 04:48 z

I guess we’d better change Old East Slavic to Old Russian, because Google Ngrams. —Michael Z. 2014-01-27 04:48 z

That one's harder to judge because of the lack of parallel structure in the names. This Ngram seems to suggest the opposite. Anyway, I would actually personally have no problem calling the language "Old Russian". --Wiki Tiki 89 05:00, 27 January 2014 (UTC)[reply]

Yes, even Ukrainian nationalists are no longer able to deny common roots with Russians, and terms давньору́ський (davnʹorúsʹkyj) (Old East Slavic/Old Russian), ру́ський (rúsʹkyj) (Rusian, Kievan Rusian) are used, even though these terms are sometimes translated as "(Old) Ukrainian" (what about Russians and Belarusians?) in some dictionaries I saw before. --Anatoli ^{(обсудить}/^вклад) 01:35, 28 January 2014 (UTC)[reply]

If we had a category named Category:Old Church Slavonic nouns and a category named Category:Old Church Slavic verbs, that would be inconsistent. But between Category:Balto-Slavic languages and Category:Old Church Slavonic language there is no inconsistency, and no potential for confusion, except perhaps the same contrived confusion someone could feel upon seeing (as Wikitiki points out) Category:West Germanic languages vs Category:German language. It so happens that the Slavic languages are most often called the Slavic languages. Meanwhile, the Old Church Slavonic language is most often called Old Church Slavonic. That's just the way it descriptively is. There is no shortage of other cases where related languages are known by more- or less-different names, e.g. Old Norse vs the North Germanic languages, Dutch Low Saxon vs (German) Low German, Old Saxon vs Middle Low German, Unserdeutsch vs Plautdietsch, ...
The only circumstance I know of under which Wiktionary intentionally calls a language by a name other than its most common name is when the most common name is ambiguous because it is the name of more than one language. When that happens, "inconsistency" is specifically desired and, if possible, implemented; see e.g. "Riang language" vs "Reang language". - -sche (discuss) 00:20, 27 January 2014 (UTC)[reply]

Okay fine, guys, if everyone likes a bit of Slavonic. It really adds some smashing 19th-century philology flavour to the project. —Michael Z. 2014-01-27 04:48 z

Not really. According the Ngram, "Old Church Slavonic" is still the most common name. --Wiki Tiki 89 05:00, 27 January 2014 (UTC)[reply]

Indeed, based on the actual evidence (ngram), it's "Old Church Slavic" that is dated. Usage of "Old Church Slavic" has been declining (and has been below the level of "Old Church Slavonic") since the 1950s, just as usage of "Slavonic languages" dropped off (in favour of "Slavic languages") after the 1940s. - -sche (discuss) 05:16, 27 January 2014 (UTC)[reply]

About including zhuyin / bopomofo in inflection lines for Mandarin entries

After two months in mainland China I've now arrived for a month in Taiwan and I'm learning as much about local language stuff as I can. Including zhuyin/bopomofo and how to use IMEs based on it.

Do we already have a way to include bopomofo in inflection lines for Mandarin along with pinyin, cangjie, etc?

Or have we discussed this somewhere in the past?

I would like to boldly just start adding it to the entries for characters and words I'm learning or re-learning in traditional script. But I better ask about the issues here first (-:

I'm aware pinyin is official in Taiwan now and bopomofo no longer is. But it is still prevalent here and I have the impression it's the most popular way to type here. — hippietrail (talk) 07:46, 27 January 2014 (UTC)[reply]

Include Bopomofo on inflection lines, you say? Neatly enough, Anatoli and James have been working on a way to do just that here: WT:Grease_pit/2014/January#Converting_numbers_to_some_other_symbols_in_Lua/. - -sche (discuss) 08:05, 27 January 2014 (UTC)[reply]

I won't take the credit for most of the work but yes, the work is underway. Kephir, Wyang, CodeCat have helped a lot. Waiting for Lua and template experts to help further.

There's no community approval yet but if nobody complains and everything is working, we should just implement it and hopefully get more users and editors on board - those who prefer Zhuyin over Pinyin or who just want to learn it. Well-known Chinese dictionaries already use Zhuyin or give it as an option. Zhuyin (or "bopomofo" - ㄅㄆㄇㄈ) looks much more complicated than Pinyin but for Taiwanese kids it's the first script they learn and many children's books and books for learners have Zhuyin shown as ruby, which better integrates with Han characters, especially with the vertical script, as is common in Taiwan.--Anatoli ^{(обсудить}/^вклад) 23:00, 27 January 2014 (UTC)[reply]

Requesting bot flag for User:Kennybot

I, User:kc_kennylau, am requesting a bot flag for my bot User:Kennybot to do this job. I have already done a testing, and it was a success. I have read the bot policy well, and pledge to follow every rule in the policy. --kc_kennylau (talk) 12:31, 27 January 2014 (UTC)[reply]

One more testing done, also a success. --kc_kennylau (talk) 12:34, 27 January 2014 (UTC)[reply]
- The bot policy page says to do a test run of 10–50 entries, not just two. —Aɴɢʀ (talk) 12:54, 27 January 2014 (UTC)[reply]
  - Done testing on all the pages on Category:Kennybot testing category --kc_kennylau (talk) 13:07, 27 January 2014 (UTC)[reply]
    - You left a message on my talk page about this, but I know nothing about bots so I have nothing more to add. What do others who do know about bots think? There's actually supposed to be a vote, isn't there? —Aɴɢʀ (talk) 14:12, 28 January 2014 (UTC)[reply]
      - Well, I had to do that with User:Buttermilch. On the other hand, we might not have a real quorum for a vote to be very meaningful. I have no objections to the bot itself, so assuming SemperBlotto will watch it, count me as "support". Keφr 16:32, 28 January 2014 (UTC)[reply]

I'm going to allow this, and keep an eye on it. I'll ask the bot owner to throttle-back the bot so it can't do too much damage. SemperBlotto (talk) 15:31, 28 January 2014 (UTC)[reply]

Requesting bot status for Wonderfulbot

I'd like to request bot status for Wonderfulbot (talk • contribs). The bot's job will be exactly the same as Asturbot's, that's to say, making conjugated forms of Asturian verbs. --Back on the list (talk) 11:52, 28 January 2014 (UTC)[reply]

Interesting. Kenny stops editing at 11:21, WF edits from 11:41 to 11:52, Kenny restarts at 12:18. I wonder.... 193.63.86.253

And an IP from WF's home country pops up to speculate about it as their only edit, Hmmm...

Seriously, though, Kenny's command of Chinese lects is better than I would expect of WF in his forays into other languages. By all accounts, Pofficer (and PofficerBot) did some pretty shoddy work in Polish, for instance. And the style is quite different (WF has never been that good at faking style). I would worry more about details overlooked in haste to get things done quickly than any kind of deceit. Kenny certainly has the raw technical skills to run a bot, and is more than willing to clean up his mistakes- but a bot can make an awful lot of those mistakes before anyone catches them. It's one thing to be bold, but another to be bold with hundreds of entries at a time. Chuck Entz (talk) 15:03, 28 January 2014 (UTC)[reply]

A new part of speech for Mandarin

We need a new part of speech for Mandarin - "attributive noun", for 第二手 (dì-èr shǒu), 公共 (gōnggòng), 食用 (shíyòng). How do we go about it, so that new headers/templates are not picked by KassadBot and patrollers? --Anatoli ^{(обсудить}/^вклад) 12:35, 28 January 2014 (UTC)[reply]

It's been a long time, and I only took first-year Mandarin, but isn't it possible to use virtually anything attributively as a modifier? If such blurring of POS is allowed, wouldn't it be better to go with function rather than strictly following canonical POS (the current practice in such entries)? Chuck Entz (talk) 13:22, 28 January 2014 (UTC)[reply]

Many Chinese words can be ONLY attributive nouns and can't be other PoS, they are neither nouns nor adjectives. It's not unusual for languages to have specific parts of speech. --Anatoli ^{(обсудить}/^вклад) 13:38, 28 January 2014 (UTC)[reply]

FWIW Burmese has these too. The names of countries and fruits, for example, always have to form compounds with other nouns. I hadn't heard the term "attributive noun" before, but it's the perfect description of things like အိန္ဒိယ (indi.ya.) and ဒူးရင်း (du:rang:). —Aɴɢʀ (talk) 15:00, 28 January 2014 (UTC)[reply]

But do we need a new PoS header? Can't we just call it a noun and tag it with (attributive) or something like that? --Wiki Tiki 89 15:27, 28 January 2014 (UTC)[reply]

I suppose, but at the moment that link goes to Appendix:English nouns, but the nouns under discussion here aren't English. —Aɴɢʀ (talk) 16:14, 28 January 2014 (UTC)[reply]

That's strange, but it can be fixed. What should it link to instead? --Wiki Tiki 89 16:20, 28 January 2014 (UTC)[reply]

We could add "attributive noun" to Appendix:Glossary and then link there. —Aɴɢʀ (talk) 16:49, 28 January 2014 (UTC)[reply]

A similar concept in English is words like no-flight zone, no-loss guarantee, where no-flight and no-loss are used only as attributive nouns but in Chinese they are much more common. They can be used as predicatives but would require particle 的, which is only used with attributive nouns in such cases. Compare also with adjectival nouns in Japanese, e.g. 便利, which has "Adjectival noun" header but for some reason uses adjective template.--Anatoli ^{(обсудить}/^вклад) 22:30, 28 January 2014 (UTC)[reply]

Oy, Japanese and 形容動詞 (keiyō dōshi). After chewing on this one for a long time, I am moved to inveigh that "adjectival noun" is also not a great descriptor -- many of these terms are not nouns, and cannot be used as such. 便利 (benri, “convenient, handy”) is a good example of the exception, but here, JA dictionaries tend to list the term as both keiyō dōshi and meishi (noun). But more "simple" or "common" keiyō dōshi, such as 静か (shizuka, “quiet, silent”) or 平凡 (heibon, “ordinary; mediocre”) or 貴重 (kichō, “rare, precious, valuable”) cannot be used as nouns, and function purely to modify another noun -- i.e., as adjectives.

But that's a bit of a tangent, given that the rest of this thread is focused on Chinese. :)

Suffice it to say, perhaps it makes the most sense to step back from the temptation to be super-specific in our POS header labeling, and consider what might make the most sense to our user base (who ostensibly would be familiar with the grammatical labels used to describe English)? Start by asking how a term is used. If a term is used as the name of an action, then that's probably a “verb” for POS labeling purposes. If that same term is also used as the name of a person, place, or thing, then that's probably a “noun”. Etc., etc. I see no reason that a single term could not appear under multiple POS headers, much as the English terms run or foot or pair. ‑‑ Eiríkr Útlendi │ Tala við mig 00:22, 29 January 2014 (UTC)[reply]

Welcome back, Eirikr, back in the game? :) The decision about PoS is often subjective on different dictionaries but as long as they are consistent and common, that's fine. I regretted the removal of Japanese の-adjectives/no-adjectives (like we have な-adjectives/na-adjectives) because from the English point of view "病気の人" (byōki no hito) means "a sick/ill person", not "a person of the illness", even if some linguists argue that nouns + の are not adjectival nouns but normal nouns.

I will search for more definite descriptions about Chinese attributive nouns to make the case stronger. Tooironic (talk • contribs) spurred me to remember them but hasn't posted himself yet. He might add some value to the discussion later. --Anatoli ^{(обсудить}/^вклад) 00:34, 29 January 2014 (UTC)[reply]

Back at it for the moment, but I don't have the kind of time I'd like. :)

Anyway, re: Japanese and の (no, posessive particle), *any* noun can be used attributively by putting a の after it. For example, 犬の人 (inu no hito, “dog” + no possessive + “person”). This could be parsed as “a dog's person” (a person belonging to a dog), or alternately as “a dog person” (a person who likes dogs). Which meaning to use depends on the context. Similarly, 病気の人 (byōki no hito, “disease” + no possessive + “person”) could be parsed as “a sickness's person”, or alternately as “a sick person”.

Possessives can be similarly multivalent and ambiguous in English. Take, for example, “the Danes' fear” or “the fear of the Danes”. Does this mean that the Danes are afraid (the Danes have the fear), or does this instead mean that someone else has a fear of the Danes (the Danes frighten someone else)? Which meaning to use depends on the context.

Anyway, that's a long-ish way of saying that Japanese の implies that the preceding term is a noun. (In those cases where it's actually a verb, like 行くのは [iku no wa, “to go” + no possessive + wa topic marker], that verb is being used nominally -- “going; the act of going”.) Moreover, noun-ness does not necessarily disallow use of a term in an attributive way (i.e., like an adjective). As such, I concur with JA dictionary editors that の adjectives don't really exist as a class -- what look like の adjectives are instead nouns being used attributively.

Cheers, ‑‑ Eiríkr Útlendi │ Tala við mig 22:47, 29 January 2014 (UTC)[reply]

Don't get me wrong, I only regretted but didn't protest. There is an alternative, popular opinion on "no-adjectives", that's all. We could have soft redirects for those (and back translations into English) or more usage example. The controversy includes Korean possessive 의 (ui), which is identical to the Japanese の (no) in forming this type of collocations. E.g. 기초의 (기초 (gicho, 基礎) + 의 (ui) "basic", similar to a Japanese 基礎の (kiso no) is formed.) A large portion of translations still need to be converted to make "no" adjectives to point to the lemma) English adjective "basic" is translated as 基礎の, 기초의 into Japanese and Korean, even if lemmas don't include の and 의. --Anatoli ^{(обсудить}/^вклад) 23:22, 29 January 2014 (UTC)[reply]

Applying European PoS to Chinese is a bad idea, eg. 给力. Wyang (talk) 22:59, 28 January 2014 (UTC)[reply]

I see your point. However, it's always a challenge but not impossible, ideas and concepts can be translated or explained. Both Chinese and English points of views should be taken into account. The fact that Chinese adjectives can also be nouns and verbs, doesn't make them non-adjectives. Users just need to know how PoS work in a target language. Arabic masdar is also a very specific PoS, which is loosely translated as "verbal noun" but it often works as the English infinitive. Having said this, I sometimes struggle to decide, which part of speech Chinese idioms belong to. --Anatoli ^{(обсудить}/^вклад) 23:24, 28 January 2014 (UTC)[reply]

It doesn't even have to be outside Europe. Dutch and German both treat all adjectives as adverbs as well, so anything labelled "adjective" should be understood as "adjective or adverb". —CodeCa t 23:36, 28 January 2014 (UTC)[reply]

Exactly. We can still use "adjective" header for German and Dutch nouns. --Anatoli ^{(обсудить}/^вклад) 23:40, 28 January 2014 (UTC)[reply]

Not really, eg. bereitwillig, äußerlich, praktisch. Wyang (talk) 23:44, 28 January 2014 (UTC)[reply]

We can but it's not always the way it is. I'm neutral on this, especially if adverbs add an extra meaning or more translations. --Anatoli ^{(обсудить}/^вклад) 23:48, 28 January 2014 (UTC)[reply]

I am in total support of this new part of speech. It's useful for Wiktionary users to understand that these kind of words do not function as adjectives the way other adjectives in Chinese do. Yet they're not simply nouns used attributively either. They are something else. ---> Tooironic (talk) 01:18, 29 January 2014 (UTC)[reply]

Thanks. Why didn't you bring in the examples? Anyway, your Chinese examples on your talk page helped me made another analogy in English - "large-scale", "small-scale", which are attributive nouns and separate words in Chinese. 大型, 小型 ("large-scale", "small-scale") are included in all Chinese dictionaries. --Anatoli ^{(обсудить}/^вклад) 02:13, 29 January 2014 (UTC)[reply]

Re: 公共 (gōnggòng) and 食用 (shíyòng), these seem more like verbal phrases: 共 (gòng) and 用 (yòng) are both verbs, not nouns...? ‑‑ Eiríkr Útlendi │ Tala við mig 22:47, 29 January 2014 (UTC)[reply]

The first one - 公共 (gōnggòng) is definitely an attributive noun, despite the components, which also have adjectival senses. 用 (yòng) in 食用 (shíyòng) acts as a suffix, similar to Japanese 用 (yō), which is derived from Chinese.

Question: would this also apply to Vietnamese? Vietnamese nouns can take almost any other nouns as attributive modifiers and it would still make sense, both semantically and syntactically, which is why I also have problems with Việt Nam labeled "adjective" when it's not. TeleComNasSprVen (talk) 01:41, 2 February 2014 (UTC)[reply]

Delete unused templates

~~I request that the unused and recurring categories below be deleted:~~

All topics and List of topics contain all of the above list (including themselves) and Communication contains Language. All topics and List of topics are solely recursion. --kc_kennylau (talk) 18:09, 28 January 2014 (UTC)[reply]

@Kc kennylau, just to let you know. The Beer Parlour is the Supreme Court of Wiktionary. It's meant to be for important, far-reaching discussions that may have a serious effect on Wiktionary. Small things like this one are supposed to be brought up on their designated pages, in this case WT:RFDO. --Wiki Tiki 89 18:25, 28 January 2014 (UTC)[reply]

spelling pronunciations

From User_talk:Angr#Spelling_pronunciation .

[...] as of yet, categories which would gather words that underwent this phenomenon don't exist. Do you think it would be useful to create it? --Fsojic (talk) 19:07, 19 January 2014 (UTC)[reply]

I don't know. It would be a little weird to have, say, Category:English spelling pronunciations since our entries are for words, not pronunciations. Category:English words that have spelling pronunciations is a bit better, but I have no idea whether other people will think it's necessary. [...] —Aɴɢʀ (talk) 19:28, 19 January 2014 (UTC)[reply]

I'm split. If we had such a category, I would be very interested in looking through it. However, I don't think it would be feasible to maintain (i.e. to add entries to). --Wiki Tiki 89 23:03, 28 January 2014 (UTC)[reply]

This sounds more like a nuanced quality than black-and-white.

Adding historical pronunciations and mentioning influences would be a good start to capturing this kind of information. Bring some etymology into the pronunciation. Or is it pronunciation into the Etymology? —Michael Z. 2014-01-28 23:37 z

Can someone give an example of an entry that would go in this category? Is this for something like vittles (spelling based on the way victuals is pronounced)? Or is it for Arctic /ˈɑɹktɪk/ (pronunciation based on the spelling, different from the original pronunciation /ˈɑɹtɪk/)? - -sche (discuss) 05:08, 29 January 2014 (UTC)[reply]

It would make more sense to have misspellings based on the pronunciations - iland, ile, wot, etc. (island, isle, what)?

A phonetic respelling is perhaps better understood by less educated users.Some good Russian children's dictionaries provide phonetic re-spellings for words pronounced irregularly, e.g. что [што], чего [чево], сегодня [севодня], пожалуйста [пожалуста]. I noticed that kids in Australia are not familiar with IPA at all but seem to understand phonetic re-spellings.--Anatoli ^{(обсудить}/^вклад) 05:27, 29 January 2014 (UTC)[reply]

In case you didn't realize, that is completely off-topic. But since you mention it, very few people who are not linguists are familiar with IPA. It's not just kids and not just in Australia. --Wiki Tiki 89 05:56, 29 January 2014 (UTC)[reply]

It's not off-topic. -sche seems to have asked the exact same question as well - "what will go in this category". --Anatoli ^{(обсудить}/^вклад) 06:10, 29 January 2014 (UTC)[reply]

The phenomenon we're discussing is the exact opposite. It's the pronunciation change based on spelling, not spelling change based on pronunciation. --Wiki Tiki 89 06:12, 29 January 2014 (UTC)[reply]

I see, thanks. What's the phenomenon called when "Pinochet", "Beijing", "adagio" are pronounced as if they were French words, when they should be pronounced more like the English way? Hyperforeignism? Maybe we should have categories for them as well? --Anatoli ^{(обсудить}/^вклад) 06:17, 29 January 2014 (UTC)[reply]

Except for "Pinochet", I think saying that they are pronounced as if they were specifically French is an oversimplification. But more to the point, I think the same thing that I thought about the spelling pronunciations (see my first post in this section). --Wiki Tiki 89 06:22, 29 January 2014 (UTC)[reply]

Yes, I'm being silly and oversimplify the following way: "Everything foreign is French for English speakers who don't speak foreign languages, so "ch", "g" and "j" must be pronounced the French way." Yes, there could be different reasons for the spelling to affect how words are pronounced. Russian examples are "дождь" and "скучный", which are now often pronounced the way they are spelled. --Anatoli ^{(обсудить}/^вклад) 06:29, 29 January 2014 (UTC)[reply]

Actually, in the case of Pinochet, I think the Spanish pronunciation is the spelling pronunciation, while the French one is more original (or you could call it a spelling pronunciation on top of another spelling pronunciation). Also, are you saying that "дождь" is supposed to be pronounced /doɕː/? --Wiki Tiki 89 06:36, 29 January 2014 (UTC)[reply]

Pinochet is pronounced /-ˈtʃɛt/ in both Engish and Spanish (cognate sounds), saying /-ˈʃeɪ/ is hyperforeignism. "дождь" used to be /doɕː/, (which matches roughly other Slavic languages) especially in Moscow, plural was /dɐˈʑːi/. It's still is but spelling pronunciation has happened.--Anatoli ^{(обсудить}/^вклад) 06:43, 29 January 2014 (UTC)[reply]

What I meant was Pinochet is a "descendant of a French Breton immigrant from Lamballe" (according to Wikipedia), and therefore the French pronunciation is not as wrong as you make it seem. And I always thought that /doɕː/ was just a colloquial alternative (that became the standard spelling in Ukrainian). --Wiki Tiki 89 06:51, 29 January 2014 (UTC)[reply]

How to pronounce names of people moving from country to country is always a big question but Chileans and Latin Americans pronounce the name as it's spelled /pinoˈtʃet/, Russian: Пиноче́т (Pinočét) and non-Roman languages used the Spanish pronunciation to refer to him in their language. Knowing how poorly English usually retains the original pronunciation, IMHO, this is more likely a hyperforeignism, not a case of retaining the French pronunciation. The majority of announcers wouldn't have a clue he was of French origin. --Anatoli ^{(обсудить}/^вклад) 23:57, 29 January 2014 (UTC)[reply]

And that is why I said that it's a spelling pronunciation on top of another spelling pronunciation that leads back to the original pronunciation. --Wiki Tiki 89 00:20, 30 January 2014 (UTC)[reply]

Agreed. It's a funny mixture of both. --Anatoli ^{(обсудить}/^вклад) 00:23, 30 January 2014 (UTC)[reply]

To clarify some terms: a spelling pronunciation is when the pronunciation is changed to reflect the spelling, such as pronouncing a /k/ sound in Arctic because there's a c in it (the pronunciation /ɑrtɪk/ is the older pronunciation). Things like "vittles" are pronunciation spellings. Pronouncing Beijing, adagio, and Taj Mahal with /ʒ/ (the "zh" sound of measure) instead of /dʒ/ (the "j" sound of juice) is hyperforeignism. —Aɴɢʀ (talk) 07:36, 29 January 2014 (UTC)[reply]

When and why did everyone start saying Beizhing for Beijing? I’m sure this didn’t used to be the case.

I also squirm when I hear about “onchiladas” and pasta cooked “el dontay.” is there a name for a misplaced foreignism, like when the first George Bush called the city “Sarahayvo,” as if from Spanish? —Michael Z. 2014-01-29 17:01 z

I've never heard "onchiladas" or "el dontay". But now that I think about it, isn't "enchilada" a spelling pronunciation, since it should be pronounced "enchilatha" (/ˌɛntʃɪˈlɑːðə/). I find it strange that even though English has the /ð/ phoneme, it is never preserved in foreign words. --Wiki Tiki 89 17:16, 29 January 2014 (UTC)[reply]

Maybe it's because dental fricatives are perceived as typically English, and people can't imagine that they exist in other languages too. And when people hear /ð/ but see <d> written, they assume that the Spanish are just not pronouncing their <d> right? Things like this happen even within spelling alone. It's not uncommon for people to write byoux instead of bijoux in the Netherlands, because people think the combination <ij> is "too Dutch" to be right for a French word. —CodeCa t 17:22, 29 January 2014 (UTC)[reply]

English speakers are remarkably unable to pronounce /ð/ outside English. When I was taken Welsh as an undergraduate, one of my classmates was completely incapable of pronouncing /ð/ in Welsh words even though she had no difficulty pronouncing it in English words. She could say them like the native English speaker she was, but couldn't pronounce ddim to save her life; it always came out dim. —Aɴɢʀ (talk) 21:30, 29 January 2014 (UTC)[reply]

/ð/ and /θ/ and the emphatic variety of /ðˤ/ are also used in Arabic but words with "th" and "dh" are not perceived as having those sounds. Same with Spanish s, c, d, which are pronounced the English way. --Anatoli ^{(обсудить}/^вклад) 23:57, 29 January 2014 (UTC)[reply]

Actually words spelled with "th" usually are pronounced with /θ/ (even when they shouldn't be, such as with Siddhartha). --Wiki Tiki 89 00:20, 30 January 2014 (UTC)[reply]

Perhaps "dh" is a better example, always pronounced /d/, when it should (or could) be /ð/ (Arabic /ð/ or /ðˤ/, like Abu Dhabi). --Anatoli ^{(обсудить}/^вклад) 00:27, 30 January 2014 (UTC)[reply]

There's also the example of Ahmadinejad, where the j is pronounced as /dʒ/ when it should be /ʒ/. --Wiki Tiki 89 00:31, 30 January 2014 (UTC)[reply]

Yes, his name should be spelled Ahmadinezhad, as Persian has both /dʒ/ and /ʒ/ and "zh" is used to transliterate Cyrillic "ж". --Anatoli ^{(обсудить}/^вклад) 00:45, 30 January 2014 (UTC)[reply]

(idiomatic)

{{context|idiomatic}} has always really annoyed me. It adds no information to the entry (if something wasn't idiomatic, we generally shouldn't have it anyway). Therefore, I propose that we remove this tag from all entries and definitions. (Not to mention it seems to scream at me in a high pitched voice, “Please don't delete me! I swear I'm idiomatic, I swear,” which just makes it irresistible for me not to nominate it for RFD.) --Wiki Tiki 89 04:15, 29 January 2014 (UTC)[reply]

I've often wondered about this myself. As you note, if something weren't idiomatic, we wouldn't include it. And as you note, it is very often used on senses that aren't idiomatic but that someone is trying (consciously or not) to insulate against RFD. I don't know if we should get rid of it entirely. Possibly. There are certainly a lot of places it doesn't belong and yet finds itself in. - -sche (discuss) 05:02, 29 January 2014 (UTC)[reply]

I've been scanning pages with my bot, but I am still unable to find one single page using {{context|idiomatic}}. Please provide me one page that uses {{context|idiomatic}} so that I can know if my script contains some problem. --kc_kennylau (talk) 10:27, 29 January 2014 (UTC)[reply]

Most if not all of the pages in Category:English idioms (and corresponding categories for other languages) use it. —Aɴɢʀ (talk) 11:29, 29 January 2014 (UTC)[reply]

OK, I'll fix my script. --kc_kennylau (talk) 12:21, 29 January 2014 (UTC)[reply]

List generated: User:Kc kennylau/pages using idiomatic tag, 7156 pages use it. --kc_kennylau (talk) 10:46, 30 January 2014 (UTC)[reply]

Help with German verb conjugations

Category:German verbs needing inflection now has a hundred entries. I do not have the confidence to add conjugation tables to other than the simplest German verb, so would be grateful if anybody else could do so. The bot SemperBlottoBot can now add the conjugated forms of regular German verbs, though it was very buggy at first (especially with strong verbs having strange combinations of parameters). I believe that it is now bug free, but I shall keep an eye out for errors. SemperBlotto (talk) 12:29, 29 January 2014 (UTC)[reply]

I'm starting to do some now. Maybe you can help by adding documentation for the all the German verb inflection-table templates and headword-line templates that don't yet have any, and by making sure they all create links using {{l-self}} ({{de-conj-strong-7}}, for example, doesn't create links). —Aɴɢʀ (talk) 13:39, 29 January 2014 (UTC)[reply]

Hmm. The only templates that I have looked at (and the only ones that the bot understands) are {{de-conj-strong}} and {{de-conj-weak}}. I think they have reasonable documentation - but the strong one made my brain hurt. I'll have a look at the others. Thanks very much for your contributions. SemperBlotto (talk) 16:01, 29 January 2014 (UTC)[reply]

A lot of those parameters could be removed by converting it to Lua. I'm also not sure what the point of parameter 8 is on {{de-conj-strong}}, since strong verbs by definition don't add -te. —CodeCa t 16:32, 29 January 2014 (UTC)[reply]

A general overhaul of the German templates would be great. I think the idea behind parameter 8 is to allow {{de-conj-strong}} to be used on rückumlautende verbs like kennen and rennen, which have vowel alterations like strong verbs even though they're actually weak. —Aɴɢʀ (talk) 21:34, 29 January 2014 (UTC)[reply]

{{de-conj-strong}} is also used on other irregular weak verbs like bringen. —Aɴɢʀ (talk) 10:55, 30 January 2014 (UTC)[reply]

Remove protection

@Connel MacKenzie, CodeCat, SemperBlotto I request that the protection level of Wiktionary:Grease pit/2007 be lowered. --kc_kennylau (talk) 12:42, 29 January 2014 (UTC)[reply]

Why? Keφr 14:07, 29 January 2014 (UTC)[reply]

@Kephir Why not? --kc_kennylau (talk) 16:37, 29 January 2014 (UTC)[reply]

Because there’s no reason to? —Michael Z. 2014-01-29 16:42 z

Well, all other similar pages have no protection level, and so should this one be. --kc_kennylau (talk) 16:51, 29 January 2014 (UTC)[reply]

Csörföly D (talk • contribs)

This user has made only one edit outside their userspace since April 2011. Since then, their only contributions here are bizarre user pages: Special:PrefixIndex/User:Csörföly D, Special:PrefixIndex/User talk:Csörföly D. The contents seem vaguely relevant for a dictionary, but are hardly comprehensible or generally useful in their current form. I am not sure what to do. Keφr 14:06, 29 January 2014 (UTC)[reply]

My suggestion is to do nothing at all. --Back on the list (talk) 14:11, 29 January 2014 (UTC)[reply]
I have puzzled over these myself several times. Personally, I would delete them all. Has anyone tried to communicate with the user? SemperBlotto (talk) 15:54, 29 January 2014 (UTC)[reply]
- Thanks to the new Echo notifications, they should be notified of this thread and hopefully come here soon enough. Keφr 16:56, 29 January 2014 (UTC)[reply]
  - Or they might not click on the notification and not come to this thread. --WikiTiki89 17:00, 29 January 2014 (UTC)[reply]
    - The pages that he has created are nothing more than tables and poems. --kc_kennylau (talk) 17:07, 29 January 2014 (UTC)[reply]

Competition time

In breaking with the traditional Easter competition, I'm announcing this year's competition to start in February, on Feb 1 to be precise. Have a look at Wiktionary:February Competition 2014, suggest any modifications to the gameplay, format, wording etc. on the page. And we shall start playing on Feb 1. --Back on the list (talk) 14:39, 29 January 2014 (UTC)[reply]

Capital letters in transliterations of languages that do not have capital letters

I've been thinking about this for a long time. Should we use capital letters in transliterations of languages that have no case distinction in their letters? I've noticed this being done both for proper names and for sentences. It has always bugged me, because among the languages that do have capitalization, there are many different systems of rules for what to capitalize and when. By capitalizing transliterations of languages without capitalization, we are imposing the English capitalization system on these languages. I cannot think of any arguments for doing this, so I think we should not. --Wiki Tiki 89 02:52, 30 January 2014 (UTC)[reply]

I agree. I've been thinking we should do the same for reconstructions as well. What about old languages that were written before modern capitalisation rules? For example, should Caesar or Óðinn be capitalised? —CodeCa t 02:59, 30 January 2014 (UTC)[reply]

For Latin and Ancient Greek, I would have said that they should not be capitalized, but they do have a long tradition of being capitalized so it's not as much of an issue. --Wiki Tiki 89 03:03, 30 January 2014 (UTC)[reply]

There are different situations with different languages.

Standard Japanese rōmaji, Korean romaja, Mandarin Chinese pinyin officially allow or even recommend capitalisation. It's not always easy to implement it technically, though, if it's automatic though but a Japanese module capitalises proper nouns. Yes, capitalisation is influenced by English and other languages. Note that rōmaji and pinyin (not sure about romaja but probably the same) are used to transliterate place names in Japan, China, etc. You could see the Latin spelling just about anywhere. It's especially the case with Japan, as it uses macrons, which make them different from the English spellings of the same place names.
A bunch of Indic languages and Arabic use a variant capitalisation to distinguish sounds (emphatic, retroflex consonants, long vowels, etc.). Until these cases are standardised, it's not a good idea to replace capitals with lower case letters by a bot.
Yes, I don't see any good reason to capitalise Persian, Georgian transliterations either, unless there is some standard, similar to #1. There could be some other reason why Persian is capitalised, need to ask some editors. I haven't seen capitalised Thai, Lao, Burmese, Khmer. Not sure about others, they may fall under #2 or #3. --Anatoli ^{(обсудить}/^вклад) 03:09, 30 January 2014 (UTC)[reply]

Regarding Arabic and Indic languages: I am aware of that, and I was not suggesting blindly removing them. Regarding Japanese, Korean, and Chinese: there is a difference between transliteration for the purposes of usage in English (or other) language text and transliteration for the purposes of a dictionary. Obviously if you are writing an English guide-book or something, you're gonna want to capitalize transliterated names of cities and other things, but do we still need to do that when the transliterations are not being used as English words? --Wiki Tiki 89 03:16, 30 January 2014 (UTC)[reply]

I thought you would ask about guide-books. No, capitalisation is used in dictionaries and textbooks, other materials. Here's on w:Pinyin#Capitalization_and_word_formation. I've seen rules on rōmaji and romaja as well. --Anatoli ^{(обсудить}/^вклад) 03:22, 30 January 2014 (UTC)[reply]

I see. Well then, I think the general rule should be not to capitalize anything, but for a given language such as Mandarin Chinese, we can decide to override it and allow capitalization. --Wiki Tiki 89 03:30, 30 January 2014 (UTC)[reply]

So, "Xí Jìnpíng" is pinyin transliteration (with tone marks) "Xi Jinping" is the English spelling (without tone marks). The guide above also has sort of an answer to your question about what makes a "word" in Chinese in an earlier discussion. Although Chinese count what they say in 字 (zì, “character”), they certainly have a concept of 词 (cí, “word”). Pinyin in solid forms are what is considered words, separated by spaces. I agree, we should clean up. ZxxZxxZ (talk • contribs), Dijan (talk • contribs) might contribute about Persian, Hindi, etc. --Anatoli ^{(обсудить}/^вклад) 03:41, 30 January 2014 (UTC)[reply]

Tajik variant of Persian (written in Cyrillic and Latin) already has specific capitalization rules (similar to those of Russian), I think we should either follow those rules for transliterations of Persian for the sake of consistency, or not capitalizing transliterations at all (currently we are following English rules of capitalization for transliterations of Persian). --Z 20:31, 30 January 2014 (UTC)[reply]

Most Persian speakers are probably not familiar with the capitalization rules of Tajik, so I don't think following them is a good idea. --Wiki Tiki 89 20:38, 30 January 2014 (UTC)[reply]

In answer to CodeCat's question: for languages like Latin and Old Norse where nouns like Caesar and Óðinn etc are almost always capitalized thusly in modern editions of old works, and in new works, we should certainly continue to capitalize thusly. For Old Norse it would be strange to normalize the orthography except the capitalization. For Latin, following original capitalization would just make entries hard to read; SCRIBIMVS DICTIONARIVM, NON ARCVM TRIVMPHALEM.
In answer to WikiTiki's question: IMO, we shouldn't add capitals when transliterating languages that don't have capitals, unless it is common/standard to do so (as Anatoli says it is in Chinese). - -sche (discuss) 07:04, 30 January 2014 (UTC)[reply]

I've been capitalizing the romanizations of proper nouns because people capitalize the first letters of their own given names and surnames, and because signs, publications, road signs, etc. capitalize the romanizations of place names. 99% of romanized Japanese is uncitable. It simply does not exist on its own. Yet the 1% which does exist consists mainly of the names of people and places, and those are capitalized. Haplogy (話) 07:40, 30 January 2014 (UTC)[reply]

Re: "99% of romanized Japanese is uncitable". Perhaps "citable" is not the right word for transliteration but the standard transliteration shows in published dictionaries, which use transliteration. --07:56, 30 January 2014 (UTC)

The JA-EN dictionaries I've looked at all use initial capitalization for transliterated proper nouns, such as 夏目漱石 --> Natsume Sōseki, or 東京 --> Tōkyō. On a similar tack, I would also not argue for omitting spaces in transliterated Japanese just because Japanese written in its native script doesn't use spaces. Different scripts and different contexts often demand different conventions. My 2p, anyway. ‑‑ Eiríkr Útlendi │ Tala við mig 08:39, 30 January 2014 (UTC)[reply]

What would be the reason for avoiding capitalisation? Shouldn't a transliteration into English look like English? As Eirikr says above, we don't copy other aspects of the language that differ from English layout. (I know little about these language conventions, so perhaps my question is naive?) Dbfirs 08:53, 30 January 2014 (UTC)[reply]

For one thing, we are not transliterating into English. We are transliterating into a more familiar alphabet in order to aid people who cannot read a particular script, and in some cases to show morphological features that are not represented in the original script. Therefore, the transliterations should not be bound by the English rules for capitalization. --Wiki Tiki 89 18:00, 30 January 2014 (UTC)[reply]

Oh, I see. So the same would be used for a french transliteration and a German Transliteration etc? Dbfirs 15:23, 1 February 2014 (UTC)[reply]

I fail to understand your question. --Wiki Tiki 89 23:00, 1 February 2014 (UTC)[reply]

Sorry, it was a rather obscure joke on the French practice of under-capitalisation and the German practice of over-capitalisation (compared with English), and it's main purpose was to acknowledge that my original question is irrelevant because you were talking about transliteration into the Roman alphabet, not specifically for English readers. Dbfirs 16:57, 2 February 2014 (UTC)[reply]

Capitalisation should especially be avoided if capital letters are used for something else, e.g. to render different sounds or when it's not standard and there is no convention to use upper case letters. Even if capitalised, transliteration doesn't need to look like English, e.g. months, days of the week are not capitalised in romanised Chinese or Japanese.--Anatoli ^{(обсудить}/^вклад) 11:52, 30 January 2014 (UTC)[reply]

That's one of my points. In Chinese we know that we should not capitalize months and days of the week because there is a standard that says so. For languages without a standard for capitalization, we have no unbiased way to decide how to capitalize these corner cases. For example, should capitalization in Yiddish transliteration follow the rules of German and capitalize every noun? --Wiki Tiki 89 18:00, 30 January 2014 (UTC)[reply]

I haven't seen capitalised Yiddish and I think it shouldn't be capitalised but I don't know if there are other standards for it. Hebrew shouldn't be capitalised either. --Anatoli ^{(обсудить}/^вклад) 20:18, 30 January 2014 (UTC)[reply]

Well, quotations of Yiddish within an English context general do use capital letters (according to the English rules), for example here. But not always, for example here. My point is not that Yiddish never appears with capital letters, but that inferring our own capital letters from Hebrew-script text does not have any advantages. --Wiki Tiki 89 20:32, 30 January 2014 (UTC)[reply]

Well, added capitalization may well have an advantage when its purpose is to aid comprehension for English readers who may be somewhat familiar with spoken Yiddish, for example. But our intent is different: to demonstrate usage by faithfully replicating the original text. —Michael Z. 2014-02-02 17:46 z

When transliterating the majority of languages with complex script, the faithful replication is immediately broken when spaces, hyphens are inserted, missing in the original text, because the result becomes impossible to use otherwise. E.g. for abjad based languages, writing out short vowels, which are not written in the original text is also breaking that rule. E.g. a phrase "where do you live?" in Persian کجا زندگی می‌کنید؟ (kojâ zendegi mikonid?) would become "kjâ zndgi miknid" without unwritten short vowels and Thai คุณอยู่ที่ไหน (kun yòo têenăi?) would be "kunyòotêenăi" without word spaces. So, there are legitimate changes to the original text. Persian also seems to consistently use a hyphen before the unwritten "e" for ezâfe, e.g. برادر بزرگ (barâdar-e-bozorg) ("the big brother"), Tajik: "бародари бузург" (бародар + и). As I mentioned above, CJK languages use capitalisation, hyphenation and spacing in the standard transliteration for their languages. I won't be surprised if such convention is also used for Persian, Yiddish or other languages. Perhaps someone can check how standard dictionaries transliterate proper nouns in Persian and Yiddish. در تهران زندگی می‌کنم "dar Tehrân zendegi mi-konam" (I live in Tehran) is easier to understand if "Tehrân" is capitalised and unlike Arabic, capital letters don't cause any possible misreading. For Indic and Arabic languages, capital letters are used for other sounds, so capitalisation may harm, rather than help the understanding. --Anatoli ^{(обсудить}/^вклад) 22:47, 2 February 2014 (UTC)[reply]

If a transliteration is dictionary working text, only meant to help the reader interpret foreign script, then we shouldn’t be enhancing it with capitalization to suit our sensibilities. Certainly transliteration of a quotation should usually remain true to the original, to demonstrate the original usage.

Anyway, this shouldn’t be that much of an issue. The written expression of a term is defined by its spelling and not capitalization, and we should lemmatize accordingly. The Old Norse, whether we style it Óðinn, óðinn, ÓÐINN, or óðinn, should appear on the same web page as same Icelandic word Óðinn.

Romanization used in living languages, like romaji and pinyin, is not just dictionary text. Perhaps it should follow its language conventions. —Michael Z. 2014-01-30 19:33 z

Are you entirely missing the point here? We are discussing the romanization of languages such as Chinese and Arabic, which don't have capital and lower case letters. They don't have their own "language conventions" for this. In the case of Pinyin, capitalization is defined as part of it. In the case of Arabic, there is no one standard for transliteration, let alone for capitalization. --Wiki Tiki 89 19:40, 30 January 2014 (UTC)[reply]

Are you entirely missing any discussion skills? —Michael Z. 2014-01-30 20:36 z

I'm not even going to answer that, but just clarify that my objection was to your sentence "Perhaps it should follow its language conventions.". --Wiki Tiki 89 20:40, 30 January 2014 (UTC)[reply]

Though Chinese and Japanese, by way of examples, don't use capital letters when written in their native scripts, there are ample examples of Latin transcriptions, wherein native speakers / writers of these languages have clearly developed capitalization preferences.

I may be stretching here, but I believe that these capitalization preferences are what Michael was referring to with his mention of “Romanization used in living languages, like romaji [for Japanese] and pinyin [for Chinese]”. I'm most familiar with Japanese, and there are indeed clear preferences for broadly hewing to English capitalization standards when writing Japanese in the Latin script. The JA editors I'm aware of here follow these standards when writing Japanese in the Latin script on the EN WT. From what I've seen in the world at large, there are similar preferences for Chinese written in the Latin script, and I think the CMN editors here likewise follow those preferences. I have no real idea about Arabic, that being wholly outside the purview of my studies to date, but I suspect a little research could uncover whether or not Arabic written in the Latin script by Arabic speakers / organizations / authorities has any capitalization conventions. ‑‑ Eiríkr Útlendi │ Tala við mig 22:43, 30 January 2014 (UTC)[reply]

I checked an online Quran which provided Arabic text, a transliteration, and English text, but the transliteration made use of capitals to transliterate different letters than the minuscules, which made it difficult to tell whether or not it was also capitalizing proper nouns, etc. Perhaps the mere danger than if we used capitals it would be assumed that we were using them to convey transliteration data is reason enough for us not to use them for Arabic. - -sche (discuss) 22:55, 30 January 2014 (UTC)[reply]

That's right, capitals in Arabic are used for different sounds, so they shouldn't be used to capitalise proper nouns. E.g., طهران is "ṭihrān" and may be "Tihraan" in chat or less standard transliteration because the first letter ط (ṭ) is emphatic /tˤ/, different from ت (t) /t/. --Anatoli ^{(обсудить}/^вклад) 23:23, 2 February 2014 (UTC)[reply]

So, by not including capital letters in the Arabic script transliterations, are we doing it to accommodate various transliteration methods? Or should we take a second look at the dotted transliteration systems and adopt them on Wiktionary? --Dijan (talk) 23:34, 2 February 2014 (UTC)[reply]

We are using the dotted system, although there are a lot of entries that still use the capital letter system. But even with the dotted system, it can create confusion. --Wiki Tiki 89 23:38, 2 February 2014 (UTC)[reply]

I am quite aware that some of us are using the dotted letters. But, some letters have as many as 4 "valid" variations according to Wiktionary:About Arabic. For that reason alone, we really should agree on one system. And no, I don't see how it can cause confusion. We have rules and policies for transliterating every other language (which may have multiple transliteration schemes, but for which we have adopted only one), so why should Arabic be any different? --Dijan (talk) 00:33, 3 February 2014 (UTC)[reply]

Using non-standard transliteration with Arabic is just too common. So, even if we fully standardise Arabic transliteration at wiktionary, capital letters H, S, D, T and Z in any position may be interpreted as transliteration for letters ح‎, ص‎, ض‎, ط‎ and ظ‎ (standard or "dotted" system: ḥ, ṣ, ḍ, ṭ and ẓ), when we actually mean ه‎, س‎, د‎, ت‎ and ز‎ (h, s, d, t and z). Such a confusion should be avoided. --Anatoli ^{(обсудить}/^вклад) 00:07, 3 February 2014 (UTC)[reply]

I suppose that using English capitalization rules would make transliterations more natural and understandable for monolingual English-speakers to read, but it would make it harder to produce the transliterations in the first place, and there would be more ways for editors to get things wrong.

I think the best policy would be to make lower-case the default, with the option for editors in any language to decide whether to use capitalization without having to ask for permission. In other words, it should be a language-by-language decision with minimal interference from the community as a whole. Chuck Entz (talk) 00:34, 3 February 2014 (UTC)[reply]

It's pretty much the case in most cases. CJK languages are capitalised - it's standard and editors use it. Thai, Lao, Khmer, Burmese are not capitalised by editors and most standards. Arabic and Indic standard transliteration is not capitalised, so editors NOW follow it. No resistance with lower case Georgian. With the rest it is still debated, especially those, which can't be transliterated automatically. --Anatoli ^{(обсудить}/^вклад) 00:59, 3 February 2014 (UTC)[reply]

Adding Balochi (bal) to wiktionary

We are group of Balochi speakers and linguists, and we want to have our version wiktionary for Balochi language. The definitions are going to be Balochi/English, English/Balochi. Thank you for your cooperation. — This unsigned comment was added by Unibal88 (talk • contribs).

Hi. Do you mean you wish to work on the Balochi (Baluchi) Wiktionary or to add Baluchi contents to the English Wiktionary? We have some examples, e.g. Category:Baluchi nouns, other parts of speech. --Anatoli ^{(обсудить}/^вклад) 11:57, 30 January 2014 (UTC)[reply]

BTW, the name Baluchi/Balochi is used inconsistently at Wiktionary. If Baluchi is an alternative name of Balochi, then parts of speech shouldn't the alternative name or Balochi should be the alternative name. --Anatoli ^{(обсудить}/^вклад) 12:01, 30 January 2014 (UTC)[reply]

See this page. Good luck. — Ungoliant ^(falai) 12:04, 30 January 2014 (UTC)[reply]

Hi. we wish to work on the Balochi (Baluchi) Wiktionary, can we be guided please on how to do so. — This unsigned comment was added by Unibal88 (talk • contribs).

Pronunciations 1 and 2

KassadBot doesn't like the headers "Pronunciation 1" and "Pronunciation 2", but how else are we to distinguish German forms like ˈdurchˌdringen vs. durchˈdringen or ˈdurchˌströmen vs. durchˈströmen? Dividing them by etymology doesn't make sense since both pronunciations are formed by adding the same prefix to the same root. —Aɴɢʀ (talk) 12:58, 30 January 2014 (UTC)[reply]

I support enumerated pronunciations. I never understood why etymologies should be enumerated but not pronunciations. — Ungoliant ^(falai) 13:08, 30 January 2014 (UTC)[reply]

It does not make sense to talk about her bot if we don't notify her. It's like talking something behind someone's back. Therefore, here it is the ping: User:Liliana-60. Hopefully she'll see this conversation. --kc_kennylau (talk) 13:12, 30 January 2014 (UTC)[reply]

I take it for granted that Liliana keeps the Beer parlour on her watchlist. I don't take that for granted about everyone, but someone as active here as she is, yes. —Aɴɢʀ (talk) 13:33, 30 January 2014 (UTC)[reply]

No, they have different etymologies. One comes from the prefix durch-, the other comes from the adverb durch. So separate etymologies make very much sense here. -- Liliana • 13:21, 30 January 2014 (UTC)[reply]

So are all separable verbs to be treated as compounds of adverbs (or other POSes) with verbs rather than as having prefixes? That's how I've been treating the ones where the separable prefix isn't really a prefix at all, like freigeben. But I notice that durch has no ===Adverb=== header, just Preposition and Postposition. —Aɴɢʀ (talk) 13:33, 30 January 2014 (UTC)[reply]

Prefixes are by definition inseparable, that's why they're called prefixes. Separable verbs are thus derived from the respective adverbs. -- Liliana • 13:48, 30 January 2014 (UTC)[reply]

This is how they work in Dutch too, see doordringen. —CodeCa t 18:04, 30 January 2014 (UTC)[reply]

Hard categorize them into Category:Entries with Pronunciation n headers, with the 1800 others. EP was supposed to come up with proposed rules for how these headers were to be used, enabling Ullmann to revise the Autoformat bot. Entries that have such headers and are members of the category are apparently ignored by AF/KassadBot. The behavior of Kassadbot in effect assists in tracking the existence of such headers.

There are other approaches to the problem:

a single Pronunciation header with all pronunciations for all PoSes, which also helps makes the distinctions apparent.
forcing distinct Etymology sections, even though the content of the sections would be identical or even empty.
revising Kassadbot's logic.

That is all that I know about this. DCDuring TALK 13:22, 30 January 2014 (UTC)[reply]

EP is EncycloPetey (talk • contribs), right? —Aɴɢʀ (talk) 13:33, 30 January 2014 (UTC)[reply]

@DCD, if the pronunciations are different, and that difference has an impact on the meaning, then separate etym sections makes the most sense to me -- how did the different pronunciations arise? how did that difference affect the semantics of the term? That kind of history seems (to me, anyway) like it would belong under an ===Etymology=== header. No? ‑‑ Eiríkr Útlendi │ Tala við mig 18:38, 30 January 2014 (UTC)[reply]

Apparently often the differences are inflectional. Also often reliable etymological information is not available. I certainly have tried to force different etymologies and have sometimes failed to find any good basis for the etymological distinction I was hoping to make. It would probably be nice to break Category:Entries with Pronunciation n headers into language-specific categories so that those with language knowledge could apply their skills to identifying possible solutions for the presentation problem that arises. I don't think it comes up in English, certainly not often. DCDuring TALK 19:05, 30 January 2014 (UTC)[reply]

You don't have to actually give the etymology in order to split it into two etymology sections. --Wiki Tiki 89 19:07, 30 January 2014 (UTC)[reply]

“Etymology x” headings don’t actually head different etymologies. They head entire entries for homographs. Anyway, Eirikr’s rationale makes enough sense to me: different terms spelt the same should be headed “Etymology x.” (We have redundant pronunciations on many pages with multiple etymologies, so why not have redundant etymologies on a few pages with multiple etymologies?)

But it might be better if the first H2 header on don was “English 1.” Or if the header below “English” was “don 1.” —Michael Z. 2014-02-03 01:57 z

In the case of the Latin ones at least, the different pronunciations often have the same etymology and othography, often differing by PoS and/or inflection only.

The category provides a list of lexicographic opportunities to apply and validate all linguistic theories about how the entries should be corrected. DCDuring TALK 12:54, 3 February 2014 (UTC)[reply]

Take a look at how מלך solves this problem. (I've seen it done like this plenty of times, but couldn't find an example right now, so I made one.) --Wiki Tiki 89 19:08, 3 February 2014 (UTC)[reply]

Another way of doing that, which I've seen in English entries, is:

* (Modern Israeli Hebrew) IPA^(key): /ˈme.leχ/ (noun)

* (Modern Israeli Hebrew) IPA^(key): /maˈlaχ/ (verb)

- -sche (discuss) 19:24, 3 February 2014 (UTC)[reply]

I like the structure. But the the headword line should remain the most visually prominent element on the page. —Michael Z. 2014-02-03 19:34 z

@DCD, separate etym headers could still be put to helpful use in explaining how the different inflection patterns / POSes came about.

@all, single spellings with multiple inflections etc is extremely common in Japanese. Earlier today, I expanded the entry for 牙 (“fang, tusk, tooth”), which has six different readings, each with different origins, derived terms, and usage contexts. I can do this more clearly because the JA source texts I'm working from already explain most of this; I certainly sympathize with the difficulties described by DCD above, where the background for differences is left unclear by the available literature. But in cases like durchdringen, we should do our best to explain the differences in derivation.

(Incidentally, someone better versed in German should expand the etymologies currently given there to include more detail -- Liliana mentions above that one is from the prefix and one is from the adverb, but that information is currently missing from the entry.)

Cheers, ‑‑ Eiríkr Útlendi │ Tala við mig 05:29, 4 February 2014 (UTC)[reply]

I have created a dozen language subcategories for Category:Entries with Pronunciation n headers. Latin, Aramaic, Lithuanian, Bulgarian seem to have a lot of these, but it will take quite some time (in principle there seems no guarantee it would ever be complete) before the template fully populates the categories. I think we will find that there are some languages that could easily eliminate such headers in almost all cases, eg, English, but there are some languages for which it is very difficult to see any good way to dispense with them, eg, Latin. DCDuring TALK 17:40, 11 February 2014 (UTC)[reply]
- There are now 27 language categories that probably include all of the entries marked by AF and its successors as having Pronunciation n headers. The categories seem close to fully populated. The largest is Category:la:Entries with Pronunciation n headers. (I know: It should be Category:Latin entries with Pronunciation n headers.) Three (~10%) of the languages (Latin, Bulgarian, and Aramaic) have 90% of the entries. DCDuring TALK 02:22, 13 February 2014 (UTC)[reply]
  - There are now categories for 32 languages, 2 of which are now empty. It would not be hard to empty the category for English (2 members) and 20+ of the others. If the headings are indispensable for Bulgarian, Latin, Aramaic, etc, so be it, but the Pronunciation n headers do not seem essential for most languages. The existence of the headers seems to be a license to use them even when they are not appropriate, as I found for 2 Malagasy entries that had Etymology 1 and Etymology 2 headers, but also had Pronunciation 1 and Pronunciation 2 headers. DCDuring TALK 23:50, 19 February 2014 (UTC)[reply]

Proposal to change how translation checks and requests are formatted

Discussion moved from Wiktionary:Grease pit/2014/January#trreq breaks translation adding with Conrad Irwin's tools.

This problem has been up for quite a while. Whenever there is a {{trreq}} entry in a translations box right before the language one attempts to add then User talk:Conrad.Irwin/editor.js will fail with some error message like Could not find translation entry for 'iso:blabla'. Please reformat. Maybe someone knowledgeable could try to fix these problems. Matthias Buchmeier (talk) 22:38, 29 January 2014 (UTC)[reply]

In the past, I suggested that we could change how {{ttbc}} and {{trreq}} behave, so that they don't add the language name anymore, but are placed after it. {{ttbc}} would become a "real" translation template, and could be replaced with {{t}} when the translation is checked. So like this:

Dutch: {{ttbc|nl|vertaling|f}}
French: {{t|fr|traduction|f}}
German: {{trreq|de}}

That would solve a lot of complexity in these scripts when it comes to sorting and finding out where to place things. Would that be an option? —CodeCa t 22:51, 29 January 2014 (UTC)[reply]

That sounds like a good solution. Is there anyone with a bot that can handle the conversion of the translation table lines? Changing the {{trreq}} template to show please add this translation if you can instead of Language: please add this translation if you can should be very easy. Matthias Buchmeier (talk) 23:27, 29 January 2014 (UTC)[reply]

Would the categorization remain the same? DCDuring TALK 23:51, 29 January 2014 (UTC)[reply]

Why not? Matthias Buchmeier (talk) 00:01, 30 January 2014 (UTC)[reply]

I don't think we could keep using the same names, because we can't break our existing entries. Then again, there's no reason to keep using the same names anyway. We could use {{t-check}} for translations to be checked, and something like {{rft}}, {{t-needed}} or similar for requests. —CodeCa t 00:09, 30 January 2014 (UTC)[reply]

We could use new template names, and then once all entries are converted redirect the old templates to the new ones. That way people still using the old templates won't break anything. Matthias Buchmeier (talk) 00:40, 30 January 2014 (UTC)[reply]

They would break things, if they used the new templates (or any redirects to them) the same way as the old ones were used. They're not compatible. —CodeCa t 00:42, 30 January 2014 (UTC)[reply]

I guess you're right. Matthias Buchmeier (talk) 00:53, 30 January 2014 (UTC)[reply]

I've just created {{t-needed}} which should hopefully work as expected. We should probably start to manually convert a couple of entries to see if it breaks anything (bots etc.) and if it's already compatible with Conrad Irwin's tools. Matthias Buchmeier (talk)

Don't you think more people should have a chance to voice ideas and opinions? —CodeCa t 01:21, 30 January 2014 (UTC)[reply]

Of course this is only a check to see if this solution might work. AFAIK you cannot test it with non-mainspace translation tables as Conrad Irwin's tools would detect it and refuse to work. Matthias Buchmeier (talk) 01:33, 30 January 2014 (UTC)[reply]

Ok, I created {{t-check}} as well. I'm not sure if it looks nice right now, so please feel free to make or suggest improvements to it. —CodeCa t 02:40, 30 January 2014 (UTC)[reply]

Should we move this to the BP? —CodeCa t 02:48, 30 January 2014 (UTC)[reply]

You should. It looks like a net improvement, but some might not see it that way and, in any event, folks will need to be notified. Also, maybe someone will have a good idea. Can the number of keystrokes be reduced? DCDuring TALK 03:32, 30 January 2014 (UTC)[reply]

What does the number of keystrokes have to do with it? —CodeCa t 03:47, 30 January 2014 (UTC)[reply]

The less we have to type when adding a new entry, trreq, etc, the easier it is to add the entry, trreq, etc, and the quicker we can do it and move on to adding or improving another entry. - -sche (discuss) 09:42, 30 January 2014 (UTC)[reply]

Do you think the names {{t-check}} and {{t-needed}} are too long? If we want to decide a name we better do it now before they start being used. —CodeCa t 14:03, 30 January 2014 (UTC)[reply]

{{tchk}} and {{tneed}} would be the clearest short versions. Chuck Entz (talk) 14:35, 30 January 2014 (UTC)[reply]

I think {{t-check}} / {{t-need}} (no need for the "ed", is there?) are short enough, but we could always have redirects that were even shorter, couldn't we? (Or would redirects not work?) After they've been out of use for long enough that we're sure no-one's still using them, even ttbc and trreq could be made into redirects to the new templates. - -sche (discuss) 16:20, 30 January 2014 (UTC)[reply]

I like the suggestion of changing things like "{{ttbc|xpy}}" to "Buyeo: {{trreq|xpy}}". It would make it slightly easier to fulfil trreqs (you just replace the trreq with the translation, rather than having to remember which language has the code 'xpy'), and if it would fix the bug that stops Conrad's trans-adder from being able to add translations next to trreqs, that's even better. - -sche (discuss) 09:42, 30 January 2014 (UTC)[reply]

Support. — Ungoliant ^(falai) 14:26, 30 January 2014 (UTC)[reply]

Support. Chuck Entz (talk) 14:35, 30 January 2014 (UTC)[reply]

Support. Please notify me when there is agreement to migrate to the new templates. (I remember "{{t?}}" being proposed some time ago in a discussion with a certain someone.) Keφr 16:04, 30 January 2014 (UTC)[reply]

Support. --Wiki Tiki 89 18:05, 30 January 2014 (UTC)[reply]

It looks like there is widespread support for this. But we can't make any changes until WT:EDIT is updated to handle both the old and the new format. I don't feel comfortable making big edits to that script, it's rather complex and things didn't go too well the last time I tried. Could someone else please do it? —CodeCa t 16:30, 30 January 2014 (UTC)[reply]

Template:m is orphaned, convert Template:term and Template:term/t to it?

We wanted to orphan these to make room for a short and convenient name to use instead of "term". The "m" stands for "mention", and contrasts with "l" which stands for "list" or "link". Now that it's orphaned, we can do this. Would it be ok for me to move/redirect {{term/t}} to {{m}}, and use a bot to convert all instances of {{term/t}} and {{term}} to {{m}}? —CodeCa t 22:01, 30 January 2014 (UTC)[reply]

Support. --Vahag (talk) 22:08, 30 January 2014 (UTC)[reply]

Support. --Wiki Tiki 89 22:15, 30 January 2014 (UTC)[reply]

Support. --Z 22:24, 30 January 2014 (UTC)[reply]

Support. — Ungoliant ^(falai) 12:30, 31 January 2014 (UTC)[reply]

Oppose, without this going through a proper full vote; voting in Beer parlour is more exclusive, since those who do not monitor Beer parlour on a daily basis thereby get excluded. Needless to say, orphaing "m" was illegal, AFAICS, in violation of WT:BOT. I have not chosen CodeCat to be the benevolent free-wheeling dictator of English Wiktionary, but at least we can guess who the supporters of this dictaror are. --Dan Polansky (talk) 08:30, 2 February 2014 (UTC)[reply]

In the RFDO discussion, no one objected to orphaning the template. There were two objections to deleting the template, but they were only temporary (both essentially saying that it should not be deleted until people stop using it). --Wiki Tiki 89 08:52, 2 February 2014 (UTC)[reply]

Well, CodeCat (talk • contribs) has already begun bot-orphaning before the RFDO discussion, as shown in your link. --kc_kennylau (talk) 08:54, 2 February 2014 (UTC)[reply]

Even if CodeCat started before the RFDO discussion, there were no objections to it before or after. Also, the vast majority of the orphaning seems to have been going on just now, well after the RFDO discussion. As far as I know, everyone agrees that {{g}} is a better template. --Wiki Tiki 89 08:59, 2 February 2014 (UTC)[reply]

In Wiktionary:RFDO#Template:m (to be archived at Template talk:m), there is e.g. this vote: "Keep until people demonstrably have quit using it.—msh210℠ (talk) 08:54, 13 September 2013 (UTC)". This I do not read as "use bots to orphan and then delete"; "people demonstrably have quit using it" is not "all uses were removed by an unauthorized run of a bot". So no, I cannot confirm your summary of the vote. --Dan Polansky (talk) 09:02, 2 February 2014 (UTC)[reply]

Well, people will not quit using a template that they see in use everywhere. I took msh210's comment to mean that he agrees with eventually deleting the template and with taking actions that will get people to stop using it (such as orphaning). Orphaning templates is common practice when a new template is created to supersede the old one, even if there is no intention to delete the old one. --Wiki Tiki 89 09:09, 2 February 2014 (UTC)[reply]

Why don't we summon him here to explain? Hey User:msh210! Come here now! Be quick! --kc_kennylau (talk) 09:12, 2 February 2014 (UTC)[reply]

Thanks for the page. I'd not have read this otherwise — or not for a while. What I meant was, check edits for a few months, and if people haven't been adding uses of the template, then delete it. An easy way to do that is to have a bot replace all uses of it once, then sit back and watch (without replacing it at all, or with logging every time it's replaced) for a few months.—msh210℠ (talk) 03:32, 4 February 2014 (UTC)[reply]

The very same discussion contained "Orphan and then delete" posted by another user above the post by msh210. If this is what msh210 meant, he could have posted "Orphan and then delete". There is no need to apply the fraudulent method of "reinterpreting" of what people say. --Dan Polansky (talk) 09:17, 2 February 2014 (UTC)[reply]

I'm not re-interpreting, I'm just interpreting. That is how I understood his words. Plus "orphan and delete" is not (by my interpretation) what he meant. By my interpretation, he meant "orphan, but keep until people have demonstrably stopped using it". --Wiki Tiki 89 09:20, 2 February 2014 (UTC)[reply]

Comment: Having an opinion quite the opposite of Dan's, I oppose holding a "full vote" on this. Holding a full vote on the question of what to name a template / where to store a template's functionality (since the only thing changing here is whether we store "term-link" functionality in {{term}} or {{m}}, and whether "masculine gender" functionality is in {{g|m}} or {{m}}) would be a new level of (as Dick Laurent once put it) "bureaucratic masturbation" and micromanagement. It would also achieve the opposite of Dan's stated goal of being more inclusive. The main WT:BP page is on the watchlists of 464 users, the main WT:VOTE page is on the watchlists of only 92 users. The December, January, and February BP subpages are on the watchlists of 458, 194 and 189 users, respectively, and they have been edited by 46, 53, and 7–8 distinct users, despite the February subpage being only two days old. The pages of the first three votes which are still listed on WT:VOTE, viz. "Obsolete forms heading", "Proto-Altaic", "Jyutping", are on the watchlists of only 9, 15, and 10 people, respectively, and those votes have been edited by only 10, 15, and 13 distinct users. (Someone could check, but I imagine all or almost all of the users who edited the VOTE pages have also edited BP pages.) Voting on WT:VOTE is also more exclusive than discussion in the BP due to the fact that anyone can participate in BP discussions, whereas certain categories of users (such as those with fewer than 50 edits) are not allowed to vote in WT:VOTE-votes. - -sche (discuss) 09:34, 2 February 2014 (UTC)[reply]

These are spurious arguments. Watching WT:VOTE with technical means is needless, since there are usually only few items on it and the items sit there at least for a month. The fact that someone has Beer parlour in the watchlist does not guarantee that the person actually reads things posted to Beer parlour; there are too many for that. And watching individual vote pages is not worthwhile for many editors: they post and go, with no need to watch. Putting WT:BP on the watchlist is something you do once and are done with it forever, so this having been done by many people is no evidence of anything. Moreover, I have not seen significant number of vote-ineligible people discussing in Beer parlour. Furthermore, there are many votes in which the number of participants dramatically exceeds the numbers seen in Beer parlour discussion. On a minor note, the use of figurative vulgar terms to advance an argument is as much of a logical fallacy as anything can be. --Dan Polansky (talk) 09:45, 2 February 2014 (UTC)[reply]

If we had to hold a WT:VOTE for every template that went through WT:RFDO, then what is the point of WT:RFDO? --Wiki Tiki 89 09:51, 2 February 2014 (UTC)[reply]

This thread is on a proposal to switch the long-established {{term}} to {{m}}. This is the sort of change that needs a vote, since it impacts a huge number of entries; the entry markup is important, since the entry markup is the user interface, and the entry markup is consumed by Wiktionary reusers and should not be changed often and on a whim. I don't understand why you are switching the subject. -Dan Polansky (talk) 09:57, 2 February 2014 (UTC)[reply]

As of now, I don't think we should delete {{term}}, and it won't be deleted without a proper discussion at WT:RFDO. The discussion here is about whether we should reuse {{m}} as a shorter name for {{term/t}}. Yes, the intent is that it would eventually replace {{term}} and {{term/t}}, but that would be a slow process that may come later, if at all, and not without more discussion. Anyway, this is not a policy, and since there is widespread agreement, I don't think there needs to be a vote. --Wiki Tiki 89 10:22, 2 February 2014 (UTC)[reply]

This is the title of the thread: "Template:m is orphaned, convert Template:term and Template:term/t to it?". It is incompatible with what you just said; it directly follows from it that {{term}} (and not just {{term/t}}) should be converted to {{m}}. I find your repeated creative reinterpretation of what is being written truly frustrating. --Dan Polansky (talk) 10:29, 2 February 2014 (UTC)[reply]

Are you just doing this on purpose Dan? You're doing the very same thing you accuse Wikitiki of doing, because now you're creatively re-interpreting my words. Had it ever occurred to you that you could just ask what I intended by what I wrote in the discussion header? To clarify that... no, I am not deleting {{term}} nor {{term/t}}, I'm only converting any uses to {{m}} (well, I will; I haven't actually started yet because this discussion is only a few days old). That's a huge difference, and I don't understand why you missed that. I think you're confused between the decision to convert {{m}}, {{f}} and such to {{g}} (which has broad support and has had for a while, as shown in WT:NFE), and the decision to convert {{term}} and {{term/t}} to {{m}}, which is actually the one being made here in this discussion. —CodeCa t 14:15, 2 February 2014 (UTC)[reply]

Just one example: compare Wiktionary:Beer_parlour/2013/November#Jyutping_syllable and Wiktionary:Votes/2013-11/Jyutping for who participated and what some of them wanted to do without a vote. Also note how long it took on the vote before some people posted: the vote started on 11 December 2013 and three people posted after 26 December 2013. --Dan Polansky (talk) 10:21, 2 February 2014 (UTC)[reply]

Oppose making Wiktionary infinitesimally easier for old farts, at the expense of new editors. Changing all self-explanatory names of common templates into single letters makes our morass of inscrutable code even worse. Cripes, like it’s hard to type “term”!

The rest of the discussion about streamlining the template are good, however. —Michael Z. 2014-02-02 17:38 z

"... at the expense of new editors" Any editor is supposed to be familiar with the basic templates of the wiki ({{l}}, {{term}}, and several others), so it's ok not to keep their names self-explanatory. --Z 17:10, 17 March 2014 (UTC)[reply]

Oppose per Mzajac; and strongly oppose and protest if people have not demonstrably quit using {{m}} for its old use.—msh210℠ (talk) 03:36, 4 February 2014 (UTC)[reply]

Parameter to use for alternative display of links

This discussion was split off from the previous one

If this passes, can we get rid of the third parameter and use alt= instead? Automatic removal of lexicographic diacritics obsoleted its most common use. — Ungoliant ^(falai) 12:42, 31 January 2014 (UTC)[reply]

Why not write {{m|lang|[[lemma|alt]]}}? And by the way, how can I tag mentions without linking to anything? Keφr 13:01, 31 January 2014 (UTC)[reply]

By passing it to the third parameter: {{term/t|lang||unlinked_word}} --Z 13:25, 31 January 2014 (UTC)[reply]

Yes, but not if we get rid of the third parameter. --Wiki Tiki 89 16:52, 31 January 2014 (UTC)[reply]

Of course. --Z 17:11, 31 January 2014 (UTC)[reply]

Which is one reason I'd prefer to keep the third parameter. It is also used to allow e.g. jeered to link to the lemma jeer. What would be the advantage of dropping it? Leaving it blank when you don't need it only requires you to type one extra |, which is shorter than typing alt= whenever you do need to set an alternative display form. - -sche (discuss) 19:42, 31 January 2014 (UTC)[reply]

That can also be done with {{m|en|[[jeer|jeered]]}}. The advantage is you wouldn't have a blank third parameter every time you wanted to add a gloss: {{m|fr|être||to be}}. --Wiki Tiki 89 19:49, 31 January 2014 (UTC)[reply]

Which is especially annoying when you don't know the script and have to count pipes, {{term|tr=āfrīnāmi|||gloss|lang=ae}}. I don't like counting pipes. --Vahag (talk) 20:16, 31 January 2014 (UTC)[reply]

I think the only widely-used linking template that uses alt= instead is {{t}}. I proposed changing that a while ago but there wasn't much interest for it. —CodeCa t 19:47, 31 January 2014 (UTC)[reply]

A rare string could be used as a placeholder, to explicitly mark an unused field, and make it easier to count pipes: {{term|tr=āfrīnāmi|~~|~~|gloss|lang=ae}}. —Michael Z. 2014-01-31 20:59 z

Or we could have the gloss be a named parameter {{m|ae|tr=āfrīnāmi|g=I praise}}. --Wiki Tiki 89 22:56, 31 January 2014 (UTC)[reply]

The gloss= parameter already exists, actually. It's a leftover from when {{l}} didn't allow the fourth parameter. We could also introduce t= on the model of {{compound}} and {{suffix}}. —CodeCa t 22:58, 31 January 2014 (UTC)[reply]

Yeah, but gloss= is too long. --Wiki Tiki 89 23:00, 31 January 2014 (UTC)[reply]

I support using |t= instead of |gloss= or the fourth parameter, and keeping the third parameter for alternative displaying (and also mentioning terms without linking to anything). Let's keep |g= for genders/numbers only. --Z 11:13, 1 February 2014 (UTC)[reply]

Actually |t= (for "translation") and |tr= (for "transliteration"/"transcription") is not a wise naming either, especially when we're going to use them at the same time, though we are used to it. Maybe |gl= is a good alternative? --Z 11:34, 1 February 2014 (UTC)[reply]

How about getting rid of the third parameter, and adding |text= for the rare case I mentioned above? Keφr 15:02, 1 February 2014 (UTC)[reply]

Or even |= (i.e. empty string as the parameter name). And when it is present, shift the other numbered parameters by one. So to avoid any linking, you can just add an equals sign. Keφr 17:20, 1 February 2014 (UTC)[reply]

Using |= is not a good idea, people may assume |=word as an input mistake by editors or something. "text=" is a bit long, BTW. --Z 18:18, 1 February 2014 (UTC)[reply]

What about {{term/t|en|[[|word]]}}? (it doesn't work now, it needs a minor change to language_link) --Z 18:31, 1 February 2014 (UTC)[reply]

Now that is what I would probably see as a markup error. Also, [[|word]] works differently in plain, template-less wiki markup. (Whereas |= is completely orthogonal to it.) Keφr 18:42, 1 February 2014 (UTC)[reply]

Practically anyone can guess what [[|word]] does (there's no target page, so there would be no link), but nobody except you can say what the hell |=word is supposed to do.

I've changed my mind, I think we should get rid of the third parameter, one advantage is that we would consistently specify the alternative displaying via pipe and square brackets everywhere. It's a bad idea to have more than one unnamed optional parameter (which would cause problems like the current problem for alternative displaying and gloss in {{term}}), People mostly talked about {{term}}, but we should keep linking templates similar to each other as far as we can, it's true that |gloss= is used more frequently than the parameter for alt in Template:term, but note that it's not used at all in {{t}}, and is rarely used in {{l}}. In {{l}}, it makes more sense to keep the third parameter for alt. But now that alt is not used that much, it doesn't make much difference, so I think we should get rid of any parameter for alt in all linking templates and use [[...|...]] consistently everywhere, and keep the only unnamed optional parameter for gloss (in {{t}} however, there's no need for gloss, so let's keep using it for genders).

Regarding mentioning terms without linking to anything, I support the idea of introducing a new parameter, |text= or |tx=. --Z 19:31, 1 February 2014 (UTC)[reply]

[[|word]] expands to [[word|word]] during the pre-save transform, even when inside template calls (reverse pipe trick).

The mnemonic for |= could be "equal ≈ identical ≈ verbatim". It might be surprising to see at first, but proper documentation should address any confusion. No strong opinion on the matter, though. I can agree to |text=. Keφr 21:44, 12 February 2014 (UTC)[reply]

The problem is that, in my experience, most uses of {{term/t}} have a gloss and few have an alt. display, leading to extra typing and accidents. Another possibility is to just switch the gloss and alt. display parameters around, so the third is for the gloss and the fourth for the alt. display, but requiring a named parameter for the gloss will be detrimental, as it is frequently used. — Ungoliant ^(falai) 13:19, 1 February 2014 (UTC)[reply]

I agree, gloss is used too frequently and preferably should be the third parameter. --Vahag (talk) 16:54, 1 February 2014 (UTC)[reply]

English vocabulary-knowledge game

You can test your knowledge of English words with this neat web game: http://vocabulary.ugent.be/wordtest/test . It asks for personal info because it's part of some scientific research, but it gives you the option of leaving those the fields blank. You then have to guess whether a series of words are real or made-up. I just learned a bunch of words through it, and learned that yorn is not a word. - -sche (discuss) 04:50, 31 January 2014 (UTC)[reply]

Just to spite it, need to create yorn and get it to pass RFV. --Wiki Tiki 89 04:56, 31 January 2014 (UTC)[reply]

Don't think I hadn't thought about it. ;) But aside from yorn, their non-words are impressively unattested. Even things like rudicose that seem like they should have been used in some language at some point are Googlenopes. - -sche (discuss) 05:03, 31 January 2014 (UTC)[reply]

Apparently cowchop isn't a word eiher. --Wiki Tiki 89 05:04, 31 January 2014 (UTC)[reply]

I got 34%-0%=34% :) "This is the level of a high proficiency second language speaker." :D --kc_kennylau (talk) 06:05, 31 January 2014 (UTC)[reply]

I tried it twice. The first time, I said no to all of the nonwords, but also to one of the words (bibliopegy), for a score of 97%. The second time I said no to another word (capelin), and yes to what they said was a nonword: fobber. It's certainly attested to our standards, and it also can be derived as the agent noun of the verb fob using normal English morphology. I reported it as a problem. They gave me 94%, which would have been 97% if they had accepted fobber as a word. Chuck Entz (talk) 07:46, 31 January 2014 (UTC)[reply]

I got 71% ("a high level for a native speaker"), partly because I accidentally hit "yes" for ilvinably though I intended to hit "no", and partly because they consider headbound to be a nonword, but it's a technical term in linguistics, so I reported it. There were 13 real words I didn't know, though some I hesitated on (it took me 4.7 seconds to decide against electrophorus, though I would have decided in favor of electrophorous). —Aɴɢʀ (talk) 08:45, 31 January 2014 (UTC)[reply]

PS, for anyone who speaks Dutch, or doesn't ;) lol, the same university offers a Dutch edition of the game: http://woordentest.ugent.be/ . - -sche (discuss) 09:59, 31 January 2014 (UTC)[reply]

"You said yes to 96% of the existing words. You said yes to 0% of the nonwords." I missed trunkfish, malison, and estipulate. I suspected that estipulate was a real word but couldn't honestly say I knew it. Equinox ◑ 10:09, 31 January 2014 (UTC)[reply]

Did it a second time: different words, but exactly the same score! Equinox ◑ 10:16, 31 January 2014 (UTC)[reply]

I knew 83% of existing words, and said yes to 0% of nonwords. —CodeCa t 15:34, 31 January 2014 (UTC)[reply]

I did it three times, each time with 0% non words, but only up to 60% yes (I played conservatively). They say I'm something close to high level for a native speaker, but I think the score can mostly be explained because some words are either directly from (old) French or scientific, in which case they are often similar in various languages (and also, English is the current universal language of scientific papers). I'm curious how my score would be if I tried to actually guess what words are English. Dakdada (talk) 15:49, 31 January 2014 (UTC)[reply]

Request that `{{deftempboiler}}` be edited

Request edit: line 20: "{{{3|}}}sc" → "{{{3|}}}|sc"

--kc_kennylau (talk) 06:53, 31 January 2014 (UTC)[reply]

Done --Wiki Tiki 89 07:08, 31 January 2014 (UTC)[reply]

And next time do this at the WT:GP. --Wiki Tiki 89 07:09, 31 January 2014 (UTC)[reply]

Infinite-duration blocks of IP addresses

According to the list of active blocks (//en.wiktionary.org/w/index.php?title=Special:BlockList&wpTarget=&wpOptions[0]=userblocks&wpOptions[1]=tempblocks), several hundred IPs are subject to permanent (infinite-duration) blocks. In general, permanently blocking IPs is a bad idea, because IPs are reassigned from time to time. Some of the active permablocks are of suspected open proxies, which we should perhaps be more cautious about unblocking, but others are of IPs who vandalized a few entries back in 2006 or earlier. Those IPs probably don't even belong to the people who were blocked anymore. I propose that we lift any blocks of IPs from before January 1, 2010, and reduce the length of any blocks which were issued after that date to some finite length of time, such as 4 years. If it's possible, I suggest we do this by bot/script (using an admin account, obv), since we are talking about several hundred blocks. If that's not possible, well, if each of us processed two blocks a day, and five of us were active on any given day, it'd only take a matter of months to go through them all. - -sche (discuss) 07:34, 31 January 2014 (UTC)[reply]

Support --Wiki Tiki 89 07:37, 31 January 2014 (UTC)[reply]

Support --kc_kennylau (talk) 07:39, 31 January 2014 (UTC)[reply]

Support. Chuck Entz (talk) 07:49, 31 January 2014 (UTC)[reply]

Support. I have posted a list of indefinite IP blocks whose reasons do not contain "tor" or "proxy" at User:Kephir/blocklist. Keφr 08:49, 31 January 2014 (UTC)[reply]

Support This was also a question raised on other wikis and the general reasons for support for each and every one of them was that 1) permanent blocks should almost never be used against IPs, 2) in the event they do warrant permanent blocks such as open proxies the Meta stewards already have them permanently globally blocked. TeleComNasSprVen (talk) 01:46, 2 February 2014 (UTC)[reply]

Comment(s): There are certain blocks of IPs which are not, in fact, available to anyone. Incoming IPs supposedly using these blocks are sometimes known as bogons, and might without harm remain blocked. Note also the number of globally blocked IPs; these often over-lap with some of the indef blocked IPs on the site—such as TOR exit nodes—but the WMF makes exceptions for certain users which have justified to the stewards their use of such IPs. For such users the local blocks prevent their participation in project even though they have been cleared by community Stewards to do so. In other words a more-nuanced assessment should probably be considered if reasonably possible, with tor blocks being especially targeted for unblocking. - Amgine/^t·e 07:14, 2 February 2014 (UTC)[reply]

Follow-up: I've lifted all of the 2005, 2007 and 2008 blocks, and started lifting the 2006 blocks. One of the 2005 blocks was of an "invalid" IP address, perhaps a bogon of the sort described by Amgine: 663.19.150.52. - -sche (discuss) 20:46, 12 February 2014 (UTC)[reply]

663. ??? That's not just an invalid IP, it's not even possible to put that into an IP packet header. The highest value of each byte is 255. I really have no idea how that got through. —CodeCa t 21:01, 12 February 2014 (UTC)[reply]

Clearly it's the residual power of evil -- 663, nearby neighbor of the Beast. :-P ‑‑ Eiríkr Útlendi │ Tala við mig 21:08, 12 February 2014 (UTC)[reply]
The-guy-who-lives-over-the-road-and-up-the-street of the Beast. --Catsidhe ^{(verba, facta)} 21:12, 12 February 2014 (UTC)[reply]

Is it possible someone could have created an account with a fake IP for a username? --Wiki Tiki 89 22:27, 12 February 2014 (UTC)[reply]

Lies. Keφr 22:59, 12 February 2014 (UTC)[reply]
Very funny. --Wiki Tiki 89 23:20, 12 February 2014 (UTC)[reply]

I have now lifted all blocks from 2010 or earlier, and shortened all post-2010 blocks to 4 years from the year of issue (so, 2011 blocks got shortened to 1 year left, 2012 to 2 years left, etc). I did not unblock IPs with "Tor", "Proxy" or "OP" as the block reason, nor IPs that were also subject to global bans/blocks/locks. - -sche (discuss) 06:50, 2 March 2014 (UTC)[reply]

Typically, IP addresses that are globally blocked are generally TORs/OPs and so keeping blocks on them on Wiktionary is rather redundant. Also, even those addresses get reassigned elsewhere, so the blocks might also have collateral damage. TeleComNasSprVen (talk) 06:03, 17 March 2014 (UTC)[reply]

If someone wants to write a script that'll remove our local blocks of globally-blocked IPs, I don't mind. (There are too many to unblock them all by hand.) - -sche (discuss) 17:37, 17 March 2014 (UTC)[reply]

If there's still anybody watching this page, I managed to create a python script that can go through and remove these IP blocks. It takes two parameters, "checkgbl" and "pre2010", as it was made specifically for Wiktionary. There are still existing blocks from before 2010, so I'd like to know if we ought to remove them all, or only the ones that are also subject to global blocks ("checkgbl"). If an admin knows a fair bit of Python and has Pywikipediabot installed on their machine, I can email the script to you and you can double check it for errors before running it on your account. Alternatively, I could request adminship to run it from my account and have another admin check the first few unblocks I make before doing the rest. TeleComNasSprVen (talk) 20:14, 21 May 2014 (UTC)[reply]

Bot permission request

I, on behalf of my bot Kennybot (talk • contribs), request to embark on the cleanup of the category Category:term cleanup. My bot will add the language code back to the template usages lacking language parameter. My bot will not change anything in the etymology section to ensure that the language code is correct. If there is any further limits, feel free to express them. I will try to make as many limits as possible to avoid the language code to be wrong. --kc_kennylau (talk) 15:24, 31 January 2014 (UTC)[reply]

How is your bot going to be able to tell what languages the terms are in? —CodeCa t 15:28, 31 January 2014 (UTC)[reply]

My bot will search for L2 headers and record their position. Then, it will check the position of the template usage to see if it is just after one section. My bot will also search for L3 headers to make sure that the template usage is not in the etymology section. --kc_kennylau (talk) 16:09, 31 January 2014 (UTC)[reply]

But the language header doesn't have to be related to the language in {{term}}. There's no guarantee that your method won't generate false positives. —CodeCa t 16:11, 31 January 2014 (UTC)[reply]

I cannot think of any template usage of other languages other than in the etymology section. Please name me an example. Please also tell me if there is any way to solve it, no matter whether it will generate false negatives or not. --kc_kennylau (talk) 16:15, 31 January 2014 (UTC)[reply]

I don't have an example other than (obviously) etymologies, but that doesn't mean there aren't any. You're assuming that {{term}} is only used where you think it should be, but reality is often very different. You can verify this quite easily by doing a search for any uses of {{term}} in an English section that link to a term in a script that is not Latin. Clearly those uses can't be English, so that would give you some examples. You may be surprised. —CodeCa t 16:20, 31 January 2014 (UTC)[reply]

I wonder how would I scan through all {{term}} usages, as there will be many. Each page.get() requires at least 3 seconds. --kc_kennylau (talk) 16:44, 31 January 2014 (UTC)[reply]

My point is, you can't solve a problem with a bot until you have a clear idea of the problem you're dealing with. Any unknowns should be treated as part of the problem, you can't "assume them out of existence". —CodeCa t 16:50, 31 January 2014 (UTC)[reply]

Some entries link to foreign-language terms from See also sections (e.g. veer, but I know some English entries link to Russian and Ukrainian words this way, too). Some entries link from Usage notes, e.g. because a word is a loanword and the note is explaining how the borrower-language use of the term is different from the source-language's use. Until I cleaned it up just now, í húð og hár was an Icelandic term with usage notes that linked to English terms. Some entries even link to / mention foreign-language terms in their definitions, e.g. ἔμβολον. Indeed, most non-English entries which use {{term}} in their definitions without setting lang= probably mean en, not whatever language the linking entry is. In all of these cases, I would hope that lang= would have been set, but there is no way of knowing. Term cleanup is a complex task. - -sche (discuss) 19:23, 31 January 2014 (UTC)[reply]

Competition begins

Without further ado, I hereby announce the commencement of the new WT:FUN competition. Be sure to check in every 24 hours to see the scores and the new board. May the best Wiktionarian win. --Back on the list (talk) 16:00, 31 January 2014 (UTC)[reply]

Is the time in UTC? It may be a good idea to put a clock or something. Dakdada (talk) 16:41, 31 January 2014 (UTC)[reply]

True. I'll put a clock up there. --Back on the list (talk) 18:55, 1 February 2014 (UTC)[reply]