Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives +/-


October 2014[edit]

What makes a single word idiomatic?[edit]

I think it would be nice if we took WT:CFI a bit more seriously. I mean, de facto there's no problem because nobody's forcing us to apply our own rules; there's no 'court of appeal' if there's a deletion decision that goes against WT:CFI. Anyway.

Under General rule:

"A term should be included if it's likely that someone would run across it and want to know what it means. This in turn leads to the somewhat more formal guideline of including a term if it is attested and idiomatic."

Under Idiomaticity:

"An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components."

So, all terms have to be idiomatic (well, it's a 'somewhat more formal guideline'. Viewed from that perspective, it does make it sound like attested and idiomatic aren't in the rules, they're just in the guidelines!), but in terms of CFI, it only give guidelines on what idiomatic means from an expression. Given that all terms have to be idiomatic, what's the test for say, hat, or reenter?

I know it's hard work, but I just think it would be nice if we could take ourselves a bit more seriously. Renard Migrant (talk) 11:14, 4 October 2014 (UTC)

For hat it's obvious because its meaning cannot be easily derived from its phonemes /h/, /æ/, /t/, since phonemes do not have any meaning to convey. For reenter it's less obvious because its meaning can be derived from the meaning of re- and the meaning of enter, but we seem to have an (unwritten?) agreement here that everything written together without a space is eligible for an entry. That convention breaks down, however, for languages that are not usually written with spaces; and it has been controversial for polysynthetic languages that may write whole phrases like "he had had in his possession a bunchberry plant" as one word without spaces. For English, the only real ambiguity is in expressions that are written with spaces, because there is no unambiguous criterion to distinguish idiomatic ones from unidiomatic ones. Probably everyone agrees that hot dog is idiomatic and hot lightbulb isn't, but between those two extremes there's a continuum, not a clearly defined split. —Aɴɢʀ (talk) 12:53, 4 October 2014 (UTC)

Make Categories Show Where They're Defined[edit]

I would like to propose that the category templates be modified to show the name of the data (sub) module where the information for the category resides. This would make it easier to make changes, and also make it easier to figure out where a new category analogous to existing ones could be added.

Adding documentation pages to modules is helpful, but it still takes a bit of wandering the maze of modules and sub-modules and data sub-sub-modules to figure out where category information resides. This shouldn't be too hard, since the modules have to have this information at some point- it's just a matter of developing protocols for passing it back to the templates.

It might also be nice to give instructions on where to go to get changes made, but that may not even be settled yet. This is all part of a larger problem with our newer Lua-based architecture, which is that things are centralized in data modules and impossible for non-admins to access, but I'll leave that for a separate topic. Chuck Entz (talk) 16:43, 4 October 2014 (UTC)

All the categories have an "edit" button already, and it's been there for a few years maybe. You never noticed? —CodeCat 16:47, 4 October 2014 (UTC)
Why are you so surprised? It's not what one one expect from how many other things work. Human attention works that way. Given that the question of category documentation and editing has been asked before without answer, Chuck probably assumed that it must be a policy matter. DCDuring TALK 18:47, 4 October 2014 (UTC)
And, once the edit button is clicked on, then what? DCDuring TALK 18:51, 4 October 2014 (UTC)
I wrote three paragraphs. You never read them?
If you click on Edit for Category:English colloquialisms, you get:
  1. {{poscatboiler|en|colloquialisms}}. poscatboiler contains:
  2. {{#invoke:category tree|show|template=poscatboiler|code={{{1|}}}|label={{{2|}}}|sc={{{sc|}}}}}, so we go to that module.
  3. Module:category tree refers us to:
  4. Special:PrefixIndex/Module:category tree. The logical next step is:
  5. Module:category tree/poscatboiler. This refers us to:
  6. Module:category tree/poscatboiler/data, which refers us to:
  7. Special:PrefixIndex/Module:category tree/poscatboiler/data, which contains dozens of submodules. Fortunately, I've been working with categories long enough to spot:
  8. Module:category tree/poscatboiler/data/terms by usage as the most likely choice.
And there it indeed is. What I'm proposing is a line at Category:English colloquialisms that refers you to Module:category tree/poscatboiler/data/terms by usage without your having to going through all the steps above. I've worked a lot with categories, and I know something about templates and modules, and there are times when I have to look at several data sub-sub-modules before I can find where the configuration is for a given category. Sure- it's simple! Chuck Entz (talk) 18:58, 4 October 2014 (UTC)
CodeCat was referring to the small edit button next to the text. You were referring to the edit tab at the top, which is the first place one would look to edit something other than a section. Someone introduced a non-standard positioning of the edit option and expected it to "of course" be noticed by anyone with half a brain. But that is simply not true: habits that are reinforced by thousands of successful repetitions are not easily overcome and cause attentional blindness to such things as small edit buttons in unexpected places. DCDuring TALK 19:11, 4 October 2014 (UTC)
Ah, that explains it! No, I never noticed it. I was wondering how she could have so completely missed my point. That feature does, indeed, make my proposal rather redundant- but it might still be useful for those who are trying to figure out how the categories work, but aren't going to be editing data modules. Perhaps a combination would be a good idea, such as "This category is defined at Module:category tree/poscatboiler/data/terms by usage" with the edit link at the end. Chuck Entz (talk)
That would be a bit too long to fit where the edit button currently is. Do you know where else it could be placed? —CodeCat 19:52, 4 October 2014 (UTC)
How is what happens after one clicks the edit link self-explanatory? Some kind of help (colored green?) to click on next to the edit button would both make the edit button more visible and afford an opportunity to explain further. DCDuring TALK 20:42, 4 October 2014 (UTC)
Is new to me too. Here's some ideas for making it more visible:
  1. Add a hidden category to the category pages, e.g. Category:Categories defined by Module:category tree/terms by usage. (And that category can then explain more in its description, and link more obviously to the module). Editors are more likely to have hidden categories showing, so may notice.
  2. Change the text to something more descriptive, such as "[Edit category definition]", and/or perhaps an even more wordy hover text, e.g. "Edit the module which defines this category's description, category parent, and category text."
  3. Add an item to the left nav under "Tools". (Though that would probably be even less noticed)
Also, pages like Module:category_tree/poscatboiler/data/terms_by_usage could really use some docs to say what is and isn't safe to edit, how to propose or add new categories, and how to test that your edits aren't going to break everything. (Especially as it [Edit] buttons encourage users to edit it). Even if you know Lua and something about Wiktionary, you still don't know what can be edited safely on that page.
Perhaps a whole other conversation, but the docs on each category page really should say (or link to) how a regular editor can add a page to that category, e.g. which template or group of templates are used in the article space to add the category tag and whether it needs additional parameters to cause it to be added, etc. Though that's a whole other conversation and perhaps a thankless task to document properly. Pengo (talk) 11:52, 5 October 2014 (UTC)
How about in an editnotice? --Yair rand (talk) 14:13, 5 October 2014 (UTC)


Considering that so much time has been wasted on rfv's/rfd's due to misspellings (especially in hyphenation) resulting from scannos, should we expand our criteria for inclusion page with notifications/warnings or something? Just a suggestion. Zeggazo (talk) 20:15, 4 October 2014 (UTC)

Category:Arabic definitive nouns???[edit]

First of all, shouldn't these be "definite nouns", not "definitive nouns"? Second of all,four of the five entries in this category are simply the definite equivalents of Arabic lemma nouns (which are always in the indefinite). The definition itself specifies this. The definite equivalents are formed simply by appending "al-" (or rather, the Arabic equivalent) to the noun. I thought there was a policy not to include such forms unless they have an idiomatic definition? I'm going to add {{delete}} tags soon but I want to make sure others don't disagree.

BTW the fifth of five entries is the word العَرَبِيَّة (al-ʿarabiyya), which has a special meaning ("the Arabic language"), separate from the word عَرَبِيَّة ("carriage" or "female Arab"), so it should be kept. Benwing (talk) 08:38, 5 October 2014 (UTC)

Perhaps nouns and proper nouns where ال (al-) is always used should still be categorised as "definite nouns"? It's useful for readers to know that a term is formed by al- + the stem. Not sure if ALL such terms should be redirected to terms without the definite article. --Anatoli T. (обсудить/вклад) 23:36, 5 October 2014 (UTC)
OK, So no one answered my question. For the ones that are simply the definite equivalents of existing lemmas, with no special meaning, should I delete them, or keep them and use something like {{definite of}}? I think we should delete, since otherwise we're setting a precedent for creating definite equivalents of every single noun out there, which is crazy, since they're all formed trivially in exactly the same fashion by just adding "al-" (actually ال (al-), in the Arabic script) onto the beginning of the noun. It would be comparable to creating entries for the car and the boat and the kumquat, etc. etc. Any objections to me deleting them? Benwing (talk) 10:31, 8 October 2014 (UTC)
Normally terms should be RFD'ed for deletion but since they are definitely just "definite article + noun" entries, yes, delete all, except العربية and اللغة العربية. If you don't have the rights to delete, I'll delete them for you. العربية and اللغة العربية should probably be RF-ed or RFV-ed, not sure. --Anatoli T. (обсудить/вклад) 22:29, 8 October 2014 (UTC)
I would also keep الأمين as one of the names of Muhammad and also given name after that. --WikiTiki89 22:49, 8 October 2014 (UTC)
Also, {{ar-proper noun}} should automatically add to Category:Arabic definite nouns. --WikiTiki89 22:52, 8 October 2014 (UTC)
Yes, keep الأمين. Agree about proper nouns as well. --Anatoli T. (обсудить/вклад) 22:59, 8 October 2014 (UTC)
Definite forms in Arabic are not written with a separating space, as far as I know, so they closely parallel the definite forms of the Scandinavian languages. Since we have separate entries for those (dag, dagen, dagar, dagarna), we should probably also have separate entries for the definite forms of Arabic nouns. —CodeCat 23:20, 8 October 2014 (UTC)
Arabic grammar doesn't consider definite articles part of the word. Exceptions are proper nouns. Also, monosyllabic prepositions consisting of one consonant and a short (unwritten) vowel are spelled together, they are separate words, unless they are adverbs (debatable), e.g. بِسُرْعَة (bisurʿa) -quickly (lit.: "with speed"), preposition بِ (bi-) + سُرْعَة (surʿa) (speed), enclitic pronouns بَيْتِي (baytī) "my house", بَيْت (bayt) + my - "ي" (-ī). Scandinavian, Bulgarian/Macedonian, Albanian definite forms are also debatable but they should be considered separately. Korean particles and copulas are also written without a space but they are considered separate words. 도서관 (doseogwane) "to the library" = 도서관 + 에. --Anatoli T. (обсудить/вклад) 23:43, 8 October 2014 (UTC)
Arabic, Hebrew, and Aramaic have a lot of clitics and we have a consensus generally not to include words with clitics. The definite article is arguably one of these clitics, although in Aramaic the definite form is actually the lemma form. However, we do seem to have a status quo of generally not including the definite forms for Arabic and Hebrew. --WikiTiki89 02:52, 9 October 2014 (UTC)

The Latin word com has no entry.[edit]

The Latin word com, a component of commodus does not have an entry. GHibbs (talk) 08:06, 6 October 2014 (UTC)

Is it ever a free-standing word? As a prefix we have com- (and con-, col-, cor-, and co-). DCDuring TALK 10:39, 6 October 2014 (UTC)
The free-standing word corresponding to com- is cum. —Aɴɢʀ (talk) 16:37, 6 October 2014 (UTC)

Transliterations for headword-line inflections[edit]

Previous discussion: Wiktionary:Beer parlour/2013/October#Transliterations for inflected forms in headwords?

This was discussed before a while ago, but didn't reach much of a conclusion. The question is how to deal with transliterations of inflected forms that are displayed in headwords. Module:headword, and by extension many of our current headword-line templates, do not support this at all. But for Arabic we've always displayed transliterations for inflected forms, and the templates therefore had to be custom-made to handle this.

I imagine it's best to have a single common behaviour for all languages. So the question is, should we include them for all languages, for none, or for some subset? And if only for some subset, then based on what criteria? —CodeCat 16:08, 7 October 2014 (UTC)

  • My 2p is on all. As the EN WT, our user base can be assumed to read English. If an entry is in a non-Latin script, we cannot assume that our users can read the headword, and as such, for the sake of usability (among other factors), we should provide transcriptions. ‑‑ Eiríkr Útlendi │ Tala við mig 17:26, 7 October 2014 (UTC)
I thought that our "ground rules" said that all non-Roman texts should (eventually) be transliterated - and that this could be by means of "pop-up" text if necessary or wanted. — Saltmarshαπάντηση 17:44, 7 October 2014 (UTC)
Transliterate all. --Vahag (talk) 18:42, 7 October 2014 (UTC)
Don't transliterate Russian inflected forms or some other languages having irregular pronunciations. It may also look quite messy if there are a lot of forms in the header. Arabic editors want to transliterate all, so be it. I don't object Arabic transliterations. --Anatoli T. (обсудить/вклад) 22:36, 7 October 2014 (UTC)
I'm not sure I understand your reasoning. If I understand correctly that by "irregular pronunciation" you mean "pronunciation not fully predictable from spelling", then it seems to me that those cases are exactly the ones where a transliteration would be useful. Then again, we've already established that many editors here disagree with the practice of using pronunciation as a guide to transliteration in phonemic scripts such as Cyrillic. —CodeCat 22:49, 7 October 2014 (UTC)
I agree with Atitarev that we should transliterate inflected forms only for languages for which the transliteration is essential to understand the structure of the inflected form. For languages such as Arabic, for which transliterations could be considered superfluous when the words are fully vowelated, there is another consideration: It may be difficult for some readers to see the vowel diacritics, making the transliterations essential to these readers. For languages like Persian, for which we do not indicate vowels at all in the native script, transliterations are absolutely essential. --WikiTiki89 22:47, 7 October 2014 (UTC)
What about users who want to know what is written, but are not learned in reading it? Arabic looks like nonsensical squiggles to me, and without transliterations the forms might as well not be there at all. For Cyrillic or Greek the consideration is no different, except that I just happen to be able to read those scripts. But there will of course be many users that can't. —CodeCat 22:51, 7 October 2014 (UTC)
Someone who cannot read a language is unlikely to need to know how a word inflects. --WikiTiki89 00:04, 8 October 2014 (UTC)
@Wikitiki89: I guess I'm unlikely then? —CodeCat 00:33, 8 October 2014 (UTC)
Yes, you are one of the few. Keep in mind that our inflection tables usually do have transliterations. But if you are interested enough in Arabic, I suggest you learn the alphabet. Otherwise you would be comparable to someone wanting to learn chemistry without learning the chemical element symbols or someone wanting to learn calculus without learning mathematical notation. --WikiTiki89 11:30, 8 October 2014 (UTC)
  • Does adding a romanization to inflected forms harm the project in any way? It seems to me instead that it would add value. Perhaps I happened across the term რეჰანმა (rehanma) and simply wanted to know roughly how to read it, without any knowledge of the Mkhedruli script. Thankfully, this entry for an inflected form already includes a romanized spelling. Would you advocate for removing romanizations from inflected forms? If so, why? ‑‑ Eiríkr Útlendi │ Tala við mig 05:29, 8 October 2014 (UTC)
@CodeCat, Many editors doesn't mean there's a consensus. If you haven't noticed there are a lot of languages with irregular pronunciations and transliterations (exceptions). There's no practice in published dictionaries to transliterate Russian or Greek, hence an in-house (Witktionary) transliteration method is used. "narodnovo" and "narodnogo" are equally attestable transliteration of genitive form of наро́дный (naródnyj) - наро́дного (naródnovo). Japanese and Korean exceptions are partially handled by smart modules (some Korean exceptions still need to be transliterated manually, such as 십육) but Russian is not, こんにちは is "konnichi wa", not "konnichi ha". Do I need to bring up that argument again? Hindi, Thai, Lao, Greek also have irregularities, which are reflected in standard or Wiktionary transliterations. Automatic transliteration would cause, e.g. ру́сского appear as "rússkogo", which should be "rússkovo" (gen. of русский) --Anatoli T. (обсудить/вклад) 23:03, 7 October 2014 (UTC)
Cyrillic, Greek, Armenian, Georgian vs Hangeul, Arabic, Hebrew, Thai, Devanagari, etc. The former are considered "easy" by dictionary publishers, although Devanagari is very phonetic. Since dictionaries usually don't use transliterations for the former, we have this argument that those should reflect the spelling, letter-by-letter whereas the difficult ones use phonetic transliterations or transcriptions, mixture of literal and phonetic. You can learn about transliterations for complex scripts and see that they are full of exceptions, most are documented ("standard" or "scientific"). --Anatoli T. (обсудить/вклад) 23:13, 7 October 2014 (UTC)
  • Reading the above, I think it would be useful for us to be clear about transcription -- changing one script for another, such as “ру́сского” → “rússkogo” -- versus transliteration -- which would include phonetic considerations, such as “ру́сского” → “rússkovo”.
Anatoli, do you (or any others) have any objection to transliteration? ‑‑ Eiríkr Útlendi │ Tala við mig 23:29, 7 October 2014 (UTC)
@Eirikr: You seem to have gotten transcription and transliteration backwards. Transcriptions are phonetic while transliterations are (supposed to be) graphemic. --WikiTiki89 00:04, 8 October 2014 (UTC)
  • Fair enough, I may have gotten it backwards. But the point stands -- are we worried about orthographic fidelity, or phonetic? Or do we even want both? ‑‑ Eiríkr Útlendi │ Tala við mig 05:29, 8 October 2014 (UTC)
@Eirikr:, have you read all of my posts above? Would agree to transliterate こんにちは as "konnichi ha" and 십육 as "sibyuk"? Modern standard transliterations go far beyond just representing words simply letter-by-letter. They use a lot of phonetic considerations, call them transcriptions, if you wish but they are not. "rússkovo" is not 100% phonetic, only shows irregular pronunciation of "г", it's pronounced [ˈruskəvə] (the phonetic respelling is "ру́скава"). --Anatoli T. (обсудить/вклад) 23:37, 7 October 2014 (UTC)
BTW, fully automated Arabic transliteration will affect irregular Arabic words, such as إنْجِلِيزِيٌّ (ʾinjilīziyyun), which is pronounced the "Egyptian" way - "ʾingilīziyyun" and other loanwords and dialectal pronunciations. It's probably fine, just need to be aware of this. --Anatoli T. (обсудить/вклад) 23:46, 7 October 2014 (UTC)
Just to make sure, you realise that if we do have transliterations for inflections on headword lines, there will also be parameters on {{head}} to override any default ones? —CodeCat 23:49, 7 October 2014 (UTC)
I suspected there would and should be but the task is too big. All adjective-like nouns will be affected first (-ого, -его/-ёго genitive endings), all words where (Cyrillic) "е" is pronounced as "э" (the largest group of exceptions). --Anatoli T. (обсудить/вклад) 23:55, 7 October 2014 (UTC)
  • @Atitarev, Wikitiki89: I'm left unsure -- do you two oppose the addition of romanizations on inflected forms, or do you instead oppose an automated approach that might introduce errors? ‑‑ Eiríkr Útlendi │ Tala við mig 05:29, 8 October 2014 (UTC)
I oppose the addition of romanizations on inflected forms for two reasons (for Russian) - 1. The irregular words will need to be transliterated manually or might introduce errors. 2. The headwords get cluttered. (genitive sg., nom. plural, feminine form - are the possible inflected forms for Russian). It doesn't have to be for all languages like that. --Anatoli T. (обсудить/вклад) 05:34, 8 October 2014 (UTC)
  • Your mention of "clutter" led me to look into Russian entry format. Here's a sample headword line from the entry for русский:

ру́сский (rússkijm anim, m inan (genitive русского, nominative plural ру́сские, feminine ру́сская)

This looks like a bit of a mess to me; all of the additional headword information for inflected forms is already given, as expected, in an Inflected forms table contained within the entry.
Redundancy aside, I think русский (russkij) is already fine -- there's a romanization of the headword, and the Inflected forms table provides romanizations of all other forms.
My current understanding of general policy, and this proposal, is that we want to make sure that all entries in non-Latin scripts include romanizations. So I'm really not worried so much about the lack of romanization for the link to русская (russkaja) in the headword line for the русский (russkij) entry. (For that matter, I think the headword line should be simplified to remove the redundant and visually cluttered inflected forms, but that might just be me.) I'm more concerned about whether there is any romanization given in the actual entries for inflected forms. Gladly, русская (russkaja) does provide a romanization.
Would you be amenable to ensuring that all entries have romanizations? ‑‑ Eiríkr Útlendi │ Tala við mig 07:11, 8 October 2014 (UTC)
I'm going to add my 2 cents to transliterating all inflections in all languages, but I think it's most important for languages like Persian and Arabic where vowels may not be written, and is important for Arabic even when vowels are written because of the difficulty that the average user will have in reading the script. So far it looks like Anatoli is opposed to transliterating inflections for Russian but not Arabic, Wikitiki might be similar, and everyone else is OK with transliterating inflections in all languages. Is this right?
I do think it's possible to make an argument that there's something qualitatively different and more "foreign" about Arabic or Devanagari or Thai vs. Greek or Cyrillic. Certainly this is the case for me. However, keep in mind, Anatoli, that you're a native Russian speaker whereas the majority of users of the English Wikipedia will not be, and might well be trying to learn a foreign language and so care about the inflections, but not be very comfortable with the script.
BTW as for the clutter issue, the same "issue" should theoretically appear in Arabic, but IMO the previous way of doing things (before CodeCat changed it), which did display transliteration of all Arabic inflections, didn't look especially cluttered. The trick here I think is to put the inflections outside of the parens, so that you don't end up with nested parens when you display the transliterations. Benwing (talk) 08:20, 8 October 2014 (UTC)
I agree that we put too much information on the inflection lines of Russian nouns. There is absolutely no need for the genitive or plural in the headword line, unless the form is irregular. The feminine form is useful, however. If the argument is about showing the stress pattern, then the genitive is needed only for nouns ending in a consonant (or ь). But I still don't see why the declension table isn't enough for this. --WikiTiki89 11:30, 8 October 2014 (UTC)
Just to clarify my position on Russian headwords. I don't oppose the information (it's helpful, can help quickly identify stress patterns and declension types and plural forms) but I don't think it's a good idea to transliterate inflected forms. --Anatoli T. (обсудить/вклад) 00:33, 9 October 2014 (UTC)
The genitive only helps identify the stress patter for nouns that end in a consonant, and only the singular stress pattern at that. It is completely useless for nouns that end in consonants, as the singular stress pattern is apparent from the nominative, except for nouns ending in , which may need the accusative (but certainly not the genitive). The nominative plural is insufficient to identify the plural stress pattern. You additionally need one other plural form other than the plural genitive and also the plural genitive in some cases. At that point, there is too much information in the headword line and we already have declension tables with all of this information. --WikiTiki89 03:02, 9 October 2014 (UTC)
I disagree (please review your post, you have two contradicting statements - the first two sentences, so I don't know what you mean there). There are 6 stress patterns: Appendix:Russian stress patterns - nouns + some nouns that are irregular.
Consonantal endings:
  1. до́ктор - до́ктора - доктора́
  2. ди́ктор - ди́ктора - ди́кторы
Ь or "hissing" sounds:
  1. ле́карь - ле́каря - ле́кари (stress pattern 3 is also acceptable)
  2. сле́сарь - сле́саря - сле́сари/слесаря́ (то́карь is the same)
  3. глуха́рь - глухаря́ - глухари́
  4. врач - врача́ - врачи́
  5. това́рищ - това́рища - това́рищи
Do I need examples for vowel endings? For people mastering the basics of Russian, including native speakers, this info is usually sufficient without looking at the full declension table. --Anatoli T. (обсудить/вклад) 03:30, 9 October 2014 (UTC)
Maybe you misunderstood my post. For nouns that end in consonants (including ь), I agree that the genitive singular helps determine the stress pattern for the singular. For nouns that end in vowels, the genitive singular is of no help at all, since the stress is always in the same place as in the nominative singular. Furthermore, for nouns that end in , the accusative might have a different stress from the nominative, yet for some reason we do not include it. For the plural, the nominative plural is insufficient to determine the full plural stress pattern. More information is needed as I explained above, and that would completely overwhelm the headword line and defeat the purpose of having inflection tables. --WikiTiki89 03:43, 9 October 2014 (UTC)
-а nouns are only one portion of nouns, large but not huge. You still need to know that plural and gen. sg for ка́ша is ка́ши, not ка́шы (beginner level) and томоды́ is a form of томода́. Animacy helps determine the accusative. Well, yes, it's not comprehensive but sufficient in MOST cases. Apart from stress patterns, there are other things - колесо́ -колеса́ - колёса, огонёк - огонька́ - огоньки́, и́мя - и́мени - имена́. Knowing that "-а" nouns (NOT ALL VOWELS, just "а"!) are predictable is a blessing but there are too many other declension and stress patterns. I want to reiterate that gen. sg. and pl. nom. forms are sufficient to determine THE FULL STRESS PATTERN (usually). --Anatoli T. (обсудить/вклад) 04:38, 9 October 2014 (UTC)
Someone who does not know the rules for ы vs и will probably need the full declension table anyway to figure anything out. Can you give me an example of a noun that ends in a vowel (not including ь or й) whose stress pattern for the singular cannot be determined from the nominative? (I don't believe there are such nouns, but if you can prove me wrong, go ahead.) Note that I am all for including the genitive for nouns ending in consonants. As for the plural, the "usually" part is exactly my point. If there are exceptions, then you can't say that the full stress pattern can be "determined", but only "guessed". I noticed other Russian dictionaries tend to include the genitive and/or the dative for the plural in cases where there could be confusion. But the more we include, the more we get back to the question of why isn't the declension table enough? --WikiTiki89 05:26, 9 October 2014 (UTC)
Haven't I already with колесо, имя, голова, борода (unlike simple one like женщина with stress pattern 1? What about о́блако - о́блака - облака́ ? --Anatoli T. (обсудить/вклад) 05:37, 9 October 2014 (UTC)
Perhaps you should re-read which forms I am referring to: nominative singular (колесо́, голова́, борода́) and genitive singular (колеса́, головы́, бороды́). Although you did remind me that the n-stems such as имя are possible exceptions; we should definitely include the genitives for them. --WikiTiki89 05:49, 9 October 2014 (UTC)
And here's a good one for you: with "-а": голова́ - головы́ - го́ловы, борода́ - бороды́ - бо́роды. So it's not absolutely useless, even for this type of nouns. :) --Anatoli T. (обсудить/вклад) 05:01, 9 October 2014 (UTC)
Umm... Yes it is useless. Unless you're blind, you can see that the genitive singulars you just gave have the same stress as their corresponding nominative singulars. --WikiTiki89 05:26, 9 October 2014 (UTC)
Hmm, what?! Have you read it carefully? голова is not like most nouns ending in "-а" and stress patterns can be determined not just by genitive sg but gen. sg + nom. pl in combination! See the table again. It's pattern 6, not 1, example given: полоса́ (same pattern as голова and борода). --Anatoli T. (обсудить/вклад) 05:37, 9 October 2014 (UTC)
Perhaps you should re-read which forms I am referring to. My point is that in these cases, if you have the nominative singular and the nominative plural, then the genitive singular adds no new information (since the singular pattern is determined from the nominative singular and the plural pattern has nothing to do with the genitive singular). --WikiTiki89 05:49, 9 October 2014 (UTC)
Displaying genitive sg just shows that it's "as expected", treating vowel and consonant endings differently doesn't make much sense. --Anatoli T. (обсудить/вклад) 05:58, 9 October 2014 (UTC)
Then instead of treating the vowel and consonants differently, let's use this simple rule: if the stress in the genitive is in a different place from the nominative (or if the stem itself is different, such as for день/дня or имя/имени) then we include the genitive, otherwise it is "as expected" and we exclude it to avoid clutter. If the user is still unsure, then they can check the declension table. --WikiTiki89 06:07, 9 October 2014 (UTC)
The modules are complicated as is. I don't see the need to change the Russian noun headword. The Russian headword style was discussed and agreed on a while ago. Even if genitive is hardly the crucial case, it's an example of a case and shows how nouns may change. --Anatoli T. (обсудить/вклад) 01:32, 10 October 2014 (UTC)
Who exactly "agreed" on this, just you and CodeCat? I don't think there is anything wrong with using the genitive as opposed to another case, I just don't think we need to include it for every word. --WikiTiki89 11:21, 10 October 2014 (UTC)
Right, I too favour not including inflected forms in Russian headword lines, but practices for Russian are usually determined by a minority here. Refer to the transliteration debate. --Vahag (talk) 12:46, 10 October 2014 (UTC)
If transliteration for the headwords is chosen I'd favour removing inflected forms from the Russian headword altogether. That way, there won't be any additional reasons for arguments, introduced discrepancies with the existing transliteration practice. @Wiki, having genitive in some terms and not the others will be confusing. Also, if you don't like something, don't do it. You're under no obligation to edit in Russian and genitive sg. and plural forms are optional. I've added manually genitives and plurals on many entries, CodeCat did it with a bot and did the headword changes, no conspiracy here. @Vahagn, you can direct your anger at all other languages where transliteration is not 100% graphemic. Transliterating English into Armenian or Russian graghically wouldn't be very useful, would it?--Anatoli T. (обсудить/вклад) 13:22, 10 October 2014 (UTC)
The question isn't about whether the transliteration is graphic, but whether it represents the written expression of the word rather than the spoken one. For example, a reasonable Cyrillization of English that aims to represent the written language would transliterate colonel as колонел rather than as кёрнел, but bite would still be байт rather than the silly бите. --WikiTiki89 14:10, 10 October 2014 (UTC)

Consensus on transliteration of headword inflections?[edit]

Irrespective of the question of how much info to include in Russian headwords, can I propose a consensus around the following?
  1. For Cyrillic (and maybe also Greek), don't include transliterations of inflections in headword lines.
  2. For other non-Latin scripts, do so. This info comes either from an explicitly given transliteration or, failing that, from auto-transliteration when it is available and is able to succeed.
My preference would be to transliterate all inflections, but I can accept this compromise for the purpose of consensus. The logic here might be something like this: Cyrillic and Greek are similar enough to Latin script, and easy enough to learn, that there's a reasonable likelihood that someone interested in the inflections of a foreign word has a decent command of these scripts, whereas other scripts are generally much harder to learn and especially to master fluently to the point where a transliteration isn't helpful. This is certainly my experience: I've learned Arabic script and tried to learn Thai script and Devanagari, and my experience with all of these is that it takes a lot more work to become comfortable reading these fluently than it does with Cyrillic or Greek, both of which I learned easily. Even after a lot of work with Arabic I still sometimes stumble over the letters, and find the transliteration very helpful. An additional consideration for Arabic script is that some of the vowels are typically omitted, making transliteration essential. Even when vowels are present, they're often hard to read properly because of font considerations (the vowels are displayed above or below the letters and frequently get drawn over letter descenders or other diacritics, or sometimes a vowel below the line can be confused with a vowel above the next line below). Benwing (talk) 04:19, 9 October 2014 (UTC)
I have already expressed my opinion. Yes, splitting "easy" and "complex" scripts sounds reasonable to me. I have to ask about Korean inflected form (verbs and adjectives). @Wyang:, what do you think, do we need to transliterated Korean inflected forms in the headword? Vahagn wants Armenian (and probably Georgian) to be fully transliterated. --Anatoli T. (обсудить/вклад) 04:38, 9 October 2014 (UTC)
I think the idea of compulsorily applying headwords to all languages is silly, and a lot of languages would be much better off without it, including the non-inflecting languages and some agglutinative languages. I think the headword is being overused in two aspects: 1) pronunciation; 2) inflection. For Korean, the inflection information in the headword more properly belongs in the conjugation section, and it can be moved to the top of the conjugation table as another table (identifying the key forms) alongside the stems table. The romanisation in the headword is redundant and should be removed. There is then no need for information or parameter duplication as in the cases of 십육 (rv=) and 아름답다 (irreg=y). In the division of "easy" and "complex" scripts, Korean would definitely be classified as an "easy" script, especially according to the Hangul supremacists. It's also called "morning script", as "a wise man can acquaint himself with them before the morning is over; a stupid man can learn them in the space of ten days". Wyang (talk) 22:35, 9 October 2014 (UTC)
This isn't a question of whether to have info in headwords but whether to transliterate them. I personally see Korean as a bunch of random squiggles, so for me it's not that easy. I have also heard that romanization of Korean involves various considerations beyond mere transliteration, i.e. the transcription shows various sorts of assimilations. I think one problem here is that people are thinking in terms of their own expert knowledge rather than the likely audience, which is someone who is a native English speaker and foreign language learner who may not have much experience with a foreign script. Benwing (talk) 00:19, 10 October 2014 (UTC)
I also used to look at Korean and Arabic as a bunch of squiggles, until I started learning these languages. Changes in the Korean transliteration make perfect sense when its phonology is understood. And learning a foreign script without learning a bit of a language using it doesn't make much sense. So, learning a script in a day or in a few days is applicable to people speaking that language. Arabic was somewhat easier for me (with good fonts only) and I still think Arabic script is easier and would be quite easy if vowel points were always written (I'm not suggesting it should). I think some info in the Korean headword is useful but for me the important bits are not those currently appearing there. --Anatoli T. (обсудить/вклад) 01:20, 10 October 2014 (UTC)
OK, consensus appears to be:
  • No translit for Cyrillic, Greek or Korean scripts.
  • Yes for others.
  • @CodeCat:, can you implement that? We can always add additional exceptions later if needed. Benwing (talk) 04:11, 10 October 2014 (UTC)
Arrowred.png Sorry guys, wrench-thrower here --
What constitutes a "simple script"? Who decides what is "simple"?
Again, I must note that, as the English Wiktionary, our only safe consideration we can make when it comes to scripts is that our user base can read the Latin script. I reiterate my position that I believe we should provide romanizations for all headwords not written in the Latin script.
One argument against including romanizations for certain non-Latin scripts seems to be that the scripts are "simple". Sure, any script (or anything at all, really) can be viewed as simple, once you've already learned it. Many other scripts are also pretty straightforward, with charts providing straightforward phonetic conversions. Are we to no longer provide romanizations for Mkhedruli? Gothic? Amharic?
An undercurrent appears to be that we shouldn't include romanizations because doing so would be difficult. That said, this whole project of creating a multilingual dictionary is itself an enormous amount of work. Is such a relatively small amount of additional work really so much of a hurdle? Romanizations are a very simple way to greatly increase the usability of Wiktionary as a whole.
As with everything here, those who don't want to do the work don't have to. But as far as policy or goals are concerned, I feel very strongly that deciding to not include romanizations for non-Latin-script headwords does us, as a project, a grave disservice. ‑‑ Eiríkr Útlendi │ Tala við mig 04:55, 10 October 2014 (UTC)
@Eirikr:, a few points.
  1. This issue concerns only the inflected forms in headwords. The headword itself is always transliterated, as are links.
  2. I agree with you. I would rather see transliterations (transcriptions or romanizations, more correctly) of inflections for all non-Latin scripts.
  3. I don't think it's an issue of how difficult it is but rather that some people seem to think it's "cluttering" the display.
  4. My main concern for the moment is to find some workable compromise so that CodeCat is willing to put back auto-transliteration of Arabic inflections in headwords; I'd do that myself but I don't have permission to edit Module:headword. (Can I request such permission on a page-by-page basis or do I have to become an admin?)
Here's another possible compromise:
  1. For scripts where there's no objection to transliterating inflections in headwords, we go ahead and put the transliteration there after the native-script inflected form, whether it's explicitly given or auto-transliterated. Let's say this will currently apply to all scripts except for Cyrillic and Korean, maybe Greek as well.
  2. For scripts where people think doing this will "clutter" the headword line, include the transliteration in a mouse-over -- I think this is feasible. (It could be said that we should use mouse-over for all scripts, but I'd rather have the transliteration directly visible whenever possible -- it is faster to read that way, and users might not realize that the transliteration is present on mouse-over.) Benwing (talk) 12:15, 10 October 2014 (UTC)
I've added a temporary exception to Module:headword so that Arabic inflections are always transliterated. This will hopefully alleviate your immediate concerns, but I do hope that you'll continue to participate in the wider discussion. —CodeCat 13:00, 10 October 2014 (UTC)
Thanks, and I will stay in the discussion. I wish more people would contribute; it's hard to form a consensus when only a small number of people speak up. Benwing (talk) 13:16, 10 October 2014 (UTC)
I realised I haven't stated my own opinion. I mostly follow Eirikr's reasoning, and think that transliterations should accompany all non-Latin-script terms in some form, wherever they are. Exceptions can be made in cases where terms generally appear paired with Latin-script alternatives, such as in Serbo-Croatian. —CodeCat 13:18, 10 October 2014 (UTC)
  • I support transliteration of all forms listed in the headword line in all scripts other than Latin, preferably automatically generated, even if this means certain Russian forms will appear to end in -ogo instead of -ovo. Some people might say that's easy for me to say, since the only non-Latin-script language I spend much time on is Burmese, and Burmese doesn't have inflections. Nevertheless, I think it's preferable to transliterate them all rather than to try to decide which scripts are "simple" enough that they don't need it. —Aɴɢʀ (talk) 13:53, 10 October 2014 (UTC)
    • It reminds me a bit of a debate we had some time ago, considering whether languages were "well known" and "major" enough to not be linked in translation tables and in {{etyl}}. Eventually we gave up on the debate and just made translations never link, and {{etyl}} always link. —CodeCat 14:05, 10 October 2014 (UTC)
One thing we seem to be forgetting here: why are the inflections included in the headword line in the first place? They're included for those who know the rules of the language to figure out the inflection without looking through the tables. In other words, they're a shorthand for people who mostly don't need transliterations. For someone who sees the letters as scribbles, an inflected form is most likely just decoration, anyway- whether it's transliterated or not. That means that this isn't a matter of substance (with a few exceptions such as Arabic), but of style. Chuck Entz (talk) 16:58, 10 October 2014 (UTC)
But many languages don't have tables, so we include the forms on the headword line. And even in cases where there are tables, the forms we include on the headword line are sometimes not in those tables. —CodeCat 17:00, 10 October 2014 (UTC)
Certainly for Arabic, this is exactly correct. The inflections list basic and very important things, like feminines and plurals for nouns and adjectives. For nouns and adjectives we don't currently have any inflection tables. There are other languages that are similar. I took a look at other non-Latin-script languages with inflections, and I can find only Russian and Georgian for nouns, and they also list basic things like the plural (and in the case of Russian, the genitive singular). I can easily imagine a situation where a learner has some concept of grammar -- doesn't take much to want to know how to form the plural -- but a shaky grasp on the native script. Benwing (talk) 23:29, 10 October 2014 (UTC)
For Russian, the genitive and plural forms are also in the tables. But for adjectives, there's the comparative forms, which are not in any table. For verbs, the imperfective and perfective counterparts are not in the table either. —CodeCat 23:47, 10 October 2014 (UTC)
The world of language learners is not neatly divided into those who can read the script and those who can't read the script. If push comes to shove, I can read Sanskrit in Devanagari, but I'd rather read it in transliteration because it's easier. I don't know if our Sanskrit headword lines currently include principle parts or not (our coverage of Sanskrit is not great), but if it did, I would want to have translits on each form listed. Devanagari is not just scribbles for me, but it does take me about 10 times longer to read than transliteration. —Aɴɢʀ (talk) 08:28, 11 October 2014 (UTC)


OK, a majority seems to want to see translit of inflections in all languages. This consists of (at least) me, CodeCat, Angr, Eirikr, Vahag, perhaps also Saltmarsh. A minority seems to either want translit of inflections in only some languages, or wants fewer headword inflections in certain languages, or both. This consists of Anatoli (doesn't want translit in Russian, is OK with the rest, is OK with headword inflections in general), Wyant (doesn't want translit in Korean, wants fewer headword inflections in Korean), and Wikitiki (seems to want fewer headword inflections in general, has expressed particular opinions about Russian, might also want less transliteration although I'm less sure about that).
So, we can do two things, it seems:
  1. Take a vote.
  2. Find some compromise that will satisfy both camps. I've proposed above the idea that we can transliterate the headword inflections of most non-Latin-script languages the traditional way (in parens or something similar, after the native-script word), and for the ones where people object (Korean, Russian), transliterate using a mouse-over popup.
I'd like each person who has expressed an opinion, and any others who want, to comment indicating whether they find #2 reasonable and whether they'd accept it, and if not, do they think #1 is the way to go, and if not, what do they think is the way to go? Benwing (talk) 09:02, 12 October 2014 (UTC)
I don't feel super strongly about this, so I'm open to finding a compromise. —Aɴɢʀ (talk) 15:22, 12 October 2014 (UTC)
I really like the idea of the mouse-over popup (or tooltip) transliteration; however, MediaWiki is imposing their own "preview" popup, which does not even work properly in any useful way on Wiktionary and I really wish we could get rid of it and make room for our own popups. --WikiTiki89 14:30, 14 October 2014 (UTC)
  • What I'd like to see for transliterations is 1) the most common scheme used by default, for all languages 2) ability to switch between all of the popular transliteration systems available by clicking on a link placed near the headword, opening a popup menu with options 3) selected choice remembered when browsing other entries in the same language. 4) Ability to hide/show all transliterations for languages that use them. No "one true transliteration scheme" and no "one true transliteration display option". I believe that all of the necessary data can be generated in Lua, and selectively displayed/hidden using JavaScript. We should give users options not cripple them. --Ivan Štambuk (talk) 00:27, 15 October 2014 (UTC)

Phrasal verbs whose lemma is not the infinitive[edit]

I noticed that there are some phrasal verb entries in English that are conjugated, but the infinitive is not used as the lemma. An example I noticed just now is all hell breaks loose. This verb certainly does have an infinitive, all hell break loose. This is clear when you add auxiliary verbs: I want all hell to break loose or may all hell break loose. So I think we should move these entries to the infinitive. —CodeCat 22:24, 11 October 2014 (UTC) :We usually don't bother with inflecting phrasal verbs, as it just clutters the entry for no real gain. This kind of a case probably warrants it, however. DCDuring TALK 22:38, 11 October 2014 (UTC)

The problem is it just sounds funny when the subject of the verb is included. I know we moved there is to there be a while back, but it has the same problem: with the subject (even just a dummy subject there) present, the bare infinitive just sounds really odd. —Aɴɢʀ (talk) 22:42, 11 October 2014 (UTC)
It does, but you can't deny that the infinitive exists. So either we should make a specific rule for these cases, or we should continue to use the infinitive, right? —CodeCat 23:13, 11 October 2014 (UTC)
Among OneLook dictionaries only Cambridge Idioms actually covers this and they do it at all hell breaks loose. DCDuring TALK 23:54, 11 October 2014 (UTC)
Whichever form we make the lemma, there should be redirects from the other forms. - -sche (discuss) 03:38, 12 October 2014 (UTC)
  • This isn't what I would call a phrasal verb, nor is it so categorized. It is a full sentence. As is the case with virtually all other full English sentences (See Category:English sentences.), the verb and sometimes the noun within can be inflected. (It is trivial to show it to be a full sentence and to show it or any other sentence to occur with an infinitive.) Sentences are usually shown in their canonical form (present indicative tense). DCDuring TALK 05:49, 12 October 2014 (UTC)

use–mention distinction in reference templates[edit]

As happened seven months ago, Dan Polansky and I are currently in disagreement about reference-template formatting; this time, we disagree about whether {{R:L&S}} should enclose the cited entry title in quotation marks. I believe that such quotation marks are necessary in order to mark the use–mention distinction, and that quotation marks create a more legible presentation than italicising the entry title would. I don't know why Dan Polansky disagrees, and nor do I know why he reverted the addition of {{documentation}} to the template in the same edit. — I.S.M.E.T.A. 01:37, 13 October 2014 (UTC)

To explain, I come here in the hope that I shall find or obtain consensus to use quotation marks in {{R:L&S}}. — I.S.M.E.T.A. 01:57, 13 October 2014 (UTC)

Just ignore him. Keφr 11:14, 13 October 2014 (UTC)
@Kephir: Forgive me; does "him" refer to Dan Polansky or to me? — I.S.M.E.T.A. 17:02, 13 October 2014 (UTC)
Polansky. He is going to be obstructionist just because he can. But for the sake of having anything said on-topic, I agree with you about the quotation marks. On the other hand, some consistency in formatting mentions would be nice, which would favour italics instead. But either way, bare external link formatting seems rather unfitting to me. Keφr 15:34, 14 October 2014 (UTC)
Thanks, Keφr; I thought you meant him, but I wanted to make sure. I've made the change again; hopefully it'll stick this time round. — I.S.M.E.T.A. 18:28, 14 October 2014 (UTC)
FWIW, I agree with quotation marks, since we are referring to a piece of a larger work: "qua" (for example) is more-or-less a section title. (This is not exactly the same as the use–mention distinction. We are neither using nor mentioning the word qua, we're just citing a source that mentions the word qua. Perhaps a subtle distinction, but IMHO a useful one to keep in mind in cases where the reference work uses a different citation form than we do, or when it assigns a few lemmata to a single entry for whatever reason.) —RuakhTALK 04:56, 15 October 2014 (UTC)

Empowering WingerBot[edit]

I filled out a vote request to empower my new bot WingerBot, here:

Wiktionary:Votes/2014-10/Request for bot status: WingerBot

This is my first bot.

It gives a 30-day vote period, which seems excessive. For example, JackBot had a 7-day window, which seems reasonable. If that can be applied here, can someone fix up the start and end times appropriately?

Thanks. Benwing (talk) 07:20, 13 October 2014 (UTC)

FYI, the voting is going on now (and has been for a few days).
My bot's source code is available on github: [1]
See also Wiktionary talk:Votes/2014-10/Request for bot status: WingerBot.
Benwing (talk) 11:29, 23 October 2014 (UTC)
It's been several days since this vote has finished ... could someone close it? Thanks. Benwing (talk) 01:24, 4 November 2014 (UTC)

Compound lists for Japanese entries (and possibly CJK in general) -- are these really needed?[edit]

With the advent of User:Haplology's various categories for Japanese entries, which compile lists of terms using each kanji (such as Category:Japanese terms spelled with 赤 read as あか, or Category:Japanese terms spelled with 幸 read as こう), it occurs to me that the potentially *huge* lists of compounds that could be compiled and included within each kanji entry are actually redundant and obsolete. Rather than laboriously compile these lists by hand, I think it makes a lot more sense to leverage the categories to do the hard work for us.

Comparing the categories and the manually created lists, the only additional information that the manual lists provide is a possible reading, and a gloss. This leads me to two things:

  • As a proposal: I posit that this information, while potentially helping to improve usability slightly, also represents a sizable negative potential for mistakes and inconsistencies. I therefore propose that we no longer include such lists in Japanese entries, referring users instead to the categories. I also submit for consideration that Chinese and Korean editors might do the same for hanzi and hanja compound lists.
  • As a request: Does anyone familiar with the inner workings of categories know if there might be some technically feasible way to get readings to display automatically in category listings? For instance, 幸運#Japanese is added to category Category:Japanese terms spelled with 幸 read as こう, with the sort argument こううん (kōun). Looking at the list on the category page, we see that 幸運#Japanese is there, but its sort argument is lost -- other than the sorting itself, the sort argument doesn't appear on the page as any kind of useful information. Is there any way of capturing sort arguments and getting them to display somehow in category lists?

I look forward to hearing what others think. ‑‑ Eiríkr Útlendi │ Tala við mig 18:19, 13 October 2014 (UTC)

I find them useful. They are not hard to create. Ideally, a bot should make those categories.--Anatoli T. (обсудить/вклад) 10:09, 14 October 2014 (UTC)
  • Sorry, which them did you mean in I find them useful? Did you mean the categories that list compounds (which are already auto-generated once the appropriate templates are added to an entry), or the in-entry lists of compounds (which so far have to be created by hand)? ‑‑ Eiríkr Útlendi │ Tala við mig 19:11, 14 October 2014 (UTC)
I find categories useful, such as Category:Japanese terms spelled with 飢 read as う. Yes, the template auto-generates cats but they have to be created manually if they are missing. --Anatoli T. (обсудить/вклад) 21:42, 14 October 2014 (UTC)

Rethinking Babel boxes[edit]

I did some minor editing at WT:Babel recently, which made me wonder whether it would make sense to rewrite {{Babel}} in Lua. My initial motivation was to integrate it with our central list of languages (maybe even into the category boilerplate system which User:CodeCat developed) and get rid of inline styles on the way. While planning this out, some other ideas emerged in my head:

  • To have the blurbs ("This user speaks Elbonian at an advanced level") in English, and English only. On one hand, this is contrary to how Babel boxes look in other Wikimedia projects. On the other, not only will it massively simplify the code, it also makes the most sense: English is the one language in which English Wiktionary's (duh) definitions, boilerplate and meta-content are written and in which discussions are (usually) conducted, and the only language which can be assumed to be understood by all users. If I am looking at a Babel box of an advanced speaker of Cantonese, I can recognise it only because I remember yue to be the code for Cantonese, and that the number 3 means advanced level. The blurb tells me nothing; I do not know nearly enough Hanzi to recognise a single character.
  • To rename the user categories. "User si-3" is rather terse and again forces me to remember language codes. "Wiktionary:Advanced speakers of Sinhalese" would be more elegant and descriptive.
  • To deprecate {{#babel:}}, as was suggested in Wiktionary:Beer parlour/2014/September#Can we disable the #babel parser function? (I see the English blurb issue was brought up there too). I think some page in the MediaWiki namespace can be edited to point users to the template instead.
  • To suggest users to add themselves to interest groups (in Module:workgroup ping/data) when they speak a certain language at a high above level.

Some considerations:

  • Integration with our central languages list would mean that, for example Template:User en-us-N would have to be folded into Template:User en (I see Template:User sr-4 already redirects to Template:User sh-4)
  • I think some users may expect Wikimedia language codes to work in our Babel boxes (they may simply copy the Babel template across projects). I think we should generally not break that expectation; however, I worry about some Wikimedia codes not mapping perfectly to local ones.


Keφr 17:08, 14 October 2014 (UTC)

I support translating the Babel boxes into English. Their very purpose is defeated when they are incomprehensible. — Ungoliant (falai) 17:18, 14 October 2014 (UTC)
I agree with this too, and I definitely agree with converting to Lua to eliminate the unmaintainable mess of templates we currently have. —CodeCat 17:21, 14 October 2014 (UTC)
I support translating to English. I oppose converting to Lua because once we translate to English it will be very easy to turn it into a small maintainable template without Lua. I also oppose, as before, deprecating {{#babel:}}. --WikiTiki89 17:30, 14 October 2014 (UTC)
The Module:workgroup ping integration and (maybe) validation would be much harder to do from a bare template. And I think so would be Eirikr's suggestion to avoid nested tables (while maintaining all current functionality, at least). Keφr 08:35, 17 October 2014 (UTC)
Arrowred.png I enjoy seeing the other languages and would be sad to see them go, but I understand and generally agree with the rationale for changing the Babel boxes to be all-English. If we're going to have them redone, my 2p request would be to not use nested tables, and to make sure that the columns actually line up properly. I'm one of those visually oriented people for whom the jagged inconsistencies of the current Babel infrastructure is so jarring, that I deconstructed the tables and rebuilt them to line up properly on my own user page. ‑‑ Eiríkr Útlendi │ Tala við mig 19:08, 14 October 2014 (UTC)
Does the text really matter, other than the English, as well as native, language name? Wouldn't luacizing the templates would mean that, as a practical matter, the text could only be in English? A new person with a new language could not be assumed capable of adding the text required in their language in a standard-conforming way, unless there were a particularly obvious way to add the text. DCDuring TALK 08:20, 15 October 2014 (UTC)
Well, maybe; translating into every language would be a bit of work (just create a huge data table… the only problem is that it would probably grow even larger than Module:languages, so we would have to split it, and it might become hard to navigate…), but could be done in principle. Though I think we could abuse the Scribunto i18n library to reuse messages provided by mw:Extension:Babel, and have every single Babel box in any language the reader desires (just add ?uselang= to the URL). Though that would put mw:Extension:Babel in a weird limbo of "deprecated but depended upon by its replacement"; and I have no idea how this interface could be exposed. Or we could just use that facility to maintain the status quo (pardon the Polanskyism) of having them in the target language. Keφr 08:35, 17 October 2014 (UTC)
Proof of concept: {{#invoke:User:Kephir/test1|babble|ast|5}} gives
{{GENDER:USER|Esti usuariu|Esta usuaria}} tien un conocimientu [[LEVEL LINK|profesional]] d'[[LANG LINK|asturianu]].
This user has [[LEVEL LINK|professional]] knowledge of [[LANG LINK|Asturian]].
This user has professional knowledge of Asturian.
. Try also viewing this page in Chinese. Keφr 14:38, 20 October 2014 (UTC)
I always thought that the purpose of having the blurbs in the target language was to help non-English speakers or English language learners to find users with whom they might be able to communicate if they needed help. I think it is beneficial to see the name of the language in English so that English speakers can easily recognize which language the box indicates. - TheDaveRoss 20:35, 16 October 2014 (UTC)
I did not consider this. This is a good argument. Keφr 08:35, 17 October 2014 (UTC)
On that note, I wouldn't be opposed to the option of adding English text to existing Babel boxes-- but WITHOUT taking away the foreign language text. This would allow them to serve the purpose of helping foreign language users find people with whom they might communicate and talk to, and still make it easier for English speakers to make sense of it. (Note also, I'm one of the people who got so fed up with the babel templates and their alignment and such, that I made my own table rather than deal with them, as well as because there were a number of babel templates missing at the time when I set mine up. This isn't as uncommon as one might think, and thus while changing the templates is well-intentioned, it won't necessarily reach every instance.) --Neskaya sprecan? 17:32, 29 October 2014 (UTC)
It would be hard to fit all of that text neatly in a small box. We could work around that, though, if we shorten the text. Something like "near-native level English speaker" is short enough. —CodeCat 17:43, 29 October 2014 (UTC)
Or even more brief: "Basic Russian", "Intermediate Russian", "Advanced Russian" and "Native Russian". - TheDaveRoss 18:06, 29 October 2014 (UTC)

Rough initial version: Module:Babel. Does not categorise, and I did not aim for pixel-for-pixel replication. Try it by expanding {{Babel/x}} at Special:ExpandTemplates. Supports script, language and "coder" boxes; {{User time zone}}, {{User Wikipedia}}, {{User SUL}} and {{User bot owner}} are not supported. Using an unsupported box specification makes it fall back to the old Babel template. Keφr 19:25, 1 November 2014 (UTC)

Spaces in alphabetization of language names[edit]

How do we treat spaces when we alphabetize language names? Specifically, does "Lower Sorbian" precede or follow "Low German"? If we ignore spaces, then "LowerSorbian" precedes "LowGerman", but if we treat spaces as preceding A in alphabetical order, then "Low_German" precedes "Lower_Sorbian". —Aɴɢʀ (talk) 19:01, 14 October 2014 (UTC)

There are pros and cons to both options. What do Dictionaries that list multi-word phrases as separate entries do? --WikiTiki89 01:35, 15 October 2014 (UTC)
I just checked six print dictionaries (two British, four American) and they all ignore spaces (hotchpot before hot dog before hotel). —Aɴɢʀ (talk) 06:12, 15 October 2014 (UTC)
w:Alphabetical order#Treatment of multiword strings is relevant.​—msh210 (talk) 12:30, 15 October 2014 (UTC)
That page basically outlines the question, but does not provide an answer. --WikiTiki89 12:39, 15 October 2014 (UTC)
Both treatments are valid; the question is, which do we want to use? Dictionary headwords apparently usually follow the "ignore the space" rule, but other lists may follow the "treat the words separately" rule. —Aɴɢʀ (talk) 13:53, 15 October 2014 (UTC)
Internet-based sorting, including our own categories, generally treats a space as being ordered before any other character. So that would place Low German before Lower Sorbian. —CodeCat 18:27, 17 October 2014 (UTC)
Some paper dictionaries, too, use this ordering, e.g. the Routledge dictionary of historical slang: have a look at http://books.google.fr/books?id=JRuNMHNcu5cC&pg=PP12&lpg=PP12&dq=%22something+before+nothing%22+dictionaries&source=bl&ots=6iDNPNRHjr&sig=S8mC2Wqar5xb4FCC2zWaw4itGG8&hl=fr&sa=X&ei=yXJBVNebNMnDPPWkgIgG&ved=0CCMQ6AEwAA#v=onepage&q=%22something%20before%20nothing%22%20dictionaries&f=false This is the better ordering for our kind of dictionary. Lmaltier (talk) 19:52, 17 October 2014 (UTC)
This dictionary calls it something before nothing. Do you understand why? Lmaltier (talk) 20:12, 17 October 2014 (UTC)
On what basis do you say "This is the better ordering for our kind of dictionary."? I happen to be leaning the other way. --WikiTiki89 21:30, 17 October 2014 (UTC)
The reason is the number of multi-word phrases, etc. here. When entries in a dictionary are almost always single words (without spaces, etc.) and phrases are defined in these basic entries, the strict alphabetical order is the logical choice. Wnen each phrase has its own entry, it's much better to get all phrases beginning with the same word together when using a category. An example : you expect boulanger-pâtissier (probably adressed in boulanger in most paper dictionaries) after boulanger but before boulangerie, the order boulanger, boulangerie, boulanger-pâtissier is not what you would expect. Lmaltier (talk) 15:48, 19 October 2014 (UTC)
For the most part, we don't have to worry about alphabetization here; our entries are on separate pages that aren't ordered with respect to each other. Our categories alphabetize automatically, and I see that Category:en:Languages has Low German >> Low Prussian >> Low Saxon >> Lower Lusatian >> Lower Silesian >> Lower Sorbian >> Lower Wendish, meaning that our automatic alphabetization does treat spaces as ordered before any other character. The only alphabetization we have to do manually is the ordering of the languages in entries like se, which is where I first encountered the problem of where to put Lower Sorbian with respect to Low German. My immediate instinct was Low German >> Lower Sorbian, but then I second-guessed myself and asked here. After discovering that dictionary lemmas treat spaces as nonexistent, I went back to se and switched the order to Lower Sorbian >> Low German. But now that I've looked at how our categories alphabetize, I'm gonna go back again and switch it back to my first instinct, Low German >> Lower Sorbian. —Aɴɢʀ (talk) 20:07, 19 October 2014 (UTC)
We have to worry about categories only, but this is very important. They alphabetize automatically, but we must ensure that they alphabetize the best way for readers. For languages: the result is disputable for Lak'ota. Lmaltier (talk) 05:41, 20 October 2014 (UTC)
We have some control over sorting in categories, though I'm not sure if that includes treatment of spaces. As for "Lak'ota", that's not a good example- we call the language Lakota. Chuck Entz (talk) 12:24, 20 October 2014 (UTC)
It was a real example: see Category:en:Languages and look at L. Lak'ota is before Lake Miwok. Lmaltier (talk) 20:17, 20 October 2014 (UTC)

Extended etymologies[edit]

I came up on this website illustrating an idea that I had in mind for a while (click on the blue links in the leftmost column). We could extend the < "derives from" operator used in etymologies to generate a drop-down table illustrating intermediate steps between pairs in the derivational chain, i.e. all of the sound changes involved. Short descriptions could link to appendices where more details are available. This would be applicable to both reconstructions and attested etymons, including borrowings (which often undergo some special rules can nevertheless be described and cataloged). Chronologically inverted list would be used in the descendants sections of the corresponding source word/reconstruction. Support could be added for multiple sequences of derivation, and even multiple sources or different reconstructions reflecting different protolanguages. It would however require some non-trivial investment in the groundwork to make it work, so it should best be approved (or better: not disapproved) first before people waste time. I've seen some recent works that use this method but they use numbers instead of descriptions to explain what's going on, so one has to manually look up what each of the numbers used means, and the layout is horizontal not vertical. --Ivan Štambuk (talk) 00:13, 15 October 2014 (UTC)

Support, although I recognize that there should be a lot of discussion about the specifics of the layout. --WikiTiki89 01:36, 15 October 2014 (UTC)
How would it work, on a technical level? How would you share data between entries? DTLHS (talk) 04:46, 15 October 2014 (UTC)
Support. I had a vague idea about having such lists in appendices somewhere, but never developed it. Filling out the details would seem to go beyond the limits of published sources without resorting to the kind of extrapolation that you've been berating CodeCat for- are you ok with that? Chuck Entz (talk) 13:30, 15 October 2014 (UTC)
Support. Categorization based on sound change could also be added, such as Category:Old Armenian terms derived by Meillet's law. Or such terms could appear on the appendix dedicated to Meillet's law. --Vahag (talk) 10:29, 16 October 2014 (UTC)
  • I think this might overwhelm normal entries, especially if people do it for every morpheme in a polymorphemic word, but it would be nice to do this somehow on reconstructed-form appendix pages. —Aɴɢʀ (talk) 13:49, 15 October 2014 (UTC)
    • It wouldn't be too bad if we restricted it to the rules between a term and its nearest parent (i.e., an English etymology would only have the steps between it and Middle English or maybe Old English), and hid the list so that only those who choose to look at it would see it. Chuck Entz (talk) 13:57, 15 October 2014 (UTC)

Categories for words that have pronunciations marked in the form of IPA[edit]

Should we create such categories? I believe that it is convenient to go to Special:WhatLinksHere/Appendix:Italian pronunciation for the above information. --kc_kennylau (talk) 09:53, 15 October 2014 (UTC)

What's the general consensus view on handling abusive editors?[edit]

I stumbled across the activities of a new editor and have been quite impressed at how abusive they can be -- foul language, name-calling, lawyering, basically the kind of trollish behavior that drove me from Wikipedia years ago. I analyzed their total contributions, only a short list so far, and found that more than a quarter have been on talk pages, where this editor has mostly argued about editing decisions, illustrated their profound ignorance of the consensus here, and berated other users. Another more-than-quarter has been in this user's own userspace. 40% has been actual constructive mainspace edits, mostly in January-March this year. Out of the total, more than a quarter has been confrontational and even outright abusive.

For what it's worth, this editor has not yet had any direct dealings with me.

How would other admins approach this? ‑‑ Eiríkr Útlendi │ Tala við mig 18:05, 17 October 2014 (UTC)

I would post a warning on his/her user page along the lines of "Start being nice to people, or I will block you." (but in a more polite way). --WikiTiki89 18:10, 17 October 2014 (UTC)
I have to agree here. If you don't want to post the warning to the user, feel free to post a note on my talk page or use EmailUser to contact me, and I'll be happy to deal with it and whatever incivility comes up of it (as well as happy to watch them for a few weeks to see if they improve or need some time off to think). That sort of attitude isn't what we need from editors here. --Neskaya sprecan? 17:26, 29 October 2014 (UTC)

Proposal: use quotation marks to mark headwords cited in reference templates for Latin-script languages[edit]

Further to §: use–mention distinction in reference templates above, may I suggest that we use quotation marks in our R:-prefixed reference templates to mark the headwords cited by those templates? So, for example, the standard format (at least where the headword is concerned) would be:

  • “foo, n.” in Some Big Dictionary

(Because of potential problems with using quotation marks with other scripts, I make this proposal for Latin-script languages only.) Does that seem sensible to everyone? Is there consensus? Shall I prepare a vote? — I.S.M.E.T.A. 18:35, 17 October 2014 (UTC)

  • It's also worth noting that all three of these changes to remove the quotes were in 2009, now half a decade ago. Attitudes and ideas change over time. I suggest we check the opinions of the relevant people here. That said, Ullman is no longer with us, and Spangineer's last edit was in 2010. @DCDuring: do you have any input on this quote issue? ‑‑ Eiríkr Útlendi │ Tala við mig 19:32, 17 October 2014 (UTC)
  • I support adding quotes. It's the only way we can make the cited part stand out without changing text style like the italic "n.". —CodeCat 19:16, 17 October 2014 (UTC)
    The only way to stand out? That is obviously untrue. The text of the word stands out by the use of a different color for the hyperlink, as in cat in Webster’s Revised Unabridged Dictionary, G. & C. Merriam, 1913. --Dan Polansky (talk) 19:27, 17 October 2014 (UTC)
    Not all people can see such colours. —CodeCat 19:35, 17 October 2014 (UTC)
    You mean color blind (are there such that cannot distinguish blue vs. black)? Or people with a simple browser that does not distinguish a piece of text with a link from a piece of text without a link? Even assuming some people do not see such colors, will they miss the link because of the missing quotation marks? If so, will they miss links in general, since in general links are not surrounded by quotation marks? --Dan Polansky (talk) 19:38, 17 October 2014 (UTC)
    Surprisingly, I agree with Dan. Color or other link distinction seems sufficient. Quotation marks, especially double, add visual clutter IMO.
We use quotes for glosses, so any need for glosses in such templates — quite possible IMO — would require multiple quotes.
If we resort to further distinction, I would strongly oppose ever using italics as it makes it impossible to maintain the appropriate typographic contrast for the taxonomic names that are supposed to have it. DCDuring TALK 19:51, 17 October 2014 (UTC)
  • Re: links, are there any cases where a term might not be linked in such a template call? ‑‑ Eiríkr Útlendi │ Tala við mig 20:27, 17 October 2014 (UTC)
    It certainly might not always be the pagename. In some cases having a named link might be misleading, as it implies that it is possible to go to a page that is directly related to the term, rather than, say, a general search-form page. The more I deal with these, the more I appreciate such refinements. Also: optional italics for the taxonomic names that need them ("i=1") and a optional gloss ("gloss="). Not every template needs such options, but they are handy. DCDuring TALK 22:08, 17 October 2014 (UTC)

Redesign-Redefine of Russian Entries[edit]

I'm going towards a large redo of many Russian pages, translating swathes from Russian Wiktionary with a focus to layout consistency, definition intuitiveness/coverage, and relevant design/coding.

Info on en-Wiktionary is generally inadequate for translating literature; often confusing for basic words (e.g.'весь', see below). We have all necessary info already, only, on Wiktionary-ru, hence inaccessible to casuals (many definition examples cited there derive from literature.) I started translating Dostoevsky, ( https://github.com/icarot/bk ) which was when such inadequacies became more obvious.


1) Collaborate with Grease Pit to try to normalize the data layout as consistently as possible, for parsing by robots. A parser/morphological analyzer needs quality, open data. Hacky consistency = hacky parse.

2) Improve word-count and definition count immensely. On the order of a few thousand for one of them. Even ru-Wiktionary is occasionally lacking in this department.

3) Clean messy pages, i.e. 'весь' (which confound the novice with the unintuitive concept that Russian uses declensions to represent irregular meaning on an unusually multi-purpose [pronoun-adjective] word), and does not represent all of the critical meanings.

4) Pronunciations from ru-Wiktionary as well. Ours are sufficient but different (we use phonemic vs. narrow transcriptions). In my opinion, the narrow transcriptions are better since they reveal useful subtleties of pronunciation without adding obscure IPA symbols. The main changes would be notating non-phonemes when ru-Wiktionary decided to do so and we did not, such as replacing our alveolar approximants with velarized allophones, and notating unusual instances of vowel allophony, or secondary stress. In short, copying the more precise and still friendly transcriptions from ru-Wiktionary. Consensus?

What are desired improvements I've missed for Russian translations which can be directly bettered from conventions and the scope of information on Russian Wiktionary? Looking for criticisms, guidance, etc. I wouldn't just run rampant without letting the community know what was going on, or asking for help.

Main Points Noted

  • Ivan I can help generate stubs [..] — that would be brilliant! I'd do the same, using lemmas from Dostoevsky. I'll use the corpora from ru-Wiktionary (i.e., National Russian Corpus) because if it's there, logically I assume it's license-compatible. I agree with you about Google Translate — they can't possibly have the copyright on that data. But we should verify to make sure.
  • Ivan German article in Spiegel and there were like 2-3 missing words in every single sentence. I can imagine that the statistics for Dostoevsky are even worse. It has become embarrassing. We should have some kind of stubs for statistically top 20k words in every language IMHO — I think this is a fantastic idea. And you're absolutely right about your inference about Dostoevsky. It's the English equivalent of reading the word 'snicker' and having no entry whatsoever. This is middle- or high-school vocabulary, and is a large problem as a whole for practical use as a dictionary. Can we reach a consensus for doing this specifically for ru-articles?
  • Wikitiki89 do not change the layout without discussing it first. — Main change wanted is inflection tables. These on en-Wiktionary waste huge amounts of space. We should copy ru-Wiktionary's approach: a clean, uncluttered overview of an inflection pattern. While we're on the topic of morphology, I want Alfred Zalizjank's inflection descriptions from ru-Wiktionary as well. He uses one number and one letter for each word to comprehensively cover the morphology and stress pattern of Russian. I'll work on translating the description from ru-Wiktionary when I get a moment.

Icarot (talk) 00:18, 18 October 2014 (UTC)

We have seen you talking but we haven't seen you working :). You're welcome to demonstrate your ideas. Yes, we need more Russian entries and some entries may need fixes or improvements but you can't make major changes without a prior agreement. --Anatoli T. (обсудить/вклад) 05:20, 18 October 2014 (UTC)
  • Just a heads-up: Any automatic transmission of data from Russian Wiktionary into English Wiktionary has to clearly indicate the source of the data in the edit summary to prevent copyright violation. --Dan Polansky (talk) 05:34, 18 October 2014 (UTC)
  • Feel free to make any changes you want to content, but do not change the layout without discussing it first. --WikiTiki89 14:25, 18 October 2014 (UTC)
  • @Icarot: I can help you generate stubs for Russian nouns, adjectives, verbs and adverbs (the rest are a closed category and mostly covered). Stubs would be entries like in this category - the only thing they are missing are definitions. I could help extract a list of missing lemmas from a particular work. We could also pregenerate a list of examples for every entry and format them using the {{usex}} template, by taking them from ru Wiktionary, glosbe, parallel corpora databases, subtitles, google translate and so on, that editors could easily copy/paste into entries that are missing them. Don't worry about associations (derived terms, *nyms, morphological etymologies etc.) - those can be largely automated once entries with definitions are created. The primary focus should be on coverage. --Ivan Štambuk (talk) 07:32, 19 October 2014 (UTC)
    • Not sure why you are not continuing with this crap in Serbo-Croatian Wiktionary. It already has more than 100 000 Serbo-Croatian definitionless entries. If Wiktionary users are so hungry after such content as you posit, Serbo-Croatian Wiktionary could become one of the most visited Wiktionaries soon. Unless it gets shut down due to copyright violation, that is, such as because of automated lifting of data from Google translate as you seem to suggest above. --Dan Polansky (talk) 07:58, 19 October 2014 (UTC)
      Inflections cannot be copyrighted, the databanks such as HJP are completely free. Besides, I fixed many errors in them, and used two others as well. Definitions on the other hand can be copyrighted, and are nevertheless abundantly stolen by many FL Wiktionaries without anyone so much raising an eyebrow. Don't worry Polansky, soon I'll add many such stubs for Czech as well. --Ivan Štambuk (talk) 08:07, 19 October 2014 (UTC)
      • As you know from a previous discussion on the subject with copious participation, there is no consensus supporting your mass creation of definitionless entries. There is no consensus for blocking that behavior either, though. You may get blocked in the process nonetheless; if I were a crat, I would have blocked you by now for entering definitionless rubbish. You may also get blocked for the above cynical utterances of disrespect toward copyright; if I were the operator of this website, I would block you for that. In the meantime, I will take this opportunity to register my annoyance. --Dan Polansky (talk) 08:16, 19 October 2014 (UTC)
        A wide consensus is not necessary for language-specific work (The original discussion was for all languages). A few editors agreeing and working together is enough. The rest can complain about it all day for all I care. (It seems to be the only thing that you do anyway.) Just looking at the content of Category:Czech nouns: We have 13k Czech nouns and 95% of them don't have inflection and pronunciation. I can guess the meaning of 90% of them and I've never studied Czech in my life. I know it's hard to accept that most of your work has been futile, but such is life. Google Translate is based on statistical correlation in parallel corpora not owned by Google an its translation pairs are uncopyrightable, and can completely substitute all of the work you've done. Working smart not stupid is the way to go, using bots and free databases for heavy lifting and not wasting time on typing wiki syntax. --Ivan Štambuk (talk) 08:31, 19 October 2014 (UTC)
        • Re: 'The rest can complain about it all day for all I care. (It seems to be the only thing that you do anyway.)': That is obviously untrue; it suffices to inspect my mainspace contribution to see otherwise. I propose you use your blocking tools to block yourself for that remark. --Dan Polansky (talk) 08:36, 19 October 2014 (UTC)
        • Re: "I can guess the meaning of 90% of them": Very unlikely. --Dan Polansky (talk) 08:38, 19 October 2014 (UTC)
          Well I took a look at the last 50 contribs of yours, and the only novel mainspace edit is some English misspelling. Anyway, my point was that you've invested too much time into easily replicable manual labor so that you oppose stubbing not by reason but principle. See: neo-Luddite. We have too little editors to do everything manually, and after 10 years we're still missing thousands of top words in many major languages. The other day I was reading a German article in Spiegel and there were like 2-3 missing words in every single sentence. I can imagine that the statistics for Dostoevsky are even worse. It has become embarrassing. We should have some kind of stubs for statistically top 20k words in every language IMHO (including translations). Regarding blocking - using words such as crap or rubbish when referring to other people's work is considered impolite and could be a cause for a block. --Ivan Štambuk (talk) 08:56, 19 October 2014 (UTC)
          Are you semantically challenged? Which part of "the only thing that you do" you fail to understand? Some recent contributions are [2] and [3]. Your ridiculous insults and inaccuracy are just tiresome. --Dan Polansky (talk) 09:19, 19 October 2014 (UTC)
          You've made ~500 mainspace edits in 4 months, most of which are translation pairs. I could in a few hours write a script that would generate both those and inflections and pronunciations. 4 months of work reduced to few hundred lines of code. I can even extract context labels from dicts. I understand your anger but there is no need to project it towards others. Behave yourself. --Ivan Štambuk (talk) 09:36, 19 October 2014 (UTC)
          • My point is that what you said was clearly false. I still see no "I stand corrected". Actually, when one rereads your posts above, they are full of obvious inaccuracies. I am not sure why I care to respond to that sort of communication style that is inaccurate by design, and whose author never says "I stand corrected, I was wrong". --Dan Polansky (talk) 09:44, 19 October 2014 (UTC)
            Natural languages are too primitive to convey the nuances of meaning representative of the real world. Nature is stochastic and statistical, and there really exists no such thing as true or false, right or wrong. In practice "never" means "almost never/in 0.something % of cases", and "all" means "100% for all practical purposes". It's real life 101. But I digress. If you don't have anything to say regarding my points I suggest that we terminate this interlocution.--Ivan Štambuk (talk) 10:05, 19 October 2014 (UTC)
            • Re: "Natural languages are too primitive to convey the nuances of meaning representative of the real world." No one should be allowed to get away with this sort of continental nonsense. The relevant distinctions are very easy to express in natural language: there is a clear, easy to understand difference between "The only thing you do is X", "You do almost nothing but X", and "Most of what you do consists of X". No rocket science, nothing to do with stochastic nature of the real world. As I said, remind me of the occasion on which you admit you made an error rather than blaming natural language for lack of expressive power. Your sort of response to clear refuting examples is the sort of behavior which Popper's philosophy of falsificationism was intended to combat. --Dan Polansky (talk) 17:51, 24 October 2014 (UTC)
          Re: 'using words such as crap or rubbish when referring to other people's work is considered impolite and could be a cause for a block': That's utter rubbish. You can hear "rubbish" all the time, used be well educated and generally polite people. These words are not the most polite forms available, but fit well to describe the sort of content that dominates the Russian Wiktionary. --Dan Polansky (talk) 09:23, 19 October 2014 (UTC)
          I'm not sure what kind of polite people you socialize with, but referring to other people's work as crap and rubbish an them as challenged (a jocular pejorative) is generally reserved for intimate contexts where they would not perceive it as an insult (e.g. family or close friends). Russian Wiktionary is doing fine, thanks for asking. And so will the Serbo-Croatian Wiktionary. Not so long ago the SC Wikipedia was ridiculed on similar grounds, and now is the bigger than any of the hr/bs/sr pedias with the highest growth rate. --Ivan Štambuk (talk) 09:36, 19 October 2014 (UTC)

@Icarot: Feel free to add definitions to Category:Russian entries needing definition, generated by User:Ivan Štambuk, which I have been working on. Plenty of work to do! I'll repeat what was said before: please don't change the design without a prior agreement. As I said before, we haven't seen you working yet. --Anatoli T. (обсудить/вклад) 02:37, 24 October 2014 (UTC)

IPA, language code and error message[edit]

Whatever changes were made to IPA modules to make older pages (2013) have conspicuous red error message in the IPA section should be undone. Example: this revision. Old revisions should look as legible and sane as possible; this is not. In general, IPA templates should not require the language parameter; filling-all-the-fields concerns should be delegated to editors with a shovel who have no real interest in building the dictionary. --Dan Polansky (talk) 05:31, 18 October 2014 (UTC)

I agree that the lack of a lang parameter shouldn't result in an error message, but we don't have any editors who have no real interest in building the dictionary. People with no interest in building the dictionary don't become editors. —Aɴɢʀ (talk) 07:00, 18 October 2014 (UTC)
I completely agree that there shouldn't be an error message. A cleanup category would be sufficient. --WikiTiki89 14:27, 18 October 2014 (UTC)
I was gonna say exactly what Wikitiki89 said. Renard Migrant (talk) 11:49, 24 October 2014 (UTC)


There's a lot about this entry that makes me nervous: the word was apparently coined in a journal article published in mid August, with some or all of the authors working at Alabama State University in Montgomery, Alabama. The Wiktionary article was created at the beginning of September by an anonymous contributor whose IP is assigned to ASU. A variety of IPs from the same southern Alabama/northern Florida area as ASU, as well as an account that seems to bear the name of one of the authors, have been adding references, which are all articles/blurbs about either the research program at ASU or about the original article itself. It's tagged as a hot word, but it looks to me to be lukewarm at best: a Google search does show the word in a blog or news article here or there, but this isn't the kind of strong, widespread adoption we saw with olinguito.

I can't escape the impression that we're being used for promotional purposes, and I feel we need to do something- but I'm not sure whether to tag this for cleanup to prune out all the PR from the references, or to rfv it, or something else. It certainly doesn't meet the letter of the CFI, since it's only 2 months old, but how do we decide whether this is "hot" enough to keep it provisionally as a hot word? Chuck Entz (talk) 05:04, 19 October 2014 (UTC)

Some use outside of the group promoting it would be nice. I'd RfV it for starts. DCDuring TALK 12:44, 19 October 2014 (UTC)
It's hard to say which of the "references" have print counterparts or can otherwise be considered to be durably archived. At least one is a self-proclaimed blog. Nothing in CFI says we have to include something as a hot word, especially when it is not at all clear that use would get ever beyond the field of forensic science and practice. I think that means that it would in the end come to a vote, which usually takes place at RfD. And then there's the increasingly important question of how we address the decline of print media.
This particular case seems to me to be part of a campaign by a university PR office. RfC seems inappropriate as the entire issue is with the attestation. I'd RfV it to get a slow clock started. We need to have properly formatted attestation to facilitate wide participation in review. Why should each participant have to click through to each website? DCDuring TALK 13:17, 19 October 2014 (UTC)

Headwords for reconstructed languages[edit]

So I'm putting in the first steps towards an appendix for Proto-Samic, a fairly well-reconstructed proto-language. I'm however wondering what would be a good choice of headword for verbs?

  1. Use just the bare verb stem. This is what the main published sources, including the 1989 dictionary by Lehtiranta [1], seem to do: e.g. *ëstë (to be in time). However this is not an actual wordform by itself.
  2. Use the verb stem, marked by a hyphen to be just a stem and not an actual wordform: e.g. *ëstë-.
  3. Follow the standard for the modern-day Samic languages (and, for that matter, our PF and PGmc appendices) and use the infinitive: e.g. *ëstëtēk. These are not directly listed in the source literature, but they are simple enough to assemble, and the ending itself is uncontroversial.

Worth noting is that some otherwise homophonic roots would be distinguishable under options #2 and #3 (e.g. *ćēkćë 'osprey' ~ *ćëkćë- 'to kick'). OTOH there also exist roots for which it is not clear if the original meaning was nominal or verbal (*teampō 'to become wet / seaweed'), and their placement would end up arbitrary if we strictly separated verbs and nominals by citation form.

(Discussion on further matters perhaps ought to go at Wiktionary talk:About Proto-Samic. [EDIT] 15:34, 24 October 2014 (UTC): Page now up.)

--Tropylium (talk) 20:46, 19 October 2014 (UTC)

[1] Lehtiranta, Juhani. 1989–2001. Yhteissaamelainen sanasto ('Common Sami Vocabulary'). Suomalais-Ugrilaisen Seuran Toimituksia 200. Helsinki: Suomalais-Ugrilainen Seura. ISBN 951-9403-23-X.

I would choose option 3 mainly because it lines up better with modern terms and makes comparisons easier. It also matches our treatment of Proto-Finnic, which also uses the infinitive as the lemma. —CodeCat 21:21, 19 October 2014 (UTC)

On proper nouns[edit]

Previous discussions: Wiktionary:Information desk/2014/July#Are names always proper nouns (or proper names)?, Wiktionary:Beer parlour/2014/July#Proper nouns

Why do we treat proper nouns as a separate POS from nouns? Proper nouns are just a specific type of noun; having separate headings and categories for "Proper nouns" as opposed to "Nouns" is a bit like having separate headings and categories for "Transitive verbs" as opposed to "Verbs". Merging proper nouns in with nouns would solve a lot of ambiguity problems, such as words like Friday and Christmas that can be used both as a proper noun and as a common noun, not to mention the problem that there is no real clear cross-linguistic definition of what constitutes a proper noun. (Most attempts at defining the difference I've seen apply only to English and don't necessarily work for other languages.) —Aɴɢʀ (talk) 16:26, 20 October 2014 (UTC)

I definitely support this. Furthermore, even if this does not pass, I would like to propose categorising all proper nouns as nouns as well, and merging Category:Proper noun forms by language into Category:Noun forms by language. —CodeCat 16:59, 20 October 2014 (UTC)
As a general rule if something (eg, a classification, attribute) is reasonably well researched and documented in a given language and has lexical implications, then we should have it in that language. If other languages don't have the distinction or don't have it documented then we shouldn't have it for those languages. I don't see why we should dumb down presentation of any language, let alone the host language, for the sake of uniformity or the convenience of translators or Lua practitioners.
For English and for taxonomic names, the notion of proper nouns is well-documented and useful. We could make the presentation simpler by acknowledging that large classes of English proper nouns have perfectly predictable (ie, effectively syntactical) patterns of common-noun use. I always wonder whether we can prevent contributors from adding "missing" information such as an Adjective PoS section to cover attributive use of an English noun, but that problem seems to be declining. DCDuring TALK 17:21, 20 October 2014 (UTC)
But we don't have to indicate the "propriety" of nouns by having "Proper noun" considered a separate POS. We could tag nouns {{lb|en|proper}} or {{lb|en|common}}, for example, the way we already label verbs {{lb|en|transitive}} or {{lb|en|intransitive}}. It isn't "dumbing down" the presentation of the language to aim for accuracy as well as precision. —Aɴɢʀ (talk) 18:34, 20 October 2014 (UTC)
You have now taken a position that is better defined than your initial posting, which expressed opposition to proper noun headings and categories. And your initial posting included "the problem that there is no real clear cross-linguistic definition of what constitutes a proper noun", which seems like the kind of cross-linguistic uniformitarianism that is often proposed here and which is probably what has won you CodeCat's support.
Your statement above that 'having separate headings and categories for "Proper nouns" as opposed to "Nouns" is a bit like having separate headings and categories for "Transitive verbs" as opposed to "Verbs"' implies that you are opposed to such headers and categorization in the case of entries that are now proper nouns. But we have categorization of "Intransitive verbs". Are you really opposed to that as well. The ratio of English proper noun entries to total English noun entries is even smaller than the ratio of intransitive English verbs to total English verbs, so the category is arguably more useful. Given our current "efficient" method of implementing labels, we cannot use "what links here" and a template to construct a list of items so labeled, leaving us with only categories, programs run on dump runs, and text searches as ways of constructing such lists from labeled definitions. Speaking from extensive and recent experience, I can say that text searches are not fully satisfactory and that programs run on the XML dumps are inconvenient for many ad-hoc purposes.
Are you opposed to the proper noun category as well as to the proper noun heading? Are you in favor of proper labeling of individual definitions before the proper noun heading is eliminated? Are we sure that proper labeling does not require manual review? Who do you propose do the checking and conversion? DCDuring TALK 19:15, 20 October 2014 (UTC)
I'm not proposing anything yet; at this point all I want is discussion. I do want to consider getting rid of the L3 header, but you're right that parallelism with transitive and intransitive verbs does suggest retaining Category:English proper nouns as well as creating Category:English common nouns. As for a cross-linguistic definition, I'm not even talking about languages that aren't considered to have the proper/common distinction (though I'm not aware of any languages that don't), I'm talking about a definition that would apply to all languages that are considered to have both kinds of nouns. Even for such syntactically similar languages as English, French, and German I don't know how to define "proper noun" in a way that will apply to all three languages. And if each language has to have its own language-specific definition, that's a good indication to me that the concept of "proper noun" has no linguistic basis at all and is useful only for pedagogy. And if it turns out there is no adequate definition of "proper noun", then we shouldn't use the label template or the category at all. What do other dictionaries do? Do other dictionaries label proper nouns separately? What criteria do they use? For that matter, what criteria do we use? Why are AB-yogurt and air chief marshal proper nouns? —Aɴɢʀ (talk) 19:36, 20 October 2014 (UTC)
I'd be willing to stake my reputation as a linguist on there being massive overlap among the sets of things considered as proper nouns in all languages. Many folks don't act as if taxonomic names are proper nouns, but most theoretical taxonomists seem to. And then there is the proper name/proper noun distinction. DCDuring TALK 21:48, 20 October 2014 (UTC)
But "being considered a proper noun" isn't a definition. And I'm not sure there's even always overlap within the same language. For example, we call language names like Latin and Sanskrit proper nouns, just like names like Noah and London. But the American Heritage Dictionary, which gives no part of speech info for Noah and London, labels Latin and Sanskrit "n.", which they otherwise do only for common nouns. So are language names proper or common? What usage of taxonomic names indicates that theoretical taxonomists treat them as proper nouns? (That's an actual question, not a rhetorical one.) Considering our first definition of [[proper name]] is "proper noun", I wonder the distinction between the two is supposed to be. —Aɴɢʀ (talk) 22:12, 20 October 2014 (UTC)

Support tentatively (but I will see how the discussion goes), even if it causes Japanese, Chinese (only Mandarin, Min Nan/Min Dong and Hakka) and Korean transliterations to become lower case (various dictionaries use different standard for capitalisations of these languages, place and personal names are usually capitalised but not by all dictionaries). There's definitely no need to treat language names, demonyms, month and weekday names to be proper nouns. Various languages here just follow English when using proper nouns. Transliterations, which are never capitalised don't need and don't benefit from this distinction at all. E.g. Arabic nouns are just nouns. --Anatoli T. (обсудить/вклад) 22:59, 20 October 2014 (UTC)

@Atitarev: Actually, in Arabic there is very important distinction between proper and common nouns. Proper nouns are automatically definite and never take the definite article الـ (al-) or possessive suffixes, and usually do not take nunation, in which case they also have a slightly different declension pattern. For example: مِصْرُ الْقَدِيمَةُ (miṣru al-qadīmatu, Ancient Egypt) and فِي مِصْرَ الْقَدِيمَةِ (fī miṣra al-qadīmati, in Ancient Egypt). Similar applies to Hebrew and Aramaic. --WikiTiki89 21:05, 21 October 2014 (UTC)
Proper nouns never take the definite article in Arabic? So العراق‎, السعودية and الإسكندرية are common nouns? People sometimes make the same claim about English, that proper nouns never take the definite article, but then Netherlands, Gambia, and Philippines (not to mention Ukraine and Crimea in more old-fashioned varieties) would have to be called common nouns. —Aɴɢʀ (talk) 22:29, 21 October 2014 (UTC)
Well in those cases, they don't take another definite article because the definite article is part of the proper noun. For your English examples, I would say that "the Netherlands" is the proper noun, while just "Netherlands" is an incomplete proper noun (or the plural of "Netherland"). --WikiTiki89 22:45, 21 October 2014 (UTC)
Yes, some proper nouns may become diptotes but this probably has to do with their definiteness, rather than the fact that they are proper nouns. The thing is also, not ALL proper nouns are triptotes, e.g. (with full vowelisation) مُحَمَّدٌ (muḥammadun) and, as Angr mentioned, they can also take a definite article, as in العِرَاق (al-ʿirāq) "Iraq" and الأُرْدُنّ (al-ʾurdunn) "Jordan", although the nisba doesn't have it: عِرَاقِي (ʿirāqī) "Iraqi" and أُرْدُنِي (ʾurdunī) "Jordanian". There are some rules about, which proper nouns can be diptotes - the length, whether they are loanwords or native Arabic, the endings, certain patterns (e.g. "fuʿal"). --Anatoli T. (обсудить/вклад) 12:35, 22 October 2014 (UTC)
My whole point was that their definiteness (more so, the fact that they cannot be made indefinite and cannot take possessive suffixes) is what makes them proper nouns. Nisbas are not proper nouns, so I don't see how they are relevant. You cannot, for example, say مِصْرُكَ (miṣruka, your Egypt) or عِرَاقُكَ (ʿirāquka, your Iraq); or if you do say that, then you are turning it into a common noun. As for مُحَمَّدٌ (muḥammadun), I did use the word "usually" for a reason. --WikiTiki89 12:56, 22 October 2014 (UTC)
Well, it's obvious that proper nouns, like unique place names, are definite but I personally don't see this really as a grammatical difference, to separate them as proper nouns, they can sometimes take a definite article, they can also take possessive suffixes (converting to common nouns, if you wish), they can sometimes be triptotes (and common nouns can be diptotes). These features are not reliable (also hard to verify, since ʾiʿrāb is seldom written, not so often pronounced in full). I found some rules for diptotes for proper nouns but my source doesn't mention how many are triptotes, so, not sure if the list is big. My nisba examples were just to show that الـ (al-) is not part of the word. Since Arabic grammarians do mention Arabic proper nouns, I'll drop this point specific to Arabic. I only think that language names and nationalities should be common nouns in Arabic, reserve proper nouns for place, people's and company names. --Anatoli T. (обсудить/вклад) 14:24, 22 October 2014 (UTC)

A much simpler solution would be to rename the current Noun to Common noun. I would strongly oppose the introduction of an Intransitive verb POS, but I think it's very helpful to readers to keep both POSs when they are meaningful in the language, these two kinds of nouns being used very differently. The precise limit between proper nouns and common nouns only depends on tradition in each language (e.g. we consider italien (the language), septembre or Parisien, a capitalized word, as common nouns in French). Note that, generally speaking, all proper nouns can be used as common nouns (but this does not make them common nouns), and common nouns can be used as proper nouns, this cannot be considered as ambiguity. Lmaltier (talk) 20:26, 21 October 2014 (UTC)

  • We need to indicate whether a noun is common or proper in some way. Whether this is in the POS heading or somewhere else makes little difference, but it seems that the POS heading is the most obvious and best place for it. Verbs do not need transitive/intransitive distinctions as much because it is usually obvious from the definition. --WikiTiki89 21:05, 21 October 2014 (UTC)
    • What's the evidence that the two kinds of nouns are "used very differently"? They seem to be used exactly the same way to me: as the subject or direct object of a sentence, as the object of a preposition, etc. Why do we need to indicate this apparently undefinable and artificial distinction? And if we do, why is the POS heading the most obvious and best place for it? To the extent the distinction actually exists, it's usually obvious from the definition too. —Aɴɢʀ (talk) 22:29, 21 October 2014 (UTC)
      • In some languages, it's clear that they are used very differently, and that they are very different from the reader's point of view. In French, the article is usually used with common nouns, not with proper nouns (it's much less simple, e.g. the definite article is normal with most country names, but this is the general idea). Lmaltier (talk) 05:49, 22 October 2014 (UTC)
        • The only thing that's clear to me so far in this discussion is that many languages have nouns that are definite without the markers of definiteness that are usual in that language, such as being governed by a definite article, a possessive determiner or the like. But in none of the languages discussed so far is that set of nouns exactly coterminous with a set of nouns that can be defined by a semantic property such as being the name of a person, geographical location, language, etc. In English, Arabic, and German, most geographical names don't use the definite article, but some do, and statements like "the definite article in the Netherlands is part of the name" is simply begging the question. In Irish, most language names do use the definite article except in certain constructions, but at least one (Béarla (English)) never uses it. So if we want to label nouns by this property at all, we should label them as being definite even without a definiteness marker, rather than implying that there is some sort of semantic property that causes nouns to be "proper nouns" and that their syntactic behavior results from that. —Aɴɢʀ (talk) 15:13, 22 October 2014 (UTC)
          • No, no, not at all, the only possible criterion is the tradition in the language. It was only an example to show that being a proper noun often has a major impact on the use, including grammatical rules to be used. Lmaltier (talk) 17:29, 22 October 2014 (UTC)

Another argument is that paper dictionaries including both common nouns and proper nouns sometimes have a fully separate part for proper nouns (it's the case of a best-seller dictionary for French: Petit Larousse Illustré). Readers may be used to this clear separation. Lmaltier (talk) 17:35, 22 October 2014 (UTC)

  • I don't think "tradition in the language" is a reason at all, especially since the vast majority of the world's languages don't have a tradition about it one way or the other. If the distinction between common nouns and proper nouns is linguistically real, it must be possible to come up with a definition that applies to all languages regardless of traditional grammars. —Aɴɢʀ (talk) 18:42, 22 October 2014 (UTC)
    • I don't think so. In any case, stating that the French nouns poker, septembre or arménien (the language) are proper nouns would clearly be wrong. They are not proper nouns in French. Lmaltier (talk) 18:58, 22 October 2014 (UTC)
      • But why not? What definition of "proper noun" are you using to determine that? Capitalization alone? Because if that's the only criterion that can be used to distinguish proper nouns from common nouns, then the distinction is definitely nonlinguistic. —Aɴɢʀ (talk) 19:40, 22 October 2014 (UTC)
        • No, we consider Parisien as a common noun in French, too, despite capitalization. When I refer to tradition of the language, I mean that the general meaning is always the same (see proper noun), but how it's interpreted precisely may depend on languages in some cases (in most case, it's the same in all languages recognizing proper noun as a word category). Lmaltier (talk) 19:52, 22 October 2014 (UTC)
          • So the distinction is made on the basis of native speakers' intuitions? A noun is a proper noun because it feels like a proper noun? —Aɴɢʀ (talk) 19:59, 22 October 2014 (UTC)
            • This intuition is based on the tradition of the language, on how specialists of the language usually consider the word. In French, traditionally, proper nouns are names of places, people (and peoples), companies, brands, historical events, works of art or books, not much more. Sometimes, we hear about proper adjectives in English (seemingly according to capitalization), this word is meaningless in French. Lmaltier (talk) 05:59, 23 October 2014 (UTC)
              • So still no definition, just an appeal to authority. I'm becoming more and more convinced there's no such thing as a proper noun. —Aɴɢʀ (talk) 12:05, 23 October 2014 (UTC)
                In French the definition is simple: a proper noun is used to described a unique being or thing. Every modern French dictionary unambigously distinguishes proper nouns from common nouns: Larousse, Robert, TLFi, Dictionnaire de l'Académie française... Just because you can't find a universal definition for a proper noun doesn't mean that you can ignore this distinction when it is part of a language like French. Dakdada (talk) 13:10, 23 October 2014 (UTC)
                • If my family owns one dog and my mother says "Have you fed the dog?", then "the dog" refers to a unique being; does that make it a proper noun? What about language names like arménien mentioned above? Is that not a unique thing? Then why is it not a proper noun in French? Just because dictionaries invent distinctions to make life easier for language teachers, that doesn't mean those artificial distionctions are actually part of the language. —Aɴɢʀ (talk) 13:49, 23 October 2014 (UTC)
                  That's just the + dog. It doesn't change the fact that dog is a common noun. Language names are debatable, but obviously I can't convince you if you really don't see any difference between e.g. city and London. Dakdada (talk) 16:30, 24 October 2014 (UTC)
The kind of thing that a proper name names can include a lineage (real, hypothetical, or conventional), as a Roman gens or a taxon. It can include a people, race, tribe, breed, family?, etc, even when they are not lineages. All of these can be plural in form, but they are considered to be referring to a single entity. Such a word, whether singular or plural, when referring to an individual member or subset of any such grouping, seems to me to be a common noun.
More generally it is a question of convention, as almost all actual language is, as opposed to part of some ephemeral rational scheme, purported to be universal and timeless, but actually just a hypothesis.
If a given definition has exceptions, that does not invalidate the definition, which is usually of the typical member of the class. Wittgenstein's discussion of game (or was it Spiel?) should informative. DCDuring TALK 14:29, 23 October 2014 (UTC)
A distinction can be made for analytical purposes (not, IMO, for Wiktionary presentation purposes) between proper names and proper nouns. Mary is a proper noun, sometimes serving as a proper name (where the context makes it sufficient to uniquely identify the individual) and sometimes as part of a noun phrase (Mary Ellen Smith) that serves as a proper name in other contexts (but not necessarily all possible contexts). That White House is a proper name, which we present as a proper noun, does not make House a proper noun or proper name. House is a proper noun by virtue of its use as a surname.
It is hard for me to believe that the request for a definition is anything but a rhetorical ploy, as such definitions are abundant and adequate for most purposes. If we need something more for purposes of knowing what goes under a given language's Proper noun heading or into the category, we can either impose the host language's conventions, either universally or by default, allowing exceptions for the conventions of other languages. We already allow orthographic departure from English usage and certainly don't impose English grammar (eg, use of determiners) on other languages, not even PoS headers, useful though they may be. If someone would like to document the proper noun/proper name practices of a language in an appendix, they would be doing the project a service. DCDuring TALK 14:29, 23 October 2014 (UTC)
No, the request for a definition is not a rhetorical ploy. I'd genuinely like a definition because I am often uncertain whether to label a particular noun as a ===Proper noun=== or not, especially in languages other than English. Usually I simply have to rely on how the English equivalent is labeled. Most conventional definitions seem to be circular and therefore useless, as in: "When is a noun capitalized in English? When it's a proper noun. OK, so when is a noun a proper noun in English? When it's capitalized." Either that or hopelessly vague, as in "a proper noun is the name of a specific, unique being", which doesn't explain why The Hague is a proper noun that just happens to include the word the, but the dog is a common noun made definite by the presence of the definite article. —Aɴɢʀ (talk) 17:23, 24 October 2014 (UTC)
The Hague is the name of a particular city. The dog is not the name of a particular dog (just the + dog). It has nothing to do with the definite article or the capitalization, which are secondary and language related. If you want definitions, what about w:Proper noun? Dakdada (talk) 17:53, 24 October 2014 (UTC)
If a distinction can be made between definite and indefinite reference, then it's a common noun. Otherwise it's a proper noun. If both, then it's both. --Ivan Štambuk (talk) 18:38, 24 October 2014 (UTC)
I can't venture anything about languages other than English.
Not all capitalized words or expressions in English are proper nouns. You should discard with prejudice any reference that says otherwise.
The Hague (sometimes the Hague) is a proper noun because of its definition. I expect that it has the attached because it is a calque of Den Haag.
The is attached to Netherlands in running text (but not in mailing addresses, etc.), probably because of the historical Nether Lands, whether factual or imagined.
In English it is usually not too hard to distinguish in current and recent usage between a definite expression (usually with the) that describes or characterizes something and a proper name that includes the. But it was not too long ago that an expression like "John, sawyer" served to uniquely identify someone on parish rolls.
In English the incompatibility of a proper name with a or any or every seems more indicative than the presence of the.
In English the hand of history and fashion is very visible. Usage dictates. How each usage gets started or terminates can be a very particular story. As a result I don't think there is a short list of rules and exceptions that covers all the cases. That is why WP needs a style sheet that documents its decisions about capitalization and why the taxonomic naming authorities have explicit rules. And why users need dictionaries and style guides. Wiktionary can do a better job of providing such lexical information than other references if we continue to be willing to do so. We can check corpora and style guides so users who trust us don't have to. DCDuring TALK 18:44, 24 October 2014 (UTC)

Small, doable modification to WT:CFI#Idiomaticity[edit]

WT:CFI#Idiomaticity sentence #1:

An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components.

Change this to

A multi-word term is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components.

The changed part here is A multi-word term.

Rationale: WT:CFI does not define what an expression is, the Wiktionary entry expression isn't any help either. Some multi-word terms like come in may not be considered expression. Multi-word term is vastly better than term, because term could include single words with transparent meanings, like improvable, points (plural of point) reenter (enter again) and so on. I'm canvassing to see if there's enough support to make a vote of it. Renard Migrant (talk) 12:01, 23 October 2014 (UTC)

The sentence is better, but is it really useful anyway? Idiomaticity of multi-word terms should not be a condition for inclusion. ice hockey cannot be considered as idiomatic. Nonetheless, it's a term of the English language, and including it is therefore normal. Lmaltier (talk) 16:54, 23 October 2014 (UTC)
Support. I think it is useful. — Ungoliant (falai) 17:16, 23 October 2014 (UTC)
Lmaltier I appreciate your input, but we also know from past experience it's just you that thinks this. Also I do consider ice hockey idiomatic. It has very different rules to hockey. Like, is table tennis merely tennis played on a table? I certain don't think so! Renard Migrant (talk) 11:52, 24 October 2014 (UTC)
Of course. Nonetheless, the meaning can be easily derived from the meaning of its separate components (provided you know the sport, even without knowing its name). I copy the definition of idiom: An expression peculiar to or characteristic of a particular language, especially when the meaning is illogical or separate from the meanings of its component words. table tennis is not something peculiar to English or characteristic of English, and its meaning is not illogical nor separate from the meanings of its component words. You understand why I don't like this sentence as a criterion. Lmaltier (talk) 18:18, 24 October 2014 (UTC)
  • I have to oppose. I think the term "expression" was intended to cover both single words and multi-word terms. The new wording would not do that. Therefore, the new wording would no longer define what CFI:idiomatic means for single words like "redefine". Right now, "redefine" is idiomatic because its components are not separate enough. --Dan Polansky (talk) 17:41, 24 October 2014 (UTC)
    • But that interpretation is not the status quo. The status quo, although it's an unwritten rule, is to accept all single words (for varying interpretations of "word") as idiomatic regardless of morphological transparency. Or to say it another way, idiomaticity is not a factor in the inclusion of single "words". —CodeCat 18:23, 24 October 2014 (UTC)
      • What I have written is consistent with current common practice. For instance, we include "blueness", since while "blueness" is clear from "blue" and "-ness", the two are not separate, which matters for "An expression is idiomatic if its full meaning cannot be easily derived from the meaning of its separate components." --Dan Polansky (talk) 22:39, 24 October 2014 (UTC)
        • I can see this interpretation, just it wouldn't be my interpretation; blue and -ness are separate. "Separateness" doesn't mean "separated by a typographical space". Renard Migrant (talk) 13:15, 25 October 2014 (UTC)
          • I like your interpretation very much, and a good spot, well done! However if you think of a cake made of eggs, flour, sugar and butter, when you've made the cake are eggs, flour, sugar and butter separate ingredients or not? Of course they are! Just because you've used them to make one cake doesn't mean they aren't separate concepts. Renard Migrant (talk) 19:23, 26 October 2014 (UTC)
            • Then please extract eggs, flour, sugar and butter from your cake. If they are separate as you claim, this should be easy. Also: talking to yourself is a bad habit for a dungeoneer. Keφr 19:33, 26 October 2014 (UTC)
            • Another thing to note: a criterion like WT:COALMINE makes much more sense if single-word terms are automatically presumed idiomatic. Keφr 19:47, 26 October 2014 (UTC)

Dialect context labels - adjective, dialect name or place name?[edit]

There's something vaguely weird on croggan: The first sense is described as "Cornish" while the second is "Scotland", and the mixing of parts of speech stands out a bit. This isn't an isolated thing - among other British Isles dialects, we have "Wales", "Ireland", "Teesside" and "Yorkshire", but "Geordie" (rather than "Newcastle-upon-Tyne" or "Tyneside"), "Bristolian" (rather than "Bristol"), "Manx" (rather than "Isle of Man"), "Northumbrian" (instead of "Northumbria") and "Liverpudlian" (rather "Liverpool", "Merseyside" or "Scouse").

I understand why we can't use (for example) Welsh or Irish as context labels in English-language entries (and by that logic, "Manx" is probably inappropriate too since there's a Gaelic Manx language), but the mishmash is a bit strange. Would people object to changing the labels to follow this pattern?

We use the proper name of the city/region that spawned it, except for in the handful of cases where the dialect has a widely-understood name that is not etymologically related to its origins (Geordie, Pitmatic, Cockney, Scouse - possibly Cajun, although I don't know whether everything currently tagged "Louisiana" is actually Cajun English.)

It just seems a bit cleaner that way. "croggan" would then be (Cornwall, Scotland), "mam" would be (Scouse, Northumbria). Smurrayinchester (talk) 16:40, 23 October 2014 (UTC)

I think that using the adjective could be more practical. It would allow us to distinguish terms used in a place from terms used in the context of discussing a place. —CodeCat 16:45, 23 October 2014 (UTC)
I prefer using placenames. Using placenames in context labels for senses discussing the place is usually confusing and can always be improved by removing the label and amending the definition (i.e., at ABC “(Brazil) [] cities [] that form the most important industrial area in the country.” → “(geopolitics) [] cities [] that form the most important industrial area in Brazil.”). — Ungoliant (falai) 17:11, 23 October 2014 (UTC)
When we had context labels rather than a module, we used to redirect things like {{Scottish}} to {{Scotland}} so that both displayed (Scotland). I see no reason to discontinue this. Having said that an adjective is better if it's more accurate or easier to understand, so Geordie rather than Tyneside, I'm fine with that. Renard Migrant (talk) 11:55, 24 October 2014 (UTC)
We still do that, only everything is within the module. --WikiTiki89 14:52, 24 October 2014 (UTC)
Good, then let's keep doing that, unless people don't want to. Renard Migrant (talk) 13:17, 25 October 2014 (UTC)
I've noticed this inconsistency myself. The other 'restricted register' labels I can think of are adjectives ("dated", "archaic", "obsolete", "uncommon", "rare", etc), whereas the labels I can think of that are nouns indicate restricted topical contexts ("mathematics", "aviation", etc). Context labels should indicate when a word is restricted to a certain place's dialect, while definitions should indicate when it's topically connected to a certain place, IMO. Ungoliant has a good example of how to clear up a misuse (or at a minimum a confusing use) of "(Brazil)". So, my inclination would be to make all the 'dialect' labels adjectives, noting that "UK" and "US" are adjectives and so can stay as they are. - -sche (discuss) 04:26, 27 October 2014 (UTC)

Black's Law 2d going up at Wikisource[edit]

Just a heads up - I am currently creating OCR pages of Black's Law Dictionary, 2d Edition (1910) at Wikisource, and would eventually like to bring as much of it as is useful over here. Cheers! bd2412 T 21:02, 23 October 2014 (UTC)

Cool! Maybe you should make a Template:Black's 1910 or something, similar to {{Webster 1913}}, for entries taken from it. —Aɴɢʀ (talk) 21:26, 23 October 2014 (UTC)
Yes, that is a very good idea. bd2412 T 21:27, 23 October 2014 (UTC)
@BD2412: That is excellent news. Thank you for your efforts. — I.S.M.E.T.A. 23:27, 23 October 2014 (UTC)
Cool. Even if you don't bring it here. DCDuring TALK 00:27, 24 October 2014 (UTC)
Maybe you could link all the terms here, like [4] (a lot of work!) DTLHS (talk) 00:53, 24 October 2014 (UTC)
  • Where is it? --Ivan Štambuk (talk) 18:31, 24 October 2014 (UTC)
    • s:Index:Black's Law Dictionary (Second Edition).djvu. —Aɴɢʀ (talk) 19:05, 24 October 2014 (UTC)
      • Here is a treasure: "HALYWERCFOLK. Sax. In Old English law. Tenants who held land by the service of repairing or defending a church or monument, whereby they were exempted from feudal and military services". bd2412 T 15:57, 25 October 2014 (UTC)
        • Sadly, having done a bit of digging in the hope of creating an entry, it looks like the concept of halywercfolk/hailworkfolk/Holyworkfolk/holy-work-folk was only ever invoked once, when the Bishop of Durham tried to get the men who maintained shrine to St. Cuthbert to fight the Scots. I've created an entry here, but all the citations seem to be about the same group of people. Smurrayinchester (talk) 08:50, 26 October 2014 (UTC)

Meta RfCs on two new global groups[edit]

Hello all,

There are currently requests for comment open on meta to create two new global groups. The first is a group for members of the OTRS permissions queue, which would grant them autopatrolled rights on all wikis except those who opt-out. That proposal can be found at m:Requests for comment/Creation of a global OTRS-permissions user group. The second is a group for Wikimedia Commons admins and OTRS agents to view deleted file pages through the 'viewdeletedfile' right on all wikis except those who opt-out. The second proposal can be found at m:Requests for comment/Global file deletion review.

We would like to hear what you think on both proposals. Both are in English; if you wanted to translate them into your native language that would also be appreciated.

It is possible for individual projects to opt-out, so that users in those groups do not have any additional rights on those projects. To do this please start a local discussion, and if there is consensus you can request to opt-out of either or both at m:Stewards' noticeboard.

Thanks and regards, Ajraddatz (talk) 18:04, 26 October 2014 (UTC)
I think you mean 'requests for comment'; here 'RfC' usually means 'request(s) for cleanup'. Renard Migrant (talk) 19:19, 26 October 2014 (UTC)

Mari terminology[edit]

Is there any particular reason why the two literary standards of Mari (the Uralic one) have been titled "Eastern Mari" and "Western Mari"? Following Ethnologue? I would suggest that "Meadow Mari" and "Hill Mari" are preferrable, for at least two reasons:

  • The traditional subethnic self-designations are specifically "Meadow Mari" and "Hill Mari"
  • There exists an "Eastern dialect" (spoken in Bashkortostan) distinct from standard Meadow Mari. Hence the term "Eastern Mari" is ambiguous.

--Tropylium (talk) 14:46, 27 October 2014 (UTC)

In the case of Western Mari, yes, the name was just imported from the ISO / Ethnologue along with the code. Eastern Mari was previously called just "Mari", until "Mari (Sepik)" and "Mari (Austronesian)" were added to Module:languages and disambiguation became necessary. At that time (see the archived discussion; skip the first half, which is about Buryat) I went with "Eastern Mari" over "Meadow Mari" so as to conform to "Western Mari", and because "Eastern Mari" seemed to be more commonly used than "Meadow Mari". Oddly enough, Andrej Malchukov and ‎Anna Siewierska's Impersonal Constructions: A cross-linguistic perspective (ISBN 9027287163), page 397, suggests that "Eastern" and "Western Mari" are the linguistic self-designations: "Mari has two literary variants Hill and Meadow Mari (or Western and Eastern Mari according to their own terminology)". OTOH, the difference in commonness is not large if you cut out the exceptional year 2003 (compare [5] to [6] and [7]), and there is the ambiguity you note: a few references refer to three or four Mari dialects and distinguish "Eastern" from "Meadow". And "Hill Mari" seems to be more common than "Western Mari". So I wouldn't object to renaming them both. - -sche (discuss) 20:21, 27 October 2014 (UTC)
I support such renaming. — I.S.M.E.T.A. 01:46, 1 November 2014 (UTC)

WMF grant request for a "Kids Visual Dictionary"[edit]

Hello all, I co-designed a Wikimedia outreach project to get a group of Indian kids to learn computer graphic while creating a real Wikipedia picture dictionary for basic English which they could be proud of ! The whole team will be under the management of a professional graphic designer lady who previously worked at Yahoo Inc India. The IEG proposal is detailed there on meta. We are obviously thinking to illustrate the wikitionaries for the most frequent words, and since the data will be structured, it could also help to build up further resources for various languages. As we are competing with other great projects as well, please take a look, your support for this Kid Visual Dictionary is also much welcome (here). Yug (talk) 18:45, 29 October 2014 (UTC)

Hello. You might like to take a look at the existing Wiktionary:Picture_dictionary. I don't think anybody has been actively working on that for a while, but a certain amount of work was done. Equinox 21:36, 29 October 2014 (UTC)


Is there a Wiktionary policy on which dialects to include in pronunciations and in rhymes. For example, your has pronunciations that would be non-standard on this side of the pond (including a recent addition that is common in my home dialect, but which I would expect to see only in a dialect dictionary). If we include every dialect variation, then the pronunciation section will take up the whole initially displayed page for many entries. Do we include all regional vowel mergers in rhymes? Personally, I would prefer to see only the "standard" pronunciations and rhymes, as given in major dictionaries, but I realise that we will probably not agree on "standard". Dbfirs 11:07, 30 October 2014 (UTC)

The pronunciation /jɝ/ is actually very common in American English and has nothing to do with vowel mergers, but rather with the re-stressing of a previously unstressed and reduced vowel. It seems strange for me, however, to include rhymes at all for your, since as far as I know this word can never occur in a rhyming position, since it must always be followed by a noun, since otherwise it becomes yours (or its homophone you're is separated back into its parts you are). The only sort of rhyme I can imagine for it is something like "your chin" and "urchin". But I would definite include /-ɜː(ɹ)z/ as rhyme for yours. --WikiTiki89 12:09, 30 October 2014 (UTC)
So does your rhyme with year and were in many parts of America? (Strangely, though that rhyme exists in my local dialect, I can think of no part of England where yours is pronounced /jɜː(ɹ)z/. Maybe in Ireland?) Do words like insure, secure and mature also rhyme with refer and deter in those parts of America, and is demure a homophone of demur? Dbfirs 13:03, 30 October 2014 (UTC)
With were, yes (although, keep in mind that the pronunciation /jɔɹ/ is still used interchangeably with /jɝ/), but certainly not with year, which is pronounced /jiːɹ/. Words like insure, secure, and mature do rhyme with refer (unless mature is pronounced /matuːɹ/, which is rare even in dialects with otherwise regular yod-dropping), but in some dialects they rhyme with core instead; demure and demur still differ by the /j/ sound. --WikiTiki89 16:04, 30 October 2014 (UTC)
Does the /j/ mean that they do rhyme or not. If we include every dialect world-wide, we will end up with lots of rhymes that are nonsense to the majority of speakers of English. Dbfirs 16:34, 30 October 2014 (UTC)
Tough one. Show has {{rhymes|əʊ}} but not {{rhymes|oʊ}}. I can see why it's preferable to have only one rhyme, but how do you pick? How is this different from colour and color (that is, in term of an 'alternative form' template, which neither has)? Renard Migrant (talk) 15:25, 30 October 2014 (UTC)
That's the way we standardized it, since RP /əʊ/ always corresponds to GA /oʊ/. --WikiTiki89 16:04, 30 October 2014 (UTC)
Yes, {{rhymes|oʊ}} doesn't exist because it would be identical to {{rhymes|əʊ}}. Personally, I'd prefer the former, but 1950s RP has the latter. Should one redirect to the other or should we just add a note to {{rhymes|əʊ}}? ... and it would be {{rhymes|oː}} in my dialect! Dbfirs 16:31, 30 October 2014 (UTC)
I would prefer "oʊ". It's more neutral because it's in the middle between the extremes (oː on one side and əʊ/əʉ on the other). —CodeCat 16:47, 30 October 2014 (UTC)
That makes sense to me, though, as a courtesy, I'd like to get the agreement of the creator of the rhymes section who put a lot of work into it. It would be the same rhymes page, just a different heading. Perhaps the heading could include both /əʊ/ and /oʊ/, then we wouldn't need to change all the entries. I wasn't seriously suggesting {{rhymes|oː}} because I don't think we need to include hundreds of regional variations. Dbfirs 18:36, 30 October 2014 (UTC)

November 2014[edit]

Inappropriate capitalization of nouns[edit]

I've been working on clearing up some missing entries and I've noticed that many of the entries that are redlinked are, in fact, present, but under a capital letter. For instance admiraless is redlinked, but Admiraless is not. This is the case for a good number of nouns and probably ought to be corrected. The list of capital letter English nouns should be culled of all non-proper nouns. —Yellowhen (talk) 18:34, 1 November 2014 (UTC)

Look at the citations listed at Admiraless. It really is spelled with a capital letter. — Ungoliant (falai) 18:38, 1 November 2014 (UTC)
"Admiraless" seems to be a truly exceptional case. Equinox 18:45, 1 November 2014 (UTC)
The citations page does give one example of lower-case admiraless. Examples of lower-case usage are hard to come by since it's almost always as a title or part of a title. However, culling the list of capital letter English nouns of all non-proper nouns would be a bad idea since there are plenty of common nouns that are always capitalized in English, first and foremost demonyms like Englishman and Spaniard. —Aɴɢʀ (talk) 19:00, 1 November 2014 (UTC)
All those capitalized admiralesses are so because they are honorifics, or proper nouns (usually in an archaic writing style), or in in one case in an article title. This is absolutely clear where several of them accompany the term admiral, likewise with initial cap.
There are now three l.c. citations (although one is a reference to the word). This entry should be moved. See Wiktionary:Requests for moves, mergers and splits#AdmiralessMichael Z. 2014-11-24 02:27 z

Any way to force WT:CFI to be applied?[edit]

I'm getting a bit annoyed that entries that don't meet WT:CFI keep passing a vote at WT:RFD. Is there any way to force WT:CFI to be applied? Renard Migrant (talk) 16:50, 3 November 2014 (UTC)

I think one thing we need to stress is that RFD/RFV discussions are supposed to be about whether the entry meets CFI and not whether we personally want to keep the entry. This is easier to apply when there needs to be a unanimous decision because people are compelled to convince others of their reasons, as I have experienced when serving in a criminal jury. However I'm not sure how it should work when there is a vote and thus less of an obligation to think critically, since I have not experienced a civil jury. (Pardon me if my jury analogy does not hold outside the US.) --WikiTiki89 17:08, 3 November 2014 (UTC)
Whether an entry meets WT:CFI is often a matter of subjective interpretation, not objective fact. While it's true that sometimes people say "keep in spite of the fact that it doesn't meet WT:CFI", much more frequently it's a matter of one side saying "this entry does meet WT:CFI" and the other side saying "no it doesn't". When I vote "keep" at RFD for a term some people consider to be SOP, it's because I disagree with them that it's SOP, not because I think that its SOPness should be ignored. —Aɴɢʀ (talk) 18:17, 3 November 2014 (UTC)
A valid concern. It could be addressed by creating more objective criteria of what makes a term idiomatic, like we did with WT:COALMINE. Keφr 19:33, 3 November 2014 (UTC)
  • Oppose: If people want to keep an entry, it should be kept. This proposal would essentially give deletionists a supervote, and it would damn near make CFI a criteria for speedy deletion. Purplebackpack89 18:52, 3 November 2014 (UTC)
    • It already is a de facto criterium for speedy deletion. —CodeCat 19:01, 3 November 2014 (UTC)
      • And it shouldn't be. CFI is too subjective for that. Speedy deletion should be for junk or vandalism entries. Most entries with RfD votes aren't junk or vandalism entries. Renard wants any entry that doesn't pass CFI to be automatically deleted. The problem is the only way CFI is determined is for somebody to say "this passes CFI" or "this fails CFI", with certain permuations such as "this passes SOP" or "this passes SOP". Therefore, if Renard got his way on this proposal, if any ONE editor said "this fails CFI", the article would have to be deleted. That's regardless of whether or not he's in the minority, or whether or not somebody gives a good reason as to why it doesn't. That seems patently ridiculous to me. Purplebackpack89 19:06, 3 November 2014 (UTC)
        • That's not what I want. The problem is that entries that nobody thinks meet CFI get kept because they win the vote. Right now the vote is everything and policy is nothing. It wouldn't be a 'supervote' for deletions any more than it would be for inclusionists. Interpretation of CFI would matter if we applied it at all. We don't. Like I said, 100% voting, 0% policy. Renard Migrant (talk) 12:30, 6 November 2014 (UTC)
    • You misspelled "support". Keφr 19:38, 3 November 2014 (UTC)
      • @Kephir:, I didn't though. I don't like the ramifications of the enacting of what Renard wants, so I opposed the proposal. Since this is a discussion about interpreting or changing policy, I am fully entitled to oppose it for any reasons I see fit. Purplebackpack89 21:14, 3 November 2014 (UTC)
  • Could someone list a few examples of what this would affect, perhaps even recent RFD listings? It sounds like we have two camps:
    1. Delete anything that doesn't meet CFI.
      • Frankly, that sounds reasonable to me, and seems to have been our MO for quite some time.
      • Angr brings up concerns about subjectivity and how CFI is applied. These strike me as reasonable concerns, and also as the underlying issue in many of the CFI disputes I have witnessed over the past few years. An effort to further clarify CFI could be warranted.
    2. Include anything just because someone wants it included.
      • I must admit that this sounds terrible. I understand that WT is intended to be prescriptive and not proscriptive, but part of that descriptivism necessitates some evidence that a given term is actually in use in the language. Including an entry for fleemkaboddinal just because I happen to like the way it rolls off the tongue doesn't strike me as a sound basis for building a dictionary (pun not intended).
      @Purplebackpack89: could you clarify your statement? Are you arguing that any entry should be kept whenever any single editor wants to keep it? Or is your argument intended to be narrower in scope -- do you instead intend to state a position more specifically about entries deemed to be SOP, or some other more limited area? ‑‑ Eiríkr Útlendi │ Tala við mig 19:26, 3 November 2014 (UTC)
      No, @Eirikr:, I'm arguing we should keep any entry that at least 50% of RfD participants want kept. Also, the problem with your first camp is "who determines it"? Purplebackpack89 19:32, 3 November 2014 (UTC)
      • When you say "who determines it", what is the it? CFI itself? SOP-ness? Some other aspect? ‑‑ Eiríkr Útlendi │ Tala við mig 20:14, 3 November 2014 (UTC)
      The 50% of participants is a bad idea because not everyone will be around to participate in every RFD. 20:16, 3 November 2014 (UTC)
      When I said 50% of participants, I meant 50% of the people who participated in a given RfD... Purplebackpack89 21:14, 3 November 2014 (UTC)
      So did I! (Sorry, not logged in.) Having rules like CFI allows the consensus opinion (as established by policy votes) to be applied without every user having to repeat their opinions on every vote. Equinox 22:07, 3 November 2014 (UTC)
      So you'd be OK with an article being deleted after four users vote keep and Renard votes "Delete. Fails CFI"? Deleting something even if 70-80% of people in that particular discussion said keep? That seems to be what Renard wants. I think that would be a bad idea. Purplebackpack89 22:14, 3 November 2014 (UTC)
      I would be okay with deletion for things that fail CFI. Renard has nothing to do with that. Equinox 22:37, 3 November 2014 (UTC)
      Even in the circumstance I outlined? Purplebackpack89 22:46, 3 November 2014 (UTC)
      Renard does not have a supervote, so your outlined circumstance is not really relevant. Only the failing of CFI is relevant. Renard might point out that something fails CFI but he does not decide whether it fails. Equinox 19:43, 4 November 2014 (UTC)
      I honestly can't see how you can divorce Renard's premise from supervoting. Renard is upset that votes are being closed based on consensus. If you don't close on consensus, the closer is giving weight to some opinions (perhaps his own) than others. The people whose opinions get undue weight hold supervotes. BTW, where is Renard anyway? He started this thread and hasn't been heard from since yesterday. Purplebackpack89 20:04, 4 November 2014 (UTC)
      It shouldn't take much thought to not that there are many possible procedures to help enforce CFI. All votes could be required to present a reasoned argument or consent to someone else's reasoned argument and be invalidated if the argument were shown to be wrong. This would only require some more explicit rules for inclusion and exclusion. People who vote without reasons could be allowed only a fixed number of votes per month (proportional to their contributions?) before being disenfranchised or blocked or whatever. We could have votes to disenfranchise contributors for a time or indefinitely. The franchise to delete could be limited based on some explicit criteria, to those with a degree in linguistics, employment in language teaching or professional lexicography, veteran status, Or those could be deemed to disqualify. We could simply formalize the lemming criterion. I'm sure you could think of others. DCDuring TALK 21:15, 4 November 2014 (UTC)
      I don't really want to think of any others, because I honestly believe that the whole premise isn't really a problem, and certainly not one worth solving. And each of the counter-solutions you propose are solutions I cannot stomach. People should be entitled to participate in as many discussions as they see fit without any penalty whatsoever or the need to present bona fides. That's how a Wiki project works. Your counter-solutions are pretty clearly designed to prevent people from participating, which I find wrong and in violation of the "anyone can edit" ethos of Wiki projects. It's even worse because the proposal seems to be singling out people who Renard/you disagree with. Purplebackpack89 21:26, 4 November 2014 (UTC)
      • You note, “People should be entitled to participate in as many discussions as they see fit without any penalty whatsoever or the need to present bona fides. That's how a Wiki project works.” I must disagree -- that's how some Wiki projects work. In my entire time participating in Wiktionary, that is not how Wiktionary works.
      Anyone is welcome to edit. Anyone is welcome to participate in discussions. But when it comes to the outcome of discussions, bona fides of some sort are very much part of how the community consensus comes together. Bona fides could be something as simple as being a community member (i.e. editor) in good standing. In fact, that's probably the most important bona fide here.
      But suggesting that anyone and everyone can and should have equal weight in the outcome of any discussion is in error, and is decidedly *not* how Wiktionary operates. Moreover, I cannot support any move to make Wiktionary operate that way. ‑‑ Eiríkr Útlendi │ Tala við mig 21:53, 4 November 2014 (UTC)
      @Eirikr:, "anyone can edit" is a pillar of all Wiki projects. I dislike the term "good standing" because who the hell determines "good standing". I'm sorry, but the things that have been written here smack of disenfranchisement of Wiktionary editors, including potentially myself and Dan for at least the next few weeks. And the problem is that the people who are pushing this proposal happen to fall on the deletionist side of things. If this was being pushed equally by keepist and deletionist editors, I wouldn't have any problem with it. If it was being pushed primarily by people who didn't participate in RfDs, I wouldn't have a problem with it. This is a proposal started by a deletionist who's upset that articles aren't being deleted, trying to make an end run around the consensus of RfD discussions to get more articles deleted. Purplebackpack89 22:02, 4 November 2014 (UTC)
      I, for one, would be happy with any reasonable explicit inclusionist criteria that reduced the total amount of blather on this page. DCDuring TALK 22:34, 4 November 2014 (UTC)
      @Eirikr:, "it" is whether or not it is CFI Purplebackpack89 22:48, 3 November 2014 (UTC)
  • I formerly thought that we just needed to make explicit more criteria for inclusion and exclusion by having them voted on so as to reduce the scope for debate. The apparent lack of any desire to adhere to any such criteria as well as the miserable experience of most votes makes me think that this is no solution. No one seems to feel the need to make principled arguments (whether or not based on CFI), let alone develop explicit criteria. Even something as simple as criteria for differentiating and adjective from a noun used attributively was never made a policy. Actually applying to all the cases where it should be applied would probably generate a firestorm of opposition whining. DCDuring TALK 19:45, 3 November 2014 (UTC)
  • My point of view as a frequent participant in discussions, and as a frequent closer of discussions, is that RfD is a very heavily trafficked page, and every editor has ample opportunity to participate in every discussion on the page. Therefore, if there are twenty editors participating in discussions on the page, and only five of them weigh in on a given point, then fifteen don't have a strong or certain enough opinion to bother expressing it in the discussion. The obvious cases generally come out with overwhelming support for the obvious position. In other words, if four editors say "keep", and one editor says "delete" on the grounds that the word doesn't (in their view) meet the CFI, the fact that fifteen other editors participating in other CFI discussions above and below didn't think it important to agree with the proposed deletion speaks volumes about whether the presence of the word is seen as a serious breach of our standards. bd2412 T 01:37, 4 November 2014 (UTC)
    I completely disagree. People tend to ignore RFD discussions about words outside of their field of interest. If many editors ignore a particular RFD discussion, it may not be because they have no strong opinion on whether it should be kept, but because they have no interest in that word at all. --WikiTiki89 02:39, 4 November 2014 (UTC)
    • I agree with Wikitiki89. I generally only weigh in on RFD discussions pertaining to Japanese terms. I may put in my 2p on the stray English term, but as far as monitoring the RFD page as a whole, I often skim through for Japanese and move on if nothing presents itself. —This unsigned comment was added by Eirikr (talkcontribs) at 08:09, 4 November 2014 (UTC).
    • I am somewhere in between those two. For some terms I do not care, and for others I simply do not feel competent to judge. But I also have other reasons: one of the things that really discourage me from participating in RFfoos is the sheer volume of these pages: having to wait sometimes half a minute every time I post anything on those pages is just too frustrating. It would help if there were fewer discussions, shorter discussions (both in terms of duration and amount of text) and the discussions were promptly archived and removed. Right now we only have a solution for the last problem. Keφr 12:32, 4 November 2014 (UTC)
      One thing we can do to reduce page size is to split WT:RFD into two separate pages, one for English entries and one for foreign entries. --WikiTiki89 16:08, 4 November 2014 (UTC)
      Yes! And the same for RfV too, please. I've actually been holding off nominating a few entries 'cuz they're foreign and might make it hard for most people around here to find the relevant, i. e. English terms. -- Liliana 00:18, 7 November 2014 (UTC)
      Please don't create a separate page for RFD or RFV for non-English terms. Let us keep processes simple. The non-English RFV nominations are a fraction anyway. Let those who post nominations to RFD or RFV in larger volumes also help 1) close and 2) archive the nominations. --Dan Polansky (talk) 09:54, 30 November 2014 (UTC)
    • I formerly cared more. I often disagree with keep decisions, but have come to be resigned to the fact that many contributors find multiword terms easier to wrap their heads around than single-word terms. Adding silly entries is less destructive than definitions omitted because of insufficient breadth of participation from contributors with special domain knowledge, excessive reliance on MW 1913, poor organization of definitions for highly polysemous words, and incorporation of polysyllabic, rare, obsolete, and archaic words in definitions and glosses when better words are available. DCDuring TALK 13:33, 4 November 2014 (UTC)
      In my experience, the words that tend to draw the sharpest divisions of policy interpretation are common, everyday things, like fat as a pig, have an affair, and devalueing. I grant that when it comes to truly esoteric stuff like arfer dda, there are likely fewer editors who feel qualified to offer an opinion. bd2412 T 14:01, 4 November 2014 (UTC)
    • I'd say I participate about as much as Kephir does. I certainly don't weigh in on every RfD; I seldom if ever weigh in on foreign words and there are plenty of English words I take a pass on as well. I don't think the solution is fewer discussions; I think we have the right number of discussions and I would oppose the suggestion that discussions be replaced with speedy deletions. I say the problem is discussions (particularly those trending keep) are dragged on for months and months and months, even if consensus is clear. All RfD discussions should be closed within a month, and if there's no consensus, that means there's not enough support for deleting them. It's also flummoxed me that we don't break up RfDs by month the way we break up this page. Purplebackpack89 14:14, 4 November 2014 (UTC)

Add "via" parameters to Template quote-news[edit]

Can someone please add "via" parameters to Template:quote-news as is used at en.wikipedia for w:Template:Cite news ?

This way, we can specify what database archive may be used to verify the material, for example: NewsBank, LexisNexis, Westlaw, InfoTrac, etc.

Thank you,

-- Cirt (talk) 20:17, 3 November 2014 (UTC)

Aren't some of those sources behind paywalls?
You can include the url now. DCDuring TALK 22:30, 4 November 2014 (UTC)

Converting RfD to monthly subpages[edit]

Previous discussions:

User:BD2412 has expressed a wish for splitting WT:RFD into monthly subpages. It seems like a good idea, but probably requires change to {{rfd}}, {{rfd-sense}}, and {{rfd-redundant}} [others?]. I don't think it will need a vote, but it certainly needs an opportunity for discussion to make sure. DCDuring TALK 18:09, 4 November 2014 (UTC)

I wouldn't object. When we've done this to other pages, it seems that any newly created month page gets automatically added to my watchlist, if I am already watching the parent page. I would want that to happen here too. Equinox 19:45, 4 November 2014 (UTC)
No such thing. Keφr 19:55, 4 November 2014 (UTC)
While I might dislike it, the current single-page set-up has one advantage — it makes sure no discussion slips through unresolved (even though it sometimes takes ridiculously long to close some of them). Which is what often happens to Tea Room discussions now. Anyone remembers why succumb was tagged with {{rft}} and whether the issue was resolved? I do not. Sure, I can check backlinks in the appropriate namespace and find out, but it is quite tedious. Keφr 19:53, 4 November 2014 (UTC)
@Kephir: How does the use of subpages lead to items falling between the cracks? Is that a big contributor beyond the other contributors to requests being neglected?
Requests of all kinds fall between the cracks for several reasons. We have items that are tagged, but not added to the appropriate page for rfc, rfd, rft, and rfv. The absence of any time limits or dramatic consequences of rft and rfc mean that such items are not closed, let alone archived (at least for rft) to the appropriate page. Tags are not removed from many of the above requests and also rfi, rfp, and rfe. DCDuring TALK 20:59, 4 November 2014 (UTC)
Because the main pages of discussion rooms using the monthly subpages system display only discussions from last three months, and there is no good way to view all unresolved discussions in order to assess and properly close them. RFI, RFP and RFE are irrelevant — no debate is usually started for those, because none is needed; those requests are considered resolved simply when someone fulfils them. Keφr 21:36, 4 November 2014 (UTC)
Can the discussion room main page structure be changed to include more time periods/subpages? How about three months per page? And more for the oldest? What about templates to categorize items as "open", "closed", "look"? It is possible to have tables that present the newest or oldest X members of such categories. DCDuring TALK 22:20, 4 November 2014 (UTC)
  • Oppose, would just lead to nominations getting forgotten. DTLHS (talk) 19:39, 5 November 2014 (UTC)
  • Oppose. The risk of rubbish being kept because a discussion disappears from the page before it is closed is not worth it IMO. — Ungoliant (falai) 01:07, 6 November 2014 (UTC)
  • Support, of course. We can always have a single transcluded page where editors can go if they feel like having a long wait while the page loads, and keep the discussions on shorter pages for those who prefer faster loading. Better yet, we could go to the system used by Wikipedia and Wikiquote, where each discussion is a subpage transcluded into the page for the month. bd2412 T 14:13, 7 November 2014 (UTC)
  • We've been doing this on the French Wiktionary for years, but it's different because those pages don't get archived to talk pages, and ours do. Probably oppose for that reason. Renard Migrant (talk) 16:25, 11 November 2014 (UTC)
  • I oppose splitting RFD to monthly pages. RFD can get shorter if editors who make most nominations also help close old nominations. Closing old nominations includes providing a boldfaced disposition, deleting the nominated page if appropriate, and striking out the heading. Splitting the page to months will not make it any less stale, and will remove the long-page-displeasure incentive to start closing old discussions or start posting fewer new ones. --Dan Polansky (talk) 23:07, 28 November 2014 (UTC)

SPAM or Spam?[edit]

The entry on Wikipedia is titled "Spam (food)," not "SPAM (food)," so which one should be the main (as opposed to the alternative form of) listing? Right now, it is "SPAM." WikiWinters (talk) 20:48, 4 November 2014 (UTC)

Hormel Foods calls it The SPAM® family of products, SPAM® brand, SPAM Classic, Great American SPAM® Championship, SPAM® Musubi, SPAM® Tocino, The SPAM Museum, and #SPAMCAN. Apparently the all-caps style is part of the logo. —Stephen (Talk) 23:04, 5 November 2014 (UTC)
It looks like the company is trying to protect their brand name from w:Trademark erosion by using a spelling that's less likely to show up in non-brand usage. If you think about it, we shouldn't be interested in how the company decides the brand name should be spelled, but in how the term is spelled when used for non-brand senses. I think the main entry should be at spam (as it is now), and I have my doubts as to whether we should even have an entry for SPAM. Chuck Entz (talk) 02:32, 6 November 2014 (UTC)
  • As editors of a descriptive dictionary, should we not include an entry for SPAM if the term is indeed used with that capitalization?  :) ‑‑ Eiríkr Útlendi │ Tala við mig 06:26, 6 November 2014 (UTC)
@Chuck Entz: The main entry currently is SPAM, not spam. Do you suggest changing this? WikiWinters (talk) 23:56, 6 November 2014 (UTC)
For future reference WT:TR is the discussion room for individual entries where there are no policy issues. Google Ngram Viewer gives a slight edge to spam even before the Internet meaning appears so I'd go with that. Renard Migrant (talk) 16:34, 7 November 2014 (UTC)
Yes check.svg Done (Got it. Also, I corrected the entries.) WikiWinters (talk) 20:02, 7 November 2014 (UTC)

Eliminating Template:trans-mid, etc.[edit]

I recently created a pair of very simple templates {{col-top}} and {{col-bottom}} that create auto-balancing columns of text (for an example, see WT:Wanted entries). If we integrate these templates into pairs like {{trans-top}}/{{trans-bottom}}, {{rel-top}}/{{rel-bottom}}, etc. we will no longer need to manually balance their columns with {{trans-mid}}, {{rel-mid}}, etc. Assuming we test this for browser compatibility, is this something we would want to do? --WikiTiki89 17:11, 5 November 2014 (UTC)

  • It seems like a worthwhile goal. Can you tell anything about resource consumption before testing? Assuming it is not a resource hog and passes on all major browsers, it would seem that it could be initially deployed by having the existing templates call it. Is that right? DCDuring TALK 18:15, 5 November 2014 (UTC)
    It's CSS-based, so it's all on the client side, so no effect on server load. Even on the client side, I would think all the browser needs to do is a simple division of the number of lines by the number of columns, which should be completely insignificant. The only potential issue (as with all CSS features) is browser compliance. As far as deploying it, yes, we just need to have the existing templates call it and have the mid-templates do nothing. --WikiTiki89 19:22, 5 November 2014 (UTC)
    Which browsers will have trouble with it? DTLHS (talk) 19:27, 5 November 2014 (UTC)
    I'm not expecting that any will, but we still have to test it. Maybe some outdated browsers or versions of browsers will not support it. Just to be clear, it work perfectly in the latest Chrome, Firefox, and IE. --WikiTiki89 19:29, 5 November 2014 (UTC)
    Are there sites for testing using older browser versions? Does MW have copies or a testing suite or insight? DCDuring TALK 20:16, 5 November 2014 (UTC)
    CSS columns are not supported by Internet Explorer 9 and lower and Opera 11 and lower. --Yair rand (talk) 22:17, 5 November 2014 (UTC)
    Sigh. Are higher versions part of automatic updates for IE 9 and Opera 11? What share of users have the old versions? I suppose it's too much to expect that it fails gracefully. DCDuring TALK 23:07, 5 November 2014 (UTC)
    Re browser share: The most recent Wikimedia Traffic Analysis Report shows the following usage share: IE9 - 2.25%, IE8 - 2.20%, IE7 - 0.89%, IE6 - 1.53%, IE5.5 - 0.22%, Opera<11 - 0.3%. Browsers that don't support CSS columns will display the content all in one column. --Yair rand (talk) 00:56, 6 November 2014 (UTC)
    Thanks. That doesn't seem fatal. Also, isn't it possible to change template/CSS/etc behavior based on the browser? That would at least dramatically diminish the importance of balancing that tables as 90% of users would see the table as balanced even if the various "mid" templates were misplaced. Isn't such balancing done by a bot? (Autoformat did it.) DCDuring TALK 01:07, 6 November 2014 (UTC)
  • (After e/c) This sounds brilliant. It always puzzled me that we had no auto-balancing, given how simple the math is. ‑‑ Eiríkr Útlendi │ Tala við mig 20:17, 5 November 2014 (UTC)
  • I would recommend to use a column width as parameter (with a set default, like 20em) instead of a number of columns, so that the number of columns would adapt to the screen width. Dakdada (talk) 10:12, 6 November 2014 (UTC)
    @Darkdadaah: Would that work with more browsers? DCDuring TALK 13:52, 6 November 2014 (UTC)
    No more, no less: same support (see here and here). Dakdada (talk) 15:00, 6 November 2014 (UTC)
    But that would be a drastic layout change for our translation tables. I'm not against it, but we would probably need to vote on it. --WikiTiki89 15:50, 6 November 2014 (UTC)
  • One thing that occurs to me. In some cases, JA editors (and probably others) have been using the {{mid}} family of templates in semantic ways -- in my case, specifically by splitting up derived term tables to have derived terms starting with the headword on one side, and derived terms ending with the headword on the other side. (This is common and useful for Japanese entries.)
Is this proposal intended to entirely scrap the {{mid}} family of templates? Or is this proposal more limited in scope, and targets only some of the {{mid}} templates? ‑‑ Eiríkr Útlendi │ Tala við mig 18:28, 6 November 2014 (UTC)
For that kind of situation I would suggest using something other than {{mid}} to delineate the split. That way it's clear that the split is not just there for balancing purposes. —CodeCat 18:45, 6 November 2014 (UTC)
What would you suggest? A sample usage is here on the 刀 entry. I added column headers here to try to clarify the table organization. In either layout, though, I have no idea what to use to split the columns other than the various {{mid}} templates. ‑‑ Eiríkr Útlendi │ Tala við mig 19:02, 6 November 2014 (UTC)
I don't know. We probably don't have templates specifically for this kind of thing, but it would be a good idea. I am a proponent of using templates in a way that signifies intent/meaning, rather than just using whatever template "looks right". —CodeCat 19:08, 6 November 2014 (UTC)
One thing to think about is whether that is actually better than just having two separate tables as in this edit. --WikiTiki89 20:06, 6 November 2014 (UTC)
  • Having just the one collapsible div seems like less clutter and better usability. Perhaps some other template tweaking would do the trick? We could create something like {{der-col-top}} etc, or even just {{der-head|header text}}, which would fit between {{der-top}} and {{der-bottom}}. ‑‑ Eiríkr Útlendi │ Tala við mig 20:12, 6 November 2014 (UTC)
    But having two separate collapsible tables makes it easier for readers to expand only what they want to see. It's not really a lot of clutter to have two tables. --WikiTiki89 20:27, 6 November 2014 (UTC)
So what's the verdict? Can we do this? --WikiTiki89 20:53, 14 November 2014 (UTC)
I'd think it needed a vote, because it is a bit rough on those with older browsers. DCDuring TALK 04:02, 15 November 2014 (UTC)
In what way is it rough? Manual column breaks are rough on all browsers, for all readers with very narrow or wide viewports ( which is a large proportion these days). Michael Z. 2014-11-28 15:50 z

Multiple etymologies=mess?[edit]

The use of the whole etymological chain of a word is necessary? For example see the entry for the French word "démocratie", which derives from the Latin "democratia". The origin of the latter is Greek, but should this be presented in the etymology of the French word or only for the Latin one? And why is this exhaustive etymological analysis through the Proto-Indo-European roots presented, which applies only to the Greek word? In the categorization, the French world is presented as deriving from all these languages, Latin, Greek and Proto-Indo-European, while it's only a loanword from Latin. Actually, the Latin comes from the Greek word, and the Greek comes from the PIE. In this way (which isn't used for the most part of the words in the wiktionary), all loanwords come from a very first proto-something language, but the point is to present the language from which a word derived, e.g. the French word derived from Latin and that's all. If anybody wants to see the origin of the Latin word should go to its entry and so on.--Ymaea (talk) 16:48, 6 November 2014 (UTC)

Does it really make sense to only go back one step? If you put borrowed from Latin, then you click on the Latin it says borrowed from Ancient Greek. You click on the Ancient Greek it says from PIE. That's a lot of clicking. If you get a chain of seven languages in an etymology you're going to have to click 7 times to get all the etyma. Renard Migrant (talk) 17:11, 6 November 2014 (UTC)
Another issue is that sometimes the intermediate etyma don’t exist. From example, there are 1809 words listed at Category:Portuguese terms derived from Old Portuguese but we only have 460 Old Portuguese entries. And sometimes the etymon immediately preceding the word is not the most important; people who want to know the origin of French words are more likely to want to know the Latin or even Old French etymon than the Middle French one. — Ungoliant (falai) 17:21, 6 November 2014 (UTC)
Without a proper etymology backend this is all just pissing in the wind. DTLHS (talk) 18:17, 6 November 2014 (UTC)

The problem here is mainly the automatic categorization according to the etymologies used. The French word "démocratie" is listed in three categories: a) "French terms derived from PIE", b) "French terms derived from Ancient Greek" c)"French terms derived from Medieval Latin". My objections:

  1. What is this word? It cannot be PIE, Greek and Latin simultaneously.
  2. Especially the first category (PIE) is totally weird, as it indicates a straight connection between the French and the PIE words. But originally only the Greek word was formed from PIE.
  3. The mess becomes more chaotic when we want to describe the origin of the French "démocratie". We should say that it has a Greek origin, it's a Greek influence, which was passed in French through Latin. This "through" doesn't indicate the etymology, but the route of the word. So, categories which indicate that this word derives from PIE and from Greek and from Latin, are obviously wrong.

To sum up, imagine a category "French terms derived from Ancient Greek through Medieval Latin". It's much more accurate and totally different from this coexistence of the three categories above.--Ymaea (talk) 18:53, 6 November 2014 (UTC)

Being in three categories does not imply that the etymon existed in three different languages. French démocratie is derived from both Medieval Latin and Ancient Greek, though I would say it is not derived from PIE since the compound was coined in Ancient Greek and didn't exist yet in PIE. Even the Greek word is not derived from PIE; it was coined within Greek from two words that were themselves independently derived from PIE. Categories like "French terms derived from Ancient Greek through Medieval Latin" sound like a good idea in principle, but in practice I think they would quickly become unmanageable. —Aɴɢʀ (talk) 21:31, 6 November 2014 (UTC)
"It cannot be PIE, Greek and Latin simultaneously." No and we're not claiming it is. It's a bit like saying a word can't be a verb and a noun simultaneously. Not simultaneously no, but separately, yes! Renard Migrant (talk) 22:55, 6 November 2014 (UTC)
"So, categories which indicate that this word derives from PIE and from Greek and from Latin, are obviously wrong."
"So, genealogies which indicate that I am descended from my father, my great-grandfather, and my grandfather, are obviously wrong."
--Catsidhe (verba, facta) 22:59, 6 November 2014 (UTC)
  • I'm not really seeing the need for a determination on this. a) I'm generally OK with long etymologies, and b) how long an etymology should be should be dictated by common sense. Purplebackpack89 22:17, 6 November 2014 (UTC)

Regarding this genealogy case, yes you are descended from these three persons, but it would be weird to put you in the category of each one without giving this vertical kinship ties. So, when the Latin and the French word are both categorized as deriving from Greek, one could assume that we talk about two separate formations with a common ancestor, the Greek one. When a dictionary says "100 French words derive from Latin and 100 more from Greek", this word is double-counted? I don't think so. But in wiktionary yes, it's double-counted, you can see it in both categories. My point is very clear when we compare the present situation with a category like that I proposed, "French terms derived from Ancient Greek through Medieval Latin". On the other hand even this solution would sound weird in other cases and I have some in my mind. Indeed it would be very complicated and possibly we couldn't handle a situation like this. But, I just wanted to point out that there is a strong lack of clarity with the categories in the way they are constructed. Thank you all!--Ymaea (talk) 01:47, 7 November 2014 (UTC)

"... derived from ..." ≠ "... directly derived from...". I certainly belong in a category of "people descended from {my grandfather}", and in "people descended from {my great-grandfather}", but not in "children of {my grandfather}". Similarly, démocratie is derived from δημοκρατία, but is not directly derived from it. --Catsidhe (verba, facta) 02:08, 7 November 2014 (UTC)
Category:French terms derived from Ancient Greek through Medieval Latin would fail the utility test, i.e. not useful to anyone. It has no advantages whatsoever; it's not more useful and it's not more accurate. And there isn't a lack of clarity, quite the opposite. We include all relevant truthful etymological categories. So if something's derived from Latin, we include that. If something's derived from Ancient Greek, we include that. We don't pick one or the other. With something like mock you end up with Category:English terms derived from Proto-Indo-European via Proto-Germanic, Old Saxon, Middle Low German, Middle Dutch, Middle French and Middle English. If that's your idea of clarity, I'll take obscurity thanks! Renard Migrant (talk) 16:28, 7 November 2014 (UTC)
Yes, with Wanderwörter the chains can get quite complicated. It wouldn't be difficult to list e.g. a dozen of words in Inari Sami whose etymology is roughly "from Finnish < from Swedish < from Low German < from French < from Latin < from Greek < from Persian". To keep this in hand, the useful stages to indicate would seem to be
  1. Direct loan origin.
  2. Ultimate loan origin.
Both indicate an action: the loaning by French (or by Inari Sami, etc), and the word's derivation in Greek (or Persian, etc). Anything else is not that necessary.
Note that by "derivation" I do not only mean the morphological composition, though. Sometimes a specific semantic or phonological mutation may have occurred in a specific language (say, between Greek and Persian), and this is also relevant info for the etymology of a word.
On the other hand, indicating the reconstructed proto-roots from which a word was derived in some other language entirely is largely superfluous IMO; while for inherited words, though, these clearly ought to be mentioned. Pretty much everything in English comes in some way from PIE (sometimes thru quite a few detours) — the purpose of a category like "English terms derived from PIE" would be mainly for indicating what exactly has been inherited from that far back.
I suppose this gets more difficult with French vs. Latin, or Hindi vs. Sanskrit, where one might want to distinguish inherited vs. learned vocab. But maybe things like "French terms derived from Medieval Latin" versus "French terms derived from Vulgar Latin" suffice for the job?
Worth mentioning here as well BTW: we currently have a somewhat inconsistent system where e.g. some Germanic words' etymology is discussed right under the modern words (as is proper), while others' is discussed under the corresponding Old Norse or Old English or Old High German words. --Tropylium (talk) 01:08, 9 November 2014 (UTC)


User:Ready Steady Yeti found out the hard way that we don't include this in our in our data modules as a language. We do have Guernésiais and Jèrriais, which are lects spoke nearby, and likewise often considered to be dialects of Norman. According to w:Sercquiais, the island was settled in the 16th century by speakers of Jèrriais, and has archaic features that have been lost in that language, along with considerable Guernésiais influence. Should we create a language code for this, or treat it as a dialect of Jèrriais/Norman? Chuck Entz (talk) 00:12, 9 November 2014 (UTC)

Why do we have separate codes for Guernésiais and Jèrriais to begin with? I'd say all three should be dialects of Norman. —Aɴɢʀ (talk) 13:51, 9 November 2014 (UTC)
I, too, am not convinced Jèrriais and Guernésiais are languages distinct from Norman. — Ungoliant (falai) 15:42, 9 November 2014 (UTC)

Indication of different pronunciations of English words of shared etymologies[edit]

Many English multi-syllable words that are used in different functions, especially as a noun and a verb, have the stress on different syllable according to which part of speech they occour as. Examples include increase, reject, excerpt, defect. The meanings and etymologies of such a word are usually related and the pronunciations are all grouped under the Pronunciation header. Many pages use various templates, some of which are clearly wrong, to indicate the part of speech a pronunciation pertains to. WT:ELE doesn't prescribe any template, instead hinting that parts of speech should be separated under multiple Pronunciation headers. Sound files often lack part of speech information. They may be indented under the relevant IPA description, but that requires a knowledgeable linguist, and to my knowledge isn't suggested on any policy page. The template {{qualifier}} seems to be the closest to what is intended. Is there a preferred way? Kumiponi (talk) 18:55, 11 November 2014 (UTC)

I'd use {{sense}} myself, I don't know about other editors. —Aɴɢʀ (talk) 19:12, 11 November 2014 (UTC)
That template's documentation doesn't allow such usage. Kumiponi (talk) 19:29, 11 November 2014 (UTC)
We've discussed this before (does anyone remember when/where?). Some people say that if a word has different pronunciations, then the two pronunciations actually have different etymologies. --WikiTiki89 22:07, 11 November 2014 (UTC)
At the very least, they're distinct words, so maybe something like this? Etymology at the top, in level 3, followed by Pronunciation 1 at level 4, then the words with that pronunciation, then Pronunciation 2 at level 4, etc. Of course this should only be done if we're really sure the words have the exact same etymology and came into existence at the same time from the same origin. —CodeCat 23:12, 11 November 2014 (UTC)
But how do you explain the different pronunciations? It should be part of the etymology. --WikiTiki89 23:20, 11 November 2014 (UTC)
You could say that the etymologies are different, but that sets up an inconsistency with the way we divide terms without pronunciation differences: we currently show the verb perfect as derived from the adjective perfect, but we don't show the verb mouse as derived from the noun, nor do we show the computer mouse as derived from the rodent mouse. Chuck Entz (talk) 00:03, 12 November 2014 (UTC)

Module:template utilities[edit]

Could someone please show me where the renaming of Module:template utilities to Module:ugly hacks was discussed at WT:RFM and where the deletion of Module:template utilities was discussed at WT:RFDO? As far as I can tell, the first was only mentioned in passing while discussing the fate of other templates, and the second wasn't discussed at all.

I can understand why User:Kephir might want to discourage people from using the module, but I can't understand why he didn't discuss it in the appropriate places first. At the very least it would have given people a chance to point out any potential problems, and to get people thinking about alternatives (this reminds me of a line in w:Dr. Strangelove about keeping deterrents secret, but I can't remember it offhand).

I'm sure he was very diligent in orphaning template utilities from everything that links to it. He wasn't so diligent, however, in checking Category:Templates that must be substituted. This is analogous to demolishing a bridge without closing the roads on either end. I've updated my own templates to use the other module, but there may be others. Chuck Entz (talk) 01:09, 12 November 2014 (UTC)

Yup. DCDuring TALK 02:33, 12 November 2014 (UTC)
How incredibly self-serving (or what a sign of youth?) it is to be able to say "I won't bother writing documentation for this module because you shouldn't use it: you should use Lua instead (and to hell with you if you don't and I won't answer your questions on my talk page if I don't feel like it). DCDuring TALK 02:41, 12 November 2014 (UTC)

AWB rights or task request[edit]

I'm looking to be granted rights to use AWB on this project. I noticed that Category:English simple past forms was supposed to be empty and I was going to correct the pages to point to Category:English verb simple past forms as prescribed. If granting AWB rights is not advisable because I'm an infrequent editor on this project, I'd ask that someone else please perform this task. Thanks, Ost316 (talk) 18:26, 13 November 2014 (UTC)

On it. bd2412 T 19:51, 13 November 2014 (UTC)
Done. Cheers! bd2412 T 20:01, 13 November 2014 (UTC)
Not that I object but, was this perhaps done unilaterally by CodeCat with no prior discussion? I don't think verb is really needed as other parts of speech don't have simple past forms. We don't have Category:English verb past participles for the same reason, 'verb' is implied by 'past participle'. Renard Migrant (talk) 13:30, 15 November 2014 (UTC)
That may be so, but we had two categories with duplicated intent, one containing about 75 entries and the other containing about 20,000. I have no problem recategorizing the 20,000, but it is definitely an easier task to recat the 75 with AWB. If we go the other way, it will be a bot task. bd2412 T 17:43, 15 November 2014 (UTC)
  • Let us fill Category:English simple past forms, and discontinue Category:English verb simple past forms. --Dan Polansky (talk) 18:37, 5 December 2014 (UTC)
    • If that is what consensus favors, I am up to the task (or a bot could do it in a day). bd2412 T 15:38, 6 December 2014 (UTC)
      • I don't agree with it. —CodeCat 17:51, 6 December 2014 (UTC)
        • @CodeCat: You could take the trouble to give reasons, instead of forcing us to beg you for them. Your opinion is one that I would recommend be ignored if you fail to participate usefully in discussions. DCDuring TALK 19:32, 6 December 2014 (UTC)
        • Well ok, you can ignore me and Dan. I'm just giving a counterweight to Dan because BD2412 suggested that might be the consensus. —CodeCat 19:33, 6 December 2014 (UTC)
          • I did not intend to suggest that there was a consensus for anything. I am volunteering to carry out the task if there is a consensus. bd2412 T 23:24, 6 December 2014 (UTC)
          • My main argument is Renard's: "I don't think verb is really needed as other parts of speech don't have simple past forms." My secondary argument is that status quo ante should prevail unless a performer of mass undiscussed changes can explain their reasoning and demonstrate consensus for their change. --Dan Polansky (talk) 12:23, 7 December 2014 (UTC)
  • I agree with Renard about the apparent redundancy of "verb" in the category name. Is there any language where a word class other than 'verb' has a past tense? DCDuring TALK 00:45, 7 December 2014 (UTC)
There are many languages where adjectives act like verbs, but one could quibble whether they're verbs or adjectives in such cases. There are similar issues with participles. There are also a number of agglutinative languages which have verb affixes on nouns. Given that the latter group are mostly not Indo-European, and tenses are rare outside of the Indo-European languages, I'm not sure whether there are any that have tense as opposed to aspect.
At any rate, I suspect that the main reason for "verb past tense forms" would be to maintain a uniform naming scheme between parts of speech and between languages. Given that many assumptions we make about what kinds of things are limited to which part of speech are wrong somewhere in the world (Modern Hebrew verbs, for instance, can have gender as well as person and number), it would make things easier in general to be explicit about such assumptions. I'm still trying to figure out where I come down on this particular case, though. Chuck Entz (talk) 01:32, 7 December 2014 (UTC)
WT:RFM I guess. Renard Migrant (talk) 13:20, 7 December 2014 (UTC)


I came across a recent edit which replaced a use of the non-existent Template:Bibleref with a direct wikilink to the relevant Biblical book, which is certainly of no use. There used to be on WP a template Bibleverse, which used "mediatools", when that went belly up, it failed for a while, and now it has been replaced with a same named template that works again. It is absolutely magnificent: the template creates an external link to any of a dozen off-site Bibles.

I assume such a template existed, and was deleted when mediatools disappeared. Either way, a Template:Bibleref that works like the WP template would be quite useful. That's certainly much better than "fixing" the Templates to something hard-coded. Choor monster (talk) 15:36, 14 November 2014 (UTC)

What does the WP template do? Does it link to one or multiple translations? Does it 'find' the citation? DCDuring TALK 16:15, 14 November 2014 (UTC)
On a partially-related issue, I find quotes that just say 'Bible' and not which edition a bit irritating. Modern English Bibles span at least 500 years, so which edition really does it matter. Renard Migrant (talk) 13:24, 15 November 2014 (UTC)
  • Good grief, it worked on Friday—linking to the asked-for particular Bible translation—and now it doesn't work at all, trying to link to tools.wmflabs.org and coming up empty. Well, it half-way works, creating a properly formatted Biblical reference. As an example, in Shuah, there are numerous instances of the template, with the parameter HE added, that originally and last Friday when I checked, provided a link to this stable line-by-line English/Hebrew Bible. This was particularly useful for this article, since there are four distinct Hebrew spellings that KJV transliterated into "Shuah". In contrast, the template did not provide links for any on-line LXX, so the relevant links had to be hand-coded, which is, of course, a nuisance. And in theory, this is less robust, but in practice, it has turned out to be more robust. Choor monster (talk) 16:39, 17 November 2014 (UTC)

Proposal to allow breves for Latin words in certain exceptional cases[edit]

Some Latin words have one or more vowels with variable length (for example, agrimensor/agrīmensor, Galilaea/Galīlaea, Lūcipor/Lūcipōr, Moȳsēs/Mōȳsēs, patruus/pātruus, Pharisaeus/Pharīsaeus, -por/-pōr, Publipor/Pūblipor/Pūblīpōr, redux/rēdux, succisīvus/succīsīvus, etc.). Allowing only macra, and no breves, accurate presentation requires something like this. To me, that seems like a crazy amount of duplication to account for variation in the length of one vowel; far better, in my opinion, would be presentation like this. Right now, however, such presentation is problematic, because the links generated point to page titles with ĭ in them, but this is easily fixed by automatically stripping macra–breves from Latin links in the same way that standalone macra are currently stripped from Latin links; to achieve this, the following change would need to be made to Module:languages/data2:

This sort of double-diacriticking is standard practice, and can be seen in Lewis & Short's Latin Dictionary and Félix Gaffiot’s Dictionnaire Illustré Latin-Français. L&S and Gaffiot both use standalone breves to mark short vowels anyway, but the Oxford Latin Dictionary, which only uses macra and never breves to mark fixed-length vowels, also uses macron–breve double-diacriticking to mark vowels with variable length, as in the case of “sibī̆¹, sibe” (p. 1,753/1 in the 1st ed.), its headword for sibi, the dative of the reflexive pronoun .

Does allowing the use of breves on Latin words in these exceptional cases seem like a good idea to everyone? — I.S.M.E.T.A. 18:15, 14 November 2014 (UTC)

Regardless of whether we should use them, the module should definitely strip them. I will make the change (and your particular suggested edit will not work). --WikiTiki89 18:21, 14 November 2014 (UTC)
@Wikitiki89: Thanks for that. And I'm sorry that I was wrong with my suggested edit. Could you explain what the u(0x0304), u(0x0306), u(0x0308) in the text you added does, please? — I.S.M.E.T.A. 21:05, 14 November 2014 (UTC)
@I'm so meta even this acronym: First of all, the reason your suggestion would not have worked is that the combining breve is treated as a separate character, thus "Pharī̆saeus" would have become "Phariasaeus" rather than the intended "Pharisaeus". u(0x0304), u(0x0306), and u(0x0308) are respectively the combining macron, combining breve, and the combining diaereses, which are replaced by nothing. The u(...) function converts a number representing a Unicode codepoint into the character itself (if you look at the top of the module, you will notice that u is just a shortcut for mw.ustring.char), the 0x indicates that the following number is in hexadecimal notation, and the number is the Unicode codepoint of each respective character. --WikiTiki89 21:27, 14 November 2014 (UTC)
@Wikitiki89: Hugely illuminating. Thank you very much. :-)  — I.S.M.E.T.A. 21:55, 14 November 2014 (UTC)

Template usage policy (generic vs. FL)[edit]

Do we have a written template usage policy? I thought FL-specific templates were allowed whenever there was a need, and the use of generic templates were encouraged when it was possible, especially when no special FL functionality was needed. But the opposite is happening in two cases and it's confusing.

  1. {{hu-proper noun}} is linked to {{head|hu|proper noun}}. It does not have any extra functionality. I was planning to manually move every entry that is using {{hu-proper noun}} to {{head|hu|proper noun}}, and eventually delete the template. So I try to change {{hu-proper noun}} to {{head|hu|proper noun}} whenever I see it, but there are other non-Hungarian editors who do the exact opposite: they change {{head|hu|proper noun}} back to {{hu-proper noun}}.
  2. {{hu-suffix}} was developed to provide functionality specific to Hungarian entries. The template was changed to point to {{suffix}} which does not provide the same functionality. Requests for adding back the unique original functionality are treated with resistance.

What is the policy that should be followed? --Panda10 (talk) 17:42, 15 November 2014 (UTC)

Now that many templates simply call on {{head}}, we should only use the ones that have additional functions or have a realistic prospect to have added functions in the future. For example {{fr-noun}} has a few extra functions, such as the automatic plural and the gender. {{fr-adv}} has no added functions (when compared to {{head|fr|adverb}}) and no realistic prospect of them, which is why I've nominated it for deletion. Renard Migrant (talk) 18:30, 15 November 2014 (UTC)

Changing the alternative display form parameters, again[edit]

Previous discussions: Wiktionary:Beer parlour/2014/January#Parameter to use for alternative display of links, Thread:User talk:CodeCat/Why the rush?

Now that we have Lua to automatically strip diacritics, we don't need the third parameter of {{l}}, {{m}} and similar templates nearly as often as before. Some people have brought this up before and suggested that we could rename this parameter to alt= and "shift" the gloss parameter (the fourth) downwards to take its place. But it's far from clear whether more entries use the fourth parameter than use the third. So before we make this change, I would like to add some tracking to these templates so that we can more easily judge which of the two parameters gets used more, and make a decision based on that. Is this ok? —CodeCat 20:56, 16 November 2014 (UTC)

I still think we should add |text= instead. Keφr 21:51, 16 November 2014 (UTC)
…by which I meant that instead of having {{m|en|A|alt=B}} we would write {{m|[[A|B]]}}. And instead of {{m|en||X}}, {{m|en|text=X}} (I preferred {{m|en|=X}} initially, though). Keφr 22:14, 16 November 2014 (UTC)
Yes it's a good idea, and I prefer |alt= because it's more established here. Renard Migrant (talk) 21:54, 16 November 2014 (UTC)
  • I support doing this, and I support naming the parameter |alt=. —Aɴɢʀ (talk) 22:06, 16 November 2014 (UTC)
    • I actually wanted to do this before deciding on whether to rename it. To see if it's needed. Keep in mind that there are many instances where {{m}} is used with only the alt display and no linked term. These cases turn up often in etymologies where you might want to show an intermediate reconstructed form without linking to it. Renaming the parameter would make such cases longer to type. —CodeCat 22:16, 16 November 2014 (UTC)
      Maybe we could create two more templates, such as {{l*|en|foo}} and {{m*|en|foo}} that would not automatically link the parameters (we don't have to actually go with my scheme-influenced asterisk usage)? --WikiTiki89 00:56, 17 November 2014 (UTC)
      Your suggested {{l*}} seems almost the same as {{lang}}. —CodeCat 01:13, 17 November 2014 (UTC)
      Except that {{lang}} doesn't support transliterations or language linking. --WikiTiki89 02:10, 17 November 2014 (UTC)
      It should probably support the latter. And maybe the former too. —CodeCat 02:29, 17 November 2014 (UTC)
      But regardless, we'd need a mention version of it as well. --WikiTiki89 03:31, 17 November 2014 (UTC)
      {{l*}} makes me rather think of reconstructed/unattested terms than Scheme. Keφr 12:18, 18 November 2014 (UTC)
      Then maybe you haven't used Scheme enough. The asterisk is used similarly to the way the prime symbol is used in mathematics. Compare functions such as let*, list*/cons*, and map*, and see this question. --WikiTiki89 16:55, 18 November 2014 (UTC)
      Quite probably so. Though why not {{l'}} if you just meant to use a prime? Or hell, even {{l′}}? (That is U+2032 PRIME.) Keφr 18:05, 18 November 2014 (UTC)
      The apostrophe is a special character in LISP-derived languages; the asterisk is not. The Unicode prime symbol would only work in some implementations and would be inconvenient to input anyway. As for why I didn't use an apostrophe here, I didn't actually connect the Scheme asterisk with the mathematical prime symbol until I was righting that response. --WikiTiki89 20:14, 18 November 2014 (UTC)
I don’t think the tracking is necessary. Even if it shows that there are more uses of {{m}} with an alternative display than with a gloss (which would not correspond to my personal experience), all that proves is that we need to start adding more glosses. But if that’s what it takes for people to support the parameter change, I see no harm in it. — Ungoliant (falai) 05:41, 17 November 2014 (UTC)

Wiktionary:Votes/2014-11/Entries which do not meet CFI to be deleted even if there is a consensus to keep[edit]

Let's start applying our own rules! Otherwise I will be nominating Wiktionary:Criteria for inclusion for deletion as de facto it isn't being used anyway. Let's go one way or the other; apply our own rules or get rid of them all together. Renard Migrant (talk) 18:49, 17 November 2014 (UTC)

Oh, you know I'm voting oppose! Purplebackpack89 20:32, 17 November 2014 (UTC)
I don't know how useful this vote is. It just means we'd be squabbling over "interpretations" of CFI. It has to come down to common sense in the end. To paraphrase Renard under another name, if the lunatics take over the asylum (single-issue propagandists, fringe kooks, etc.) we are not going to defeat them with rule-mongering. Equinox 22:32, 17 November 2014 (UTC)
Oh, it's not useful at all. There is an enormous crisis of implementation if this vote passes. Which it shouldn't, because entries should be kept or deleted because of consensus. I'm also worried about the motivations of this vote: it and the discussion above grew out of Renard's complaint about not enough articles getting deleted. I'd feel much more comfortable if this was coming from a neutral third-party rather than an ardent deletionist. Purplebackpack89 22:52, 17 November 2014 (UTC)
CFI is consensus in itself. It consists of the rules that all of Wiktionary has agreed to work with. If the rules don't suffice that doesn't mean we should just override them when we feel it's necessary. It means we need better rules. I'm also rather surprised that you, as a proponent of applying Wikipedia practices here, do not agree with the suggestion that we apply the common Wikipedia practice of using policies as rationales. Personally I think this is a problem here and if policies were enforced more strictly then they'd actually mean something. In particular I would love to see Wikipedia-style deletion messages used here, in which the name and a link for the relevant policy stating rationale for deletion is always included. —CodeCat 23:09, 17 November 2014 (UTC)
Since you brought up Wikipedia, my experience is that admins there don't tend to close an AfD against consensus, even if consensus is in one direction and policy is another. The exception to that is AfDs that have a lot of IPs or new editors, who are usually discounted. Purplebackpack89 23:32, 17 November 2014 (UTC)
That may be true, but I don't know what you're interpreting as "consensus" in this case. I've seen administrators close discussions with option 1 even though the majority of votes was for option 2. The rationale given for this was that the people who wanted option 1 gave better rationales, in particular rationales that had more merit with respect to Wikipedia policy. That consensus depends on the quality of arguments over numerical superiority is a Wikipedia policy in itself, as can be read at w:WP:Consensus. And I would definitely favour a similar approach on Wiktionary. The only problem is that with our much smaller number of users and administrators, it's harder to find someone who is not involved and who can therefore be trusted to view the given arguments impartially. So we have the problem of "no consensus over the consensus"... —CodeCat 00:03, 18 November 2014 (UTC)
I am interpreting "consensus" as the preponderance of opinions on the matter. I honestly don't think it occurs as often as you do. The decisions you cite where that's true almost always fall in the 60-40 range. Admins almost never close a discussion one way when more than 65% of the votes are the other. Purplebackpack89 00:05, 18 November 2014 (UTC)
It's not common, no. But the fact that it can and does happen does mean something. The fact that Wikipedia even has an official policy saying it must be done that way is even more telling. While Wiktionary is certainly not Wikipedia, there are some things we can learn from and I think this is one case. —CodeCat 00:08, 18 November 2014 (UTC)
Purple, that's absurd: claiming that Renard operates on how many entries get deleted. He only wants to delete the ones that are actual SoPs. I know you don't understand SoP, as you have often made evident. Equinox 00:23, 18 November 2014 (UTC)
I stand 100% by that. Prior to the beer parlour discussion above, Renard had expressed dismay that a number of entries were closed as keep despite he believing them to fail CFI. He mentions this dismay in the beer parlour discussion above. Purplebackpack89 00:30, 18 November 2014 (UTC)
If anyone is operating purely "by the numbers" and not by thought or logic, it is you, Purple, who want to decide everything with a transient "keep" or "delete" vote, rather than deciding why, and formulating rules based on the reasoning. Equinox 00:24, 18 November 2014 (UTC)
I object vociferously to what you've just said. I give reasons for every vote I make. And I'm nowhere near the most voting person here. I also understand what SOP is; I believe it to be a ridiculously restrictive policy that should be eliminated to allow us to have worthy entries that many other dictionaries have. It's not that I have no reasons, it's mostly that you and Renard don't like me reasons. Purplebackpack89 00:28, 18 November 2014 (UTC)
There won't be any point in making any entries (of more than one word) at all if this is passed. Migrant's policy would ruin the dictionary and it would be in danger of stagnation, IMO. There are more receptive dictionaries around on the Internet. Donnanz (talk) 23:29, 17 November 2014 (UTC)
Yep, what Donnanz said. Purplebackpack89 23:32, 17 November 2014 (UTC)
I'm not given to the expression of strong opinions around here, but, frankly, this proposal seems like an absolutely terrible idea to me. It isn't always clear-cut whether a term meets CFI or not. Sometimes the determination of CFI compliance depends upon making a subjective/qualitative judgment (SOPness) rather than simply ensuring that certain objective criteria are being met (three non-mention citations spanning a year). The RfD process exists to resolve such cases. We discuss the matter and reach a consensus. If we're not going to respect the outcome of these discussions — if we're going to allow admins to become judge, jury, and executioner, deleting entries at their sole discretion based on their own personal interpretations of CFI - then the RfD process will become nothing more than a meaningless song and dance. Discouraging discussion and disrespecting consensus in such a manner would be contrary to the collaborative spirit of this project. -Cloudcuckoolander (talk) 00:01, 18 November 2014 (UTC)
On Wikipedia, as I noted above, AfD (article for deletion) discussions are not the same kind of back-and-forth that is often seen here. Instead, each person gives arguments and then a third party will judge those arguments based on policy and decide which view has more merit. The fact that a third party is involved means that powers are somewhat separated. Perhaps we should do something similar here, by requiring that whoever closes an RFD discussion must not have taken part in it. —CodeCat 00:06, 18 November 2014 (UTC)
We absolutely should do that! Purplebackpack89 00:09, 18 November 2014 (UTC)
This looks like something that should be obviously desirable and nice in theory, but I do not think the CFI as it is written right now is a good policy to enforce to the letter. For starters, I would like WT:CFI amended to accommodate "hot word"/"hot sense", {{translation only}} and phrasebook entries first. But even if we do that, there will still be too much room for subjectivity in interpreting CFI, which at best means that RFDs will keep being votes — or worse: as one person on TOW put it, "AfD is not a vote" means "AfD is a vote that administrators are allowed to count any way they like.". Keφr 08:22, 18 November 2014 (UTC)

This is a joke proposal, right? It’s honestly difficult for me to tell. --Romanophile (talk) 16:03, 18 November 2014 (UTC)

Your comment is a joke, right? You think applying rules we already have is a joke? Renard Migrant (talk) 20:08, 19 November 2014 (UTC)
Maybe the joke is that we need to vote to approve rules that have been already formally enacted. Keφr 20:22, 19 November 2014 (UTC)
No, I believe Renard is solid in his convictions. Solid enough that the vote starts on Monday. Purplebackpack89 16:42, 18 November 2014 (UTC)
Purplebackpack89 is wrong about what he says about me, for the reasons he gives. It's nothing to do with the number of entries being deleted, but which ones. Also, Purplebackpack89 if with what CodeCat says about deletion debates not being about counting votes but about the overall arguments made, why would you oppose this vote? It's what I'm proposing, after all. Renard Migrant (talk) 18:16, 18 November 2014 (UTC)
"Which ones". The ones I've seen you complain about the most vociferously are those where you voted delete, others voted keep, the discussion was closed as keep, and you claim the entry should be deleted on CFI grounds. What you're essentially proposing is to shut out a line of argument, which coincidentally tends to be one you don't agree with. Purplebackpack89 18:46, 18 November 2014 (UTC)
But if Renard's argument is rooted in CFI whereas nobody else's are, why should we ignore CFI? I support this vote because it means people will be forced to come up with better arguments - specifically, arguments that follow consensus-established Wiktionary policy - if they want their views to be taken into account. I think that's a good thing. —CodeCat 19:54, 18 November 2014 (UTC)
More than that, any problems that are in CFI, currently we've no reason to fix them because editors are free to ignore CFI as much as they choose. Forcing CFI to be applied will put in under much greater scrutiny and therefore it will get amended. What's the reason to improve it right now? Renard Migrant (talk) 22:05, 18 November 2014 (UTC)
Because participating in RfD is easy and changing CFI is hard. And CFI can never cover everything. Sometimes, you just have to use the Potter Stewart test. Purplebackpack89 22:08, 18 November 2014 (UTC)

Wiktionary:Votes/pl-2014-11/Require third-party closures of RfD and RfV discussions[edit]

Starting one week from today, there is going to be a vote on whether RfD and RfV discussions must be closed by uninvolved editors. Purplebackpack89 00:37, 18 November 2014 (UTC)

We generally do this anyway, no need to vote on it. --WikiTiki89 00:39, 18 November 2014 (UTC)
There's no policy that says we have to, though. Purplebackpack89 00:41, 18 November 2014 (UTC)
We don't need a vote to enforce something that we already do. --WikiTiki89 00:42, 18 November 2014 (UTC)
We need a policy to make sure we keep doing it. —CodeCat 00:52, 18 November 2014 (UTC)
It is not that clear that we should. It is naïve to think that requiring "uninvolved editors" to close discussions will eliminate biased closures — maybe even the opposite. Also, this requirement is especially superfluous on RfVs, where the existence of citations is usually not a matter of any interpretation. Keφr 08:30, 18 November 2014 (UTC)
I have removed RfVs from the vote, though I still harbor reservations that there's nothing stopping the same editor from starting and closing an RfV. Purplebackpack89 16:49, 18 November 2014 (UTC)
I think as it's worded, no, because it excludes anyone who voted. If it excluded just the original nominator and the entry creator, I could go with that. Excluding anyone who votes (though I note, not anyone who comments) could rule out practically everyone for really well discussed entries. Also why would someone who hasn't voted necessarily be 'unbiased'? Also, you could abstain from voting in order to be able to close, which means if you had a bias you'd be free to impose it onto the entry, because you hadn't voted. Renard Migrant (talk) 18:20, 18 November 2014 (UTC)
This. Keφr 18:33, 18 November 2014 (UTC)
I have closed dozens of deletion discussions that I have participated in - mostly because it seems (to me at least) that discussions tend to languish on the page long after they have run their course. If I hadn't closed those discussions, someone else would have had to pick up the ball, which wasn't happening. If consensus differs from my views, then I go with the consensus. The closed discussion remains on the page for a week following the closure, so if there are objections to the closure they can be raised. bd2412 T 17:07, 19 November 2014 (UTC)

Harassment by User:Kephir[edit]

Another admin, Kephir, is harassing me. He removed comments I made on another user's talk page, here and here. When I asked him not to do that, he deleted the message on my talk page, claiming it was vandalism here (commenting on another person's talk page is clearly not vandalism nor graffiti, as he labeled one of my edits). There are many other instances of harassment of me by this editor in months past, including a number of unwarranted personal attacks on talk pages and in edit summaries (this is a good example). Could someone PLEASE get him to stop? Purplebackpack89 22:56, 18 November 2014 (UTC)

If the abuse is long-term I recommend starting User:Purplebackpack89/Kephir. Subsequently write all the instances you can think of where he has misused his tools so that everything is in one place. That way you could remember every negative interaction that has occurred between you and if others agree with you then they may chime in where Kephir could possibly eventually get his admin tools revoked. Zigguzoo (talk) 23:05, 18 November 2014 (UTC)
Kephir has been an admin for 10 months only. Maybe he;s experiencing powertrips of some sort but I get the feeling that if he continues on his current projection of misleading edit summaries and misuse of the tools he may face desysopping soon. Zigguzoo (talk) 01:12, 19 November 2014 (UTC)
Oh, he oughta. The deletion of talk page threads and hiding of edits he did was just seeing what he could get away with with the tools he's proving he shouldn't have. While it's acceptable to clear one's talk page, user warnings should not be tagged as vandalism or graffiti. What compelled him to remove good-faith edits from your talk page, I do not know. Purplebackpack89 03:30, 19 November 2014 (UTC)
I think he's making legitimate criticisms, and I also think you're happy to make legitimate criticisms yourself, but when someone does it to you, you accuse them of harassment. Why do you think it's ok for you to act that way, but when other people do it to you, it's awful and horrendous? Renard Migrant (talk) 18:02, 24 November 2014 (UTC)
@Renard Migrant:, because I don't delete the criticisms with the summary "vandalism", nor do I remove other people's comments on talk pages other than my own. It's not so much what he says to me, it's the misleading edit summaries. Purplebackpack89 17:31, 28 November 2014 (UTC)

And he's doing it again[edit]

When I expressed concern about his deletion of two templates, User:Kephir deleted my comment with the summary "Incomprehensible, meaningless or empty: please use the Sandbox". The sandbox is not the appropriate place for user issues. When I told him that, he just deleted that with the edit summary "No usuable content given", which is a CSD for articles, not user comments. Can somebody please tell him to stop the misleading edit summaries? Purplebackpack89 19:04, 28 November 2014 (UTC)

Proposal to start WT:Courthouse[edit]

I propose we start WT:Courthouse, a page to discuss user conduct, blocks, reverts, etc. without disrupting the WT:BP. --WikiTiki89 16:09, 19 November 2014 (UTC)

Definitely support. But I wonder if it wouldn't be clearer if we just named the page after its function. —CodeCat 16:11, 19 November 2014 (UTC)
Support the page Not a big fan of the name, though. Purplebackpack89 16:13, 19 November 2014 (UTC)
Is "courthouse" not indicative of its function? --WikiTiki89 16:17, 19 November 2014 (UTC)
Not in the same way that WT:Vandalism in progress is. —CodeCat 16:29, 19 November 2014 (UTC)
Courthouse may actually be a little too indicative. I'd just call it Wiktionary:User conduct Purplebackpack89 16:31, 19 November 2014 (UTC)
Yes that's a good idea, although that might be mistaken for a page describing how users should behave. —CodeCat 16:35, 19 November 2014 (UTC)
WT:Vandalism in progress is a different story, since it's for emergencies. I chose the name WT:Courthouse by analogy to pages such as WT:Beer parlour, WT:Tea room, etc., except that it is actually more indicative of what the page is about. I'd put it at the same level of self-explanatory-ness as WT:Information desk. --WikiTiki89 16:47, 19 November 2014 (UTC)
I realise that. I'm not really that fussed about the names, I'm just wondering if the fancy names like "Beer parlour" and "Grease pit" aren't too confusing to new people who don't already know what they are. —CodeCat 16:50, 19 November 2014 (UTC)
Of course they're confusing, but only for the minute before they read the description on the page. But "Information desk" is not confusing, and I don't think "Courthouse" would be either. --WikiTiki89 17:09, 19 November 2014 (UTC)
What about Wiktionary:Dispute resolution? —CodeCat 17:19, 19 November 2014 (UTC)
That is currently a redirect to Help:Dispute resolution, which says, among other things, that if you have a dispute with a user, come here. That page will have to be reworded when the user conduct page is created. FWIW, I think the user conduct page would be more expansive than just dispute resolution. Purplebackpack89 17:37, 19 November 2014 (UTC)
"Dispute resolution" is something that can be done on userpages. This new page would be for when the dispute resolution fails. --WikiTiki89 18:17, 19 November 2014 (UTC)
...like above, when another editor arbitrarily decides any attempt to communicate with him is vandalism and immediately deletes it. Purplebackpack89 18:26, 19 November 2014 (UTC)
How about Court of last resort? —Stephen (Talk) 19:40, 19 November 2014 (UTC)
As a five-time administrator, I put my name forward for being the judge in the courthouse. --Type56op9 (talk) 14:36, 20 November 2014 (UTC)
In my experience the accusations of harassment and evidence gathering and posse forming which go along with it often result in more abuse that the instigating incident. I don't think adding a venue for tong wars will do anybody any favors. I would go so far as to say that disputes between users are better settled off of the site. - TheDaveRoss 23:09, 19 November 2014 (UTC)
I would have agreed, except that some users are always going to complain and giving them a place for it will prevent the Beer parlour from being disrupted by the complaints. It's like creating designated smoking areas versus banning smoking altogether. --WikiTiki89 23:16, 19 November 2014 (UTC)
I support the creation of Wiktionary:Courthouse, and with that name (and, consequently, with the shortcut WT:CH), if only to get all the recent whining off my watchlist via WT:BP and WT:RFD. — I.S.M.E.T.A. 21:52, 20 November 2014 (UTC)
I suspect this will become some sort of sensationalistic, exhibitionistic bullshit like Judge Judy. Let's do it. Equinox 02:58, 21 November 2014 (UTC)
I support this idea. It will be a place for just average block discussions, rather than for just emergency vandal reports. WT:VANDAL has in the past also been used for regular block discussions, which gives another advantage to WT:Courthouse. Rædi Stædi Yæti {-skriv til mig-} 03:36, 21 November 2014 (UTC)
  • How about calling it WT:Theater since it will be for nothing but drama? —Aɴɢʀ (talk) 21:18, 24 November 2014 (UTC)
    That could be a redirect. --WikiTiki89 15:26, 25 November 2014 (UTC)
  • I oppose this proliferation of discussion forums. User conduct is only hardly ever discussed in public forums; user talk pages are usually enough for that. We already have too many forums. --Dan Polansky (talk) 01:23, 29 November 2014 (UTC)
    • Dan, what do you do if somebody refuses to engage? Or if you're at an impasse that can only be solved by a third set of eyes? Purplebackpack89 02:02, 29 November 2014 (UTC)



According to Wiktionary:Criteria for inclusion#Inflections, entries like keeps one's options open are to delete? (cf. keep one's options open).

Regards, — Automatik (talk) 20:23, 20 November 2014 (UTC)

Links to Appendix:Glossary in Template:inflection of[edit]

I've added a feature to this template (in its module Module:form of) that automatically links to the glossary definition of a given grammar tag, if it exists. For example:

Not all of the recognised grammar tags have glossary entries yet. I hope this is useful in any case. This feature would make it more desirable to use this template instead of {{form of}}, at least as long as it's possible. I will probably also add this feature to the "shortcut" templates like {{plural of}}, {{accusative of}} and so on. —CodeCat 18:12, 21 November 2014 (UTC)

Oddity in Template:context[edit]

(Not sure if this is BP or GP material...)

I was just adding context labels to two Chinese entries (traditional spelling 庫納 and simplified 库纳) to clarify that the meanings here are about a currency. I added {{context|lang=zh|currency}}. Confusingly, and incorrectly, this adds (numismatics) to the visible page, although it does add the page to Category:zh:Currency as expected.

I've left the context labels in place on those two entries. Could someone more familiar with the context infrastructure see about changing this behavior? Numismatics is specific to coinage, whereas currency is about more than just coins. TIA, ‑‑ Eiríkr Útlendi │ Tala við mig 19:30, 21 November 2014 (UTC)

This is all handled in Module:labels/data. You can remove the following lines, if you think that is the right thing to do:
aliases["currency"] = "numismatics"
deprecated["currency"] = true
--WikiTiki89 19:50, 21 November 2014 (UTC)
Some unelected person made the unsanctioned decision to deprecate the currency topical tag. Is it clear to anyone what the logic of these labels is? Is this deprecated because some believe topic categories are not a good idea or because this one is not to their taste? Is it explained anywhere accessible? It certainly doesn't fit the documentation of {{context}}.
In the absence of any particular sanction for such, I guess you can do whatever makes sense to you. Even if something ends up broken, it wouldn't be all bad: it might create some pressure to make some actual community decisions about this kind of thing. DCDuring TALK 19:55, 21 November 2014 (UTC)

Actually, the label only adds to the "Money" category. This is because the "numismatics" label is defined twice:

labels["numismatics"] = {
	display = "numismatics",
	topical_categories = {"Currency"} }
-- ...
labels["numismatics"] = {
	topical_categories = {"Money"} }

The following entries use the "currency" label as of now:

Use that list as you please. Keφr 21:15, 21 November 2014 (UTC)

Note that the {{context}} and {{label}} tags aren't supposed to be about clarifying which meaning of a word is being referenced. That's what {{gloss}} is for. The context tags are for labeling technical terms within a certain field. Numismatics is a field that has technical terms, but currency isn't. —Aɴɢʀ (talk) 21:19, 21 November 2014 (UTC)
We have many other labels like that, including "cardinal", "ordinal", "personal" and so on. We should probably get rid of those. —CodeCat 22:50, 21 November 2014 (UTC)
Those seem less clear cut than the more purely topical labels. Like other dictionaries we use beginning-of-the-line labels to convey grammatical information that qualifies and clarifies the definition. The labels mentioned above sometimes convey grammatical information. For example, in English, ordinals normally fit only in certain slots relative to determiners and adjectives in NPs. DCDuring TALK 15:56, 22 November 2014 (UTC)
But does that mean they should be a context label? "Ordinal" is not a context, it's a description of the semantic function of the word. —CodeCat 16:00, 22 November 2014 (UTC)
So are the labels about transitivity and countability. The convention among dictionaries is generally placement at the beginning of the line. (We rejected the other convention of having a separate header for transitive and intransitive.) Dictionaries that have information on complements (eg, head of following PP), semantic restrictions of what is modified, and orthography also place that at the beginning of the definition line. DCDuring TALK 16:27, 22 November 2014 (UTC)
Transitivity is contextual though. Verbs can have different meanings depending on the presence of an object. —CodeCat 18:38, 22 November 2014 (UTC)
Why should we care about what you or I think is or is not 'contextual' or 'grammatical'? We are discussing a user interface. The sole question of importance is what and where would users expect certain classes of information. I was simply arguing against another case of jumping to - and acting on - premature conclusions.
I like the idea of putting topical information after the definition, if we have it at all. Are you saying that this is information that belongs after the definition or on the inflection line or in a usage note or that it doesn't belong in the entry at all? DCDuring TALK 19:34, 22 November 2014 (UTC)
"cardinal" and "ordinal" don't need to be displayed because that should already be obvious from the definition and the part of speech. "personal" could just be a gloss after the definition, but only if it's needed to qualify it (for example who alone could be both a personal and relative pronoun). —CodeCat 20:37, 22 November 2014 (UTC)
I might agree with this conclusion, in most applications, but the evidence of how you reach it scares me. "Obvious" to whom?
Is it obvious from the PoS header "Adjective" or from the PoS header "Numeral"? Both of them are applied to English ordinal numbers. BTW, do we still have runs of entries with unusual or no-longer-conforming headers? DCDuring TALK 23:18, 22 November 2014 (UTC)
@CodeCat: At the very least next and last are ordinals that depart from the numeral pattern. Presumably, next to last, ultimate, penultimate, antepenultimate are also ordinals. My imagination and memory both fail to provide me with others, but I suspect there are more, some probably used only in special contexts. I think all of these have common characteristics in terms of order in noun phrases: '[det] [ord] [card] [adj]s [N]' (much less often '[det] [card] [ord] [adj]s [N]'), not *'[det] [adj]s [card] [ord] [N]', nor *'[det] [card] [adj]s [ord] [adj]s [N]', etc. Ie, 'the last six red cars' (or ?'the six last red cars'). I think this kind of characteristic suggests that we need to keep the 'ordinal' label. I have the strong suspicion that it is only by ignoring some of the fine-grained features of syntax that we can feel free to ignore this kind of essentially lexical information. I wonder whether we shouldn't actually apply an RfD process to any deletions of data from the modules. DCDuring TALK 01:25, 25 November 2014 (UTC)
I don't see why the meanings of those terms can't be expressed without putting {{lb|en|ordinal}} in front. —CodeCat 01:50, 25 November 2014 (UTC)
It is a question of SYNTAX, not semantics. That is exactly the sort of thing that we put in front-of-definition labels. By labeling them properly we can help folks use them properly without having to have a common usage note in every English entry that has an ordinal sense. Wiping out content that you don't seem to understand does not engender trust in your decisions. DCDuring TALK 04:46, 25 November 2014 (UTC)
What Angr said (at 21:19, 21 November 2014 UTC). "Currency" is a gloss, not a context. - -sche (discuss) 02:01, 25 November 2014 (UTC)

Wiktionary:Votes/sy-2014-11/User:ObsequiousNewt for admin[edit]

Hello all. I'm just posting this here to notify everyone that the vote to confer administratorship on User:ObsequiousNewt began twenty-five minutes ago; the vote will end at 24:00, 10 December 2014 (UTC). — I.S.M.E.T.A. 00:31, 25 November 2014 (UTC)

Proposed new page: Wiktionary:Errors and omissions[edit]

We often get feedback about things that are incorrect or missing in our entries. This is very useful, but having it on the feedback page puts it out of view of some of the editors who might fix it. Furthermore, the feedback is only for anonymous users, registered editors can't use it (the link simply doesn't show up). So I believe it would be beneficial to have a single page, somewhat like a discussion page but without any real "discussion", where IPs and logged-in users alike can report mistakes in entries. This would primarily be used by people who don't have the linguistic (might not be fluent enough to be sure) or technical (might not understand our code) knowledge to fix it themselves. —CodeCat 22:41, 25 November 2014 (UTC)

Maybe... though to me this feels slightly redundant to individual entries' talk pages (though I know they can be overlooked) and the Tea Room. Equinox 22:52, 25 November 2014 (UTC)
My intent was to create a page that is focused more on reporting errors and less on discussing them, which the Tea Room is more about. It's definitely not obvious to me (or to many anonymous users, I imagine) that the TR should be used to report mistakes. —CodeCat 22:59, 25 November 2014 (UTC)
But it could be made obvious, e.g. by having a link on every mainspace entry, visible to both registered and unregistered users, saying "Find an error or omission? Please report it to the Tea Room." —Aɴɢʀ (talk) 23:02, 25 November 2014 (UTC)
I like the idea. Putting the items in TR could overwhelm the place.
Could we do something to really encourage timid or uncertain users to report errors and omissions. For example, imagine a tab or something on the entry page with an invitation to report an error, with the report either on the talk with a link thereto on the proposed page or directly on the proposed page. DCDuring TALK 01:05, 26 November 2014 (UTC)
I definitely like the idea of the "report" button, I was thinking of something like that as well. It would streamline the process for the user and make it more likely that they will report things, which helps us in turn. —CodeCat 01:10, 26 November 2014 (UTC)
I agree, let's use the tea room and talk pages rather than an entirely new page. WT:FEED not ideal but still better than nothing. Remember a lot of these corrections will turn out to be wrong or unprovable. Renard Migrant (talk) 18:20, 26 November 2014 (UTC)
I agree with the RM about the likely low yield (less than 50%, possibly much lower) of usable 'reports'. That seems to me to be a reason not to put such items on WT:TR. I do think we need a single page that has all the instances of such reports. Whether it would be populated with the 'reports' and links to the entry or with links to the section of the entry talk pages that contained the 'report' is worth discussion. I suppose leading users to the talk page has some advantages, especially as the issue may have been discussed before.
Is there any benefit to a limited test or is the best test full implementation with attentiveness to user response to the new invitation to 'report'? DCDuring TALK 15:32, 29 November 2014 (UTC)

Inflecting alternative forms[edit]

As an Ancient Greek editor, many words that I have come across have dialectical or alternative forms. For verbs, this can mean that different tenses have different alternative forms. Policy only states that the alternative lemma should not have a full entry, and what I have found is not clear as to which headers should be included. Therefore I ask: should inflection tables of alternative forms go only on the lemma entry, only on the alternative form entry, or on both?

I'd hold that they should at least appear on the lemma entry, if only for the reason that, in the case of GRC verbs, most alternative forms will be for non-present tenses. In fact, I would favor including nothing in an alternative form's entry but the Alternative forms, Pronunciation, and POS headers; the lemma should hold all of the verb's information, while alternative forms should be only a soft redirect. The argument against this, with regards to inflection at least, would be that the inflection of the alternative form is relevant to the alternative form and not the lemma- e.g. that the inflection of δᾶμος (dâmos) is relevant only to the alternative lemma δᾶμος (dâmos) and not the standard form δῆμος (dêmos). ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 16:04, 28 November 2014 (UTC)

Inflections can be added to alternative forms. — Ungoliant (falai) 16:11, 28 November 2014 (UTC)
@Ungoliant MMDCCLXIV: Do you mean that this is the practice, or that you think it should be? And if so, should they be added to the lemma entry as well? ObsequiousNewt (ἔβαζα|ἐτλέλεσα) 16:43, 28 November 2014 (UTC)
It’s the common practice, at least for the languages that I read and edit (an also what I think it should be).
I don’t recall ever seeing the inflection of an alternative form displayed in the lemma. If the peculiarities of Ancient Greek mean it is better to list all the inflections in one place, you should convene the Ancient Greek contributors to see what they think, and write something in WT:AGRC if they agree it requires special treatment. — Ungoliant (falai) 16:54, 28 November 2014 (UTC)
I always include full inflection in alternative forms. —CodeCat 16:55, 28 November 2014 (UTC)

Reordering the years on Requested entries pages[edit]

On Wiktionary Talk:Requested entries (English)#Reordering the years was a brief discussion of the benefits of having the year appear in descending order for each letter. The principal benefit would be the likely elimination of the need to move items to the current year from the first year listed when a requester accidentally places it in the wring year. Users who click on the letter will find a list in which to place their request and will from time to time fail to notice the year section heading. Even users who do find the right year heading need extra pagedowns or scrolls to do so. Other than the work to do it and the modest change for habituated users of such pages I can't think of significant reasons not to do this. (The option of having separate sections for each year with alphabetical ordering thereunder would make it harder to find requests in different years that were duplicates or near duplicates.) DCDuring TALK 15:12, 29 November 2014 (UTC)

@DCDuring: Seems sensible. Go for it. — I.S.M.E.T.A. 22:57, 29 November 2014 (UTC)
Ordering by year before letter might make even more sense, so that people can focus on the oldest ones first. —CodeCat 23:13, 29 November 2014 (UTC)
@CodeCat: I assume you mean ascending order by year. Did you think the whole alphabet of requests for 2011 should appear before, say, those for 2015?
I am not concerned so much about what experienced contributors do, though I would not want to inconvenience them, as much as I am concerned about reducing needless user error, especially, but not limited to, newbies. Our veteran contributors can make decisions about the kind of requests they would like to fill (or delete) by whatever criteria they have, probably little influenced by how the page is presented. DCDuring TALK 00:02, 30 November 2014 (UTC)
@DCDuring: As a frequent editor of the page, I think CodeCat's idea is the most efficient. Year should be given priority over letter, as even the most experienced users would be much more likely to refocus their efforts on the older entries simply because they are further up the page, and they would thereby be completed more quickly. If they are being completed more quickly, the inexperienced users would be less likely to even stumble upon the entries from the older years as they would already be completed. I support the replacement of the year and letter sections with each other. WikiWinters (talk) 22:39, 30 November 2014 (UTC)
I am not entirely certain what you mean by priority. Are you saying that all 2011 items should precede all 2012 items etc, down to the present? I contrast your view with that of another frequent user of the page, Equinox, who seems to think that almost anything more than a year old is not worthwhile. I would have thought that new additions are most worthy of consideration. Consideration leads to entry creation, removal from the list, or kicking the can down the road. After a few contributors have given an item such consideration, presumably the item is too specialized for any current contributor. Why would we want to force folks who have already looked at the item to keep on looking at it or to expend keystrokes to avoid looking at it? DCDuring TALK 22:59, 30 November 2014 (UTC)
@DCDuring: I mean that, rather than having sections for each letter and sorting each letter's entries by year, I propose that there be sections for each year (only the years that contain entries currently) and that within the year sections there be smaller sections for each letter, so it would be the opposite of what it is now. Many of the older entries easily meet the CFI requirements, and, while standard protocol is for the entries to be open to discussion and for the community to leave entries as they are, their theoretical placement above all of the newer entries would make it easier for all users, regardless of level of experience, to notice them. It's obviously expected that people not immediately delete entries because they don't think they meet CFI, but with this new system, experienced users would not have to scroll down as much and would notice the entries that do not meet CFI more easily, and new users would be more likely to attempt to create the first entries they see, which would be the ones from the oldest year, at the top. Newer entries tend to meet CFI more easily, simply because the older entries generally stay only after users see that they are older and are therefore probably "there for a reason," while the newer ones are generally treated more at face value and less by seniority, so as to be dealt with accordingly at a more rapid and efficient rate. WikiWinters (talk) 23:32, 30 November 2014 (UTC)
SemperBlotto used to wipe the 12-month-old requests every year (around Christmas, in fact; I like to imagine he thrust them into a cheerful Dickensian fire). I felt this was a good idea, since important words would keep coming up again and again, while dross (like much of the WT:REE page) would be lost. But someone objected to this, which is why the page is now full of crap nobody will create, but nobody dares to delete. Equinox 02:59, 30 November 2014 (UTC)
Pushing the old year sections to the bottom of each letter section is somewhat in that direction, but I was mostly trying to reduce keystrokes without a lot of controversy. DCDuring TALK 03:39, 30 November 2014 (UTC)
@Equinox, SemperBlotto: Would SB now delete everything from 2011 and 2012? So, most of the time, except December, we would have two years of requests? DCDuring TALK 22:59, 30 November 2014 (UTC)
@Equinox: How about moving all requests from the calendar year before last to subpages (e.g., WT:RE:en/2012 and previous)? — I.S.M.E.T.A. 09:33, 30 November 2014 (UTC)
A possibility that would mean that duplicates would be harder to discover. I suppose one could argue that it is the more recent addition that better indicates degree of interest. DCDuring TALK 13:16, 30 November 2014 (UTC)
@DCDuring: My thinking was that such consignment to subpages would have the decluttering effect of SemperBlotto's Dickensian fire, but without the actual loss / deletion that some have objected to. The presence of duplicates is pretty unimportant; they can be cleared out as blue links every few months. — I.S.M.E.T.A. 14:58, 30 November 2014 (UTC)
I sometimes sign requests <small>~~~~</small>. It shows who made the request and when. Maybe get rid of the years all together as it isn't when the request was made that matters. Renard Migrant (talk) 16:40, 1 December 2014 (UTC)
Signatures make the page larger and might force us to a subpage structure.
In the absence of signatures, the year conveys something: that none those who fill requests have been able or willing to fill the specific request over a time period, nor have they felt free to delete it. It is obvious to me that fresh requests are more actionable (whether the action be fulfillment, deletion, or comment) than stale ones.
Perhaps we need some kind of workflow-related organization of requests: automatic linking to component terms (as in {{head}}?) and a timestamp would be the best initial presentation, possibly together with determination whether the item was a duplicate of a previous request. This would be a better base for subsequent decisions, such as fulfillment or deletion. Items not rapidly fulfilled or deleted could be manually labelled with "issue" tags like "attestable?", "SoP?", "spelling?", "language?", or tagged as topically specialized. Thus subsequent reviewers of the items would not be starting over and could pick items based on the issues or topical specialty involved. DCDuring TALK 17:17, 1 December 2014 (UTC)
If we absolutely must keep every addition to the page, then I think we need a somewhat "intelligent" script or database that can track all additions, and boost the popularity of a word when a second (third, fourth...) person requests it, etc.; and this will at least give us a numerical justification for deleting very old and unpopular requests. But it seems like massive overkill. I still prefer the idea of wiping the page every couple of years. Equinox 06:08, 17 December 2014 (UTC)

Html button[edit]

Hello everyone!I had a idea for a new button:A HTML button.It can completely allow html and gives people the ability to make entries better.

Supersonic414-On Wikia 00:09, 30 November 2014 (UTC)

And what would it be used for exactly? —CodeCat 00:38, 30 November 2014 (UTC)
The only button that makes things better is the secret red one that nukes everybody on the planet. Allowing free HTML on here would just encourage stupid exploits like JavaScript. Equinox 03:10, 30 November 2014 (UTC)

Categorizing misspellings[edit]

Please comment I went to categorize Grauniad as under Category:English misspellings but the category's introduction says that it is for unintentional misspellings. Do we have or should we have some scheme for navigating deliberate misspellings? cf. eye spellings like da and tha for "the". —Justin (koavf)TCM 20:08, 30 November 2014 (UTC)

We have {{eye dialect|...|lang=xx}}. But for deliberate misspellings, well, if you deliberately misspell something, you spell it a way that is not in the dictionary; so I don't see why we should expect to include these until/unless they become mainstream spellings. See the recent discussion on rediculous (which CodeCat spells that way on purpose). Equinox 20:18, 30 November 2014 (UTC)
Not all deliberate misspellings are eye dialect, though; Grauniad isn't. Category:English misspellings does already contain some terms labeled "deliberate misspellings", though, e.g. enuf. In fact, in that entry I see that {{deliberate misspelling of}} automatically categorizes into the Misspellings category. —Aɴɢʀ (talk) 21:04, 30 November 2014 (UTC)
What about "fictional" misspellings? I'm thinking of Jane Austen Persuasion, end of chapter VI: ‘... mentioning him in strong, though not perfectly well-spelt praise, as "a fine dashing felow, only two perticular about the school-master," ...’ (bolding added) [8]. Do these get automatic entries? They are fictional eye-dialect, but of course, deliberate by Austen. Choor monster (talk) 21:12, 30 November 2014 (UTC)
Perticular would probably be attestable as eye-dialect. I hear that pronunciation sometimes. DCDuring TALK 23:06, 30 November 2014 (UTC)
It's already an entry perticular, where it's marked as "obsolete". Choor monster (talk) 23:25, 30 November 2014 (UTC)
Change the category description. Renard Migrant (talk) 16:47, 1 December 2014 (UTC)
To what? The OED makes it clear that it's obsolete. I gave the Austen cite is a prominent example of a deliberate fictional misspelling, not eye-dialect, a distinct kind of misspelling from something like "rediculous" or "nucular". Usually these are one-offs, but because Austen is Austen, maybe it rates an entry. Similar, but not rating an entry would be from the end of Miss MacIntosh, My Darling: "She would hang a sign in the restaurant window. Owt to luntsch. Bee bak in a whale. For she could not spell either." Choor monster (talk) 17:21, 1 December 2014 (UTC)
No, change the category description, the written statement at the top of the CATEGORY. Renard Migrant (talk) 23:12, 2 December 2014 (UTC)

Eye spellings Don't go too off the rails about eye spellings: this isn't one. I only mentioned them as another type of misspelling which is not accidental. —Justin (koavf)TCM 10:52, 2 December 2014 (UTC)

Koavf's change to the description is pretty good. But what do you think of making the wording even more compact, like:
  • "Common accidental (or sometimes deliberate) misspellings of {{{langname}}} terms."
 ? - -sche (discuss) 18:26, 9 December 2014 (UTC)

December 2014[edit]

Is 'label' the new 'context'?[edit]

As in [9] and [10]. If so, what has 'label' got that 'context' hasn't? Is it just a younger model? I can't keep track of the never-ending cycle of template changes. Kaixinguo (talk) 13:00, 1 December 2014 (UTC)

The language code goes in the first positional parameter (instead of named |lang=), which some find convenient. Also, there are some definition labels which are not strictly contexts, so the name is a bit more accurate. Keφr 14:10, 1 December 2014 (UTC)
Thank you. I need to look into when I should be using 'label', then. Kaixinguo (talk) 10:33, 2 December 2014 (UTC)
@Kaixinguo: The edits of Embryomystic are not supported by consensus. The use of {{label}} is not standard; it currently sees a tiny minority use. See also Wiktionary:Votes/2014-08/Templates context and label and also the talk page Wiktionary talk:Votes/2014-08/Templates context and label. --Dan Polansky (talk) 18:32, 5 December 2014 (UTC)
I'll be honest and say that I prefer {{label}} over {{context}}, but the main reason for those edits was to add language. embryomystic (talk) 23:40, 6 December 2014 (UTC)

Allow checking translations with the translation editor[edit]

I think it might be nice if the translation editor could somehow allow translations that need checking to be marked as "checked". That way people won't need to edit the entry anymore, which speeds things up. —CodeCat 19:14, 1 December 2014 (UTC)

New changes to Chinese entries[edit]

Question book magnify2.svg Input needed
This discussion needs further input in order to be successfully closed. Please take a look!

(Notifying Kc kennylau, Atitarev, Tooironic, Jamesjiao, Bumm13, Meihouwang):

Dear Chinese-language editors,

As the presence of Chinese entries grows rapidly on Wiktionary, I would like to propose some further changes to the format of Chinese entries. These changes aim to reduce the workload of Chinese-language editors (more specifically, avoid data reduplication and synchronisation hassles) and further neatify the code of Chinese entries. The changes include:

  1. Introduction of "lemma forms" for Chinese entries. This follows from sporadic suggestions and discussions raised before, such as at Talk:個. Copying my post there, the arguments for the introduction of lemma forms are:
... introducing the idea of lemma forms, so that information is all kept centralised and the trad-simp entries do not have to be synchronised. Currently the supposedly established practice of synchronisation is poorly maintained, from what I observe from my bot's sweeping edits. Conceivably the lemma forms should be traditional, since trad-to-simp conversion can be performed reasonably reliably. It's not because I discriminate against simp; I grew up with those characters too. That way we could just enable automatic trad-to-simp conversion in zh-usex, and all the information on the page (with the minor exception of the title) would be di-scripted.
In detail, this lemma form proposal would entail:
  • Centralising all information (etymology, pronunciation, definitions, "see-also" terms, compounds) at the traditional form, which is considered the lemma form of a Chinese word. If multiple traditional forms exist, the most common form is chosen as the lemma form.
  • All Chinese text under the Chinese header, including example sentences, related terms, synonyms/antonyms would include both scripts. Example sentences in both scripts can be generated automatically by the template {{zh-usex}}.
  • Other non-lemma forms will be converted to a soft-redirect - see User:Wyang/历史 and User:Wyang/语. The format of these redirects is negotiable but the principle is they should contain as little information as possible.
  1. Adopting a new neater version of the Hanzi box - {{zh-forms}} (backend is Module:zh-forms). Now that many complex editing tasks could be performed automatically with the Lua language, there is no need for partial manual coding of the Hanzi box as is currently implemented. Instead of the code
at 電腦, one can use

Please let me know what you guys think about these changes.

Thank you!

Wyang (talk) 11:08, 2 December 2014 (UTC)

(Not an editor of Chinese) I always thought it was interesting but perhaps inconsistent that content has been removed from so-called British entries and focussed on the American spelling, the reason given being that it would be too much work to maintain both entries, whilst having a far greater number of simplified and traditional Chinese entries to maintain. Kaixinguo (talk) 11:24, 2 December 2014 (UTC)
Interesting, but not true. There is no rule or practice to consolidate only in favor of American entries: it can go either way. Chuck Entz (talk) 13:13, 2 December 2014 (UTC)
The idea of centralization sounds logical. I have a couple of questions:
  1. Will the centralized information on the lemma page be too crowded? Since both formats will be displayed.
  2. Will the Hanzi-box still display the characters in the order of simplified-traditional? Would it make more sense to switch the order to trad-simp now that the lemma page will be the traditional?
  3. The soft redirect will not categorize the simplified entries. Is that ok?
  4. Instead of the soft redirect, what about a simple page similar to alternative spelling that we use for other languages?
  5. Will the lemma page visibly contain both formats at the same time or can users set preferences to see only one of them? --Panda10 (talk) 14:31, 2 December 2014 (UTC)
Ping didn't work for me, repeating here @Kc kennylau, Tooironic, Jamesjiao, Bumm13, Meihouwang:.
Symbol support vote.svg Support in principle. My preference is simplified for lemmas but if the rest decides traditional, I won't object. (I will give my reasons for preferring simplified over traditional later.)
Any dictionary - print or online uses just one form for articles, providing the other form for reference. Yes, keeping the info centralised is the main issue. --Anatoli T. (обсудить/вклад) 21:06, 2 December 2014 (UTC)

--Anatoli T. (обсудить/вклад) 21:06, 2 December 2014 (UTC)

  • I also support this proposal in theory. It would save myself and other Chinese editors a lot of time dealing with the synchronising work. Using the traditional form as the lemma makes sense, since mapping from trad->simp can be easily automatised, while simp->trad would require the intervention of human editors. Are we at the stage now where we can see a model of this proposed change? ---> Tooironic (talk)
    • Another argument for traditional is that it's more widely represented within the time span of the language we call "Chinese". Simplified is only about... half a century old? —CodeCat 23:02, 2 December 2014 (UTC)
Pro-simplified arguments:
  1. Much more common today. Used in (mainland) China (also Singapore, Malaysia, Indonesia) as the obviously biggest user of the Chinese language (in any form). Not sure if there is statistics about it but you can guess that the explosion of Chinese in Internet is due to China proper.
  2. Some will disagree but I doubt China will go back to traditional characters. Traditional characters are used only in Taiwan and Hong Kong. The status of both is less than a fully independent state. Taiwan government is working hard on preservation of traditional characters for a reason. Many Hong Kong citizens are fluent in English, so foreigners don't need to know Chinese (Cantonese or Mandarin) to get by.
  3. Overseas Chinese almost completely switched to simplified. Universities mostly use it in education. Also preferred by learners for practical reasons.
  4. Ancient books are all converted to simplified spellings, including works in Classical Chinese. How often do you need to read ancient books?
  5. Japanese shinjitai is also relatively new but the battle shinjitai vs kyūjitai (pre-reform spellings) is almost over. The simplification process wasn't agreed on with communities outside China, so the resistance is still big and anti-simplification propaganda affects some people. It shouldn't be politicised and there's no need to link simplified Chinese with communism. It has happened, compare with 1918 reform of the Russian spelling. Although there are flaws, there are obvious benefits.
  6. Simplified Chinese jiantizi is standardized much better than traditional Chinese fantizi. It's fantizi that still has more variants, obscure characters and IME (input methods) are better suited for simplified Chinese.
  7. Although traditional Chinese is the original form and links better to Sino-Xenic derivations, it's not always true. Japanese has its own simplification - shinjitai, which matches 30% (I guess) with jiantizi and has its own, Japanese specific forms. Korean and Vietnamese have their variants to some extent and for these languages, Chinese characters are no longer the writing system they use, especially Vietnamese.
  8. Most Chinese contributors are from mainland China, the majority of learners, IMHO, focus on simplified Chinese. I wonder how they feel when they find they have to click on the soft redirect links to get the info. --Anatoli T. (обсудить/вклад) 23:42, 2 December 2014 (UTC)
I wish to stress that I won't object traditional over simplified, if everybody wants so but the choice should be carefully weighed out. Languages and scripts are like currencies, it's not we like but what's more common and practical. --Anatoli T. (обсудить/вклад) 01:11, 3 December 2014 (UTC)
Question: Is there a one-to-many correspondence between traditional and simplified simplified and traditional characters, or is there a many-to-many correspondence? In other words, are there any traditional characters that correspond to more than one simplified character? --WikiTiki89 01:24, 3 December 2014 (UTC)
Usually it's a one-to-many correspondence between simplified and traditional characters (in this order). So a conversion from simplified to traditional would be harder but if traditional characters are used as a source for usage examples, etc, then it would still work. It's quite rare but a simplified version should be manually fed into templates, when there are variants. Automatic conversion tools work better (but not 100%) from traditional to simplified. --Anatoli T. (обсудить/вклад) 01:38, 3 December 2014 (UTC)
So you're saying there are traditional characters that have more than one simplified character, but very few of them? --WikiTiki89 01:43, 3 December 2014 (UTC)
Apparently, there are 19 (?) such traditional characters: , , , , , , , , , , , , , , , , , , . Most common/notable is probably . Wenlin editor always asks how you wish to convert the character to simplified. Don't fully trust Wiktionary on this, single-character entries need a lot of attention. --Anatoli T. (обсудить/вклад) 01:53, 3 December 2014 (UTC)
The funniest character is , which, when incorrectly translated causes mistranslations, like "dry food" becomes "fuck food". is both simplified and traditional for this sense. --Anatoli T. (обсудить/вклад) 01:59, 3 December 2014 (UTC)
That explains some of those Chinese mistranslation memes. Anyway, that means that if we have an automatic conversion from traditional to simplified (whatever it is used for), then it should require a manual conversion whenever those characters are present, right? So I have an idea that would let us have the lemmas at the simplified character entries and still make use of automatic conversion: At simplified character entries, we can group the definitions by traditional equivalents that are indicated in the headwords and include HTML anchors. Then the traditional character entries will link to the correct anchor at the simplified character entries, bringing the reader directly to the definitions he was looking for. --WikiTiki89 02:17, 3 December 2014 (UTC)
With automatic conversion it always requires a knowledge/intervention of an editor, whether you create a simplified or traditional character term, character forms and pinyin. See 台湾 (Taiwan) or its traditional equivalents 臺灣 and 台灣. Or others like 什么. The simplified entry could contain a usage example, where traditional character are used with parameters when a different conversion is required:
  1. 什麼 [Beijing Mandarin, MSC, trad.]
    什么 [Beijing Mandarin, MSC, simp.]
    Zhè shì shénme? [Pinyin]
    What is this?--Anatoli T. (обсудить/вклад) 02:30, 3 December 2014 (UTC)

Thank you for the replies. Answering Panda10's questions:

  1. Centralised information on the lemma page will not look crowded - the examples and derived terms will be in collapsed mode. Please see for an example.
  2. Good point. I have changed the order.
  3. Categorisation will be added.
  4. Personally I think the current format for alternative spellings will be excessive for non-lemma Chinese forms. Pronunciation, definitions, see-also terms will be 100% the same.
  5. Currently both are displayed, but envisageably some sort of gadget could be developed for this purpose, using what the Chinese Wikipedia and Wiktionary do.

@Tooironic: The new lemma forms would look like (with all Chinese text in both scripts), and the non-lemma forms would look like User:Wyang/历史 and User:Wyang/语 (or other formats if people prefer).

I entirely agree with CodeCat's reason that Traditional Chinese's long history is a factor not to be overlooked here. It is the reason that Hanyu Da Cidian, the most inclusive Chinese dictionary produced by PRC and in history, uses traditional forms as headwords and most of its citations. The main advantage would be the automation of the trad-simp conversions. Graphical etymologies and descendants would also be heaps easier when done at the traditional forms - see for example 風#Etymology and 學#Etymology.

The choice of script for the title is IMO not a crucial choice, since the title would be the only Chinese text that is not di-scripted on the lemma pages. (The title may even be di-scripted with gadgets.) Soft redirects shouldn't be too much of a problem - since information is not lost. Wyang (talk) 04:33, 3 December 2014 (UTC)

@Wyang: Let it be traditional then. Do we need a vote? If you set it up, I'll support it. I'm not sure if {{ping}} works, maybe we need to poll some editors manually. --Anatoli T. (обсудить/вклад) 21:24, 3 December 2014 (UTC)
This is Chinese so you need tone marks: {{pīng}}. —CodeCat 13:54, 4 December 2014 (UTC)
Is that or ? Chuck Entz (talk) 14:15, 4 December 2014 (UTC)
We can bombard those unresponsive with 乒乒乓乓. Wyang (talk) 03:13, 5 December 2014 (UTC)
OK, thanks Anatoli. Let's wait for a day or two, and we can start the vote then. Wyang (talk) 13:40, 4 December 2014 (UTC)
We have to work out details of the format (a template) and categorising (more detailed) of jiantizi, perhaps also about interwikis (Wiktionary:Grease_pit/2014/December#Interwiki_bots), since Chinese Wiktionary mainly use jiantizi. It's a big job, hopefully it can automated and simplified characters are not disadvantaged in usage examples, synonyms, etc. --Anatoli T. (обсудить/вклад) 21:47, 4 December 2014 (UTC)
OK, any suggestions for the template and the categorisation? Wyang (talk) 03:13, 5 December 2014 (UTC)
I think categorisation should possibly mirror traditional entries, if that's reasonable and possible (without having to edit the entry manually - done and forgotten). A simplified entry should have a definition line, and (IMHO) a short usage note (standard in PRC, Singapore, Malaysia, etc.). I'll give it a thought. --Anatoli T. (обсудить/вклад) 05:20, 5 December 2014 (UTC)
I added a note and categorisation - Please take a look at User:Wyang/历史 and User:Wyang/热爱. Wyang (talk) 07:20, 5 December 2014 (UTC)
I still find the dark gray template at the simplified entry too different from the current standars. Would it be too complicated to create a template with the below layout? How about adding the Hanzi box? Another topic: As the language develops over time, is there a chance that the simplified character will get a new meaning that the traditional character will not have?


# ''simplified form of'' '''[[歷史]]'''

====Usage notes====
* '''[[Simplified Chinese]]''' is mainly used in Mainland China and Singapore.
* '''[[Traditional Chinese]]''' is mainly used in Hong Kong, Macau and Taiwan.

--Panda10 (talk) 19:00, 5 December 2014 (UTC)

My preference would be to have minimal information on the non-lemma pages, i.e. link only. Definitions and other details (including parts of speech and hanzi box) are covered at the main entry, for clarity and ease of maintenance (e.g. 保险). Answer to the second question: No, they are strictly two versions of the same thing. Wyang (talk) 02:25, 6 December 2014 (UTC)

Symbol support vote.svg Support I support this proposal. I was worried about categorisation but I read above that it will be handled. For the conflict between simplified and traditionnal, is it possible to have a template on the non-lemma page which will parse the lemma page, extract the parts corresponding to the non-lemma word, convert it to the non-lemma script and display it? This way, the lemma and non-lemma page will display the same information. Meihouwang (talk) 15:00, 6 December 2014 (UTC)

(I'm not a Chinese-speaker, but) I support this proposal. Compare how Swiss spellings like Strasse are soft-redirected. - -sche (discuss) 18:39, 9 December 2014 (UTC)

We can use an existing L3 header, which is legal! - ===Hanzi=== and probably not just for simplified but ALL Chinese entries, Wyang, you may get away with the ===Definitions=== idea after all but using ===Hanzi=== instead. Chinese may not need PoS headers:


# ''simplified form of'' '''[[歷史]]'''

...notes follow

An entry where simplified is also traditional for some senses could use a normal format. --Anatoli T. (обсудить/вклад) 00:42, 7 December 2014 (UTC)

  • ===Hanzi=== was only for single-character entries, no? Much like the ===Kanji=== header used in Japanese single-character entries? ‑‑ Eiríkr Útlendi │ Tala við mig 01:02, 7 December 2014 (UTC)
Yes, that was the original intention but hanzi (and kanji) are invariable nouns. --Anatoli T. (обсудить/вклад) 01:14, 7 December 2014 (UTC)
  • Sorry, I don't understand your comment. I think you mean that the singular and plural forms are identical, which is fine, but also irrelevant to my intended point.
To expand upon my earlier question, if ===Hanzi=== has been used primarily in single-character entries as a header indicating information about that specific character that does not belong under any of the other headers (such as character composition, Unicode chart links, historical development), then the sample use above (in a multi-character entry, and not as a header indicating info about these characters, but rather as a generic non-POS header) strikes me as misleading and potentially confusing. ‑‑ Eiríkr Útlendi │ Tala við mig 07:11, 7 December 2014 (UTC)
The information (non-lexical) you're referring it is stored under ==Translingual== L2 header, not under ===Hanzi=== L3 header, everything lexical related to single or multiple character terms is the same - it's lexical, transliterations and pronunciations in the new format are under ===Pronunciation===. Unlike Japanese (also Korean, Vietnamese) - hanzi is the only writing system for standard Chinese, so whether it's a single character (even a component) or a long word, they can all be handled under one header. Rather than using ===Definitions=== (there is no agreement on this heading and an administrator may potentially removed it), I suggest using ===Hanzi=== (which is legal) or we need to promote ===Definitions=== and make it legal. --Anatoli T. (обсудить/вклад) 09:34, 7 December 2014 (UTC)
I'm neutral on the choice of header, if any. My feeling is that this general format gives too little emphasis on the actual link, in that it does not specifically inform readers of where to find definitions and other content instead. I'm quite sure that people might start complaining about "no definition, no pronunciation" in Wiktionary:Feedback, since the redirect appears no different from a normal link and is not conspicuous enough. Wyang (talk) 20:23, 7 December 2014 (UTC)
You can add the information after the 'simplified form of 歷史' - in the same line - as you planned in the original gray box. --Panda10 (talk) 21:05, 7 December 2014 (UTC)
First things first. We need descriptive templates for links to traditional and usage notes templates, which can mention that all the info is in the traditional form entry. It's better to split them into two. Simplified characters may be also traditional for some senses or alternative forms, like or . "Hanzi" header may be used for both traditional and simplified entries. --Anatoli T. (обсудить/вклад) 23:05, 7 December 2014 (UTC)
How about this? Wyang (talk) 07:56, 8 December 2014 (UTC)
Thanks, it looks good BUT it may not pass the requirements on WT:ELE#Definitions and someone may complain and we'll have to redo as it was with Wiktionary:Votes/pl-2013-03/Japanese Romaji romanization - format and content and Wiktionary:Votes/pl-2013-03/Romanization and definition line. Maybe the links should have an L3 header (not generated by a template), e.g. ===Hanzi===? The problem with ===Definitions=== header is that it has not been approved yet, so a change may not be supported because a new header is introduced, not because of the change itself. Not sure about colours either, maybe no colours should be present in the redirect, just text. I'm just thinking of things that can possibly cause problems. --Anatoli T. (обсудить/вклад) 11:42, 8 December 2014 (UTC)
Since I am not an editor of Chinese, my apologies for my questions. But when I click on the traditional link in the redirect box, where is it supposed to jump to? To the translingual entry? Or to the Mandarin entry? Currently, it just goes to the top of the page.
Do you need the collapse function for the 2-line simplified/traditional usage note? Could they be centralized? For example: placed into their own box on the right, only once for the entire entry. Or no box, just simple text in small font, right under the Chinese L2 header, before any of the L3 headers start. It's a central note, valid for all ety's.
I agree with Anatoli that the colors may not be necessary in the redirect, just the text, although I understand that the color would highlight the fact that this is a redirect, not just a usual ety. For the text itself, there could be several variations, depending on which part should be first and which second. Just reducing the number of double quotes might help to simplify the look. Two possible examples:

Etymology 1[edit]

See ('dry') for pronunciation and definitions of . ( is the simplified form of .)


Etymology 1[edit]

is the simplified form of . See ('dry') for pronunciation and definitions.

--Panda10 (talk) 18:06, 8 December 2014 (UTC)
Thank you both for suggestions. My understanding is that Wiktionary:ELE serves as a format guide for normal entries (The first sentence on the page says "While the information below may represent some kind of “standard” form, it is not a set of rigid rules."). Wiktionary lacks policy or even precedents of soft-redirects, for situations where a multi-scripted language consistently redirects forms written in one script to the forms written in the other script. This is of a completely different nature to the "non-lemma forms" commonly mentioned before, which were mostly stylish (naivety), compounding (sockpuppet), historical (anæmia) and erroneous (accomodation) variants, with very few being actual regional variants (compare the non-redirection of colour/color). As a consequence, there is no properly designed layout for this new category of soft-redirects, and existing layout fails to give due emphasis to the link to information, misleading readers to think that the form is an uncommon (and unimportant) alternative variant. In the case of , the inconspicuity of the redirections does not create the impression that two thirds of the definitions of the character should actually be found at the respective lemma forms.
Answering Panda10: The link to traditional links to the Chinese section, if it exists. The merger of Chinese variants is ongoing. The box will only show the notes if the variant type is "simplified". It could also be "ancient", "obsolete" or "variant", in which case the notes would not be appropriate. Characters like 干 are very rare - most will have only one note displayed, thus having notes underneath the link might be more explanatory. Wyang (talk) 00:09, 9 December 2014 (UTC)
The closest thing we have for a single-language full-entry soft redirect is {{no entry}}. —CodeCat 19:06, 9 December 2014 (UTC)
I would say {{pinyin reading of}} or {{ja-romaji}} are close equivalents to soft redirects. I personally don't have strong objections to Wyang's suggested format but I'm almost sure, there will be strong opposition to the new format, once we start changing entries. The votes, discussions on Japanese romaji entry format is a good example of that, despite the seeming triviality of romanisation entries. Specifically, the definition line (starting with #, not generated by a template), a PoS header (including "Definitions" or "Hanzi") should be sorted first - need to get some kind of legitimacy. The topic is mostly ignored now by people, who will surely raise their voice later. I don't want to create obstacles, I fully support the change (despite my preference for simplified) but I'm worried about time and efforts that could be spent. --Anatoli T. (обсудить/вклад) 21:35, 9 December 2014 (UTC)
If KassadBot (or similar) is restarted, it will flag these entries as incorrectly formatted. --Anatoli T. (обсудить/вклад) 21:53, 9 December 2014 (UTC)
{{pinyin reading of}} and {{ja-romaji}} are not for native scripts, whereas these redirects will be for native scripted forms and therefore need to be more eye-catching. IMO the first step is to make sure the introduction of lemma forms is agreed upon, and further changes to formats can be discussed once the first is established. A vote shouldn't be necessary if there is overwhelming support from related editors - and it seems from the discussion above that we can presume that step 1 is accepted by most, if not all. Wyang (talk) 09:02, 11 December 2014 (UTC)
I know they are not for native scripts but the lack of "#" on a definition line on romaji entries caused quite a stir (even if it was generated by a template). The new format examples introduces Definitions header as well (when simp./trad. is shared in some cases), for which we still have opposition. I've made some 字 entries with this header, anyway. I think you can proceed. --Anatoli T. (обсудить/вклад) 00:47, 12 December 2014 (UTC)
  • For the record, I disagree with removing definitions from simplified entries. Let those who want to concentrate on traditional entries do so; they should not be forced to create simplified entries at the same time. --Dan Polansky (talk) 12:13, 14 December 2014 (UTC)
The problem is not about having or not wanting to create simplified entries but about having them badly out of sync. Long-timers have been putting lots of efforts keeping them in sync but bewcomers often fail to do so, especially with entries with derived terms, see also's, etc. It's admittedly hard to synchronise large entries. Please note that the changes User:Wyang made will allow to show both forms in usage examples, etc. Editors only need to provide the traditional form. For the record, no single published or online dictionary has identical contents of both traditional and simplified Chinese, it's always one or the other. The other form is also provided. Our objective is to have both forms for each user example, synonym, etc. so that there is no information loss and users who have difficulties with traditional Chinese could use simplified right next to it (or below in multiline usexes). The only disadvantage (at this moment) to simplified Chinese users is having to click through the link but even sophisticated electronic ductionaries are not able to show both forms in usexes, users have to set options as in Pleco or Wenlin. --Anatoli T. (обсудить/вклад) 10:48, 19 December 2014 (UTC)

FYI: Wiktionary:Votes/pl-2014-12/Making simplified Chinese soft-redirect to traditional Chinese

Demoting kyūjitai to stubs/soft-redirects[edit]

Somewhat similar to the discussion just above and Wiktionary:Tea_room/2014/December#社会 and 社會, I suggest to make some changes to Japanese entries, which are kyūjitai and is not current use (some kind of exceptions can be made for kyūjitai, which are still in use, perhaps. Even the format of 社會 is too much, IMHO. It shouldn't contain translations, romanisation, etc, just a one-line link to lemma - 社会. I have no exact format at the moment, just wish to mention and get opinions. Calling @Eirikr, TAKASUGI Shinji, Haplology, Whym: (please add anyone I missed). --Anatoli T. (обсудить/вклад) 03:50, 3 December 2014 (UTC)

I agree. We use {{archaic spelling of}} for English. Why not for Japanese? — TAKASUGI Shinji (talk) 23:46, 3 December 2014 (UTC)
I agree that a one-line link would be sufficient for them. As for the wording, I would prefer saying something like "archaic" (as in the template Takasugi-san suggests) or "rarely used", than "not in current use". At least some words in kyūjitai such as 藝術 (art), (cherry tree) appear to me more like archaic than obsolete (in the sense that most people, if not all, can understand). This is perhaps because I keep seeing some institutions and people using those forms as part of their names. Whym (talk) 00:19, 4 December 2014 (UTC)
I also agree, and I think {{archaic spelling of}} sounds like a great idea. Many (most?) of these spellings aren't strictly obsolete, as Whym notes, and do get used intentionally from time to time. ‑‑ Eiríkr Útlendi │ Tala við mig 00:50, 4 December 2014 (UTC)
{{archaic spelling of}} is not the best solution, IMO, as there are archaic spellings, which are not kyūjitai. Kyūjitai merits a separate template, with categorisations. I also support removing all additional infos, definitions, examples to avoid duplications. --Anatoli T. (обсудить/вклад) 00:52, 12 December 2014 (UTC)

PoS filtering at OneLook[edit]

Here is a description of the new capability added to OneLook. It already had wildcard searches which allowed searches for words ending in "full" (which we don't as "full" is not a suffix and {{compound}} does not categorize). DCDuring TALK 17:02, 2 December 2014 (UTC)

Using templates to synchronize US-UK spelling[edit]

FYI: Wiktionary:Grease pit/2014/December#Revisiting the issue of English UK/US spellings and entry synchroni(s.7Cz)ation. --Dan Polansky (talk) 22:07, 5 December 2014 (UTC)

Esperanto participles - markup in headword lines[edit]

FYI, an editor currently mass replaces "{{head|eo|participle}}" with the likes of "{{eo-part|alĝustig|ite}}", as in diff. I don't know the benefit of such a replacement; if you are an Esperanto editor (User:Mr. Granger?), you might want to have a look and see whether the change seems good to you. --Dan Polansky (talk) 11:03, 6 December 2014 (UTC)

{{eo-part|alĝustig|ite}} puts the term into Category:Esperanto adverbial participles, while {{head|eo|participle}} puts it into Category:Esperanto participles, so it's more specific. {{head|eo|adverbial participle}} would have the same effect, but by using {{eo-part}}, editors don't have to remember which kind of participle is associated with which suffix, as the template does the work for them. —Aɴɢʀ (talk) 11:14, 6 December 2014 (UTC)
Is the goal to have Category:Esperanto participles empty, having all participles classified in one of Category:Esperanto adjectival participles‎, Category:Esperanto adverbial participles‎ and Category:Esperanto nominal participles‎? --Dan Polansky (talk) 11:43, 6 December 2014 (UTC)
That sounds like a reasonable goal to me—every participal is either adjectival, adverbial, or nominal, so it's certainly possible to move all of them to the subcategories, and maybe that would be useful in some way. There's no need for User:Embryomystic to do it by hand, though—it could easily be done by bot. —Mr. Granger (talkcontribs) 14:12, 6 December 2014 (UTC)
Fair enough. Send in the bots. embryomystic (talk) 23:33, 6 December 2014 (UTC)

Mass or indiscriminate adding of RFE - requests for etymology[edit]

I noticed a user is mass additing RFE tags to Estonian entries; not using a bot but in considerable volumes anyway. My opposition to these requests tags is probably known; I still oppose this practice. If there are other people who like me think that the tags are pointless especially when being added indiscriminately, maybe we could do something to prevent the continuation of addition of these tags. For reference: Category:Estonian entries needing etymology, Recent changes in that category. --Dan Polansky (talk) 15:10, 7 December 2014 (UTC)

Are you suggesting that these entries do not need etymology? —CodeCat 15:43, 7 December 2014 (UTC)
All entries ought to have etymology ultimately, but the absence of an ety section already indicates that it is missing. RFE should be reserved for words that a user has a particularly keen interest in; otherwise we might just as well auto-add it to every entry, which is unhelpful for readers. Equinox 16:15, 7 December 2014 (UTC)
My experience is that a notice makes people more likely to add information. It has helped with inflections for example. Furthermore, the category is a to-do list, as it shows all entries that still lack an etymology. Keen interest is irrelevant; for every entry where a user adds a request template to indicate an interest, there are ten more where the user has simply left, disappointed, without adding a notice. —CodeCat 16:21, 7 December 2014 (UTC)
We also (judging from the feedback page) have users who leave disappointed because they literally can't find the definition among all the tables of contents and large ety and pron sections! Equinox 16:32, 7 December 2014 (UTC)
Tabbed Languages, people. Keφr 16:34, 7 December 2014 (UTC)
{{rfelite}} is less intrusive than {{rfe}}. DCDuring TALK 17:01, 7 December 2014 (UTC)
rfelite still requires an etymology section heading for what is not content, just a request. --Dan Polansky (talk) 17:30, 7 December 2014 (UTC)
But we're talking about a category that would contain millions of entries- entries which require individual attention by human beings with knowledge on how to do etymologies. It would never be cleared in your lifetime or mine- what kind of a motivator is that? Chuck Entz (talk) 22:43, 7 December 2014 (UTC)
The category name is misleading. Recently, the category name was Category:Requests for etymology (Estonian). The renaming happened via Wiktionary:Requests for moves, mergers and splits#Category:English definitions needed to Category:English entries needing definition discussion in August 2014 with very little participation; the only boldfaced support there was by Wikitiki. Now as before, I think Wiktionary:Requests for moves, mergers and splits should either be discontinued or limited to pages in the main namespace, since it is a positively harmful process.
I have seen no evidence that these RFE tags make people more likely to add etymologies, and I don't believe that to be the case. --Dan Polansky (talk) 17:03, 7 December 2014 (UTC)
FYI: Wiktionary:Votes/2014-12/Adding RFEs to all lemma entries where etymology is missing. --Dan Polansky (talk) 17:07, 7 December 2014 (UTC)
I think you may be right after all. If there are votes for every little issue you have, people will start to ignore those, too. I know I will. —CodeCat 17:14, 7 December 2014 (UTC)
It's not a little issue. It's a deviation from a previous practice. Up to now, we did not try to use RFE (which is still named a "request") to cover all missing etymologies; even right now, it is not our practice. Fact is, you are not very good at creating votes that result in support for your changes. Of the top of my head I don't remember any, but there probably is at least one such vote. One of your recent proposals has a vote which does not show consensus for your change: Wiktionary:Votes/2014-08/Migrating from Template:term to Template:m. --Dan Polansky (talk) 17:30, 7 December 2014 (UTC)
In my experience, a limited number of requests (of any type) is necessary and most editors and some users do this but when they are too many, it's demotivating (an exception is, obviously when there are known editors who do this on a regular basis or there is a previous agreement). I dislike when people add translation requests to any entry they edit, especially when it is a request of a non-trivial term and into a language we have few contributors for. --Anatoli T. (обсудить/вклад) 03:47, 8 December 2014 (UTC)

Template:attributive of under Adjective PoS[edit]

A significant number of the 675 uses of {{attributive of}} appear under the Adjective PoS header. The overwhelming majority of these are not adjectives, as the use of the template suggests and as tests for adjectivity would likely show.

  1. Do we want to keep these inane entries and add more to preempt the creation of shoddy, inane Adjective PoS sections by well-meaning contributors?
  2. Do we want to clean them out by RfV to test the validity of their adjectivity?
  3. Do we want to allow contributors to delete all of them that are not attested and do not have any other definition that might warrant inclusion?

Neither option 1 nor option 3 are in accord with CFI, but that may be more a suggestion than a policy anyway. DCDuring TALK 23:07, 8 December 2014 (UTC)

There can be no mass deletion. We can rfv them or rfd them all separately. Or change the header and templates to noun. That's the only one that can be done en masse. Renard Migrant (talk) 23:22, 8 December 2014 (UTC)
There usually already is a Noun PoS. So you think it would be OK to move the offending sense line to the existing Noun header?
Are you at all concerned about the likely re-creation of an Adjective PoS section after the changing-to or merging-into the Noun PoS header? DCDuring TALK 23:46, 8 December 2014 (UTC)

Translations of non-lemma forms (see newest)[edit]

Do we allow translations for English non-lemma forms? See newest. --Panda10 (talk) 15:16, 9 December 2014 (UTC)

  • Why would we not? bd2412 T 16:15, 9 December 2014 (UTC).
Would a superlative form be considered an "inflected form"? According to Wiktionary:Entry_layout_explained#Translations: "English inflected forms will not have translations. For example, paints will not, as it is the plural and third-person singular of paint. In such entries as have additional meanings, these additional meanings should have translations. For example, the noun building should have translations, but the present participle of build will not." --Panda10 (talk) 16:42, 9 December 2014 (UTC)
Yes, a superlative is considered an inflected form. —Aɴɢʀ (talk) 16:47, 9 December 2014 (UTC)
Not in all languages. Latin comparatives and superlatives are considered lemmas in Wiktionary. And in many other languages such as Slovene and Finnish, the comparative and superlative are what you might call a "half-lemma": they have a non-lemma definition, but they also have their own inflection table like a lemma. Participles are treated similarly in many languages. —CodeCat 19:40, 9 December 2014 (UTC)
Latin novissimus is in Category:Latin non-lemma forms, not in Category:Latin lemmas. —Aɴɢʀ (talk) 20:52, 9 December 2014 (UTC)
But compare Category:Latin superlative adjectives. —CodeCat 21:13, 9 December 2014 (UTC)
  • Because we presumably list the translation at the lemma form, and inflected forms of the foreign word at the foreign-language entry. If you want to know how to say "newest" in Hungarian or Polish, you go to [[new#Translations]], find the Hungarian or Polish word, go to that entry, and see what the superlative of it is. I'd be opposed to listing translations of nonlemma forms, because of the sheer quantity of translations that could theoretically be added, especially for verb forms. I don't relish the idea of seeing an entire translation table of third-person singular forms at [[walks]] and two entire translation tables (one for the past tense and one for the past participle) at [[walked]]; especially not for languages where those forms are not distinct from that language's lemma form in the first place, meaning the entries for those languages would be redundant to the entries at [[walk]]. We'd never be able to keep them coordinated with the translations tables at the lemma form, which is why we already use {{trans-see}} for near-perfect synonyms, alternative spellings, and the like. —Aɴɢʀ (talk) 16:47, 9 December 2014 (UTC)
    What Angr said. DCDuring TALK 16:53, 9 December 2014 (UTC)
    It seems to me that it would make it much easier for the reader if they could look up the translation by going to the exact word for which it is a translation. This would be doubly useful for words for which the lemma might have a dozen different meanings, but a particular inflection only occurs for one of those meanings. bd2412 T 17:26, 9 December 2014 (UTC)
    Let me introduce a radical consideration: resources, ie, contributors. Is this what we would like to either do ourselves, offer to new contributors as a task, or attempt to automate? DCDuring TALK 19:28, 9 December 2014 (UTC)
    There are eleven translation tables at [[new]], some with dozens of languages in them. Are we to repeat all of those translations at [[newest]], but using the superlative form, even for languages where the superlative is formed fully regularly, even periphrastically (e.g. le plus nouveau with each word linked separately)? And when someone comes along to [[new]] and adds a new language, say Marathi, to the translations, who's going to go to [[newest]] and add the superlative there? And many languages have multiple past tenses, while English only has one, not to mention multiple persons and numbers; should the French translation for [[walked]] list all of the following: marchais, marchait, marchions, marchiez, marchaient, marchai, marchas, marcha, marchâmes, marchâtes, marchèrent, ai marché, as marché, a marché, avons marché, avez marché, ont marché? And then the same thing for all of the polysemous verbs that have more than one French translation; shall we list all 17 forms for each verb that can be used to translate the English word? Lower Sorbian has at least four verbs that mean "go", two past tenses, three persons, and three numbers; should the translation table for [[went]] really list all 72 forms? I don't think that's going to make things any easier for the reader than simply going to [[go]], finding the Lower Sorbian lemmas, and then finding the appropriate inflected form on the Lower Sorbian lemma page. —Aɴɢʀ (talk) 20:52, 9 December 2014 (UTC)
    No one is proposing such a thing any more than they are proposing that "newest" should have eleven senses reflecting the senses at "new" (some of which seem redundant to me, like sense 12 merely being a special case of sense 3). As for conjugations, we can find other ways to deal with those. I am merely proposing that we should give the reader the shortest path to finding what they want. If contributors don't want to add such information, then it won't get added, but that doesn't mean it should be prohibited. We might as well say that etymologies or pronunciations of non-English terms should be prohibited because including them is too daunting a task for contributors to engage in. bd2412 T 21:17, 9 December 2014 (UTC)
    But indirectly that is what's being proposed, because if we remove the prohibition on inflected forms having translations from WT:ELE, there's nothing to stop someone from adding 72 Lower Sorbian forms in a translation table at [[went]]. And that would not help anyone, not even someone trying to figure out how to say "I went to Cottbus" in Lower Sorbian. Basically, I think it's an illusion that listing the translations for inflected forms will help the reader. It seems at first blush like it will, but in actual practice it won't. —Aɴɢʀ (talk) 21:44, 9 December 2014 (UTC)
    Can we draw a distinction between inflected verbs and inflected adjectives? Are we going to find 72 Lower Sorbian forms of "newest"? bd2412 T 22:22, 9 December 2014 (UTC)
    We can, but the original question was about English non-lemma forms in general, not English adjective forms specifically. There will only be 15 distinct Lower Sorbian forms of "newest". —Aɴɢʀ (talk) 22:57, 9 December 2014 (UTC)
    A tricky topic. The superlative form [[newest]]] can have a lemma form as well in the Slavic languages - masculine, singular, nominative case, e.g. in Russian it's нове́йший (novéjšij) or са́мый но́вый (sámyj nóvyj), other genders, plural, cases should not be in the translation, if they are added. If I were to translate the past tense form [[went]] into Russian, then I would use masculine singular - шёл impf (šol), пошёл pf (pošól) - concrete of идти́ (idtí), ходи́л impf (xodíl), походи́л pf (poxodíl) - abstract of ходи́ть (xodítʹ) (verbs of movement can have concrete and abstract versions in Slavic languages). Plus, there are equivalent verbs to go by a vehicle (perfective, imperfective, concrete, abstract), so there could be, at least eight translation into Russian. See go#Translations, e.g. translations into Russian. --Anatoli T. (обсудить/вклад) 23:36, 9 December 2014 (UTC)
To answer the question, no, it was formally disallowed by a vote. Renard Migrant (talk) 15:11, 10 December 2014 (UTC)
Did the vote explicitly define whether comparatives and superlatives are considered inflected forms? I would say they are, but there doesn't seem to be unanimity on that issue. —Aɴɢʀ (talk) 21:00, 10 December 2014 (UTC)
The Wiktionary:Votes/pl-2011-02/Disallowing translations for English inflected forms did not mention comparatives and superlatives. --Panda10 (talk) 14:57, 11 December 2014 (UTC)
In that case, what we need to decide is whether comparatives and superlatives are considered inflected forms in the sense of that vote or not. —Aɴɢʀ (talk) 15:09, 11 December 2014 (UTC)
I wrote that vote, and they are. Renard Migrant (talk) 18:22, 11 December 2014 (UTC)
Who wrote the vote is immaterial. Voters only voted on what it says in the vote, not on the unspoken intentions of the creator of the vote. --Dan Polansky (talk) 12:37, 14 December 2014 (UTC)
For the purposes of translations (and I think most others), I think they should be. Apart from "logistic" issues (i.e. keeping the lists in synch) and redundancy (you can already translate the ungraded form and look up the graded form in the target language's entry), different languages may handle gradation differently (e.g. elative form in Arabic, or several "workaround" constructs for Japanese, which has no proper adjectives to begin with), so often there will not be a good match in the target language. Keφr 18:24, 11 December 2014 (UTC)
If people want to amend to vote to explicitly cover comparative and superlative forms, fine. Basically the intention was Category:English non-lemma forms (which didn't exist yet). Renard Migrant (talk) 13:49, 14 December 2014 (UTC)
You didn't write that. User:Mglovesfun did. You can't speak for him.
In any event that is how I understood it to apply in English. I was somewhat aware of the complications in Latin with inflected forms of participles and gerunds, but, perhaps unrealistically, did not expect the vote to be applied mechanistically, even in English, let alone across all languages. AFAICT we have never been very good at drafting substantive proposals that anticipated even obvious matters such as this. (Not that we've been much better on procedural matters.) DCDuring TALK 14:33, 14 December 2014 (UTC)

@bd2412: the reason is simple: in most cases, it's meaningless, because inflection rules only depend on the language and on the context in the sentence. An example: if the feminine form of an adjective is used in Italian because the noun is Italian, it must be translated to the masculine form of the corresponding adjective in French is the feminine noun is translated to a masculine noun. This is why the définition should be Feminine form of, not a translation (and, anyway, the translation in English would be the same for all forms of the Italian adjective). For the translation section, this is the same issue. Sometimes, grammatical rules are similar enough (e.g. existence of a plural form meaning the same in two languages), but this is a special case. Lmaltier (talk) 08:56, 20 December 2014 (UTC)

Oxford Dictionaries word of the year 2014[edit]

[11]: They chose vape (we had it in 2012) and runners-up bae (we had it in 2014), budtender (2012), contactless (2008), indyref (we don't have it), normcore (2014), slacktivism (2006). Equinox 23:55, 10 December 2014 (UTC)

  • Excellent, but we shouldn't dislocate our shoulder patting ourselves on the back. DCDuring TALK 00:45, 11 December 2014 (UTC)
  • indyref added December 2014 - not sure if it will have a lasting usage. SemperBlotto (talk) 08:44, 11 December 2014 (UTC)
    indyref is a word? I thought it was a hashtag. Renard Migrant (talk) 18:23, 11 December 2014 (UTC)
    • A Google News search turns up some reputable media outlets that appear to be using it as a word, though mostly as shorthand in headlines. bd2412 T 18:42, 11 December 2014 (UTC)

Request for Permissions[edit]

I would like to request to be able to delete pages and move pages without a redirect, please. Anglom (talk) 18:38, 11 December 2014 (UTC)

  • That would make you an administrator, as those are the only editors able to do so. Based on your length and duration of activity here, I would support you in an adminship bid if you make one. bd2412 T 18:44, 11 December 2014 (UTC)
  • @Anglom: Who are you? I do not recall you from the dramaboards. Keφr 18:53, 11 December 2014 (UTC)
@T Ah, I didn't realize. Thank you. An adminship is probably more responsibility than I'm willing to take on right now, but how might I go about that in the future? By requesting here?
@Keφr I'm sorry, I don't much make it over to discussion pages. Anglom (talk) 19:16, 11 December 2014 (UTC)
@Kephir: I know Anglom from his work on Germanic languages. He's a good and conscientious editor who largely stays away from drama. If this were Wikipedia, his almost complete avoidance of the project namespace would be problematic, but here at Wiktionary I don't think it is. @Anglom:, being an admin doesn't actually give you more responsibilities unless you want them. You're not obligated to go vandal hunting, or block people, or protect pages, or anything like that. I'd support you for adminship too. —Aɴɢʀ (talk) 21:09, 11 December 2014 (UTC)
I offer conditional support. Anglom needs to put up a Babel box, be emailable, and, most importantly, provide etymology and gender for Alle. DCDuring TALK 21:55, 11 December 2014 (UTC)
The gender is hard to find, I assumed it would be a neuter third declension i-stem, but going by the International Code of Zoological Nomenclature "30.2.3. If no gender was specified, the name takes the gender indicated by its combination with one or more adjectival species-group names of the originally included nominal species", I would have to say feminine based on Alca, yes? Anglom (talk) 00:05, 12 December 2014 (UTC)
Thanks for humoring me and for your diligence. I was hoping that there was an answer from the apparent language of origin. DCDuring TALK 01:12, 12 December 2014 (UTC)
Hold on, DCDuring. You already created the vote, but I have not yet seen Anglom say that he wants to be an admin. --WikiTiki89 03:17, 12 December 2014 (UTC)
He said he would accept the nomination. DCDuring TALK 03:19, 12 December 2014 (UTC)
Oh, I didn't notice that you asked him on his talk page. --WikiTiki89 03:21, 12 December 2014 (UTC)

Deletion of rfv-passed and the like[edit]

FYI, there is a proposal to delete {{rfv-passed}}, {{rfd-passed}} and the like and to replace it with a longer markup. It is here: Wiktionary:Requests_for_deletion/Others#Archive_templates. I oppose the proposal. If it ain't broke, don't fix it, and don't make the markup longer. --Dan Polansky (talk) 10:40, 14 December 2014 (UTC)

Converting classic talk pages to Flow[edit]

I oppose converting classic talk pages to Flow. This is a Beer parlour subject, IMHO. (Was raised at Wiktionary:Grease pit/2014/December#The process of converting classic talk pages to Flow. --Dan Polansky (talk) 10:46, 14 December 2014 (UTC)

Have not seen Flow, but I oppose anyone doing it overnight without a long beta test and discussion. LiquidThreads was horrible. Equinox 11:59, 14 December 2014 (UTC)
Both of which are happening, on mw:Talk:Sandbox and mw:Talk:Flow (and on w:Wikipedia talk:Flow/Developer test page and w:Wikipedia talk:Flow because most Wikipedians cannot be bothered to visit other wikis). Yes, the page width has been mentioned, and yes, apparently it is here to stay. Keφr 12:14, 14 December 2014 (UTC)
@Kephir: Check out w:User:TheDJ/flowidth (a new userscript that allows drag-to-change width), which I'm currently urging the dev team to incorporate (or at least borrow ideas from) in the extension itself. Some sort of toggle or changer is/was planned for many months, but this has helped push it forward. :-) Quiddity (WMF) (talk) 02:57, 18 December 2014 (UTC)
I support conversion as it's much much much better than what we use now. —CodeCat 14:41, 14 December 2014 (UTC)
How? DCDuring TALK 14:50, 14 December 2014 (UTC)
Looks good, BUT how would it work for archives like deletion debates? Or would we be able (and willing?) to bypass flow for archive templates? Renard Migrant (talk) 16:36, 14 December 2014 (UTC)
Are you kidding me? The looks are the worst part of it. Oversized fonts, gratuitous animations, too much wasted screen space (between lines, padding, empty space to the right of the screen), poor visualisation of discussion structure (posts not clearly separated, who replies to whom only indicated by indentation). Also no pagination, missing basic functionality like deleting threads, and links like w:Topic:S22olnmzgd49twr0 (never mind finding this link in the first place was quite inconvenient). I am no fan of talk pages — they are crappy, monthly subpages here are marginally better, LiquidThreads has a few warts, but Flow is just horrible.
As for archiving, I think the original plan was to render archives obsolete. I think it was planned to make it possible for a single discussion to be visible from two talk pages at once. I have not seen anyone actually working on this, though. Instead WMF seems to concentrate on generating hype to make itself appeal to "OMGJAVASCRIPT" types, as most of its other software projects do. I mean, just look at Media Viewer, or the migration to Phabricator (the latter is not actually bad, but the improvement over Bugzilla is marginal). Keφr 17:28, 14 December 2014 (UTC)
(I hate Media Viewer too. It looks like a spammy "subscribe to our newsletter" popup, and hides all the useful metadata behind further clicks.) Just played in the Flow sandbox. Personally I could stand the UI but the performance is ridiculously poor, taking 20-30 sec to respond to a button press, with no visual cue that it's doing anything at all. I didn't think my computer was that old. Equinox 17:32, 14 December 2014 (UTC)
The looks can be changed, with enough sensible feedback (just like anything onwiki), and enough patience (there are only 3 devs working on Flow at the moment, and a voluminous list/backlog of requested features & changes). I (with my volunteer hat) want more density, too, and I'm pushing phab:M17 to help solve these subjective disagreements in the longterm.
(Tangentially: The benefits of phabricator over bugzilla include: A) it uses SUL (so no need for another account, and no more exposed email addresses), B) it replaces: Bugzilla/Trello/Mingle/RT/Gitblit, and eventually Gerrit, so it will be a lot easier for everyone (every community, and every wmf team) to collaborate on, and track, the various projects/extensions/code.)
Deleting threads/posts is available (to admins), and everyone else currently has a "hide" feature, which is equivalent to reverting but without being quite so opaque. There's room for improvement here, which will come with time and feedback and usage.
The performance does need to be improved, particularly for those of us with older machines. Examining that is constant, but attacking it vigorously is on the agenda.
The Topic URLs definitely need to be improved, and that's on the (long) to-do list. Quiddity (WMF) (talk) 02:57, 18 December 2014 (UTC)
  • I just half-skimmed, half-read the content at mw:Flow, and I found myself thinking that this is geeks seeking a new! improved! technical solution to something that 1) is at least partly a social problem (“New users on English Wikipedia have become less and less likely to participate in on-wiki discussions, in spite of a growing and mostly automated body of messages directed at them” -- sounds like capital-D Drama might drive some people away, and being increasingly nagged by automated stuff is just off-putting), and that 2) already has technical solutions for a number of the other issues they brought up, in the form of LiquidThreads (mooting the whole second paragraph of the Background section).
I do see in the Why not use LiquidThreads? section that they discuss some of the latter, but given Equinox's comments above, I'm unconvinced that we at EN WT have any need for Flow. This looks like the broader organization forcing something on all Wiki communities for the sake of ... I dunno, some ideal of consistency? The purported features of Flow that LiquidThreads does not offer don't look like anything we need, or would have much use for (globally unique identifiers, cross-wiki threads).
There are some other "features" of Flow that concern me. Flow will not directly support custom signatures. WTF? LiquidThreads does that just fine. Now we're going to be forced to use a different discussion mechanism on all talk pages, and we won't be able to use our sigs. The technical reasons they give all sound like “we're technically incompetent and it's too hard for us to figure out, so we're just going to throw out this one feature that all of our users have had for ages.” If they're too incompetent to figure out signatures, exactly how are we to trust that they can implement this entire system at all well?
Looking at the sample link Keφr provided, I am further horrified at this mess. Indentation stops after three levels, at which point, I completely lose any ability to tell which post is in reply to what. The mw:Flow page says that indentation relies on “arcane wikicode knowledge”. Granted, a user needs to know a little bit to use wikicode effectively. I posit that this is much more preferable over a patently broken and hard-to-use automated layout system. The justification for this visual crippling is that one quarter of WP viewership is on mobile devices. This reasoning is seriously flawed:
  • How much of that mobile viewership has any interest in Talk pages?
  • How much of that mobile viewership is using devices like iPads or Galaxys, which actually have pretty big screens?
  • How on earth is it acceptable, or even a good idea, to make design decisions for the PC based on what mobile phones are capable of? Microsoft spent billions on this boondoggle of an idea, and the early reviews of Windows 10 suggest that even this corporate behemoth has learned the error of its ways and is changing course (moving away from Metro, reinstating the Start menu, etc).
Oppose the current implementation of Flow. This is exceedingly poorly implemented, and I want no part of it until it is substantially improved and reworked. ‑‑ Eiríkr Útlendi │ Tala við mig 19:06, 14 December 2014 (UTC)
On other UI matters MW has imposed its will, despite a near revolt at de.wiki. I don't know what the final outcome was. As with many elites nowadays they probably believe that they have superior insight into the true needs of the people, which justifies ignoring everything that does not contribute to implementation of the plan. I'm fairly sure that we will get messages from MW staff tantamount to "You have to implement it in order to know what it is." DCDuring TALK 21:57, 14 December 2014 (UTC)
@Eirikr: Indent: The indentation limit is going to change - they've been postponing it for a long time, but everyone agrees that the current 3-indents limit is not good - it was meant to be an evolving experiment, and has remained static for too long. They're currently waiting for Design to provide a greater improvement than simply increasing the limit.
LQT: The LQT extension is no longer maintained and many of the highly active users (e.g. translatewiki) want to transition away, and there's a script for converting LQT to Flow (because for all LQT's problems, it is a logically structured system, at least). That part is fairly easy (comparitively speaking). It's how to deal with existing talkpages, that Gryllida was originally requesting comment on. Gryllida was not suggesting converting Wiktionary any time in the near future. Everyone agrees that Flow isn't nearly ready for that.
Signatures: Our username-attribution will need some way of being altered/adapted, to enable the editors who want an alternate name to display, e.g. the many editors who include a Greek/Latin name in addition to their local/primary script name.
However, the current system — of allowing anyone to A) give their name multiple different colors, in bold/superscript/comicsans/etc (which is bad for equality, and bad for accessibility), and B) to add numerous links (can often be confusing, or soapboxy), and C) to add templates and circumvent size-limits "but please don't", and D) to obfuscate our primary username (the one which appears in History/Logs/etc) thereby complicating everything — could benefit from a few changes. Quiddity (WMF) (talk) 02:57, 18 December 2014 (UTC)
Oppose. If it ain't broke don't fix it (especially if it's anything like liquid threads). SemperBlotto (talk) 07:59, 15 December 2014 (UTC)

Stranding off-topic a bit: I realise that LiquidThreads has a few quirks, but I think the amount of hate it gets from some users is quite disproportionate. Could those people illuminate me on what makes LQT so horrible to them? Keφr 14:33, 15 December 2014 (UTC)

Why does a bit from some old movies come to mind? This could get ugly... ;) Chuck Entz (talk) 14:59, 15 December 2014 (UTC)
Don't get me started. I have stopped watching the user pages of those who use LQT because of the annoying messages and the lack of inclusion in my regular watchlist. As a result I am effectively prevented from usefully communicating with users who use LQT. I assume it was done this way intentionally to force rapid adoption, in imitation of Facebook. DCDuring TALK 15:29, 15 December 2014 (UTC)
  • The watchlist issues DCD mentioned;
  • It messes up patrolling (not that this affects many people...)
  • It’s hard to read the history of an entire discussion.
  • It’s ugly.
  • The normal system is much more flexible.
Ungoliant (falai) 16:09, 15 December 2014 (UTC)
Oppose per Eiríkr. - -sche (discuss) 02:28, 17 December 2014 (UTC)
Hi, I've replied in a few places above. TL;DR: Flow is not in a final state by a long shot, and your feedback (throughout this thread) is appreciated. Gryllida was not suggesting converting Wiktionary any time in the near future. Everyone agrees that Flow isn't nearly ready for that. The dev team is concentrating on implementing the feature-set requests of the communities that are already actively/happily testing Flow, and they can only do so much at once. Personally, I'll be back to ask about and investigate more Wiktionary workflows (in all languages), in later months. Eventualism, and "slow and steady", are my constant mantras; They're (part of) the only reason any of this wonderful/aggravating/overwhelming/hopeful wikiverse work (imho, with apologies for using generalizations ;-) . Hope that (all) helps. Quiddity (WMF) (talk) 02:57, 18 December 2014 (UTC)
I for one believe that it is either impossible or very difficult to implement something as useful and flexible as the current system of talk pages, which are just wiki pages in a different namespace. I for one am not interested in providing feedback to Flow so that Flow can be improved. I just dread the day on which Flow is going to be forced down our throats the way Media Viewer was forced on German Wikipedia. --Dan Polansky (talk) 08:31, 20 December 2014 (UTC)

Hyphenation linked to a language-specific appendix[edit]

Similarly to the IPA key, can we link the Hyphenation label to a language-specific appendix (such as Appendix:Hungarian hyphenation) where the rules of hyphenation would be described? The {{hyphenation}} already has the lang parameter to make this feasible. If the appendix does not exist, then no linking would take effect. --Panda10 (talk) 20:56, 16 December 2014 (UTC)

problems and errors in Latin diphthong info[edit]

Discussion moved to Wiktionary talk:About Latin.

Google 1gram.[edit]

I asked Jimbo to put in a word with Larry Page and Sergey Brin to see if we couldn't get a comprehensive list of words appearing in Google Books. Another editor suggested we look at "Google 1-grams", http://storage.googleapis.com/books/ngrams/books/datasetsv2.html :

"File format: Each of the files below is compressed tab-separated data. In Version 2 each line has the following format:
ngram TAB year TAB match_count TAB volume_count NEWLINE
As an example, here are the 3,000,000th and 3,000,001st lines from the a file of the English 1-grams (googlebooks-eng-all-1gram-20120701-a.gz):
circumvallate 1978 335 91
circumvallate 1979 261 91
The first line tells us that in 1978, the word "circumvallate" (which means "surround with a rampart or other fortification", in case you were wondering) occurred 335 times overall, in 91 distinct books of our sample."

Is this something we can use as a source of words? bd2412 T 02:08, 17 December 2014 (UTC)

Yes, last year I did, User:DTLHS/googlebookscorpus/A. I think I still have the code somewhere, I could do more letters if that's not enough. DTLHS (talk) 00:41, 18 December 2014 (UTC)
Awesome. I notice some spurious things (e.g. "andthen", and "ation" which I assume is the suffix -ation widowed by a line break), but the list still seems quite useful. Question: if a book has, say, "Anna", is that in the 1-gram list as-is, or is everything downcased (to "anna")? - -sche (discuss) 05:04, 18 December 2014 (UTC)
There are very many typos though e.g. (in the first line) asterwards for afterwords, attomeys for attorneys. SemperBlotto (talk) 15:06, 18 December 2014 (UTC)
I am interested in getting a complete list of all words appearing in all Google Books. Can we get that from these lists compiled by Google? bd2412 T 17:53, 18 December 2014 (UTC)

Category:Numerals by language and Category:Numbers by language[edit]

Some of us talked about the two similar categories three years ago in Wiktionary:Beer parlour/2011/June#Numbers and numerals. Now that they are creating confusing interwiki links, how about unifying them? We should delete Category:Numbers by language and its subcategories because they are newer than Category:Numerals by language and its subcategories. — TAKASUGI Shinji (talk) 03:51, 17 December 2014 (UTC)

Numbers are not a part of speech, while numerals are. We should keep them separate. —CodeCat 14:17, 17 December 2014 (UTC)
I know we came up with a distinction, but I can't remember what it is. Renard Migrant (talk) 00:25, 18 December 2014 (UTC)
One point is that all the words in Category:English ordinal numbers are adjectives. That means they can't also be numerals, because numeral is already a part of speech. —CodeCat 00:32, 18 December 2014 (UTC)

Categorization of Polish participles[edit]

There are currently three categories for Polish participles:

These categories do not correspond at all to the categories of participles recognized in Polish grammar, which are:

  • adjectival participles:
    • active adjectival participles (imiesłów przymiotnikowy czynny), such as czytający
    • passive adjectival participles (imiesłów przymiotnikowy bierny), such as przeczytany
  • adverbial participles:
    • contemporary adverbial participles (imiesłów przysłówkowy współczesny), such as czytając
    • anterior adverbial participles (imiesłów przysłówkowy uprzedni), such as przeczytawszy

Currently, it seems that passive adjectival participles of perfective verbs are under "Polish past participles", active adjectival participles are under "Polish present active participles", and passive adjectival participles of imperfective verbs are under "Polish present passive participles‎". Anterior adverbial participles are nowhere to be found. This is clearly suboptimal and also misleading, since the entries under "Polish past participles" can also be used to form present tense sentences, such as Ten samochód jest już sprzedany which means This car has already been sold.

I can migrate the entries to the correct categories (there are less than 300 of them). I added the correct categories to Module:category tree/poscatboiler/data/non-lemma forms, but the new POS names are unrecognized by Module:headword. Can anyone with admin access add them? --Tweenk (talk) 22:41, 18 December 2014 (UTC)

Never mind, I created a new headword template Template:pl-participle that obviates the need for modifications to Template:head. --Tweenk (talk) 01:28, 22 December 2014 (UTC)

German composed forms[edit]

Is there any reason we present German composed forms? As far as I'm aware, the only "irregularity" in German composed form is whether a past participle takes "haben" or "sein", and that's already covered in both the heading line and the conjugation table. It seems at best sort of redundant, and at worst a little misleading (while there's nothing incorrect about the table at sein, for example, some of the composed forms aren't very idiomatic). On the other hand, German Wiktionary includes all this information, so it's entirely possible that I'm missing something. Smurrayinchester (talk) 14:48, 19 December 2014 (UTC)

I think this is very usual in conjugation tables (in all languages). Some conjugated forms are easy, some are less easy, but conjugation tables should be as complete as possible, this is useful to people wishing to use the verb and with little knowledge of conjugation rules. Lmaltier (talk) 17:59, 20 December 2014 (UTC)
I don't think it's necessary to show the full conjugation of all the auxiliary verbs. After all, the conjugation for those verbs can easily be looked up itself. For Slovene, I adopted the practice of showing only one form of the auxiliary. It keeps the tables much shorter while still giving an idea of the general formula. See kupovati for an example. For Latin, we don't list composed forms either, we show how to form them. Like on canto. —CodeCat 18:22, 20 December 2014 (UTC)
With such a reasoning, you could as well remove all regular forms and keep conjugation tables of individual verbs only for irregular verbs. But, once again, providing complete tables is useful to some readers. Lmaltier (talk) 18:34, 22 December 2014 (UTC)

User:JAnDbot for bot status[edit]

FYI, there is a new vote Wiktionary:Votes/bt-2014-12/User:JAnDbot for bot status. Previous failed vote: Wiktionary:Votes/bt-2012-06/User:JAnDbot_for_bot_status. --Dan Polansky (talk) 08:15, 20 December 2014 (UTC)

Free 'RSC Gold' accounts[edit]

I am pleased to announce, as Wikimedian in Residence at the Royal Society of Chemistry, the donation of 100 "RSC Gold" accounts, for use by editors wishing to use RSC journal content to expand articles/ items on chemistry-related topics. Please visit en:Wikipedia:RSC Gold for details, to check your eligibility, and to request an account. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:34, 20 December 2014 (UTC)

The idea of an open and free dictionary using content that is behind such elitist closed doors doesn't sit well with me. —CodeCat 14:20, 20 December 2014 (UTC)
There's probably not much useful for a dictionary in those journals anyway; I'm sure the words in those articles are found in plenty of more easily accessible places. Great news for Wikipedia, though! —Aɴɢʀ (talk) 14:28, 20 December 2014 (UTC)