Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


December 2017

Etymology cleanup[edit]

Not sure how to approach the etymology in a coinage like panpygoptosis. If someone could give it a pair of eyes, that would be appreciated. Thanks. —Justin (koavf)TCM 07:05, 1 December 2017 (UTC)

Is it pan- +‎ pygo- +‎ ptosis? Not sure about the "opto" part in the current entry. Equinox 07:43, 1 December 2017 (UTC)
I think you're right. There are two meanings for Ancient Greek ὀπτός (optós), "roasted" and "seen", but neither makes any sense here. — Eru·tuon 08:31, 1 December 2017 (UTC)
Done. --Barytonesis (talk) 21:09, 1 December 2017 (UTC)

delete unnecessary intermediate pages[edit]

Situtations like both fukyo and the page it redirects to ふきょ to finally, after 3 steps, get the right 不許, should be avoided when possible. --Backinstadiums (talk) 16:04, 2 December 2017 (UTC)

No, it shouldn't. Learn a little about Japanese and why our structure is the way it is before you tell us to delete things. —Μετάknowledgediscuss/deeds 18:50, 2 December 2017 (UTC)
@Metaknowledge: Just suggesting, as usual, so I should've chosen a clearer modal verb. I know a little of Japanese phonology/phonetics. Wikipedia encyclopedic approach redirects whenever it's desirable --Backinstadiums (talk) 21:08, 2 December 2017 (UTC)
We should absolutely not use hard redirects. fukyo could easily be a word in another language, and aren't other languages besides Standard Japanese (e.g. Ryukyuan languages) also written in hiragana? If so, then ふきょ could potentially also be a word in another language. And while fukyo/ふきょ may have only one kanji reading, consider cases like fu/ where there are five different associated kanji. —Mahāgaja (formerly Angr) · talk 21:28, 2 December 2017 (UTC)
We could still alleviate the issue by listing the kanji directly at fukyo. —Rua (mew) 22:14, 2 December 2017 (UTC)
@Rua: I agree --Backinstadiums (talk) 23:17, 2 December 2017 (UTC)
Absolutely NOT! Rōmaji entries link to kana readings - usually one but can be a few. The kana entry (hiragana and katakana) serves as a disambiguation page, it can link to a large number of kanji entries. kōsō -> こうそう (kōsō). Please read the policy on Japanese entries in WT:AJA and stop messing around. @Eirikr, TAKASUGI Shinji. --Anatoli T. (обсудить/вклад) 13:26, 3 December 2017 (UTC)
... or こうしょう (kōshō). Wyang (talk) 15:50, 3 December 2017 (UTC)
Actually one should mithe to access the Roman transcriptions in the first place, isn’t it? Probably you should learn to use an IME. But it displays the true state of things if one has to click twice if one uses a transcription of a transcription.
Though why doesn’t the template {{ja-romanization of}} take an additional argument in the case that the hiragana is itself a transcription of the Chinese? I can imagine that it is regularly the case that users click twice in those cases. Palaestrator verborum (loquier) 23:45, 2 December 2017 (UTC)
The problem is that there's no guarantee that someone won't add more kanji for the pronunciation, and if they only add them to one or the other, but not both, the pages go out of synch. It's hard enough to get all the readings reflected where they should be without requiring updating of another page every time. Like many of Backinstadium's ideas, this is based on the premise that the dictionary is a finite, finished product that merely needs to be rearranged, rather than a dynamic entity that's constantly being edited by people who don't know what anyone else is doing on other pages. Chuck Entz (talk) 16:01, 3 December 2017 (UTC)
@Chuck Entz, Atitarev: I was told in past discussions, I think about the Simplified/Traditional redirect issue, that it was the structure used for the wikis the one that limited the implementation (in your example just sinchronizing two pages). Secondly, I've never thoutht of a lexicographic resource as a finished product; rather I prioritize enhancing the user experience, especially for beginners, even if that means extra work. --Backinstadiums (talk) 17:21, 3 December 2017 (UTC)
Dude, just compare the time you have talked here with the time lost by this disenhancement of you having to click twice. The editing work needed for the enhancement is completely disproportionate to the enhancement gained: Either editors would have to check for rearrangements constantly or complicated templates would have to be used on the Hiragana pages whereby the Romaji pages would autofetch the Chinese. It is better to create some new entries. Palaestrator verborum (loquier) 17:32, 3 December 2017 (UTC)

Confusing cleanup category[edit]

What is Category:Needs cleanup supposed to be and how are terms included in it? —Justin (koavf)TCM 21:52, 2 December 2017 (UTC)

It's being generated by {{da-verb}}. DTLHS (talk) 21:54, 2 December 2017 (UTC)
I have already asked User:Gamren about this. It's something he will be working on eventually. DonnanZ (talk) 22:08, 2 December 2017 (UTC)


We currently only have one entry for the Hlai language, which is nom³. This entry is in some sort of IPA. However, Hlai does have a written language in the Latin script, based on Ha. Should we move nom³ to noms, based on the newest orthography (2005)? — justin(r)leung (t...) | c=› } 22:01, 2 December 2017 (UTC)

Yes, definitely. -sche has created many such entries, usually basing the orthography on whatever linguistic materials they have access to. Whenever I come across these and they need to be moved to an orthography that's actually used, I fix the entry (and the translation at water#Translations) and create a short about page that documents the orthography I have selected. —Μετάknowledgediscuss/deeds 22:10, 2 December 2017 (UTC)
@Metaknowledge Thanks! I've added a Hlai section to noms, but should nom³ be a redirect or deleted? — justin(r)leung (t...) | c=› } 00:25, 3 December 2017 (UTC)
I'd say that's at your discretion. Unless somebody else uses it, my instinct would be to delete it. —Μετάknowledgediscuss/deeds 01:02, 3 December 2017 (UTC)
@Metaknowledge: Alright, thanks again! — justin(r)leung (t...) | c=› } 01:07, 3 December 2017 (UTC)
No, thank you for working on minority languages! —Μετάknowledgediscuss/deeds 02:15, 3 December 2017 (UTC)
Yes, thank you! :) I try to find if there is any standard orthography for each language (at the time I create the entry and in periodic re-checks of previously-created entries), but sometimes I can't find any. For Hlai, searching for a standard orthography was complicated by the fact Hlai has many varieties. - -sche (discuss) 02:54, 4 December 2017 (UTC)

@-sche It is certainly an issue that Hlai has many varieties. With the standard orthography, I'm not sure how the various dialectal differences can be shown. On a related note, I've also noticed that you have also created "water" entries for different varieties of Zhuang using IPA. Should these all be deleted, since most of our Zhuang coverage doesn't distinguish between dialects? The same issue occurs here with Zhuang, where I'm not sure how the standard orthography can be used to represent varieties other than the the standard register and the Wuming dialect. — justin(r)leung (t...) | c=› } 04:39, 6 December 2017 (UTC)
Are the varieties of Zhuang so similar that they should not be considered separate languages, i.e. we should retire the separate codes we currently have for them? (WT:LT says as much, but there does not appear to have been any discussion of it — if you can confirm that we should treat Zhuang as one language, we can add a link on WT:LT to this discussion.) In that case, the entries we currently have, to the extent they show that e.g. Yang Zhuang uses ham⁴, could be migrated into the ===Pronunciation=== section of the "standard Zhuang" term, it would seem to me. Whereas if the varieties should be considered separate languages and the problem is just that only one has a standard orthography, it seems like a tolerable interim solution to leave the others at the best orthography we have access to, even if that is an IPA-ish orthography. - -sche (discuss) 00:20, 7 December 2017 (UTC)
@-sche: Traditionally, these varieties were considered to be one language by the Chinese government (since they are grouped under the Zhuang ethnicity). I think situation is similar to Arabic or even Chinese, where the varieties have very limited to no mutual intelligibility but speakers generally think they are speaking the same language. That is why there is only one standard for all these varieties. See these two files for more information: [1] [2] — justin(r)leung (t...) | c=› } 02:21, 7 December 2017 (UTC)
@Justinrleung: I'm sorry for the late reply. As long as the dialectal differences (at least, the ones we already have, i.e. the dialectal pronunciations we currently have entries for) can be preserved / presented, like they are for Chinese, I don't see a problem with merging the Zhuang varieties, if that is how they are usually treated. (WT:LT already says to do that.) The entries we currently have could be migrated into the pronunciation sections of the "standard" Zhuang entries, as I have done with "water". - -sche (discuss) 19:58, 18 January 2018 (UTC)

Removing precomposed characters from MediaWiki:Edittools[edit]

A lot of the space in the edit tools is taken up by precomposed characters like á, ē, ằ, ѝ etc. These aren't actually needed on Wiktionary, because the software automatically converts everything to composed normal form whenever you save a page. It would be perfectly fine if you entered a regular letter a followed by a combining acute accent, the end result is exactly the same. So I propose removing these precomposed characters from the edit tools, and instead making a special tab with all combining diacritics. That would give a lot more flexibility and allow editors to enter combinations that were not previously possible. —Rua (mew) 13:16, 3 December 2017 (UTC)

the software automatically converts everything to composed normal form whenever you save a page – that’s important information, because the upcoming IPA keyboard layout heavily relies on combining characters. I thought before that maybe I have to use precomposed characters.
As for me, the whole bar can be disabled, as all is accessible via dead keys and the compose table and appropriate keyboard layouts or direct Unicode input and charmaps. A web table under an editing area for such seems like a relict from 2004 and a very lame way of input, also the selection is arbitrary, like Ð ð but not Đ đ; dictionary editors can be expected to use appropriate input methods. Or @Anglish4699, do you need that bar?
I can’t disable it just for me? I don’t see such a setting under the editing preferences. Palaestrator verborum (loquier) 15:37, 3 December 2017 (UTC)
For myself at least, I have never used that bottom tool bar, but I have used the top one above the edit workspace. That one does come in handy much of the time. I would not be against the more flexible diacritic table (as that may be more helpful to others), but I think more input from the users is needed here. Anglish4699 (talk) 20:26, 3 December 2017 (UTC)
I used to use the Greek menu, but now I use the template {{chars}}. I still sometimes use the menu for non-Greek characters, though.
I don't know if I've ever used the "Special characters" menu on top. It groups things by Unicode block (I guess), which is somewhat less helpful for scripts such as Greek that are located in multiple blocks, and it doesn't contain all the characters that I would want to type (for instance, punctuation-type characters such as §, ’, ◌). — Eru·tuon 20:55, 3 December 2017 (UTC)
Symbol support vote.svg Supportsuzukaze (tc) 21:15, 3 December 2017 (UTC)
Symbol support vote.svg Support --Barytonesis (talk) 21:16, 3 December 2017 (UTC)
Template:neutral I'd like to see a simulation. Will inserting combining characters work on virtually all the systems that people are using to edit Wiktionary? Especially with monospace fonts in the edit box, I'm not sure if the combining characters will show up correctly on many systems. I find the box quite useful, however.--Prosfilaes (talk) 11:44, 5 December 2017 (UTC)
They may not appear exactly right in the edit box, but that should go away once you save the page. It might even go away just by previewing the page, but someone will have to test that. —Rua (mew) 15:23, 5 December 2017 (UTC)
Yes, when I enter a + ◌́ (U+301, acute accent, minus the dotted circle) and press "show preview", this sequence changes to á (U+E1, Latin small letter a with acute) in the text box. — Eru·tuon 19:16, 5 December 2017 (UTC)
  • I would support some combining characters being removed. That is, when the menu is not for an orthography in which letter + combining character is considered a letter. For example, I think it would be an improvement if the Vietnamese menu gave the letters ă, â, ê, ô, ơ, ư as precombined characters, but listed the tones ◌̀, ◌̉, ◌̃, ◌́, ◌̣ as combining characters. And perhaps the Arabic menu should list the diacriticked letters that are used to transliterate Arabic consonants, but it should give a combining acute accent instead of vowel + acute combinations. (Either that or the acute accent should be removed.) The rationale is that the tonal diacritics and acute accent are separable symbols that represent a phonemic or phonetic unit in themselves, while the other diacritics do not represent a unit in themselves, but only in combination with a letter. — Eru·tuon 21:26, 5 December 2017 (UTC)
    • But for "Latin", everything is a diacritic. I don't think this is a particularly meaningful distinction. The point of the change is to make it easier for people to use diacritics and not having a list of precomposed characters. If everyone and their dog wants their own custom set of "letters" in the menu, then it defeats the point entirely. —Rua (mew) 21:35, 5 December 2017 (UTC)
      • My concern does not apply to the "Latin" menu, only to the ones for particular languages or other scripts. — Eru·tuon 21:39, 5 December 2017 (UTC)
        • Vietnamese is written in the Latin script. None of the precomposed characters in its menu are needed, all can be created with combining diacritics. —Rua (mew) 21:48, 5 December 2017 (UTC)
          • Of course, but it's less convenient to select from a huge long list of diacritics (which are in a difficult-to-understand order) than from a short list of diacritics or letter–diacritic combinations used in a particular orthographical system. — Eru·tuon 21:55, 5 December 2017 (UTC)
          • When visually similar diacritics are involved, it might help to have a menu specific to an orthographical system. Some people confuse háček or breve, circumflex or inverted breve, ogonek or cedilla or comma. Then again, we do not have a menu for every orthography that uses some of these diacritics and I'm not really arguing for that. And they would probably be confused by a minority among active contributors. — Eru·tuon 22:10, 5 December 2017 (UTC)
            • I've included Unicode names and codepoints in the title text of the new diacritics submenu. That will hopefully help people decide which is the right one. —Rua (mew) 22:17, 5 December 2017 (UTC)
              • That is a helpful feature. To be clear, I do support your proposal as far as the Latin menu is concerned. — Eru·tuon 22:35, 5 December 2017 (UTC)
All these confusions are a reason why such a menu should not exist at all 😒. If you use Gucharmap or Fcitx Unicode Input you see more information about the characters you try to give in, and these application also update their data without manual intervention by a Wiktionary coder. Perhaps we should just disable the menu entirely for some time and look who complains. If it goes unnoticed, you’ll can rub your hands because you have less bothering and a new argument for that bug being fixed.
But as this extreme step will not be performed, it slowly becomes more desirable that that bug be fixed. There isn’t a specific script that I can block with uBlock Origin, I can only hide the bar with the element picker. Palaestrator verborum (loquier) 23:55, 5 December 2017 (UTC)
Because not everyone is using a Linux box with xkbconfig and without Gnome IBus overriding all other keyboard interfaces and ignoring ~/.XCompose configurability as too hard. And sometimes people need to use a character that's not in the keyboards they have configured, or for which they haven't memorised the <dead_circumflex><compose><shift><hyphen><j> dance to get ʲ. If you want "It works for me therefore everyone else is doing it wrong", you know where to find systemd. This "extreme step" of removing alternate and accessible ways of entering characters beyond 7-bit ASCII will not be performed because it is arrogant and wrong-headed. --Catsidhe (verba, facta) 01:36, 6 December 2017 (UTC)

Inviting IABot[edit]

While working on Wikipedia I noticed that they use IABot (InternetArchiveBot) to automatically replace dead links with a version stored on archive.org. The project is a joint effort between the Internet Archive and the WMF (more in this blog article). The project has funding, is developed in the open (github) and has already been deployed on a few Wikipedias. As far as I know it hasn't been used on a Wiktionary instance yet. We don't generate the same amount of external links / references as other Wikis but I think it doesn't hurt to make sure what we have is valid and usable. We need to establish a consensus first so I'd like to get your opinions. – Jberkel (talk) 10:25, 4 December 2017 (UTC)

Thanks, T181879 on Phab if you want to follow up. – Jberkel 11:33, 9 December 2017 (UTC)
Symbol support vote.svg SupportRua (mew) 12:01, 4 December 2017 (UTC)
Sounds good. Equinox 12:24, 4 December 2017 (UTC)
I like this idea. I would also like it if we had more external links, especially in areas like etymology, reconstructed languages, etc. - TheDaveRoss 14:32, 4 December 2017 (UTC)
Symbol support vote.svg Support I always make sure that the URLs of quotes I post are archived. It goes without saying as a thing of the durability ideal. Palaestrator verborum (loquier) 15:53, 4 December 2017 (UTC)
Symbol support vote.svg SupportΜετάknowledgediscuss/deeds 19:19, 5 December 2017 (UTC)
Symbol support vote.svg Support — justin(r)leung (t...) | c=› } 19:31, 5 December 2017 (UTC)
Symbol support vote.svg Support --Barytonesis (talk) 19:37, 5 December 2017 (UTC)
Symbol support vote.svg Support French Wiktionary supports this proposal too! Face-smile.svg Noé 11:52, 13 December 2017 (UTC)

Chinese ordering[edit]

Are there any guidelines about the preferred ordering for Chinese? I cannot find any info. at About_Chinese, and after searching in Category:zh:Grammar I realized how chaotic radical ordering can be, especially when treating simplified and traditional versions individually. I would always include some (alternative) pinyin ordering, which has helped increase literacy a great deal since its implementation. --Backinstadiums (talk) 11:24, 6 December 2017 (UTC)

@Backinstadiums: There are many ways to order Chinese characters. Radical ordering is the common way before the advent of bopomofo and pinyin, and is still the common way for traditional Chinese dictionaries. In recent years, ordering by pinyin has become more common, especially in Mainland China. However, most dictionaries would also have an index ordered in alternative ways. For instance, a traditional Chinese dictionary with radical ordering often has a 難查字表 for characters whose radicals are not apparent. It may also include a pinyin/bopomofo index.
The reason why Chinese categories are ordered by radical here in Wiktionary is because Chinese is not only used for Mandarin, making pinyin/bopomofo sorting unreasonable. The categories for specific Chinese lects, e.g. CAT:Mandarin lemmas and CAT:Cantonese lemmas, are sorted according to their respective romanizations. — justin(r)leung (t...) | c=› } 17:37, 6 December 2017 (UTC)
@Justinrleung: Except if a true semantic conversion were intended, at least traditional and simplified forms should be treated as a single unit for purposes such as ordering, (currently) the traditional one being used as common lemma. Regarding Pinyin, since nowadays it's the most used romanization (and transliteration) system regardless of one's native Chinese lect, it would be truly beneficial to add it as much as possible to spread Chinese literacy. --Backinstadiums (talk) 18:01, 6 December 2017 (UTC)
Symbol oppose vote.svg Oppose
1. Someone looking for 刘 won't find the traditional lemma of 劉 because of the drastically different number of strokes. Sort them independently to accommodate users of both.
2. Not every character has a Mandarin pronunciation. Use Category:Mandarin lemmas to search by pinyin.
suzukaze (tc) 18:36, 6 December 2017 (UTC)
(Chiming in...)
One additional wrinkle is that the MediaWiki software backend appears to only allow sorting by one index key per category. Sorting a given Chinese character by each of the various possible pinyin representations would require a change in the underlying database software, one that the MW devs seem uninterested in tackling.
We've run into this issue with Japanese entries, where a given Chinese-based spelling often entails multiple phonetic realizations, and thus ideally multiple sortings -- see for instance, where we would want the entry to appear in Category:Japanese_nouns, etc., under all of these indices: え (e), かび (kabi), かい (kai), から (kara), つか (tsuka), つく (tsuku), and ほぞ (hozo). Due to how the MW software is programmed, this 柄 character only appears indexed under the last reading, hozo.
Categorizing Chinese characters separately by lect would work for indexing by pinyin reading. But categorizing Chinese characters in a single category, and trying to index by the pinyin readings for all of the lects at once within that one category, is currently barred by baked-in technical difficulties. ‑‑ Eiríkr Útlendi │Tala við mig 18:38, 6 December 2017 (UTC)

Category:English misnomers[edit]

I think that it may be of interest, especially for English learners, to see a compendium for words or locutions that don’t mean what they would imply. To start, we have: strawberry, Rhode Island, East River, funny bone, prairie dog, French braid, Boston cream pie and Hawaiian Pidgin (amongst many other candidates, but I don’t want to exhaust room here). I can’t think of any possible objections at the moment, though some may find the topic too trivial or too uninteresting to merit a category, but I’d like to see others’ thoughts here first. — (((Romanophile))) (contributions) 19:01, 6 December 2017 (UTC)

There will inevitably be disputes about what in particular goes into it but yes, I think that it's fine to have some listing of counter-intuitive words. It may be better suited to an appendix. —Justin (koavf)TCM 19:16, 6 December 2017 (UTC)
I would think that almost any noun phrase (any phrase of any kind?) that was also an idiom would necessarily be a misnomer. DCDuring (talk) 00:07, 7 December 2017 (UTC)
Perhaps even single words that are used in a non-literal sense, depending on how you define "what they would imply". — Eru·tuon 00:10, 7 December 2017 (UTC)
* What does strawberry imply? And how many berries would be exempt from this? cranberry (which do not come in crans), chokeberry, cowberry, hagberry, raspberry, gooseberry, hackberry? (The coffee bean is, of course, the true hackberry, and a real cowberry is of course bullshit.)
* A pidgin and a creole are technically distinct, but nobody but a linguist really cares; it's not really confusing.
* A prairie dog is a sort of dog-like creature that lives on the prairies. It's a wonder of clarity compared to the robin, which can refer to any number of distantly related species of bird.
* I don't see even what you're getting with French braid; it's a hair braid that is called French.
What I'm getting at, is that it seems pretty vague and broad as a category, and given the examples you gave, I don't think it would be terribly helpful to an English learner.--Prosfilaes (talk) 09:58, 7 December 2017 (UTC)
Strawberries aren’t berries, prairie dog aren’t dogs (though all robins are at least birds), and French braids did not originate in France. Even so, I do see that it has the potential to be too big of a category unless we made up some special rules for it. — (((Romanophile))) (contributions) 03:40, 11 December 2017 (UTC)
berry, sense #1 is "A small fruit, of any one of many varieties.". Many things we call berries aren't botanical berries and some things, like grapes, that are botanical berries we don't call berries. French practically needs a definition of "fancy" (though apparently they're called tresse française in French.) Also, French doesn't necessarily mean "originated in France"; it could mean it was "popular in France" or even "associated with France". Also, cf. pineapple braid (another name for the Dutch braid or inverted French braid.) The extreme here is the "canoe wife", a name in English tabloids for a woman whose husband allegedly drowned in a canoe accident (body never found), who moved to Brazil with the insurance money and lived the good life with her husband, until she was extradited back to England for insurance fraud. Hence, "canoe wife". Adjectives have to communicate clearly to the audience, not necessarily make good literal sense.
I'd see a category of things like prairie dog, where the root word is clearly a misnomer in normal English--that is, most people would agree that a prairie dog is not a dog. That seems a reasonable enough and possibly useful category.--Prosfilaes (talk) 16:37, 11 December 2017 (UTC)

Unusual English pronunciations categorized as 1 syllable words[edit]

trousers, thousandth, straighten, Pirc Defence, cat's meow, read receipt. DTLHS (talk) 06:11, 7 December 2017 (UTC)

Pirc Defence is incomplete; cat's meow and read receipt used {{IPA}} instead of {{IPAchar}}. —suzukaze (tc) 06:24, 7 December 2017 (UTC)
In trousers, @Bcent1234 added the 1-syllable category, no doubt by mistake. In thousandth, someone didn't indicate that the n formed a syllable, which can be done with a syllabic diacritic or a schwa. — Eru·tuon 07:09, 7 December 2017 (UTC)

Alternative template that applies the same font formatting as Template:IPAchar[edit]

As part of ongoing maintenance and cleanup work, @Mahāgaja recently made this change to the Japanese 光 entry, replacing a call to {{IPAchar}} with {{l|und}} instead. This was due to the inclusion of <sub> tags in the string formerly contained by {{IPAchar}}. Talking things over with him, I understand his reasons and have no objection. (That thread is here for those interested.)

One issue that remains, however, is font formatting. {{IPAchar}} results in text at 15.4pt, while {{l|und}} produces 14pt. I could manually add in a <span> tag specifying class="IPA" to do that, but I wanted to ask first if anyone is aware of some other template that would apply a 110% font size?

(And if this query should go in the Grease Pit instead, I'm happy to move it.)

TIA, ‑‑ Eiríkr Útlendi │Tala við mig 22:12, 7 December 2017 (UTC)

I don't know if it is correct to consider this notation IPA, even though all but the subscripted vowels at least belong to the IPA character set. Perhaps it would be better to consider it as Old Japanese text (though it is a transcription system for the actual man'yougana used to write Old Japanese), tag it as Old Japanese ({{m|ojp||...}} or {{m|ojp|...}} depending on whether there will be entries), add Latin as a script for Old Japanese in Module:languages/data3/o, and assign the desired styles to Old Japanese written with Latin script in MediaWiki:Common.css? — Eru·tuon 22:59, 7 December 2017 (UTC)
Apologies, I'm not married to the idea of IPA-ness at all -- I merely want the text to appear in the same font and at the same scaling as the /slash-transcription phonemic string/ IPA text that follows it on the same line. The w:International Phonetic Alphabet article describes using ⟨angle brackets⟩ for non-/phonemic/ and non-[phonetic] transcriptions meant to indicate the ⟨original-language orthography⟩, which would seem to make sense for Old Japanese strings like this with the numeric subscripts -- but as Mahāgaja argued, we probably don't want to use {{IPAchar}} for angle-bracket orthographic strings (which makes sense to me).
I hope that clarifies -- I'm not looking for IPA per se, just something that formats similarly for visual consistency. ‑‑ Eiríkr Útlendi │Tala við mig 00:28, 8 December 2017 (UTC)
What are these pronunciations? What system do they follow? Does that system have a name? What symbols can and can't appear in them? If there are multiple answers to these questions multiple templates should be created. DTLHS (talk) 00:53, 8 December 2017 (UTC)
@DTLHS: One system, symbols used would be a subset of ASCII lower-case letters, whitespace, and subscripted 1 and 2 to indicate differences apparent in w:Jōdai Tokushu Kanazukai and apparently indicating phonemic differences lost in the Japanese language during the w:Heian period (roughly the 900s-1000s CE).
Not sure why we'd need multiple templates; all I need is something that formats the font, and that is ideally more elegant in the wikicode than <span class="IPA">using raw HTML</span>.
You may only need something that formats the font, but other people may appreciate knowing explicitly how some string of symbols is to be interpreted and what its source is. DTLHS (talk) 01:11, 8 December 2017 (UTC)
Ah, yes, that I agree with -- I'd missed your intent earlier. The subscripted numerals are described in w:Old Japanese, which is linked to already by the {{inh}} template that should be right nearby, as at 光#Japanese, Etymology 1; is that not insufficient? ‑‑ Eiríkr Útlendi │Tala við mig 20:07, 8 December 2017 (UTC)
If no one is aware of any extant template (besides {{IPAchar}}) that does this, I'm happy to make a new one. ‑‑ Eiríkr Útlendi │Tala við mig 01:03, 8 December 2017 (UTC)
I'm pretty sure that {{IPA}} and {{IPAchar}} and other pronunciation templates are the only ones add the IPA class. But once again, I do not think it is correct to label this transcription IPA. It would be better to tag it as Old Japanese written in the Latin script, and to apply the desired styles to that language–script combination. — Eru·tuon 01:27, 8 December 2017 (UTC)
I realize I've confused the issue. I'm happy using some other class -- class="IPA" just happens to be the only one I know of right now that I'm sure applies the font styling I want. Looking at the rendered HTML in Chrome's "Inspect" view, I don't think this CSS class does anything to label the text as IPA: it just sets font and line height styling. I have no interest in labeling the romanized OJP text as IPA; Mahāgaja's arguments against doing so have already convinced me of that.  :) My reservation about using something like {{m|ojp|...}} is that, in my specific use case (including romanized OJP to show phonetic development through to modern JA, as at 光#Japanese), I want the font to match the IPA strings that immediately follow on the same line, and this font configuration might not be the ideal for other use cases. ‑‑ Eiríkr Útlendi │Tala við mig 20:07, 8 December 2017 (UTC)
@Eirikr: It should be fine to assign an IPA-like style to {{m|ojp|<Latin text>}} if this is the only Latin-script orthography or transcription system for Old Japanese that will be used on Wiktionary. Then, the combination of lang="ojp" and class="Latn" can be given the style you desire in MediaWiki:Common.css, with no conflict. It occurs to me that a transliteration in {{m|ojp|<man'yougana>|tr=<Latin text>}} would have this combination of language and script. But it would be easy to select ojp transliterations in the CSS and disable the styles for them, if necessary. — Eru·tuon 06:49, 21 December 2017 (UTC)
I’d support the creation of such a template; it’s also needed for Egyptian entries like tꜣwj, where the reconstructed pronunciation is generally IPA but also includes a wildcard V for unreconstructible vowels. — Vorziblix (talk · contribs) 14:44, 11 December 2017 (UTC)
@Vorziblix -- In light of our combined use cases, I created {{IPAfont}}. (A very different template by the same name was deleted in 2010.) I also updated that Egyptian entry to use the template. HTH, ‑‑ Eiríkr Útlendi │Tala við mig 19:13, 21 December 2017 (UTC)
Thanks! I’ll go through and add it to the other Egyptian entries that need it. — Vorziblix (talk · contribs) 23:40, 21 December 2017 (UTC)
Actually, the error messages seem to suggest IPAchar is complaining about the numbers, not the HTML: invalid IPA characters (1). —suzukaze (tc) 00:36, 8 December 2017 (UTC)
For reasons of both aesthetics and consistency, I'd rather this ojp transliteration used the Unicode subscript numerals ₁ and ₂ rather than normal 1 and 2 with subscript markup, but since ₁ and ₂ are also not valid IPA characters, doing so won't solve the immediate problem. —Mahāgaja (formerly Angr) · talk 13:27, 8 December 2017 (UTC)
Hmmm, I find that the Unicode subscripts are too small -- I have trouble seeing them clearly, and my eyes aren't that bad. I also find they scan less well -- 1 extends below the line somewhat, thereby standing out better, whereas ₁ is mostly within the line of the text and thus (in my opinion) too unobtrusive. Either way though, as you note, we need to find something other than {{IPAchar}}. ‑‑ Eiríkr Útlendi │Tala við mig 20:07, 8 December 2017 (UTC)
@Eirikr: By chance I was reading Unicode subscripts and superscripts on Wikipedia. Apparently most fonts design the superscripts and subscripts for use in fractions, making them too small to be used for their intended purpose, chemical and mathematical formulas (and transcription systems). — Eru·tuon 06:53, 21 December 2017 (UTC)
@Erutuon: Ya, that page also has an informative image showing layout lines and where the different glyphs should ideally fall -- with superscript and subscript clearly extending beyond the cap line or baseline, while the fraction figures stay within those bounds. It seems the font used for Wiktionary is guilty of this same flaw, using fraction glyphs for the superscript (Unicode glyph vs. sup-taggged: ¹ 1) and subscript (Unicode glyph sub-tagged: ₂ 2). For others reading this thread, one key quote from the WP article:

Most fonts that include these characters design them for mathematical numerator and denominator glyphs, which are smaller than normal characters but are aligned with the cap line and the baseline, respectively. When used with the solidus, these glyphs are useful for making arbitrary diagonal fractions (similar to the ½ glyph).

This was not the intended use of these characters when Unicode was designed. The intended use was to allow chemical and algebra formulas to be written without markup. Proper appearance of these requires true superscript and subscript. H2O with subscript markup may look better than with a Unicode subscript (H₂O) in a font that has repurposed the Unicode subscripts for fractions.

Yay, not following the specification... :-P   Oh, well. I'll keep using tags until the font developers get on board. ‑‑ Eiríkr Útlendi │Tala við mig 17:54, 21 December 2017 (UTC)

Mandarin tone-related phenomena[edit]

I'd like to know whether tone-related phenomena is explained somewhere, and if not I propose creating some notes for users. Secondly, 小姐 shows xiǎojie as a "toneless variant", yet A Grammar of Mandarin by Jeroen Wiedenhof states that in a bisyllabic word written with 2 characters, both with citation 3T, modern Mandarin varies: 小姐 Miss [2.0] xiáojie, 表姐 elder female cousin [2.3], 姐姐 elder sister [3.0]. Xiáojie is also shown by Colloquial Chinese A Complete Language Course --Backinstadiums (talk) 11:18, 8 December 2017 (UTC)

You are focusing too much on the details. It's great that you are looking up reference phonetic material on the pronunciation of Mandarin, but as a learner of Mandarin, please bear in mind that the '3-3 → 2-3' tone sandhi in Mandarin is a natural process that even native speakers may not be able to notice or characterise. For '3-3' words with non-obligatory toneless second syllables, there is nothing that will make you un-understandable by pronouncing the tones clearly without sandhi. In fact, if your tones are accurate (which is absolutely paramount for learners), pronouncing the words 小姐 and 表姐 slowly and clearly, without sandhi, will give the unfamiliar listener the impression that you are a learned speaker. The sandhi will come naturally when you have become sufficiently familiar with pronouncing full '3-3' sequences such as 表姐; it should not precede the latter. Actually, intentionally pronouncing '3-3' as '2-3' as a learner would lead to confusions. e.g. 網友 could be understood as , especially considering that learners generally tend to enunciate the syllables very slowly. When the second syllable of a '3-3' sequence is optionally reduced, both the 3-3 and 2-3 forms can be weakened, resulting in 3-· and 2-·, the former used in careful speech and the latter in fast speech. In normal speech it is probably somewhere in between the two. I don't really think it needs to be further explained- learners should try to use the unreduced pronunciations unless the word is obligatorily toneless like 姐姐. Plus please avoid the word 小姐 altogether because of its negative connotations. Wyang (talk) 14:03, 8 December 2017 (UTC)
@Wyang: Since the author said "historical compounds now highly lexicalized" I think it's worth classifying such a group (to which terms such as 小姐 Miss [2.0] belong) withing Category:Mandarin_words_containing_toneless_variants --Backinstadiums (talk) 15:13, 8 December 2017 (UTC)

Happy Birthday Wiktionary![edit]

Happy Birthday Wiktionary! Message from Katherine Maher, Executive Director of the Wikimedia Foundation

VGrigas (WMF) (talk) 14:39, 8 December 2017 (UTC)

Thank you! —Justin (koavf)TCM 23:12, 8 December 2017 (UTC)
Cheers! How many candles on the birthday cake? DonnanZ (talk) 00:56, 9 December 2017 (UTC)
15, I should have listened to the audio. DonnanZ (talk) 01:00, 9 December 2017 (UTC)
🎂 It depends on your platform! [3] Equinox 04:39, 9 December 2017 (UTC)

Proposal: public polls[edit]

Sometimes there are matters that strongly affect the user experience, and it would be valuable to ask the ordinary users what they want. I therefore propose that we implement an option to hold public polls, which are shown to all users who visit Wiktionary, whether logged in or not. In these polls, we would ask simple questions such as "which layout do you prefer" or "do you think this feature is valuable". There would be a maximum of one public poll at a time, and they would run for a duration of two months (subject to discussion of course). Starting a public poll should not require majority consensus, to prevent reactionary users from sabotaging the process.

I have no idea how this would actually be implemented, but I'm sure there are people who can help with that. —Rua (mew) 22:38, 8 December 2017 (UTC)

In principle, it's not hard to put a banner on the top of each page--we already have one: MediaWiki:Sitenotice. —Justin (koavf)TCM 23:12, 8 December 2017 (UTC)
But a poll is quite different. It would involve saving people's answers somewhere, and also remembering who voted. —Rua (mew) 23:14, 8 December 2017 (UTC)
Are you suggesting that the poll be in the header? I am suggesting that the site notice have a link to a poll and that is documented elsewhere. —Justin (koavf)TCM 23:42, 8 December 2017 (UTC)
Hmm. I figured that the poll would be directly located on the page somewhere, like in the bar on the left. That way users can just click their option without being taken away from where they want to be. —Rua (mew) 23:57, 8 December 2017 (UTC)
Ah. I see what you mean now. MediaWiki can certainly have polls and record user answers--e.g. see Wikia. Not sure if it's desirable or if we want some barrier to entry of some kind but it's definitely possible. —Justin (koavf)TCM 00:04, 9 December 2017 (UTC)
  • Good idea. Probably the best we can do and still continue to bend over backward to convey the idea that WMF projects respect and guard user privacy. Would we limit participation to registered users? Would a checkuser be available and willing to make sure there wasn't sock-puppetry skewing the results?
"Majority consensus" conflates two decision principles. Consensus usually means well more than a simple majority. In practice, we have had various levels for "consensus" here, varying over time and across types of decisions. I would think that a simple majority would suffice for something like this. We could start with no vote required and see how it goes. DCDuring (talk) 15:35, 9 December 2017 (UTC)

vulgar passages in quotes[edit]

I don't know if this has been discussed already but why do so many of the quotes for English terms have to reference very vulgar or obscene sexual material, often from homosexual novels and the like? Is this just meant to be provocative and "edgy" for the sake of it, or to show that it's okay and normal, or to push the envelope? Is it part of some agenda? Is it some statement about the nature of this site being one where people can freely contribute? I don't get it. Not that I have anything against homosexuality but some of these passages (equally including those that reference straight interactions) are a bit much for people to read if they're just casually looking at a dictionary, and I feel it may make some people take Wiktionary less seriously, making it rather juvenile and akin to something like Urban Dictionary. Word dewd544 (talk) 04:50, 9 December 2017 (UTC)

Any examples? If a term itself is vulgar the quotes will probably also be vulgar. DTLHS (talk) 04:51, 9 December 2017 (UTC)
In some cases Wonderfool has added this kind of thing for a joke (e.g. the erotic passages from Fanny Hill, and I remember a news article about a kitten being microwaved). In other cases (slang) some words just tend to appear in these contexts. I try to favour "neutral"-feeling citations where possible. Equinox 04:55, 9 December 2017 (UTC)
The The Fanny quotes are all excellent! Especially for amorous/old-fashioned terms like straddle, indriven, unbonneted, house of accommodation, throb, supinely. But the sexual quotes for daily words like prove, unnecessary etc. are probably not the best ideas. As for the quotes about the cat in the microwave, that was pretty funny. This Wonderfool character seems like a cool type of chick (I assume she's a chick, anyhow...). --Lirafafrod (talk) 11:08, 12 December 2017 (UTC)
I definitely think we should give a strong preference for non-shocking quotations (cf. w:en:WP:EGG). There is no reason for vulgar or graphic quotations for words other than vulgar and graphic ones. —Justin (koavf)TCM 05:00, 9 December 2017 (UTC)
It might be worth saying: we're not a corpus, and we don't have an obligation to keep any particular quote if it can be replaced and if it's not notable in some way, like the first known usage of a word. DTLHS (talk) 05:01, 9 December 2017 (UTC)
Vulgarity seems to cast too wide a net. To clarify what I imagine is the intent: Don't we mean "obscene", "disgusting", and/or "not suitable for small children"? I've felt and mostly repressed the urge to remove such quotes.
A problem arises with certain polysemous terms which have many ordinary definitions but some that are disgusting or luridly sexual. For such problematic definitions a practice of not having any usage examples (which display by default) and no excessively disgusting citations (which are hidden by default) might suffice. DCDuring (talk) 15:50, 9 December 2017 (UTC)
I could add much more vulgar quotes as I use to hear rap music, like I have done on dog bone.
I think the general solution is to move quotations to citation pages – I don’t want a moral cleansing to be started because “we don't have an obligation to keep any particular quote”. Do not be a puritan. Sexual references form a great part of what people talk and have useful subtile differences. The measurement line might be: A reference should not be moved because of being boorish, but if it is contra bonos mores, as would be enough for a contract to be void. Palaestrator verborum (loquier) 16:37, 9 December 2017 (UTC)
Ah yes, wrapping oneself in the flag of free expression. I think we lavish much more attention on some "subtle" differences for sexual terms than on, say, those involving a word like, say, seem. I suspect that the entire reason for the differential attention is a differential in hormonal response within the contributors to the prospect of working on the two types on entries. I rather doubt that any consideration of the users of an entry, especially those of an age different from the contributor becomes involved. DCDuring (talk) 18:23, 9 December 2017 (UTC)
I am not sure what you mean with those hormonal responses and with the “free expression”. If I hear rap music I, I add quotes from rap music, and users who hear Pop add Pop music quotes. The former is even more valuable because there are the gaps of Wiktionary whilst there is else hardly an English word found not in Wiktionary (consider Grime or Drill with Multicultural London English). But I am not free to change what music pleases me and I doubt that you know much about how hormones lead the editing habits of Wiktionarians. And what I am particularly not free in is to choose the sources which I stumble upon. Better give that durable quote for dog bone than to leave it out because some people are more sensitive and there could be a less edgy one. There should be more cases when people should be thankful for seeing the quote than pitying themselves for having endeavored to read it – and disgust can also be mixed with thankfulness for having read an example. That is another way how you could discriminate the cases considering the user. Palaestrator verborum (loquier) 22:31, 9 December 2017 (UTC)
I also dislike censorship but I think the point here is to avoid vulgarity if it is unnecessary, e.g. when citing an everyday word like "umbrella". Equinox 06:22, 10 December 2017 (UTC)
Sorry, I meant excessive obscenity rather than vulgarity. That more accurately describes what I'm getting at here. I actually had a feeling much of it was part of some kind of practical joke by some user a while ago and then no one really bothered to change them. It doesn't really bother me that much personally (in fact I'm far from puritanical in my daily life and don't hesitate to use some of this language) but I could see how that would annoy some people and draw them away from the project. It almost seems like some of the editors were purposely going for shock value. I could also invoke the whole "not safe for work and not safe for kids" thing, but realistically, it's hard to keep kids away from seeing certain kinds of content online. And yes, of course words that are meant to be obscene by definition would obviously have these kinds of quotes, but I do recall coming across several where it was uncalled for. I'll have to go and find some specific examples. And I also get that there is freedom of speech, which is perfectly fine, but this is still a user-modified project, like Wikipedia, one in which a consensus of informed people generally decide on what ends up being presented. Word dewd544 (talk) 07:06, 12 December 2017 (UTC)

Doubt about recent Persian, Middle Persian, Yaghnobi, Tajiki and Arabic entries and edits[edit]

I'm really worried about recent edits and entry creations in these languages. Kaixinguo~enwiktionary (talk) 18:19, 9 December 2017 (UTC)

@Kaixinguo~enwiktionary Me too, even though I don't speak Persian, Hindi entries need good Persian entries for etymology. I think @Rajkiandris should explain himself, since he has added thousands of Yaghnobi, Tajik, and Persian stubs with incorrect formatting and etymologies, and now he doesn't respond to queries on his talk page. Some of his edits are, frankly, totally wrong.
As for Arabic, I actually think the entry quality has improved a lot recently, thanks to the better treatment of dialects and good work by new contributors. —AryamanA (मुझसे बात करेंयोगदान) 22:53, 9 December 2017 (UTC)
So what are you worried about in Arabic entries? The increase in quality I have felt for Arabic since I have registered two months ago has been according to my gut-feel 30%. I have always patrolled the edits in Arabic, corrected the template usage (even added and edit reference templates), added many good etymologies (see ت ر ج م (t r j m), this has not been seen before), removed the {{etyl}} completely from it, emptied Category:Arabic entries needing vocalization (though this some random IP did too), Category:Arabic terms needing Arabic script, Category:Arabic form-I verbs with missing non-past vowel in headword, and added about 400 additional entries of high quality belonging to the 9200 in Category:Arabic lemmas now, and almost 200 quotes with translations, and a significant number of images. I use all sources available and as a rule add only words I read.
I had on that occasion also cleaned up some Persian etymologies. Today you have changed the transcriptions of the Persian in إِبْرِيق (ʾibrīq) and its reborrowed descendant ابریق but I have only copied the transcription of the Persian word as it was found before, this plays surely a role in that you talk here now. I don’t know Persian to say what transcriptions are good and Wiktionary:About Persian#Transliteration does not say anything meaningful about it. Perhaps you should mention the minimum standards in it. I have just three days ago cleaned it up from some instructions that were wrong, and I have much improved Wiktionary:About Arabic. Also, who will clean up {{etyl}} in Persian? Now I have removed it from Arabic it can’t be used anymore, but Persian? You seemingly refer to edits Rajkiandris (talkcontribs) but I have not seen him adding Arabic. @Kaixinguo~enwiktionary Palaestrator verborum (loquier) 22:55, 9 December 2017 (UTC)
I'm not referring to those romanisations; that kind of thing happens all the time. No one can monitor everything that's going on, perhaps it was wrong to include Arabic. I'm just talking about a general impression. I am sorry if I have given you the impression that I am referring to you. Kaixinguo~enwiktionary (talk) 23:26, 9 December 2017 (UTC)
So how have you formed the impression? I know what has been going on the last two months in Arabic and there hasn’t been anything needing reproval. Palaestrator verborum (loquier) 23:42, 9 December 2017 (UTC)
I literally just said maybe I shouldn't have included Arabic so let it go.Kaixinguo~enwiktionary (talk) 23:45, 9 December 2017 (UTC)
Ok, I let it go, but your submission is still underspecified. I don’t know what you want people to do. People probably prefer not to extract the notions you know but others don’t. You won’t get sudden contributors proficient in those languages who will perform some thorough reviews of the language treatment. Palaestrator verborum (loquier) 00:06, 10 December 2017 (UTC)
@Kaixinguo~enwiktionary: If you complain about quality in a public place, you have to give some examples. Which of the recent entries in CAT:Persian lemmas, CAT:Tajik lemmas, CAT:Arabic lemmas, etc. seem wrong? Or is it some translations from English? --Anatoli T. (обсудить/вклад) 00:26, 10 December 2017 (UTC)
@Atitarev: See User talk:Rajkiandris and Special:Contribs/Rajkiandris. While most of the entries and edits don't have wrong definitions per se, the quality of edits is not up to par (not using templates, wrong Proto-Iranian reconstructions, incorrect sorting of descendants, no transliterations for Middle Persian, etc.). Combined with the massive volume of edits that (s)he has done, the overall quality of Yaghnobi and Tajik entries especially has deteriorated. Also, much like Gfarnab, they edit in languages that don't even know anything about; I had to revert a dozen Konkani entries of his a while ago. —AryamanA (मुझसे बात करेंयोगदान) 00:33, 10 December 2017 (UTC)
I see, thanks. Having an editor editing in a language nobody can check, like Yagnobi is even more dangerous. --Anatoli T. (обсудить/вклад) 00:38, 10 December 2017 (UTC)
I have left a warning on his talk page. I ask that any of you with expertise or resources in any of the languages he has worked on to help assess or clean up his entries. —Μετάknowledgediscuss/deeds 00:56, 10 December 2017 (UTC)
We should require @Rajkiandris to proffer some quotations with translations in Yaghnobi or at least Tajik where he should find enough – I mean, speaking to @Rajkiandris, you should add quotes with translations. If you do, you can affirm an impression of knowledge of these languages; if not, it is also good, for Wiktionary, because it only increases the notion that you only copy linguistic material without sufficient exposal to the language itself, and we can ostracize you. So add quotes if you want to be taken seriously and your entries to have a long life. For now it is in the realm of the possible that “created by Rajkiandris” will be an argument in itself in proposals for deletion. Palaestrator verborum (loquier) 01:34, 10 December 2017 (UTC)

Synonyms under corresponding senses - stop of switching[edit]

In Wiktionary:Beer_parlour/2017/May#Poll: putting "nyms" directly under definition lines, there is a preliminary consensus for using a new synonym format but only under the condition that the synonyms become hidden, meaning collapsible. There is not consensus, not even a supermajority, for non-collapsible synonyms under corresponding senses. An example entry that I think currently looks horrible due to the new format is cat.

I propose that switching to the new synonym format stops until the collapsing of synonyms is implemented. --Dan Polansky (talk) 13:24, 10 December 2017 (UTC)

I agree with cat looking horrible. However it would not be a solution to use the synonyms section. It looks bad too and would look worse without {{syn}}. This template kin need fixed. And before a hiding can be made, it would work out well to decrease font-weight and font-size. But {{sense}} + synonyms looks worse than {{syn}} with arguments does.
For reference only I link here Wiktionary:Beer parlour/2017/November#Independent Synonyms section, or {{synonyms}} under each relevant sense?. Palaestrator verborum (loquier) 15:03, 10 December 2017 (UTC)
this revision of cat entry (31 December 2016) looks okay as for synonyms, using the old/current format. What looks even better is this revision (26 December 2011), where the some of the rather dubious synonym lines are absent. --Dan Polansky (talk) 15:15, 10 December 2017 (UTC)
If I apply the arguments consequently, I reason that we should not list synonyms at all in pages as long as they are not hidden in the Thesaurus namespace or in a drop-down menu. Thesauri are else separate products from dictionaries that are not included even in the greatest dictionaries. Palaestrator verborum (loquier) 15:29, 10 December 2017 (UTC)
I don't know what to say or where to begin; the above does not make any sense to me. --Dan Polansky (talk) 15:47, 10 December 2017 (UTC)
What’s so complicated? Currently both the synonyms section and {{syn}} look bad. If these things are hidden, they look better. Thus there is an argument for not adding synonyms at all as long as {{syn}} is not fixed (for one does not make a Thesaurus entry for all things either). Which does not mean that other reasons – the usefulness – outweigh the aesthetic argument leading to the conclusion to use a synonym section or {{syn}} anyway. Palaestrator verborum (loquier) 15:57, 10 December 2017 (UTC)
Re. "Thus there is an argument for not adding synonyms at all as long as syn is not fixed:" There is no such argument. I am not proposing to stop adding synonyms. I am proposing to continue the original practice of adding synonyms to their dedicated sections, as still codified in WT:ELE. I am proposing that all switching should stop until collapsibity is implemented, consistent with the referenced poll. I said these things above. --Dan Polansky (talk) 16:02, 10 December 2017 (UTC)
Help me test nym-hiding by adding the following line to your custom javascripts:
importScript("User:Ungoliant MMDCCLXIV/synshide.js");
Ungoliant (falai) 17:49, 10 December 2017 (UTC)
What do I have to do for it? I haven’t ever used custom javascripts, and when I search for it I find plenty. Palaestrator verborum (loquier) 18:31, 10 December 2017 (UTC)
@Palaestrator verborum: click here, and copy paste what Ungoliant wrote. --Per utramque cavernam (talk) 18:37, 10 December 2017 (UTC)
@Ungoliant MMDCCLXIV It works on cat, but not on Geck or anmachen – there the synonyms and antonyms just vanish. Palaestrator verborum (loquier) 19:06, 10 December 2017 (UTC)
Another issue: cottage cheese DTLHS (talk) 19:26, 10 December 2017 (UTC)
@Palaestrator verborum, DTLHS, thanks. It was a really dumb mistake... should be working now. Please use my talk page to report problems, and we’ll announce here when everything starts working properly. — Ungoliant (falai) 19:30, 10 December 2017 (UTC)
@Ungoliant MMDCCLXIV Thank you! Having synonyms displayed so prominently under definition lines was driving me crazy. Andrew Sheedy (talk) 22:09, 10 December 2017 (UTC)


Hello. When a French term is a borrowing of an English term, which itself is a borrowing of an Old French term, does it qualify as a "reborrowing"? I don't know if I can add French square to CAT:French twice-borrowed terms. --Per utramque cavernam (talk) 21:20, 10 December 2017 (UTC)

You don’t have to add to the category yourself. It happens when you use {{der|fr|fr|}}. Palaestrator verborum (loquier) 22:14, 10 December 2017 (UTC)
And if doesn’t happen with {{der|fr|fro|}}, I think it is an error. Palaestrator verborum (loquier) 22:15, 10 December 2017 (UTC)
It seems like the collection of doublets and of reborrowed terms overlaps. Palaestrator verborum (loquier) 22:17, 10 December 2017 (UTC)
Yes, I think that these are reborrowings. The word reborrowing presupposes the identity of the language, because the prefix “re” signifies retreat to a state the subject had, i. e. when it existed as such. So words which English has from the Latin spoken in Gallia cannot be reborrowings because French does not represent the successor of Latin; but if Old English has taken a word from Old French it can be a reborrowing, because French represents the successor of Old French (the people then thought of Old French as French, it has the same identity, though we shall see it ex post) – not relevant if English is the sole successor of Old English –, and words which Indo-European has borrowed from Proto-Semitic cannot be “reborrowed” by Arabic. Palaestrator verborum (loquier) 22:26, 10 December 2017 (UTC)
That argument makes no sense. There is no notion of successors with languages the way there is with countries. Basing it only on the name is unlinguistic. There is nothing that entitles French to more than say, Walloon. Both are linguistically descendants of the same language. —Rua (mew) 23:00, 10 December 2017 (UTC)
Which means the things OP described are not reborrowings. Question solved. Palaestrator verborum (loquier) 00:13, 11 December 2017 (UTC)
But it looks like it is not easy to fix with the current system of language data to cause {{der|fr|fro|-}} to categorize into reborrowings but {{der|fr|itc|-}} not. Palaestrator verborum (loquier) 22:31, 10 December 2017 (UTC)
Which illustrates the difficulty of deciding what is the same language and what isn't. Proto-Italic was passed on from parent to child in an unbroken chain from about 3500 years ago to today. It picked up some changes along the way, so that the current language looks very different from the language of 3500 years ago. There is no point in any of this time where speakers started speaking another language. Our modern way of splitting things into stages and dialects and languages is completely arbitrary. And with this, the concept of a reborrowing is also fuzzy and arbitrary. I would consider borrowing of any PIE term from one PIE language into another PIE language a reborrowing. After all, the word once existed in the history of the borrowing language, and now it is entering the language also from another source. Thus, I would call genus a reborrowing, since this word once existed natively in the history of English too. —Rua (mew) 14:18, 15 December 2017 (UTC)

request to add to WT:AWB[edit]

user:rajasgored, because I don't like flooding Recent Changes. —suzukaze (tc) 07:11, 12 December 2017 (UTC)

Added, I hope I did it correctly. Let me know if it still doesn’t work. Wyang (talk) 03:50, 13 December 2017 (UTC)
There's also the option of temporarily giving your account a flood flag, though you're certainly experienced and responsible enough to have AWB enabled. Chuck Entz (talk) 04:02, 13 December 2017 (UTC)
Thank you!
Nearly forgot about Wiktionary:Bots#Process though, lol
I'll set up the votes page. —suzukaze (tc) 06:21, 13 December 2017 (UTC)
The account has been renamed: User:350bot. See the vote page: Wiktionary:Votes/bt-2017-12/User:350bot for bot status. --Per utramque cavernam (talk) 16:37, 16 December 2017 (UTC)
Now the entry in WT:AWB needs to follow the new name too... —suzukaze (tc) 06:50, 25 December 2017 (UTC)
Yes check.svg Done — justin(r)leung (t...) | c=› } 07:15, 25 December 2017 (UTC)

Nicknames from surnames[edit]

Can and should we document these? I feel like Sully should link to Sullivan in some way, but I don't know how I'd format it. We have Murph (Murphy) as a diminutive surname, but every example I think of is (I believe) a distinctly masculine nickname. There's also Coop (Cooper), Murr (Murray), and probably a handful more. Ultimateria (talk) 19:30, 14 December 2017 (UTC)

Putting them under derived terms seems sufficient. DTLHS (talk) 04:50, 15 December 2017 (UTC)
What about the nicknames themselves? I can't really use "male given name". Ultimateria (talk) 15:55, 16 December 2017 (UTC)
I think "A diminutive form of the surname x." would suffice. Many of these are also not gender specific, in my experience. - TheDaveRoss 13:52, 18 December 2017 (UTC)
In my experience, it's simply more common for guys to use other guys' last names as a nickname than it is for girls/women to do so, or for men to call women by their surnames. That might be where the perception that they are masculine nicknames comes from, and it may well be that it's unusual for a female to be called Murp or Coop. (An example from my personal experience: I was often called Sheeds or Sheedster during my early teens, but it's hard to imagine my sisters receiving the same nickname.) Andrew Sheedy (talk) 15:13, 18 December 2017 (UTC)
Less common, perhaps, but it does happen. In Roman Holiday, Audrey Hepburn's character introduces herself as Anya Smith, and Gregory Peck's character calls her "Smitty". It happens in real life too, of course. —Mahāgaja (formerly Angr) · talk 16:21, 18 December 2017 (UTC)

"from en.wiktionary.org (Terms of Use)" at the bottom of the main page[edit]

What is the point of this line? What is from en.wiktionary.org? DTLHS (talk) 04:51, 15 December 2017 (UTC)

That is totally unnecessary and redundant to the title above and Terms below. Please delete that. —Justin (koavf)TCM 04:59, 15 December 2017 (UTC)
(watsuzukaze (tc) 05:10, 15 December 2017 (UTC))

Template for coinages[edit]

Should we have a template for coinages? I'm imagining one that at järjestelmä, for example, it would look like {{coin|fi|{{w|Agathon Meurman}}|1867}} and output Coined by Agathon Meurman in 1867, thus templatising the current phrasing. The advantages to doing this are that we can then categorise coinages by date (and have, say, Category:Finnish terms coined in the 19th century) or even by person, and would make our etymologies more standardised and machine-readable. —Μετάknowledgediscuss/deeds 07:09, 15 December 2017 (UTC)

I could well make use of it. For many culture languages it is a sport to replace foreign intruders with native coinages. For German, one can ascribe many words to the Fruitbearing Society, specifically Philipp von Zesen, not few to Joachim Heinrich Campe, and I know Eduard Engel (Germany’s foremost, most thorough purist) well to know that he has invented a word when I read him. Many words in the administrative area also have been invented by the legislator, basic ones like volljährig (earlier they said mayorenn, Mayorennität). It is surely also useful for Turkish, where coinages have been made after the fall of the Ottoman Empire to go back to Turkish, but others can tell you more about it. One just has to think to which entities one can ascribe words. German legislator, Hessian legislator etc. seems to be working, as one hardly ever finds out the one official who invented it. Palaestrator verborum (loquier) 08:18, 15 December 2017 (UTC)
I like this idea. Some people are prolific coiners, being able get a list of all their coinages could be useful. – Jberkel 10:40, 15 December 2017 (UTC)
As for German coinages, they aren't rarelly attributed to wrong persons or are attributed to different persons. E.g. Mundart (for Latin dialectus) is sometimes attributed to Zesen and sometimes to Schottel; and Gesichtserker created by anti-purists is incorrectly attributed to various purists like Zesen or Campe. Making secondary sources (where some people, especially linguists, write about older coinages or purism) mandatory wouldn't be enough to prevent such misattributions. Making primary sources (where e.g. Zesen, Schottel or Campe used or mentioned a word) mandatory would reduce misattributions, but wouldn't proof who coined it and would only proof that the person used/mentioned it (while others could have used/mentioned it before him).
Additionally, there could be the problem that the coiner didn't use the term. For example the coiner could have made a word list which only has mentionings. The etymology section should than have a line like "Coined by Person A in Year X and first used by Person B in Year-(X+y)". - 15:45, 15 December 2017 (UTC)
This is nothing more of a problem than etymologies usually are; it is even better than the probability jugglery that has to be practised with ancient languages and is often really wild but necessary (consider the lack of resources mentioned in Wiktionary:About Arabic). And your examples even confirm that it is feasible to attribute words to coiners, as it is a well and long known misattribution that von Zesen invented Gesichtserker. Palaestrator verborum (loquier) 15:59, 15 December 2017 (UTC)
It's often more of a problem. It's easy to see that Mundart is Mund + Art and that it translates Latin dialectus. But to find out who coined it and when isn't so easy.
Gesichtserker is still sometimes attributed to Zesen (even by linguists), and even if it were universally known that he didn't coin it, what about the attribution to Campe? - 15:05, 16 December 2017 (UTC)
This is a good idea. Such categories would be useful. — Ungoliant (falai) 11:25, 15 December 2017 (UTC)
Why "coined"? I'd rather prefer Category:Finnish terms first attested in the 19th century. —Rua (mew) 14:10, 15 December 2017 (UTC)
To collect the terms consciously introduced by identifiable entities. Palaestrator verborum (loquier) 15:59, 15 December 2017 (UTC)
I meant to create a template for this a year ago but forgot all about it. I think it would be a good idea to add a parameter that would block categorization in cases where the entity coined only a single term. Crom daba (talk) 16:16, 15 December 2017 (UTC)
Indeed, first attestations are interesting but ultimately a very different matter from coinages made by a specific, known person. —Μετάknowledgediscuss/deeds 19:18, 15 December 2017 (UTC)
@Metaknowledge: I suggested a similar thing a few months ago. --Per utramque cavernam (talk) 13:16, 16 December 2017 (UTC)
Seems a good idea. We may need to define what we mean by "coinage" in one of the glossary appendices (if we don't already), to ensure it's used properly. Equinox 13:29, 16 December 2017 (UTC)

Citation titles[edit]

I believe we need some kind of general guideline or policy on how to treat the titles of works in citation. It is insane to me that in some cases they are three or four lines long, much longer than the actual sentence being cited. I think we should be saying, for instance, "1722, Daniel Defoe, A Journal of the Plague Year, London: E Nutt, p. 247" – but that work is currently being given as follows:

  • 1722, [Daniel Defoe], A Journal of the Plague Year: Being Observations or Memorials, of the Most Remarkable Occurrences, as well Publick as Private, which Happened in London during the Last Great Visitation in 1665. Written by a Citizen who Continued all the while in London. Never Made Publick before, London: Printed for E[lizabeth] Nutt at the Royal-Exchange; J. Roberts in Warwick-Lane; A. Dodd without Temple-Bar; and J. Graves in St. James's-street, OCLC 745119358, page 247:

I have been trimming many of these as and when I come across them, but I am just getting reverted so perhaps we can decide – for instance – to cite works in most cases "without subtitles", or to specify that saying "printed for" is unnecessary, or in some other way to codify what I would have thought was a common sense idea of how to present the key information. Or alternatively to clarify if the current approach is what the community wants. Ƿidsiþ 12:27, 16 December 2017 (UTC)

I think the templates should get parameters for subtitles that format the subtitles so these are less space-taking. Palaestrator verborum (loquier) 12:39, 16 December 2017 (UTC)
Sometimes the subtitle is short and useful, for disambiguation or to suggest what kind of work is being cited. I generally try to use the WP title of the work if there is a WP article. I also skip many of the publication details, including only enough to sufficiently disambiguate editions. DCDuring (talk) 16:33, 16 December 2017 (UTC)
Some kind of JS that hides most of the citation details / displays a short form of the title would be ideal IMO. DTLHS (talk) 17:18, 16 December 2017 (UTC)
Even worse, the long cites are redundant if repeated in other entries. Once it is ready it might be a good application for WikiCite (the main use case is scientific publishing at the moment). – Jberkel 18:56, 18 December 2017 (UTC)


This needs updating. The two CUs we have aren't listed here, and 2 ex-CUs are. --Gente como tú (talk) 16:48, 16 December 2017 (UTC)

TheDaveRoss is still a CU, so I'm not sure of what you're saying. --Per utramque cavernam (talk) 17:02, 16 December 2017 (UTC)
Yup, looks right to me. - TheDaveRoss 13:49, 18 December 2017 (UTC)

Adding a language parameter to {{non-gloss definition}}[edit]

Currently links go to the English section by default. Anyone objects to adding a lang parameter to this template? See also Template_talk:non-gloss_definition#Language_parameter. – Jberkel 10:13, 18 December 2017 (UTC)

Yes, I wanted to suggest that at some point; maybe it would allow us to have CAT:Words with non-gloss definitions by language? --Per utramque cavernam (talk) 16:28, 18 December 2017 (UTC)
I don't see what the point of that would be. —Μετάknowledgediscuss/deeds 17:16, 18 December 2017 (UTC)
Yes, there is not even a strict line between gloss and non-gloss definitions. Palaestrator verborum (loquier) 17:37, 18 December 2017 (UTC)
I don't see the point. Nothing within the template is language specific. —Rua (mew) 18:00, 18 December 2017 (UTC)
Not to categorise. The last few times I used it for non-English entries I've never needed to link to English entries from within the def. But maybe that's not a common case, I'll leave it as is. – Jberkel 18:33, 18 December 2017 (UTC)
Sorry, I seem to have derailed your suggestion... --Per utramque cavernam (talk) 19:41, 18 December 2017 (UTC)
Then please explain the use of the lang parameter, why it “should” exist. I can so far only imagine categorization as a purpose, which is as already noted poor. Palaestrator verborum (loquier) 19:53, 18 December 2017 (UTC)
@Palaestrator verborum: it could be used as a maintenance category. --Per utramque cavernam (talk) 22:27, 18 December 2017 (UTC)
@Jberkel, I think you should be using {{m}} or {{l}} instead of plain links if the words you're linking to are not English. Those would link the words to the right section. A lang parameter makes it confusing if you're linking to both English words in the explanatory text and non-English words. — justin(r)leung (t...) | c=› } 20:01, 18 December 2017 (UTC)
That works, thanks. Not sure why I didn't try this first. I've updated the docs. – Jberkel 20:07, 18 December 2017 (UTC)

Cleaning up the various names and shortcuts to Template:non-gloss definition[edit]

This is not related to the discussion above, but it was inspired by it. When you look at how many other templates redirect to {{non-gloss definition}}, there's quite a few. I propose cutting this back to only one, simple shortcut. My preference goes out to {{ngd}} as the shortcut to use: a hyphen is a little bit more difficult to type than g, and ngd is also a perfect initialism of the full name, which makes it easier to memorise. A bot can then convert all existing uses of any of the other names, including the full {{non-gloss definition}}, to {{ngd}}. This is in line with what we already do for other templates with "official" shortcuts. —Rua (mew) 20:18, 18 December 2017 (UTC)

@Rua: I typically use {{n-g}}, but {{ngd}} is probably better because it is a full initialism. I would support it being chosen as the official shortcut. — Eru·tuon 21:11, 18 December 2017 (UTC)
Yes, please. – Jberkel 21:19, 18 December 2017 (UTC)
Somehow I find {{ng}} easier to memorize, and it is probably easier to read when scanning through wikitext. Palaestrator verborum (loquier) 21:41, 18 December 2017 (UTC)
Personally, I have been using {{n-g}}, but I wouldn't mind {{ngd}} being the "official" shortcut if people are in support of it. — justin(r)leung (t...) | c=› } 21:44, 18 December 2017 (UTC)
I’ve been using {{ngd}} myself and would support it. — Vorziblix (talk · contribs) 15:06, 19 December 2017 (UTC)
I don't see why we have to limit ourselves to a single short name. Why can't we just keep both {{ngd}} and {{n-g}} (and maybe even add {{ng}}) and allow people to use whichever one they like best? —Mahāgaja (formerly Angr) · talk 16:36, 19 December 2017 (UTC)
I cannot see it either. Maybe it is for having less code in bots, but still the dictionary is made for man and not for robots. Palaestrator verborum (loquier) 16:42, 19 December 2017 (UTC)
The more names we have, the more editors have to learn and memorise in order to understand our code. If our code and naming is kept straightforward, there's less of a mental load. I would like to stick by the Python principle: there should be one, and preferably only one, obvious way to do it. —Rua (mew) 18:06, 19 December 2017 (UTC)
That’s probably why they constantly change the syntax for Python. But one must be very silly if one struggles with “learning” that {{n-g}}, {{ngd}}, {{non gloss}}, {{non-gloss}} {{non gloss definition}} mean the same. I fear it is an exaggeration to speak of a mental load. In any case if you put all the abbreviations into the documentation this argument vanishes because one learns about all forms to call it, which are very vernacular ones, if one learns about the template itself. However I don’t care if you want to standardize the usage, it is also not much mental load to use one only. Palaestrator verborum (loquier) 18:30, 19 December 2017 (UTC)
>a hyphen is a little bit more difficult to type than g
Are we actually going to consider this kind of stuff when making shortcuts... —AryamanA (मुझसे बात करेंयोगदान) 19:32, 19 December 2017 (UTC)
Some people here think every character is one too much, clarity be damned. —Rua (mew) 19:40, 19 December 2017 (UTC)
{{ngd}} is only a little if at all more clear than {{n-g}}. IMO {{non-gloss definition}} would be the best since it's obvious what it means (and condensed wikitext often scares newbies away from here), but I sure wouldn't want to type all that. —AryamanA (मुझसे बात करेंयोगदान) 02:26, 20 December 2017 (UTC)
Can we keep the redirects but allow bots to convert templates to the primary name? DTLHS (talk) 03:11, 20 December 2017 (UTC)
That was already decided against when we explicitly voted to do the opposite for {{lb}}. —Rua (mew) 22:44, 22 December 2017 (UTC)
  • "Some people here think every character is one too much, clarity be damned." -- By contrast, some people here think the opposite, and are getting increasingly concerned by the "form-over-usability" mania for obfuscated short names for everything.
I'd like to also point out Rua's comment above about Python, and emphasize: "I would like to stick by the Python principle: there should be one, and preferably only one, obvious way to do it." Let's not forget the "obvious" portion here. {{ngd}} is very non-obvious to me, and I suspect to others as well. If we are to limit ourselves to only using one name for all templates, I propose we cleave to this obviousness principle and avoid ambiguous and obscure abbreviation. I.e., {{non-gloss definition}} instead of {{ngd}}, {{label}} instead of {{lb}}, {{compound}} instead of {{com}}, {{prefix}} instead of {{pre}}, etc. etc.
If we are to allow multiple names for any given template via redirection, then I see no reason to remove redirects, short of naming collisions. If we're worried about the multitude of template synonyms causing confusion and making the wikicode harder for humans to understand, then let's set a bot to work on a regular interval and task it with turning the synonyms into the canonical template name -- ideally something obvious and easily understandable, as noted above.
‑‑ Eiríkr Útlendi │Tala við mig 23:31, 22 December 2017 (UTC)

Voting right in cases of vote extensions[edit]

Wiktionary:Voting policy § Voting eligibility couples the right to vote to the start time of the vote. A teleological interpretation of the prescriptions suggests that the provisions referring to the start time of the vote have to be applied by analogy to the times of vote extensions, the ratio legis of the provisions being to deter overruns of votes by meatpuppets as has happened with Yugoslavs, while extended votes are those which demonstratedly do not concern the interests of which the voting system is protected by the self-same provisions and each of such extended votes as well can be seen as two votes, the second one being a repetition of the first vote with the old castings of votes taken over with the rebuttable presumption of continuity.

In accordance with these reasonings, I consider myself who has performed his first edit on Oct. 4th 2017 and has collected fifty edits in an immaterial time after it eligible to vote in Wiktionary:Votes/2017-07/Templatizing topical categories in the mainspace 2 which started on Aug. 6th 2017 but has been extended on Nov. 26th 2017. However I want to point out this unfortunate wording lest votes that need voters have fewer legitimate participants than they could have because of legitimate voters being bedazzled by the wording of the provisions about voting eligibility. In case of inaction concerning the voting policy wording I have here at least laid down reasonings for reference. Palaestrator verborum (loquier) 17:31, 19 December 2017 (UTC)

The extension changes the end of the vote, not the start. Doesn't seem at all unclear to me. A vote which was scheduled to start at one time and had that start delayed for some reason is possibly open to interpretation, but not an extended vote. - TheDaveRoss 19:21, 19 December 2017 (UTC)
But the start of a vote has never been delayed. The votes just run without sufficient participants and then they are being started again, with new attempting to publicize them. Palaestrator verborum (loquier) 19:37, 19 December 2017 (UTC)
They aren't actually started again, but extended. The start of the vote is still the original start date. — justin(r)leung (t...) | c=› } 19:40, 19 December 2017 (UTC)
You say this because it has become common to speak of extension in that thing and not of starting. But the concept of extension does not exclude a restart, nor does the appellation change the nature. This is why I say that the expressions do not help here, just the telos. Palaestrator verborum (loquier) 19:50, 19 December 2017 (UTC)
The practice, letter, and spirit of the rules are all in accord here; the start of the vote is the start, extensions do not alter the start time. - TheDaveRoss 19:58, 19 December 2017 (UTC)
Of course they don’t alter the start time – but they can add one, as the concept of extension does not exclude a restart. Meaning then there are two start times. I have already supra disproven that the spirit of the rules harmonizes with their wording. Palaestrator verborum (loquier) 20:04, 19 December 2017 (UTC)
I would be interested to hear if anyone else agrees with your interpretation; I think the concept of an extension is clear and does not alter the start time, or add additional start times, or change anything else about the start time or the eligibility criteria. - TheDaveRoss 20:20, 19 December 2017 (UTC)
Even if you do not say there is a second start time (which is implicit if it is) there is still the possibility of analogical application. Because why shouldn’t it be? As I have read the archives, the eligibility policy has been devised to prevent that the whole Balkan votes even though mostly never having appeared before in working on the dictionary and things like that. But if a vote is extended, there has been no voting rush and it is as if there were a clean start. Palaestrator verborum (loquier) 20:31, 19 December 2017 (UTC)
If it is unclear, I suppose that it could be reworded in some way to make it more clear, but in all similar circumstances that I can think of an extension does not involve a restarting, but a continuation. That is actually a fundamental component of the word extension as I know and use it. If this whole discussion is simply a very long-winded way of asking if you can be made eligible to vote in the vote you linked to above, I would say no. - TheDaveRoss 20:51, 19 December 2017 (UTC)
That I can be made eligible is of course not possible. Either I have been from the beginning or I am not. This is just a exposition of my thoughts favoring casting the vote which itself is not very relevant, but it shan’t be thought about too late, as the unclearness is seen and is not beseeming in such a matter. I have instantly thought what start time means on such occasions when I thought about the vote. If afterwards one talks about extension or whatever hardly matters because what matters are of course the factual occurences to which the rules are applicable, not how they are called. We need to hear a bit more what people think about the clearness of the wording. Hey @Wikitiki89, I find your judgement interesting, for you have on some occasions complained about unclear wordings. Palaestrator verborum (loquier) 21:27, 19 December 2017 (UTC)
So if a professor gives a student an extension on a paper, does that mean that they've reassigned it? I think your interpretation of the word "extend/extension" is idiosyncratic, so there's no worry about ambiguity.... Andrew Sheedy (talk) 21:08, 19 December 2017 (UTC)
But if an administrative authority extends or modifies an administrative act, or even a different (higher) authority does it, it does never matter, the whole arrangements affecting a claimant are then attacked in court in any case. We can surely find many more similes and the matter becomes even more blurry. Usage differs across topics. And again, how it is called does not change its nature. An authority can issue a “notice” and it might be a legal act. Palaestrator verborum (loquier) 21:27, 19 December 2017 (UTC)
Extension or modification of an administrative act is distinct from extension of time. I cannot think of an extension of time where it would mean the event is restarted. — justin(r)leung (t...) | c=› } 05:32, 20 December 2017 (UTC)

Disabling PyWikiBot's cosmetic changes ('cleanUpSectionHeaders')[edit]

Here Framawiki requested that I gain community consensus in order for the task to be done. Basically, I want PyWikiBot to not "cleanup" headers by adding spaces to the beginning and end (==This== to == This ==). From what I have seen the convention of having no starting/ending spaces in the header is a universal convention across this wiki, so consensus should be easy to get. -Xbony2 (talk) 00:16, 20 December 2017 (UTC)

It's codified in WT:NORM. —Rua (mew) 00:36, 20 December 2017 (UTC)
That's roughly how I felt too. -Xbony2 (talk) 22:38, 21 December 2017 (UTC)
Agree with everything. - TheDaveRoss 13:02, 22 December 2017 (UTC)

Wiktionary:Christmas Competition 2017[edit]

FYI, this is happening now. —AryamanA (मुझसे बात करेंयोगदान) 23:40, 20 December 2017 (UTC)

Transcribing "The Art of Grammar"[edit]

Is anyone interested in helping to transcribe The Art of Grammar over at Wikisource? I'm happy to start the project but don't want to do it entirely on my own. I also need some help with Ancient Greek (it's an English translation but has many untranslated words in it). The transcription would make a nice complement to some of our entries here. – Jberkel 13:52, 21 December 2017 (UTC)

@Jberkel: I can help with the Greek terms. Is it on Wikisource yet? --Per utramque cavernam (talk) 18:01, 21 December 2017 (UTC)
@Jberkel: I can help too. I've long wished there was a translation easily accessible. (I had to translate some passages for Wikipedia in the past.) — Eru·tuon 18:14, 21 December 2017 (UTC)
Thanks for your offer to help, that's great. There's also a proofreading stage at the end for those who want to help but don't want to transcribe. @Per utramque cavernam: it's not on Wikisource yet, I'll create the project skeleton and let you know once it's ready. – Jberkel 23:35, 21 December 2017 (UTC)
@Per utramque cavernam, Erutuon: Here's the index: s:en:Index:The_grammar_of_Dionysios_Thrax.djvu. I've added headings and started with the first two pages, left all the greeks words as (GREEK) in the text. I think it makes sense to directly link them to the corresponding entries here. Also see s:Wikisource:Style_guide. – Jberkel 14:08, 22 December 2017 (UTC)
Wikisource has a template s:Template:greek missing which you can use to indicate missing Greek characters. Pages with that template are put into s:Category:Pages with missing Greek characters. —Mahāgaja (formerly Angr) · talk 15:00, 22 December 2017 (UTC)
That's very handy, didn't know about it. Jberkel 15:11, 22 December 2017 (UTC)
I've created s:Template:togrc to convert ASCII to Greek in the manner of {{chars}}. — Eru·tuon 20:35, 22 December 2017 (UTC)
Excellent! I was switching tabs between Wiktionary and Wikisource to do it. —Mahāgaja (formerly Angr) · talk 07:49, 23 December 2017 (UTC)


I think this user is adding bad pronunciations even after being alerted. That one at Eminem isn't right, is it? Equinox 12:35, 22 December 2017 (UTC)

Corrected. —Mahāgaja (formerly Angr) · talk 13:19, 22 December 2017 (UTC)

Template shortcuts[edit]

We could institute the practise of grandfather clause: experienced users would be allowed to use whatever shortcut they're used to, which would be converted by bots like DTLHS suggested, but new users would be compelled to use the official shortcut or name. --Per utramque cavernam (talk) 16:16, 22 December 2017 (UTC)

That is utterly unenforceable. DTLHS (talk) 21:24, 22 December 2017 (UTC)
So we can keep WT:WF? Sweet! --Gente como tú (talk) 21:15, 23 December 2017 (UTC)

new namespace[edit]

hello. I created a new namespace for "related terms" and "derived terms": Related terms:Russian/говорить. it is not perfect, but it is useful to see all related words. Thank you. --2A02:2788:A4:F44:B529:D6B9:E82A:9F1F 23:50, 22 December 2017 (UTC)

Unless you are a server admin, you didn't create a new namespace. —Rua (mew) 00:02, 23 December 2017 (UTC)
Deleted. As Rua says, you've only created a sub-sub page of Related terms. Even if you could create a new namespace, you would want the structure to parallel that of the entries, which goes by spelling, then by language. On top of that, the only way such a scheme could possibly work is if there were only one "Related terms" section per language. Once you get into multiple "Etymology" sections, each with their own "Related terms" section, you run into the problem of "Etymology "sections being added, merged, deleted, rearranged and renumbered, without any way to guarantee that the corresponding "Related terms" entry will be moved, and with the messiness of moving without admin rights. The moral of the story: don't try to rearrange the structure of the site without discussing it here, first. Chuck Entz (talk) 04:07, 23 December 2017 (UTC)
@Chuck Entz that was bit harsh to delete the page, it took me time to prepare, fortunately i had notepad backup. I've created an account and put the page at User:Пикап/Related terms:Russian/говорить.
is that work useless and unwelcome? look at related terms of переговорный ("negotiations"): there is говорун, which is related (etym), but it means "person who talks a lot", so it has almost nothing to do there. this is problem with Russian where you have long lists of related (etym) words on every entry, because there is no centralised place. But they are distracting because the meaning is very different. --Пикап (talk) 13:19, 23 December 2017 (UTC)


Can someone help me with the definition line? "An" should be added before "island" but the definition ID code isn't letting me. ---> Tooironic (talk) 03:40, 24 December 2017 (UTC)

Fixed. DTLHS (talk) 03:45, 24 December 2017 (UTC)

Wikidata request for comment on the ideal data import proccess[edit]

Community Noun project 26481.svg

Dear all

We are currently running a discussion on Wikidata about what the ideal data import process looks like. We want to get the thoughts of people who work on different Wikimedia projects who have different needs and knowledge of different kinds of data to make it our roadmap as inclusive as possible, please take a look.

Many thanks

John Cummings (talk) 01:17, 25 December 2017 (UTC)

Formatting Serbo-Croatian cognates and cognates in other languages with more than one script[edit]

This edit raises the question about the proper formatting of cognates in languages that use more than one script. As a result of the edit, the designation of the language is currently duplicated on kabát, revealing it thereby as conducive to suboptimal appearance. Two approaches that crossed my mind were settling for nothing more than square brackets for the lemma with the second script (undone by the edit) or replacing in these cases Template:cog with Template:m which does not duplicate the designation of the language. Other ideas? Examples for such languages are Serbo-Croatian (Cyrillic and Latin), Kurdish (Arabic and Latin), Kazakh (Cyrillic and Latin) etc. The uſer hight Bogorm converſation 19:00, 25 December 2017 (UTC)

@Bogorm: The option I would choose is {{cog|sh|кавад}}/{{m|sh|kavad}}. — Eru·tuon 19:10, 25 December 2017 (UTC)
I prefer to what I did in the first place, which is just choose one script rather than redundantly showing both of them. Erutuon's solution is my second choice. —Mahāgaja (formerly Angr) · talk 19:15, 25 December 2017 (UTC)
I don't like the amount of duplication we have for Serbo-Croatian. Most entries are copies of each other. We should choose Latin script as the main one and turn Cyrillic into alternative forms or some other designation. Latin script is the most used, so it makes sense. —Rua (mew) 19:55, 25 December 2017 (UTC)
That would be somewhat politically incorrect, although we already redirect British spellings to American, so maybe it's not a big deal, it would simplify things massively. Crom daba (talk) 21:13, 25 December 2017 (UTC)
Chinese is a better example. We could also go the opposite way and split them into Serbo-Croatian Knjižni Jezik and Serbo-Croatian Novosrpskohrvatski. —Rua (mew) 21:17, 25 December 2017 (UTC)
We don't really redirect "British" spellings consistently, and that definitely isn't a policy, although some editors may choose to do so. DTLHS (talk) 21:19, 25 December 2017 (UTC)
I use {{cog|sh|ка̏ва̄д}}/{{m|sh|kȁvād}} consistently – don’t know how easy it is for normies to write both alphabets in quick succession (I have a rare setup), but that (currently least objectionable) usage is the reason why the Cyrillic is not automatically transcribed. Ignore politics, but aesthetically the Cyrillic script has no wrong to be ignored. Serbo-Croatian in Cyrillic is so fair! And for the reason that nj and lj kann be нј and лј as well as њ and љ though rarely it could even be chosen as the main script for technical reasons. We could then develop some template that is placed in the Latin page and does nothing but converting the whole Cyrillic page for readers of Latin – no double-creating anymore, only creation in Cyrillic and after each instance of it placement of a template at the Latin entry; I have already suggested a conversion mechanism for the translation tables in Wiktionary:Grease pit/2017/December § Bring Serbo-Croatian entry and translation additions technically up to date to accelerate the treatment of the language. Something similar can be made with Ijekavian and Ekavian (too bad Serbo-Croatian does not just use glorious ѣ, then we would not have this problem). Redirects of course do not work as some spelling could be a word in another language.
I have in the case of the Kazakh alphabets said that the dictionary can of course make editorial decisions for alphabets no to be bothered by governmental whimsies, but this would not even be so radical because we would still show both alphabets but have less work for editors (unless one needs to learn to write Cyrillic first, which technically is necessary anyway for those who create Latin Serbo-Croatian entries without the corresponding Cyrillic ones [a huge annoyance]) Palaestrator verborum (loquier) 22:04, 25 December 2017 (UTC)

Emoji as translingual "translations" of normal words[edit]

Is this appropriate? e.g. [4], [5]: we are translating the words taco and banana into the symbols or depictions 🌮 and 🍌. Equinox 20:46, 25 December 2017 (UTC)

They are not wrong, but perhaps we want a template to show the emojis on a different place – except we already have a fitting one, which I do not know about. Or maybe we should use the emojis as parts of the glosses 😅? It is perhaps better than a {{wikipedia}}-style mention for cases where want to translate words for feelings to emojis or else where there is no 1-to-1 match but a rough one. Though I warn that Wiktionary will look retarded if we will use emojis for glossing. On the other hand it is perhaps an accessibility plus. Simple English Wiktionary would probably do it. Palaestrator verborum (loquier) 22:19, 25 December 2017 (UTC)
One reason why they don't seem like "translations" is that a symbol and a word may exist for the same concept in the same language (e.g. blue letter P for "parking"): not all symbols are translingual. But it's also odd to call them "synonyms" when they are not (generally) substitutable into a written sentence. Equinox 22:30, 25 December 2017 (UTC)
That little word “generally” is revealing. It means they are substitutable if the rules of style permit it – yeah, oral speech falls completely out for these signs and much other speech because one does not generally want to use it, as long as one does not exert oneself to think in emoji. The only requirement is analytic grammar, though yeah, I like to put Russian endings after writing Latin words, so I could write: “Не ешь моего 🍍а.” But it appears to me that such dependence on endings inhibits usage of emojis at least somewhat, and that the Japanese not having many of them is the reason why emojis have gained the most prominence with them.
For the word “translation”, it is only the general idea that it takes place between different languages while we even fail to distinguish the borders of a language. This way one can translate J. M. E. McTaggart for analytical philosophers or similarly between sub-languages, to show an exaggerated example. Palaestrator verborum 🎢 sis loquier 🗣 22:51, 25 December 2017 (UTC)
I would say that that is wrong. Just because ideographs have become more common lately due to Emoji, it does not change the fact that one could theoretically draw a picture of anything to try and convey an idea, that is exactly not language. The cases where I think Emoji might merit inclusion are those where the image has taken on a secondary meaning, e.g. the eggplant. A picture of a taco as a translation of taco should be avoided. - TheDaveRoss 13:55, 27 December 2017 (UTC)

See Category:Translingual translations for all entries that have Translingual translations. I suggest removing all of these Translingual translations and keeping this category empty forever. Even the word Japan is is currently "translated" as "🗾" (which is just a map of Japan encoded as an emoji). I also suggest using the "see also" section in normal word entries to link to emoji. I support using the "see also" this way in all entries where a word can be reasonably linked to an emoji. --Daniel Carrero (talk) 15:25, 27 December 2017 (UTC)

I support this removal as well. —Μετάknowledgediscuss/deeds 00:50, 28 December 2017 (UTC)
Oops! I wish to make a correction: some Translingual translations are about taxonomic names of species. The entry rufous-bellied kookaburra has this Translingual translation: Dacelo gaudichaud. I only support removing all emoji translations. In my opinion, the taxonomic translations can stay. --Daniel Carrero (talk) 01:01, 28 December 2017 (UTC)

Watchlist headers[edit]

I was initially bold about this, then decided it was probably better to ask first. There are a number of things included at the top of the watchlist page, including {{votes}}, {{smallest discussions}}, a bunch of "utility" links, and the "not counted" message. Personally I would prefer to remove all of these links from the default page, and allow people to add them to their own watchlist if they would like to. I feel that the watchlist is an "opt in" page, and should have as little content forced on the user as possible. Can we remove some or all of the extraneous content from the default page view? It would be easy for individuals who would like the content to restore it using their .js page or a gadget. - TheDaveRoss 16:38, 26 December 2017 (UTC)

I suggest keeping at least the {{votes}} there by default. Reason: votes can cause major changes in the website, so in my opinion it's best to let everyone know, by default, what are the present votes. I'm perfecly fine with letting people use CSS and/or JS to hide the box if they want. --Daniel Carrero (talk) 15:38, 27 December 2017 (UTC)
That template is already included on several other high-traffic pages. The watchlist is an opt-in page in all other regards, why should we impose what we think others should see on their own watchlists? - TheDaveRoss 16:03, 27 December 2017 (UTC)
I, for one, rarely see the top of RfD, RfV, TR, RfC and other similar high-traffic pages. As we obviously don't want too much democracy in our votes, we wouldn't want to have some kind of more aggressive push notification (eg, e-mail notices to all opted-in users for just-started or about-to-close votes). Thus, we need some form of conspicuous, yet discriminatory, notification. For this watchlist notification is pretty good, not being obviously undemocratic. DCDuring (talk) 16:21, 27 December 2017 (UTC)
You also have it on your user page, and it is at the top of BP as well. - TheDaveRoss 16:40, 27 December 2017 (UTC)
The votes are shown in the watchlist as per consensus in Wiktionary:Beer parlour/2015/September#Should we display the active votes in the watchlist?. Maybe it's correct that we are imposing what "we" think others should see on their own watchlists -- but "we" is not you and I; "we" is the community consensus, which may either change or persist. In my opinion, that's a good thing, in addition to the other reason I gave in my message above.
Dave, the template is not at the top of BP. What are the other high-traffic pages you mentioned? In Special:WhatLinksHere/Template:votes, I see the template is mostly linked in a few BP discussions. Or maybe would you suggest we should add the template in any other pages that don't have the template yet?
Plus I don't think it's actually true that the watchlist is an opt-in page in all other regards. These things are shown there by default: the "utilities" (logs, new, cleanup, etc.) and the wanted entries.
I support automatically sending an e-mail to all Wiktionary users whenever a vote page is created. Or just opted-in users. Who can make this happen? --Daniel Carrero (talk) 16:42, 27 December 2017 (UTC)
I think you would see the downside of populist democracy with that innovation. Who knows? We might be taken over by deplorables. DCDuring (talk) 16:45, 27 December 2017 (UTC)
As I'm sure you know, Wiktionary:Voting policy#Voting eligibility has some eligibility restrictions. Personally, I support that, which seems to be a measure against people who never edited Wiktionary just coming here to vote and then going away without making any actual contributions. That's the second best system for making decisions I can think of. The best one would be making me King of Wiktionary, enabling me to unilaterally make decisions just by issuing royal decrees. --Daniel Carrero (talk) 17:00, 27 December 2017 (UTC)
One alternative to populist democracy is that we try to put a good product out there, with a highly select, but mostly self-selected, group of active contributors being the participants in votes. We then let the users decide whether or not they like the product. That's pretty much what we do now. We could go the extra mile and take the trouble to actually find out what users, including lurkers and passive users, actually use, but we seem to like relying on our own whims most of all. DCDuring (talk) 17:32, 27 December 2017 (UTC)
But why actually find out what users want when we could instead spend our time whining about it and derailing discussions, right? —Μετάknowledgediscuss/deeds 17:48, 27 December 2017 (UTC)
We do have WT:FEED. --Daniel Carrero (talk) 18:19, 27 December 2017 (UTC)
No one, including me, spends much time looking at WT:FEED, though perhaps we should. What I think we would usually want is answers to the questions that might inform our efforts, as to both form and content. As surveys are nearly as selective as voluntary random feedback, they're not enough. Someone with the technical chops to count clicks using the squid(?) server could help. DCDuring (talk) 19:00, 27 December 2017 (UTC)
E-mails? It’s like those Wikipedias that automatically sent me an e-mail just when I visit a single page on one of them via interwiki links. Hugely annoying. Also not targeted enough. So only opt-in is acceptable. As for {{smallest discussions}}, I do not see any use in it. Meseems however that {{votes}} is at its most appropriate place. Palaestrator verborum sis loquier 🗣 17:16, 27 December 2017 (UTC)
@Daniel, sorry about the BP comment, it is linked to but not transcluded, my mistake. Re the "consensus" in the link you shared, it looks like a few people were OK with the idea and a few weren't, and it happened. I wouldn't call it consensus per se. I agree that there is a lot of stuff on the watchlist which is not opt-in, my suggestion is to remove all of that so that the watchlist returns to being in the control of the user. Put these types of things in the community space, on BP and TR etc. - TheDaveRoss 18:48, 27 December 2017 (UTC)

Enable "If you have time, leave us a note." for everyone, not just the anons[edit]

There's this message in the sidebar: "If you have time, leave us a note."

The message is linked to this specific webpage, where "example" is the current entry: https://en.wiktionary.org/w/index.php?title=Wiktionary:Feedback&action=edit&section=new&preload=Wiktionary:Feedback%2Fpreload&editintro=Wiktionary:Feedback%2Fintro&preloadtitle=%5B%5B%3Aexample%5D%5D

In other words, it enables the person to post a new message at WT:Feedback.

My point is: I only see the sidebar message when I'm not logged in. When I'm logged in, the message disappears. Apparently, all logged users are unable to see that link to leave feedback, aren't they? I would suggest leaving the sidebar message on at all times. As we know, some users might have created accounts on Wikipedia and other projects and may not have any experience with using Wiktionary.

See the "WT:FEED" section at MediaWiki:Common.js to configure this. --Daniel Carrero (talk) 18:18, 27 December 2017 (UTC)

Symbol support vote.svg SupportRua (mew) 18:21, 27 December 2017 (UTC)
Symbol support vote.svg Support DCDuring (talk) 18:49, 27 December 2017 (UTC)
Symbol support vote.svg Support Yea, that argument about logged-in users from other Wikimedia projects is strong. Their opinions are likely even more constructive. Palaestrator verborum sis loquier 🗣 19:24, 27 December 2017 (UTC)
Symbol support vote.svg SupportΜετάknowledgediscuss/deeds 00:49, 28 December 2017 (UTC)

Wiktionary:Votes/2017-07/Templatizing topical categories in the mainspace 2 and the increasing use of templates on Wiktionary[edit]

I thought it worth drawing attention to this vote which is due to end within the next 24 hours. If the vote passes, it will have an effect on most entries and yet only a small proportion of editors have voted on it so far.

I have only just started editing here again after a break, and I've noticed that there have been moves to add templates to entries where there was none before, with little obvious benefit but the obvious disadvantage of creating a backlog of thousands upon thousands of edits that need to be made to bring entries in line with whichever new standard has been introduced. It creates a situation where where only a small proportion of words represent the latest standard and most will never be updated before yet another template is introduced, or if ever at all. Kaixinguo~enwiktionary (talk) 21:50, 27 December 2017 (UTC)

@Kaixinguo~enwiktionary Do you see benefit in diff? The categories covered most of the editing field. I don’t see a backlog, it is said to be a bot job. Why do you esteem it possible that another template is introduced? Palaestrator verborum sis loquier 🗣 22:49, 27 December 2017 (UTC)
Lately I've started to place topical categories before the senses they apply to, like we already do with labels. It makes more sense that way. —Rua (mew) 22:55, 27 December 2017 (UTC)
OK, thanks for the example. It does demonstrate a potential advantage to the change to a template. I would still rather see the word 'category' used as I feel that 'C' alone is not self-explanatory. Regarding other templates, amongst the words on my watchlist I have noticed the the recent introduction of a template for alternative forms (template:alter), template:l being used around English translations, the template for categories mentioned in this vote, and also template:syn suddenly introduced for synonyms with synonyms being moved in the page as well.Kaixinguo~enwiktionary (talk) 23:10, 27 December 2017 (UTC)
I know- if I can't keep up with the changes in the templates then I will just have to stop editing or choose a task that doesn't need any understanding of templates. That may be for the best, I just wanted to raise this point. Kaixinguo~enwiktionary (talk) 23:17, 27 December 2017 (UTC)
Looks like the vote isn’t going to pass anyway. Proper collation defeated once again! — Ungoliant (falai) 23:01, 27 December 2017 (UTC)
I would vote support if we didn't use the horrible {{cat}} name. —Rua (mew) 23:38, 27 December 2017 (UTC)
I find it very silly to block progress that you agree with on account of your preferred name being one that almost nobody else supports. —Μετάknowledgediscuss/deeds 00:48, 28 December 2017 (UTC)
On the contrary, I think it would be detrimental to use a template with a name that does not reflect its purpose. By voting against the name, I am helping to keep template naming clearer. —Rua (mew) 23:53, 28 December 2017 (UTC)
I think we have ended up with no consensus for both {{c}} and {{cat}}, so users should be free to choose, {{topics}} is doomed, and {{Category}} wiil be around for a long time yet. I hope DP doesn't ask for another extension. DonnanZ (talk) 00:10, 29 December 2017 (UTC)

Oi, I have an idea for another try, which could make all glad that the votes failed: Combine Wiktionary:Votes/2017-11/Placing Wikidata ID in sense ID of proper nouns and Wiktionary:Votes/2017-07/Templatizing topical categories in the mainspace 2 into one vote so that {{senseid}} (or a template of a different name, for distinction) categorizes instead of the [[Category:Stuff]] command or a template for wrapping categorization. This should work for almost anything, innit? Flora, fauna, locations on earth and in the sky, all the things in Category:en:All sets as distinguished by Category:en:All topics (or even in those? I do not understand yet how {{autocat}} and family have their category structures, I need a lecture on it as I just create those topic categories for Arabic that already exist for English). And to get the things in Category:en:All topics too we could make a master template: {{collect|en|Wikidata ID|s1=subsidiary categorization|s2=subsidiary categorization 2}}. This should reduce complexity? Palaestrator verborum sis loquier 🗣 23:30, 27 December 2017 (UTC)

Information desk header wording[edit]

I know this is a bit meticulous. But I was reading the header for Wiktionary:Information desk, and I noticed something that was a bit off. It says:

Welcome to the Information desk of Wiktionary, a place where newcomers can ask questions about words and about Wiktionary, ask for help, or post miscellaneous ideas that don’t fit in any of the other rooms.

I don't think this is necessarily true. The wording of the sentence suggests that only newcomers should ask questions about things at ID. I ask questions there a lot, and I've been here for almost 4 years. I saw Equinox ask a question in ID recently, for instance, too, and he's an admin and been here since 2009. I understand that it may be mainly newcomers coming to the ID, but the only problem I have with this wording is the sort of conclusion that some might come to with this wording that people who aren't newcomers shouldn't ask questions if they have trouble. I know it's not suggesting this necessarily, but still it gives off the wrong feeling.

I propose that it should be reworded to:

Welcome to the Information desk of Wiktionary, a place where users can ask questions about words and about Wiktionary, ask for help, or post miscellaneous ideas that don’t fit in any of the other rooms.

If it is necessary to mention newcomers, how about:

Welcome to the Information desk of Wiktionary, a place where newcomers or others users can ask questions about words and about Wiktionary, ask for help, or post miscellaneous ideas that don’t fit in any of the other rooms.

But I would not recommend this, because we probably want to keep it simple. PseudoSkull (talk) 23:49, 28 December 2017 (UTC)

"where new and experienced users alike"? —Rua (mew) 23:51, 28 December 2017 (UTC)
If you're an experienced user you'll know already that it's allowed to ask question there, thus the change is redundant. Crom daba (talk) 00:30, 29 December 2017 (UTC)

I have been bold and made the change from "newcomers" to "users". SemperBlotto (talk) 06:40, 29 December 2017 (UTC)

English irregular verbs[edit]

How can I get a list of verbs which are regular in standard modern English, but used to be strong verbs (i.e. have a past simple and/or past participle tagged as "archaic" or "obsolete". An example: yield)?

Proposed new rule: don't repeat things on the headword line[edit]

Right now there are lots of non-lemma entries that have gender and/or number information in the definition, but then also have this same information in the headword line for no apparent reason. I find this ugly and unnecessary, so I propose a new rule: if information is already presented in the definition, that information should not also be present on the headword line. The headword line isn't there to give definitions, and if we're already duplicating gender and number information there, what's next, cases? For the simple fact that we don't place cases in the headword line if they are in the definition, we shouldn't do so with gender or number either. —Rua (mew) 20:46, 31 December 2017 (UTC)

For anyone who wants example cases, I found a few from Rua's contributions. diff, diff.
As for my opinion, I without a doubt support this. I find this annoying myself, as it's unnecessary to mention this for inflected forms; only in the lemma entry. It's always been an afterthought for me, though, and I can't recall ever making a fuss about it, but huge thanks for bringing this up. A fuss needed to be made. PseudoSkull (talk) 22:52, 31 December 2017 (UTC)

Symbol oppose vote.svg Oppose I do find that including gender in inflections is useful, particularly when a noun has more than one gender. In Norwegian the inflections for words that are both masculine and neuter can be pretty confusing, so introducing a rule like this would be a backward step, and achieve absolutely nothing. DonnanZ (talk) 23:44, 31 December 2017 (UTC)

@User:Donnanz This is a valid point. Perhaps then you might consider a conditional support, for languages such as Norwegian where it is especially confusing as you say? PseudoSkull (talk) 00:03, 1 January 2018 (UTC)
I think we need to wait for other comments. DonnanZ (talk) 00:18, 1 January 2018 (UTC)
Note that the proposal only concerns cases where the gender is already mentioned in the definition. With noun forms this is not usually the case, so the gender would be kept there, although number would be removed. I'm not personally in favour of keeping gender in that position either, since it still duplicates information from the lemma. But I'm leaving it unchanged in this proposal. —Rua (mew) 00:51, 1 January 2018 (UTC)
What do you mean by number? DonnanZ (talk) 01:01, 1 January 2018 (UTC)
w:Grammatical number. In practice, it's mostly plural. —Rua (mew) 01:06, 1 January 2018 (UTC)
I somehow fail to imagine cases when the grammatical gender, i. e. the noun class and not the number is in the definitions.
But I can imagine the genders in the heads of plural entries of Arabic removed. أقطان says about the forms they are masculine plural though plurals of inanimate masculine nouns agree with the feminine singular in Arabic. Palaestrator verborum sis loquier 🗣 01:59, 1 January 2018 (UTC)

January 2018

Cleaning up after long-term abuse by Zeshan Mahmood[edit]

The user account Zeshan Mahmood is globally locked [6] because of cross-wiki abuse. In a recent sockpuppet investigations case on en.wiki, it turned out that they have extensively edit from IPs. IPs that geolocate to the same area and that follow the same editing patter, have been active here as well, the following IPs match the description: [7] [8] [9] [10]. It's likely there have been many more. This user's edits were occasionally helpful, buy they have often added spurious content and created hoax articles (there's one obvious example). I'm leaving it to the community to decide what is the best way of dealing with their legacy.

Though it is not highly desirable content at this stage in Wiktionary's development, I don't see any 'obvious' problem with example you highlight, the entry for Karachi-Bela Division. Could you explain how it is a hoax? DCDuring (talk) 17:51, 1 January 2018 (UTC)
It would seem there is no "Karachi-Bela Division" of Pakistan. Bela, Pakistan and Karachi both exist, but they're in separate provinces. Divisions of Pakistan does suggest there used to be a Karachi-Bela Division, but that info could have been added by this blocked user. —Mahāgaja (formerly Angr) · talk 18:10, 1 January 2018 (UTC)
The Wikipedia article doesn't exist, so I removed the reference. Probably a candidate for RFD. DonnanZ (talk) 19:05, 1 January 2018 (UTC)
I straight up deleted it. No such thing. It was also rv'd on Wikipedia: [11]. All the Google hits seem to be WP mirrors that haven't been updated. —AryamanA (मुझसे बात करेंयोगदान) 19:28, 1 January 2018 (UTC)
If you look at the block log for some of these accounts, you'll notice we dealt with this person back in 2013 & 2014- it was obvious at the time what they were up to. My impression at the time was that this was a typical expat wannabe hardliner trying to rewrite geographical terminology to fit their Pakistani/Islamist worldview. I think you'll find that the bogus geographical entities are what would exist if certain things like the Indian occupation of territories claimed by Pakistan hadn't happened.
I agree, though that we never properly cleaned up their edits- most of the problems were in content that's rather marginal for Wiktionary, so our review is rather hit-and-miss. Chuck Entz (talk) 22:04, 1 January 2018 (UTC)

January LexiSession: Happy New Year![edit]

LexiSession is back! Ok, I didn't have time to write you a notice last month, sorry. We looked at tea.

This month, we are gonna be improving the pages describing words related to New Year celebrations, all around the globe. It could be interesting.

Well, for those who do not known LexiSession yet, it is a collaborative transwiktionary experiment. You're invited to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every Wiktionary on the same agenda, to give us more insight into the ways our colleagues works in the other projects.

I hope that 2018 will be a year that LexiSession increases in participants and page-creations! Face-smile.svg Noé 20:04, 1 January 2018 (UTC)

I created uvas de la suerte. --Gente como tú (talk) 14:59, 2 January 2018 (UTC)

Wiktionary:Votes/sy-2018-01/User:Nloveladyallen for admin[edit]

There's an adminship vote going on. --Per utramque cavernam (talk) 14:42, 2 January 2018 (UTC)

Wiktionary:Votes/2017-12/User:BukhariSaeed for rollbacker[edit]

I don't know what to make of this. --Per utramque cavernam (talk) 15:10, 2 January 2018 (UTC)

I deleted it; I think Aryaman is handling this user. —Μετάknowledgediscuss/deeds 23:57, 7 January 2018 (UTC)

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg


December issue of Wiktionary Actualités just came out in English!

Actualités this month include an article about Trump censuring words, a presentation of a book, an investigation about the definition of peace, some words about the Tech survey, links to cool stuff, statistics, shorts news and nice pictures!

This issue of our regular journal was written by nine people and was translated for you by Pamputt and I. This translation could be improved by readers (wiki-spirit). We still receive zero money for this publication and we are not supported by any user group or chapter, it's just a way for us to show how cool our project and community are. Feel free to send us comments or to start your own journal (we're eager to read it and we can help you to start it!) Face-smile.svg Noé 16:59, 2 January 2018 (UTC)

Very nice! —Stephen (Talk) 19:06, 2 January 2018 (UTC)

RFD of Reconstruction pages[edit]

These are currently put in Wiktionary:Requests for deletion/Others, among rfd of templates, categories and the like, but I think they belong rather in Wiktionary:Requests for deletion/Non-English. Yes, reconstruction pages aren't in the mainspace, but they're still entries, which serve to present lexical items. --Per utramque cavernam (talk) 21:49, 2 January 2018 (UTC)

Listing of compounds under Derived terms or Related terms ?[edit]


A while ago, I looked up the definition of Derived terms in the section Derived terms at Wiktionary:Entry layout. There, I was told that Derived terms list terms that are morphological derivatives. But what exactly are morphological derivatives? I looked it up at Wikipedia (Morphological derivation).

Under the section Derivation and other types of word formation the article clearly states that from a linguistic point of view compounds are not considered to be derivations:

Derivation can be contrasted with other types of word formation such as compounding. For full details see Word formation.
Note that derivational affixes are bound morphemes – they are meaningful units, but can only normally occur when attached to another word. 
In that respect, derivation differs from compounding by which free morphemes are combined (lawsuit, Latin professor). 
It also differs from inflection in that inflection does not create new lexemes but new word forms (table → tables; open → opened).

Since my editing is mostly confined to German language entries, I subsequently figured out that this also applies to German language compounds: Derivation_(Linguistik)

Die Derivation unterscheidet sich von der Zusammensetzung (Komposition) dadurch, dass bei letzterer mindestens zwei Wörter (Grundmorpheme) eine eigenständige lexikalische Bedeutung besitzen, während bei der Derivation nur ein Wort existiert, dessen Anhängsel (Affixe) keine konkrete (jedoch eine abstrakte) lexikalische Bedeutung haben.
Beispiel eines Derivats: Frei-heit → frei ist Lexem (Adjektiv), heit besitzt abstrakte lexikalische Bedeutung, nämlich einen Seins-Zustand. Gesamtwort: Substantiv
Beispiel eines Kompositums: Haus-wand → Haus ist Lexem (Substantiv), Wand ist Lexem (Substantiv). Gesamtwort: Substantiv

The established practice at Wiktionary, however, is to include compounds under Derived terms, so this seems to me somehow contradictory. Again, W:EL clearly states that morphological derivatives should be listed under Derived terms, so there can be no doubt.

Those words that have strong etymological connections (like compounds) but aren’t derived terms should be listed under Related terms (-> Related terms).

For this reason, I changed my way of editing, starting to list compounds under Related terms, but my edits were reverted twice so far. To resolve this confusing situation, I need some kind of clarification regarding this issue. Thanks.-- 00:34, 3 January 2018 (UTC)

While we are at it, we could also decide whether terms that are historically (diachronically) derived from terms in other languages, but can be constructed equivalently (synchronically) from native morphemes, should be shown as Derived or Related or both. DCDuring (talk) 01:16, 3 January 2018 (UTC)
I'm in favor of considering compounding to be a form of derivation for Wiktionariographical purposes, even if it isn't as far as theoretical morphologists are concerned. For example, we consider German verbs with separable prefixes (e.g. ˈüberˌsetzen (to pass over)) to be compounds but verbs with inseparable prefixes (e.g. überˈsetzen (to translate)) to be affixed forms. It seems silly to me to consider the latter but not the former to be a derived term of setzen.
I'm also in favor of considering transparent root+affix units synchronic derived forms even when the affixation originally happened in another language: while heavily goes back to Old English, it can be (and is) coined afresh by any English-speaking child who has learned to affix -ly to adjectives to form adverbs, even if s/he has never actually heard the word heavily before. It is thus simultaneously an inheritance from Old English and a new formation in Modern English. —Mahāgaja (formerly Angr) · talk 09:13, 3 January 2018 (UTC)
I agree, and I'd even go a bit further. I suspect that words using very common affixes should, more often than not, really be seen as new coinages only: the force of analogy is so strong that all the sound changes they would normally undergo are warded off. --Per utramque cavernam (talk) 14:14, 3 January 2018 (UTC)

We are using “derived” in a more vulgar way. The section lists morphological derivations, but not only these, but as you see also compounds, and also we could list those Chinese formation mentioned in Wiktionary:Beer parlour/2017/November § Add pronunciation of chinese words in the table titled "Dialectal synonyms of", under the "Synonyms" header. which currently use a non-standard header. For this übersetzen example it could be advisable to separate those two kinds of derivations under two headers, or maybe even three to make the distinction to other kinds of compounds that do not look like containing a prefix: with prefix, with adverb, with other parts of speech. If the community had known all those problems before there would not have been a successful vote … but still you must see that Related terms is too loose a relation for compounds you add, but if you don’t take WT:EL by the words it all looks good, because no reader can complain about seeing compounds under Derived terms. Palaestrator verborum sis loquier 🗣 11:01, 3 January 2018 (UTC)

My issue is that the common practice uses Related terms to mean words that share an etymon but are not derived or directly related. Compounds do not fall into this category as they are created directly from the 2 or more parent members. Related terms should always be used to represent a more distant genetic relation. —*i̯óh₁nC[5] 11:39, 3 January 2018 (UTC)
I agree. --Per utramque cavernam (talk) 14:14, 3 January 2018 (UTC)
Agreed. Far too many editors are including derived terms (consisting of two words) as related terms, which I think is wrong. I'm not sure what the logic is. And then there's hyponyms, yet another complication and open to misinterpretation. DonnanZ (talk) 17:02, 4 January 2018 (UTC)


I am a regular user of Wiktionary and I already have AWB access on English Wikipedia and Simple English Wiktionary. I would like to help with cleaning up some of the definitions on Wiktionary and I would like to help out with correcting typos and formatting. I have already done some work on cleaning up some pages on the Check Wiktionary page [12]. Can I please be added to the AWB checkpage. Pkbwcgs (talk) 10:08, 3 January 2018 (UTC)

  • Seems reasonable request. Added. SemperBlotto (talk) 21:44, 4 January 2018 (UTC)

Disallow Template:l in glosses and definitions[edit]

Can we make a rule to disallow {{l}} in edits like diff? There's absolutely no need for it. —Rua (mew) 13:04, 6 January 2018 (UTC)

That there is ”no need” means it’s supererogation, not that it is bad. But it seems to me that the editors are generally most joyed with the anarchy. Sometimes I write square brackets, sometimes curly brackets. Both has its advantages. The syntax highlighting though should display better, for it seems to me that your dislike for the template in glosses arises mostly from it. And, I don’t say you are overly reactionary, newer people have learnt to like it too (like me, it’s easy for me after I became used somehow, though I see that for others it is easier to write four square brackets, which I sometimes do too).
Wasn’t there are vote where it should be made required to use {{l}}? It’s fail had as result: Do you what you want. Sometimes normalization is exuberant. Palaestrator verborum sis loquier 🗣 14:05, 6 January 2018 (UTC) This is the vote: Wiktionary:Votes/2016-07/Using template l to link to English entries. Palaestrator verborum sis loquier 🗣 14:28, 6 January 2018 (UTC)
Yes please. It makes it harder for editors and gives no benefit. Equinox 14:09, 6 January 2018 (UTC)
@Rua: Isn't it needed to slow down the pages? --Rerum scriptor (talk) 14:21, 6 January 2018 (UTC)
@Rerum scriptor: I don’t know what exactly you are asking, but {{l}} instead of square brackets slows down, so the square brackets are needed.
But there is a reason against the notion that the template makes the wikitext harder to read, the reason that the template adds some structure: Links to words are done by templates, other links, out of the mainspace for example, get square brackets. But I take all easy. Palaestrator verborum sis loquier 🗣 14:37, 6 January 2018 (UTC)
@Palaestrator verborum: I think he's making a joke; {{l}} can use a lot of memory if it's invoked on a page too many times. —AryamanA (मुझसे बात करेंयोगदान) 15:00, 6 January 2018 (UTC)
@AryamanA (or someone else): Have you got some numbers in the head about it? I will support the square brackets if there is a significance. Palaestrator verborum sis loquier 🗣 15:07, 6 January 2018 (UTC)
@Palaestrator verborum: I just tried it out. {{l|en|word}} uses 1.52 MB of memory, and each successive use of {{l}} uses ~0.11 MB. So it's not horrible, but it's still unnecessary memory usage. —AryamanA (मुझसे बात करेंयोगदान) 15:11, 6 January 2018 (UTC)
Ok, some middling support from me for the square brackets. I ping @Profes.I. lest he find out too late: It looks like there forms are rule that you shall not use {{l}} anymore for English glosses; but you can say what you think about it. Palaestrator verborum sis loquier 🗣 15:35, 6 January 2018 (UTC)
I don't mind as long as we make an exception in cases where the gloss is spelled the same as the word it's glossing, for example accident#French needs to be glossed with {{l|en|accident}} so there's actually a link; using double square brackets would result in a linkless, bold-face gloss. —Mahāgaja (formerly Angr) · talk 16:35, 6 January 2018 (UTC)
I write [[#English|accident]] in such cases. —Rua (mew) 16:42, 6 January 2018 (UTC)
I don't like that solution any better than {{l}}, so I would like to continue having the choice to use either. I'm fine with it being banned in English definitions and cases where plain square brackets work just fine, but I oppose an absolute ban. Andrew Sheedy (talk) 17:03, 6 January 2018 (UTC)
Why is it "harder for editors"? New editors? I find [[#English|accident]] more difficult to parse. The fact that HTML fragments are used for language links is an implementation detail (and conveniently abstracted in templates) – Jberkel 17:15, 6 January 2018 (UTC)
Wasn't it supposed to be essential for the proper working of "Tabbed Languages". I need it to work properly for references to Translingual terms in definitions to avoid "orange" links. Also I note that it seems to force a type size in a way that plain wikitext and plain links does not [but in Firefox, not Chrome]. (See Cryptomonada#Hyponyms for example. If you don't see a difference try changing your default text size on your OS or browser settings.) See Template talk:l for any responses to my complaint (made today). DCDuring (talk) 17:37, 6 January 2018 (UTC)
Also, if it is bad to use it for links to English words in definitions, why is it good in lists of English words in English L2 sections? DCDuring (talk) 17:47, 6 January 2018 (UTC)
We've been trying in the last few years to wrap plain links in {{l}} so that they work properly with TabbedLanguages. TL was modified recently so that it defaults to English. So that consideration no longer applies. However, for the sake of consistency {{l}} has been used in the same places that it would be used for other languages, and I would like to keep it this way. There is a conceptual difference between {{l|...|[[x]] [[y]]}} and {{l|...|x}} {{l|...|y}}. The former is a single term in which two individual words are linked. The latter is two separate terms, each of which is linked. —Rua (mew) 18:40, 6 January 2018 (UTC)
Another kind of consistency would result from eliminating all uses of {{l|en}} in all English L2 sections. DCDuring (talk) 18:45, 6 January 2018 (UTC)
We can't elimiate {{l|en}}. Have you considered that the template has other parameters? —Rua (mew) 18:48, 6 January 2018 (UTC)
There would be a problem with ===Alternative forms=== for a start. DonnanZ (talk) 18:51, 6 January 2018 (UTC)
I didn't mean to suggest that it should be forbidden in English, only that it be discouraged in English L2 sections where other parameters are not actually needed. Most legitimate uses of alternate display function can be readily accomplished with plain links with pipes. DCDuring (talk) 19:01, 6 January 2018 (UTC)
I strongly suspect that a very small proportion of uses of {{l|en}} in English sections use other parameters, numbered or named. DCDuring (talk) 19:08, 6 January 2018 (UTC)
Well regardless, this discussion is only about getting rid of uses of {{l|en}} in places where non-English does not appear. —Rua (mew) 19:11, 6 January 2018 (UTC)
That would include uses under Alternative forms, Related terms, Derived terms, Synonyms and other semantic relations. In Etymology and Usage notes there is not much point in having {{l}}, almost all instances being better handled with {{m}}. I doubt that See also is much different from Related terms etc in that regard, non-English terms not really being appropriate there as a general rule. DCDuring (talk) 20:23, 6 January 2018 (UTC)
{{m}} gives italics, {{l}} doesn't. DonnanZ (talk) 20:54, 6 January 2018 (UTC)
That's always (almost always?) what we want in Etymologies and Usage notes. DCDuring (talk) 21:30, 6 January 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Exactly. I also use {{m}} for species. e.g. Punica granatum. Some editors are using {{l}} with {{der3}} etc. when it is not necessary if the language, e.g. lang=en, is specified. Links from the {{der3}} will work just as well without {{l}} except when there's two links on one line, or where a note is added. Then you have to use [[]] or {{l}}. DonnanZ (talk) 22:02, 6 January 2018 (UTC)

Actually, that may be due to laziness when converting from the old template to the new one. DonnanZ (talk) 22:30, 6 January 2018 (UTC)

The taxonomic authorities apparently want folks to use a type style for taxonomic names that contrasts with italics whenever the taxonomic name appears in italicized text. I couldn't figure out any good way to implement that here. (How would that work with {{sense}} or {{a}} when the taxonomic name was the only items enclosed. What about a mention of a taxonomic name? Should it contrast with the surrounding text or with the way a normal word would appear?) Our existing practice seems good enough. DCDuring (talk) 23:10, 6 January 2018 (UTC)
Regarding "X does italics and Y doesn't": let's learn from CSS (cascading stylesheets in Web design), where the aim is to separate how it looks from what it means. If we can't do something because it would have the wrong visual style, that suggests we might need a new style/markup based on the semantics. (Frankly I still miss the old days of "{{cooking}} A [[pot]] used to [[cook]] food.", but while we rely on hacky markup I can see why we need it. And I do like to be able to edit markup manually.) Equinox 00:52, 7 January 2018 (UTC)
@Equinox: The usual typographic custom is to use roman whenever italicized text calls for something to be italicized. For example, I would be really scared if I saw a Tyrannosaurus rex outside my window right now. —Mahāgaja (formerly Angr) · talk 15:58, 7 January 2018 (UTC)
Oppose, at least until we can agree upon a simpler way to make links consistently work correctly. — Ungoliant (falai) 16:47, 7 January 2018 (UTC)

Documentation template for modules[edit]

Hey there, I'm an admin from Turkish Wiktionary and have been meaning to get documentation pages of modules right. As you can see on this page, there aren't any edit or see links. It makes us difficult to work with modules. But couldn't figure our where to add a decent template for this. Anyone can help me? HastaLaVi2 (talk) 00:46, 7 January 2018 (UTC)

You need to create MediaWiki:Scribunto-doc-page-show and MediaWiki:Scribunto-doc-page-does-not-exist. --Vriullop (talk) 10:15, 7 January 2018 (UTC)
Thanks a lot! HastaLaVi2 (talk) 19:21, 7 January 2018 (UTC)

How much is the sc= parameter still needed?[edit]

Lots of our templates have a sc= parameter, but because we have script detection, I'm not sure we really need it. Are there any cases in which it's still used? Perhaps we can look at solving those cases. —Rua (mew) 21:14, 7 January 2018 (UTC)

@Rua: For what it's worth, I just surveyed 400 random English lemmas and found about 25 instances. I removed two and didn't see any difference in how the pages rendered. —Justin (koavf)TCM 22:43, 7 January 2018 (UTC)
Not even in the HTML? That's the part that matters. —Rua (mew) 22:47, 7 January 2018 (UTC)
@Rua: This edit "changed" line 365 from:
<li>Greek: <span class="Grek" lang="el"><a href="/wiki/%CF%97#Greek" title="ϗ">ϗ</a></span> <span class="mention-gloss-paren annotation-paren">(</span><span lang="el-Latn" class="tr Latn">ϗ</span><span class="mention-gloss-paren annotation-paren">)</span></li>
<li>Greek: <span class="Grek" lang="el"><a href="/wiki/%CF%97#Greek" title="ϗ">ϗ</a></span> <span class="mention-gloss-paren annotation-paren">(</span><span lang="el-Latn" class="tr Latn">ϗ</span><span class="mention-gloss-paren annotation-paren">)</span></li>
i.e. they are identical. —Justin (koavf)TCM 23:41, 7 January 2018 (UTC)
Templates that use an "sc" parameter and number of occurrences: User:DTLHS/sc, translation templates only: User:DTLHS/cleanup/translation sc. DTLHS (talk) 00:06, 8 January 2018 (UTC)
The |sc= parameter isn't needed if findBestScript from Module:scripts would give the same result. That is, if the template has text to work with, and a language code whose associated data file contains the script that the text is actually written in. I suspect that in the vast majority of cases the parameter is not needed. Rarely, it's actually doing damage, when Ancient Greek text is labeled as Grek (monotonic Greek) when it should be polytonic (polytonic Greek).
To actually determine if the parameter isn't needed, we need data on how often each script is actually used by each language on Wiktionary. If any script that is used is not in the language's data table, the |sc= parameter is needed, or the script needs to be added to the language's data table so that findBestScript will be able to automatically determine it. (This data would also be useful for determining which script should be first in the list for those languages that use multiple scripts.)
I suppose this could be done by bot, but it might be complicated. There are some pretty efficient Lua functions that could be translated into Python to do the actual script detection, though. — Eru·tuon 00:59, 8 January 2018 (UTC)
It's about time you got a bot account isn't it? DTLHS (talk) 01:19, 8 January 2018 (UTC)
@DTLHS: I like the idea, but I've found it difficult to come up with a way to start using the Python interface. — Eru·tuon 03:47, 8 January 2018 (UTC)
I'd rather do it with tracking templates. Module:links, Module:headword and others can be modified so that if sc is provided, check if it's identical to what you get from findBestScript. —Rua (mew) 11:12, 8 January 2018 (UTC)
I've done it now for Module:links. The following tracking templates are used:
Rua (mew) 11:23, 8 January 2018 (UTC)
I did the same for Module:headword. The tracking templates are the same, just with "headword" instead of "links". —Rua (mew) 13:02, 8 January 2018 (UTC)
Sometimes, mixed Japanese-English text needs to use |sc=Jpan. (could ja be made to never use Latn or something?) —suzukaze (tc) 03:49, 8 January 2018 (UTC)
In that case, I'd suggest wrapping the English and Japanese parts each in their own template. That way the language tags will be correct too. —Rua (mew) 11:13, 8 January 2018 (UTC)
I think that using the same font for Jpan and Latn text is nicer most of the time. —suzukaze (tc) 18:28, 8 January 2018 (UTC)
@Suzukaze-c: Latn is intended for roumaji. If this is what you mean, I agree that if text contains some Latin mixed in with kanji or kana, it should be tagged as Jpan; it would look weird if sequences of Latin characters in Japanese text were script-tagged as Latn. Maybe the Lua logic should be to assign Jpan if there are any Hani, Hira, or Kana characters at all in Japanese text, and if not to decide between Latn and Brai by counting characters. findBestScript in Module:scripts isn't quite that sophisticated, though. — Eru·tuon 21:44, 8 January 2018 (UTC)
Yes, what I meant is that if text contains any Jpan character, it should be marked as Jpan. (I forgot about romaji.) —suzukaze (tc) 00:12, 9 January 2018 (UTC)
@Erutuon But it's quite feasible to turn the scripts into an ordered priority list of some sort. Given that our script tagging is generally intended to make text more legible, it makes sense that Latn should only be used if none of the fancier scripts are found, in any language. —Rua (mew) 00:25, 9 January 2018 (UTC)
@Rua: I don't know how simple it will be to formulate a rule. Even in Japanese it may be more complex: probably Latin-script terms that are not roumaji transliteration (for instance, AT) should be tagged as Jpan. It's probably best to start small. — Eru·tuon 22:02, 12 January 2018 (UTC)
@suzukaze: What about sentences like ジス (this)?
If we code so that any sentence containing any JA text is marked in its entirety as JA, we might not get what we want.  :)
Also, it's worth noting that Japanese authors are occasionally prone to including English strings right in the middle of Japanese sentences. It's hard to search for, but this shows some examples where English appears in otherwise Japanese texts, and where the English is clearly English and not Japanese spelled in the Latin alphabet:
‑‑ Eiríkr Útlendi │Tala við mig 22:44, 12 January 2018 (UTC)
@Eirikr: Any ja sentence, not just any-language sentence :) "English strings in Japanese sentences" is exactly why I said what I said. Japanese fonts are often designed with consideration for English, but the reverse is alsmost certainly not true. —suzukaze (tc) 20:47, 13 January 2018 (UTC)

Western Yugur orthography standardization[edit]

(Pinging @Anylai as the only other consistent Turkic editor, but I'd like wider input too)

Western Yugur is a Turkic language spoken in China. It has no writing traditions (as far as I know) and due to the small number of its speaker community it is unlikely to get an officially recognized orthography. It is only attested in (pseudo-)phonetic transcription which differs from author to author.

In order to unify these sources and express them in a form appropriate for Wiktionary, I'm proposing a transcription system. The table compares symbols used in sources dealing with Western Yugur, Proto-Turkic (as used here, adapted from Starling) and Eastern Yugur (as used here, adapted from Nugteren in Mongolic Languages 2003) with my proposition. There are comments in wikicode (they should be notes but I forgot how to format those). A few example words to compare given orthographies and a sample text written in this orthography.

There are some inconsistencies here that I couldn't straighten out:

  1. T/D difference sometimes implies pre- and sometimes post-aspiration, and sometimes h is used instead.
    1. This is because post-aspiration is very common at the onset of the word and very rare in medial position, while pre-aspiration is quite common medially.
    2. Also there is no intuitive way to represent s with preaspiration.
    3. Pre-aspiration may be found before a -RT- cluster, but it is always a function of the occlusive which can then be used to signify it.
  2. It uses both a digraph and diacritics.
    1. I could perhaps use ġ for unaspirated uvular plosive, but gh feels more intuitive and in synch with Eastern Yugur.
  3. Slavic and Turkish symbols might clash too much.
    1. I needed to use Turkish symbols to represents Turkish sounds, and I needed kreska and haček to represent two series of sibilants, (Pinyin was out of question).

I'd love to hear what everyone thinks of this? Is creating new orthographies beyond everyone's comfort zone? Do you hate how it looks? Would you prefer more consistency? Any suggestions (even cosmetic)? Crom daba (talk) 18:20, 8 January 2018 (UTC)

Thank you for the research into the various orthographies. Amongst the sources, Lei Xuanchun's dictionary is part of the dictionary series for ethnic minorities in China produced by the Chinese Academy of Social Sciences, which can pretty much be regarded as the standard (unless there is evidence to the contrary). In general, I think we should limit creating orthographies to cases where absolutely no attempt at writing the language exists. Wyang (talk) 13:07, 9 January 2018 (UTC)
Lei's transcription scheme is pretty good, but it's basically IPA, and I think it would be better to have something more abstract and intuitively clear to Turkologists. I don't know if Lei's orthography is used outside of his dictionary or if it's purely ad hoc, I haven't come across it in western literature, are there any other Chinese sources using it? Crom daba (talk) 15:22, 9 January 2018 (UTC)
@Crom daba Sorry for the delay in reply. There are some papers citing the book and using his orthography:
  • 莊子儀(2011),回鶻文《金光明經》所反映的音韻現象,國立臺灣師範大學。
  • Yong-Sŏng Li (2014), “Some Star Names in Modern Turkic Languages-I” (and -II) (Çağdaş Türk Dillerinde Bazı Yıldız Adları-I, -II), Türk Dili Araştırmaları Yıllığı - Belleten, 62 (1): 121–156.
  • 徐丹(2015),从借词看西北地区的语⾔接触,《民族语文》,第2期。
  • 赤坂恒明(2016),<翻訳> 馬鈴?「哈薩克入甘続記」第一章第一・二節,埼玉学園大学紀要. 人間学部篇。
  • Li Yong-Sŏng (2016), “Finger Names in Modern Turkic Languages”, Central Asiatic Journal, 59 (1–2): 1–42.
Wyang (talk) 15:50, 12 January 2018 (UTC)
The digraph gh is a bit of an irregularity, when compared to its voiceless counterpart q. (Ts and dz are somewhat different, as they indicate affricates.) Does the sequence g + h also occur? If so, some method to distinguish the two would be needed. But consonant clusters don't look very common in the example given on the page. — Eru·tuon 21:47, 9 January 2018 (UTC)
The way I imagined it, h would only be used initially (where it is a phoneme), before sibilants to indicate pre-aspiration, and after medial stops to signify post-aspiration with the stop written as fortis. This makes it impossible to express the distinction between pre-aspirated and non-preaspirated, but I doubt that this difference is phonemic. In Lei I have found following cases of preaspiration:
  1. [pəhltər], [buhrqan], [ɢahsqa] - but he also has [pəhldər], [buhrɢan], [ɢahsɢa] showing free variation.
  2. Words ending in -hT, this is simply because Lei treats every final stop as aspirated, but (post-)aspiration isn't distinctive here, and I couldn't find any word written with -(h)D.
  3. Words with -hTD- or -hTT-, here he uses a fortis stop because all stop clusters are treated as if containing an intervening aspiration, I couldn't find any words with -(h)DD- clusters.
  4. Words with -hT- clusters that are actually compounds of words ending with -hT, aka second case.
  5. Remaining cases are written -hD-, leading me to believe that pre-aspiration is not contrastive before post-aspirated stops.
So basically, there shouldn't be any cases where a gh might be used for anything other than the uvular. Crom daba (talk) 01:51, 10 January 2018 (UTC)

Removing Scots from Wiktionary:Criteria for inclusion/Well documented languages[edit]

I've been playing with Scots a little, but things are very hard to cite. If someone RFVd jeelie bean, I couldn't back it up. Which is not say there's any other word in Scots for "jelly bean"; it's that Susan Rennie is way more dominant as an author/translator of children's books than anyone could be in most well documented languages.

I know we don't quote from Wikipedia, but I think that's a decent source of hard numbers on how well documented a language is. https://stats.wikimedia.org/EN/Sitemap.htm shows that there's fewer active editors than many other well-documented languages, and while the number of articles put it above Icelandic, slightly, a quick comparison of the two Wikipedias shows that Scots is full of stubs and Icelandic has long articles; I found various examples, but the current random article was Watter cycle versus Hringrás vatns.

Maybe I'm biasing it by comparing it to western languages. There's two Punjabi Wikipedias, and Western Punjabi is in about the same shape as Scots, where as (Eastern) Punjabi has fewer articles and more active editors. The Xhosa and Zulu Wikipedias are no where in the shape of the Scots Wikipedia; they're not my field, but I don't see why they're considered well documented languages.

I'm sure Wikipedia stats are going to annoy some people; I didn't come to this conclusion based on those numbers. I'm interested in Scots and Estonian, and as an American, books in Scots should be easier for me to access than Estonian books; I can order direct from Amazon.co.uk, if nothing else. I've found Ben-Ben-A-Go, Sweetieraptors: A Book O Scots Dinosaurs and Everson's various translations of Alice in Wonderland for modern Scots, but I can find a huge selection of modern Estonian works, so much so that I see no point in trying to enumerate them. w:List_of_newspapers_in_Estonia is an amazing list of regularly published works in Estonian; Scots Leid Associe says "The Associe furthsets the bi-annual journal Lallans, a 124-page magazine o the best nui screivin in Scots, thare is nae ither journal 100% in Scots". (That is, their biannual journal is the only periodical 100% in Scots.)--Prosfilaes (talk) 06:10, 10 January 2018 (UTC)

I think many of your arguments are not the most relevant, but I see your overall point and admit that you may very well be right that we should remove it. I suspect the original intent was to avoid sneaking in extremely rare English dialect words used in Scotland as Scots, considering how the two languages are so undistinct that at RFV, we often struggle to determine which language a text is written in. —Μετάknowledgediscuss/deeds 06:26, 10 January 2018 (UTC)
It's hard to compare against a lot of languages without some hard numbers. I didn't want to just refer to Estonian versus Scots, since I have no reason to think that Estonian should be the least of the WDLs, or that other people think so.--Prosfilaes (talk) 12:47, 10 January 2018 (UTC)
While the issue Metaknowledge highlights is a serious and recurring one, in some ways it's orthogonal, in that we're going to continue having to figure out whether things are Scots or Scottish English either way. It does seem like Scots is not that much better attested than Irish too (which was also removed a while ago). The tendency to view Scots as a form of English (which the OED still does?) may also be influencing those who want it to be subjected to the same standards; OTOH, it does seem like Scots authors are liable to unilaterally create neologisms by just Scots-ifying English words; but on the third hand, meh. I don't object to removing it. - -sche (discuss) 16:48, 13 January 2018 (UTC)
It could be argued that "Scotsified" words are merely phonetic spellings in the Scots dialect, which normally aren't too difficult to separate from true Scots words. DonnanZ (talk) 12:48, 14 January 2018 (UTC)

Reciprocal label[edit]

Why does {{lb|en|transitive}} add the word in Category:English transitive verbs when {{lb|en|reciprocal}} doesn't add the word in Category:English reciprocal verbs?Jonteemil (talk) 16:46, 11 January 2018 (UTC)

Because Module:labels/data has pos_categories = { "transitive verbs" }, under labels["transitive"] = {, but doesn't have pos_categories = { "reciprocal verbs" }, under labels["reciprocal"] = {. We could change that, though, if it seems like a good idea. —Mahāgaja (formerly Angr) · talk 17:10, 11 January 2018 (UTC)
There are possibly so few uses of such a label because one tends to split relevant words into multiple senses, compare fuck for an example where there is no label to put, apart from the unknownness of the term and the unlikeliness of the phenomenon in some languages, and maybe because it is at times hard to hard to decide if a verb is reciprocal or just ambitransitive. However if one does use such a label I see no reason why it should not categorize. Palaestrator verborum sis loquier 🗣 17:36, 11 January 2018 (UTC)
I suggest a change since all verbs are transitive, intransitive or reciprocal (I think).Jonteemil (talk) 18:41, 11 January 2018 (UTC)
Are other parts of speech ever tagged with {{lb|foo|reciprocal}}? If so, we would have to weigh whether it is better to use a new label "reciprocal verb" to categorize such verbs, and risk that some verbs will not get categorized because people don't know better and just use "reciprocal", or else force other parts of speech to use other labels and risk that some will be miscategorized as verbs if people use bare "reciprocal" on them. - -sche (discuss) 19:53, 11 January 2018 (UTC)
@-sche: A search for insource:/lb\|[^\}]+\|reciprocal[}|]/ in mainspace yields 21 results, some of which are in Pronoun sections: се, си, միմյանց, фкя-фкянь. — Eru·tuon 20:31, 11 January 2018 (UTC)
If it's the verb itself rather than the sense that's reciprocal, then there shouldn't be a reciprocal label there. Sense labels are for sense-specific things. —Rua (mew) 20:40, 11 January 2018 (UTC)
Thanks for doing the search. It is as I suspected (used of more than one POS). Since the label is so rare, my preference would be to introduce a new label "reciprocal verb" for verbs that need it, but bear in mind Rua's point. - -sche (discuss) 20:42, 11 January 2018 (UTC)
Yep, that stuff at the non-glosses of pronouns is misuse, those labels as in си should be just removed, they just double what should be in the non-glosses (as non-glosses can contain grammatical information, these examples are exactly what they are for). The “clitic” word should be moved into the description, it seems to me. Palaestrator verborum sis loquier 🗣 21:15, 11 January 2018 (UTC)
@rua: Well, if what I think is correct there are reciprocal verbs and reciprocal pronouns. A reciprocal verb can express a reciprocal tense without the use of a reciprocal pronoun. So there are reciprocal verbs, tenses and pronouns. English has two reciprocal pronouns who happen to be synonymous - each other and one another. My mother tongue Swedish has two - varandra (each/one another) and sinsemellan (with each/one another). All reciprocal tenses can be expressed with ”I /verb/ (with) you and you /verb/ (with) me/. For example: We met each other=I met you and you met me. Here ”met” isn’t used reciprocally since the reciprocality is expressed with the pronoun ”each other”. In Swedish this is: ”Vi träffades här”. Hear ”träffades” is used reciprocally since there is no reciprocal pronoun.Jonteemil (talk) 17:20, 12 January 2018 (UTC)

Proposal: Remove pre-1919 Chinese from well documented languages[edit]

User:Dokurrat created Template:zh-historical-ghost, indicating senses that only found in one or more historical dictionaries. However, per Wiktionary:Criteria for inclusion, these mention-only terms is not consider attested.

Classical Chinese is essentially dead language. Although there're plenty of texts in Classical Chinese (just like Latin), many texts in antiquity are irreversibly lost and many terms (including characters) can only be found in dictionaries. So I propose to exclude Chinese from well documented languages until 1919, when Classical Chinese is no longer widely in use.--Zcreator (talk) 13:19, 13 January 2018 (UTC)

I always opined that if a quote is old enough then that single quote is enough, by analogy, as English and German etc. are also separated into three stages of which only the latest are considered well-attested, so for example an Arabic quote from the eleventh century is always enough.
For Chinese one has special arguments again as the Chinese have a history of burning their own literature. The question is what you promise yourself from including characters that are only found in dictionaries. For English we have Appendix:English dictionary-only terms and the template {{no entry}} used. But the words you are concerned about are maybe, and likely, not ghost words but believed to have been used, just that the only thing left from the usage is a dictionary entry – a situation that the modern English language and the modern German language do not have but their old predecessors do: Many Old High German words are only attested in no better source than one or two word-lists; still mainspace entries for such words are accepted, it seems to me.
So I’d say if the dictionary is old enough, it seems a good solution for me to include the word in the mainspace and use {{zh-historical-ghost}}. The old dictionaries haven’t habitually invented characters, have they? Palaestrator verborum sis loquier 🗣 15:04, 13 January 2018 (UTC)

I think {{zh-historical-ghost}} should be turned into a language-agnostic template ({{historical-ghost}}), and use a language parameter. --Per utramque cavernam (talk) 15:10, 13 January 2018 (UTC)

Also a good point. Some people might ask the edgy question from which point in time such usage be appropriate, but answering that question would be comparing apples and oranges, for it depends on how history has unfolded itself for each language. Rigor that is appropriate with English attestations can well be brutish with another language that is superficially prominent, and the votes about such criteria were of course biased by the privileged position of English and loosed from the reality of other language. Palaestrator verborum sis loquier 🗣 15:22, 13 January 2018 (UTC)
Chinese is split into stages too: Category:Old Chinese language (och, en:w:Old Chinese including en:w:Classical Chinese) and Category:Middle Chinese language (ltc, en:w:Middle Chinese). Is it requested to split something like (New) Chinese in some way? If that's intended, how about splitting (New) English into Early New English (e.g. Shakespeare, KJB) and younger New English (e.g. Harry Potter), and (New High) German into Early New High German (until 1650, e.g. Luther) and younger New High German (after 1650, e.g. philosophy (Kant, Nietsche)) too? - 15:41, 13 January 2018 (UTC)
No, it’s not intended. Languages are split if differences in grammar and core vocabulary create a barrier. And it seems like English and German have two such splits while Spanish has only one and Arabic and Chinese have none since their early days. If you look through the “Old Chinese lemmas” you see that they are entries under the header “Chinese” with Old Chinese pronunciations in the pronunciation section.
The question is where to put such terms that are presumably left only in dictionaries but nonetheless believed to have existed. Palaestrator verborum sis loquier 🗣 16:01, 13 January 2018 (UTC)
In my opinion terms in Appendix:English dictionary-only terms (and other dictionary-only terms), except coined protologisms and ghost words like esquivalience and zzxjoanw, should be moved to main namespace, with a notice template indicating that this is only a dictionary-only term.--Zcreator (talk) 17:07, 13 January 2018 (UTC)
@Zcreator We have {{no entry}} (ablocate for example). DTLHS (talk) 17:37, 13 January 2018 (UTC)
This is my proposed layout.--Zcreator (talk) 17:44, 13 January 2018 (UTC)
@DTLHS Yep, it looks like he knew this – I have said this supra, and what he wants is some in-between where the definitions are still in the mainspace but with proper warning around. Like: “the meanings given for this term are …” Palaestrator verborum sis loquier 🗣 17:46, 13 January 2018 (UTC)
I'm sceptical. I can think of a number of times Chinese terms have been RFVed and an editor has cited nothing but a dictionary or two (mentions) — sometimes the senses RFVed are quite elaborate or hard-to-parse, too, like Talk:坉 — and the terms have had to be failed for lack of evidence of actual use (such as would, among other things, clear up the meaning). There is more argument to be made, IMO, for allowing Middle-Chinese-and-older terms (analogous to allowing Middle English, etc), but for terms mentioned only in a dictionary from e.g. 1914, I see no compelling reason not to use the same approach as in other languages, with appendices for dictionary-only terms. - -sche (discuss) 17:08, 13 January 2018 (UTC)
IMO dictionary-only terms may have its entry, but with a template indicating such.--Zcreator (talk) 17:30, 13 January 2018 (UTC)
I see, Chinese at wiktionary isn't really split into stages. Are there at least labels like {{lb|zh|Old Chinese}} similar to {{lb|la|Medieval Latin}} (at wiktionary Medieval Latin is part of Latin)? - 18:22, 13 January 2018 (UTC)
Oh well, 1919 is too late, that is visible; 1914 words aren’t that interesting either, but having entries for older badly attested terms and having stated the uncertainty (or non-existence) is wherefore people visit Wiktionary and appreciate it. I don’t know what a good date for Chinese is, but arguably it is one that is determined by the intrusion of Westerners and their economic possibilities for publishing texts – the same with Arabic. For senses, one can use {{uncertain}} – people have to see this if with the available material the semantics cannot be reduced to a denominator, which can happen as well with many cites. Consider plant names where many descriptions are needed for knowledge of the meaning; and also consider units of measures where in fact a mention can be more valuable than a use; and of course there are always problems with ideological and religious concepts – it is still unknown what فرقان means that appears seven times in the Qurʾān, and such terms continue to be created by obscurantists. We just need to evaluate if the term has existed widely, considering if people still search it, balancing the scientific and the market-oriented approaches. Palaestrator verborum sis loquier 🗣 17:33, 13 January 2018 (UTC)

Another proposal: Accept web.archive.org and WebCite etc. as a source[edit]

Previous discussion at Wiktionary:Votes/pl-2012-08/Citations from WebCite.

Currently only accept Usenet but not web.archive.org and WebCite have some problems:

  1. Not all languages are well presented at Usenet and Usenet is somewhat English-biased. This will cause a natural English-bias in Wiktionary.
  2. Use of Usenet is declining. It may be more and more difficult to find attestions of neologisms from Usenet.
  3. The decentralization of Usenet is limited. They may be accessed through Google Groups, but if you thinks web.archive.org and WebCite will close one day, it's not impossible that Google will also (Google was founded after Internet Archive and WebCite); It's also not impossible that Google may take down some content because of Digital Millennium Copyright Act. If you think WebCite had major outages, Google also had ([13]).

So, it may be a good idea to accept web.archive.org and WebCite etc. as a source, at least for webpages that there's evidence that it is an original work. For safety's sake, it may be required that a webpage should be archived at at least two different archive websites. However, quality control for cites is a problem; we should discuss it in detail.--Zcreator (talk) 17:29, 13 January 2018 (UTC)

For a beginning, the quotation templates should support additional links of archived versions. Else when I use |archiveurl= in {{quote-book}} or whatever it says “archived from” though the original URL is still accessible. Though I just habitually ensure archive.org and archive.is archive versions and want to link three versions for attestation. For many words – say gamer words, Russian words used in Germany only, dialectalisms in Arabic … – to cite some forum posts plus archived versions is the best thing one can do. @Sgconlaw
Yes, archive.is can be used too though it shows ads – I believe in capitalism. Palaestrator verborum sis loquier 🗣 17:58, 13 January 2018 (UTC)
To clearify: My proposal is to accept all perennial web archiving services, but web.archive.org and WebCite are preferred as they are long-established.--Zcreator (talk) 18:08, 13 January 2018 (UTC)

Does WT:Translation requests need more rules?[edit]

First of all, I see that a lot of writers are keeping the [brackets] in when they submit their requests. I don’t think that that’s a major issue, but in any case it seems to be incorrect and either needs to be removed entirely or given a clarification.

Now more importantly, for months we’ve been receiving a lot of garbage requests, lines that when translated turn out to be bizarre nonsense, like ‘colourless green ideas sleep furiously’. I myself have made jocular or vanity requests on occasion, but these particular ones, aside from being excessive, seem completely pointless to make. They’re useless for communication, and I suspect that the lines were never written by sentiment beings. Messages from amateur speakers would be one thing, but I think that these are nonsensical on purpose. As such, I propose that editors be allowed to erase them.

Nonetheless, I could see arguments against this, namely: ‘nonsense’ might be too subjective and up to interpretation, and nonsensical requests still aren’t exactly ‘harmful’, I guess. Keeping the mindless requests would annoy me, but I could deal with it in the long run. — (((Romanophile))) (contributions) 22:43, 13 January 2018 (UTC)

I’m for closing it. By its very nature it can only contain nonsense because nobody would post something personally valuable on such a high-visibility site for others to find that he has begged from others to translate it. There are also other communities more suitable for such, subreddits, Telegram groups, Discord groups, Tumblr, what not.
People might interject that sometimes it is amusing to translate, I have done it once – and only once – for this reason too, but there is no hardship with finding comparable delectations. Palaestrator verborum sis loquier 🗣 22:56, 13 January 2018 (UTC)
I see nonsense requests as abuse of a free resource. I suspect that the person(s) posting them are fully aware of the nature of their requests. I have removed them on sight.
I think that closing TRREQ is too drastic. —suzukaze (tc) 23:02, 13 January 2018 (UTC)
It can make sense to translate 'colourless green ideas sleep furiously'. For example, when translating the English wikipedia article into German, it could begin with "colourless green ideas sleep furiously (englisch für farblose grüne Ideen schlafen wütend) ist ein [englischer] Satz [...]". In case of other random words, it could be that the requester wants to have several words translated independently from each other. In case of strange English source sentences it's possible that it was translated from the user's native language to English, though not perfectly. Though of course it would have been better if both sources were provided, the non-English and the English translation. - 06:43, 15 January 2018 (UTC)

Hittite lemmas[edit]

Related previous discussion: Beer parlour / 2016 / March § Hittite lemmas.


Currently there are 115 Hittite entries in wiktionary. Most of them are written in cuneiform except for the few ones I've created. I think that expanding the Hittite dictionary would be way easier if we wrote the lemmas in some romanization. There is absolutely no reason to keep the lemmas in cuneiform, it only makes them harder to find. All books and dictionaries transliterate or transcribe words. No reader is going to look up a word in cuneiform, they're most probably going to type the broad transcription. And if they want to see the word written in cuneiform, there's no problem, since it's shown in the declension tables (see attaš). Say if a student that knows no Hittite want's to find a word, he can either do two things, look up a cognate and hope that the word he's looking for is linked there, or go checking the entries one by one on the categories. We don't write Egyptian lemmas in hieroglyphs, then why should we write in Hittite in cuneiform. Plus, the characters aren't visible in chrome, or at least not to me, so even if the reader knew Hittite, he might not even see the signs.

Hittite has two romanization systems. The first is called the one to one transliteration (e.g. at-ta-aš < 𒀜𒋫𒀸), here each sign is written with its corresponding transliteration. Whenever a dictionary gives an inflection, it often gives it in this method of transcription, specially if the word is irregular. The second one is called the broad transcription, and because it is the most legible it's the one I propose to use as lemmas. Dictionaries list words according to this one. They often list them under stems, so if you anted to find at-ta-aš you would need to look for atta-. Generally to transcribe words, the hyphens are removed and adjacent repetitions of identical vowels are simplified (e.g., a-ša-an-zi > ašanzi, na-at > nat, but ši-uš > šiuš). Adjacent identical consonants are not simplified but remain geminate (ap-pa-an-zi > appanzi). Redundant vowels are expressed with a macron (e.g. e-eš-ḫar > ēšḫar), and silent vowels are written between brackets (e.g. at-ta-az, at-ta-za > attaz(a)). Using the broad transcription would be way more practical, for both the readers an the editors. --Tom 144 (talk) 00:41, 14 January 2018 (UTC)

It is not true that else the entries cannot be found. One writes the transcription and insource:/==Hittite==/ into the search field.
There is no harm in creating soft redirects like for Japanese and Gothic, but do you really want to duplicate content? It can easily become out of sync, having invited incompetent people to create Hittite entries in romanization in masses without the cuneiform being found or to expand Hittite entries without expanding the cuneiform entries. I warn you that it is really annoying when people edit Serbo-Croatian entries in Latin spelling only and do not touch the corresponding Cyrillic entries. Palaestrator verborum sis loquier 🗣 10:20, 14 January 2018 (UTC)
Obviously, content shouldn't be duplicated; either the romanizations or the cuneiform should soft- (or hard-?) redirect to the other.
The problem of lemmatizing (and romanizing) Hittite has been discussed before, and is a bit tricky, I'll ping users who participated in that discussion: @ObsequiousNewt, JohnC5, Rua, DerekWinters. - -sche (discuss) 15:08, 14 January 2018 (UTC)
Thank you, @-sche:. After reading that discussion I would support listing words under stems, as Kloekhorst, the CHD, and Hoffner & Melchert do. I would oppose to standardizing cuneiform, since then we'd be making a false claim. Concerning attestations, unattested words should be marked with an asterisk as reconstructions generally are (e.g. the ablative in 𒉺𒀪𒄯, which is partially attested). There are two issues of this method, ambiguous characters, this are divided in to two types: ambiguous voicing, and ambiguous vowels. Ambiguous voicing is easy to solve, we can simply use the voiceless sign, just like Kloekhorst. Hittite used voiced and voiceless signs interchangeably and showed no voice assimilation, so it's unlikely voice was a distinctive feature (as Kloekhorst argues). Hoffner & Melchert say the following about the issue:
"Some cuneiform signs have more than one phonetic value, that is, they are polyphonous. Some CV type signs whose initial consonant is a stop can have either a voiced or voiceless interpretation: BU can be bu or pu. Signs of the types VC and CVC do not indicate whether the final stop is voiced or voiceless (b or p, d or t, g or k). For example, the sign AB can be read ab or ap, ID as id or it, UG as ug or uk. Moreover, when writing Hittite, the scribes do not even use contrastively those CV signs with initial stop that distinguish voicing in the Akkadian syllabary: a-ta-an-zi and a-da-an-zi ‘they eat’, ta-ga-a-an and da-ga-a-an ‘on the ground’, ad-da-as and at-ta-aš ‘father’ (§§1.84–1.86, pp. 35–36). Nevertheless, when transcribing syllabically-written Hittite words, Hittitologists normally transliterate the obstruent according to the value of the cuneiform sign most favored by the tradition of Hittitologists. Usually the favored trans- literation is that which uses the number one value (pa, not bá; du, not tù; ga, not kà). Exceptions to this pattern are the preferred transliterations utilizing the voiceless stops such as pí or pé (instead of bi), tén (instead of din or den), pár (instead of bar), pád/t or píd/t (instead of be), tág/k (instead of dag/k). CV signs possessing a number-one value of both voiced and voiceless nature, e.g., BU = bu or pu, are normally rendered with the voiceless stop."
Concerning the ambiguous vowels we have the sign 𒀪 that in bot Akkadian and Hittite accounts for aḫ, eḫ, iḫ and uḫ. There seems to be preference for aḫ. There are also various characters that cannot distinguish the i from the e, here the preference is i. In those cases, I would simply follow what the source has to say, and if authors happened to contradict each other, just list the alternative form in the page. After all, they will have already transcribed the word for us.
The second problem has to do with logograms (e.g. DUMU.MUNUS, "girl"). I'd say that whenever we can reconstruct the stem, we should do it (as in 𒆜𒀸) and use the one-to-one transliteration if not. --Tom 144 (𒄩𒇻𒅗𒀸) 16:24, 14 January 2018 (UTC)
I would not be opposed to having entries for both at-ta-aš and attaš whose only content is "Romanization of 𒀜𒋫𒀸" and for KASKAL-aš whose only content is "Romanization of 𒆜𒀸" (no Etymology section, no Pronunciation section, no Inflection section, etc.). But the main entries should remain at the cuneiform spellings. —Mahāgaja (formerly Angr) · talk 16:40, 14 January 2018 (UTC)
The cuneiform script can only be added if the authors cited show the transliteration of the word. Hoffner & Melchert have a vocabulary list in their book, but they only show the broad transcription, unless they are written with sumerograms. If we used the stems as lemmas as I proposed, we could create entries based on their list, which happens to be one of the most reliable sources today. And if we happen to find the transliteration, then we can add it along with the original script. Each script is optional on the declension tables for this very reason. But if we decide to use cuneiform as a lemma, then we would be restraining ourselves from expanding the already small set of Hittite words on wiktionary. --Tom 144 (𒄩𒇻𒅗𒀸) 18:07, 14 January 2018 (UTC)
I also want to add that even though logograms are common, we also happen to know the consonantal stem of most of them. --Tom 144 (𒄩𒇻𒅗𒀸) 18:12, 14 January 2018 (UTC)
I think the end goal should be to have all lemmas in cuneiform. But in the meantime, I agree with you: it'd be good to allow users to add full-blown entries in broad transcription (still bearing in mind that they will eventually be converted to simple romanisation entries, once all their info has been moved to the cuneiform lemma.)
Would that be messy, though? For an indeterminate amount of time, we would have some lemmas in end state (full-blown entries in cuneiform), and some in middle state (full-blow entries in broad transcription). I don't know if there's any precedent to that. We do have CAT:Gothic romanizations without a main entry, but these are (already) simple romanisation entries only, and all the info still has to be encoded at the main entry. --Per utramque cavernam (talk) 18:30, 14 January 2018 (UTC)
I think I would support broad transcriptions that are soft redirects. I think the extra information should be kept to a minimum. In reference to a question I asked in the previous conversations, determinatives should not be included. —*i̯óh₁nC[5] 21:47, 14 January 2018 (UTC)
Since it's almost consensual, I guess we'll just keep the lema forms in cuneiform and create soft redirects for the romanizations, I'm still opposed to this solution though. I agree with the fact that the broad transcription shouldn't have logograms of any kind. Concerning the terms Hoffner & Melchert's vocabulary lists, I guess the best thing to do would be to add the lists to some appendix or request list, and add create them only once we have the cuneiform script for them. Unattested lemmas should be dealt in the same way we do with (vulgar) latin. And btw, could anybody instruct me on how to use the Module:typing-aids for Hittite? --Tom 144 (𒄩𒇻𒅗𒀸) 05:07, 15 January 2018 (UTC)
@Tom 144: {{subst:chars|hit|a-ku}} produces 𒀀𒆪. That is, you type {{subst:chars|hit|[NAME OF CHARACTERS]}} to output the actual cuneiform. At the moment, there is a module for Hittite not for Sumerian for some reason, so a Sumerian term like "𒂼𒄄" (ama-gi) does not work with this template. —Justin (koavf)TCM 05:28, 15 January 2018 (UTC)
@Koavf: Thank you! --Tom 144 (𒄩𒇻𒅗𒀸) 05:37, 15 January 2018 (UTC)
@Tom 144: No problem. I'm assuming that you hvae at least a passing familiarity with Sumerian, so could you please take a look at my two most recent creations? —Justin (koavf)TCM 05:45, 15 January 2018 (UTC)
@Koavf:, I'm sorry, but I don't know anything about it. But I would certainly be interested to study the oldest written language it if I got some reliable text book. --Tom 144 (𒄩𒇻𒅗𒀸) 05:59, 15 January 2018 (UTC)
@Koavf: If Sumerian is not handled, it's probably because nobody has expressed a need for it yet. I suggest you post on the module talk page. --Per utramque cavernam (talk) 14:24, 15 January 2018 (UTC)
It would be useful for Hittite too, sumerograms are common. Btw, would infringe copyrights to add Hoffner & Melchert's vocabulary list into Wiktionary:Requested entries (Hittite)? I guess that if we just leave the stems but erase the definitions it would be fine. --Tom 144 (𒄩𒇻𒅗𒀸) 15:57, 15 January 2018 (UTC)
Also, how would we lemmatize morphemes such as -ant-, -iya-, -ili-, -ima-, -ir-, -talla-, -ul-, -att-, -ašti-, -ašha-, -ašša-? We could just use cuneiform too, it would look ugly though. --Tom 144 (𒄩𒇻𒅗𒀸) 16:31, 15 January 2018 (UTC)

Allowing IAST Romanisation entries for Sanskrit[edit]

I propose that the result of the "Wiktionary:Votes/pl-2014-06/Romanization of Sanskrit" vote be revisited, and that IAST romanisations be allowed as alternative-form entries of the Devanagari-script lemma entries, in a manner similar to how Gothic is handled.

My main incentive is that the issue brought up by Ivan Stambuk in the talk page of that vote, as well as in "Wiktionary:Grease_pit/2014/July#Sanskrit_transliteration", has, AFAICT, never been properly addressed: namely, that "Vedic Sanskrit uses special accent marks which we don't use in Devanagari, but which are indicated in IAST transcriptions."

This means that relying entirely on the automatic transliteration from Devanagari (by way of Module:sa-translit) actually leads to a loss of information.

One could argue at this point that I should get my facts right, and that it has never been suggested to rely entirely on the transliteration module; that manual transliterations are 1) entered whenever necessary, and 2) never removed when they're present. But is this the case? I genuinely don't know, but if yes, this seems like a huge overhead (unless the automatic transliteration is, for all intents and purposes, sufficient in 95% (arbitrary number) of cases?).

In any case, I think having dedicated Romanisation entries would allow us to relax and not worry about not having complete transliterations everywhere: we would know that they can be found somewhere, and where exactly that somewhere would be.

But one might say that we could provide the manual transliteration directly in the Devanagari-script entry. Yes we could, I guess?

(it has also been suggested that we could insert invisible stress marks in the Devanagari-script, so as to make the transliteration module attain the desired result; but I agree with Ivan Stambuk that "Devising an obscure secondary system with invisible stress marks and whatever in Devanagri is absurd", not to mention impractical)

I'm totally unqualified to contribute further in any meaningful way, and probably shouldn't get involved in the first place. Still, I thought it would be good to have a new discussion about this, now that we have many users knowledgeable in Sanskrit: @AryamanA, माधवपंडित, Kutchkutch, DerekWinters, JohnC5, Victar, Mahagaja. --Per utramque cavernam (talk) 01:56, 14 January 2018 (UTC)

Symbol oppose vote.svg Oppose: Unnecessary. --Victar (talk) 02:02, 14 January 2018 (UTC)
@Victar, for users without expertise in Devanagari input, do we (EN WT as a whole) have a means for users entering IAST to find the Devanagari entries? An analogy could be made to the use of romaji for Japanese, as a set of soft redirects to get users to the main entries in kana or kanji scripts. ‑‑ Eiríkr Útlendi │Tala við mig 02:09, 14 January 2018 (UTC)
Although this idea does sound fascinating, I agree with Victar that this is unnecessary. The Devanagari transcriptions of the Vedas do indicate the high and low pitch, by means of a horizontal line above and below the character respectively. We can have those symbols. In any case, googling the IAST trabscription along with the pitch accent should give the wiktionary entry, if it exists, as one of the first results. Lastly, IAST has the same symbol for two very distinct phonemes: (ḷa) which is the retroflex /l/ and (), which is the syllabic liquid /l/. Although both sounds are very rare in Sanskrit, an IAST transcription kḷp can be ambiguous between कॢप् (kḷp) and क्ळ्प् (kḷp). The current active Sanskrit editors are seeing to it that information with regards to accentuation is not lost and now with JohnC5's new declension module, even the declension tables record the accent. I personally don't see having to manually enter the accents as a hassle and enjoy working a bit more to make Wiktionary's information more accurate. -- माधवपंडित (talk) 02:53, 14 January 2018 (UTC)
The automatic Sanskrit transliteration is pretty reliable and can continue to be used. Sanskrit Devanagari is very phonetic. What is missing, from the point of view of some users, is the stress marks and some hyphens. I personally oppose the stress marks in the transliteration, since there's nothing in the native script to show the stress. The stress marks could be used in the pronunciation sections, if it's known. Hyphens are used to show the borders between compound words. I also think this is the job of the etymology sections. There won't be any loss of information if Sanskrit entries are maintained properly. I have the same opinion about Hebrew transliterations - if semi-automatic transliteration can be produced for about 70-80% of fully vocalised terms, we should use it and leave the stress marks for the entries with pronunciation sections. Alternatively, invisible symbols could be employed to mark stresses for both Sanskrit and Hebrew, which would only affect the translit, not the words in the native scripts. As it is, the automatic Sanskrit transliteration doesn't override the manual, so, if someone is not happy with the automatic one, can override it with the manual ("tr=") one but I maintain what should belong to entries, should be used there, not in every place Sanskrit terms are used. And I oppose IAST entries. --Anatoli T. (обсудить/вклад) 02:10, 14 January 2018 (UTC)
@Atitarev: There is a way to show accent in Devanagari: (), क॒ (ka), क॑ (). How else could we know where the pitch accent was if Sanskrit compilers of the Rigvedic-era texts didn't use such symbols? I think keep these in headwords would be a good idea. —AryamanA (मुझसे बात करेंयोगदान) 04:45, 14 January 2018 (UTC)
@AryamanA:: Thanks, I am not familiar with this convention but I don't see why not, as long as everyone is happy with this particular method and there are no more common ones. It can also also be made invisible in Devanagari, if purists objected. --Anatoli T. (обсудить/

вклад) 04:56, 14 January 2018 (UTC)

@Atitarev: I think purists would be fine with it. There are some variants that are used only in certain texts (the Unicode block "Vedic Extensions" has them), but the ones I showed are the most common. —AryamanA (मुझसे बात करेंयोगदान) 16:37, 14 January 2018 (UTC)
Symbol oppose vote.svg Strong oppose As Madhavpandit has said, accent was in fact marked in Vedic Sanskrit, and it would make sense for use to have it as |head= parameter on the headword-line templates. But, not all Sanskrit words have a known pitch accent, and a lot of words that were borrowed later or first used in Classical Sanskrit just didn't have pitch accent (Classical Sanskrit had syllable weight-based stress). Automatic translit doesn't get rid of anything that is very necessary; pitch accent is really only useful to linguists who reconstruct PIE and priests who do Vedic chanting. As for Ivan Štambuk's comments, I don't have reason to believe he was much more knowledgeable in Sanskrit than, say, JohnC5 or Madhavpandit. (He also copied every entry he made for Sanskrit from Monier-Williams, so it's difficult to assess how much he knew about the language) Anyways, all the active Sanskrit editors do add the accent when making entries from my experience. I also add it in etymology sections for Hindi etc. now. —AryamanA (मुझसे बात करेंयोगदान) 04:45, 14 January 2018 (UTC)
@AryamanA: It's unrelated, but I must say I find his almost religious deference to Monier-Williams rather odd. This exchange and this message especially were pretty disconcerting. Saying that Monier-Williams is an exemplary piece of scholarship and saying that it's absolutely unimprovable on any account at all are two quite different things (I wouldn't see much point anyway in copying it verbatim; it's already online after all). But I still think he raised some important points. --Per utramque cavernam (talk) 16:22, 14 January 2018 (UTC)
@Per utramque cavernam: I am particularly surprised by "In other words, there are no problems with Sanskrit entries." I (and others) still are cleaning up the huge messes made by copying from Monier. Monier is also pretty old, and Sanskrit scholarship has advanced leaps and bounds in the past century. As for Sanskrit being a dead language, we still don't know the exact meanings of every Sanskrit word, and Monier didn't either; there's a lot of debate on what certain words in even the Rig Veda mean.
He also claims in the vote that IAST is a neutral way of transliterating Sanskrit and that Devanagari has a "pro-Hindu POV", which IMO is a pretty clueless thing to say. —AryamanA (मुझसे बात करेंयोगदान) 16:35, 14 January 2018 (UTC)
For all of his immense contributions to Wiktionary, Ivan Štambuk always has had problems with a battleground mentality. I think some of the more extreme things he said came from his perception that his judgment was being questioned, and the instinct to fight that off by any means available. Chuck Entz (talk) 01:42, 15 January 2018 (UTC)
Symbol oppose vote.svg Oppose Certainly to have accents on the transliteration. One major problem is that the CDSD version of MW doesn't distinguish between udatta and svarita, so a lot of people don't know about independent svaritas. The notion of correcting incorrectly accented forms isn't great. Also, a lot of academic literature will add accents to example forms of verbs that are not actually attested with accent marking (mostly because the finite forms appear in main clauses). So it's hard to know which accentuated forms are "real" without looking in Grassmann, and even with Grassmann and Whitney, you need to know to interpret things like “kanýā, kaníā” as kanyā̀. Overall, Rigvedic is obscure, difficult to get correct and very spottily attested, so I am opposed to using it in transcriptions. We could represent it in Devanagari, but several opposing and contradictory notational systems exist, so that isn't a good idea either. Though the current situation is annoying, all of the other options are way more prone to error. —*i̯óh₁nC[5] 05:17, 14 January 2018 (UTC)
Perhaps it's not worth to mark accents if they are not confirmed by multiple sources and leave altogether if there is any doubt. We don't normally mark accents for word stresses in Old-Church Slavonic or Old East Slavic, even if accents can be guessed in a large number of cases and confirmed with sources in a smaller number of cases. --Anatoli T. (обсудить/вклад) 05:24, 14 January 2018 (UTC)
Symbol oppose vote.svg Oppose One learns the script first before dealing with the language, it should not be that hard. I can’t see much value in people wanting to find Sanskrit entries without caring about the script. Also what the others said: Too many variant transcriptions, too inexact transcriptions, too bad sources, too high probability of errors. Palaestrator verborum sis loquier 🗣 10:33, 14 January 2018 (UTC)
I disagree with your argument that "One learns the script first before dealing with the language, it should not be that hard. I can’t see much value in people wanting to find Sanskrit entries without caring about the script.".
There are many possible reasons someone might want to look up entries in any non-Latin script, without having any intention of becoming a student of that language or of learning the script (such as when researching the etymologies of derived terms in other languages). And even if the user can read the script, that's not the same thing as being able to input that script easily.
This is separate from the issue of whether to include IAST entries. I simply wish to point out the potential for serious usability issues inherent in your assumptions. I am totally happy not having IAST entries, so long as users still have some means of getting to the Devanagari-spelled entries without having to search for the Devanagari strings. ‑‑ Eiríkr Útlendi │Tala við mig 11:26, 14 January 2018 (UTC)
And the “researching the etymologies of derived terms in other languages” is the only thing I could think about, I don’t see the “many possible reasons”. And those should be able to use the search, and maybe they should learn the language a bit because it is prone to errors if one adduces formations from a language without knowing anything about its morphological shapes and their frequencies.
Note that one does not “pick up some Sanskrit” to go to India, so the argument that one can make for Japanese that people might be interested in the oral language only is detached from reality.
Whatever cases you contrive, the issue here is that they need to constitute sufficient reason for the additional maintenance burden of romanization entries to be acceptable. Palaestrator verborum sis loquier 🗣 12:43, 14 January 2018 (UTC)
Yes, Devanagari is pretty much the standard script for Sanskrit now. Mediawiki has built in Devanagari input tools, hit "ctrl-m" in any text field and select Sanskrit. The popular INSCRIPT keyboard is available and so is a simple transliteration keyboard based on IAST. I use these all the time. —AryamanA (मुझसे बात करेंयोगदान) 14:41, 14 January 2018 (UTC)
@Palaestrator verborum, AryamanA, please note, I am not arguing that we need IAST entries. I am only arguing that we need to ensure that, whatever we choose to implement, we are not introducing barriers to usability.
For instance, Ctrl-M doesn't work for me at all (Chrome on Win 10), and I have no Devanagari input installed on my machine. When editing an entry, I could at least use Edittools to get Devanagari input that way. However, Edittools is not available for the search bar. Moreover, Devanagari input requires that the user know the script, which is a barrier to entry. Granted, anyone interested in Sanskrit over the long term will want to learn the script. However, everyone must start somewhere, and especially for casual users and beginning learners, we need to make sure that users can still find the Devanagari-script entries, even if they only search on Latin-script spellings. So long as that search feature works, I have no qualms. ‑‑ Eiríkr Útlendi │Tala við mig 20:30, 14 January 2018 (UTC)
@Eirikr: What you described is true for any language. We don't do this for Arabic, Persian, Hindi, Russian, etc. etc. ad nauseum even though there are plenty of learners who don't learn the Arabic script or the Cyrillic script at first. Frankly, Mediawiki's search function is good enough to locate the entries by searching for the transliteration.
I'm using Chrome on Mac (macOS Sierra) and Mediawiki's input tools work so well (and are fast enough) that I never bother using the built in input method. I don't know why they're not working for you, that's definitely a problem. —AryamanA (मुझसे बात करेंयोगदान) 20:43, 14 January 2018 (UTC)
@AryamanA: I assume by "this" in we don't do this, you mean creating romanized entries? Indeed. Searching for a term by language + romanized string does seem to work to some extent, and this thread is prompting me to re-evaluate the usefulness of romanized entries for Japanese. However, there are some hiccups: searching for "sanskrit karpasa" gives me lots of other Indian-language entries, but not the Sanskrit one at कर्पास (karpāsa). This is not the expected result. If I search just for "karpasa", the Sanskrit entry is the third one down for me. For other Latin-script strings with more overlap with other languages (say, "gola"), it's even harder to find the Sanskrit entries. Is there any way of improving the search functionality? ‑‑ Eiríkr Útlendi │Tala við mig 21:14, 14 January 2018 (UTC)
@Eirikr: Yes, I mean romanized entries, sorry if I was unclear. Adding incategory:"Sanskrit lemmas" to the search narrows down to searching only Sanskrit terms, but that isn't immediately obvious to a casual Wiktionary user. I think Japanese is a different case, because from what I know Romaji is used a lot in learner's material, whereas the books I've used to learn Sanskrit always have a unit on the Devanagari script. (I also think we should keep Pinyin redirects for Chinese, I use them a lot for learning Mandarin). —AryamanA (मुझसे बात करेंयोगदान) 21:23, 14 January 2018 (UTC)
@AryamanA, Eirikr:: When I joined Wiktionary, romaji and pinyin entries had full-blown entries, as if they were the proper native Japanese and Chinese scripts. Their status has been reduced to soft-redirects and Japanese kana entries work well for disambiguations. They still enjoy higher status than any other romanisation but it's not fair to other languages. If the search functionality is improved, we don't need romanised entries. --Anatoli T. (обсудить/вклад) 22:20, 14 January 2018 (UTC)
  • I would have no objection to including an entry, for example, for vṛka, that contains no information but "Romanization of वृक (vṛka)", much as we already have for Gothic. Accent marks (both Latin and Devanagari) could be included in headword lines and stripped from links, just as macrons already are for Latin and Ancient Greek. Incidentally, the ambiguity of "ḷ" is actually easy to resolve: ळ must (I'm pretty sure) always be adjacent to a vowel, while ऌ may never be. And even if both कॢ (kḷ) and क्ळ् (kḷ) really do exist, there's nothing stopping us from having an entry for kḷ that says "1. Romanization of कॢ (kḷ) <br/> 2. Romanization of क्ळ् (kḷ)". —Mahāgaja (formerly Angr) · talk 16:53, 14 January 2018 (UTC)
    @Mahagaja: मीळ्ह (mīḷha) exists at least. I don't think we really need romanizations though, because if you search for "vrka", वृक (vṛka) is in the results anyways. —AryamanA (मुझसे बात करेंयोगदान) 20:43, 14 January 2018 (UTC)
    And in मीळ्ह (mīḷha), ळ is adjacent to a vowel, so it's not a counterexample to my statement. (I'm not sure whether you intended it to be one, though.) When I search for "vrka", वृक (vṛka) is the sixth result listed, which isn't very good. And what if I'm looking for (ka)? If I search for "ka", (ka) doesn't appear until the fifth page of results. Not very useful at all. —Mahāgaja (formerly Angr) · talk 23:02, 14 January 2018 (UTC)
Support. The current method of using the search function is insufficient for finding entries reliably. I've had plenty of difficulty finding Russian entries, it needs to be easier. —Rua (mew) 20:45, 14 January 2018 (UTC)
Symbol support vote.svg Support. Redirecting people to the Devanagari entries wouldn't do any harm. To me the fastest way to find a Sanskrit entry is looking up a cognate an hope the term I'm looking for is listed there. This would facilitate things. --Tom 144 (𒄩𒇻𒅗𒀸) 21:11, 14 January 2018 (UTC)
Another option is to browse CAT:Sanskrit lemmas, but that only works for people with a good reading knowledge of Devanagari. —Mahāgaja (formerly Angr) · talk 23:02, 14 January 2018 (UTC)
Symbol support vote.svg Support, without accent marks of course. I feel that @AryamanA and others are getting far too wrapped up in that instead of acknowledging that accentless IAST soft redirects could serve our users. —Μετάknowledgediscuss/deeds 23:41, 14 January 2018 (UTC)
It's my fault though, I shouldn't have presented this stuff about accents as the main reason for the proposal; in the end, it's probably the weakest of all. --Per utramque cavernam (talk) 23:46, 14 January 2018 (UTC)
@Metaknowledge: Do our users really not know about tools like this? —AryamanA (मुझसे बात करेंयोगदान) 23:49, 14 January 2018 (UTC)
FWIW, I didn't, and I think it's a fair bet that casual users of Sanskrit won't necessarily know about it either. ‑‑ Eiríkr Útlendi │Tala við mig 00:33, 15 January 2018 (UTC)
A thoroughly plausible scenario is someone seeing a romanized Sanskrit term in a dictionary's etymology or a linguistics article and wanting to find out more. Such people aren't going to know much about what tools are available, nor are they likely to bother with them if they're pointed to them.
I have no problem with romanization entries that are soft redirects, as in Gothic- as long as all the content is in the Devanagari entry. There are so many potential ways to represent Sanskrit that we need to have one designated standard to keep content from getting unmanageably scattered all over the place. Chuck Entz (talk) 01:42, 15 January 2018 (UTC)
@AryamanA: I didn't know about it either. I think you may be in too deep to realise what those of us who have never studied an Indian language are like when it comes to using a dictionary. —Μετάknowledgediscuss/deeds 03:05, 15 January 2018 (UTC)
@Metaknowledge: It would have helped me a lot to know the Persian script for Hindi etymologies, so I learned it. Before that, I used far more comprehensive dictionaries than Wiktionary to find Persian stuff.
Anyways, I would support this if it wasn't Sanskrit specific. There are many other languages (that aren't dead!) that learners could benefit from having transliteration redirects for. —AryamanA (मुझसे बात करेंयोगदान) 13:43, 15 January 2018 (UTC)
Yes, admittedly learners may have issues with foreign scripts but there are so many, much more complex scripts than Devanagari but we don't create soft-redirect entries for them. Why Sanskrit should be another privileged exception? --Anatoli T. (обсудить/вклад) 05:50, 15 January 2018 (UTC)
Symbol oppose vote.svg Oppose Sanskrit may not have had an official script initially, but the modern convention is to use Devanagari. Sanskrit is adequately represented with Devanagari, and as an abugida the individual units of the Devanagari script in most cases have a direct relationship with their transliterations and transcriptions.
Even if Devanagari is given primacy, Anatoli: "[Romanized soft redirects] will mislead users that it's OK to write Sanskrit in Roman" at all times and that Romanized forms are as equally legitimate as the Devanagari forms. The romanized alternate forms could be confused with the lemmas themselves. It would probably be better as Anatoli suggested to "help users use Devanagari and other complicated scripts and help them find what they're looking for" such as Wyang's idea to "develop reverse transliteration modules". Kutchkutch (talk) 07:07, 15 January 2018 (UTC)
@Kutchkutch -- by way of examples of soft redirect entries, please view hōhō#Japanese, kawara#Japanese, and sukī#Japanese. You'll note that all of them have zero content -- just a note that this is a romanized spelling of a term, and a link to the non-romanized entry. There isn't really any reasonable way for users to confuse these with the full lemma entries. (Note: I'm not arguing for IAST entries, I'm just offering examples of what that might look like to address specific concerns.) ‑‑ Eiríkr Útlendi │Tala við mig 09:56, 15 January 2018 (UTC)
That's what I was going to say; I'm not suggesting that we should have anything more than this. The IAST entries would simply be soft redirects, really. --Per utramque cavernam (talk) 10:12, 15 January 2018 (UTC)
@Kutchkutch, Atitarev: There are grammar books and readers of Sanskrit written entirely in romanization, e.g. Wackernagel's grammar and Liebich's reader. Granted, it tends to be 19th- and early 20th-century scholars from Germany who use the Latin alphabet exclusively, but such works do exist. I really fail to see the harm in providing soft redirects from the romanized forms to the Devanagari forms. —Mahāgaja (formerly Angr) · talk 11:07, 15 January 2018 (UTC)

😁 How is it even easy for the users to write the signs needed in the IAST romanization? The Anglo-Saxons who are not tech-savvy even fail to write ñ or – and have to learn how to write characters outside ASCII. Though the software redirects, it is doubtful that people even think so far that transliterations could be entries and then use them for getting to Devanāgarī entries, because they would think that they cannot access the entries anyway because of not being able to write IAST. I have the impression that for Anglo-Saxons on the internet it is even easier to write Indian scripts than to use correct quotation marks … Palaestrator verborum sis loquier 🗣 10:08, 15 January 2018 (UTC)

Symbol oppose vote.svg Oppose as well, mostly because it is unnecessary. Writing in Devanagari online is quite easy for even those who barely try. Sorry I'm late, just got back from abroad. DerekWinters (talk) 02:40, 19 January 2018 (UTC)

‎'Palaestrator verborum'[edit]

'Causing our editors distress by directly insulting them or by being continually impolite towards them.' [14], [15] Yes check.svg Done Kaixinguo~enwiktionary (talk) 12:47, 15 January 2018 (UTC)

I think this warrants a block already; this kind of behaviour of insulting an entire community, ethnic group, etc. should not be tolerated here. Wyang (talk) 13:35, 15 January 2018 (UTC)
Unless there are more statements which are harsher than the ones linked, I do not think this warrants a blocking. To my reading, the second statement is akin to "curse the Irish for inventing Guinness." It is worthwhile to let PV know that their comments were not well taken and that they should use more discretion in the future. If the behavior persists or worsens then, or if there are other comments which I have not seen, perhaps a block may be in order. - TheDaveRoss 13:51, 15 January 2018 (UTC)
This should not be acceptable around here... —AryamanA (मुझसे बात करेंयोगदान) 15:42, 15 January 2018 (UTC)
@TheDaveRoss Seriously? It might be worthwhile to read the entire sections that came after the linked edits. When I challenged him on his statement wishing death to all Christians, at which point you might have expected him to apologise or clarify that he was joking, he demonstrated that he was, in fact, entirely serious ("Why should I like Christians? Complimenting Christianity is tantamount to outspokenly support criminality." [16]).
I decided to ignore this deliberate and frankly childlike provocation in order to work towards establishing some of the meanings of the entry at hand, and trying to assist by providing information from Persian-only sources (Dehkhoda). I was only forced to communicate with him against my better judgment due to the fact that he had deliberately used an archaic word on that page ('wherewith') even though other editors have already had cause to warn him against using archaic or poorly-worded English in entries. It can only be assumed that this was deliberate, as he himself describes his English as being at 'near-native' level on his own user page. In the first example linked, he also mis-characterises my effort to help establish the correct translation of this word as being 'entitled that the whole world rotates around them; everybody knowing what they use in their ritual acts' due to my Christianity (real or otherwise). He has unleashed a tirade of bigoted abuse directed at a whole group of people and also at myself as an individual. Kaixinguo~enwiktionary (talk) 16:11, 15 January 2018 (UTC)
In any case, any block is irrelevant and purely symbolic, as he would certainly be back to edit afterwards. I think I just wanted to draw attention to how he has behaved. Kaixinguo~enwiktionary (talk) 16:14, 15 January 2018 (UTC)
You have decided to be offended. As I said, I did not know about “your Christianity”, and apparently I could not either.
The wording “bigoted” is very striking, for this is taken from religion, and therewith it is claimed that I have to adhere to Christianity. Here I could say that this warrants a block for Kaixinguo~enwiktionary because he tries to propagate his religion by removing those who are positioned against it.
What Wyang says “insulting an entire community, ethnic group” is beside the point. People hardly choose to belong to ethnic groups, but people choose to exercise Christianity, and Kaixinguo~enwiktionary chose to throw upon me expectations of being entangled in Christianity, so that I should know what happens inside of churches. Also there isn’t such a thing as “insulting a community”. The punishable offences of “insult” always protect the honour of individuals, and communities are not individuals and attacking the phenomenon of behaving in conformance with Christianity does not reach out to the honour. And the concept is contourless:
What if I spoke out against analphabetism, drug abuse, or gluttony? It is generally agreed that these are vices and it would not be edgy to take position against them, so why should Christianity get a special treatment? Or is being a druggie acceptable if one is a druggie in a community? People choose to memorize the deeds of Jesus of Nazareth and to sit down regularly at church pews as others engage in bulge-drinking or lechery, which looks equally freely decided, so why has Christianity to be regarded more favourable? Is it because there are so many Christians around? Nobody would look up if I cursed some died-out cult from Antiquity, and yet still what I am not interested in is the lives of the Christians – I would be content if the Christians all ceased to be Christians, and if I wish death to them nobody knows it and it does not matter because it does not matter what I wish. I can wish what I want and I can wish death to whom I want as long as I do not express incitement for forceful realization of it. (Though still it is a debatable question if it is allowed to invite someone to kill himself, because he has the right to do it, but perhaps not here.) And this digresses, I have not enticed nor have I even expressed a wish of death but I reported a wish that Christianity ceases to be; if this were an offence it would mean that it is an offence to tell the truth. People here fail to distinguish between assertive illocutionary acts and directive and expressive illocutionary acts.
It would for example be not improper to tell him that I wish Christians to be dead if he asked me what I want about Christianity, because then we are talking only about the true states of things. Directives on the same are generally harmful, whereas about expressions it must be weighed, for emotions may be desired as well as undesired; but I have always recommended not to have any emotions.
Not sure about wherewith. therewith is quite common and thus the intelligibility of wherewith is not lessened even by its falling out of use; but for me it has been just a translation of womit, and a German hardly notices anything when he reads that word, and I might use the whole collection of such words by influence from legalese. Palaestrator verborum sis loquier 🗣 17:48, 15 January 2018 (UTC)
I changed my mind, let's block him for being obnoxious. - TheDaveRoss 19:59, 15 January 2018 (UTC)
This kind of language is unacceptable. —Justin (koavf)TCM 20:23, 15 January 2018 (UTC)
This geezer usually has too much to say for himself. I will go along with a block if it's considered necessary. Has he been booted off somewhere else? DonnanZ (talk) 20:39, 15 January 2018 (UTC)
I agree with Koavf. I think a one week block would let Palaestrator verborum cool down. —AryamanA (मुझसे बात करेंयोगदान) 22:05, 15 January 2018 (UTC)
Don't do it on my account, and also don't expect him to change- he isn't going to. Kaixinguo~enwiktionary (talk) 22:10, 15 January 2018 (UTC)

This looks like a witch hunt by the PC police. Palaestrator is entitled to expressing strong opinions on Wiktionary, as long as that is not the only thing he does around here. Also, I like his archaic language. He sounds like Bogorm on steroids. --Vahag (talk) 21:44, 15 January 2018 (UTC)

"analphabetism" was as far as I got... —AryamanA (मुझसे बात करेंयोगदान) 21:58, 15 January 2018 (UTC)
"This looks like a witch hunt by the PC police. Palaestrator is entitled to expressing strong opinions on Wiktionary, as long as that is not the only thing he does around here." This is where you're mistaken: this is a dictionary. His "strong opinions" about religion or ethnic groups or coffee are irrelevant. So he's free to express them as long as he bears in mind that off-topic ranting that others find obnoxious and distracting from the project of making a dictionary is absolutely a good cause for blocking him. Why is it you think that the Beer Parlour is a free hosting service for flagrantly stupid bigotry? —Justin (koavf)TCM 22:30, 15 January 2018 (UTC)
Nor is the Beer Parlour a place for piling on a user and virtue signalling. --Vahag (talk) 22:55, 15 January 2018 (UTC)
Just ignore Vahag, he has a history of having "strong opinions". I think he's joking, but I'm never sure. —AryamanA (मुझसे बात करेंयोगदान) 23:32, 15 January 2018 (UTC)
I am not joking this time. I too have been on the receiving end of such an unfair witch hunt. It starts with a hysterical and insecure user taking offence from some harmless joke or rant and looking for protection in the mob. Then the mob takes turns in haranguing the accused, taking pleasure in “protecting” some minority group from this evil person. Usually they do not belong to the “wronged” group and have no idea if they are insulted (like the Christians would need any of your protection). They are simply virtue signaling.
Wiktionary editors are not your employees. They are not robots. They are supposed to have rants and express unusual opinions from time to time, even offensive ones. If you don’t like that, don't interact with the user.
@Palaestrator verborum, please don’t be discouraged from editing. Your high-quality contributions are very welcome. --Vahag (talk) 13:10, 16 January 2018 (UTC)
@Vahagn Petrosyan: I have no interest in "protecting" Christians, I just think you're forgetting this is a dictionary. Like, what possible reason is there to say that kind of stuff on a dictionary website? There's nothing so stressful about editing a dictionary that would lead to ranting (at least in my view). There's no doubt Palaestrator has great contributions, and I've gotten tremendous help from him when I've asked, but this kind of stuff is just not acceptable. Besides, it's just a week-long block, if he really does care so much about the dictionary (and I'm sure he does), he will come back. —AryamanA (मुझसे बात करेंयोगदान) 17:06, 16 January 2018 (UTC)
This isn't some harmless joke or rant. This is explicit religious profiling: death wishing and revilement in face of one who is clearly traumatised. There is no attempt of making the “joke” light, and User:PV only upped his tirade of abuse after seeing the other party has taken offence. This isn't being “odd” like he claims himself to be; this is being obnoxiously self-obsessed. Clearly he doesn't think any of what he has written was inappropriate ― the next target will just be a matter of time. Wyang (talk) 13:47, 16 January 2018 (UTC)

UK traffic sign 601.1.svg Let's draw a line under this and end this discussion here. I've never seen the like of it in more than ten years (on and off) here, not even when Crazy Yalda Guy threatened me with a dictionary. I'm taking a break, which I had decided before this morning and there should be no block of PV as it won't serve any purpose. That will be an effective end to the matter, as it's clear that the root cause is that he and I are two totally and utterly incompatible people. It happens. Kaixinguo~enwiktionary (talk) 23:20, 15 January 2018 (UTC)

@Kaixinguo~enwiktionary I wish you great relish! 💛 Palaestrator verborum sis loquier 🗣 00:32, 16 January 2018 (UTC)
I’ve blocked him for 1 week now, per the suggestions by other editors above.
@Palaestrator verborum What you have said on this page and other related pages is deeply insulting to User:Kaixinguo and many other editors in the Wiktionary community. You are entitled to your opinions, but using insults and profiling as such is immature and unacceptable. Please cool down during this period and realise that those comments are not welcome here. I suggest we hide the relevant revisions. Wyang (talk) 04:40, 16 January 2018 (UTC)

I suggested earlier today on his talk page unblocking him. It's best to just move on from this IMO. Kaixinguo~enwiktionary (talk) 21:12, 16 January 2018 (UTC)

His comments were inappropriate, regardless of who was and wasn't offended. I don't think his block should have been shortened. --Victar (talk) 09:39, 17 January 2018 (UTC)

@Victar: I only did it because of what Kaixinguo said. Honestly, I don't think he's going to change no matter how long the block is. —AryamanA (मुझसे बात करेंयोगदान) 15:03, 17 January 2018 (UTC)
@AryamanA: This block was beyond simply the matter with Kaixinguo, and he was the only person I saw wishing to remove the block. Shortening the block was premature, and though it may be symbolic, I think we should be clear that this sort of dialog is unwelcome to the project. --Victar (talk) 15:30, 17 January 2018 (UTC)
@Victar: That is a very good point. I've un-shortened the block. —AryamanA (मुझसे बात करेंयोगदान) 16:47, 17 January 2018 (UTC)
The block length now seems to be taken from when it was last changed instead of from the original start date. Kaixinguo~enwiktionary (talk) 17:19, 17 January 2018 (UTC)
As I have made clear, I think the block should be lifted now. The point has been made and I have offered to take a break and we had come to an agreement. Honestly, it's not like he's going to going on a crazy spree like some people who have been blocked have done in the past, and he hasn't had another go at me (that I can see), which is probably what I would have done if it were me in his position. 'It takes two to tango' and I didn't have to react to what was written, either. I could have closed the page and done nothing but I have a fiery temper and decided to respond. So I'd really appreciate it if he can be un-blocked. From a selfish POV, I feel compelled to keep on checking back to see what has happened and I just want to leave. Kaixinguo~enwiktionary (talk) 17:33, 17 January 2018 (UTC)
The block was not because you wanted him blocked, it was because Wyang looked at what had transpired and determined that a block was in order. I think you were right to raise your concerns, and I think Wyang made a reasonable determination. It is not your "fault" that the block occurred, you can feel free to move along from the issue. - TheDaveRoss 19:41, 17 January 2018 (UTC)
Oops, fixed. Anyways, TheDaveRoss is right, there were other reasons for such a block to have happened, and it wasn't your fault Palaestrator chose to say what he did. We can't let this kind of dialogue be acceptable here. —AryamanA (मुझसे बात करेंयोगदान) 19:50, 17 January 2018 (UTC)
This sort of behavior is totally unacceptable because saying such strong words is not only irrelevant to the dictionary, but can also very easily scare editors away from the project. We definitely don't want that! PseudoSkull (talk) 01:25, 19 January 2018 (UTC)

Proposal: adding elasticity/flexibility in Chinese entries[edit]

I'll be concise for those knowledgeable, and refer to brief and basic bibliography for those who are not.

The Chinese elasticity/flexibility is a lexical property of chinese terms, two sides of the same coin, which must be reflected in the very same entry for a certain lemma.

Therefore, for example the fifth version of the prestigious XDHYCD (Xiandai Hanyu Cidian) applies mutual annotations in the respective entries, so that the entry for 煤 mei ‘coal’ reads "noun, … also called 煤炭 mei-tan ‘coal-charcoal’", and the entry for 煤炭 meitan ‘coal-charcoal’ is annotated as "noun, 煤 mei ‘coal’".

Unfortunately, currently in wiktionary this is wrongly reflected in the broadly termed 'compounds' section, as a synonym or after 'see also', and only for the monosyllabic version.

Please, before commenting read the following brief article (and if necessary further references within it); if you still have any questions, I'll be glad to try and answer them.


Finally, elasticity from Xiandai Hanyu Cidian 2005 has been tabulated in the following open access thesis


I hope an enriching discussion ensues for this critical lexicograhical issue --Backinstadiums (talk) 15:33, 15 January 2018 (UTC)

The shadow of the Wikimedia Foundation[edit]

Hi all,

Just to let you know an admin in French Wiktionary went global ban by the Wikimedia Foundation. No contact before the sudden change on his personal page, no explanation on the reasons behind, no possibilities of appeal, no discussion about the procedure. Classiccardinal was never contacted by the people who decided this and our community members neither. We suppose this ban is based on some insult he wrote in French Wikipedia two years ago and a stupid joke he made in Commons. I was banned in those two projects but was a great contributor in Wiktionary, nice with newcomers and very helpful to answer politely to questions. Sure, he used a gross language time to time but only with colleagues and never went offensive, it was just his manner in communication and we were adapted to it.

I diffuse this information here after I read two conversations with people causing problem. They may be judge by others if people here do not decide of appropriate ways to deal with them, and it can be very painful for everyone. Take care of each others, and I wish you to never know such unfair procedure in your community. If you need assistance on difficult situation, you can talk to stewards or discuss for a global ban, but not let some bureaucrats decide for you if there is no strong threat/harassment. We are still looking for options on how to modify this procedure, but it appears we are not welcome to be part of this aspect of the governance. So, you may heard again about this case in the future, but I don't call you to do anything, as we are not suppose to. -- Noé 11:52, 16 January 2018 (UTC)

We've already experienced this phenomenon before at en.wikt, although the most prominent case (Liliana-60) was one where there was arguably due cause. I don't like it, and most of all I don't like that it is impossible to get them to discuss it after the fact. It bears remembering that, for better or for worse, democratic principles are not among the central ideas that inspire how the WMF works. —Μετάknowledgediscuss/deeds 15:56, 16 January 2018 (UTC)
Yes. "Shadow" is a good word for it! There is the WP:OFFICE problem where they sometimes hush things up due to legal arse-covering. ("One of the terms of the settlement was that we would not disclose any of the terms of the settlement"... where'd I see that?) Equinox 22:51, 18 January 2018 (UTC)

Nym-type in bold[edit]

I think having the nym-type in bold looks overbearing, often larger than the definition itself.

  1. mad
    Synonyms: angry

I would rather the nym-type be made normal and the whole thing be in italic.

  1. mad
    Synonyms: angry

@Rua, Erutuon --Victar (talk) 16:54, 16 January 2018 (UTC)

I didn't make it like that originally, so that reflects my preference. I don't see a reason to make it italic. —Rua (mew) 16:59, 16 January 2018 (UTC)
I agree that bold overemphasizes "Synonyms". But it's in the spirit of overemphasizing headings relative to content. DCDuring (talk) 17:44, 16 January 2018 (UTC)
True! But still, I would drop the bolding. - -sche (discuss) 18:47, 16 January 2018 (UTC)
@Rua: I'm not married to the italic suggestion. --Victar (talk) 18:45, 16 January 2018 (UTC)

To broach a larger question, why are we placing {{syn}} under the definition instead of under its own header, like we do Related terms? --Victar (talk) 18:55, 16 January 2018 (UTC)

Because synonyms are sense-specific, related terms aren't. —Rua (mew) 19:07, 16 January 2018 (UTC)
The header format is still allowed, though. I still use it sometimes, when it works for many senses. --Per utramque cavernam (talk) 19:11, 16 January 2018 (UTC)
@Rua: So are translations, but again, their own section. --Victar (talk) 19:12, 16 January 2018 (UTC)
Who says the current placement of translations is a good thing? DTLHS (talk) 02:17, 17 January 2018 (UTC)
I don't particularly like it when "Alternative forms" are regularly placed above "Etymology" by a certain bot. DonnanZ (talk) 10:24, 17 January 2018 (UTC)
@DTLHS, Donnanz There was a vote specifically allowing alternative forms to be placed below the definitions, if the bot is changing that it's in error and should be fixed. —Rua (mew) 20:15, 17 January 2018 (UTC)
@Rua: I can't remember the vote, can you pinpoint it? DonnanZ (talk) 20:25, 17 January 2018 (UTC)
Wiktionary:Votes/pl-2016-09/Placement of "Alternative forms" 2 (weaker proposal). —Rua (mew) 20:43, 17 January 2018 (UTC)
Yeah, I abstained, but that would be preferable to what's happening at the moment. DonnanZ (talk) 20:56, 17 January 2018 (UTC)
(chiming in...)
I agree with DonnanZ.
I missed both votes. For Japanese, neither of the suggested placements (above syns as a POS subsection, or at the top above everything) are appropriate. Alternative forms in Japanese are determined by etymology and pronunciation, not by POS. This is why I (and I believe other JA editors as well) have placed alt forms after the etym and pronunciation, and before POS sections. A single JA spelling might have multiple separate etyms and pronunciations -- see for one such example, showing how alt forms are tied to the etym + pr combination. Native monolingual dictionaries are structured in a similar fashion; I would be happy to supply screenshots. For consistency across JA entries, it makes the most sense to place alt forms in the same location even for JA spellings that only have one etym and pronunciation.
Mandating a single structure for all languages, without properly considering the impacts on all languages, doesn't strike me as the best way forward. ‑‑ Eiríkr Útlendi │Tala við mig 23:00, 17 January 2018 (UTC)
As an aside to that, in "Templates and Headers" we have ===Alternative forms===, not ====Alternative forms====. DonnanZ (talk) 12:48, 19 January 2018 (UTC)

How about:

  1. mad
    Synonyms: angry

Or is that too small? I also think that no matter what format we choose, the nyms should be made collapsible by default (using User:Ungoliant MMDCCLXIV/synshide.js). —AryamanA (मुझसे बात करेंयोगदान) 02:13, 17 January 2018 (UTC)

Looks good to me. DCDuring (talk) 03:41, 17 January 2018 (UTC)
I think it's too small, but definitely agree that it should be collapsed by default, similar to how quotations currently are. --Victar (talk) 06:49, 17 January 2018 (UTC)
I support dropping the bolding and wikification (at least for the well-known names: synonyms and antonyms) from the nym type.
A smaller font doesn’t seem necessary if they are collapsed by default, but it does look nice. — Ungoliant (falai) 21:06, 17 January 2018 (UTC)
You should all look at {{zh-syn}} as well, it looks pretty nice. —AryamanA (मुझसे बात करेंयोगदान) 21:58, 17 January 2018 (UTC)

Kazakh romanization[edit]

https://www.nytimes.com/2018/01/15/world/asia/kazakhstan-alphabet-nursultan-nazarbayev.htmlJustin (koavf)TCM 17:53, 16 January 2018 (UTC)

@Koavf: We already had this conversation, at Wiktionary:Beer parlour/2017/October#Kazakh orthography, where we essentially concluded that we will wait for attestation. What do you have to add by posting this? —Μετάknowledgediscuss/deeds 17:58, 16 January 2018 (UTC)
Just, "this is neat, I think you might be interested". —Justin (koavf)TCM 18:23, 16 January 2018 (UTC)

Desysopping for inactivity[edit]

Per Wiktionary:Votes/pl-2017-03/Desysopping for inactivity, we can (should?) desysop the following users:

I suppose we should warn the admins who have been recently active that their status is liable to be removed right now? --Per utramque cavernam (talk) 20:30, 16 January 2018 (UTC)

Why not just get to work on the ones inactive since 2015 or earlier?
Have we had an actual problem with any admin account being hijacked? Have we had any signs of such trouble? Have any wikis, especially Wiktionaries, with our level of activity had such trouble? DCDuring (talk) 20:58, 16 January 2018 (UTC)
I think the results of the vote are pretty clear. Yes, we should de-sysop users who have not used their tools in the past five years. As to whether or not there have been issues, I don't think that matters. - TheDaveRoss 21:05, 16 January 2018 (UTC)
@TheDaveRoss The vote doesn't command us to desysop; it allows us to. I am asking whether there is any compelling reason to do so, especially in the case of those who are recently less active. DCDuring (talk) 02:23, 18 January 2018 (UTC)
@DCDuring I agree, it isn't written as a mandate. As it stands all it does is allow 'crats to change the user rights if they feel like it. I assumed that we would actual make that a practice as well, which I don't think was an unreasonable line of thought. With regards to compelling reason to do so, I think there are lots of good ones, none of them particularly urgent.
It is best if the administrator lists reflect the active administrators on the project. This helps people looking for help to more easily find it. If you leave a message on, say, Conrad.Irwin's talk page he is unlikely to respond quickly to assist you. An active list also helps us keep track of how many people are doing admin work, so that if the number dips particularly low we know to seek out more. There is also the small chance that an account gets compromised. This is unlikely and would not cause lasting harm, but more administrators means more surface area for attack. I don't give too much weight to that argument, but it has been made.
Finally, there is the question of what user rights represent. I consider user rights to be an expression of trust on behalf of the community to the particular user. After several years of inactivity there is a new community, with new people and practices. This is also an argument in favor of discreet terms in roles, which I would probably support if it didn't mean so much extra overhead in the form of voting and role changing and keeping track of duration. I find the automatic removal after a long period of inactivity to be a low-maintenance method of imposing this sort of term limit. It is not hard to become an administrator, so if a trusted user returns they would almost certainly have no difficulty regaining their rights. - TheDaveRoss 13:03, 18 January 2018 (UTC)
I think this user is a little overzealous. If a user was active last month they are not inactive, whether they use certain tools or not. DonnanZ (talk) 21:16, 16 January 2018 (UTC)
The vote was very specific, the measuring stick is use of tools. Also, if you have had admin rights for five years and have not used them, why do you need them? If someone has not used them but would like to keep them they can use them, there are consistently dozens of pages to be deleted, and there is a person to block every hour or two. - TheDaveRoss 21:47, 16 January 2018 (UTC)
You did yourself vote in favour of that rule, so I'm not sure I follow. --Per utramque cavernam (talk) 21:52, 16 January 2018 (UTC)
I voted in favour of desysopping for five years inactivity, but I glossed over the small print. DonnanZ (talk) 22:00, 16 January 2018 (UTC)
Actually, it's a shame Dvortygirl is no longer doing audio, she has a great voice. DonnanZ (talk) 15:49, 17 January 2018 (UTC)
I think they should be desysoped. Even if there are no immediate concerns about their accounts being compromised, we should practise the principle of least privilege. —Internoob 05:21, 20 January 2018 (UTC)

Wyang playing a Lenin on the whole community[edit]

I like to keep bitching at around 1‰ (we don’t need more of that here whenever avertible), but in recent days Mr. Wyang has forbidden me to arrange my talkpage in archival fashion 1, ignored related questions in his 2 3 and deleted messages from mine 4 (he claimed my arrangement was "vandalism" but he obliterates content therefrom and that's ok with him).

I respect his obsession with me (I know commoners marvel at extraordinaires), but it oughtn’t worsen Wiktionary. He has many times proved capable of more than pettiness, so if he could stop engaging in quarrels which, according to him 5, are a loss of time we will all win in the process. He was restored admin rights because, in his own words “It's incredibly frustrating to not be able to delete new user vandalism or delete the original as I move entries with wrong titles”, and less than 6 months after he has just today banned one of the most knowledgeable users we have here, thus exceeding the scope of his initial admin request.

Thanks in advance for taking the time to read my message!

—This unsigned comment was added by Gfarnabo (talkcontribs).

rofl --Per utramque cavernam (talk) 21:15, 16 January 2018 (UTC)
Hah, funny. Blocked. —AryamanA (मुझसे बात करेंयोगदान) 22:25, 16 January 2018 (UTC)

Quickly, Aryaman, ban this user too to attain Nirvana, this is your opportunity!!! talk

Wyang's edits are completely merited and the blocked user's (Gfarnab) complaints are not. He continues to use sockpuppets to avoid the block. --Anatoli T. (обсудить/вклад) 23:21, 16 January 2018 (UTC)
A perfect example of why we could use local CheckUsers. —Justin (koavf)TCM 23:24, 16 January 2018 (UTC)
But we already have local CU... Or do you mean more local CU? --Per utramque cavernam (talk) 23:26, 16 January 2018 (UTC)
For what it is worth, I went to look at this and Chuck had already done so. - TheDaveRoss 23:29, 16 January 2018 (UTC)
Sorry if I was unclear here: this is in reference to our recent votes on CheckUsers. Some of the editors here felt the user rights were superfluous. —Justin (koavf)TCM 01:34, 17 January 2018 (UTC)
Wrong religion buddy :) —AryamanA (मुझसे बात करेंयोगदान) 23:34, 16 January 2018 (UTC)
You can spend time (in vain) trying to ban me or answer my grievances with more than adjectives.
"Grievances"? Oh, you mean you adding incorrect information in languages you don't know? —AryamanA (मुझसे बात करेंयोगदान) 23:48, 16 January 2018 (UTC)

I wish you a pleasant night either way! —This unsigned comment was added by (talk).

I saw the deleted revision, and I'll respond. First of all, the actual verse (see Wikisource):
अमुं च रोपितव्रणमिगुदीतैलादिभिरामिषेण शाकेनात्मनिर्विशेषं पुपोष ।
amuṃ ca ropitavraṇamigudītailādibhirāmiṣeṇa śākenātmanirviśeṣaṃ pupoṣa .
Second, I have never made any false claims to how much Sanskrit I know. I don't know enough Sanskrit to translate it. I just see a meaningless translation that probably needs way more context and finesse with Sanskrit than you or I have. So it's better to not have it at all and wait for someone more knowledgeable to deal with it, rather than have low-quality content. —AryamanA (मुझसे बात करेंयोगदान) 21:21, 18 January 2018 (UTC)

"From" in etymologies[edit]

I've been meaning to bring this up for a while now, but haven't had much time.

Wouldn't it be great if we didn't have to write "from"? Oh, wait, just don't write "from". An etymology by definition tells you where something comes from. - Equinox

I've always written "From" (until only recently) in etymologies because I've seen it done on so many other entries, and I just wanted to copy what was said. Equinox claims this is redundant. I have vague memories of him complaining about this before, but I can't remember exactly what happened.

I'm starting this topic because I can understand where he's coming from. I more specifically remember him also saying something like "The pronunciation doesn't say 'sounds like' before it, so why should the etymology say 'from' before it?"

So, for those reasons, I'm looking for an explanation of why we do this "from" thing here. Could it be because maybe other dictionaries do it, perhaps? That would be the only reason I can think of. I'd also like to propose to disallow etymologies to be worded this way since it is redundant, unless someone comes up with a good explanation of why to say "from".

And naturally, I should ping you, @Equinox. PseudoSkull (talk) 03:00, 19 January 2018 (UTC)

Etymologies are sometimes English sentences (seize) and sometimes formulas (de- + frog). I don't see why making all etymologies into formulas would be a good thing. DTLHS (talk) 03:07, 19 January 2018 (UTC)
The benefit of etys being templates is that we are then only storing the abstract details (X derived from Y) and we can render those details with or without a "from", depending on this week's whim, or a user's choice. The downside (as DTLHS says) is that you can't be discursive or mention anything quirky. Yeah, I hate the "from". (P.S. I want "I have vague memories of Equinox complaining" as my epitaph.) Equinox 03:20, 19 January 2018 (UTC)
Here is the etymology for English man:
From Middle English man, from Old English mann (“human being, person, man”), from Proto-Germanic *mann- (“human being, man”), probably from Proto-Indo-European *mon- (“man”) (compare also *men- (“mind”)).
Now here it is without "from"
Middle English man, Old English mann (“human being, person, man”), Proto-Germanic *mann- (“human being, man”), probably Proto-Indo-European *mon- (“man”) (compare also *men- (“mind”)).
See why we need "from" ? Leasnam (talk) 03:23, 19 January 2018 (UTC)
At one point in time we were using <'s in place of "from", but then it began to feel impersonal and cold, so we reverted to using "from". I think "from" is easier to make sense of, especially in lengthy etymologies. If it's just: be- + glimmer, then it can do without the "from", but using the "from" in such circumstances increases consistency across all etymology formats. Leasnam (talk) 03:26, 19 January 2018 (UTC)
I don't see why we need the first one. As I've also said before, etys are a sort of "family tree" and we don't typically include the entire thing in every entry (e.g. we wouldn't/shouldn't include the entire history of "fragment" at "defrag"). I suppose we await better visualisation technologies where you can scan and zoom through a sea of floating words linked by lines or something. (I am serious.) Type "etymology of car" into Google to see their primitive (but quite nice) attempt, which does not use the word "from" at all. Equinox 03:27, 19 January 2018 (UTC)
I remember those ">" etymologies. Yuck. --Victar (talk) 04:03, 19 January 2018 (UTC)
  • @DTLHS: Just so you know that I amended your comment and added a link to make it clear that there is a real-life example that you are citing rather than a hypothetical. —Justin (koavf)TCM 03:29, 19 January 2018 (UTC)
[edit conflict x3...] Strong oppose. Cutting out technically unnecessary words usually results in something taking more brainpower to read, not less. It would also create new problems. For instance, if I understand the suggestion correctly, this...
Borrowed from French rendez-vous, from rendez, second person plural, imperative, of to go (to) + you.
...would become this...
Borrowed from French rendez-vous, rendez, second person plural, imperative, of to go (to) + you.
...implying that "rendez-vous" and "rendez" are simply forms of the same word somehow, the way various forms of the Middle English ancestor are listed at seize (in the case of rendezvous, it's easy enough to figure out, but there are plenty of cases where it would be more confusing). If this is only about removing the initial "from," I think that's not much of an issue, but I don't think there's any point to banning it. Don't fix it if it ain't broke, as they say.... Andrew Sheedy (talk) 03:30, 19 January 2018 (UTC)
Is it something that can be agreed upon that "From de- + frog." etymologies are not necessary, and should just be said as "de- + frog"? PseudoSkull (talk) 03:35, 19 January 2018 (UTC)
I agree that we should leave it as is. Currently, it is optional to leave off the "From" when it's clearly inferred, but it in no way is illegal to add it, because it really does belong there Leasnam (talk) 03:38, 19 January 2018 (UTC)
I don't think we should actually ban it. But also it shouldn't be compulsory to stick "from" on the front of every simple templated ety. Someone used to do that; haven't seen it recently and can't remember who. Equinox 03:49, 19 January 2018 (UTC)
Personally, I think "from" or "of" at the start of an etymology should be compulsory. I find it make the most grammatical sense, and I think the etymologies should be proper sentences, not just mechanical hierarchies. --Victar (talk) 04:00, 19 January 2018 (UTC)
But you must admit that if something is compulsory then it might as well be automated (why should users type the same thing every time? we don't have to type the page's HTTP headers). I am sure we can make templates like "compound" and "prefix" say "FROM" at the start of a line if we really want it. How many times do you type "From" in a year? I'm gonna get RSI six months early from typing "===Etymology===" half my life. Equinox 04:05, 19 January 2018 (UTC)
Even having "from" at the beginning doesn't make it a complete sentence. If simple etymologies being complete sentences becomes a thing, check this out: "The word defrog was formed by taking the noun frog and appending the prefix de- to the beginning." That sounds like way too much to write for just an etymology. That looks kind of like how the French Wiktionary does it, btw. PseudoSkull (talk) 04:09, 19 January 2018 (UTC)
If you want fully automated etymologies you should probably write a template that can accommodate all the steps in one go, plus add some JS hooks so we can convert between Leasnam style and Equinox style at a whim. DTLHS (talk) 04:13, 19 January 2018 (UTC)
I'm sure that would be lovely but that would solve the problem "Leasnam and Equinox want to see slightly different etymologyies". It wouldn't solve any actual problem that affects most users. I'm also sceptical of "make it a user setting" in general because it tends to indicate some inherent flaw in the design. I could write about five paras on this but it's not necessary yet. Equinox 21:28, 19 January 2018 (UTC)
Actually, replying to Equinox, ===Etymology=== etc. can be added by accessing "Templates and Headers" when editing/creating an entry. Maybe "From" can be added the same way, by adding it to the available templates. DonnanZ (talk) 12:41, 19 January 2018 (UTC)
No, please let's not do that again. We finally got rid of that pesky automated text in front of {{bor}}, so let's not re-add the same kind of crap somewhere else. --Per utramque cavernam (talk) 12:59, 19 January 2018 (UTC)
I dislike {{bor}}, so I don't care what happens to it. DonnanZ (talk) 13:19, 19 January 2018 (UTC)
I would lose my mind if I had to use |nofrm=1 --Victar (talk) 14:01, 19 January 2018 (UTC)
I guess that means you dislike the use of "From", but that shouldn't stop other editors using it. It needn't be made compulsory. DonnanZ (talk) 14:18, 19 January 2018 (UTC)
Nope, the opposite. --Victar (talk) 17:35, 19 January 2018 (UTC)

Wayback Machine[edit]

I found a discussion from 2012 on the Wayback Machine saying it wasn't durably archived, and I find the reasoning for this flimsy. "The Web Archive is an Internet company that can disappear at any time" - OK, but do you know how many books have been lost to the ages? A library can burn down any time taking the one copy of an obscure book along with it, or it could be stolen, etc.. It's entirely possible Usenet could be lost to history. These are all big ifs. How likely is it that the Wayback Machine is going to disappear? It has lasted much longer than GeoCities and GeoCities was dying for much of its official lifespan already before they decided to put the final nail in the coffin, so it is not a fair comparison. GeoCities was considered to be a big deal back when the Internet was much smaller than it is now, and when the Internet was not nearly as old as it is now, the short time GeoCities was popular seemed longer than it was. Finsternish (talk) 23:09, 19 January 2018 (UTC)

WMF is now actively working with the Internet Archive. See Inviting IABot for a related BP discussion. Jberkel 15:27, 20 January 2018 (UTC)

Hittite pronunciation[edit]

Should we abstain from giving Hittite pronunciations? In that case we should probably delete this category. --Tom 144 (𒄩𒇻𒅗𒀸) 00:56, 20 January 2018 (UTC)