Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:PUMP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


September 2017

September LexiSession: peace[edit]

An origami for peace!

The monthly suggested collective task is to make peace. September 21 is the International Day of Peace and October 2 the International Day of Non-Violence so it may be good to reinforce our content related to this topic.

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession, because we started a year ago. I hope there will be some people interested in making some contributions! My plan is to try to draft a thesaurus on this topic, but to pick good illustrations can be a nice challenge too Face-smile.svg Noé 10:37, 1 September 2017 (UTC)

Hey, the International Day of Peace is today. I'm quite happy to make some publicity about the new thesaurus about peace in French! Face-smile.svg Noé 13:41, 21 September 2017 (UTC)

Addition to WT:Wikidata policy[edit]

I propose to add a 5th case in which pre-approval isn't necessary: in existing and in-use templates/modules, where the use of Wikidata does not have any effect on the output. This would allow us to do things like tracking and stuff, for testing and to explore potential new uses and the effects. —Rua (mew) 14:41, 1 September 2017 (UTC)

To repeat what I said in Wiktionary talk:Wikidata#A first experiment, I support adding this 5th exception. I think it's consistent with the spirit of the other exceptions, to allow the possibility to access Wikidata without affecting the actual content presented to readers. --Daniel Carrero (talk) 21:49, 9 September 2017 (UTC)
I've been bold and added it to the Wikidata policy page, since nobody else has shown an interest in this discussion. —Rua (mew) 19:03, 15 September 2017 (UTC)

Old Kurdish?[edit]

Please forgive my ignorance, but what is the generally accepted name for the parent form of all the Kurdish sublects? Old Kurdish? Proto-Kurdish? Is "Old Kurdish" attested and/or reconstructable?

Also, I noticed we have quite a few entries with the language code ku. Do these need to be sorted out and moved to their respect lect codes or are these entries with identical orthography across all three lects? --Victar (talk) 20:55, 1 September 2017 (UTC)

We have discussed this before, although I'm not sure where. @-sche might have a link handy. Our entries in ku are nearly all Kurmanji in Latin script, although we also have a specific code for Kurmanji. I think we would be best off committing to a single approach, and use a modified version of {{fa-regional}} to link between dialects. —Μετάknowledgediscuss/deeds 21:15, 1 September 2017 (UTC)
Pinging @Calak, who seems to be knowledgeable in all of the Kurdish dialects. —Aryaman (मुझसे बात करो) 02:42, 2 September 2017 (UTC)
Proto-Kurdish is attested.
I always use ku code when a word is common in all of the Kurdish dialects. For example the common word for "goat" in Kurdish is "bizin"; why should we separate dialects and write "bizin" four times?! ku code means Kurdish language with all its dialects.--Calak (talk) 09:03, 2 September 2017 (UTC)
Because it can get confusing where there's Southern Kurdish one place and Kurdish another, and it's doubly confusing once someone has set computers at the job and there's just raw lists of which entries have Souther Kurdish translations and which don't, without any note of Kurdish in the vicinity.--Prosfilaes (talk) 22:52, 4 September 2017 (UTC)

Proposal: install mw:Extension:PageNotice[edit]

This extension makes it possible to add headers to pages. It would mean we no longer need to add {{reconstructed}} to every reconstruction page. —Rua (mew) 12:19, 2 September 2017 (UTC)

French Wiktionary August news[edit]

Logo Wiktionnaire-Actualités.svg
Camille Pissarro - La Récolte des Foins, Éragny - Google Art Project.jpg


Hey! August issue of Wiktionary Actualités just came out in English!

What's up in French Wiktionary? And in the other Wiktionaries? What is a Magic Link? Is there statistics somewhere? German words that may exists? Videos? Details on tantum categories? Nice paintings from French artists? Clowns? Yes, all of this can be find in August Actualités!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We are very happy to celebrate a year of English translations! Twelves issues! That's not bad considering that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community, and it was eleven participants for this issue! We all stay eager to receive your opinion on our publication! Face-smile.svg Noé 21:24, 2 September 2017 (UTC)

Egyptian hieroglyphics[edit]

Why are you using the html tag for Egyptian hieroglyphics instead of the unicode characters? While looking at Help:WikiHiero syntax I read that the unicode characters only are partially supported so I guess that's why (found the page while writing). What are the things missing for it to be fully supported and when might those "missings" be added or fixed? This turned out to be a more "I don't know anything" question than meant... sorry 'bout that.Jonteemil (talk) 23:34, 2 September 2017 (UTC)

You answered your own question: Unicode can't support all of what we want to show. WikiHiero not only displays the characters correctly, it also does so regardless of whether the reader's computer can support special fonts (hint: most can't) and allows for flexibility in stacking, which lets us show how the language's native speakers chose to organise hieroglyphs spatially. Moreover, Egyptian dictionaries are conventionally organised by romanisation. There is no expectation that Unicode will ever fix this, which is unsurprising given that it is a long-extinct language with no use community, so our current solution is the best way to handle Egyptian going forward. —Μετάknowledgediscuss/deeds 23:45, 2 September 2017 (UTC)
Actually, positioning hieroglyphs properly might be in the works. —suzukaze (tc) 07:16, 3 September 2017 (UTC)
Fingers crossed! — Ungoliant (falai) 17:54, 4 September 2017 (UTC)
Oh, I hadn't seen this version. That's interesting, although I'm not sure it weighs out the other concerns. (E.g. that Egyptian spelling is so erratic that if people search by hieroglyph, we'd have to create entries for all the plentiful alternative spellings (and arrangements that aren't even truly alternative spellings, but have different Unicode control characters) because they'd never be able to guess what spellings we'd lemmatised. —Μετάknowledgediscuss/deeds 18:02, 4 September 2017 (UTC)


Discussion moved from Wiktionary:Tea room/2017/September#User:TNMPChannel‎.

I just blocked them for three days for creating an entry in Vietnamese- a language they don't claim to know- by plagiarizing the definitions (without attribution) from a Chinese entry that shares the same character. This is not just dishonest, it's a copyright violation and a violation of our Creative Commons license.

It's also part of a pattern of poor judgement that I've been concerned about for a while: indiscriminate mass creation of articles from a single source without checking for attestability. Creating entries, then immediately rfding them (within minutes). Submitting one of their new entries to rfc because no one else had intervened to fix it yet. Moving a category without understanding enough about our categories to have a clue whether it was a good idea (it definitely wasn't). In general, doing stuff without knowing what they were doing, then expecting others to fix it.

I may be wrong, but to me this all looks like a child who's too young to understand the implications of what they're doing, and is used to grownups stepping in and fixing things. If not, then something is really, really wrong.

At any rate, we need to decide what to do about this- I've only blocked them for three days, and they will have read this by then. Wikis aren't all that good at dealing with contributors who sincerely believe they're helping, but don't know what they're doing. What do you think we should do? Chuck Entz (talk) 05:55, 3 September 2017 (UTC)

I think he/she should be unblocked for now. The first reminder or warning to the user about making errors in unfamiliar languages was by Justin on their talk page at 03:54, 3 September 2017 (UTC), and they haven't made similar edits after the message. From I observed in their Chinese edits: he/she seems to be quite unfamiliar with our formatting system, though I can see they are trying to improve, and I have also received 'thanks' for my subsequent edits to their created entries. The entries have been quite useful too. It's not very often that we get new users who are native in E/SE Asian languages, so I'm more inclined to fix their new edits than discourage them. Of course, if it persists despite explanation and warning, then blocking would be indicated. Wyang (talk) 06:01, 3 September 2017 (UTC)
The plagiarism part merits at least a day. Chuck Entz (talk) 07:16, 3 September 2017 (UTC)
Sure. However, I suspect (or hope) that it may just be part of their cluelessness, rather than malicious disruption or infringement. I do hope that they could return, to help with Chinese idioms and Malay entries. Wyang (talk) 08:12, 3 September 2017 (UTC)
I find their constant page moves very worrying. They seem to misunderstand how everything works here. Maybe time should be taken to explain what's wrong on their talk page. —suzukaze (tc) 06:02, 3 September 2017 (UTC)
About 2.5 hours ago this user placed {{unblock|I promise I won't do it again.}} at User_talk:TNMPChannel#Unblock. I think we need some specific, extensive acknowledgement of what won't be done again. (Also something in the documentation that requests a complete confession or allocution before the user is entitled to the request being considered.) DCDuring (talk) 12:25, 3 September 2017 (UTC)
Also, please check if it's not an Awesomemeeos sock. --Anatoli T. (обсудить/вклад) 12:53, 3 September 2017 (UTC)
Completely different. Awesomemeeos gets all the technicalities perfect but has trouble with basic common sense. This person can't get either right. Awesomemeeos simply doesn't have the self-awareness and self-restraint to pull off an impersonation like this- for one thing, they're compulsive about upgrading templates. Chuck Entz (talk) 14:32, 3 September 2017 (UTC)

Translingual terms listed under descendants[edit]

Prompted by this seemingly innocent kerfuffle, it makes me wonder if it is I who am in the wrong, or if I'm onto something. Should translingual terms be listed as descendants? I would say no, because they're mainly taxonomic terms made up of Latin terms, not natural descendants. --Robbie SWE (talk) 17:49, 4 September 2017 (UTC)

Would you support the taxonomic terms being listed as ==Latin== instead of ==Translingual==? DTLHS (talk) 17:59, 4 September 2017 (UTC)
Hmm, I'm kind of slow today. Mind giving me an example? --Robbie SWE (talk) 18:02, 4 September 2017 (UTC)
Since you say that they are "taxonomic terms made up of Latin terms" and favor derived terms instead of descendants, it would be consistent to just call them Latin instead of "Translingual". DTLHS (talk) 18:04, 4 September 2017 (UTC)
But we already have translingual taxonomic terms. The examples I was given were bombyx#Descendants, accipiter#Descendants, aequoreus#Descendants and alauda#Descendants. I don't believe they should be listed as descendants. --Robbie SWE (talk) 18:19, 4 September 2017 (UTC)
You do not understand. If you want them to be derived terms why do you still want to call them "Translingual"? DTLHS (talk) 18:23, 4 September 2017 (UTC)

I don't want them there at all - for instance, Bombyx shouldn't be under descendants nor should it be under derived terms at bombyx. --Robbie SWE (talk) 18:47, 4 September 2017 (UTC)

Why shouldn't there be some link in the Latin section to the taxonomic term that is derived or descended from it? Why would we omit the connection? DCDuring (talk) 22:17, 4 September 2017 (UTC)
@DCDuring, I understand what you mean. The reason why I would opt for not listing them under descendants at all is because tanslingual isn't a language per se. According to our guidelines – [l]ist terms in other languages that have borrowed or inherited the word. The etymology of these terms should then link back to the page. – I don't think that translingual terms fall under this category. I looked through this category and a vast majority of them are not listed under descendants in their original Latin entries. --Robbie SWE (talk) 08:30, 5 September 2017 (UTC)
@Robbie SWE: In the case of CJKV characters clearly we are dealing with a script, not a language. In the case of other Translingual items we are dealing with items that are used in multiple languages. If we don't have Translingual descent shown for taxonomic names, then in principle we should show the taxonomic name as a descendant in each language in which the taxonomic name is used. This seems silly at best. DCDuring (talk) 14:34, 5 September 2017 (UTC)
"Should translingual terms be listed as descendants?"
By common practice it's done that way and so you were "in the wrong". This know is with a "should" and another question and topic. So what possibilities are there?
  • Don't list translingual descendants at all.
    --- This probably is not a good choice.
  • List them as derived terms.
    --- By WT:ELE#Derived terms that would only be possible if translingual terms would be mislabelled Latin or if Latin would be mislabeled translingual or if both would be merged into a single pseudo-language 'Translingualolatin' (or whatever the name would be). This probably is not a good choice either.
  • List translingual terms as descendants.
    --- Why not? I can't think of any contra reasons. Pro reasons: In a translingual entry it would also be for example "From Latin TERM", and descendant is descriptive.
  • List translingual terms at see also.
    --- This does also depend on the question of what can be listed at see also, and apparently there are different views about it. If different-language terms can be listed at see also: Why not? The only reasons I could think of would be that descendants (maybe cp. descendant#Noun) sounds fitting and is more descriptive and informative. On the other hand, as some translingual terms come with {{taxlink}} and link to the English wikispecies project and not to wiktionary, this would also be an acceptable choice.
- 02:52, 5 September 2017 (UTC)
{{taxlink}} "temporarily" links to Wikispecies (in all uses), with the hope that there will be a Translingual Wiktionary entry (unless it is decided that taxonomic names are Latin). The "See also" heading is for items that have no more specific heading, but in the past has been used for (true) external links and WM project links as well as for alternative forms, "coordinate terms", hyponyms, hypernyms, meronyms, etc. Placing the items now under "See also" under proper headings would be an excellent cleanup project (Augean stables?), but most of us are in pursuit of bright shiny objects. DCDuring (talk) 03:59, 5 September 2017 (UTC)

(literary or dialectal)[edit]

故此 is tagged as (literary or dialectal). I'd like to know whether, as the order of it seems to mean, 'literary' refers to Mandarin only, or also in the unspecified dialects implied by the tag. Is this way of inferring to be systematically unsderstood for any such tags appearing in other entries for any other language? --Backinstadiums (talk) 21:06, 4 September 2017 (UTC)

As it is the entry is not comprehensible. @Wyang, what dialects are included in "dialectal"? DTLHS (talk) 05:17, 5 September 2017 (UTC)
Wiktionary:About Chinese#Key points: "Terms are defined in relation to Modern Standard Written Chinese. ... Senses limited to the literary language, certain dialects or regions should be marked accordingly." I changed the tag to "formal or Min Dong". I think the headers in our entries should link to the "About" pages somewhere, so that people are directed to a page which explains how a language is documented in entries on Wiktionary, also a page where people can leave their questions or feedback, or voice their interest in joining the editing team. Wyang (talk) 12:13, 5 September 2017 (UTC)
@Wyang: Isn't it uncommon for term to be use in either formal registers or dialectal ones? --Backinstadiums (talk) 06:41, 7 September 2017 (UTC)
Not necessarily, especially if the reason for the disuse in the modern standard register is an innovation, which happens often in Chinese. Wyang (talk) 06:44, 7 September 2017 (UTC)
@Wyang: Could you please add some examples of such innovations for words used frequently in the language? thanks in advance. --Backinstadiums (talk) 14:58, 8 September 2017 (UTC)
Much of the variation in basic vocabulary (Appendix:Sino-Tibetan Swadesh lists) is due to innovations in the northern varieties of Chinese. Some examples include: (“mouth”) (later displaced by ), (“to eat”) (by ), (“to drink”) (by ), (“dog”) (by ), (“to stand”) (by ), (“he/she”) (by ). Apart from this kind of simple monosyllabic supplantation, another reason for the innovations is the process of polysyllabification, which occurred especially in the northern varieties out of the need for disambiguation, as a compensatory mechanism after the loss of many phonetic contrasts (e.g. tone) through sound changes. Examples include 石頭, (“seed”) → 種子. Wyang (talk) 04:30, 9 September 2017 (UTC)
@Wyang: Hi again, thank you for your examples but none is tagged as either "literary", "dialectal" or "literary or dialectal". Could we isolate the entries which have the tag "literary or dialectal", and then check which ones developed as innovation? --Backinstadiums (talk) 08:30, 9 September 2017 (UTC)
They don't have to be tagged. It's implied in the {{zh-dial}} boxes on those entries. Wyang (talk) 10:42, 9 September 2017 (UTC)
@Wyang: Do you mean having {{lb|zh|Min Dong}} link to the about page? — Eru·tuon 07:52, 7 September 2017 (UTC)
Not really, having <h2>Chinese</h2> link to the about page rather, in the style of fr:chinoise or something similar. Wyang (talk) 08:02, 7 September 2017 (UTC)
I suppose that could be done with JavaScript. Or with templates, if we decided to allow templates in headers like the French Wiktionary (less likely). — Eru·tuon 08:49, 7 September 2017 (UTC) (talk)[edit]

Please someone block this. Wyang (talk) 04:40, 8 September 2017 (UTC)

@Wyang: Done, simply because I trust you. I want a reason, though — I looked over a few edits and they seemed fine, though I don't know any Uyghur. (Also, for future reference, this sort of thing can go at WT:VIP.) —Μετάknowledgediscuss/deeds 05:39, 8 September 2017 (UTC)
Hmm, is it just because it's an Australian who already knows how all our templates work? —Μετάknowledgediscuss/deeds 05:46, 8 September 2017 (UTC)
Works for me. It's not necessarily the quality of the initial batch of edits, but the camel's nose that will lead to high volumes of hard-to-check edits later on. Notice, for instance, their edits on {{Template:kk-decl-noun}} which "coincidentally" continues edits by another IP, (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks) back in July.
(Before E/C) @Wyang Hi. What edits are wrong? I have only checked some, haven't seen anything bad.

--Anatoli T. (обсудить/вклад) 05:40, 8 September 2017 (UTC)

@Atitarev you don't find it odd that an IP pops up out of nowhere and starts out by rewriting all the inflection templates- in both Kazakh and Uyghur? Chuck Entz (talk) 07:28, 8 September 2017 (UTC)
@Chuck Entz: Yes, it's suspicious, it could be a formally blocked editor but I didn't know that this is a reason for blocking, though and none was given. Who was it, anyway? --Anatoli T. (обсудить/вклад) 07:37, 8 September 2017 (UTC)
AwesomeMeeos, of course. I'm starting to go through Category:Noun inflection-table templates by language. In the Adyghe subcategory, for instance you'll find edits by (talkcontribswhoisdeleted contribsnukeedit filter logblockblock logactive blocksglobal blocks). Chuck Entz (talk) 07:57, 8 September 2017 (UTC)
He's quite inventive, LOL. --Anatoli T. (обсудить/вклад) 08:11, 8 September 2017 (UTC)

Adding accents to Italian headwords[edit]

I'm currently learning Italian and started to work on our Italian entries. I noticed that we don't display accents for irregularly stressed words. cavolo and diavolo for example are listed in other Italian dictionaries as càvolo and diàvolo (e.g. Treccani), because they don't follow the common stres pattern (next to last syllable). My suggestion is to add a headword parameter to those entries, {{it-noun|m|head=diàvolo}}. Or would that be confusing? Explicit parameter? Better alternatives? – Jberkel (talk) 09:04, 10 September 2017 (UTC)

The problem is that sometimes the accent is actually written, and there's no way for someone to tell the difference. —Rua (mew) 11:28, 10 September 2017 (UTC)
I have included the accent in the hyphenation as a possible solution: diavolo. --Vriullop (talk) 12:12, 10 September 2017 (UTC)
Hm, maybe that's a solution then, I think the information should go somewhere, and the pronunciation section is a good place. It's interesting that they don't bother using accents, even in ambiguous cases (e.g. pesca). – Jberkel (talk) 15:12, 10 September 2017 (UTC)
Theoretically, the "correct" solution would be to include an IPA pronunciation with a stress mark. But I like the accent in the hyphenation idea too. --WikiTiki89 18:06, 11 September 2017 (UTC)


This person keeps reverting my OED-sourced proper pronunciation of angstrom. Apparently he or she thinks that proper pronunciations must meet his personal litmus test of notability or whatever. Wiktionary exists for many reasons, one being a place where readers from Wikipedia can come to get the specifics of a word, an important part of that being proper pronunciations. I added this information for the specific reason of avoiding the continuation of ever-recurring arguments about the pronunciation of this word over at Wikipedia. The proper pronunciations of words and the common do not always overlap in many cases. Their argument seems to be that [œ] does not exist in English, but a quick look over at Open-mid front rounded vowel can tell you that's not the case. In any case it is a Swedish loanword, and you can observe this vowel in the pronunciation of the the Swedish version. I made sure to display the pronunciation I added [phonetically] rather than /phonemically/, and I did not mess with or remove the existing common phonemic English pronunciations, so I really don't see what the big deal is. I don't just pull this stuff out of my ass; I am well-versed in the relevant term and how IPA works. This information, while a bit obscure, could still potentially help someone. Don't get me wrong; any other day I'm all for excising superfluous crap from reference sources, but this is not one of those cases. It looks to me to be yet another age-old case of what we called “barracks lawyers” in the Army. Pariah24 (talk) 12:48, 10 September 2017 (UTC)

I would expect that pronunciation by English speakers would rarely be the same as "correct" or common pronunciation by Swedish speakers, especially for a word fully absorbed into English. At [[ångström]] we have the Swedish pronunciation. DCDuring (talk) 14:51, 10 September 2017 (UTC)
@Pariah24 The problem I have is that you don't state which accent says IPA(key): [ˈɔːŋstɹœm]. The LPD and CEPD, the most respected pronunciation dictionaries of English list only the pronunciations with IPA(key): /ə/ and IPA(key): /ʌ/. We're both aware that there's no *IPA(key): /œ/ phoneme in English, at least in most accents. Can you prove that accents that use IPA(key): [œː] for the NURSE vowel (or GOAT vowel, in the case of South African English) would use that vowel in 'angstrom'? I find that highly unlikely and so that argument just doesn't hold up without additional sources that would prove that. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
@DCDuring We do, I added it there a few days ago per the LPD, which provides the Swedish IPA alongside RP and GA transcriptions. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)
The established practice on Wiktionary is to only put English pronunciations in an English entry. If no English speaker actually says the word angstrom with the pronunciation [ˈɔːŋstrœm], then that pronunciation should not be listed in the English entry. So the way to resolve this dispute is to find evidence that an English speaker says [ˈɔːŋstrœm], and to put that pronunciation in the proper context (is it rare, is it used by English speakers who also speak Swedish?). — Eru·tuon 17:48, 10 September 2017 (UTC)
I didn't notice that the pronunciation came from the OED. It is given as phonemic there: /ˈɔːŋstrœm/. On the one hand, I respect the OED; on the other, Wiktionary encourages verification of information taken from other sources, so it would be good to find out what they based this transcription on and whether it would meet our standards even if the OED didn't say it, and put it in proper perspective (that is, as above, who actually uses this pronunciation?). — Eru·tuon 18:02, 10 September 2017 (UTC)
screenshot for the plebs :) Pariah24 (talk) 03:29, 11 September 2017 (UTC)
Given its relative obscurity I think most would agree that finding a sample in the wild of a pronunciation of this term is pretty unfeasible. OED has never steered me wrong. If this were Wikipedia I wouldn't have even bothered with this nitpicky silliness, but it's a dictionary for pete's sake, and I always lean on the side of too much information is better than not enough, provided the source isn't questionable. I'm not over here deleting anything, I'm just adding. Pariah24 (talk)
This doesn't look like a query for an obscure word to me. —suzukaze (tc) 03:48, 11 September 2017 (UTC)
After all, there are many sources that say "X is a word that means Y" out there in the world, in particular other dictionaries. But they don't actually prove that people use a word, they just say they do, which really isn't sufficient. It wouldn't be the first time that dictionaries make up words that nobody has ever used!
suzukaze (tc) 03:40, 11 September 2017 (UTC)
Blockquoting, really? Pariah24 (talk) 03:43, 11 September 2017 (UTC)
If you want to play this game, here's a link for you. Pariah24 (talk) 03:47, 11 September 2017 (UTC)
Alright, you can have another too. —suzukaze (tc) 03:50, 11 September 2017 (UTC)
  • I, too, suspect this is a dictionary invention and I think it should not be in the entry without proof of actual use. Incidentally, I have a background in science (in the US), where normal use was /ˈæŋstɹəm/ or /ˈæŋstɹɑm/ (the latter of which I note is not in the entry). —Μετάknowledgediscuss/deeds 03:52, 11 September 2017 (UTC)
Not trying to offend but your anecdotal experience with a physics term can not possibly be comprehensive enough to be used as a valid argument for inclusion into a worldwide dictionary project. The only plausible way to attack my view that I see is to question the validity of OED and claim they would just put made-up bullshit into their dictionary, which is what you appear to be doing. Everything I know about OED from years of using it goes against this view. This is all getting really pedantic if you ask me. Sometimes it seems like all the worst parts of Wikipedia are magnified here. Pariah24 (talk) 04:24, 11 September 2017 (UTC)
If I understand the screenshot above correctly, the OED's entry for angstrom hasn't been updated since 1933. Nobody is saying that they insert made up bullshit into their entries. But their information may be outdated. DTLHS (talk) 04:28, 11 September 2017 (UTC)
Sadly, it may be bullshit. The OED makes things up a lot more than any of us would like, I'm afraid; you'll see that they're a frequent offender over at Appendix:English dictionary-only terms (a list of terms found in dictionaries that were never actually used enough to enter Wiktionary). —Μετάknowledgediscuss/deeds 04:31, 11 September 2017 (UTC)
If it needs to be removed based on this rationale I don't have a problem with it. I only have a problem with "hey guy I'm here to police you because I don't think you know what you're talking about." I'm rarely on the other side of these situations, because I almost never remove the content of others unless it obviously needs to go. Pariah24 (talk) 04:35, 11 September 2017 (UTC)
@Pariah24 Here we go again with misrepresenting what I say. I'm not interested in policing you (though you seem to strongly believe that, which isn't correct) but in the quality of Wiktionary. I removed that information and added sourced pronunciations (or sourced already existing ones, whatever it was) because I could see that the IPA was incomplete, it didn't say which accent says IPA(key): [ɔːŋstrœm] and I knew for a fact that neither RP nor GA speakers use IPA(key): [œ] in loanwords, not to mention native words. Do I really have to repeat myself over and over again? It seems like I do. I'm tired of you twisting my words and lying about my actions. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
It's especially aggravating when you're someone who has already gone through pains to use good sources. Pariah24 (talk) 04:39, 11 September 2017 (UTC)
@DTLHS: The last quotation in the entry is from 1957, and the small text under the definition mentions a redefinition of the meter in 1960, so they must have updated at least those parts of the entry since 1933. — Eru·tuon 05:03, 11 September 2017 (UTC)
Nobody likes to say this, but we rely heavily on original research (for English at least) and generally don't give a shit what secondary sources say. Which is why you will encounter so much hostility if you try to support something based on what a dictionary says. DTLHS (talk) 04:40, 11 September 2017 (UTC)
I honestly never noticed the little update warnings off to the right, so thanks for that. Clearly my attention to detail needs some work. Pariah24 (talk) 04:47, 11 September 2017 (UTC)
However on a second look it does say Previous version OED2 (1989). I think it just means it was first published in 1933. Pariah24 (talk) 04:53, 11 September 2017 (UTC)
@DTLHS That's clearly not the case here. I sourced the IPA on angstrom, it's not OR. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)
FWIW, there have been other cases where dictionaries prescribe a pronunciation that we can't find in use; bon appétit is one. If more dictionaries than just the OED prescribed the pronunciation mentioned here (and if they were consistent, e.g. in saying it was an RP pronunciation), then it might be appropriate to add a note like the one in bon appétit. - -sche (discuss) 04:20, 19 September 2017 (UTC)
Isn't it ALWAYS true that a term originating in a FL will sometimes be pronounced by a limited number of English speakers as it is in the FL. Those who pronounce it as in the FL would be limited to those who knew the FL or were repeatedly "corrected". If the pronunciation is not fairly common, why should it be included? DCDuring (talk) 05:03, 19 September 2017 (UTC)

WM language guides[edit]

Following is text extracted from a message posted on many WM mailing lists:

Many wikis in the Wikimedia world give editors suggestions about the correct usage of each respective language: orthography, register, punctuation, and so on.

I started a page to list of such language guides: https://meta.wikimedia.org/wiki/Language_guides

I added a bunch of links to Hebrew there because that's my home wiki. I also added a few pages that I could find for Catalan, Indonesian, Russian, and Bosnian.

Please add your languages there! Surely there are dozens and dozens of missing links there.

Before you ask: The linked page explains why Wikidata is not very convenient for maintaining such a list, but if you think that you can put this nicely in Wikidata, be bold.

Thank you!

-- Amir Elisha Aharoni

Note that this is not the same as article style guides. I would think folks here would be interested. Potentially English usage guides would be a valuable resource and link target for us. Perhaps some of our usage notes and other material would be useful material for such usage guides in many languages. DCDuring (talk) 14:43, 10 September 2017 (UTC)

Erm, it appears that these are, in fact, just style guides... —Μετάknowledgediscuss/deeds 18:32, 10 September 2017 (UTC)
I think the glass is about 1/4 full. At least, Croatian, Hebrew, Indonesian, and Polish include grammar and/or common spelling errors. Afrikaans has something on translation errors. DCDuring (talk) 20:03, 10 September 2017 (UTC)

Draft strategy direction. Version #2[edit]

In 2017, we initiated a broad discussion to form a strategic direction that will unite and inspire Wikimedians. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups discussed and gave feedback[strategy 1][strategy 2][strategy 3]. We researched readers and consulted more than 150 experts[strategy 4]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

A group of community volunteers and representatives from the strategy team synthesized this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The second version of the direction is ready. Again, please read, share, and discuss on the talk page on Meta. Based on your feedback, the drafting group will refine and finalize the direction.

SGrabarczuk (WMF) (talk) 10:12, 11 September 2017 (UTC)

Merge Proto-Nuclear Polynesian and Proto-Eastern Polynesian into Proto-Polynesian[edit]

The Austronesian languages suffer from what you might call matryoshka grouping: each group has a branch which then branches further, which then branches further, and so on. You end up with a lot of branches which don't have very significant differences, and a lot of different proto-language entries with very similar content. It's made more complicated by the fact that some of the branches are less well established than others. To reduce this somewhat, I propose merging Proto-Nuclear Polynesian poz-pnp-pro and Proto-Eastern Polynesian poz-pep-pro into Proto-Polynesian poz-pol-pro. The differences between them are very small; each group is separated from its parent by only one or two sound changes, the sound changes of individual languages are often more substantial than those separating the proto-languages. See *matuqa for example. Rapa Nui and Hawaiian are both in the Eastern Polynesian group, yet the former preserves the Proto-Polynesian form unchanged while the latter significantly changes it. Having separate entries for Proto-Polynesian *aka, Proto-Nuclear Polynesian *aka and Proto-Eastern Polynesian *aka is quite pointless. —Rua (mew) 22:47, 11 September 2017 (UTC)

So for some more background, what are the sound changes that supposedly differentiate PNP from PP, and PEP from PNP? --WikiTiki89 22:54, 11 September 2017 (UTC)
See w:Proto-Polynesian language. Nuclear has loss of *h and merging of *l and *r. Eastern also has *s > *h and partial loss of *q. —Rua (mew) 23:31, 11 September 2017 (UTC)
  • Oppose. We have terms that can be reconstructed to PNP but not to PPn. Why on earth would you make it impossible to enter reconstructible terms? —Μετάknowledgediscuss/deeds 23:53, 11 September 2017 (UTC)
    • Same reason we did it for Proto-Germanic or Proto-Uralic. —Rua (mew) 00:00, 12 September 2017 (UTC)
      So what do you do with words reconstructible to Proto-West-Germanic but not to Proto-Germanic? —Μετάknowledgediscuss/deeds 04:29, 12 September 2017 (UTC)
      • We reconstruct them for Proto-Germanic. Many linguists don't even believe in Proto-West-Germanic. —Aɴɢʀ (talk) 09:47, 12 September 2017 (UTC)
        • To clarify, we put the reconstruction in a Proto-Germanic entry and give it a context label of "West Germanic". --WikiTiki89 16:47, 12 September 2017 (UTC)
  • Oppose. There are two factors that differentiate Polynesian languages from Indo-European and other Eurasian language families.
    1. Polynesian phonotactics are extremely adverse to consonant change: in any Polynesian language I'm familiar with, there's no such thing as a consonant cluster- every syllable begins with either a vowel or a single consonant followed by a vowel, and there are simply no final consonants. That means that any consonant change that does happen is really significant.
    2. Millions of square miles of open ocean. It is physically possible to walk from the Scandinavian Peninsula all the way to India, but not from Samoa to Hawaii. Proto-Germanic spread over a wide area, but contact between dialect areas prevented it from splitting up into separate branches, for the most part. There are Polynesian island groups such as the Hawaiian Islands and Rapa Nui where there has been a colonization event or two, but no other contact with the outside world- ever. There are parts of Polynesia where island groups are close enough to allow periodic contact, but there are also plenty of island groups where other peoples were a subject of oral history, but never actually encountered until the Europeans showed up. There again, patterns of sound changes are probably reflective of actual population movements, not of areal influence or borrowing- you can't have areal influences if people within an area have never had any contact with each other. Chuck Entz (talk) 01:36, 13 September 2017 (UTC)
      I don't get what your point is. You seem to be saying that we should group languages based on how different they theoretically could be rather than how different they actually are. To me it seems that what we need is to determine whether there actually are fundamental enough differences between PP, PNP, and PEP that it would be infeasible to treat them under a single langauge. I don't personally have an answer to this, nor do I have any evidence to share, but let's not base this on theory. --WikiTiki89 02:40, 13 September 2017 (UTC)
  • Support at minimum for inherited terms. Multiplication of entries like Proto-Polynesian *wai, Proto-Nuclear Polynesian *wai and Proto-Eastern Polynesian *wai is useless. These are nothing more than a waste of effort that makes things harder to maintain. Trying to document every possible reconstructed proto-language as if they were attested natural languages is not a part of Wiktionary's mission; it is w:scope creep and should be avoided.
    — I could agree either way on items that actually are reconstructible only for a smaller group of languages, though, such as Proto-Nuclear Polynesian *nui. Having these under their proper proto-language categories etc. is more exact; but on the other hand, keeping around two "tiers" of proto-languages seems more complicated than is really necessary, since the context label (dialect label would be more exact) approach works for almost all needs. --Tropylium (talk) 20:48, 18 September 2017 (UTC)

Category:Quotation templates to be cleaned[edit]

There are over eight thousand entries in this category. Does anyone know what we are supposed to do with them? SemperBlotto (talk) 19:27, 14 September 2017 (UTC)

It is a project of TheDaveRoss. Maybe he can tell you. DTLHS (talk) 19:28, 14 September 2017 (UTC)
The cleaning means changing {{quote-text}} to one of the specific quote templates, it is not at all urgent. - TheDaveRoss 19:57, 14 September 2017 (UTC)

"Morphologically from the root" IP editor[edit]

There seem to be a variety of IP users editing Arabic entries and, among other things, adding the text "morphologically from the root x". See for instance this and this and this and this. As you can see, there are several different IP addresses, but to me their editing style looks the same, particularly the tagline above in the etymology sections, so they might be a single very high-tech person who knows how to mess with IP addresses, or a conspiracy of people. Sometimes the edits are okay, aside from formatting (no newlines between sections, and second-level reference sections, for instance). Often they're replacing specific definitions or etymologies with generic morphological ones (with the above tagline), or replacing Arabic templates with generic ones. With the last example, أَعْلَى (ʾaʿlā), they've radically reformatted the entry in an unconventional fashion, with pronunciation sections above etymology sections that share that pronunciation. It makes sense, but it's something that needs discussion (and there's the deletion of valuable etymological material). On the whole, their edits are full of various things that need to be corrected.

Anyway, I don't know what to do about this. At least the tagline gives some way to find their edits. I'll leave it at that. — Eru·tuon 10:06, 15 September 2017 (UTC)

Proposed first use of Wikidata: categorising planets[edit]

A while ago, {{senseid}} was added to entries with Wikidata ids, but with no actual Wikidata access, it was mostly a formality. Now that Wikidata access has been enabled, I've done some experimenting in Module:senseid, detailed at Wiktionary talk:Wikidata#A first experiment. The experiment was meant to find out how to use Wikidata information to categorise entries. Categorising entries this way offers the advantage that we don't have to think about which categories something belongs in. As long as the data is present on Wikidata, and the appropriate IDs are added to entries on Wiktionary, the categories can be added automatically. Think of it as an {{auto cat}} for individual senses: just plop in the Wikidata ID and the template will figure out what needs to be done. This method can only work for "set" categories, which contain things belonging to a particular set, usually indicated in Wikidata with the "instance of" property. It doesn't work for categories that contain terms related to a particular thing/topic. Semantic relatedness is lexical data, which is not currently present in Wikidata. Thus, we'll still have to manually add "topic" categories like Category:Astronomy.

Because of the rule that uses of Wikidata should be approved first, I did this experiment by using tracking categories to stand in for actual topical categories. My finding was that in general it works quite well, but Wikidata handles certain things idiosyncratically which our modules need to take into account when "translating" the information. For example, many things that are combined into one category on Wiktionary have several different Wikidata entities, such as our Category:Planets of the Solar System, which has three corresponding Wikidata entities, outer planet of the Solar system (Q30014), inner planet of the Solar System (Q3504248) and planet of the Solar System (Q17362350). Wikidata makes frequent use of subclassing; writing code on our end to resolve sub/parent classes may help in these cases. Taxonomic data is also handled differently, with special taxonomic properties rather than the generic "subclass of" and "instance of" properties.

I would like to take a first step towards making it actually do something with the data. This needs to be approved in a discussion, so I hereby propose modifying Module:senseid/{{senseid}} to automatically place an entry into a language-specific Category:Planets of the Solar System, if it is given a Wikidata ID (Q followed by numbers) and if Wikidata indicates that the entity for this ID is a planet of the solar system. I'm choosing planets specifically because it's a very small set with exactly 8 known members, and the Wikidata data is known to be complete. This makes it easy to spot any problems if they arise. Extending the system for more categories is very easy; if you approve of doing it for more categories than just planets right from the start, please state so. —Rua (mew) 18:54, 15 September 2017 (UTC)

I entirely approve and I especially like this example because it is something small and restrained (unlike e.g. adding it to every entry on a species) and it does include some possibly contentious data--i.e. the status of Pluto (which is not contentious to astronomers but is the sort of thing that could have actual individuals editing back-and-forth about it). —Justin (koavf)TCM 00:47, 16 September 2017 (UTC)
What benefit does this add? The members of the category are unlikely to change much over time, and if we did this it would require that every page which is not a planet checks to see if it is a planet, which seems like a lot of overhead. - TheDaveRoss 12:54, 18 September 2017 (UTC)
@TheDaveRoss: Your question answers itself: since this is a very small and stable use case, it will allow us to see in the wild how Wikidata integration would work. That is the value. —Justin (koavf)TCM 15:53, 18 September 2017 (UTC)
@Koavf: I don't disagree that this would be a good test, if it were the type of thing that I thought Wikidata was well suited for. I do not think that this is an example of a good use of Wikidata, however. If you have a small, class of objects you label the objects rather than querying all objects to see if they belong to that class. If you have a very large class, or one which changes often, then you query. - TheDaveRoss 11:39, 19 September 2017 (UTC)
It's not expensive at all to check this. All you need to do is retrieve the "instance of" property of the entity, and then check the IDs that you get for matches. IDs are just strings, so it's basic string matching, which is very fast. —Rua (mew) 15:48, 20 September 2017 (UTC)
It is expensive due to the volume, not the task. - TheDaveRoss 17:37, 20 September 2017 (UTC)
Yes, but structured data will vastly decrease overhead in the long term. —Justin (koavf)TCM 17:41, 20 September 2017 (UTC)
Only if we were currently using any overhead at all for this sort of thing, which we are not. I agree that judicious use of Wikidata will decrease overhead. - TheDaveRoss 17:46, 20 September 2017 (UTC)
Categorizing things is overhead. Someone has to actually do it. —Justin (koavf)TCM 18:23, 20 September 2017 (UTC)
  • I too question the utility of this. The inline markup is unaesthetic, cryptic and really seems out of place, and it offers practically no additional benefit to the current categorisation system. Furthermore, it rests on the assumption that words in various languages are direct equivalents of one another; they are not, for most words in a language. Sure, Venus may be translated as 太白星 and listing it in the translations table at Venus is acceptable, but the words are far from being the same, really. There are several names for Venus in Chinese, each with a nuance in meaning, and the same situation exists for nearly every other planet and star. Wyang (talk) 13:38, 18 September 2017 (UTC)
If 太白星 doesn't mean Venus, then why does the definition say Venus? Subtle nuances in meanings should be included in entries. But in this case it's simple: either it refers to that same ball of rock floating around the sun, or it doesn't. —Rua (mew) 14:17, 18 September 2017 (UTC)
It means Venus, but only in a Chinese astronomy, astrology, or Taoist context. It is already indicated with the label in the entry. It is the Grand White Star in traditional Chinese astronomy/folk religion, governed by the Grand White Star Lord. It is conveniently translated as Venus, but its definition really should be elaborated on in the future. Venus in modern Western astronomy and in the context of Taoist Wu Xing (five elements) is called 金星, Venus in Chinese astronomy and astrology is called 太白(星), Venus in the context of Taoist mythology is called 太白金星, Venus seen in the morning is called 啟明(星), Venus seen in the evening is called 長庚(星), etc. The nuances in most of the vocabulary in a language are simply too significant to allow a bijective map to another language, even for a simple term like Venus, especially if the two languages developed from cultures which historically had very few contacts. The principle of trying to map all words of a language to specific pre-defined semantic concepts, for the purpose of classification, is methodologically problematic. Wyang (talk) 23:09, 18 September 2017 (UTC)
@Wyang: Of course. But has an article named something on the planet *second-*closest to the Sun. What is that named? —Justin (koavf)TCM 23:29, 18 September 2017 (UTC)
@Koavf: You can't just say "of course" and wave that off. It's similar in Hindi: अरुण (aruṇ, Uranus in astrology), युरेनस (yurenas, Uranus the planet). Some Hindi purists also use अरुण for the planet. btw the closest planet to the sun is actually Mercury. That's not the main problem with this though, the real problem is the markup will become even more opaque and hidden from the casual editor. People wonder why we don't get many new editors, it's because it takes months to learn how to use all of our templates. —Aryaman (मुझसे बात करो) 23:57, 18 September 2017 (UTC)
@Aryamanarora: I'm not suggesting that the problem is simple or trivial: I'm suggesting that it's very complicated but that this is a good first step. Do you have a better solution? —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
If I did I would have to know enough Lua to implement it. —Aryaman (मुझसे बात करो) 00:08, 19 September 2017 (UTC)
@Aryamanarora: I'm not asking if you know the technical means (God knows I don't!), just what in principle would work better. Do you have any thoughts? I'd be happy to know what would work better even in a hypothetical sense. —Justin (koavf)TCM 04:26, 19 September 2017 (UTC)
I don't understand what you mean. Wyang (talk) 23:34, 18 September 2017 (UTC)
@Wyang: zh:w:金星 corresponds to en:w:Venus, so wouldn't that be the best word to use? Also, it's not a problem to use this senseid on more than one entry. —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)
Well, it is a problem to try to assign foreign words to specific semantic labels in English: e.g. Venus on Wikidata, and try to systematically generate categories based on these crude equivalents. Like I said above, the best word to use depends on context. Very rarely do words in Chinese match with senses of a word in English exactly. Wyang (talk) 00:14, 19 September 2017 (UTC)
@Wyang: Then have both words in the category. That solves the problem. If one English word is a cognate*equivalent* to two words in another language, that's okay. (Or three or vice versa, etc.) —Justin (koavf)TCM 03:13, 19 September 2017 (UTC)
They can be both put in the same category, or any category, with the current categorisation system, without having to resort to such rigid equivalence sets. The current method is also superior in usability and aesthetics. P.S. See definitions of cognate. Wyang (talk) 04:05, 19 September 2017 (UTC)
@Wyang: It is not superior in usability because it can be exported across languages. With over 100 Wiktionaries and no less 6,500 languages, using structured data to do any part of the work is far more usable. If the method of categorization that MediaWiki uses is superior, why did we ever launch Wikidata in the first place? —Justin (koavf)TCM 04:23, 19 September 2017 (UTC)
Pronunciation is templatisable, inflection is templatisable, entry layout is templatisable, but semantics is not templatisable or structurisable. Every sense of every word in a language corresponds to a semantic field or domain, and in the hypothetical 3D representation of human perception and cognition, it is a sphere in space, which spatially centres on the core, fundamental meaning of the term. What we are trying to do when we translate foreign words into English on Wiktionary is to find existing English terms with spatially close semantic areas to the source words; that's why definitions for Chinese words on Wiktionary are usually given with two or three English equivalents. Giving these multiple equivalents allows the reader to imagine the semantic area for the foreign term by superimposing the various English terms. As such, semantics across languages is not structured, and attempts to structurise it will only result in confusion and chaos. If languages were strict bijective mappings of words and grammar from one to another, machine translation would be a lot easier. Sure, it may work for water (in most languages), since the semantic fields for words for water are mostly spatially close, but it will fail for river, fluid, syrup. Wyang (talk) 04:46, 19 September 2017 (UTC)
@Wyang: So are you opposed to the notion of categorizing these terms? —Justin (koavf)TCM 16:20, 19 September 2017 (UTC)
I'm opposed to the notion of mapping senses of foreign words onto specific, pre-defined English semantic labels, and blindly achieve categorisation via those labels, as if the labels themselves are equivalent to the senses of the foreign words. Wyang (talk) 23:26, 19 September 2017 (UTC)
The Wikidata items aren't meant to encompass entire senses. They encompass referents of senses. Chinese may have different words for Venus, all with various nuances, but they all refer to the same ball of rock in space. The context in which they are used isn't relevant, that's a matter for context labels and usage notes. All that matters is that they fundamentally are different terms for that same ball of rock. So can you give concrete examples? Which terms refer to the planet and which don't? —Rua (mew) 23:45, 19 September 2017 (UTC)
They differ on a lexical level and these nuances will be reflected in their categories. The category of Category:zh:Planets of the Solar System is perfect as it is now. There is no need to dump all tens of synonyms of Venus, plus the names of all other planets in traditional Chinese astronomy into this category; these words, which are largely limited to traditional Chinese astronomy, should go into Category:zh:Planets of the Solar System in Chinese astronomy, or at least Category:zh:Stars and planets in Chinese astronomy (the reason the entry has the {{lb|zh|Chinese star}} label). One can easily adjust the categorisation in whatever way is most appropriate now. Putting an unattractive senseid next to the sense simply takes away this freedom and flexibility. Another example is the senseid at happiness, linked to Q8 on Wikidata which has 幸福 (xingfu) listed as the Chinese equivalent. This is unfortunate as xingfu is probably one of the hardest Chinese words to translate into English. Although it is typically glossed as happy; happiness, its connotations are hard to describe and not insignificant. English "I am very happy" and Chinese "我很幸福" have vastly different meanings. It would be quite silly to let the meaning conveyed by happiness blindly dictate the categories of the foreign words. Wyang (talk) 02:12, 20 September 2017 (UTC)
I opppose this, in particular the {{senseid|zh|Q313}} noise added to 太白星. Wiki markup should be free from identifier noise; it should be pleasant to edit directly. --Dan Polansky (talk) 13:53, 18 September 2017 (UTC)
It's not possible to use Wikidata without identifiers. —Rua (mew) 14:11, 18 September 2017 (UTC)
I support using Wikidata to categorize planets. One alternative idea might be using something like {{senseid|zh|Venus}} instead of {{senseid|zh|Q313}} with a data module that recognizes that "Venus" means "Q313". --Daniel Carrero (talk) 14:35, 18 September 2017 (UTC)
I strongly oppose creating a module which maps strings to Wikidata identifiers. The Lua errors are rampant enough without going down that path. - TheDaveRoss 15:02, 18 September 2017 (UTC)
I'm not a fan of it either. In any case, the senseids themselves aren't a part of this proposal. This proposal is only about modifying {{senseid}} to use them for categorising. Having Wikidata IDs on entries is beneficial even if others decide they don't want {{senseid}} to categorise. —Rua (mew) 15:08, 18 September 2017 (UTC)
Sure, I take back the idea of mapping things like "Venus" = "Q313". I prefer using "Q313" anyway, that was just an alternative idea. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
I strongly agree with Wyang, especially his point regarding the fact that Chinese has multiple names for Venus, each with its own connotations. --WikiTiki89 21:19, 18 September 2017 (UTC)
Is there any Chinese name for Venus that shouldn't get categorized in Category:zh:Planets of the Solar System? This is a categorization proposal, so I'd like to know how the nuances of each name affect categorization. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)
Oppose per Dan Polansky and Wyang. —Aryaman (मुझसे बात करो) 21:22, 18 September 2017 (UTC)
d:User:Amgine for a very old commentary on wikidata, which aligns with Wyang's opposition. Feel free to expand if you can. - Amgine/ t·e 01:54, 21 September 2017 (UTC)

Split RfD by English/non-English as we have with RfV[edit]

I propose that we split Wiktionary:Requests for deletion into Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English, just as we have done with Wiktionary:Requests for verification. RfD is presently over 425K, and although I can't say offhand what proportion is non-English, I would estimate it at somewhere over one third. As with RfV discussions, examination of English and non-English entries, of course, requires different skill sets, and a different set of editors are typically attracted to each kind of discussion. bd2412 T 02:04, 17 September 2017 (UTC)

When I proposed the split of RFV, I considered this as well but ultimately rejected it. The fact is that if you don't know at least a little Japanese, you just can't be of any use in gathering Japanese quotations or assessing whether they're uses. However, anyone who understands how the SOP concept works can look at a Japanese word broken into its component parts and, once shown that 茶色の葉 is 茶色 (ちゃいろ) (chairo, brown colour) + (no, possessive connector) +  () (ha, leaf), and since it means "brown leaf", it would be inappropriate to have a Wiktionary entry for that. That's why everyone can contribute at RFD, and why we should focus on clearing up the backlog by making judgement calls on whether a consensus has been reached rather than splitting the page. —Μετάknowledgediscuss/deeds 03:58, 17 September 2017 (UTC)
Just as an academic question, doesn't RfD address issues other than SOP-ness? bd2412 T 01:28, 24 September 2017 (UTC)
Yes, but much less commonly. —Μετάknowledgediscuss/deeds 03:53, 24 September 2017 (UTC)
Support, WT:RFD is too large already. --Daniel Carrero (talk) 21:37, 18 September 2017 (UTC)
Support --Backinstadiums (talk) 07:04, 21 September 2017 (UTC)
Support and I also believe scriptio continua languages and some other language groups require CFI different from English. BTW, @Metaknowledge: Japanese idiomatic terms may get a possessive particle の, e.g.  () () (konoha) or  () () (kinoha). The 2nd one looks especially like SoP ("leaf of the tree"") but both terms are considered idiomatic. Languages such as Korean or Arabic, etc. (both use spaces between) may have non-words written together, with no spaces between them, such as clitic prepositions, pronouns, etc. (Arabic) - فَقَالَ (faqāla, and (he) said)‎ = فَ‎ + قَالَ‎, غُرْفَتِي (ḡurfatī, my room)‎ = غرفة‎ + ي‎, particles and copulas (Korean) - 한국어로 (han-gugeoro, “in Korean”) = 한국어 + 로, 학생입니다 (haksaeng-imnida, “(someone) is a student”) = 학생 + 입니다. I do agree, however, that one can take part in discussions without a thorough knowledge of a given language but one has to learn fast and listen to arguments of native speakers or advanced learners. --Anatoli T. (обсудить/вклад) 07:44, 21 September 2017 (UTC)
Support. — Ungoliant (falai) 12:52, 21 September 2017 (UTC)
Support. --Canonicalization (talk) 13:19, 21 September 2017 (UTC)7
Support. --Robbie SWE (talk) 18:18, 21 September 2017 (UTC)

For reference:

I don't see any particular need. RFD is nowhere near as huge as RFV was when we split. --WikiTiki89 18:19, 25 September 2017 (UTC)
I think the relative sizes of both pages have fluctuated and crossed over one another from time to time. bd2412 T 19:59, 27 September 2017 (UTC)
As I see it, there are two negatives to consider: first, of course, is the burden of working with a large page, but the other one hasn't been mentioned yet: splitting by language means requiring a language code in the {{rfd}} template and cleaning things up when people post to the wrong page. This represents yet another way that those who don't know the finer points of our templates can get tripped up while doing basic tasks. Chuck Entz (talk) 21:24, 27 September 2017 (UTC)

Upcoming Wiki Science Competition[edit]

Did you hear about the Wiki Science competition, starting in November?

The competition will focus on images, but it might evolve in the near future, so users of other content platforms should take a look at it.

I've informed the village pump on commons, since there will be an intense workflow of technical uploaded by newbies, that will require some better categorization and translation of descriptions here and there. More importantly, images can be used for the articles on specific platforms. I think about some of your users who created and take care of many technical and scientific entries and are still currently active, such as User:SemperBlotto

I give you some details.

In 2015, limiting to Europe, we got thousands of entries, we can expect two or three times more this year. In the case of Italy for example we will send emails to many professional mailing lists, and other national wikimedia chapters will use their social media too to inform the public.

We have finished with Ivo Kruusamägi of WM Estonia to prepare some of the juries. I did my best to gather, besides people with a strong scientific background, also some expert wikipedians (because I ask first on wikipedia) here and there to take a look to the files on commons and not just the quality of the images. I have also informed users on English wikipedia, English wikiveristy and will do the same on some other wikimedia platforms in the following weeks.

The final international jury is made of expert researchers, usually with interest in photography, but no strong knowledge of the details of any wikimedia platforms. The main goal was to enlarge the network of "friends" of wikimedia platforms. Some national juries should have enough expert wikimedians and wikipedians probably, I guess because of the presence of active national chapter in their set up, so someone might take care of some the uploads at least improving some description and/or using them diorectly. Sometimes, suggesting technical entries to be created too.

More in general, gathering users besides wikipedians will probbaly help us to include more platforms for the competitions.

Now that I am sure that we have enough "scientists" here and there and from different fields, maybe we can see if we can also gathers specifically expert wikimedia users, whatever their background. Example simple teachers and not researchers that can evaluate the quality of the images for more specific uses.

For the countries without juries, there is the possibility of creating a second-level jury to select images from the rest of the world to the experts of the final jury. For such second-level jury I have found some names, but the numbers of entries could be really high, so maybe that's where we can look for more standard wikimedia users.

if you are a citizen of a country with a national jury you could also join them directly (rumor has it, more will appear). I don't know the details in many cases, if they need more jurors or they are fine.

Anyone interested?--Alexmar983 (talk) 05:59, 18 September 2017 (UTC)

I am not interested to be part of a jury but I though it is very interesting that you knock here. Pictures made with a Wikipedia uses perspectives are quite different than pictures usable in Wiktionary. Here we also need to illustrate verbs and actions for tools (not only the tool itself) and more. I'll be very enthusiastic to integrate pictures from this competition in wiktionaries if they fit our needs! Face-smile.svg Noé 07:36, 19 September 2017 (UTC)
With thousands of uploads, statistically someone could fit some needs also here... Noé I am happy if more people take a look, this should give better feedbacks for the future when the competition will be bigger and we can make it more specific to the needs of some wikiplatform. For example edit-a-thons. In the meantime, I have found another juror on frwikipedia, I am close enough to finalize the second-level jury. I am "sad" noone replied form wikiversity yet.--Alexmar983 (talk) 12:56, 19 September 2017 (UTC)
Cool. Maybe you can try to ping the French Wikiversity, if some French Wikipedians can assist you Face-smile.svg Noé 13:24, 19 September 2017 (UTC)

Wiktionary User Group[edit]


The Tremendous Wiktionary User Group is a coalition of users of Wiktionaries aimed to create a common platform to share ideas and documents. It is also a way to be a lobby at Wikimedia Foundation to make it acknowledge the needs of our projects in term of technical improvements. .

This User Group is completing a revolution, a first year of existence! We are writing our first Annual Report (due September 26th). It's time to look at what was made during the year and to frame the future axis of action. There is 42 affiliates now but the group can include much more people. I invite you to read our works and to see if you want to participate in our actions. The more visible one is LexiSession but there is much more to do, including promotional material (leaflet, banners, stickers, etc.), inter-wiktionarian collaborations (on templates, Wikidata, policies and guidelines) and meet-ups! There is no fees nor admission processes, it's open to everyone who like Wiktionary and want to do more about this project. Your ideas and initiatives are welcome!

Thank you for your attention, I hope to see you soon Face-smile.svg Noé 08:14, 19 September 2017 (UTC)

Help review PulauKakatua19 (talkcontribs)'s entries[edit]

This user is editing in way too many languages for them to possibly understand all of them. I have checked and fixed all of the recent edits in Hindi, Bengali, and Sanskrit, but someone acquainted with Indonesian, Malay, and now Korean should check the rest. Atitarev (talkcontribs) warned them on their talk page about Russian a while ago too. —Aryaman (मुझसे बात करो) 01:00, 21 September 2017 (UTC)

I will check their Chinese, Korean and Malay ones. Wyang (talk) 01:11, 21 September 2017 (UTC)

Gfarnab (talkcontribs) back at it again[edit]

e.g. A recent error at . Someone please block them. Wyang (talk) 01:13, 21 September 2017 (UTC)

Wiktionary:Votes/cu-2017-09/User:SemperBlotto for checkuser[edit]

Could someone add this to Wiktionary:Votes/Active? Thanks. --2A02:2788:A4:F44:AC35:948A:635A:9569 18:16, 21 September 2017 (UTC)

Before any such thing (I don't even think anons can create votes to start with), have you at least asked @SemperBlotto if he's interested? --Robbie SWE (talk) 18:21, 21 September 2017 (UTC)
No, he isn't. SemperBlotto (talk) 20:17, 21 September 2017 (UTC)
That's what I suspected. The vote is therefore useless and will be deleted. --Robbie SWE (talk) 20:19, 21 September 2017 (UTC)

Parentheses in IPA[edit]

I really wish people would stop inserting these back into pronunciations. The only acceptable IPA use of parentheses is in w:ExtIPA, and those are subscript parentheses to represent partial devoicing. In fact, I've searched through various linguistics databases and can't find much evidence even of non-IPA uses, except their occasional use to denote silent articulation. Obviously this doesn't apply to the case which Wiktionary editors are most commonly trying to use parentheses (optional articulation of ⟨ɹ⟩). It may be helpful, but it's wrong. The entry should show either both possible pronunciations separately or a more specific phonetic pronunciation. If people are going to keep using them, then Wiktionary as a whole needs to stop claiming they are using IPA's system, and admit they have their own in-house system. It's rather disrespectful to the creators of a standard to cherrypick what you wish to use. If the IPA thought parenthesis were really that important, don't you think they would have standardized them by now? Pariah24 (talk) 20:31, 21 September 2017 (UTC)

Of course, IPA is disrespectful to the creators of the Latin alphabet, by the way they cherrypick that alphabet. Phonetic alphabets are used in great variation throughout the world, including many, many minor variants on IPA. And our use of parenthesis has precedence; we have for crater /ˈkɹeɪ.tə(ɹ)/, and Keynon and Knott's A Pronouncing Dictionary of American English (1953) has (among others) ˈkɹetə(r--Prosfilaes (talk) 21:52, 21 September 2017 (UTC)
@Pariah24 It's not wrong. Peter Ladefoged transcribes the unstressed form of the as [(ð)ə] in broad phonetic transcription in the Handbook of the IPA (chapter 'American English'). Just because something is not officially endorsed (and I'm not so sure of that, have you tried asking the IPA itself?) it doesn't mean that it's wrong or that it shouldn't be used. Unless I'm missing something?
You're also a bit inconsistent in your edits. In martyr, you transcribed the AuE pronunciation /ˈmɑːtəɹ/, /ˈmɑːtə/, [ˈmäːtə], [ˈmäːɾə]. The order was wrong, as the pronunciation with the final /ɹ/ is marked and used only immediately before vowels, not the other way around. Also, the way the final [ɹ] is omitted in phonetic transcriptions suggests that it's there phonemically but not phonetically, which is of course completely wrong (if anything, again, it's the other way around). I've fixed that for you. Mr KEBAB (talk) 07:39, 27 September 2017 (UTC)

Braille entries[edit]

Should we reformat Braille entries like this?--2001:DA8:201:3512:BC46:AD88:D9A7:3939 16:39, 22 September 2017 (UTC)

(@Daniel Carrerosuzukaze (tc) 00:47, 23 September 2017 (UTC))
Mostly support. My opinion is this:
  1. I would suggest, in normal letter entries like a and also Braille letter entries like (which is Braille for "a") deleting all Latin script sections like Spanish, Portuguese, Italian, etc. because they clutter the entry and are basically infinite. The Translingual section can explain the Latin script letters.
  2. But, in Braille entries like the aforementioned , I would support keeping separate sections for Japanese, Arabic, Hebrew, etc. and other non-Latin script entries as opposed to keeping them all in the Translingual section.
  3. I would also support using proper categorization like Category:Arabic letters in Braille script (current redlink), with the written language and script. I would suggest using Category:Arabic letters in Arabic script (self-explanatory) for the normal alphabet.
--Daniel Carrero (talk) 03:41, 23 September 2017 (UTC)
@Daniel Carrero: I agree. --Backinstadiums (talk) 08:04, 23 September 2017 (UTC)

Wiktionary:Votes/sy-2017-09/User:Aryamanarora for admin[edit]

User:Aryamanarora has been nominated for adminship. Please voice your opinion on the page. Thanks! Wyang (talk) 13:56, 23 September 2017 (UTC)

Denoting long aspiration[edit]

In Northern Sami, there's a set of preaspirated consonants, but these consonants can be lengthened as well. When they are long, it is the preaspiration that lengthens rather than the occlusion itself. Usually, I've seen the preaspiration transcribed with just the letter h, e.g. hp, so that long preaspiration then becomes a matter of writing hːp. The few Northern Sami transcriptions that we have, and those on Wikipedia, use the superscript ʰ instead. I prefer the superscript, but writing ʰːp is probably less than ideal. I've written ʰpː instead in these occasions, but it doesn't really reflect the phonetic reality that it's the aspiration that lengthens. Any ideas? —Rua (mew) 18:29, 23 September 2017 (UTC)

How about hhp? — Eru·tuon 19:09, 23 September 2017 (UTC)
That's more or less equivalent to hːp. I would prefer to avoid h because there's also an actual phoneme /h/, and it's not part of these preaspirated consonants. /hːp/ is one phoneme, so I'd like it if the transcription reflected that. —Rua (mew) 19:14, 23 September 2017 (UTC)
You call it "less than ideal", but from your description I don't see much choice beside ʰːp, unless it's ʰʰp. —Aɴɢʀ (talk) 14:37, 24 September 2017 (UTC)
Yeah. I just hoped someone would think of something I hadn't thought of yet. @Tropylium any ideas? —Rua (mew) 15:14, 24 September 2017 (UTC)
In phonetic transcription there should be no problem with [hːp]. Phonetically there is no difference between [ʰ] and [h]. Even phonemically /hːp/ might be feasible. It is not universally agreed that these are unitary consonants; some analyses do consider them clusters /h/+/p/, in part precisely because it's the aspiration that lengthens and not the closure (similar to how the long counterpart of clusters such as /sk/ is /sːk/). In any case there is no contrast between /hp/ versus /ʰp/. --Tropylium (talk) 19:24, 24 September 2017 (UTC)
Ok, I'll just go with /hːp/ then. Thank you. —Rua (mew) 19:50, 24 September 2017 (UTC)
I would have picked /ʰʰp/, but it doesn't matter too much. --WikiTiki89 18:30, 25 September 2017 (UTC)


Do we have a language code for this under a different name? Used on জাৰ, নিগনি, translation at winter @Sagir Ahmed Msa, Aryamanarora. DTLHS (talk) 19:11, 23 September 2017 (UTC)

Also "Mymensinghiya", used on light. DTLHS (talk) 19:16, 23 September 2017 (UTC)
AFAIK both of these are Bengali dialects... Sagir probably knows better than me. —Aryaman (मुझसे बात करो) 20:11, 23 September 2017 (UTC)
@Aryamanarora: nope, there's no code for these languages. Yes these are considered as Bengali dialects just like Sylheti, Chittagonian, Rajbongsi etc (Wiktionary has code for these). Chakma and Rohingya (both are very closely related to Chittagonian) are not considered as Bengali dialects probably because their native speakers are not considered as Bengali people. But these languages are not actual dialects. Some of these are more closely related to other languages than standard Bengali. Similar for Assamese dialects (I mentioned Kamrupi language in মেকুৰী, which is considered as an Assamese dialect). I think just like Sylheti, Chittagonian etc, these languages should also have codes. They have different phonology, grammar even origins. The Dinajpuria and Mymensinghia words are not present in Rarhi-Nadia (standard Bengali), these are also closer to Rajbongsi and Sylheti respectively. Please check samples.

User:Sagir Ahmed Msa

@Sagir Ahmed Msa: Is grammar significantly different in these lects from Rarhi-Nadia? I'll admit I know little about Eastern Indo-Aryan, it's just ISO is usually generous with codes (e.g. a bunch of Hindi lects are given codes when they are often considered to be dialects). If we did add a code, bn-dnj etc. would be fine right? —Aryaman (मुझसे बात करो) 17:14, 25 September 2017 (UTC)
@Aryamanarora: yes you can make codes with "bn-" since they are generally considered as Bengali dialects.

-- Sagir

@Aryamanarora The code would be inc-dnj (see Wiktionary:Languages#Language_codes). DTLHS (talk) 19:10, 25 September 2017 (UTC)
@DTLHS: Whoops, typo on my part. But why not bn- since they are often considered Bengali dialects? That's probably not supposed to be argued about here though. —Aryaman (मुझसे बात करो) 19:12, 25 September 2017 (UTC)
Bengali is not a language family. DTLHS (talk) 19:15, 25 September 2017 (UTC)
Yes, I understand that now. So could inc-dnj (Dinajpuria) and inc-mym (Mymensinghiya) both with script Beng and ancestor inc-mgd? —Aryaman (मुझसे बात करो) 19:18, 25 September 2017 (UTC)
I am concerned that there is no information about these languages / dialects online- not even mentions of the language names. Since they would be WT:LDLs, what references would be used to support entries? DTLHS (talk) 19:21, 25 September 2017 (UTC)
@DTLHS: [1] seems promising, and attests to the lack of mutual intelligibility between these dialects... But I am not sure whether they deserve codes. Would {{lb|bn|...}} not suffice @Sagir Ahmed Msa? —Aryaman (मुझसे बात करो) 19:27, 25 September 2017 (UTC)
@Aryamanarora, Aryaman:

Here are some examples: Unfortunately i couldn't find Mymensinghiya tenses, so I'm comparing with Dhakaya, they are closely related to each other and Mymensinghiya is more distinct from Standard Bengali than Dhakaiya.

  • English :
  1. I do.
  2. I am doing.
  3. I did.
  4. I was doing.
  5. I will do.
  6. I will be doing.
  • Dhakaiya :
  1. Ami kôri.
  2. Ami kôrtasi.
  3. Ami kôrsi/kôrsilam.
  4. Ami kôrtasilam.
  5. Ami kôrmu.
  6. Ami kôrtê thakum.
  • Bengali:
  1. Ami kôri.
  2. Ami kôrchi.
  3. Ami kôrêchi.
  4. Ami kôrchilam.
  5. Ami kôrbo.
  6. Ami kôrtê thakbo.
  • Assamese:
  1. Môi kôrû.
  2. Môi kôri asû. (kôri = kôrat)
  3. Môi kôrisû/kôrisilû.
  4. Môi kôri asilû.
  5. Môi kôrim.
  6. Môi kôri thakim. (kôri = kôrat)
  • Rangpuri/Rajbongsi/Kamata:
  1. Muĩ kôrû.
  2. Muĩ kôrûsû.
  3. Muĩ kôrsinû. (And kôrsû?)
  4. Muĩ kôrûsinû.
  5. Muĩ kôrim.
  6. Muĩ kôrtê thakim.

-- Sagir

Wiktionary:Votes/sy-2017-09/User:Justinrleung for admin[edit]

Another veteran editor of Wiktionary, User:Justinrleung, has been nominated for adminship. Please voice your opinion on that page. Thanks! (Vote closes on 8 Oct.) Wyang (talk) 00:39, 24 September 2017 (UTC)

Should sense ids be distinct across pages?[edit]

Israel and State of Israel have the same sense id. I can’t imagine that this will cause any problem, since a sense id will presumably always accompany a pagename, or do we want to ensure universal uniqueness? — Ungoliant (falai) 13:43, 25 September 2017 (UTC)

In principle, it might take pagename + etymology + PoS + senseid to guarantee uniqueness in English, at least if the senseid is poorly chosen (eg, noun and verb spelled the same each used by itself in two different senseids for different PoSes). In the absence of etymology, pronunciation might be required. In some FLs gender might needed. I wonder what requirements exist in other languages. This seems messy.
Do we have anyone running comprehensive checks for this kind of thing against the XML dumps? For example, I use {{sense|genus}} under synonyms, hypernyms, and hyponyms header in taxonomic entries, but I sometimes need to differentiate by taxonomic family, order, etc. to ensure uniqueness. Have I always done so? I haven't been checking for that. DCDuring (talk) 17:00, 25 September 2017 (UTC)
Doesn't the same kind of problem exist to a vastly greater extent in FL sections where definitions consist of a single polysemic English word, with no disambiguating gloss? DCDuring (talk) 17:06, 25 September 2017 (UTC)
Yes, it does. That’s a major problem with our FL content. Our definitions in certain languages (Italian and Spanish come to mind) are still too poor to be used as my primary source of information. — Ungoliant (falai) 17:19, 25 September 2017 (UTC)
I try to fix these for Dutch whenever I spot them, but it's an uphill battle. Finding them is difficult enough. —Rua (mew) 18:05, 25 September 2017 (UTC)
(edit conflict) The only thing that has to be unique is the combination of page name, language, and sense id. Sense ids appear in a link to an entry that contains the entry name, the language name, and the bit of sense id text; they are not used without a language name or as a substitute for a page name. So they do not need to be unique across pages; if they were, they would probably be too long or unintuitive. I think they should be as short as possible, because they have to be plugged into the |id= parameter of link templates. (However, I just searched and discovered a very long sense id for radical in English: linguistics: portion of a character that provides an indication of its meaning. Oh well.) — Eru·tuon 18:12, 25 September 2017 (UTC)
Can we agree that senseids must be unique within a language section? — Ungoliant (falai) 16:38, 27 September 2017 (UTC)
They wouldn't work if they weren't. --WikiTiki89 16:46, 27 September 2017 (UTC)
Sounds like a good rule. DCDuring (talk) 18:06, 27 September 2017 (UTC)
Yes, that is a restatement of what I meant by "the combination of page name, language, and sense id must be unique". It may not have been very clear. — Eru·tuon 18:47, 27 September 2017 (UTC)

Modern Greek terms spelt with Latin characters[edit]

@Xoristzatziki has just speedied the Greek entry at marketing (marketing). I assume the reason is that the word is not written in Greek letters. Since the entry in question was reviewed by experienced editor Saltmarsh, and since marketing is trivially attestable in running text, I think its inclusibility should be at least discussed. — Ungoliant (falai) 16:26, 27 September 2017 (UTC)

Yes - and I thought hard about it at the time. I have frequently considered raising the subject here (TLDR generally stops me). As an ageing Englishman I can feel annoyed at myironic language being mangled by others (I heard an Englishman say "crawfish" on the radio this morning - I'm sorry, we say "crayfish"), ; I can also understand @Xoristzatziki's anger when the same thing is happening to his language. Greek web pages (the first supermarket site I look at has "FRANCHISE" and "CLUB CARD" (and "SUPER MARKET"). When I go to Greece I feel sad that packaging and billboards are similarly invaded. A quick look at my w:Babiniotis Dictionary shows the Latin script "status quo" (we have it as an English term as well as Latin) and other Latin terms, only a few English terms (I only find NATO in the time available). To pick an easy example - "weekend" is common in Greek text (the Academie francaise fought against it for years) have entries for 5 other languages, and it even declines in Polish! But perhaps marketing is better than μάρκετινγκ ? — Saltmarsh. 06:15, 28 September 2017 (UTC)

There is no such term Modern Greek terms spelt with Latin characters. Please do not try to alter a language out of nothing. If you think Greeklish should be a new language in wiktionary, make a propose. --Xoristzatziki (talk) 16:33, 27 September 2017 (UTC)

As a descriptive dictionary, if a word is used in texts by Greek speakers it can be included. @Saltmarsh DTLHS (talk) 16:42, 27 September 2017 (UTC)
In that sense all English words should contain a Greek section... And sections for all other languages also... Or only Greeks and Cypriots use in texts signs and words from English? Chinese do not do it? --Xoristzatziki (talk) 16:54, 27 September 2017 (UTC)
Category:Chinese terms written in multiple scripts, Category:Chinese terms written in foreign scripts. DTLHS (talk) 16:59, 27 September 2017 (UTC)
This is something else. Please do not confuse us. We are talking about the usage of real English words. Not for terms that cannot be otherwise identified (σ鍵). There is not a single English word in the above mentioned category although ex. fast-food is written, as stand alone word, in more Chinese restaurants around the world than marketing is written in Greek "googloid" texts. --Xoristzatziki (talk) 17:08, 27 September 2017 (UTC)
Look at the second category (Category:Chinese terms written in foreign scripts), especially band, size, and friend. --WikiTiki89 17:26, 27 September 2017 (UTC)
@Ungoliant MMDCCLXIV Could you link some examples of "marketing" being used in Greek texts? DTLHS (talk) 17:11, 27 September 2017 (UTC)
google books:"το marketing". Compare with google books:"το μάρκετινγκ". --WikiTiki89 17:26, 27 September 2017 (UTC)
[2], [3], [4], [5], [6], [7], [8]. Some of them also use μάρκετινγκ elsewhere in the text. — Ungoliant (falai) 17:30, 27 September 2017 (UTC)

Apart of all that I could agree in such a "descriptive"(!?) way, if all languages had the same confronting. marketing should include every language for which google returns that word if specific language is asked. And, any way, I will not revert such Greeklish entries if the dominant status of volunteers in Wiktionary is to create such "Modern Greek terms spelt with Latin characters".--Xoristzatziki (talk) 17:18, 27 September 2017 (UTC)

@Wikitiki89, @Ungoliant MMDCCLXIV one thing is sure. You do not know how google works (and especially their department of sales together with google books). Otherwise you should come with true results. Mentioning counts relative to the time they where written, to whom they are addressed, how many are duplicating or copying or attesting other books etc. etc. --Xoristzatziki (talk) 17:39, 27 September 2017 (UTC)

It's true that relative numbers of Google Books hits aren't that useful. One thing I noticed is that one hit displayed on the results page for the "το μάρκετινγκ" search has "το Marketing" highlighted as the search term- you have to wonder if there's some bleed-through between languages in their search algorithms. That said, such things are beside the point when it comes to CFI: there are enough viewable hits to satisfy CFI- if they really are using the term to convey meaning in Greek. The latter is the tricky part. Chuck Entz (talk) 22:03, 27 September 2017 (UTC)

Based on User:DTLHS's idea of "descriptive dictionary" and the whole above conversation, assuming I can provide enough sources written in English as main language (electronic or printed) which have or inside the text is it safe to assume that this an indication to add to these terms an English section? (I have in my hands at least two such books) --Xoristzatziki (talk) 05:34, 28 September 2017 (UTC)

Do the Chinese characters in your texts convey meaning (Wiktionary:Criteria_for_inclusion#Conveying_meaning)? Are they being used as English words and not just mentioned as Chinese characters? It can be hard to answer these questions for a non-native speaker, which is why I don't know if the Greek quotes linked above would qualify. DTLHS (talk) 05:46, 28 September 2017 (UTC)
The "Conveying_meaning" does not mention at all the script. Only mentions words in the same script (which should be considered as "only words in the same script" and not the opposite). The fact that many people who speak a second language prefer to pronounce some words in the way they are pronounced in that second language does not make the pronunciation of these words part of the pronunciation of first language. Such as the "USA" pronunciation of words from enough people living in London does not make that pronunciation British. A book targeted to specific group might contain anything that the target group can identify. A book containing emoticons might have emoticons inside sentences used not as example but as a full sentence. That does not mean the emoticons are part of a specific language. (Or they are now? Can you spell File:Fxemoji u1F602.svg in English or in any other spoken language? Or we are not interest in pronunciation from that point forward? Just "a printed icon" of "example" is enough? Should we start converting any word to picture and stop writing it here but include it as picture?) --Xoristzatziki (talk) 09:25, 28 September 2017 (UTC)
Yes, if enough people in London pronounce something some way, that pronunciation is British. That's what "descriptive" means. Cross-lingual pronunciations are complex, but again, descriptive means that the pronunciation of a word is many times going to be foreignized. I wouldn't say it was safe to assume that 羊 is part of English, but there would certainly be an argument if it was used in running text, particularly if it was treated as an English word. There's also complexities here; English absorbs all sorts of random accents in rare words, like ʔAllāt, or Greek letters, in cases like γ-globulin, and odd characters like ℝ-order tree. But scripts outside Greek rarely get used; ℵ₀ is an odd example, and Cyrillic occasionally leaks through, like СССР, but I'd be very surprised by Chinese characters. I'd expect Greek to be a similar spot; Latin getting mixed in sometimes, with other scripts being rare.--Prosfilaes (talk) 10:02, 28 September 2017 (UTC)

As a native reader of Chinese characters, I agree with User:Xoristzatziki 100% here. If native speakers do not treat these words as their language, do not include these so-called "attestable" words in the comprehensive monolingual or bilingual reference dictionaries they produce, there is really no point in including them. The native speakers (not language regulators) have the best Sprachgefühl regarding what is their language, what is sum of parts in their language, what part of speech a word is (for analytic languages), and often the script is a formidable barrier to something being considered 'their language'; it is a very bad idea to argue against the perception of native speakers, and say this is your language when they are native in it. I'm sure native English speakers would be similarly concerned if a user starts to mass-create "English" entries of a similar nature, even if it is just Latin-script perro ([9], [10], [11], [12]). It is just the case that English is the overwhelming exporter of these uses in other languages, but all languages have principles as to what can be considered part of their language and what can not; not all words a Chinese person says or writes when they speak Chinese is Chinese ― they can mix a lot of English, Malay, Japanese, etc. words in, depending on where they are and how much Chinese/other languages they know. Likewise, Latin-script marketing in running Greek text is just not Greek. Wyang (talk) 11:11, 28 September 2017 (UTC)

I am totally with Xoristzatziki and Wyang on this one. Imagine a published dictionary where marketing is marked as a Greek, Russian, Armenian, etc. word. We have, unfortunately allowed some Latin script words enter CAT:Chinese terms written in foreign scripts, they are mostly slang and, very few are standard Chinese (Mandarin) and, unlike Greek, Russian and other alphabet-based or phonetic languages, there is no Chinese script to render those words phonetically. Most of these terms wouldn't pass if they were in a respected published dictionary. I'd like to mention again that a language, such as Chinese needs a separate CFI for various reasons. --Anatoli T. (обсудить/вклад) 13:26, 28 September 2017 (UTC)
You may want to RFD the contents of Category:Greek terms written in Latin script. — Ungoliant (falai) 13:35, 28 September 2017 (UTC)
Gone. There wasn't even an attempt to provide citations for those. As far as I am concerned, they are all against our policies and the common sense. --Anatoli T. (обсудить/вклад) 13:47, 28 September 2017 (UTC)
There goes ain't and fuck, which weren't English words in the comprehensive monolingual or bilingual reference dictionaries of English for a long time. It also strands "English" words that aren't English anymore, that are being used in ways that no native speaker of English would use the word. I don't know about marketing, but this is not a simple case.
Also, we're a descriptive dictionary. The "correct" writing style frequently differs from the writing style in actual use. If digging around in the newsgroups, we find a few million words of Latin-script Greek, then of course we should record that.--Prosfilaes (talk) 01:20, 29 September 2017 (UTC)
These are different cases: one (ain't, fuck) where the words are deemed nonstandard or vulgar by dictionary makers, and the other where the words are rejected outright by native speakers as simply being foreign words mixed into speech or writing, much like the example of this perro above. We certainly should not include a few million words of Latin-script Greek; that will only lead us to become a laughing stock, and lead to complete dismissal by Greek speakers. Wyang (talk) 05:46, 29 September 2017 (UTC)
It's not different cases; they're both cases where dictionary writers consider a word not a word, because it's not proper. I wasn't talking about native speakers; I was responding to where you were talking about words considered inappropriate to include by dictionary makers.
What other corpuses should we ignore? All that Hebrew-script German Jargon? Scots (the very existence of the Scots Wikipedia seems to get a lot of mockery)? Should we delete Category:Macedonian language because that might cause complete dismissal by Greek speakers? If we have a corpus of several million words, we should record it.--Prosfilaes (talk) 07:14, 29 September 2017 (UTC)
I'm speechless... You are insisting ain't/fuck and marketing in Greek are of the same nature, so ― I was able to find ain't in many English dictionaries: Marriam-Webster, Oxford Dictionary, Cambridge Dictionary, Collins Dictionary, MacMillan Dictionary, American Heritage Dictionary, Longman Dictionary of Contemporary English; can you find a Greek dictionary that includes the Latin-script word marketing as a Greek word? We've already got a native Greek speaker complaining that we are butchering their language, why? Because non-native speakers and non-speakers are dictating their language, often in a self-assumed manner, as if we know what is best for their language. We are sometimes trapped in the mindset of our own rules, so trapped that we have lost touch with reality, with common sense. Show a native Greek speaker the texts containing marketing and ask them what this is, and they would unanimously tell you this is an English word mixed into a Greek text, and the author is trying to show off that they are professional, up-to-date with the lingo and superior with their knowledge. Ask them what μάρκετινγκ is, and they will tell you it is a Greek word borrowed from English. Yet, we decide for the Greeks, ruling that marketing is their language, as well as several million more Latin-script 'Greek' words. Of course this is going to lead to displeasure, dismissal, and ridicule amongst the Greeks, the native help from whom we desperately and paradoxically need here. Wyang (talk) 09:10, 29 September 2017 (UTC)
The issue boils down to having criteria that allows us to draw a line between “Y-language word used in language X” and “loanword from language Y that has been borrowed into X”. This is not as self-evident as some here seem to think, and consulting n people will yield n different opinions as to which words are the former and which are the latter.
Since no one is arguing for the deletion of μάρκετινγκ, it seems that you want the use script as a criterion, which is not at all unreasonable, but do discuss it rather than removing the entries without explanation. — Ungoliant (falai) 11:54, 29 September 2017 (UTC)
There's nothing to discuss here, really. No need to create votes and policies for the obvious, natural and universally accepted rules - languages are written in native scripts, romanisation and words in other scripts are not words in those languages. People who imagine that Greek or any other language written in non-Roman script language can be written in scripts other than the native should be the ones seeking approvals, not the ones who protect the sanity and quality of this dictionary. --Anatoli T. (обсудить/вклад) 12:31, 29 September 2017 (UTC)
Apparently you have to teach that to Nikos, who created the Greek entry at marketing (marketing) (and to the Greeks who keep using marketing in their books). — Ungoliant (falai) 12:38, 29 September 2017 (UTC)
Nikosks (talkcontribs) was the user who created the section in 2016. They only had two edits, spaced less than twenty minutes apart, one on management, and one on marketing, both edits involving the addition of a Greek section. Looking at their edits at the time (edit to management and edit to marketing), this may be the case of an innocent newbie mistake after all: they thought the Greek sections on these entries can be used to hold translations. Wyang (talk) 12:54, 29 September 2017 (UTC)
I doubt the user even knew Greek. Both terms were created as masculines but they are neuters - both μάνατζμεντ (mánatzment) and μάρκετινγκ (márketingk). We often have this type of entries made by clueless users. A while ago [[ghar]] was created with a definition something like "This is a Hindi word for "house". (The history is now overwritten, as the entry was deleted.) The correct entry is, of course, at घर (ghar). --Anatoli T. (обсудить/вклад) 04:57, 30 September 2017 (UTC)

It's pretty obvious that these are English words being used in Greek text (probably for convenience), not integral Greek words. —Aryaman (मुझसे बात करो) 15:44, 28 September 2017 (UTC)

Another possibility is code switching. People who are bilingual sometimes switch between languages because different languages have different associations: using one's native language for personal, emotional topics, using another language to evoke a certain style, or yet another to show one is up on the latest in a field dominated by speakers of that language. This would probably be the latter: if the field of internet marketing is dominated by English-speakers, one might throw in an occasional bit of English internet marketing terminology to give the appearance of being well-versed in that type of thing. We don't see as much of that in English nowadays because English speakers are less bilingual and don't care as much about other languages, but there are specialized areas such as religions like Islam or Catholicism based in other languages or academic fields where you can see it, and it was once common for educated people to throw in the occasional Greek, Latin or French term in ordinary conversation. Chuck Entz (talk) 16:22, 28 September 2017 (UTC)

FWIW, I found back in 2011 that both ἄρχων and Москва were citable in running English text. The latter was deleted per RFD, but if we decide to keep marketing (marketing) et al, it would be easy to cite a bunch more like ἄρχων. IMO, it's better to analyse it as code-switching. We consider the presence/absence of italic script when trying to determine if a Latin-script foreign-language phrase or term has been borrowed into English or only mentioned, and it seems appropriate to me to consider the presence/absence of native script similarly. - -sche (discuss) 19:15, 6 October 2017 (UTC)

Project Grant proposal for Lingua Libre[edit]

Lingua Libre's logo


Lingua Libre is an opensource platform created to ease mass recording of word pronounciations into clean, well cut and well normalized audio files. Given a clean words list, recording productivity can reach up to 1000 audio recordings per hour, i.e. ten times faster than the best method described on Help:Audio pronunciations and without requiring any technical skills.

It's currently supported by a team of (mostly French, including French Wiktionary administrators) volunteers. Even if the core recording tool is fully functional and very efficient, it currently suffers from a very poor integration with the Wikimedia projects. To accelerate the development of this tool and overcome these problems, we have submitted a Project Grant proposal. If you're interested by this project, take a look at the proposal, on meta: meta:Grants:Project/0x010C/LinguaLibre. Don't hesitate to ask questions on it if you feel there are ambiguous points, or to endorse the project if you wish to see it coming true!

Furthermore, if you want the English Wiktionary to benefit from these audio recordings (through a bot, or some other way), please get in touch with me! Face-smile.svg0x010C ~talk~ 17:21, 27 September 2017 (UTC)

Little gnomes at work?[edit]

Who was the little gnome that removed the arrow symbol from references? I kind of miss it, and there is a gap where it should be. The way watched pages (such as pages one has created) are presented has also changed. DonnanZ (talk) 12:09, 29 September 2017 (UTC)

I don't recall seeing such an arrow, but it sounds like it may have been added by a gadget, which may have been broken by the recent updates to the site software, or the fact that the "References" header has been changed to something else in many entries, or the recent cleanup of old gadgets. Sorry I can't be of any more help than that. - -sche (discuss) 19:24, 6 October 2017 (UTC)


The name of this category is a bit strange. Aren't we supposed to use only nouns? --Barytonesis (talk) 19:39, 29 September 2017 (UTC)

There is also Category:en:Nautical using an adjective. But I prefer "Automotive" to Category:en:Auto parts. DonnanZ (talk) 08:54, 30 September 2017 (UTC)

Wikimedia Movement Strategy phase 2, and a goodbye[edit]


As phase one of the Wikimedia movement strategy process nears its close with the strategic direction being finalized, my contractor role as a coordinator is ending too. I am returning to my normal role as a volunteer (Tar Lócesilion) and wanted to thank you all for your participation in the process.

The strategic direction should be finalized on Meta late this weekend. The planning and designing of phase 2 of the strategy process will start in November. The next phase will again offer many opportunities to participate and discuss the future of our movement, and will focus on roles, resources, and responsibilities.

Thank you, SGrabarczuk (WMF) (talk) 21:55, 30 September 2017 (UTC)

Language userboxes: by country/region?[edit]

I don't need it, but I just wondered: can our language userboxes support specific country/region, e.g. British English, or Swiss German? (And if not, should they?) Equinox 21:55, 30 September 2017 (UTC)

Do you mean the Babel boxes? Mine says "This user is a native speaker of British English". DonnanZ (talk) 11:59, 1 October 2017 (UTC)

Desinence as a POS[edit]

I suggest adding desinence (inflectional ending) as a POS header/category, I think it would be good to differentiate them from suffix in general. Crom daba (talk) 23:37, 30 September 2017 (UTC)

I think "desinence" is a very obscure term that most people wouldn't know. What about just "inflectional suffix"? DTLHS (talk) 23:48, 30 September 2017 (UTC)
I thought it was that lovey-dovey feeling. 20 seconds of brain-searching later, I realise I was thinking of limerence. Equinox 00:07, 1 October 2017 (UTC)
(edit conflict) In principle, that sounds nice, but desinence is a lousy name (how many people know what it is without looking it up), and the Indo-European languages we're used to are deceptively simple when it comes to the types and hierarchy of affixes. For instance, Bantu languages show number with prefixes in many cases, and, as you know, agglutinative languages throw in all kinds of things represented by separate words of various parts of speech, with the lines separating inflection, derivation and syntax getting thoroughly tangled. Chuck Entz (talk) 00:02, 1 October 2017 (UTC)
We call them suffixes, but for many languages we do distinguish between inflectional and derivational suffixes (e.g. CAT:Irish inflectional suffixes and CAT:Irish derivational suffixes). Note that not all inflectional affixes are suffixes, e.g. Maltese ni-, ti-, ji- (and their equivalents in other Semitic languages) are prefixes, i.e. endings that are actually "beginnings". —Aɴɢʀ (talk) 15:07, 1 October 2017 (UTC)

October 2017

October LexiSession: punishment[edit]

The Punishment of Loki.

The monthly suggested collective theme is punishment. Not so funny, but the 10th of October is the World Day Against the Death Penalty so we may look at the alternatives and do better descriptions around this theme.

Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. In one year, 35+ people have participated! I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every project at the same time, to give us more insight into the ways our colleagues works in the other projects.

See you soon Face-smile.svg Noé 09:28, 1 October 2017 (UTC)

slow slicing and poena cullei are my requests for this month (entries for these, not as punishment...although, they could be useful detractors for trolls...) --P5Nd2 (talk) 09:39, 8 October 2017 (UTC)


Adding translations to too many unrelated languages. No idea where they get transliterations for Chinese dialects, such as Jin, Gan, Xiang, etc. --Anatoli T. (обсудить/вклад) 10:51, 1 October 2017 (UTC)

How the heck did they find Prakrit translations? They must be going through the entries we have already. —Aryaman (मुझसे बात करो) 20:44, 1 October 2017 (UTC)

French Wiktionary September news[edit]

Logo Wiktionnaire-Actualités.svg


Hey! September issue of Wiktionary Actualités just came out in English!

In this issue: Comments about press articles, our information desk is not like yours, a description of a dictionary of short-text signs, a comment on the expression of gender in an Andean language, some cool videos about words (in French and English!), announcements for the Wikiconférence francophone in October and plenty of statistics with fancy fleurons surrounding it all!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We did not receive any money for this publication and we are not supported by any user group or chapter. It is only written by the community, for the large community of lexicolovers! I hope you did not feel harassed by this notice Face-smile.svg Noé 21:41, 1 October 2017 (UTC)

PulauKakatua19 (talkcontribs) again[edit]

This user is adding spurious Hittite entries, with some really bad/outdated etymologies. No references either. They have been warned many times for Russian, Hindi, and Rohingya edits. I suggest a week-long block. —Aryaman (मुझसे बात करो) 14:51, 3 October 2017 (UTC)

Etymological information for strong verb non-lemma forms[edit]

There are many forms, e.g. in English, where arbitrary, irregular, strong forms of verbs deserve their own etymology. Many of these individual forms have received particular attention from linguists over the decades, e.g. did (past tense of do, from a unique, non-past reduplicated root form of the ancestor of do dating back to Pre-Germanic, for unknown reasons), sang (past tense of sing derived directly from a Proto-Indo-European form of the ancestor of sing). These non-lemma forms have their own independent etymological lineage that can be traced back thousands of years.

A certain administrator (Rua) has informed me that it is policy on Wiktionary to minimize etymological information on non-lemma forms, and instead place such information in the lemma form's etymology section. This can be understood for weak forms like walked, but those forms need little explanation because they are formed regularly, and for the forms that do require extra explanation, it makes for unsightly etymology pages on the lemma form's etymology sections (see Proto-Germanic *dōną's etymology section for the current policy specification; it doesn't even specify the past form *dedǭ, referring to it only as "the past form").

I understand the concern to avoid etymology fragmentation, but in this case, the etymology itself is fragmented and the two forms are remembered as separate, arbitrary, irregular forms. Perhaps there is a solution to maintain the same etymology information in multiple pages, but I think the most simple solution would be to provide etymological information for such forms on their own pages. There is really no reason to avoid this practice and it only makes things more confusing. I am surprised that this is against current policy. Do you agree with this assessment? 16:17, 3 October 2017 (UTC)

Strongly oppose putting etymologies on every inflected form, irregular or not. —Rua (mew) 16:35, 3 October 2017 (UTC)
  • Out of curiosity -- where should such etymological information go? Some simple-present verb forms include etymological information for irregular conjugated forms, such as at [[go#Etymology 1]]. Others do not, such as at [[do]], which includes no explanation for the formation of [[did]]. ‑‑ Eiríkr Útlendi │Tala við mig 17:29, 3 October 2017 (UTC)
    • At the lemma entry, where we currently already place them. The IP is arguing that we should put etymologies on nonlemma entries too, which is going to lead to a huge duplicative mess. —Rua (mew) 17:47, 3 October 2017 (UTC)
I am proposing a move of the notable etymologies from the lemma to the non-lemma forms, if they are notable as in strong verbs. There is no duplication going on, only a move, as I indicated in the OP. 17:58, 3 October 2017 (UTC)
Then I still oppose it, the etymological information should be centralised on the lemma form. That's how all etymological dictionaries work, that's how we've worked so far too. Our users are accustomed to follow the link to the lemma for information, which is the purpose of non-lemma entries in the first place. They're there to help get users to the right place, nothing more. We should not scatter our information across various non-lemmas. —Rua (mew) 18:34, 3 October 2017 (UTC)
Traditional etymological dictionaries are constrained by the space in a book and give priority to lemma forms because they are the most popular. There is no real reason to ignore non-lemma forms or centralize their etymologies because Wiktionary doesn't have a size constraint, especially for adding reasonable information. I disagree that the only purpose of non-lemma forms is to provide a link to a lemma form; many non-lemma forms have lineages in their own right and there is no reason to marginalize them. Furthermore, users are not accustomed to follow the link to the lemma forms as you suggest; precedents for separate etymologies for non-lemma forms like done, is, are, am, etc. already exist and have existed for a long time. 18:37, 3 October 2017 (UTC)
Size constraint isn't the issue. It's keeping our information organised so that information can be found easily. And what I said is the agreed-upon purpose of non-lemma forms. It's why we don't include things such as derived terms, descendants or inflection tables on non-lemmas. Wiktionary is fundamentally lemma-oriented (or lexeme-oriented) rather than word-oriented. If we were word-oriented, we'd also include full definitions on non-lemmas, but thankfully we've been wise enough to not follow that idea. —Rua (mew) 18:42, 3 October 2017 (UTC)
Semantically and synchronically, what you're saying is correct; non-lemma forms don't require separate definitions. The placeholders used now are adequate. Etymologically and diachronically, it's incorrect. Irregular non-lemma forms are entirely independent of their lemma forms. Wiktionary is a semantically lemma-based dictionary, but that's completely unrelated to etymology. There is good reason for irregular non-lemmas to provide etymologies, and the semantic value of the terms have no bearing on it. 18:46, 3 October 2017 (UTC)
Ok, but as you must understand by now, that's not how Wiktionary works. The etymologies for the individual parts are noted on the lemma. You'll simply have to adapt to this practice. We're not going to change it just because some random user doesn't like it. —Rua (mew) 18:48, 3 October 2017 (UTC)
You are not arguing against my point, you are arguing your point because "that's the way it's always been done" and based on ad hominem because I'm "some random user". 18:54, 3 October 2017 (UTC)
  • I am not proposing putting etymologies on "every inflected form", only on the arbitrary forms with their own separate, traceable etymologies, if only to indicate their significance. The regular forms don't require etymologies because they are predictable. E.g. the etymology for the strong non-lemma form of sing, which is sang:
From Old English sang, from Proto-Germanic *sang, from Proto-Indo-European *songʷh-, o-grade past tense of *sengʷh- (sing, make an incantation).
Right now, the article sang doesn't indicate any of this lineage at all. As a strong and unpredictable form, lexically, sang is just as prominent as its form which is arbitrarily deemed the lemma form, sing, which independently derives from a different PIE form. There is no reason to treat it as a secondary form etymologically, at least in this case. 17:36, 3 October 2017 (UTC)
That's not any better. Consider how many times we'd have to duplicate the etymology for all 12 of the past tense forms of vera or syngja. The lemma is a natural place for etymologies, since it's a single central entry that covers all inflected forms. —Rua (mew) 17:47, 3 October 2017 (UTC)
That's not duplication, that's providing the very separate etymologies for very separate forms. If the forms merge at a certain point, then a link can be provided to the form from which they split off to avoid etymology duplication, like is done with borrowed terms. The words is, are, and were, for example, are all forms of is, but does that mean these forms should not provide their own etymologies? Are the etymologies of these forms of less interest and notability than any other term? They are not. 17:56, 3 October 2017 (UTC)
  • I could be wrong, but I don't think Rua is arguing that the etymologies of conjugated forms are not worthy of inclusion. I believe that she is instead arguing that the etymologies of conjugated forms should go within the etymology of the lemma form, and that the conjugated-form entries should be minimal.
The issue at hand is not whether to include or exclude certain information -- rather, it is about where to include that information. ‑‑ Eiríkr Útlendi │Tala við mig 18:09, 3 October 2017 (UTC)
Right. I am proposing that it should go in the non-lemma form. In fact, what I'm proposing is already standard practice for many notable forms, e.g. done. I want this to be consistent. For verbs like be, it would clutter its etymology section to list all of the etymologies for all of its many suppletive forms is, are, were, was, am, etc. One interested in these etymologies can follow the links to these forms' pages (which are provided in the term head) and view the etymology. More importantly, someone who specifically searches for irregular forms should have immediate access to their etymologies on the same page.
When I go to the page for am (which actually already follows the format that I'm proposing; I don't think anyone would want to move its etymology to the be page), I want to know the etymology of the term. When dealing with etymology, I don't really care if it's a form of any other (in this case, a completely unrelated lemma form), I want immediate access to its own unrelated and notable etymology. I believe this seems fairly reasonable and already has precedent. 18:20, 3 October 2017 (UTC)
The lemma entry is a central place for the term and all of its inflections. Information about am concerns the lemma be, so it should go there. The individual parts of verb paradigms may have separate origins but they don't have separate etymologies because they are inherited as a whole. The verb be in modern English is the same paradigm as the verb been in Middle English. —Rua (mew) 18:38, 3 October 2017 (UTC)
That's incorrect. Separate forms of a verb are not "inherited as a whole". That doesn't even make any sense. Irregular forms all require individual memorization and passing down. The lineage of was, for example, is entirely separate from be, as are both from am. If what you were saying was true, all Indo-European languages would still preserve the verb paradigms of Proto-Indo-European. They do not. They mix, they match, they innovate, they supply. 18:41, 3 October 2017 (UTC)
But they still form a single verbal paradigm. A question like "what is the past tense of be?" has an answer precisely because paradigms exist. We have chosen to use a single form to stand in for the entire paradigm, the lemma form, for convenience. That's where etymologies also go. —Rua (mew) 18:46, 3 October 2017 (UTC)
Etymologically, verbal paradigms don't matter for irregular forms. We have chosen the lemma form to stand in for the non-lemma forms semantically, but we have not done so etymologically, because that makes no sense. 18:48, 3 October 2017 (UTC)
We've chosen to do both. I'm sorry if that makes no sense to you, but it is what it is. —Rua (mew) 18:49, 3 October 2017 (UTC)
Please cite for me to this specific point in Wiktionary policy. I will propose the change through the proper channels. 18:53, 3 October 2017 (UTC)
I found it myself, and lo and behold, you seem to be the one who added this into the "common guidelines" page in the first place. While I agree with most of your additions, exclusivity of etymology to the lemma page is one that does not make any sense. 19:06, 3 October 2017 (UTC)
I am in favor of continuing to not split etymologies, on the grounds of workability: an editor who is interested in adding this type of information should be able to see at a glance if it has been done already, without checking each relevant non-lemma entry separately. On the other hand, I don't see a problem in directing users from non-lemma forms to the lemma, in cases where they need a separate discussion.
Actual suppletion seems like a different case, though. is, are and be have completely unrelated etymologies, and continuing to maintain separate etymology sections for them seems like a good idea (but I'd again be in favor of pointing users from the lemma form to the other entries for further reading). --Tropylium (talk) 20:49, 3 October 2017 (UTC)
I don't want to split etymologies, except like you said, for terms with suppletive forms and terms with strong forms. For example, the etymology of "did" takes a separate lineage all the way back to Proto-Indo-European that's completely independent from "do"; despite not being a suppletive form, it's a strong form. I don't want to split etymologies for verbs like "walked", only for verbs like "did" and "is/am/are" and "brought". One page should not contain etymologies for different terms if the etymologies are not currently regularly formed. So this would only be an exception that would affect a relative minority of pages. Wouldn't you agree with this? 21:47, 3 October 2017 (UTC)
English has relatively few inflected forms, but it can get pretty complicated when you have forms inflected for gender, number, case, etc. Even in English, am, is and are all go back to inflected forms of the same Proto-Indo-European root. As for strong verbs, I don't think differences in ablaut grade are enough to justify maintaining separate etymologies. We have a recognized system of lemmas and non-lemmas, but I'm not sure how you could decide which form to make the "etymology lemma" for forms sharing an etymology. Chuck Entz (talk) 02:15, 4 October 2017 (UTC)
I have trouble with the vagueness of "strong forms". This is well-defined only for Germanic languages, not a generally applicable concept. Likewise, "having a separate lineage" holds for a lot of things, for starters all irregular forms in general. We have a separate etymology for mice; should we also have separate etymologies for taught or bent?
I think the default assumption should be that, if not otherwise specified, it is not merely the lemma but all applicable inflected forms that descend from a given ancestor. If we give mūs as the ancestor of mouse, then this should already imply that the former's plural mȳs is the ancestor of the latter's plural mice. This gets rid of having to treat any irregularities that represent fossilized original regular alternations, no matter how far back they go. We are working on etymology sections here after all, not on historical morphology or historical phonology.
To be fair, without morphological and phonological supplementary information, etymology often becomes fairly opaque just-take-my-word-for-it business, and I do think Wiktionary could benefit from detailing these somewhere; I just do not think etymology sections are the place for this. --Tropylium (talk) 10:26, 4 October 2017 (UTC)
Having mūs as the ancestor of mouse does not immediately imply that mice derives from mȳs, or make it clear to the viewer. There is no duplication of information going on when etymology is given for mice, only clarification and necessary etymology. Apparently, someone rightly found that etymology should be specified for this non-lemma form, since an etymology section for mice already exists. Anyhow, I think this is being blown out of proportion. I would only ask for the option of specifying non-lemma etymologies where they are notable, as has already long been done with the article of am. Rua would delete all these etymology sections (despite am being a oft-cited non-lemma form for the purposes of reconstruction). When I make an etymology section on brought and did to explain their opaque etymologies, I don't want my edits nonsensically moved and crowded under the etymology pages of bring and do (or more often than not, simply deleted). These sorts of power trips by administrators not following the spirit of the guidelines (that they themselves wrote!) just make me incredibly discouraged from adding information to this website. 18:04, 4 October 2017 (UTC)
@Tropylium How would you handle the suppletion of the potential of olla, the perfect of sum, or in być? Putting etymologies on each of the forms is not going to be feasible. —Rua (mew) 23:36, 3 October 2017 (UTC)
There's only a limited amount of suppletion for any given case; we could assign an "etymological lemma" for each nonsuppletive group (e.g. lienee for the Finnish possessive stem). --Tropylium (talk)
Ew. —Rua (mew) 11:04, 4 October 2017 (UTC)
Seconded Rua. Anti-Gamz Dust (There's Hillcrest!) 00:34, 16 October 2017 (UTC)


Hullo. I'd like to make a request for the rollbacking or the patrolling tool. Where is it at? --Barytonesis (talk) 08:09, 5 October 2017 (UTC)

@Barytonesis: An admin has to nominate you at WT:Whitelist I think (or is that only for auto patrol)? —Aryaman (मुझसे बात करो) 17:01, 5 October 2017 (UTC)
I think that rollback/patrol most often is applied to people who, for one reason or another, do not want to be administrators. Just apply to be an admin if you want some subset of the tools. - TheDaveRoss 17:04, 5 October 2017 (UTC)
@TheDaveRoss: I'd like to, but I don't think I've gathered enough trust yet. Would you endorse me? --Barytonesis (talk) 16:42, 14 October 2017 (UTC)

A more personal form of Google Translate just for Faroese[edit]

https://www.faroeislandstranslate.com/#!/Justin (koavf)TCM 08:01, 6 October 2017 (UTC)

Entries with deprecated labels[edit]

The label (ordinal) used for ordinal numbers is listed in Category:Entries with deprecated labels with no suggested replacement. Should it even be listed there? DonnanZ (talk) 13:21, 6 October 2017 (UTC)

There is no replacement. There should not be a label there at all, add the category with {{head}} or {{cln}} instead. —Rua (mew) 13:27, 6 October 2017 (UTC)
The label automatically generates the category though, as well as saying what it is, so I don't see any reason to change it, e.g. nittende. Besides that, there is no suggestion to use {{head}} or {{cln}} in the above-mentioned category. DonnanZ (talk) 13:45, 6 October 2017 (UTC)
It's a misuse of labels, that's why it's deprecated. "Ordinal" doesn't specify a context in which a term is used. —Rua (mew) 13:57, 6 October 2017 (UTC)
Whoever set up the label didn't take that into account. It surely would be a simple matter to change the label to "ordinal number", although loads of entries would have to be revised. "cln|nb|ordinal numbers" works for generating the category, but a qualifier would then have to be added, which is twice as much writing, and a step backwards. DonnanZ (talk) 14:11, 6 October 2017 (UTC)
The other label (cardinal) when moused over shows "cardinal number", but this doesn't happen with (ordinal). It is not deprecated. DonnanZ (talk) 14:47, 6 October 2017 (UTC)
"ordinal number" is also not a valid context. Context labels should not be used to give definitions or disambiguate them. They are meant to describe how something is used, not what it means. —Rua (mew) 15:20, 6 October 2017 (UTC)
Have you checked ordinal number? Also see here. Nineteenth is an ordinal number. DonnanZ (talk) 15:35, 6 October 2017 (UTC)
Where are you getting the idea that I'm denying that these are ordinal numbers? I only said that a context label is not how this fact should be indicated. The entry should be categorised with {{cln}} or the cat2= parameter on {{head}}, but there shouldn't be a context label saying that it's an ordinal number. —Rua (mew) 15:47, 6 October 2017 (UTC)
I agree with RuaCat. Ordinal numbers should be categorized as such using |cat2= or {{cln}} but not using {{lb}}. —Aɴɢʀ (talk) 16:21, 6 October 2017 (UTC)
I still disagree, but as you are so keen on everything else but, perhaps you would like to come up with some usage examples. DonnanZ (talk) 22:31, 6 October 2017 (UTC)

Please, please reveal the cause of the revert in the edit summary[edit]

Void information is the default text If you think this rollback is in error, please leave a message on my talk page. In so many words you could give some specific about the actual problem.

Instead of writing pure junk this formula, it would be more helpful for all of us if you would just write the reason in the edit summary (this way we won’t have to bother you on your talk page).

By the revert you make the work of someone to nil. Please, please either correct the error, other at least give a hint about the problem to avoid.

(Sorry for my poor English.)

Karmela (talk) 07:09, 8 October 2017 (UTC)

There are relatively few admins who have to go through a flood of edits by new contributors and see whether they belong in the dictionary or not. Given that, we simply do not have the time to give explanations tailored for every rollback that we make (if it wasn't clear, the default text is added automatically). I created the vote that added that default text because previously, it said nothing at all — obviously, this is much better, because you followed the instructions and left a message on Wikitiki89's page, where you can further discuss the edit. —Μετάknowledgediscuss/deeds 07:22, 8 October 2017 (UTC)
Thank you. For a (not vandal) contributor is the cause of the rollback _never_ clear, s/he made the contribution supposing it was ok.
The list of the typic errors must not be too long, would be possible to chose from a premade explanation list by reverts?
Karmela (talk) 16:37, 8 October 2017 (UTC)
We have such a list for deletions of entire entries. It would be a good start for what you recommend. I do not know whether it is readily done technically. DCDuring (talk) 18:59, 8 October 2017 (UTC)
  • @DCDuring, Metaknowledge In en.wikipedia.org you can add two dropdown boxes below the edit summary box with some useful default summaries:
  1. Common edit summaries -- click to use
  2. Common minor edit summaries -- click to use
One can enable this gadget at https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-gadgets
An analog dropdown box Common revert summaries -- click to use must be technically similar.
Karmela (talk) 07:47, 14 October 2017 (UTC)
So, apparently technically possible. How do we get it? DCDuring (talk) 14:01, 14 October 2017 (UTC)
This is how mw.loader.load('//en.wikipedia.org/w/index.php?title=MediaWiki:Gadget-defaultsummaries.js&action=raw&ctype=text/javascript'); Dixtosa (talk) 14:20, 14 October 2017 (UTC)
All this postulating the wish of the community here. Is this here the correct place and form to ask the community of the Wiktionary?
Karmela (talk) 08:52, 15 October 2017 (UTC)

Requests for deletion - restoring the list of nominations[edit]

In June 2017, WT:RFD was changed to no longer list items nominated at the right top of the page. I propose to restore the previous state. The current state is that categories are listed but not the items nominated themselves. That is not very useful, IMHO.

Therefore, I propose:

  • List nominated items again, as a list of items for all languages.
  • To support that, list all nominated items in Category:Requests for deletion instead of listing them only in per-language categories. This, again, is a restoration proposal.

--Dan Polansky (talk) 09:34, 8 October 2017 (UTC)

@Dan Polansky: If you want to see, say, the 5 French requests, click on the "▶" symbol next to the "Requests for deletion in French entries‎ (0 c, 5 e)". In my opinion, this is more useful than before, because now you can choose the language you want to see, as opposed to seeing a mess of entries in all languages. If we want to see a mess of entries in all languages, we may look at the normal TOC (the "Contents" list). I believe we also have the option of making all languages un-collapsed by default, though personally I'd prefer them collapsed as they currently are. --Daniel Carrero (talk) 09:58, 8 October 2017 (UTC)
I want to see the complete list, not by language. I only want to check whether all the items listed there were put to RFD page itself; if I did not want to do that, I would not want to see that right-floating portion of the page at all. --Dan Polansky (talk) 13:32, 8 October 2017 (UTC)
  • I support Dan's proposals. The language-specific RFD categories seem to be useless. —Μετάknowledgediscuss/deeds 16:05, 8 October 2017 (UTC)
    How do we know that no one uses the by-language listings? (BTW, I don't use them)
    BTW, I have noticed that we have a fair number of headings on request pages that do not have tags. Do we need yet another run against the XML to identify:
    1. Tagged L2s that are not on current request pages.
      1. Tagged L2s that are for archived or otherwise closed requests.
    2. Untagged L2s that are on the request pages.
    We'd also need to treat items that have been stricken or closed, but not yet archived.
    At the moment I don't see how this can systematically be accomplished with search. Though I doubt we would need such a run every two weeks, it might be useful every quarter or, at least, every year. DCDuring (talk) 18:47, 8 October 2017 (UTC)
@DCDuring For part 1, User:DTLHS/cleanup/request consistency. I don't think 2 is that important since entries request request pages get archived eventually. It's possible that there are false positives if pages are linked unusually on the request pages. DTLHS (talk) 19:27, 8 October 2017 (UTC)
@DTLHS: For 2 I was thinking about those requests that are entered without use of any request template. Today I noticed it when [[academic institution]] was added to RFDE. (The contributor has now added {{rfd}} at my request.) Perhaps what is needed is to discourage addition of new headers on request pages except through the relevant templates. DCDuring (talk) 20:35, 8 October 2017 (UTC)

Classification of forms with -n't[edit]

Hello. Rua, Equinox, Erutuon and I have been talking about the classification of don't, can't and other forms with -n't in User talk:TAKASUGI Shinji/2017#Contractions. I think they are verb forms just like did and could, according to Arnold M. Zwicky and Geoffrey K. Pullum (Cliticization vs. Inflection: English n’t, Language 59(3), 1983, pp. 502-513), but not everyone agrees with their analysis. In my opinion, we shouldn't use “Contraction” as a header because it is not a part of speech, and we should replace it with a part of speech we can reasonably assign. What do you guys think? — TAKASUGI Shinji (talk) 10:09, 8 October 2017 (UTC)

Our level 3 headers are for more than just part of speech. Suffix isn't a part of speech either. We have to use "contraction" because for most cases there is no other way to do it. Look at Category:Middle Dutch contractions for example. So that argument is not very compelling.
As for these contractions specifically, I don't see how they can be considered anything else. They aren't considered verb forms in any standard grammar of English. One paper is interesting, but we should follow linguistic consensus on the matter and not the opinion of a single paper. —Rua (mew) 11:44, 8 October 2017 (UTC)
An analysis of well-known linguists and lack of analysis don't have the same value. I find their analysis convincing. You can only say don't you? and not *do not you?, from which we must conclude that don't is not a contraction of do not. — TAKASUGI Shinji (talk) 11:41, 9 October 2017 (UTC)
What do you mean by "standard grammar"? The Cambridge Grammar of the English Language (naturally, because it was co-written by Pullum) uses the inflectional-suffix analysis of -n't, and its auxiliary verb paradigms show negative forms corresponding to each of the finite forms. Certainly the more traditional version of English grammar that I learned as a kid didn't recognize negative inflected forms, but it wasn't particularly linguistically rigorous and shouldn't be the basis for our decisions on Wiktionary. — Eru·tuon 21:52, 9 October 2017 (UTC)
I'm in favor of the analysis in which -n't is an inflectional suffix and forms like don't are verb forms (and I could go on about that), but the essential thing is to at least be consistent. I don't think it's consistent to label -n't as a suffix (as it's been labeled since 2008) and then call forms like won't contractions. A contraction is basically the combination of a full word plus one or more clitics that are derived from orthographic words, but are not spelled as words in this case. So for won't to be a contraction, -n't has to be a clitic (a variant form of not). The other option is for -n't to be a suffix and won't a verb form. We need to pick an analysis and stick to it. It would be fine to include usage notes explaining the alternative analysis, or alternative inflection tables, or categories, but the headers and headword templates should stick to a single analysis. — Eru·tuon 19:20, 8 October 2017 (UTC)

A week has passed, and there has been one negative vote. I assume the classification of -n't according to their paper is acceptable. — TAKASUGI Shinji (talk) 12:54, 18 October 2017 (UTC)

Any idea for a new "Thesaurus:" shortcut?[edit]

WS:goodThesaurus:good stills works, as it should.

But "WS:" does not make a lot of sense anymore, because now "Wikisaurus" is called "Thesaurus".

Then again, "TS:good" and "TH:good" are unavailable, because they are language codes. Is there a good shortcut available? If not, I guess we'll have to keep using only "WS". --Daniel Carrero (talk) 18:37, 9 October 2017 (UTC)

THES seems the obvious choice. Equinox 18:44, 9 October 2017 (UTC)
Alright, I guess. I'm not entirely happy with a mere reduction from 9 to 4 letters, but maybe that's the best option we have.
Maybe THE would be better ("THE:good" → Thesaurus:good), but "the" is the ISO code for Chitwania Tharu (w:Tharu languages). Can't we use it anyway? --Daniel Carrero (talk) 14:31, 11 October 2017 (UTC)
But if we can't use ISO codes then we can't use any three-letter code, even if ISO hasn't used it yet. It should be considered reserved for future ISO use. Equinox 15:07, 11 October 2017 (UTC)
That may be true, but we have violated that rule before. We have "cat" and "mod" as working aliases. See CAT:English nouns and MOD:sandbox. "cat" means Catalan, which seems unlikely to be used by Wikimedia because they have settled for https://ca.wiktionary.org/ and https://ca.wikipedia.org/ (using "ca", not "cat"). "mod" is Mobilian Jargon language. --Daniel Carrero (talk) 15:22, 11 October 2017 (UTC)
To be clear, I would support using "the". ("THE:good" → Thesaurus:good) --Daniel Carrero (talk) 15:24, 11 October 2017 (UTC)
I would prefer "THS" which is the language code for the w:Thakali language, a Nepali Sino-Tibetan language with 5,900 native speakers. Chuck Entz (talk) 02:04, 12 October 2017 (UTC)
SYN. —suzukaze (tc) 02:28, 12 October 2017 (UTC)
NYM: is what I like, to stand for -nyms. --Dan Polansky (talk) 07:49, 21 October 2017 (UTC)

Linking active policy proposals[edit]

WT:EL should probably link to WT:FORMS in some fashion. I imagine there are also other cases like these, where EL is a dead-end and the actual documentation is hidden away in some obscure undocumented location.

Some might protest that the former is policy while the latter are often drafts, but as long as this is indicated, I do not see any problem in linking. Should we maybe settle on some specific more mildly worded section hatnote, such as "Read more:" (instead of "Main article:" or the like)?

Interestingly, WT:Policies and guidelines, despite being prominently linked from the policy headers ({{policy}}, {{policy-TT}}, {{policy-DP}}), is currently categorized as "inactive". There's Category:Wiktionary think tank policies, but it's not especially user-friendly. --Tropylium (talk) 14:54, 11 October 2017 (UTC)

New section "Synchronic analysis" in WT:EL[edit]

w:en:Synchrony and diachrony

It isn't useful to have only historic (current "Etymology" section at en.wiktionary) or only modern analys.

Example: атония d1g (talk) 08:51, 12 October 2017 (UTC)

We include this in etymology, but the usual wording is "equivalent to". —Rua (mew) 13:36, 12 October 2017 (UTC)

Linking to Wikimedia Commons categories[edit]

Hello, I would like to know why the wiktionary entries are not linked to the Wikimedia Commons categories (by using statements at Wikidata). For example the entry Varvel can be connected to commons:Category:Vervels. It can only help the readers to (visually) learn more about that particular word. Fructibus (talk) 09:45, 12 October 2017 (UTC)

@Fructibus: Have you seen Wiktionary:Wikidata? We do in fact have some links to that sister project via local templates. E.g. tea. —Justin (koavf)TCM 09:50, 12 October 2017 (UTC)
@Koavf: Thanks a lot! By the way, was there any discussion about including the wiktionary pages into Wikidata, connecting with the Wikipedia/Commons pages? Then the Commons link would show automatically for the Wiktionary pages, in all languages. At this moment, if you want to link to Commons in all language articles, that means you have to edit 67 Wiktionary pages. Fructibus (talk) 19:05, 12 October 2017 (UTC)
@fructibus: "Was there any discussion about including the wiktionary pages into Wikidata" Oh yes, quite a bit. And there are currently options to include Wiktionary entries in Wikidata but I don't feel like I can do a good job of summarizing all of that. You may wish to see the equivalent page here: d:Wikidata:Wiktionary. I 100% agree that we should use Wikidata to make sister links--you may wish to talk with User:CodeCatUser:Rua (I had forgotten he [she?] was renamed for some reason) about that. —Justin (koavf)TCM 19:16, 12 October 2017 (UTC)
People have expressed dislike for Wikidata IDs, so we probably won't be using Wikidata for anything after all. I tried. —Rua (mew) 19:45, 12 October 2017 (UTC)
It will happen, it's just that at the moment the advantages aren't completely obvious. – Jberkel (talk) 20:56, 12 October 2017 (UTC)
@Jberkel: Isn't this one of them? —Justin (koavf)TCM 22:26, 12 October 2017 (UTC)
The page tea has already {{wikidata|Q6097}}. Changing to a template like {{sister links|Q6097}} could fetch all sister project links with automatic update of new links, deleted ones or renamed ones. The problem is that a word may have multiple senses that can be connected to multiple equivalent pages on Wikidata. --Vriullop (talk) 08:22, 13 October 2017 (UTC)
@Koavf: I'm all for Wikidata, it's just that to some editors the advantages are less clear at the moment. @Vriullop: yes that would be great, via Wikidata one should be able to fetch all the other relationships. Couldn't {{senseid}} (or something similar) be used for fine-grained associations? – Jberkel (talk) 14:11, 13 October 2017 (UTC)

@Jberkel - @Koavf - @Vriullop - @Rua - Sorry, I am new to Wiktionary buy I really don't see the reason in not linking the Wiktionary definitions in Wikidata. For example the Wikipedia article Water has a link to the Wiktionary definition, at the bottom of the article. Why not to show it in the middle-left side of the page, near to the other sister project links? (Commons, Wikibooks, Wikiquote). This way all the 220 Wikipedia articles can show the link to the Wiktionary definition in their respective language (if it exists), without the need to actually edit the 220 Wikipedia articles. Fructibus (talk) 18:49, 13 October 2017 (UTC)

@Fructibus: I agree as well but there were concerns that it's too difficult, impossible, or possible-but-difficult and not actually helpful. I disagree with the latter two but it's definitely an undertaking to be sure. Then again, so is everything. —Justin (koavf)TCM 19:01, 13 October 2017 (UTC)
@Koavf: Very nice answer, gives a feeling of touching a perfection in language, thanks :) - Fructibus (talk) 23:39, 13 October 2017 (UTC)

Ōbaku tō-on/sō-on readings[edit]

Found this video: Heart Sutra chanted by Ōbaku monks; is the ruby a Chinese pronunciation or as Wikipedia states: tō-on/sō-on readings? Here's a supporting resource. Domo, --POKéTalker (talk) 04:49, 13 October 2017 (UTC)

Personally it sounds suspiciously(?) too much like accented Mandarin ( () (ji)?  () (e)?), possibly dated ( (けん) (ken)), but I also don't know know what I'm talking about. Maybe tō-on is Mandarin. —suzukaze (tc) 05:11, 13 October 2017 (UTC)

For reference, comparison of Japanese, Ōbaku reading (sō-on?) and standard Chinese:

 (かん) () (ざい) () (さつ) (ぎょう) (じん) (はん) (にゃ) () () (みっ) () ()
Kanjizai Bosatsu gyō jin hannya haramitta ji
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
 (クヮン) () (サイ) () () (ヘン) (シン) () () () () () () ()
K(w)antsusai Pusa hen shin poze poromito su
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
觀自在菩薩般若波羅蜜多 [MSC, trad.]
观自在菩萨般若波罗蜜多 [MSC, simp.]
Guānzìzài Púsà xíng shēn bānruò bōluómìduō shí [Pinyin]
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...

Though there is probably no clear romanization to the monks chanting, should kanji with these Ōbaku on-readings be provided as sō-on? Just wondering. --POKéTalker (talk) 02:21, 14 October 2017 (UTC)

(The automatic pinyin generated by zh-usex is not correct because it uses the most common readings. [13] has pinyin transcription that seems to be OK —suzukaze (tc) 02:36, 14 October 2017 (UTC))
  • @POKéTalker, to confirm / clarify -- it sounds like you're asking if there is value in adding sōon readings to the individual kanji entries. If that's your proposal, I have no particular opposition, so long as the readings are clearly labeled as sōon (provided that's the correct reading category). ‑‑ Eiríkr Útlendi │Tala við mig 04:36, 14 October 2017 (UTC)
I think POKéTalker wants to make sure that they are indeed tou'on first. —suzukaze (tc) 20:48, 14 October 2017 (UTC)

TabbedLanguages default and English links in definitions[edit]

Yesterday, following Wiktionary:Beer parlour/2017/July#TabbedLanguages edit: default to English for unmarked links, I made a change to MediaWiki:Gadget-TabbedLanguages.js so that the default language would always be English, if no language is specified. This means that it's no longer necessary to use {{l|en|...}} in definitions. I'd like to ask the people who do this to use regular links from now on. —Rua (mew) 15:40, 14 October 2017 (UTC)

But not everyone uses tabbed languages. DTLHS (talk) 15:45, 14 October 2017 (UTC)
I agree that this would be a sensible default for those without it, too. But then we'd need a separate gadget. —Rua (mew) 15:46, 14 October 2017 (UTC)
If we made a separate gadget it could be more smart, such as linking derived terms to the correct language, while linking terms in definitions to English. DTLHS (talk) 15:53, 14 October 2017 (UTC)
Also, what happened to the plan to make TL the default? We had a vote and everything. —Rua (mew) 15:51, 14 October 2017 (UTC)
Any way to undo this behavior for searches and search results? Having them always go to English is pretty annoying when you’re working on some other language. — Vorziblix (talk · contribs) 01:48, 18 October 2017 (UTC)
Where else should they go? —Rua (mew) 10:49, 18 October 2017 (UTC)
Ideally to the last language visited, as they did before. — Vorziblix (talk · contribs) 23:15, 18 October 2017 (UTC)

Singapore terms[edit]

Just a heads up: a while ago, a Singapore schoolteacher encouraged his students to add Singaporean English terms to Wiktionary (which is, on the whole, a good thing). We seem to have a new batch of these happening at the moment, e.g. bus captain, taxi uncle. So be ready for some cleanup. Equinox 08:23, 16 October 2017 (UTC)


They've made some drastic changes to pronuciation which might not be correct. Anyone who knows Old English, do you mind taking a look. --Robbie SWE (talk) 18:12, 16 October 2017 (UTC)

@Robbie SWE: It looks like, as far as Old English pronunciation goes, they're changing sequences of /h/ and a sonorant to sonorant and voicelessness diacritic (for instance, /hr/ to /r̥/). That might be correct in a pseudo-phonetic transcription, but I don't know if it is an accepted phonological analysis. — Eru·tuon 21:46, 16 October 2017 (UTC)

Translating both ways[edit]


When I started working on a project in which I would like to use translations from the wiktionary, I noticed that wiktionary translations are created separately for each language. That means that even if the English wiktionary contains the translation of a word into another language e.g. Mandarin, in that language there will not be a translation of that word into english.

One example:


- the list of translations contains the translation 图书馆


- the Chinese wiktionary page does not have a translation for that word into English (the site contains: 英语(English):[[]])

Since these translations are symmetric, it would be correct to add a large number of translations to these wiktionaries with much less effort. However there surely will be a few issues that have to be resolved first.

TheDaveRoss already replied to me per mail already stating some issues:

"1. There are numerous Wiktionaries, each one maintained by a distinct community of volunteers. Each has its own policies regarding what may or may not be included, how translations are to be added, etc. It is very important that you coordinate with the local community wherever you add content to ensure that the content meets their criteria.

2. Translations are very nuanced (as you are probably aware). Automated addition of translation has happened at small scales in the past, however close oversight by a person familiar with both languages is required. Even translations which appear to be symmetrical may require special annotation in the target language which is not included in the original language.

3. The source material may not be correct, and automation can propagate errors. The English Wiktionary, and a few other large Wiktionaries, have enough contributors that many errors are caught quickly. That is not the case for the majority of other languages, so it is important to ensure any additions to other languages are correct"

4. Attribution to the original contributor will be important

E.g. adding the new words to proposed translation first and then checking for correctness would decrease the risk of wrong translations but add some value right away.

What do you think about writing a script to do this, what other problems are there with this? Do you know about previous attempts to do this? I hope this could be very useful!

Noahho (talk) 01:21, 17 October 2017 (UTC)

@Noahho: Something similar to two-way translation could work if we can agree on how we will use Wikidata and how it will be connected across Wiktionaries. Unfortunately, how that would work is very difficult to determine. —Justin (koavf)TCM 02:13, 17 October 2017 (UTC)
@Noahho Hello Noah. Can I please know what project you are working on? If the aim of the project is to extract translations of foreign-language terms into English from Wiktionary, it would be much easier to extract from the pages on English Wiktionary; e.g. for simplified 图书馆 it would be at 圖書館, which says "library". Wyang (talk) 02:17, 17 October 2017 (UTC)
This is an age-old problem. I believe that, somewhere, there is a unified Wiktionary that is not dependent on a "home" language. But I forget what it is called, or what state it is in. SemperBlotto (talk) 05:37, 18 October 2017 (UTC)
@SemperBlotto: omegawiki:? —Justin (koavf)TCM 09:03, 18 October 2017 (UTC)
I will also mention one previous attempt to do this locally was User:Tbot. The person who created that script has passed away, however there is some amount of documentation of his efforts in that user space. - TheDaveRoss 13:57, 18 October 2017 (UTC)

Turkish vs Ottoman Turkish[edit]

The Balkan language loanwords from Turkish should technically be Ottoman Turkish, since that's the era they entered those languages, right? Is the only main difference the script being Arabic vs. Latin? I realized I need to go back and change a bunch of Romanian and some other entries. Word dewd544 (talk) 16:12, 17 October 2017 (UTC)

Yes, they should generally be Ottoman Turkish. The script is one significant difference, but if I’m not mistaken there’s also a huge difference in lexicon, where a large portion of the Ottoman Turkish lexis consists of loanwords from Persian and Arabic that were later stamped out of usage and replaced with neologisms by Atatürk. — Vorziblix (talk · contribs) 21:40, 17 October 2017 (UTC)
There are also grammatical differences. That said, I would personally prefer to treat them as a single language, and I don't think we lose much by claiming Balkan loanwords are from Turkish rather than Ottoman Turkish when the word in question is itself the same. —Μετάknowledgediscuss/deeds 21:44, 17 October 2017 (UTC)
I’m indifferent to merging them, not being knowledgeable enough on the subject, but the split does seem to be mostly a relic of sticking to ISO codes; input from editors experienced with Turkish could be helpful. — Vorziblix (talk · contribs) 06:40, 19 October 2017 (UTC)
Isn’t it more true to the soothfast happenings in the Ottoman Era to describe the Ottoman Turkish as an acrolect of Turkish which the elite prioritized while basically we have had Turkish all the time? It would be awkward to say that we had Turkish once and then, by some peculiar developments in constitutional history, Ottoman Turkish, and then because Atatürk said so Turkish has smitten Ottoman Turkish. Rather there has been one basic Turkish from which the Balkan languages also borrowed rather than from the language we see as Ottoman literary inheritance today, though of course there can be learned borrowings from the literary language as well, though in the case this is largely unlikely because of mostly late literary culture in the Balkan countries and literary culture in the Slavia as well as in Greece (I don’t know about Romania and Albania) also prohibits itself to borrow, as compared to other literary cultures. So in the context of Balkan languages, the Turkish they have been in contact with was coexistent Turkish rather than typical Ottoman Turkish. We are just inveigled to assume that one literary language has borrowed from the other literary language even when the spoken language has borrowed, because for older times we know about the spoken language from its appearance in writing. It is an image of things we have long surpassed in Romance studies, like acknowledging that Spanish has borrowed from colloquial Arabic rather than from literary Arabic. Palaestrator verborum (loquier) 20:51, 18 October 2017 (UTC)
Unfortunately it’s not really clear what Wiktionary means by ‘Ottoman Turkish’ — just the literary acrolect or the language in general during a given time period. Previous discussions don’t seem to have reached a conclusion. Some of the comments made there could be relevant to the issue of how to treat Ottoman Turkish, though. — Vorziblix (talk · contribs) 06:40, 19 October 2017 (UTC)

Listing Translations by Language[edit]

It seems to me that quality control of translations is harder than it should be: those of us who patrol new edits can't be knowledgeable in anywhere near all of the languages, and those with expertise in the languages in question are less likely to be spending their time browsing through English entries. {{t-check}} is helpful, but not used in most languages.

Does everybody think it would be a good idea to create a listing of translations in each language, along with the entry they're in? I would envision it as a listing in the language's alphabetical order, with the {{t}} template converted to an {{l|template}} and followed by the name of the entry:

This would make it easy for an expert in a given language to scan through all the translations in that language without browsing a bunch of English entries. It also might make the redlink categories and the overhead that goes into creating them unnecessary.

I'm bringing it up here because it would be a major undertaking involving massive processing of the dumps, so I want to make sure it's a good idea before asking anyone to do it. Perhaps it could be started with some of the smaller LDL languages as a test. Chuck Entz (talk) 14:19, 18 October 2017 (UTC)

Matthias Buchmeier maintains lists similar to what you describe. — Ungoliant (falai) 14:59, 18 October 2017 (UTC)
I can share the assumption that it would make it more possible to make Wiktionary serve as bilingual dictionary in relation to single languages, as for now one cannot directly work in Wiktionary to make it intentionally a bilingual dictionary for any language because one does not see what is already there, i.e. one can only add to quantity, but only by serendipity to quality. But the requirement would be that such lists are live dumps which get refreshed as soon as edited, instantly by Javascript or at least after refetching the page. Because it is a core part of motivation for editing to see the results published instantly, that’s why the web is there. Palaestrator verborum (talk) 16:41, 18 October 2017 (UTC)

Word Frequencies in Wiktionary[edit]

We have just finished removing the {{rank}} information based on some old, problematic parsing of the Project Gutenberg corpus. We still have a few appendices which record that information. My main objection to the inclusion of that data was that it was flawed, and outdated. But I don't roundly object to having word frequency information which is accurate. To that end I have a few questions.

  1. Should we include any frequency data in any manner?
  2. If so, should that data be represented within word entries in some way?
  3. Which corpora should be used, or which frequency lists?
  4. Should original research (of the type used for the old data) be allowed?

One starting place for English (and a few other languages) can be found at the BYU corpus page. It is probably best to avoid getting too deeply into the weeds here, but rather if it seems like there is a general consensus around what should be included we can spin off a project page and figure out all of the details. - TheDaveRoss 17:53, 18 October 2017 (UTC)

We should certainly not add frequency data to word entries because the data is doubtless interesting in a list, but too unsure and thus and because nobody wants to know which words have the nearest frequency if he looks up a word – which is a totally random result of sundry capacities of a language and without fruit for erudition – and because it would suck off endeavors to more instructive content creation not worthwhile enough to maintain in a the main namespace. And as there is more instructive content to be created by the same endeavors, I also opine that for the collection of frequency data it should be waited until copyright law has been abolished by revolutions in the world and thus representative and illuminatingly separable corpora can be collected. At that time we would only struggle with the technical recognition of what a word be, not of what sources we digest, which together are multiplying error factors. Palaestrator verborum (loquier) 18:25, 18 October 2017 (UTC)
Wut? Sorry, what I meant there was: one, I have no idea what the abolition of copyright law has to do with the inclusion of word frequency on Wiktionary and, two, I disagree with the notion that this would prevent some other work from being done. - TheDaveRoss 18:33, 18 October 2017 (UTC)
The notion is that there are opportunity costs in collecting telling corpora. I don’t think that one could be content with subtitle databases, as these are slanted to Hollywood and mass productions and their fantasy worlds instead of the whole language that we mean when we talk about the language; and actually those current subtitle collections and most other corpora are non-free either. The web-based corpora which are represented on the BYU corpus site have of course their own problems, with the deep web and the dark web and resources being varyingly crawlable. If we want recent and representative data, we can only go illegal by grabbing Library Genesis, accessing journal and newspaper databases via black channels like Sci-Hub does and things like that, perhaps mixed with subtitles, i.e. things that we cannot perform by the means of the Wikimedia Foundation without endangering it. If there would be no copyright law, there would be a large database of works of all kinds which would constitute good (i.e. the technologically and humanly, not legally best possible) and fast data. This is of course a high standard from which I esteem corpus data valuable, and possibly the view of a philosopher against the practical mind of a programmer. But one can set the doubts even higher by asking oneself how to offset different data sources, like how commensurable web data and journal data and parliamentary debates are, even if one has access to all humanly possible corpora, and if one needs to be the man to have correct information about word frequencies. Others could be pleased to see lesser corpus data, but I think that the assumption cannot be rejected out of hand that this is not worth it if there are so many doubts about whether this data represent the actual distribution in language (by my common sense, I often wonder about words not being found at all in large frequency dictionaries) – aside from such data not being valuable maintained in individual word articles, which question is subject to entirely different evaluation criteria, because the intention of a reader opening a word’s entry is different from the intention of a reader opening a word list. Palaestrator verborum (loquier) 19:25, 18 October 2017 (UTC)
It's about time we asked the question, how can words be real if our eyes aren't real? DTLHS (talk) 19:34, 18 October 2017 (UTC)
@DTLHS: I'm ded. But it should be "How Can Words Be Real If Our Eyes Aren't Real". —Aryaman (मुझसे बात करो) 20:40, 18 October 2017 (UTC)
I have not asked this question, they are valuable abstractions from language – we can explain and describe them –, but you cannot just cast “the language” into a measuring beaker to know an objective distribution of its constituent parts (which is also a mereological problem), as the language you know about is always constructed to some degree as necessitated by material constraints. What we want, in laying out at frequency list which could praise itself of utilizing the methods fit for the object, is to be at least as exact as possible about it, but that is by far not legal. Palaestrator verborum (loquier) 19:45, 18 October 2017 (UTC)
Palaestrator, your writing style is unnatural and unnecessarily loquacious. I don't know why you are doing it, but I want you to know that your arguments will be taken more seriously if you try to express them clearly and succinctly, rather than in a way that just makes us all think that you're trying to show off. (And as a side note, your understanding of how corpora work in relation to copyright law seems to be flawed, so you might want to try reading up on that first.) —Μετάknowledgediscuss/deeds 19:53, 18 October 2017 (UTC)
It is easy with the law in this case: If the corpus collection is legal (i.e for example the Wikipedia corpus, accessing it is legal), accessing it is legal; if the data collection is illegal, or the manner of accessing it is (i.e. for example a university’s access being used beyond its license, as Sci-Hub does), that is already legally contentious (it is disputed in many jurisdictions, as the United States of America and the Federal Republic of Germany, if just streaming content published in breach of copyright law is violating it, also dependent of dolus), and one should not lay the hands in fire for the collected fruits of such automated accessing, especially if it is commercially exploited as allowed by the licenses used on Wiktionary.
I can’t show off with my writing style, being unnatural in normal people’s view is my expressions’ very nature, or sounding like a 19th century novel (wherefore being natural though if the matter dealt with is a complicated matter of human culture, as language is? I don’t know why people recall nature when we can surpass it.). And it is not loquacious, I already write off parts of it. Besides, the point of reading it could lie in it saving minds from futile pursuits. Like others, I don’t talk if I prognosticate that my verbosity does not pay proportionately. What is the prospect though of how much work hours the creation and maintenance of those lists take? Palaestrator verborum (loquier) 20:32, 18 October 2017 (UTC)
  • It would be nice to have word frequency information available, but there is the serious PoS/Etymology problem (eg, dyke or dike). I am skeptical of both the heavily annotated corpora (which differentiate [or try to] by PoS, but are generally small) or the large corpora (which do not usually make accurate PoS determinations). That said Google N-Grams and the BYU corpora would be fairly useful, though I have not investigated the terms of use for their frequency data. It doesn't seem particularly useful in an entry. It would be very useful to have some kind of quick indication as to what frequency class a given word used in a definiens was in (eg, top 10K, next 40K, next 200K, perhaps next 750K). As an appendix such lists might make it easier for a contributor to check the understandability of a definition. DCDuring (talk) 21:02, 18 October 2017 (UTC)
    Google N-Grams makes possible Reference links like this frequency comparison of canvas and canvass as verb and noun. To me that seems useful to contributors and to passive users. DCDuring (talk) 21:08, 18 October 2017 (UTC)
    I completely agree that frequency at the POS or even sense level is much more useful, but how that might happen eludes me. The concept of not necessarily providing specific rank, but instead indication a frequency class of some kind could be interesting, but the underlying data would suffer in the same way. - TheDaveRoss 11:56, 19 October 2017 (UTC)
    The annotated corpora, like Google N-grams, BYU COCA, as well as smaller ones, support PoS at least. Usually, one etymology accounts for the overwhelming majority of the usage in a given PoS, which we could note in such cases with little OR. DCDuring (talk) 13:11, 19 October 2017 (UTC)
  • Thanks for getting rid of it! It's been one of my bugbears for years! --P5Nd2 (talk) 10:06, 20 October 2017 (UTC)

Catholicism vs. Roman Catholicism vs. Eastern Catholicism[edit]

There seems to me to be an inconsistency with how these terms are used in Wiktionary. I just want to clarify what these terms mean and therefore, we can streamline the usage of these terms in Wiktionary. "Catholicism" or "Catholic" (as capitalized) in common parlance would be the faith or word connoting the Catholic Church (those that are in communion with the Pope in Rome). "Roman Catholicism", on the other hand, means something specific, since it would refer to Catholicism that is using the Roman Rite within the Catholic Church (as opposed to those using other rites such as Byzantine Catholicism, Coptic Catholicism, Syriac Catholicism, etc.). However, historically, the term "Roman Catholicism" was used as a pejorative slur in English-speaking circles for the Catholic Church. Actually, if referring to all western rites (together with the Roman rite, the Ambrosian rite, etc.), it would collectively be called "Latin Catholicism", or "Western Catholicism". "Eastern Catholicism", on the other hand, also means something specific, since it would refer to Catholicism (in communion with the Pope in Rome) that uses any Eastern rite, which would be by Byzantine Catholics, Syriac Catholics, Chaldean Catholics, etc.

The problem now arises. Some call anyone in communion with the Pope as a "Roman Catholic", but almost all Eastern Catholics don't like it, because they say they don't use the Roman rite. Therefore, they don't associate with the term "Roman Catholic", but with whatever their sui iuris church or rite is, like a "Ukrainian Catholic", or a "Greek Catholic". Therefore, some terminology associated with the entire Catholic Church, let's say the Council of Trent, would be associated with the entire Catholic Church, but it is labelled as "Roman Catholic", and Eastern Catholics, since they are in communion with the Pope, would also hold the Council of Trent as true. Therefore, we get Wiktionary entries like patron saint that have both, which is pretty redundant, because one could just label this as simply "Catholicism", and it would be simpler and no one would misunderstand it. Everyone would understand that it means something associated with the Catholic Church.

Therefore, for simplicity, clarity, and completeness of information, I move that all labels under "Roman Catholicism" be changed to "Catholicism" unless the entry is really just concerned with the Roman rite of Catholicism, terms like "Agnus Dei" or a "humeral veil", although I find it redundant too that we need to provide the specific rite within the Catholic Church to which the entry is used. --Mar vin kaiser (talk) 13:16, 19 October 2017 (UTC)

I thought that in English at least, the term Roman Catholic meant a Catholic who is in communion with the Bishop of Rome, i.e. the Pope, and thus includes Eastern Catholics. The "Roman" is necessary in order to exclude Anglicans and the Eastern Orthodox (who are also considered part of the Catholic Church as that term is used in the Creed). —Aɴɢʀ (talk) 13:54, 19 October 2017 (UTC)
@Angr: Actually, in those cases, the word "catholic" is written as a smaller case, which is a common practice in reciting the creed by Protestants, such as the "catholic church". When it is capitalized, as "Catholic", it would refer to the Catholic Church in communion with the Pope. This is exemplified by the fact that when one asks what religion you are, one says "Catholic", "Orthodox"(Eastern or Oriental), or "Anglican", and there is no ambiguity with regards to the term "Catholic" that it automatically refers to the Catholic Church in communion with the Pope. As I said, the reason why "Catholic" should be used instead of "Roman Catholic" is because Eastern Catholics simply do not subscribe to the idea that they are Roman Catholics, because they do not follow the Roman rite, nor any Roman tradition, as started in the church in Rome. They have their own liturgy and practices, distinct from the Roman rite, thus they refuse to be called "Roman Catholic". Since the entries in Wiktionary labelled as "Roman Catholic" also apply to "Eastern Catholics", it's better to label as "Catholic". In religious discussion, people actually differentiate "catholic" and "Catholic", wherein the capitalized word refers to the Catholic Church in communion with the Pope. --Mar vin kaiser (talk) 14:31, 19 October 2017 (UTC)
That's not how I've ever understood "small-c catholic"; I've always taken it to refer to the nonreligious sense of catholic: "universal; all-encompassing; pertaining to all kinds of people and their range of tastes, proclivities etc.; liberal", while "big-C Catholic" has the religious senses. I suppose that, just as with the word American, there are different meanings to both Catholic and Roman Catholic and different people prefer different meanings and get into arguments with other people as to the "proper" meaning. The trouble is, there is no term that is both unambiguous and commonly used that refers to all churches in communion with the Pope. Both "Catholic Church" and "Roman Catholic Church" are ambiguous as they mean different things to different people, and "Church in communion with the Pope" is unwieldy and not exactly a common term (quite apart from the ambiguity of pope, which can refer to other people than the Bishop of Rome}}). —Aɴɢʀ (talk) 14:44, 19 October 2017 (UTC)
@Angr: I see what you mean, and I understand the trouble of ambiguity. How about we follow the precedent of Wikipedia? The Wikipedia entry w:Catholic Church pertains to all churches in communion with the Bishop of Rome. How about using the "Catholic Church" as a label instead? --Mar vin kaiser (talk) 17:10, 19 October 2017 (UTC)
It still seems so weird and funny to me that the entry "particular Church" is labelled as "Roman Catholic" when almost all of the particular Churches except 1 refuse to be called "Roman Catholic". --Mar vin kaiser (talk) 17:12, 19 October 2017 (UTC)
@Mar vin kaiser: The Anglo-Catholic in me rebels at seeing "Catholic Church" used to mean only the parts of it in communion with the Pope (and I was opposed to Wikipedia's moving "Roman Catholic Church" to "Catholic Church" several years ago), but the pragmatist in me says I suppose it's the least bad solution. What do others think? I feel like this isn't a decision that should be made by Mar vin kaiser and me alone. —Aɴɢʀ (talk) 13:47, 20 October 2017 (UTC)
@Angr, Mar vin kaiser: I don't see what the difference between "Catholicism" and "Catholic Church" is. And what about just "Catholic"? Would that be problematic? — justin(r)leung (t...) | c=› } 15:14, 20 October 2017 (UTC)
Come to think of it, I just noticed that if you type "Catholic" or "Catholicism" into Wikipedia, it redirects you to the article of the "Catholic Church". --Mar vin kaiser (talk) 15:27, 20 October 2017 (UTC)
I don't think there is a final solution, since there is ambiguity no matter which term you use. I think "Catholicism" or "Catholic Church" are probably the best ways to label something pertaining to the Church in communion with the Pope, since it is almost always what people are referring to when they use that term, and using more specific labels like "Old Catholicism" or "Old Catholic Church" for other brands of Catholicism. There's no perfect solution, but I think that's the way to go. We almost need another term, like "Papal Catholicism" (although that still leaves an ambiguity between Roman Catholics and sedevacantists...). :P Andrew Sheedy (talk) 03:55, 22 October 2017 (UTC)

Please revert vandalism at WT:LOP[edit]

Can someone please undo the vandalism of 2017 October 9 at WT:LOP ?

Almost all the page content was deleted by vandal user 2602:306:3B60:F5F0:9DB8:9A6C:2612:A93 (talk)

For some unknown reason pressing revert or trying to go back to the last good version [14] resulted in an edit filter saying it was impossible to save.

-- 06:24, 20 October 2017 (UTC)

Thanks, done. (Repeated vandalism at Appendix:List of protologisms/Q–Z by similar IPs; may require protection or range block.) Wyang (talk) 06:27, 20 October 2017 (UTC)

Why isn’t it even possible for a non-sysop user to revert multiple commits, at least by a single IP? This could of course be used by vandals, but just deleting content has the same result, and without it the vandals have an advantage, because they just need to save multiple times to make their changes unrevertable. Or do I just fail to see how this can be done by me as a plain user? There has been a number of cases however where I would have been faster than an admin in reverting vandalism. Palaestrator verborum (loquier) 09:10, 20 October 2017 (UTC)

Removing images of coats of arms[edit]

Relatively recently, coats of arms have been added to entries as images.

I propose to remove all images of coats of arms.

Images should help find what the thing referred to looks like or help get a clearer idea of the referent in another way. Coats of arms do not serve the purpose at all. For countries, states and cities, geographic maps seem okay as images.

--Dan Polansky (talk) 07:44, 21 October 2017 (UTC)

Support removing coats of arms. Support having maps in the entries mentioned. --Daniel Carrero (talk) 10:59, 21 October 2017 (UTC)

WT:WDP and senseid[edit]

Not going to hide, I'm eager to have this information directly at Wiktionary.

Probably Template:senseid is not the best template and we would have better solutions.

So I suggest two votings:

  1. if wikidata ids could be used parallel to synonyms (to connect same senses with same wikidata ids)
  2. if Template:senseid is the best option for this

I suggest to start voting on 28-10-2017. d1g (talk) 15:50, 21 October 2017 (UTC)


Discuss any questions before voting here. d1g (talk) 15:50, 21 October 2017 (UTC)

Wikidata ids in order to capture same senses[edit]




Template:senseid as current solution to implement topic above[edit]




from vs <[edit]

I stumbled upon this old line in Wiktionary:Etymology:

Some editors use the word “from” to separate ancestors, while others use the algebraic “<”. The symbol “<” implies an arrow that points in the direction of language change. There is currently no consensus on a preferred form, but a majority of editors prefer "from" over "<".

There was no clear consensus in 2011, but the "from" side has clearly won out by 2017. Can we update WT:ETY or does it need some kind of vote? Pengo (talk) 16:25, 21 October 2017 (UTC)

I've deleted that paragraph. --Barytonesis (talk) 16:53, 21 October 2017 (UTC)
I added the paragraph back: it is true and links to evidence of consensus or its lack. If you can provide evidence of consensus, we can update the paragraph to state what the consensus is. --Dan Polansky (talk) 17:27, 21 October 2017 (UTC)
Newer user here. I registered two and a half weeks ago and worked on and read the English Wiktionary heavily since, having almost 10,000 of pages since then in my browser history from en.wiktionary.org, counting several tens of thousands more for the earlier rest of this year, but I have not seen the use of such a sign since then, or if I have seen it it has not exceeded five times, and I cannot imagine that a reader can have an honest will to see it.
Instead, it appears to me that if one is an editor that has not forgone caring for the visual appearance of a linguistic entry, one is inclined by one’s refinement to substitute such figures en passant; I don’t think that it is good typography and I must own that it would be an understatement to claim that I would need to tax my brain to apprehend what such a mark is intended to signify. It is hard for someone who is lacking of any background of having it seen used other than in unavoidable mathematical education and on the other hand it is hard for a mathematician likewise because he who knows mathematics is irritated by any use of mathematical characters for which he has trained much to have specific concepts. Also I am concerned that it is a bad habit to use that sign anywhere at all on the web with the intention of its glyph being displayed, as its meaning is restricted to being part of the XML and HTML markup and one can make a mess with it easily or have it filtered (on other sites if not here).
Uses in print are based on space rationales while the sign does not belong to the inventory of any writing system, but else you can with as much success use ⊷ or ⊱ or √ if you use < – what you want to express with it is no less obscure, as using “<” is a far-fetched trope no matter of how much frequency it has; “smaller than” in mathematics does not overtly map to “borrowed from” or “inherited from” (the keeping apart of which is also double-crossed by it), and the other uses of the sign on the web further distort the meaning up to conviction that this is not a sign that belongs into etymology sections of community-edited web dictionaries. Of course the same applies to many other parts of the web and goes with the characters “>”, “=”, “"” and “~” in so far as these characters are not used in any language but in the scribal traditions of special disciplines. The fact that your computer has a character on your keyboard does not at all recommend it being used outside of a computer-context, and particularly it can not ever be a standard as it does not need to be included at all in a keyboard layout, or in the keyboard if the key that would carry it is missing as is sometimes the case with the ⟨LSGT⟩ key that carries both ASCII angle brackets in half of Europe (vulgo: the one right of the left shift key).
Is there anything about the “<” sign that outweighs the detriments of its usage? Meseems I have falsified everything that one could possibly say in favor of it, and that this my posting could as well have a place in Wiktionary:Etymology. Come hither, defenders of U+003C in web dictionary etymologies! Can you hold anyone at your side? Or will you shrewdly ignore the issue because of convenience? A community decision would be relieving for the purpose of providing a reference that using “<” for etymologies is uncouth, in the ambit of the rationale. I quit writing for it now, as it seems that I have written the top beer parlor post of this year by length without even being inebriated, but I hope that this has an effect, for the quality of the English Wiktionary and maybe other projects, as it has taken me three hours of thinking and explaining and I have in this time done my best to garner your deltas and I want to spare people from repeating the formulation of the thoughts I have laid down. Palaestrator verborum (loquier) 23:17, 21 October 2017 (UTC)
I have now read the prior utterances in the 2011 thread, and I highlight that the 2011 vote had introducing a certain format with “from” as a practice. However, now the thread is about if U+003C should be replaced, not about what can be used in general. From what I have fished out there, there were not so relevant reasons why people actually opposed. One (Mglovesfun) said: “< is easier on the eye than 'from', not least because in reading I 'internally' pronounce the 'from' but for < I pronounce nothing” – which can only very conditionally be true, as I have shewn that its nature is obscure and the ”internal pronunciation”, if it exists, which is a dubious phonocentristic claim, cannot have weight as one asks oneself what to pronounce for “<” ⇨ it is illogical. Other people opposed because they had to read “a page full of verbal diarrhea” or because they had been asked to vote (SemperBlotto literally), and mostly because they did not want it to be included in the Wiktionary:Etymology page, thus most only of formal reasons. One utterance by some Stevey7788 in opposition is exemplary of the votes in opposition missing the point:
“The "<" sign is easier to read, as wordiness and lack of conciseness tend to cause confusion. This symbol is also widely used in academic publications on historical linguistics. Readers would all quickly learn what "<" means, since it should be very intuitive.” It is the job of the writer not to cause confusion when using words, but “<” does that as a rule; one cannot learn what it means because it means multiple things and one has to interpret it in context, while words can be artful. Also the usage of however-academic publications does not count as we work on a different publication type with unlike constraints and allowances. I want to point out one point that I have not really touched: “less than is not readable for users of screen readers, braille displays, or other assistive technologies.” (Neskaya)
Another guy (Bogorm) wrote that it “is useful in longer derivation chains” – there are ways more useful for the reader to show derivation chains; you write “<” because you are too inert to think of another thing and to accomplish it. Of course it is not that much a sin that it justifies punishment, but there should be a canon for confirmation of the replacement of such inertia. What is Unicode for if people opt for ASCII? What is the increasing store and display space in computers for if they use such mediocre shorthands?
We can do better in reaching a consensus by starting with one summary, as I have started. Palaestrator verborum (loquier) 00:04, 22 October 2017 (UTC)
The uses of "<" largely disappeared, sure, but whether that was by "consensus" is not entirely clear. I don't believe there are any conclusive arguments unequivocally in favor of "<" or "from"; how to weigh various pros and cons is a matter of preference. Wiktionary:Votes/pl-2011-02/Deprecating less-than symbol in etymologies did not show consensus. The current text in Wiktionary:Etymology does not mislead a new user, I believe. A new user can read the vote, see that nearly a supermajority (2/3) supported "from", look around a bit, and see that "from" has won in the mainspace. That said, another vote may be in order if we want Wiktionary:Etymology to indicate "from" as the recommended practice. --Dan Polansky (talk) 06:04, 22 October 2017 (UTC)
I disagree that there is "increasing ... display space in computers". Most people seem to use phones or tablets now. Equinox 16:06, 22 October 2017 (UTC)

Move {{was wotd}} notices to talk pages[edit]

Would anyone support this? I'm not sure of the utility of this template to a reader beyond archiving old words of the day, and the category Category:Word of the day archive can function as an archive with talk pages. DTLHS (talk) 05:33, 22 October 2017 (UTC)

Why though, is it too ugly? I am afraid it has exactly zero use, only being a source of work that could be used for removing or fixing other template usages. Also it would effectuate additional clicks ad infinitum because people would have to go to the talk pages to check if a word already has been word of the day. Palaestrator verborum (loquier) 05:44, 22 October 2017 (UTC)
  • Support moving "was wotd" to talk pages. I don't see why a reader should care to see immediately whether a word was once the word of the day. --Dan Polansky (talk) 06:07, 22 October 2017 (UTC)
Having it on the talk page is pointless. I personally find it an interesting bit of information. An alternative would be a category (although not very visible, still better than dumping it on talk). – Jberkel (talk) 16:01, 22 October 2017 (UTC)

Category:Buyeo language[edit]

I just RFV-failed our only two entries: 乙那 and . Is it clear that there was a Buyeo language, and if so, are there any texts that are indiscussibly in Buyeo?__Gamren (talk) 16:07, 22 October 2017 (UTC)

Oh, and @suzukaze-c, -sche, Pedrianaplant.__Gamren (talk) 16:09, 22 October 2017 (UTC)

Poll: deploy timeless skin[edit]

It'd be great to have the Timeless skin deployed here (opt-in, of course). Wiktionnaire, the French Wikipedia and a bunch of other sites have it already (see T154371 for a list). It has a sticky header and a responsive design that handles different screen sizes much better than vector (demo). We need to have community consensus before it gets deployed, hence this poll. – Jberkel (talk) 16:14, 22 October 2017 (UTC)