Wiktionary:Beer parlour/2017/September

From Wiktionary, the free dictionary
Jump to navigation Jump to search

September LexiSession: peace[edit]

An origami for peace!

The monthly suggested collective task is to make peace. September 21 is the International Day of Peace and October 2 the International Day of Non-Violence so it may be good to reinforce our content related to this topic.

By the way, Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession, because we started a year ago. I hope there will be some people interested in making some contributions! My plan is to try to draft a thesaurus on this topic, but to pick good illustrations can be a nice challenge too Noé 10:37, 1 September 2017 (UTC)Reply[reply]

Hey, the International Day of Peace is today. I'm quite happy to make some publicity about the new thesaurus about peace in French! Noé 13:41, 21 September 2017 (UTC)Reply[reply]

I propose to add a 5th case in which pre-approval isn't necessary: in existing and in-use templates/modules, where the use of Wikidata does not have any effect on the output. This would allow us to do things like tracking and stuff, for testing and to explore potential new uses and the effects. —Rua (mew) 14:41, 1 September 2017 (UTC)Reply[reply]

To repeat what I said in Wiktionary talk:Wikidata#A first experiment, I support adding this 5th exception. I think it's consistent with the spirit of the other exceptions, to allow the possibility to access Wikidata without affecting the actual content presented to readers. --Daniel Carrero (talk) 21:49, 9 September 2017 (UTC)Reply[reply]
I've been bold and added it to the Wikidata policy page, since nobody else has shown an interest in this discussion. —Rua (mew) 19:03, 15 September 2017 (UTC)Reply[reply]

Old Kurdish?[edit]

Please forgive my ignorance, but what is the generally accepted name for the parent form of all the Kurdish sublects? Old Kurdish? Proto-Kurdish? Is "Old Kurdish" attested and/or reconstructable?

Also, I noticed we have quite a few entries with the language code ku. Do these need to be sorted out and moved to their respect lect codes or are these entries with identical orthography across all three lects? --Victar (talk) 20:55, 1 September 2017 (UTC)Reply[reply]

We have discussed this before, although I'm not sure where. @-sche might have a link handy. Our entries in ku are nearly all Kurmanji in Latin script, although we also have a specific code for Kurmanji. I think we would be best off committing to a single approach, and use a modified version of {{fa-regional}} to link between dialects. —Μετάknowledgediscuss/deeds 21:15, 1 September 2017 (UTC)Reply[reply]
Pinging @Calak, who seems to be knowledgeable in all of the Kurdish dialects. —Aryaman (मुझसे बात करो) 02:42, 2 September 2017 (UTC)Reply[reply]
Proto-Kurdish is attested.
I always use ku code when a word is common in all of the Kurdish dialects. For example the common word for "goat" in Kurdish is "bizin"; why should we separate dialects and write "bizin" four times?! ku code means Kurdish language with all its dialects.--Calak (talk) 09:03, 2 September 2017 (UTC)Reply[reply]
Because it can get confusing where there's Southern Kurdish one place and Kurdish another, and it's doubly confusing once someone has set computers at the job and there's just raw lists of which entries have Souther Kurdish translations and which don't, without any note of Kurdish in the vicinity.--Prosfilaes (talk) 22:52, 4 September 2017 (UTC)Reply[reply]

Proposal: install mw:Extension:PageNotice[edit]

This extension makes it possible to add headers to pages. It would mean we no longer need to add {{reconstructed}} to every reconstruction page. —Rua (mew) 12:19, 2 September 2017 (UTC)Reply[reply]

French Wiktionary August news[edit]


Hey! August issue of Wiktionary Actualités just came out in English!

What's up in French Wiktionary? And in the other Wiktionaries? What is a Magic Link? Is there statistics somewhere? German words that may exists? Videos? Details on tantum categories? Nice paintings from French artists? Clowns? Yes, all of this can be find in August Actualités!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We are very happy to celebrate a year of English translations! Twelves issues! That's not bad considering that we do not received any money for this publication and we are not supported by any user group or chapter. It is only written by the community, and it was eleven participants for this issue! We all stay eager to receive your opinion on our publication! Noé 21:24, 2 September 2017 (UTC)Reply[reply]

Egyptian hieroglyphics[edit]

Why are you using the html tag for Egyptian hieroglyphics instead of the unicode characters? While looking at Help:WikiHiero syntax I read that the unicode characters only are partially supported so I guess that's why (found the page while writing). What are the things missing for it to be fully supported and when might those "missings" be added or fixed? This turned out to be a more "I don't know anything" question than meant... sorry 'bout that.Jonteemil (talk) 23:34, 2 September 2017 (UTC)Reply[reply]

You answered your own question: Unicode can't support all of what we want to show. WikiHiero not only displays the characters correctly, it also does so regardless of whether the reader's computer can support special fonts (hint: most can't) and allows for flexibility in stacking, which lets us show how the language's native speakers chose to organise hieroglyphs spatially. Moreover, Egyptian dictionaries are conventionally organised by romanisation. There is no expectation that Unicode will ever fix this, which is unsurprising given that it is a long-extinct language with no use community, so our current solution is the best way to handle Egyptian going forward. —Μετάknowledgediscuss/deeds 23:45, 2 September 2017 (UTC)Reply[reply]
Actually, positioning hieroglyphs properly might be in the works. —suzukaze (tc) 07:16, 3 September 2017 (UTC)Reply[reply]
Fingers crossed! — Ungoliant (falai) 17:54, 4 September 2017 (UTC)Reply[reply]
Oh, I hadn't seen this version. That's interesting, although I'm not sure it weighs out the other concerns. (E.g. that Egyptian spelling is so erratic that if people search by hieroglyph, we'd have to create entries for all the plentiful alternative spellings (and arrangements that aren't even truly alternative spellings, but have different Unicode control characters) because they'd never be able to guess what spellings we'd lemmatised. —Μετάknowledgediscuss/deeds 18:02, 4 September 2017 (UTC)Reply[reply]
Discussion moved from Wiktionary:Tea room/2017/September#User:TNMPChannel.

I just blocked them for three days for creating an entry in Vietnamese- a language they don't claim to know- by plagiarizing the definitions (without attribution) from a Chinese entry that shares the same character. This is not just dishonest, it's a copyright violation and a violation of our Creative Commons license.

It's also part of a pattern of poor judgement that I've been concerned about for a while: indiscriminate mass creation of articles from a single source without checking for attestability. Creating entries, then immediately rfding them (within minutes). Submitting one of their new entries to rfc because no one else had intervened to fix it yet. Moving a category without understanding enough about our categories to have a clue whether it was a good idea (it definitely wasn't). In general, doing stuff without knowing what they were doing, then expecting others to fix it.

I may be wrong, but to me this all looks like a child who's too young to understand the implications of what they're doing, and is used to grownups stepping in and fixing things. If not, then something is really, really wrong.

At any rate, we need to decide what to do about this- I've only blocked them for three days, and they will have read this by then. Wikis aren't all that good at dealing with contributors who sincerely believe they're helping, but don't know what they're doing. What do you think we should do? Chuck Entz (talk) 05:55, 3 September 2017 (UTC)Reply[reply]

I think he/she should be unblocked for now. The first reminder or warning to the user about making errors in unfamiliar languages was by Justin on their talk page at 03:54, 3 September 2017 (UTC), and they haven't made similar edits after the message. From I observed in their Chinese edits: he/she seems to be quite unfamiliar with our formatting system, though I can see they are trying to improve, and I have also received 'thanks' for my subsequent edits to their created entries. The entries have been quite useful too. It's not very often that we get new users who are native in E/SE Asian languages, so I'm more inclined to fix their new edits than discourage them. Of course, if it persists despite explanation and warning, then blocking would be indicated. Wyang (talk) 06:01, 3 September 2017 (UTC)Reply[reply]
The plagiarism part merits at least a day. Chuck Entz (talk) 07:16, 3 September 2017 (UTC)Reply[reply]
Sure. However, I suspect (or hope) that it may just be part of their cluelessness, rather than malicious disruption or infringement. I do hope that they could return, to help with Chinese idioms and Malay entries. Wyang (talk) 08:12, 3 September 2017 (UTC)Reply[reply]
I find their constant page moves very worrying. They seem to misunderstand how everything works here. Maybe time should be taken to explain what's wrong on their talk page. —suzukaze (tc) 06:02, 3 September 2017 (UTC)Reply[reply]
About 2.5 hours ago this user placed {{unblock|I promise I won't do it again.}} at User_talk:TNMPChannel#Unblock. I think we need some specific, extensive acknowledgement of what won't be done again. (Also something in the documentation that requests a complete confession or allocution before the user is entitled to the request being considered.) DCDuring (talk) 12:25, 3 September 2017 (UTC)Reply[reply]
Also, please check if it's not an Awesomemeeos sock. --Anatoli T. (обсудить/вклад) 12:53, 3 September 2017 (UTC)Reply[reply]
Completely different. Awesomemeeos gets all the technicalities perfect but has trouble with basic common sense. This person can't get either right. Awesomemeeos simply doesn't have the self-awareness and self-restraint to pull off an impersonation like this- for one thing, they're compulsive about upgrading templates. Chuck Entz (talk) 14:32, 3 September 2017 (UTC)Reply[reply]

Translingual terms listed under descendants[edit]

Prompted by this seemingly innocent kerfuffle, it makes me wonder if it is I who am in the wrong, or if I'm onto something. Should translingual terms be listed as descendants? I would say no, because they're mainly taxonomic terms made up of Latin terms, not natural descendants. --Robbie SWE (talk) 17:49, 4 September 2017 (UTC)Reply[reply]

Would you support the taxonomic terms being listed as ==Latin== instead of ==Translingual==? DTLHS (talk) 17:59, 4 September 2017 (UTC)Reply[reply]
Hmm, I'm kind of slow today. Mind giving me an example? --Robbie SWE (talk) 18:02, 4 September 2017 (UTC)Reply[reply]
Since you say that they are "taxonomic terms made up of Latin terms" and favor derived terms instead of descendants, it would be consistent to just call them Latin instead of "Translingual". DTLHS (talk) 18:04, 4 September 2017 (UTC)Reply[reply]
But we already have translingual taxonomic terms. The examples I was given were bombyx#Descendants, accipiter#Descendants, aequoreus#Descendants and alauda#Descendants. I don't believe they should be listed as descendants. --Robbie SWE (talk) 18:19, 4 September 2017 (UTC)Reply[reply]
You do not understand. If you want them to be derived terms why do you still want to call them "Translingual"? DTLHS (talk) 18:23, 4 September 2017 (UTC)Reply[reply]

I don't want them there at all - for instance, Bombyx shouldn't be under descendants nor should it be under derived terms at bombyx. --Robbie SWE (talk) 18:47, 4 September 2017 (UTC)Reply[reply]

Why shouldn't there be some link in the Latin section to the taxonomic term that is derived or descended from it? Why would we omit the connection? DCDuring (talk) 22:17, 4 September 2017 (UTC)Reply[reply]
@DCDuring, I understand what you mean. The reason why I would opt for not listing them under descendants at all is because tanslingual isn't a language per se. According to our guidelines – [l]ist terms in other languages that have borrowed or inherited the word. The etymology of these terms should then link back to the page. – I don't think that translingual terms fall under this category. I looked through this category and a vast majority of them are not listed under descendants in their original Latin entries. --Robbie SWE (talk) 08:30, 5 September 2017 (UTC)Reply[reply]
@Robbie SWE: In the case of CJKV characters clearly we are dealing with a script, not a language. In the case of other Translingual items we are dealing with items that are used in multiple languages. If we don't have Translingual descent shown for taxonomic names, then in principle we should show the taxonomic name as a descendant in each language in which the taxonomic name is used. This seems silly at best. DCDuring (talk) 14:34, 5 September 2017 (UTC)Reply[reply]
"Should translingual terms be listed as descendants?"
By common practice it's done that way and so you were "in the wrong". This know is with a "should" and another question and topic. So what possibilities are there?
  • Don't list translingual descendants at all.
    --- This probably is not a good choice.
  • List them as derived terms.
    --- By WT:ELE#Derived terms that would only be possible if translingual terms would be mislabelled Latin or if Latin would be mislabeled translingual or if both would be merged into a single pseudo-language 'Translingualolatin' (or whatever the name would be). This probably is not a good choice either.
  • List translingual terms as descendants.
    --- Why not? I can't think of any contra reasons. Pro reasons: In a translingual entry it would also be for example "From Latin TERM", and descendant is descriptive.
  • List translingual terms at see also.
    --- This does also depend on the question of what can be listed at see also, and apparently there are different views about it. If different-language terms can be listed at see also: Why not? The only reasons I could think of would be that descendants (maybe cp. descendant#Noun) sounds fitting and is more descriptive and informative. On the other hand, as some translingual terms come with {{taxlink}} and link to the English wikispecies project and not to wiktionary, this would also be an acceptable choice.
- 02:52, 5 September 2017 (UTC)Reply[reply]
{{taxlink}} "temporarily" links to Wikispecies (in all uses), with the hope that there will be a Translingual Wiktionary entry (unless it is decided that taxonomic names are Latin). The "See also" heading is for items that have no more specific heading, but in the past has been used for (true) external links and WM project links as well as for alternative forms, "coordinate terms", hyponyms, hypernyms, meronyms, etc. Placing the items now under "See also" under proper headings would be an excellent cleanup project (Augean stables?), but most of us are in pursuit of bright shiny objects. DCDuring (talk) 03:59, 5 September 2017 (UTC)Reply[reply]

(literary or dialectal)[edit]

故此 is tagged as (literary or dialectal). I'd like to know whether, as the order of it seems to mean, 'literary' refers to Mandarin only, or also in the unspecified dialects implied by the tag. Is this way of inferring to be systematically unsderstood for any such tags appearing in other entries for any other language? --Backinstadiums (talk) 21:06, 4 September 2017 (UTC)Reply[reply]

As it is the entry is not comprehensible. @Wyang, what dialects are included in "dialectal"? DTLHS (talk) 05:17, 5 September 2017 (UTC)Reply[reply]
Wiktionary:About Chinese#Key points: "Terms are defined in relation to Modern Standard Written Chinese. ... Senses limited to the literary language, certain dialects or regions should be marked accordingly." I changed the tag to "formal or Min Dong". I think the headers in our entries should link to the "About" pages somewhere, so that people are directed to a page which explains how a language is documented in entries on Wiktionary, also a page where people can leave their questions or feedback, or voice their interest in joining the editing team. Wyang (talk) 12:13, 5 September 2017 (UTC)Reply[reply]
@Wyang: Isn't it uncommon for term to be use in either formal registers or dialectal ones? --Backinstadiums (talk) 06:41, 7 September 2017 (UTC)Reply[reply]
Not necessarily, especially if the reason for the disuse in the modern standard register is an innovation, which happens often in Chinese. Wyang (talk) 06:44, 7 September 2017 (UTC)Reply[reply]
@Wyang: Could you please add some examples of such innovations for words used frequently in the language? thanks in advance. --Backinstadiums (talk) 14:58, 8 September 2017 (UTC)Reply[reply]
Much of the variation in basic vocabulary (Appendix:Sino-Tibetan Swadesh lists) is due to innovations in the northern varieties of Chinese. Some examples include: (“mouth”) (later displaced by ), (“to eat”) (by ), (“to drink”) (by ), (“dog”) (by ), (“to stand”) (by ), (“he/she”) (by ). Apart from this kind of simple monosyllabic supplantation, another reason for the innovations is the process of polysyllabification, which occurred especially in the northern varieties out of the need for disambiguation, as a compensatory mechanism after the loss of many phonetic contrasts (e.g. tone) through sound changes. Examples include 石頭, (“seed”) → 種子. Wyang (talk) 04:30, 9 September 2017 (UTC)Reply[reply]
@Wyang: Hi again, thank you for your examples but none is tagged as either "literary", "dialectal" or "literary or dialectal". Could we isolate the entries which have the tag "literary or dialectal", and then check which ones developed as innovation? --Backinstadiums (talk) 08:30, 9 September 2017 (UTC)Reply[reply]
They don't have to be tagged. It's implied in the {{zh-dial}} boxes on those entries. Wyang (talk) 10:42, 9 September 2017 (UTC)Reply[reply]
@Wyang: Do you mean having {{lb|zh|Min Dong}} link to the about page? — Eru·tuon 07:52, 7 September 2017 (UTC)Reply[reply]
Not really, having <h2>Chinese</h2> link to the about page rather, in the style of fr:chinoise or something similar. Wyang (talk) 08:02, 7 September 2017 (UTC)Reply[reply]
I suppose that could be done with JavaScript. Or with templates, if we decided to allow templates in headers like the French Wiktionary (less likely). — Eru·tuon 08:49, 7 September 2017 (UTC)Reply[reply]

Please someone block this. Wyang (talk) 04:40, 8 September 2017 (UTC)Reply[reply]

@Wyang: Done, simply because I trust you. I want a reason, though — I looked over a few edits and they seemed fine, though I don't know any Uyghur. (Also, for future reference, this sort of thing can go at WT:VIP.) —Μετάknowledgediscuss/deeds 05:39, 8 September 2017 (UTC)Reply[reply]
Hmm, is it just because it's an Australian who already knows how all our templates work? —Μετάknowledgediscuss/deeds 05:46, 8 September 2017 (UTC)Reply[reply]
Works for me. It's not necessarily the quality of the initial batch of edits, but the camel's nose that will lead to high volumes of hard-to-check edits later on. Notice, for instance, their edits on {{Template:kk-decl-noun}} which "coincidentally" continues edits by another IP, (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks) back in July.
(Before E/C) @Wyang Hi. What edits are wrong? I have only checked some, haven't seen anything bad.

--Anatoli T. (обсудить/вклад) 05:40, 8 September 2017 (UTC)Reply[reply]

@Atitarev you don't find it odd that an IP pops up out of nowhere and starts out by rewriting all the inflection templates- in both Kazakh and Uyghur? Chuck Entz (talk) 07:28, 8 September 2017 (UTC)Reply[reply]
@Chuck Entz: Yes, it's suspicious, it could be a formally blocked editor but I didn't know that this is a reason for blocking, though and none was given. Who was it, anyway? --Anatoli T. (обсудить/вклад) 07:37, 8 September 2017 (UTC)Reply[reply]
AwesomeMeeos, of course. I'm starting to go through Category:Noun inflection-table templates by language. In the Adyghe subcategory, for instance you'll find edits by (talkcontribswhoisdeleted contribsnukeabuse filter logblockblock logactive blocksglobal blocks). Chuck Entz (talk) 07:57, 8 September 2017 (UTC)Reply[reply]
He's quite inventive, LOL. --Anatoli T. (обсудить/вклад) 08:11, 8 September 2017 (UTC)Reply[reply]

Adding accents to Italian headwords[edit]

I'm currently learning Italian and started to work on our Italian entries. I noticed that we don't display accents for irregularly stressed words. cavolo and diavolo for example are listed in other Italian dictionaries as càvolo and diàvolo (e.g. Treccani), because they don't follow the common stres pattern (next to last syllable). My suggestion is to add a headword parameter to those entries, {{it-noun|m|head=diàvolo}}. Or would that be confusing? Explicit parameter? Better alternatives? – Jberkel (talk) 09:04, 10 September 2017 (UTC)Reply[reply]

The problem is that sometimes the accent is actually written, and there's no way for someone to tell the difference. —Rua (mew) 11:28, 10 September 2017 (UTC)Reply[reply]
I have included the accent in the hyphenation as a possible solution: diavolo. --Vriullop (talk) 12:12, 10 September 2017 (UTC)Reply[reply]
Hm, maybe that's a solution then, I think the information should go somewhere, and the pronunciation section is a good place. It's interesting that they don't bother using accents, even in ambiguous cases (e.g. pesca). – Jberkel (talk) 15:12, 10 September 2017 (UTC)Reply[reply]
Theoretically, the "correct" solution would be to include an IPA pronunciation with a stress mark. But I like the accent in the hyphenation idea too. --WikiTiki89 18:06, 11 September 2017 (UTC)Reply[reply]

This person keeps reverting my OED-sourced proper pronunciation of angstrom. Apparently he or she thinks that proper pronunciations must meet his personal litmus test of notability or whatever. Wiktionary exists for many reasons, one being a place where readers from Wikipedia can come to get the specifics of a word, an important part of that being proper pronunciations. I added this information for the specific reason of avoiding the continuation of ever-recurring arguments about the pronunciation of this word over at Wikipedia. The proper pronunciations of words and the common do not always overlap in many cases. Their argument seems to be that [œ] does not exist in English, but a quick look over at Open-mid front rounded vowel can tell you that's not the case. In any case it is a Swedish loanword, and you can observe this vowel in the pronunciation of the the Swedish version. I made sure to display the pronunciation I added [phonetically] rather than /phonemically/, and I did not mess with or remove the existing common phonemic English pronunciations, so I really don't see what the big deal is. I don't just pull this stuff out of my ass; I am well-versed in the relevant term and how IPA works. This information, while a bit obscure, could still potentially help someone. Don't get me wrong; any other day I'm all for excising superfluous crap from reference sources, but this is not one of those cases. It looks to me to be yet another age-old case of what we called “barracks lawyers” in the Army. Pariah24 (talk) 12:48, 10 September 2017 (UTC)Reply[reply]

I would expect that pronunciation by English speakers would rarely be the same as "correct" or common pronunciation by Swedish speakers, especially for a word fully absorbed into English. At [[ångström]] we have the Swedish pronunciation. DCDuring (talk) 14:51, 10 September 2017 (UTC)Reply[reply]
@Pariah24 The problem I have is that you don't state which accent says IPA(key): [ˈɔːŋstɹœm]. The LPD and CEPD, the most respected pronunciation dictionaries of English list only the pronunciations with IPA(key): /ə/ and IPA(key): /ʌ/. We're both aware that there's no *IPA(key): /œ/ phoneme in English, at least in most accents. Can you prove that accents that use IPA(key): [œː] for the NURSE vowel (or GOAT vowel, in the case of South African English) would use that vowel in 'angstrom'? I find that highly unlikely and so that argument just doesn't hold up without additional sources that would prove that. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)Reply[reply]
@DCDuring We do, I added it there a few days ago per the LPD, which provides the Swedish IPA alongside RP and GA transcriptions. Mr KEBAB (talk) 15:10, 10 September 2017 (UTC)Reply[reply]
The established practice on Wiktionary is to only put English pronunciations in an English entry. If no English speaker actually says the word angstrom with the pronunciation [ˈɔːŋstrœm], then that pronunciation should not be listed in the English entry. So the way to resolve this dispute is to find evidence that an English speaker says [ˈɔːŋstrœm], and to put that pronunciation in the proper context (is it rare, is it used by English speakers who also speak Swedish?). — Eru·tuon 17:48, 10 September 2017 (UTC)Reply[reply]
I didn't notice that the pronunciation came from the OED. It is given as phonemic there: /ˈɔːŋstrœm/. On the one hand, I respect the OED; on the other, Wiktionary encourages verification of information taken from other sources, so it would be good to find out what they based this transcription on and whether it would meet our standards even if the OED didn't say it, and put it in proper perspective (that is, as above, who actually uses this pronunciation?). — Eru·tuon 18:02, 10 September 2017 (UTC)Reply[reply]
screenshot for the plebs :) Pariah24 (talk) 03:29, 11 September 2017 (UTC)Reply[reply]
Given its relative obscurity I think most would agree that finding a sample in the wild of a pronunciation of this term is pretty unfeasible. OED has never steered me wrong. If this were Wikipedia I wouldn't have even bothered with this nitpicky silliness, but it's a dictionary for pete's sake, and I always lean on the side of too much information is better than not enough, provided the source isn't questionable. I'm not over here deleting anything, I'm just adding. Pariah24 (talk)
This doesn't look like a query for an obscure word to me. —suzukaze (tc) 03:48, 11 September 2017 (UTC)Reply[reply]

After all, there are many sources that say "X is a word that means Y" out there in the world, in particular other dictionaries. But they don't actually prove that people use a word, they just say they do, which really isn't sufficient. It wouldn't be the first time that dictionaries make up words that nobody has ever used!

suzukaze (tc) 03:40, 11 September 2017 (UTC)Reply[reply]
Blockquoting, really? Pariah24 (talk) 03:43, 11 September 2017 (UTC)Reply[reply]
If you want to play this game, here's a link for you. Pariah24 (talk) 03:47, 11 September 2017 (UTC)Reply[reply]
Alright, you can have another too. —suzukaze (tc) 03:50, 11 September 2017 (UTC)Reply[reply]
  • I, too, suspect this is a dictionary invention and I think it should not be in the entry without proof of actual use. Incidentally, I have a background in science (in the US), where normal use was /ˈæŋstɹəm/ or /ˈæŋstɹɑm/ (the latter of which I note is not in the entry). —Μετάknowledgediscuss/deeds 03:52, 11 September 2017 (UTC)Reply[reply]
Not trying to offend but your anecdotal experience with a physics term can not possibly be comprehensive enough to be used as a valid argument for inclusion into a worldwide dictionary project. The only plausible way to attack my view that I see is to question the validity of OED and claim they would just put made-up bullshit into their dictionary, which is what you appear to be doing. Everything I know about OED from years of using it goes against this view. This is all getting really pedantic if you ask me. Sometimes it seems like all the worst parts of Wikipedia are magnified here. Pariah24 (talk) 04:24, 11 September 2017 (UTC)Reply[reply]
If I understand the screenshot above correctly, the OED's entry for angstrom hasn't been updated since 1933. Nobody is saying that they insert made up bullshit into their entries. But their information may be outdated. DTLHS (talk) 04:28, 11 September 2017 (UTC)Reply[reply]
Sadly, it may be bullshit. The OED makes things up a lot more than any of us would like, I'm afraid; you'll see that they're a frequent offender over at Appendix:English dictionary-only terms (a list of terms found in dictionaries that were never actually used enough to enter Wiktionary). —Μετάknowledgediscuss/deeds 04:31, 11 September 2017 (UTC)Reply[reply]
If it needs to be removed based on this rationale I don't have a problem with it. I only have a problem with "hey guy I'm here to police you because I don't think you know what you're talking about." I'm rarely on the other side of these situations, because I almost never remove the content of others unless it obviously needs to go. Pariah24 (talk) 04:35, 11 September 2017 (UTC)Reply[reply]
@Pariah24 Here we go again with misrepresenting what I say. I'm not interested in policing you (though you seem to strongly believe that, which isn't correct) but in the quality of Wiktionary. I removed that information and added sourced pronunciations (or sourced already existing ones, whatever it was) because I could see that the IPA was incomplete, it didn't say which accent says IPA(key): [ɔːŋstrœm] and I knew for a fact that neither RP nor GA speakers use IPA(key): [œ] in loanwords, not to mention native words. Do I really have to repeat myself over and over again? It seems like I do. I'm tired of you twisting my words and lying about my actions. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)Reply[reply]
It's especially aggravating when you're someone who has already gone through pains to use good sources. Pariah24 (talk) 04:39, 11 September 2017 (UTC)Reply[reply]
@DTLHS: The last quotation in the entry is from 1957, and the small text under the definition mentions a redefinition of the meter in 1960, so they must have updated at least those parts of the entry since 1933. — Eru·tuon 05:03, 11 September 2017 (UTC)Reply[reply]
Nobody likes to say this, but we rely heavily on original research (for English at least) and generally don't give a shit what secondary sources say. Which is why you will encounter so much hostility if you try to support something based on what a dictionary says. DTLHS (talk) 04:40, 11 September 2017 (UTC)Reply[reply]
I honestly never noticed the little update warnings off to the right, so thanks for that. Clearly my attention to detail needs some work. Pariah24 (talk) 04:47, 11 September 2017 (UTC)Reply[reply]
However on a second look it does say Previous version OED2 (1989). I think it just means it was first published in 1933. Pariah24 (talk) 04:53, 11 September 2017 (UTC)Reply[reply]
@DTLHS That's clearly not the case here. I sourced the IPA on angstrom, it's not OR. Mr KEBAB (talk) 18:11, 11 September 2017 (UTC)Reply[reply]
FWIW, there have been other cases where dictionaries prescribe a pronunciation that we can't find in use; bon appétit is one. If more dictionaries than just the OED prescribed the pronunciation mentioned here (and if they were consistent, e.g. in saying it was an RP pronunciation), then it might be appropriate to add a note like the one in bon appétit. - -sche (discuss) 04:20, 19 September 2017 (UTC)Reply[reply]
Isn't it ALWAYS true that a term originating in a FL will sometimes be pronounced by a limited number of English speakers as it is in the FL. Those who pronounce it as in the FL would be limited to those who knew the FL or were repeatedly "corrected". If the pronunciation is not fairly common, why should it be included? DCDuring (talk) 05:03, 19 September 2017 (UTC)Reply[reply]

WM language guides[edit]

Following is text extracted from a message posted on many WM mailing lists:

Many wikis in the Wikimedia world give editors suggestions about the correct usage of each respective language: orthography, register, punctuation, and so on.

I started a page to list of such language guides: https://meta.wikimedia.org/wiki/Language_guides

I added a bunch of links to Hebrew there because that's my home wiki. I also added a few pages that I could find for Catalan, Indonesian, Russian, and Bosnian.

Please add your languages there! Surely there are dozens and dozens of missing links there.

Before you ask: The linked page explains why Wikidata is not very convenient for maintaining such a list, but if you think that you can put this nicely in Wikidata, be bold.

Thank you!

-- Amir Elisha Aharoni

Note that this is not the same as article style guides. I would think folks here would be interested. Potentially English usage guides would be a valuable resource and link target for us. Perhaps some of our usage notes and other material would be useful material for such usage guides in many languages. DCDuring (talk) 14:43, 10 September 2017 (UTC)Reply[reply]

Erm, it appears that these are, in fact, just style guides... —Μετάknowledgediscuss/deeds 18:32, 10 September 2017 (UTC)Reply[reply]
I think the glass is about 1/4 full. At least, Croatian, Hebrew, Indonesian, and Polish include grammar and/or common spelling errors. Afrikaans has something on translation errors. DCDuring (talk) 20:03, 10 September 2017 (UTC)Reply[reply]

Draft strategy direction. Version #2[edit]

In 2017, we initiated a broad discussion to form a strategic direction that will unite and inspire Wikimedians. This direction will be the foundation on which we will build clear plans and set priorities. More than 80 communities and groups discussed and gave feedback[strategy 1][strategy 2][strategy 3]. We researched readers and consulted more than 150 experts[strategy 4]. We looked at future trends that will affect our mission, and gathered feedback from partners and donors.

A group of community volunteers and representatives from the strategy team synthesized this feedback into an early version of the strategic direction that the broader movement can review and discuss.

The second version of the direction is ready. Again, please read, share, and discuss on the talk page on Meta. Based on your feedback, the drafting group will refine and finalize the direction.

SGrabarczuk (WMF) (talk) 10:12, 11 September 2017 (UTC)Reply[reply]

Merge Proto-Nuclear Polynesian and Proto-Eastern Polynesian into Proto-Polynesian[edit]

The Austronesian languages suffer from what you might call matryoshka grouping: each group has a branch which then branches further, which then branches further, and so on. You end up with a lot of branches which don't have very significant differences, and a lot of different proto-language entries with very similar content. It's made more complicated by the fact that some of the branches are less well established than others. To reduce this somewhat, I propose merging Proto-Nuclear Polynesian poz-pnp-pro and Proto-Eastern Polynesian poz-pep-pro into Proto-Polynesian poz-pol-pro. The differences between them are very small; each group is separated from its parent by only one or two sound changes, the sound changes of individual languages are often more substantial than those separating the proto-languages. See *matuqa for example. Rapa Nui and Hawaiian are both in the Eastern Polynesian group, yet the former preserves the Proto-Polynesian form unchanged while the latter significantly changes it. Having separate entries for Proto-Polynesian *aka, Proto-Nuclear Polynesian *aka and Proto-Eastern Polynesian *aka is quite pointless. —Rua (mew) 22:47, 11 September 2017 (UTC)Reply[reply]

So for some more background, what are the sound changes that supposedly differentiate PNP from PP, and PEP from PNP? --WikiTiki89 22:54, 11 September 2017 (UTC)Reply[reply]
See w:Proto-Polynesian language. Nuclear has loss of *h and merging of *l and *r. Eastern also has *s > *h and partial loss of *q. —Rua (mew) 23:31, 11 September 2017 (UTC)Reply[reply]
  • Oppose. We have terms that can be reconstructed to PNP but not to PPn. Why on earth would you make it impossible to enter reconstructible terms? —Μετάknowledgediscuss/deeds 23:53, 11 September 2017 (UTC)Reply[reply]
  • Oppose. There are two factors that differentiate Polynesian languages from Indo-European and other Eurasian language families.
    1. Polynesian phonotactics are extremely adverse to consonant change: in any Polynesian language I'm familiar with, there's no such thing as a consonant cluster- every syllable begins with either a vowel or a single consonant followed by a vowel, and there are simply no final consonants. That means that any consonant change that does happen is really significant.
    2. Millions of square miles of open ocean. It is physically possible to walk from the Scandinavian Peninsula all the way to India, but not from Samoa to Hawaii. Proto-Germanic spread over a wide area, but contact between dialect areas prevented it from splitting up into separate branches, for the most part. There are Polynesian island groups such as the Hawaiian Islands and Rapa Nui where there has been a colonization event or two, but no other contact with the outside world- ever. There are parts of Polynesia where island groups are close enough to allow periodic contact, but there are also plenty of island groups where other peoples were a subject of oral history, but never actually encountered until the Europeans showed up. There again, patterns of sound changes are probably reflective of actual population movements, not of areal influence or borrowing- you can't have areal influences if people within an area have never had any contact with each other. Chuck Entz (talk) 01:36, 13 September 2017 (UTC)Reply[reply]
      I don't get what your point is. You seem to be saying that we should group languages based on how different they theoretically could be rather than how different they actually are. To me it seems that what we need is to determine whether there actually are fundamental enough differences between PP, PNP, and PEP that it would be infeasible to treat them under a single langauge. I don't personally have an answer to this, nor do I have any evidence to share, but let's not base this on theory. --WikiTiki89 02:40, 13 September 2017 (UTC)Reply[reply]
  • Support at minimum for inherited terms. Multiplication of entries like Proto-Polynesian *wai, Proto-Nuclear Polynesian *wai and Proto-Eastern Polynesian *wai is useless. These are nothing more than a waste of effort that makes things harder to maintain. Trying to document every possible reconstructed proto-language as if they were attested natural languages is not a part of Wiktionary's mission; it is w:scope creep and should be avoided.
    — I could agree either way on items that actually are reconstructible only for a smaller group of languages, though, such as Proto-Nuclear Polynesian *nui. Having these under their proper proto-language categories etc. is more exact; but on the other hand, keeping around two "tiers" of proto-languages seems more complicated than is really necessary, since the context label (dialect label would be more exact) approach works for almost all needs. --Tropylium (talk) 20:48, 18 September 2017 (UTC)Reply[reply]

There are over eight thousand entries in this category. Does anyone know what we are supposed to do with them? SemperBlotto (talk) 19:27, 14 September 2017 (UTC)Reply[reply]

It is a project of TheDaveRoss. Maybe he can tell you. DTLHS (talk) 19:28, 14 September 2017 (UTC)Reply[reply]
The cleaning means changing {{quote-text}} to one of the specific quote templates, it is not at all urgent. - TheDaveRoss 19:57, 14 September 2017 (UTC)Reply[reply]

"Morphologically from the root" IP editor[edit]

There seem to be a variety of IP users editing Arabic entries and, among other things, adding the text "morphologically from the root x". See for instance this and this and this and this. As you can see, there are several different IP addresses, but to me their editing style looks the same, particularly the tagline above in the etymology sections, so they might be a single very high-tech person who knows how to mess with IP addresses, or a conspiracy of people. Sometimes the edits are okay, aside from formatting (no newlines between sections, and second-level reference sections, for instance). Often they're replacing specific definitions or etymologies with generic morphological ones (with the above tagline), or replacing Arabic templates with generic ones. With the last example, ⁧أَعْلَى(ʔaʕlā), they've radically reformatted the entry in an unconventional fashion, with pronunciation sections above etymology sections that share that pronunciation. It makes sense, but it's something that needs discussion (and there's the deletion of valuable etymological material). On the whole, their edits are full of various things that need to be corrected.

Anyway, I don't know what to do about this. At least the tagline gives some way to find their edits. I'll leave it at that. — Eru·tuon 10:06, 15 September 2017 (UTC)Reply[reply]

Proposed first use of Wikidata: categorising planets[edit]

A while ago, {{senseid}} was added to entries with Wikidata ids, but with no actual Wikidata access, it was mostly a formality. Now that Wikidata access has been enabled, I've done some experimenting in Module:senseid, detailed at Wiktionary talk:Wikidata#A first experiment. The experiment was meant to find out how to use Wikidata information to categorise entries. Categorising entries this way offers the advantage that we don't have to think about which categories something belongs in. As long as the data is present on Wikidata, and the appropriate IDs are added to entries on Wiktionary, the categories can be added automatically. Think of it as an {{auto cat}} for individual senses: just plop in the Wikidata ID and the template will figure out what needs to be done. This method can only work for "set" categories, which contain things belonging to a particular set, usually indicated in Wikidata with the "instance of" property. It doesn't work for categories that contain terms related to a particular thing/topic. Semantic relatedness is lexical data, which is not currently present in Wikidata. Thus, we'll still have to manually add "topic" categories like Category:Astronomy.

Because of the rule that uses of Wikidata should be approved first, I did this experiment by using tracking categories to stand in for actual topical categories. My finding was that in general it works quite well, but Wikidata handles certain things idiosyncratically which our modules need to take into account when "translating" the information. For example, many things that are combined into one category on Wiktionary have several different Wikidata entities, such as our Category:Planets of the Solar System, which has three corresponding Wikidata entities, outer planet (Q30014), inner planet of the Solar System (Q3504248) and inner and outer planets (Q17362350). Wikidata makes frequent use of subclassing; writing code on our end to resolve sub/parent classes may help in these cases. Taxonomic data is also handled differently, with special taxonomic properties rather than the generic "subclass of" and "instance of" properties.

I would like to take a first step towards making it actually do something with the data. This needs to be approved in a discussion, so I hereby propose modifying Module:senseid/{{senseid}} to automatically place an entry into a language-specific Category:Planets of the Solar System, if it is given a Wikidata ID (Q followed by numbers) and if Wikidata indicates that the entity for this ID is a planet of the solar system. I'm choosing planets specifically because it's a very small set with exactly 8 known members, and the Wikidata data is known to be complete. This makes it easy to spot any problems if they arise. Extending the system for more categories is very easy; if you approve of doing it for more categories than just planets right from the start, please state so. —Rua (mew) 18:54, 15 September 2017 (UTC)Reply[reply]

I entirely approve and I especially like this example because it is something small and restrained (unlike e.g. adding it to every entry on a species) and it does include some possibly contentious data--i.e. the status of Pluto (which is not contentious to astronomers but is the sort of thing that could have actual individuals editing back-and-forth about it). —Justin (koavf)TCM 00:47, 16 September 2017 (UTC)Reply[reply]
What benefit does this add? The members of the category are unlikely to change much over time, and if we did this it would require that every page which is not a planet checks to see if it is a planet, which seems like a lot of overhead. - TheDaveRoss 12:54, 18 September 2017 (UTC)Reply[reply]
@TheDaveRoss: Your question answers itself: since this is a very small and stable use case, it will allow us to see in the wild how Wikidata integration would work. That is the value. —Justin (koavf)TCM 15:53, 18 September 2017 (UTC)Reply[reply]
@Koavf: I don't disagree that this would be a good test, if it were the type of thing that I thought Wikidata was well suited for. I do not think that this is an example of a good use of Wikidata, however. If you have a small, class of objects you label the objects rather than querying all objects to see if they belong to that class. If you have a very large class, or one which changes often, then you query. - TheDaveRoss 11:39, 19 September 2017 (UTC)Reply[reply]
It's not expensive at all to check this. All you need to do is retrieve the "instance of" property of the entity, and then check the IDs that you get for matches. IDs are just strings, so it's basic string matching, which is very fast. —Rua (mew) 15:48, 20 September 2017 (UTC)Reply[reply]
It is expensive due to the volume, not the task. - TheDaveRoss 17:37, 20 September 2017 (UTC)Reply[reply]
Yes, but structured data will vastly decrease overhead in the long term. —Justin (koavf)TCM 17:41, 20 September 2017 (UTC)Reply[reply]
Only if we were currently using any overhead at all for this sort of thing, which we are not. I agree that judicious use of Wikidata will decrease overhead. - TheDaveRoss 17:46, 20 September 2017 (UTC)Reply[reply]
Categorizing things is overhead. Someone has to actually do it. —Justin (koavf)TCM 18:23, 20 September 2017 (UTC)Reply[reply]
  • I too question the utility of this. The inline markup is unaesthetic, cryptic and really seems out of place, and it offers practically no additional benefit to the current categorisation system. Furthermore, it rests on the assumption that words in various languages are direct equivalents of one another; they are not, for most words in a language. Sure, Venus may be translated as 太白星 and listing it in the translations table at Venus is acceptable, but the words are far from being the same, really. There are several names for Venus in Chinese, each with a nuance in meaning, and the same situation exists for nearly every other planet and star. Wyang (talk) 13:38, 18 September 2017 (UTC)Reply[reply]
If 太白星 (Tàibáixīng) doesn't mean Venus, then why does the definition say Venus? Subtle nuances in meanings should be included in entries. But in this case it's simple: either it refers to that same ball of rock floating around the sun, or it doesn't. —Rua (mew) 14:17, 18 September 2017 (UTC)Reply[reply]
It means Venus, but only in a Chinese astronomy, astrology, or Taoist context. It is already indicated with the label in the entry. It is the Grand White Star in traditional Chinese astronomy/folk religion, governed by the Grand White Star Lord. It is conveniently translated as Venus, but its definition really should be elaborated on in the future. Venus in modern Western astronomy and in the context of Taoist Wu Xing (five elements) is called 金星, Venus in Chinese astronomy and astrology is called 太白(星), Venus in the context of Taoist mythology is called 太白金星, Venus seen in the morning is called 啟明(星), Venus seen in the evening is called 長庚(星), etc. The nuances in most of the vocabulary in a language are simply too significant to allow a bijective map to another language, even for a simple term like Venus, especially if the two languages developed from cultures which historically had very few contacts. The principle of trying to map all words of a language to specific pre-defined semantic concepts, for the purpose of classification, is methodologically problematic. Wyang (talk) 23:09, 18 September 2017 (UTC)Reply[reply]
@Wyang: Of course. But has an article named something on the planet *second-*closest to the Sun. What is that named? —Justin (koavf)TCM 23:29, 18 September 2017 (UTC)Reply[reply]
@Koavf: You can't just say "of course" and wave that off. It's similar in Hindi: अरुण (aruṇ, Uranus in astrology), युरेनस (yurenas, Uranus the planet). Some Hindi purists also use अरुण for the planet. btw the closest planet to the sun is actually Mercury. That's not the main problem with this though, the real problem is the markup will become even more opaque and hidden from the casual editor. People wonder why we don't get many new editors, it's because it takes months to learn how to use all of our templates. —Aryaman (मुझसे बात करो) 23:57, 18 September 2017 (UTC)Reply[reply]
@Aryamanarora: I'm not suggesting that the problem is simple or trivial: I'm suggesting that it's very complicated but that this is a good first step. Do you have a better solution? —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)Reply[reply]
If I did I would have to know enough Lua to implement it. —Aryaman (मुझसे बात करो) 00:08, 19 September 2017 (UTC)Reply[reply]
@Aryamanarora: I'm not asking if you know the technical means (God knows I don't!), just what in principle would work better. Do you have any thoughts? I'd be happy to know what would work better even in a hypothetical sense. —Justin (koavf)TCM 04:26, 19 September 2017 (UTC)Reply[reply]
I don't understand what you mean. Wyang (talk) 23:34, 18 September 2017 (UTC)Reply[reply]
@Wyang: zh:w:金星 corresponds to en:w:Venus, so wouldn't that be the best word to use? Also, it's not a problem to use this senseid on more than one entry. —Justin (koavf)TCM 00:06, 19 September 2017 (UTC)Reply[reply]
Well, it is a problem to try to assign foreign words to specific semantic labels in English: e.g. Venus on Wikidata, and try to systematically generate categories based on these crude equivalents. Like I said above, the best word to use depends on context. Very rarely do words in Chinese match with senses of a word in English exactly. Wyang (talk) 00:14, 19 September 2017 (UTC)Reply[reply]
@Wyang: Then have both words in the category. That solves the problem. If one English word is a cognate*equivalent* to two words in another language, that's okay. (Or three or vice versa, etc.) —Justin (koavf)TCM 03:13, 19 September 2017 (UTC)Reply[reply]
They can be both put in the same category, or any category, with the current categorisation system, without having to resort to such rigid equivalence sets. The current method is also superior in usability and aesthetics. P.S. See definitions of cognate. Wyang (talk) 04:05, 19 September 2017 (UTC)Reply[reply]
@Wyang: It is not superior in usability because it can be exported across languages. With over 100 Wiktionaries and no less 6,500 languages, using structured data to do any part of the work is far more usable. If the method of categorization that MediaWiki uses is superior, why did we ever launch Wikidata in the first place? —Justin (koavf)TCM 04:23, 19 September 2017 (UTC)Reply[reply]
Pronunciation is templatisable, inflection is templatisable, entry layout is templatisable, but semantics is not templatisable or structurisable. Every sense of every word in a language corresponds to a semantic field or domain, and in the hypothetical 3D representation of human perception and cognition, it is a sphere in space, which spatially centres on the core, fundamental meaning of the term. What we are trying to do when we translate foreign words into English on Wiktionary is to find existing English terms with spatially close semantic areas to the source words; that's why definitions for Chinese words on Wiktionary are usually given with two or three English equivalents. Giving these multiple equivalents allows the reader to imagine the semantic area for the foreign term by superimposing the various English terms. As such, semantics across languages is not structured, and attempts to structurise it will only result in confusion and chaos. If languages were strict bijective mappings of words and grammar from one to another, machine translation would be a lot easier. Sure, it may work for water (in most languages), since the semantic fields for words for water are mostly spatially close, but it will fail for river, fluid, syrup. Wyang (talk) 04:46, 19 September 2017 (UTC)Reply[reply]
@Wyang: So are you opposed to the notion of categorizing these terms? —Justin (koavf)TCM 16:20, 19 September 2017 (UTC)Reply[reply]
I'm opposed to the notion of mapping senses of foreign words onto specific, pre-defined English semantic labels, and blindly achieve categorisation via those labels, as if the labels themselves are equivalent to the senses of the foreign words. Wyang (talk) 23:26, 19 September 2017 (UTC)Reply[reply]
The Wikidata items aren't meant to encompass entire senses. They encompass referents of senses. Chinese may have different words for Venus, all with various nuances, but they all refer to the same ball of rock in space. The context in which they are used isn't relevant, that's a matter for context labels and usage notes. All that matters is that they fundamentally are different terms for that same ball of rock. So can you give concrete examples? Which terms refer to the planet and which don't? —Rua (mew) 23:45, 19 September 2017 (UTC)Reply[reply]
They differ on a lexical level and these nuances will be reflected in their categories. The category of Category:zh:Planets of the Solar System is perfect as it is now. There is no need to dump all tens of synonyms of Venus, plus the names of all other planets in traditional Chinese astronomy into this category; these words, which are largely limited to traditional Chinese astronomy, should go into Category:zh:Planets of the Solar System in Chinese astronomy, or at least Category:zh:Stars and planets in Chinese astronomy (the reason the entry has the {{lb|zh|Chinese star}} label). One can easily adjust the categorisation in whatever way is most appropriate now. Putting an unattractive senseid next to the sense simply takes away this freedom and flexibility. Another example is the senseid at happiness, linked to Q8 on Wikidata which has 幸福 (xingfu) listed as the Chinese equivalent. This is unfortunate as xingfu is probably one of the hardest Chinese words to translate into English. Although it is typically glossed as happy; happiness, its connotations are hard to describe and not insignificant. English "I am very happy" and Chinese "我很幸福" have vastly different meanings. It would be quite silly to let the meaning conveyed by happiness blindly dictate the categories of the foreign words. Wyang (talk) 02:12, 20 September 2017 (UTC)Reply[reply]
I opppose this, in particular the {{senseid|zh|Q313}} noise added to 太白星. Wiki markup should be free from identifier noise; it should be pleasant to edit directly. --Dan Polansky (talk) 13:53, 18 September 2017 (UTC)Reply[reply]
It's not possible to use Wikidata without identifiers. —Rua (mew) 14:11, 18 September 2017 (UTC)Reply[reply]
I support using Wikidata to categorize planets. One alternative idea might be using something like {{senseid|zh|Venus}} instead of {{senseid|zh|Q313}} with a data module that recognizes that "Venus" means "Q313". --Daniel Carrero (talk) 14:35, 18 September 2017 (UTC)Reply[reply]
I strongly oppose creating a module which maps strings to Wikidata identifiers. The Lua errors are rampant enough without going down that path. - TheDaveRoss 15:02, 18 September 2017 (UTC)Reply[reply]
I'm not a fan of it either. In any case, the senseids themselves aren't a part of this proposal. This proposal is only about modifying {{senseid}} to use them for categorising. Having Wikidata IDs on entries is beneficial even if others decide they don't want {{senseid}} to categorise. —Rua (mew) 15:08, 18 September 2017 (UTC)Reply[reply]
Sure, I take back the idea of mapping things like "Venus" = "Q313". I prefer using "Q313" anyway, that was just an alternative idea. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)Reply[reply]
I strongly agree with Wyang, especially his point regarding the fact that Chinese has multiple names for Venus, each with its own connotations. --WikiTiki89 21:19, 18 September 2017 (UTC)Reply[reply]
Is there any Chinese name for Venus that shouldn't get categorized in Category:zh:Planets of the Solar System? This is a categorization proposal, so I'd like to know how the nuances of each name affect categorization. --Daniel Carrero (talk) 21:26, 18 September 2017 (UTC)Reply[reply]
Oppose per Dan Polansky and Wyang. —Aryaman (मुझसे बात करो) 21:22, 18 September 2017 (UTC)Reply[reply]
d:User:Amgine for a very old commentary on wikidata, which aligns with Wyang's opposition. Feel free to expand if you can. - Amgine/ t·e 01:54, 21 September 2017 (UTC)Reply[reply]

Split RfD by English/non-English as we have with RfV[edit]

I propose that we split Wiktionary:Requests for deletion into Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English, just as we have done with Wiktionary:Requests for verification. RfD is presently over 425K, and although I can't say offhand what proportion is non-English, I would estimate it at somewhere over one third. As with RfV discussions, examination of English and non-English entries, of course, requires different skill sets, and a different set of editors are typically attracted to each kind of discussion. bd2412 T 02:04, 17 September 2017 (UTC)Reply[reply]

When I proposed the split of RFV, I considered this as well but ultimately rejected it. The fact is that if you don't know at least a little Japanese, you just can't be of any use in gathering Japanese quotations or assessing whether they're uses. However, anyone who understands how the SOP concept works can look at a Japanese word broken into its component parts and, once shown that 茶色の葉 is 茶色(ちゃいろ) (chairo, brown colour) + (no, possessive connector) + () (ha, leaf), and since it means "brown leaf", it would be inappropriate to have a Wiktionary entry for that. That's why everyone can contribute at RFD, and why we should focus on clearing up the backlog by making judgement calls on whether a consensus has been reached rather than splitting the page. —Μετάknowledgediscuss/deeds 03:58, 17 September 2017 (UTC)Reply[reply]
Just as an academic question, doesn't RfD address issues other than SOP-ness? bd2412 T 01:28, 24 September 2017 (UTC)Reply[reply]
Yes, but much less commonly. —Μετάknowledgediscuss/deeds 03:53, 24 September 2017 (UTC)Reply[reply]
Support, WT:RFD is too large already. --Daniel Carrero (talk) 21:37, 18 September 2017 (UTC)Reply[reply]
Support --Backinstadiums (talk) 07:04, 21 September 2017 (UTC)Reply[reply]
Support and I also believe scriptio continua languages and some other language groups require CFI different from English. BTW, @Metaknowledge: Japanese idiomatic terms may get a possessive particle の, e.g. ()() (konoha) or ()() (kinoha). The 2nd one looks especially like SoP ("leaf of the tree"") but both terms are considered idiomatic. Languages such as Korean or Arabic, etc. (both use spaces between) may have non-words written together, with no spaces between them, such as clitic prepositions, pronouns, etc. (Arabic) - ⁧فَقَالَ(faqāla, and (he) said)‎ = فَ‎ + قَالَ‎, ⁧غُرْفَتِي(ḡurfatī, my room)‎ = غرفة‎ + ي‎, particles and copulas (Korean) - 한국어로 (han'gugeoro, “in Korean”) = 한국어 + 로, 학생입니다 (haksaeng'imnida, “(someone) is a student”) = 학생 + 입니다. I do agree, however, that one can take part in discussions without a thorough knowledge of a given language but one has to learn fast and listen to arguments of native speakers or advanced learners. --Anatoli T. (обсудить/вклад) 07:44, 21 September 2017 (UTC)Reply[reply]
Support. — Ungoliant (falai) 12:52, 21 September 2017 (UTC)Reply[reply]
Support. --Canonicalization (talk) 13:19, 21 September 2017 (UTC)7Reply[reply]
Support. --Robbie SWE (talk) 18:18, 21 September 2017 (UTC)Reply[reply]

For reference:

I don't see any particular need. RFD is nowhere near as huge as RFV was when we split. --WikiTiki89 18:19, 25 September 2017 (UTC)Reply[reply]
I think the relative sizes of both pages have fluctuated and crossed over one another from time to time. bd2412 T 19:59, 27 September 2017 (UTC)Reply[reply]
As I see it, there are two negatives to consider: first, of course, is the burden of working with a large page, but the other one hasn't been mentioned yet: splitting by language means requiring a language code in the {{rfd}} template and cleaning things up when people post to the wrong page. This represents yet another way that those who don't know the finer points of our templates can get tripped up while doing basic tasks. Chuck Entz (talk) 21:24, 27 September 2017 (UTC)Reply[reply]
Since the nominations are not transclusions, would we really need a language parameter? I would think that we could just have two different templates, one for English, one for everything else. bd2412 T 15:16, 29 October 2017 (UTC)Reply[reply]

Done. Discussion results:

  • 7 supports (counting the OP) + 0 opposes, or maybe 3 opposes (nobody actually voted oppose but there are three oppose-ish comments)
  • support count is from 70% to 100% (the latter is if you count the 3 non-opposes as abstentions or non-votes) as per above, either way it passes
  • we had 2 months, 1 week and 1 day to discuss this (technically the discussion is never over, this is not a formal vote)


  1. I split the page as follows: Wiktionary:Requests for deletion (disambiguation page), Wiktionary:Requests for deletion/English and Wiktionary:Requests for deletion/Non-English.
  2. I created the shortcuts WT:RFDE, WT:RFD:en and WT:RFDN (based on the RFV ones: WT:RFVE, WT:RFV:en and WT:RFVN).
  3. I edited Template:rfd to automatically link entries to the right page ("English" or "Non-English") based on the language code, just like Template:rfv.
  4. I edited Template:rfd to make the language code mandatory or else you get a module error, just like Template:rfv. (see note below)

Note about point 4: Both Template:rfv and Template:rfd currently need the langcode or else you get a module error, but this does NOT have to be the case. I personally prefer the langcode mandatory, yes, so I'm happy with this situation, but if people disagree with this, we can simply make the langcode optional again. The only problem is that without the langcode the templates can't automatically redirect to the right page, but they still can point to the disambiguation pages by default: WT:RFV and WT:RFD.

Both Wiktionary:Beer_parlour/2017/April#Splitting_WT:RFV and Wiktionary:Beer_parlour/2017/May#WT:RFV_is_now_split say nothing (unless I'm mistaken) about the mandatory langcode. The mandatory langcode in both rfv and rfd was never the result of a consensus in a discussion or vote, and presumably can easily be confirmed or overturned in future discussions or votes. --Daniel Carrero (talk) 13:43, 25 November 2017 (UTC)Reply[reply]

The main problem at the moment is that having the main rfd page on one's watchlist doesn't automatically put the new pages on one's watchlist. That means anyone who's offwiki until the change to the main rfd page goes off the edge will be unaware of anything going on with the new pages. You should check how @Rua's been doing it with the monthly subpages. I believe it involves copying the main page to the new page, then recreating the main page from scratch. I believe you could delete the new pages, copy the main page over it, then do a history merge by restoring the revisions you want to to keep to the history of the new page, then undo to the revision just before the deletion. I haven't done this kind of thing enough to be sure it will work as advertised, so it might be good to have backups of the wikitext, just in case.
As for he langcode, my pet peeve with the rfv template is that it defaults to non-English with no langcode, but it makes more sense to default to English- I don't remember the last time I cleaned up an rfv module error in a non-English section, but I've done several English ones. Chuck Entz (talk) 20:59, 25 November 2017 (UTC)Reply[reply]
I should probably post the script I use to make new monthly pages at the end of each year, so that others can do it if I ever don't. What would be a good place to put it? —Rua (mew) 21:15, 25 November 2017 (UTC)Reply[reply]
I sure don't like it. I must have seen this discussion, though, and I don't remember why I did not post anything; I guess I feel tired of opposing all the changes that I do not like. --Dan Polansky (talk) 11:05, 26 November 2017 (UTC)Reply[reply]

Upcoming Wiki Science Competition[edit]

Did you hear about the Wiki Science competition, starting in November?

The competition will focus on images, but it might evolve in the near future, so users of other content platforms should take a look at it.

I've informed the village pump on commons, since there will be an intense workflow of technical uploaded by newbies, that will require some better categorization and translation of descriptions here and there. More importantly, images can be used for the articles on specific platforms. I think about some of your users who created and take care of many technical and scientific entries and are still currently active, such as User:SemperBlotto

I give you some details.

In 2015, limiting to Europe, we got thousands of entries, we can expect two or three times more this year. In the case of Italy for example we will send emails to many professional mailing lists, and other national wikimedia chapters will use their social media too to inform the public.

We have finished with Ivo Kruusamägi of WM Estonia to prepare some of the juries. I did my best to gather, besides people with a strong scientific background, also some expert wikipedians (because I ask first on wikipedia) here and there to take a look to the files on commons and not just the quality of the images. I have also informed users on English wikipedia, English wikiveristy and will do the same on some other wikimedia platforms in the following weeks.

The final international jury is made of expert researchers, usually with interest in photography, but no strong knowledge of the details of any wikimedia platforms. The main goal was to enlarge the network of "friends" of wikimedia platforms. Some national juries should have enough expert wikimedians and wikipedians probably, I guess because of the presence of active national chapter in their set up, so someone might take care of some the uploads at least improving some description and/or using them diorectly. Sometimes, suggesting technical entries to be created too.

More in general, gathering users besides wikipedians will probbaly help us to include more platforms for the competitions.

Now that I am sure that we have enough "scientists" here and there and from different fields, maybe we can see if we can also gathers specifically expert wikimedia users, whatever their background. Example simple teachers and not researchers that can evaluate the quality of the images for more specific uses.

For the countries without juries, there is the possibility of creating a second-level jury to select images from the rest of the world to the experts of the final jury. For such second-level jury I have found some names, but the numbers of entries could be really high, so maybe that's where we can look for more standard wikimedia users.

if you are a citizen of a country with a national jury you could also join them directly (rumor has it, more will appear). I don't know the details in many cases, if they need more jurors or they are fine.

Anyone interested?--Alexmar983 (talk) 05:59, 18 September 2017 (UTC)Reply[reply]

I am not interested to be part of a jury but I though it is very interesting that you knock here. Pictures made with a Wikipedia uses perspectives are quite different than pictures usable in Wiktionary. Here we also need to illustrate verbs and actions for tools (not only the tool itself) and more. I'll be very enthusiastic to integrate pictures from this competition in wiktionaries if they fit our needs! Noé 07:36, 19 September 2017 (UTC)Reply[reply]
With thousands of uploads, statistically someone could fit some needs also here... Noé I am happy if more people take a look, this should give better feedbacks for the future when the competition will be bigger and we can make it more specific to the needs of some wikiplatform. For example edit-a-thons. In the meantime, I have found another juror on frwikipedia, I am close enough to finalize the second-level jury. I am "sad" noone replied form wikiversity yet.--Alexmar983 (talk) 12:56, 19 September 2017 (UTC)Reply[reply]
Cool. Maybe you can try to ping the French Wikiversity, if some French Wikipedians can assist you Noé 13:24, 19 September 2017 (UTC)Reply[reply]

Wiktionary User Group[edit]


The Tremendous Wiktionary User Group is a coalition of users of Wiktionaries aimed to create a common platform to share ideas and documents. It is also a way to be a lobby at Wikimedia Foundation to make it acknowledge the needs of our projects in term of technical improvements. .

This User Group is completing a revolution, a first year of existence! We are writing our first Annual Report (due September 26th). It's time to look at what was made during the year and to frame the future axis of action. There is 42 affiliates now but the group can include much more people. I invite you to read our works and to see if you want to participate in our actions. The more visible one is LexiSession but there is much more to do, including promotional material (leaflet, banners, stickers, etc.), inter-wiktionarian collaborations (on templates, Wikidata, policies and guidelines) and meet-ups! There is no fees nor admission processes, it's open to everyone who like Wiktionary and want to do more about this project. Your ideas and initiatives are welcome!

Thank you for your attention, I hope to see you soon Noé 08:14, 19 September 2017 (UTC)Reply[reply]

Help review PulauKakatua19 (talkcontribs)'s entries[edit]

This user is editing in way too many languages for them to possibly understand all of them. I have checked and fixed all of the recent edits in Hindi, Bengali, and Sanskrit, but someone acquainted with Indonesian, Malay, and now Korean should check the rest. Atitarev (talkcontribs) warned them on their talk page about Russian a while ago too. —Aryaman (मुझसे बात करो) 01:00, 21 September 2017 (UTC)Reply[reply]

I will check their Chinese, Korean and Malay ones. Wyang (talk) 01:11, 21 September 2017 (UTC)Reply[reply]

Gfarnab (talkcontribs) back at it again[edit]

e.g. A recent error at . Someone please block them. Wyang (talk) 01:13, 21 September 2017 (UTC)Reply[reply]

Could someone add this to Wiktionary:Votes/Active? Thanks. --2A02:2788:A4:F44:AC35:948A:635A:9569 18:16, 21 September 2017 (UTC)Reply[reply]

Before any such thing (I don't even think anons can create votes to start with), have you at least asked @SemperBlotto if he's interested? --Robbie SWE (talk) 18:21, 21 September 2017 (UTC)Reply[reply]
No, he isn't. SemperBlotto (talk) 20:17, 21 September 2017 (UTC)Reply[reply]
That's what I suspected. The vote is therefore useless and will be deleted. --Robbie SWE (talk) 20:19, 21 September 2017 (UTC)Reply[reply]

Parentheses in IPA[edit]

I really wish people would stop inserting these back into pronunciations. The only acceptable IPA use of parentheses is in w:ExtIPA, and those are subscript parentheses to represent partial devoicing. In fact, I've searched through various linguistics databases and can't find much evidence even of non-IPA uses, except their occasional use to denote silent articulation. Obviously this doesn't apply to the case which Wiktionary editors are most commonly trying to use parentheses (optional articulation of ⟨ɹ⟩). It may be helpful, but it's wrong. The entry should show either both possible pronunciations separately or a more specific phonetic pronunciation. If people are going to keep using them, then Wiktionary as a whole needs to stop claiming they are using IPA's system, and admit they have their own in-house system. It's rather disrespectful to the creators of a standard to cherrypick what you wish to use. If the IPA thought parenthesis were really that important, don't you think they would have standardized them by now? Pariah24 (talk) 20:31, 21 September 2017 (UTC)Reply[reply]

Of course, IPA is disrespectful to the creators of the Latin alphabet, by the way they cherrypick that alphabet. Phonetic alphabets are used in great variation throughout the world, including many, many minor variants on IPA. And our use of parenthesis has precedence; we have for crater /ˈkɹeɪ.tə(ɹ)/, and Keynon and Knott's A Pronouncing Dictionary of American English (1953) has (among others) ˈkɹetə(r--Prosfilaes (talk) 21:52, 21 September 2017 (UTC)Reply[reply]
@Pariah24 It's not wrong. Peter Ladefoged transcribes the unstressed form of the as [(ð)ə] in broad phonetic transcription in the Handbook of the IPA (chapter 'American English'). Just because something is not officially endorsed (and I'm not so sure of that, have you tried asking the IPA itself?) it doesn't mean that it's wrong or that it shouldn't be used. Unless I'm missing something?
You're also a bit inconsistent in your edits. In martyr, you transcribed the AuE pronunciation /ˈmɑːtəɹ/, /ˈmɑːtə/, [ˈmäːtə], [ˈmäːɾə]. The order was wrong, as the pronunciation with the final /ɹ/ is marked and used only immediately before vowels, not the other way around. Also, the way the final [ɹ] is omitted in phonetic transcriptions suggests that it's there phonemically but not phonetically, which is of course completely wrong (if anything, again, it's the other way around). I've fixed that for you. Mr KEBAB (talk) 07:39, 27 September 2017 (UTC)Reply[reply]

Braille entries[edit]

Should we reformat Braille entries like this?--2001:DA8:201:3512:BC46:AD88:D9A7:3939 16:39, 22 September 2017 (UTC)Reply[reply]

(@Daniel Carrerosuzukaze (tc) 00:47, 23 September 2017 (UTC))Reply[reply]
Mostly support. My opinion is this:
  1. I would suggest, in normal letter entries like a and also Braille letter entries like (which is Braille for "a") deleting all Latin script sections like Spanish, Portuguese, Italian, etc. because they clutter the entry and are basically infinite. The Translingual section can explain the Latin script letters.
  2. But, in Braille entries like the aforementioned , I would support keeping separate sections for Japanese, Arabic, Hebrew, etc. and other non-Latin script entries as opposed to keeping them all in the Translingual section.
  3. I would also support using proper categorization like Category:Arabic letters in Braille script (current redlink), with the written language and script. I would suggest using Category:Arabic letters in Arabic script (self-explanatory) for the normal alphabet.
--Daniel Carrero (talk) 03:41, 23 September 2017 (UTC)Reply[reply]
@Daniel Carrero: I agree. --Backinstadiums (talk) 08:04, 23 September 2017 (UTC)Reply[reply]

User:Aryamanarora has been nominated for adminship. Please voice your opinion on the page. Thanks! Wyang (talk) 13:56, 23 September 2017 (UTC)Reply[reply]

Denoting long aspiration[edit]

In Northern Sami, there's a set of preaspirated consonants, but these consonants can be lengthened as well. When they are long, it is the preaspiration that lengthens rather than the occlusion itself. Usually, I've seen the preaspiration transcribed with just the letter h, e.g. hp, so that long preaspiration then becomes a matter of writing hːp. The few Northern Sami transcriptions that we have, and those on Wikipedia, use the superscript ʰ instead. I prefer the superscript, but writing ʰːp is probably less than ideal. I've written ʰpː instead in these occasions, but it doesn't really reflect the phonetic reality that it's the aspiration that lengthens. Any ideas? —Rua (mew) 18:29, 23 September 2017 (UTC)Reply[reply]

How about hhp? — Eru·tuon 19:09, 23 September 2017 (UTC)Reply[reply]
That's more or less equivalent to hːp. I would prefer to avoid h because there's also an actual phoneme /h/, and it's not part of these preaspirated consonants. /hːp/ is one phoneme, so I'd like it if the transcription reflected that. —Rua (mew) 19:14, 23 September 2017 (UTC)Reply[reply]
You call it "less than ideal", but from your description I don't see much choice beside ʰːp, unless it's ʰʰp. —Aɴɢʀ (talk) 14:37, 24 September 2017 (UTC)Reply[reply]
Yeah. I just hoped someone would think of something I hadn't thought of yet. @Tropylium any ideas? —Rua (mew) 15:14, 24 September 2017 (UTC)Reply[reply]
In phonetic transcription there should be no problem with [hːp]. Phonetically there is no difference between [ʰ] and [h]. Even phonemically /hːp/ might be feasible. It is not universally agreed that these are unitary consonants; some analyses do consider them clusters /h/+/p/, in part precisely because it's the aspiration that lengthens and not the closure (similar to how the long counterpart of clusters such as /sk/ is /sːk/). In any case there is no contrast between /hp/ versus /ʰp/. --Tropylium (talk) 19:24, 24 September 2017 (UTC)Reply[reply]
Ok, I'll just go with /hːp/ then. Thank you. —Rua (mew) 19:50, 24 September 2017 (UTC)Reply[reply]
I would have picked /ʰʰp/, but it doesn't matter too much. --WikiTiki89 18:30, 25 September 2017 (UTC)Reply[reply]


Do we have a language code for this under a different name? Used on জাৰ, নিগনি, translation at winter @Sagir Ahmed Msa, Aryamanarora. DTLHS (talk) 19:11, 23 September 2017 (UTC)Reply[reply]

Also "Mymensinghiya", used on light. DTLHS (talk) 19:16, 23 September 2017 (UTC)Reply[reply]
AFAIK both of these are Bengali dialects... Sagir probably knows better than me. —Aryaman (मुझसे बात करो) 20:11, 23 September 2017 (UTC)Reply[reply]
@Aryamanarora: nope, there's no code for these languages. Yes these are considered as Bengali dialects just like Sylheti, Chittagonian, Rajbongsi etc (Wiktionary has code for these). Chakma and Rohingya (both are very closely related to Chittagonian) are not considered as Bengali dialects probably because their native speakers are not considered as Bengali people. But these languages are not actual dialects. Some of these are more closely related to other languages than standard Bengali. Similar for Assamese dialects (I mentioned Kamrupi language in মেকুৰী, which is considered as an Assamese dialect). I think just like Sylheti, Chittagonian etc, these languages should also have codes. They have different phonology, grammar even origins. The Dinajpuria and Mymensinghia words are not present in Rarhi-Nadia (standard Bengali), these are also closer to Rajbongsi and Sylheti respectively. Please check
English Wikipedia has an article on:

User:Sagir Ahmed Msa

@Sagir Ahmed Msa: Is grammar significantly different in these lects from Rarhi-Nadia? I'll admit I know little about Eastern Indo-Aryan, it's just ISO is usually generous with codes (e.g. a bunch of Hindi lects are given codes when they are often considered to be dialects). If we did add a code, bn-dnj etc. would be fine right? —Aryaman (मुझसे बात करो) 17:14, 25 September 2017 (UTC)Reply[reply]
@Aryamanarora: yes you can make codes with "bn-" since they are generally considered as Bengali dialects.

-- Sagir

@Aryamanarora The code would be inc-dnj (see Wiktionary:Languages#Language_codes). DTLHS (talk) 19:10, 25 September 2017 (UTC)Reply[reply]
@DTLHS: Whoops, typo on my part. But why not bn- since they are often considered Bengali dialects? That's probably not supposed to be argued about here though. —Aryaman (मुझसे बात करो) 19:12, 25 September 2017 (UTC)Reply[reply]
Bengali is not a language family. DTLHS (talk) 19:15, 25 September 2017 (UTC)Reply[reply]
Yes, I understand that now. So could inc-dnj (Dinajpuria) and inc-mym (Mymensinghiya) both with script Beng and ancestor inc-mgd? —Aryaman (मुझसे बात करो) 19:18, 25 September 2017 (UTC)Reply[reply]
I am concerned that there is no information about these languages / dialects online- not even mentions of the language names. Since they would be WT:LDLs, what references would be used to support entries? DTLHS (talk) 19:21, 25 September 2017 (UTC)Reply[reply]
@DTLHS: [1] seems promising, and attests to the lack of mutual intelligibility between these dialects... But I am not sure whether they deserve codes. Would {{lb|bn|...}} not suffice @Sagir Ahmed Msa? —Aryaman (मुझसे बात करो) 19:27, 25 September 2017 (UTC)Reply[reply]
@Aryamanarora, Aryaman:

Here are some examples: Unfortunately i couldn't find Mymensinghiya tenses, so I'm comparing with Dhakaya, they are closely related to each other and Mymensinghiya is more distinct from Standard Bengali than Dhakaiya.

  • English :
  1. I do.
  2. I am doing.
  3. I did.
  4. I was doing.
  5. I will do.
  6. I will be doing.
  • Dhakaiya :
  1. Ami kôri.
  2. Ami kôrtasi.
  3. Ami kôrsi/kôrsilam.
  4. Ami kôrtasilam.
  5. Ami kôrmu.
  6. Ami kôrtê thakum.
  • Bengali:
  1. Ami kôri.
  2. Ami kôrchi.
  3. Ami kôrêchi.
  4. Ami kôrchilam.
  5. Ami kôrbo.
  6. Ami kôrtê thakbo.
  • Assamese:
  1. Môi kôrû.
  2. Môi kôri asû. (kôri = kôrat)
  3. Môi kôrisû/kôrisilû.
  4. Môi kôri asilû.
  5. Môi kôrim.
  6. Môi kôri thakim. (kôri = kôrat)
  • Rangpuri/Rajbongsi/Kamata:
  1. Muĩ kôrû.
  2. Muĩ kôrûsû.
  3. Muĩ kôrsinû. (And kôrsû?)
  4. Muĩ kôrûsinû.
  5. Muĩ kôrim.
  6. Muĩ kôrtê thakim.

-- Sagir

Another veteran editor of Wiktionary, User:Justinrleung, has been nominated for adminship. Please voice your opinion on that page. Thanks! (Vote closes on 8 Oct.) Wyang (talk) 00:39, 24 September 2017 (UTC)Reply[reply]

Should sense ids be distinct across pages?[edit]

Israel and State of Israel have the same sense id. I can’t imagine that this will cause any problem, since a sense id will presumably always accompany a pagename, or do we want to ensure universal uniqueness? — Ungoliant (falai) 13:43, 25 September 2017 (UTC)Reply[reply]

In principle, it might take pagename + etymology + PoS + senseid to guarantee uniqueness in English, at least if the senseid is poorly chosen (eg, noun and verb spelled the same each used by itself in two different senseids for different PoSes). In the absence of etymology, pronunciation might be required. In some FLs gender might needed. I wonder what requirements exist in other languages. This seems messy.
Do we have anyone running comprehensive checks for this kind of thing against the XML dumps? For example, I use {{sense|genus}} under synonyms, hypernyms, and hyponyms header in taxonomic entries, but I sometimes need to differentiate by taxonomic family, order, etc. to ensure uniqueness. Have I always done so? I haven't been checking for that. DCDuring (talk) 17:00, 25 September 2017 (UTC)Reply[reply]
Doesn't the same kind of problem exist to a vastly greater extent in FL sections where definitions consist of a single polysemic English word, with no disambiguating gloss? DCDuring (talk) 17:06, 25 September 2017 (UTC)Reply[reply]
Yes, it does. That’s a major problem with our FL content. Our definitions in certain languages (Italian and Spanish come to mind) are still too poor to be used as my primary source of information. — Ungoliant (falai) 17:19, 25 September 2017 (UTC)Reply[reply]
I try to fix these for Dutch whenever I spot them, but it's an uphill battle. Finding them is difficult enough. —Rua (mew) 18:05, 25 September 2017 (UTC)Reply[reply]
(edit conflict) The only thing that has to be unique is the combination of page name, language, and sense id. Sense ids appear in a link to an entry that contains the entry name, the language name, and the bit of sense id text; they are not used without a language name or as a substitute for a page name. So they do not need to be unique across pages; if they were, they would probably be too long or unintuitive. I think they should be as short as possible, because they have to be plugged into the |id= parameter of link templates. (However, I just searched and discovered a very long sense id for radical in English: linguistics: portion of a character that provides an indication of its meaning. Oh well.) — Eru·tuon 18:12, 25 September 2017 (UTC)Reply[reply]
Can we agree that senseids must be unique within a language section? — Ungoliant (falai) 16:38, 27 September 2017 (UTC)Reply[reply]
They wouldn't work if they weren't. --WikiTiki89 16:46, 27 September 2017 (UTC)Reply[reply]
Sounds like a good rule. DCDuring (talk) 18:06, 27 September 2017 (UTC)Reply[reply]
Yes, that is a restatement of what I meant by "the combination of page name, language, and sense id must be unique". It may not have been very clear. — Eru·tuon 18:47, 27 September 2017 (UTC)Reply[reply]

Modern Greek terms spelt with Latin characters[edit]

@Xoristzatziki has just speedied the Greek entry at marketing. I assume the reason is that the word is not written in Greek letters. Since the entry in question was reviewed by experienced editor Saltmarsh, and since marketing is trivially attestable in running text, I think its inclusibility should be at least discussed. — Ungoliant (falai) 16:26, 27 September 2017 (UTC)Reply[reply]

Yes - and I thought hard about it at the time. I have frequently considered raising the subject here (TLDR generally stops me). As an ageing Englishman I can feel annoyed at myironic language being mangled by others (I heard an Englishman say "crawfish" on the radio this morning - I'm sorry, we say "crayfish"), ; I can also understand @Xoristzatziki's anger when the same thing is happening to his language. Greek web pages (the first supermarket site I look at has "FRANCHISE" and "CLUB CARD" (and "SUPER MARKET"). When I go to Greece I feel sad that packaging and billboards are similarly invaded. A quick look at my w:Babiniotis Dictionary shows the Latin script "status quo" (we have it as an English term as well as Latin) and other Latin terms, only a few English terms (I only find NATO in the time available). To pick an easy example - "weekend" is common in Greek text (the Academie francaise fought against it for years) have entries for 5 other languages, and it even declines in Polish! But perhaps marketing is better than μάρκετινγκ ? — Saltmarsh. 06:15, 28 September 2017 (UTC)Reply[reply]

There is no such term Modern Greek terms spelt with Latin characters. Please do not try to alter a language out of nothing. If you think Greeklish should be a new language in wiktionary, make a propose. --Xoristzatziki (talk) 16:33, 27 September 2017 (UTC)Reply[reply]

As a descriptive dictionary, if a word is used in texts by Greek speakers it can be included. @Saltmarsh DTLHS (talk) 16:42, 27 September 2017 (UTC)Reply[reply]
In that sense all English words should contain a Greek section... And sections for all other languages also... Or only Greeks and Cypriots use in texts signs and words from English? Chinese do not do it? --Xoristzatziki (talk) 16:54, 27 September 2017 (UTC)Reply[reply]
Category:Chinese terms written in multiple scripts, Category:Chinese terms written in foreign scripts. DTLHS (talk) 16:59, 27 September 2017 (UTC)Reply[reply]
This is something else. Please do not confuse us. We are talking about the usage of real English words. Not for terms that cannot be otherwise identified (σ鍵). There is not a single English word in the above mentioned category although ex. fast-food is written, as stand alone word, in more Chinese restaurants around the world than marketing is written in Greek "googloid" texts. --Xoristzatziki (talk) 17:08, 27 September 2017 (UTC)Reply[reply]
Look at the second category (Category:Chinese terms written in foreign scripts), especially band, size, and friend. --WikiTiki89 17:26, 27 September 2017 (UTC)Reply[reply]
@Ungoliant MMDCCLXIV Could you link some examples of "marketing" being used in Greek texts? DTLHS (talk) 17:11, 27 September 2017 (UTC)Reply[reply]
google books:"το marketing". Compare with google books:"το μάρκετινγκ". --WikiTiki89 17:26, 27 September 2017 (UTC)Reply[reply]
[2], [3], [4], [5], [6], [7], [8]. Some of them also use μάρκετινγκ elsewhere in the text. — Ungoliant (falai) 17:30, 27 September 2017 (UTC)Reply[reply]

Apart of all that I could agree in such a "descriptive"(!?) way, if all languages had the same confronting. marketing should include every language for which google returns that word if specific language is asked. And, any way, I will not revert such Greeklish entries if the dominant status of volunteers in Wiktionary is to create such "Modern Greek terms spelt with Latin characters".--Xoristzatziki (talk) 17:18, 27 September 2017 (UTC)Reply[reply]

@Wikitiki89, @Ungoliant MMDCCLXIV one thing is sure. You do not know how google works (and especially their department of sales together with google books). Otherwise you should come with true results. Mentioning counts relative to the time they where written, to whom they are addressed, how many are duplicating or copying or attesting other books etc. etc. --Xoristzatziki (talk) 17:39, 27 September 2017 (UTC)Reply[reply]

It's true that relative numbers of Google Books hits aren't that useful. One thing I noticed is that one hit displayed on the results page for the "το μάρκετινγκ" search has "το Marketing" highlighted as the search term- you have to wonder if there's some bleed-through between languages in their search algorithms. That said, such things are beside the point when it comes to CFI: there are enough viewable hits to satisfy CFI- if they really are using the term to convey meaning in Greek. The latter is the tricky part. Chuck Entz (talk) 22:03, 27 September 2017 (UTC)Reply[reply]

Based on User:DTLHS's idea of "descriptive dictionary" and the whole above conversation, assuming I can provide enough sources written in English as main language (electronic or printed) which have or inside the text is it safe to assume that this an indication to add to these terms an English section? (I have in my hands at least two such books) --Xoristzatziki (talk) 05:34, 28 September 2017 (UTC)Reply[reply]

Do the Chinese characters in your texts convey meaning (Wiktionary:Criteria_for_inclusion#Conveying_meaning)? Are they being used as English words and not just mentioned as Chinese characters? It can be hard to answer these questions for a non-native speaker, which is why I don't know if the Greek quotes linked above would qualify. DTLHS (talk) 05:46, 28 September 2017 (UTC)Reply[reply]
The "Conveying_meaning" does not mention at all the script. Only mentions words in the same script (which should be considered as "only words in the same script" and not the opposite). The fact that many people who speak a second language prefer to pronounce some words in the way they are pronounced in that second language does not make the pronunciation of these words part of the pronunciation of first language. Such as the "USA" pronunciation of words from enough people living in London does not make that pronunciation British. A book targeted to specific group might contain anything that the target group can identify. A book containing emoticons might have emoticons inside sentences used not as example but as a full sentence. That does not mean the emoticons are part of a specific language. (Or they are now? Can you spell File:Fxemoji u1F602.svg in English or in any other spoken language? Or we are not interest in pronunciation from that point forward? Just "a printed icon" of "example" is enough? Should we start converting any word to picture and stop writing it here but include it as picture?) --Xoristzatziki (talk) 09:25, 28 September 2017 (UTC)Reply[reply]
Yes, if enough people in London pronounce something some way, that pronunciation is British. That's what "descriptive" means. Cross-lingual pronunciations are complex, but again, descriptive means that the pronunciation of a word is many times going to be foreignized. I wouldn't say it was safe to assume that 羊 is part of English, but there would certainly be an argument if it was used in running text, particularly if it was treated as an English word. There's also complexities here; English absorbs all sorts of random accents in rare words, like ʔAllāt, or Greek letters, in cases like γ-globulin, and odd characters like ℝ-order tree. But scripts outside Greek rarely get used; ℵ₀ is an odd example, and Cyrillic occasionally leaks through, like СССР, but I'd be very surprised by Chinese characters. I'd expect Greek to be a similar spot; Latin getting mixed in sometimes, with other scripts being rare.--Prosfilaes (talk) 10:02, 28 September 2017 (UTC)Reply[reply]

As a native reader of Chinese characters, I agree with User:Xoristzatziki 100% here. If native speakers do not treat these words as their language, do not include these so-called "attestable" words in the comprehensive monolingual or bilingual reference dictionaries they produce, there is really no point in including them. The native speakers (not language regulators) have the best Sprachgefühl regarding what is their language, what is sum of parts in their language, what part of speech a word is (for analytic languages), and often the script is a formidable barrier to something being considered 'their language'; it is a very bad idea to argue against the perception of native speakers, and say this is your language when they are native in it. I'm sure native English speakers would be similarly concerned if a user starts to mass-create "English" entries of a similar nature, even if it is just Latin-script perro ([9], [10], [11], [12]). It is just the case that English is the overwhelming exporter of these uses in other languages, but all languages have principles as to what can be considered part of their language and what can not; not all words a Chinese person says or writes when they speak Chinese is Chinese ― they can mix a lot of English, Malay, Japanese, etc. words in, depending on where they are and how much Chinese/other languages they know. Likewise, Latin-script marketing in running Greek text is just not Greek. Wyang (talk) 11:11, 28 September 2017 (UTC)Reply[reply]

I am totally with Xoristzatziki and Wyang on this one. Imagine a published dictionary where marketing is marked as a Greek, Russian, Armenian, etc. word. We have, unfortunately allowed some Latin script words enter CAT:Chinese terms written in foreign scripts, they are mostly slang and, very few are standard Chinese (Mandarin) and, unlike Greek, Russian and other alphabet-based or phonetic languages, there is no Chinese script to render those words phonetically. Most of these terms wouldn't pass if they were in a respected published dictionary. I'd like to mention again that a language, such as Chinese needs a separate CFI for various reasons. --Anatoli T. (обсудить/вклад) 13:26, 28 September 2017 (UTC)Reply[reply]
You may want to RFD the contents of Category:Greek terms written in Latin script. — Ungoliant (falai) 13:35, 28 September 2017 (UTC)Reply[reply]
Gone. There wasn't even an attempt to provide citations for those. As far as I am concerned, they are all against our policies and the common sense. --Anatoli T. (обсудить/вклад) 13:47, 28 September 2017 (UTC)Reply[reply]
There goes ain't and fuck, which weren't English words in the comprehensive monolingual or bilingual reference dictionaries of English for a long time. It also strands "English" words that aren't English anymore, that are being used in ways that no native speaker of English would use the word. I don't know about marketing, but this is not a simple case.
Also, we're a descriptive dictionary. The "correct" writing style frequently differs from the writing style in actual use. If digging around in the newsgroups, we find a few million words of Latin-script Greek, then of course we should record that.--Prosfilaes (talk) 01:20, 29 September 2017 (UTC)Reply[reply]
These are different cases: one (ain't, fuck) where the words are deemed nonstandard or vulgar by dictionary makers, and the other where the words are rejected outright by native speakers as simply being foreign words mixed into speech or writing, much like the example of this perro above. We certainly should not include a few million words of Latin-script Greek; that will only lead us to become a laughing stock, and lead to complete dismissal by Greek speakers. Wyang (talk) 05:46, 29 September 2017 (UTC)Reply[reply]
It's not different cases; they're both cases where dictionary writers consider a word not a word, because it's not proper. I wasn't talking about native speakers; I was responding to where you were talking about words considered inappropriate to include by dictionary makers.
What other corpuses should we ignore? All that Hebrew-script German Jargon? Scots (the very existence of the Scots Wikipedia seems to get a lot of mockery)? Should we delete Category:Macedonian language because that might cause complete dismissal by Greek speakers? If we have a corpus of several million words, we should record it.--Prosfilaes (talk) 07:14, 29 September 2017 (UTC)Reply[reply]
I'm speechless... You are insisting ain't/fuck and marketing in Greek are of the same nature, so ― I was able to find ain't in many English dictionaries: Marriam-Webster, Oxford Dictionary, Cambridge Dictionary, Collins Dictionary, MacMillan Dictionary, American Heritage Dictionary, Longman Dictionary of Contemporary English; can you find a Greek dictionary that includes the Latin-script word marketing as a Greek word? We've already got a native Greek speaker complaining that we are butchering their language, why? Because non-native speakers and non-speakers are dictating their language, often in a self-assumed manner, as if we know what is best for their language. We are sometimes trapped in the mindset of our own rules, so trapped that we have lost touch with reality, with common sense. Show a native Greek speaker the texts containing marketing and ask them what this is, and they would unanimously tell you this is an English word mixed into a Greek text, and the author is trying to show off that they are professional, up-to-date with the lingo and superior with their knowledge. Ask them what μάρκετινγκ is, and they will tell you it is a Greek word borrowed from English. Yet, we decide for the Greeks, ruling that marketing is their language, as well as several million more Latin-script 'Greek' words. Of course this is going to lead to displeasure, dismissal, and ridicule amongst the Greeks, the native help from whom we desperately and paradoxically need here. Wyang (talk) 09:10, 29 September 2017 (UTC)Reply[reply]
The issue boils down to having criteria that allows us to draw a line between “Y-language word used in language X” and “loanword from language Y that has been borrowed into X”. This is not as self-evident as some here seem to think, and consulting n people will yield n different opinions as to which words are the former and which are the latter.
Since no one is arguing for the deletion of μάρκετινγκ, it seems that you want the use script as a criterion, which is not at all unreasonable, but do discuss it rather than removing the entries without explanation. — Ungoliant (falai) 11:54, 29 September 2017 (UTC)Reply[reply]
There's nothing to discuss here, really. No need to create votes and policies for the obvious, natural and universally accepted rules - languages are written in native scripts, romanisation and words in other scripts are not words in those languages. People who imagine that Greek or any other language written in non-Roman script language can be written in scripts other than the native should be the ones seeking approvals, not the ones who protect the sanity and quality of this dictionary. --Anatoli T. (обсудить/вклад) 12:31, 29 September 2017 (UTC)Reply[reply]
Apparently you have to teach that to Nikos, who created the Greek entry at marketing (and to the Greeks who keep using marketing in their books). — Ungoliant (falai) 12:38, 29 September 2017 (UTC)Reply[reply]
Nikosks (talkcontribs) was the user who created the section in 2016. They only had two edits, spaced less than twenty minutes apart, one on management, and one on marketing, both edits involving the addition of a Greek section. Looking at their edits at the time (edit to management and edit to marketing), this may be the case of an innocent newbie mistake after all: they thought the Greek sections on these entries can be used to hold translations. Wyang (talk) 12:54, 29 September 2017 (UTC)Reply[reply]
I doubt the user even knew Greek. Both terms were created as masculines but they are neuters - both μάνατζμεντ (mánatzment) and μάρκετινγκ (márketingk). We often have this type of entries made by clueless users. A while ago [[ghar]] was created with a definition something like "This is a Hindi word for "house". (The history is now overwritten, as the entry was deleted.) The correct entry is, of course, at घर (ghar). --Anatoli T. (обсудить/вклад) 04:57, 30 September 2017 (UTC)Reply[reply]

It's pretty obvious that these are English words being used in Greek text (probably for convenience), not integral Greek words. —Aryaman (मुझसे बात करो) 15:44, 28 September 2017 (UTC)Reply[reply]

Another possibility is code switching. People who are bilingual sometimes switch between languages because different languages have different associations: using one's native language for personal, emotional topics, using another language to evoke a certain style, or yet another to show one is up on the latest in a field dominated by speakers of that language. This would probably be the latter: if the field of internet marketing is dominated by English-speakers, one might throw in an occasional bit of English internet marketing terminology to give the appearance of being well-versed in that type of thing. We don't see as much of that in English nowadays because English speakers are less bilingual and don't care as much about other languages, but there are specialized areas such as religions like Islam or Catholicism based in other languages or academic fields where you can see it, and it was once common for educated people to throw in the occasional Greek, Latin or French term in ordinary conversation. Chuck Entz (talk) 16:22, 28 September 2017 (UTC)Reply[reply]

FWIW, I found back in 2011 that both ἄρχων and Москва were citable in running English text. The latter was deleted per RFD, but if we decide to keep marketing et al, it would be easy to cite a bunch more like ἄρχων. IMO, it's better to analyse it as code-switching. We consider the presence/absence of italic script when trying to determine if a Latin-script foreign-language phrase or term has been borrowed into English or only mentioned, and it seems appropriate to me to consider the presence/absence of native script similarly. - -sche (discuss) 19:15, 6 October 2017 (UTC)Reply[reply]

Project Grant proposal for Lingua Libre[edit]

Lingua Libre's logo


Lingua Libre is an opensource platform created to ease mass recording of word pronounciations into clean, well cut and well normalized audio files. Given a clean words list, recording productivity can reach up to 1000 audio recordings per hour, i.e. ten times faster than the best method described on Help:Audio pronunciations and without requiring any technical skills.

It's currently supported by a team of (mostly French, including French Wiktionary administrators) volunteers. Even if the core recording tool is fully functional and very efficient, it currently suffers from a very poor integration with the Wikimedia projects. To accelerate the development of this tool and overcome these problems, we have submitted a Project Grant proposal. If you're interested by this project, take a look at the proposal, on meta: meta:Grants:Project/0x010C/LinguaLibre. Don't hesitate to ask questions on it if you feel there are ambiguous points, or to endorse the project if you wish to see it coming true!

Furthermore, if you want the English Wiktionary to benefit from these audio recordings (through a bot, or some other way), please get in touch with me! 0x010C ~talk~ 17:21, 27 September 2017 (UTC)Reply[reply]

Little gnomes at work?[edit]

Who was the little gnome that removed the arrow symbol from references? I kind of miss it, and there is a gap where it should be. The way watched pages (such as pages one has created) are presented has also changed. DonnanZ (talk) 12:09, 29 September 2017 (UTC)Reply[reply]

I don't recall seeing such an arrow, but it sounds like it may have been added by a gadget, which may have been broken by the recent updates to the site software, or the fact that the "References" header has been changed to something else in many entries, or the recent cleanup of old gadgets. Sorry I can't be of any more help than that. - -sche (discuss) 19:24, 6 October 2017 (UTC)Reply[reply]
It's back again, and has been for a while. All is well again. DonnanZ (talk) 21:08, 25 November 2017 (UTC)Reply[reply]

The name of this category is a bit strange. Aren't we supposed to use only nouns? --Barytonesis (talk) 19:39, 29 September 2017 (UTC)Reply[reply]

There is also Category:en:Nautical using an adjective. But I prefer "Automotive" to Category:en:Auto parts. DonnanZ (talk) 08:54, 30 September 2017 (UTC)Reply[reply]

Wikimedia Movement Strategy phase 2, and a goodbye[edit]


As phase one of the Wikimedia movement strategy process nears its close with the strategic direction being finalized, my contractor role as a coordinator is ending too. I am returning to my normal role as a volunteer (Tar Lócesilion) and wanted to thank you all for your participation in the process.

The strategic direction should be finalized on Meta late this weekend. The planning and designing of phase 2 of the strategy process will start in November. The next phase will again offer many opportunities to participate and discuss the future of our movement, and will focus on roles, resources, and responsibilities.

Thank you, SGrabarczuk (WMF) (talk) 21:55, 30 September 2017 (UTC)Reply[reply]

Language userboxes: by country/region?[edit]

I don't need it, but I just wondered: can our language userboxes support specific country/region, e.g. British English, or Swiss German? (And if not, should they?) Equinox 21:55, 30 September 2017 (UTC)Reply[reply]

Do you mean the Babel boxes? Mine says "This user is a native speaker of British English". DonnanZ (talk) 11:59, 1 October 2017 (UTC)Reply[reply]

Desinence as a POS[edit]

I suggest adding desinence (inflectional ending) as a POS header/category, I think it would be good to differentiate them from suffix in general. Crom daba (talk) 23:37, 30 September 2017 (UTC)Reply[reply]

I think "desinence" is a very obscure term that most people wouldn't know. What about just "inflectional suffix"? DTLHS (talk) 23:48, 30 September 2017 (UTC)Reply[reply]
I thought it was that lovey-dovey feeling. 20 seconds of brain-searching later, I realise I was thinking of limerence. Equinox 00:07, 1 October 2017 (UTC)Reply[reply]
(edit conflict) In principle, that sounds nice, but desinence is a lousy name (how many people know what it is without looking it up), and the Indo-European languages we're used to are deceptively simple when it comes to the types and hierarchy of affixes. For instance, Bantu languages show number with prefixes in many cases, and, as you know, agglutinative languages throw in all kinds of things represented by separate words of various parts of speech, with the lines separating inflection, derivation and syntax getting thoroughly tangled. Chuck Entz (talk) 00:02, 1 October 2017 (UTC)Reply[reply]
We call them suffixes, but for many languages we do distinguish between inflectional and derivational suffixes (e.g. CAT:Irish inflectional suffixes and CAT:Irish derivational suffixes). Note that not all inflectional affixes are suffixes, e.g. Maltese ni-, ti-, ji- (and their equivalents in other Semitic languages) are prefixes, i.e. endings that are actually "beginnings". —Aɴɢʀ (talk) 15:07, 1 October 2017 (UTC)Reply[reply]