Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


April 2018

Image captions and descriptions of the representing objects such as paintings[edit]

An editor (@Sgconlaw) likes to place into image captions descriptions of representing objects such as paintings, engravings and the like. Like, he said "An engraving from a 16th-century treatise by Levinus Hulsius" in an image at triangulation before I removed that. In that entry, we now have a to-and-fro.

I oppose this practice. I oppose having a description of the representing object (painting, engraving) even in the note created via <ref>. Such a description can be found on Commons. Wiktionary is a dictionary and its images and captions should help learn about the referents, or in some cases about character strokes. Wiktionary should not contain any marginally relevant tidbits only because they could be interesting. Since, there is no tight relation between the referent and the representing object; anything can be on a painting, and the painting can be in any gallery in the world. It is random noise.

Your thoughts?

--Dan Polansky (talk) 08:51, 1 April 2018 (UTC)

This is not an April joke on my part. --Dan Polansky (talk) 08:51, 1 April 2018 (UTC)
I oppose Dan's unnecessariy rigid approach. The information is useful to the reader, and placing it in a footnote is an acceptable compromise between having it in the caption itself and not having it at all. — SGconlaw (talk) 09:28, 1 April 2018 (UTC)
"Useful to the reader" is not enough; "partains to dictionary" or "befits a dictionary" is required. Like, the possible cures of pneumonia could be useful to the reader, but have no place in a dictionary. And strictly speaking, these tidbits are not useful in the sense in which a knife or a fridge are useful (and definitions, if you are a translator); it just fuels idle curiosity. --Dan Polansky (talk) 09:44, 1 April 2018 (UTC)
Wiktionary is not a print dictionary; space is not an issue. We are also not talking about reams of information that something like "cures of pneumonia" might entail. I maintain that providing some source information that places an image in context is useful to the reader. And what is wrong with "idle curiosity"? One of the beauties of the Wikipedia project is the serendipity of discovering something else interesting while you are looking for one thing. In any case, I remain guided by any clearly established consensus on the issue. — SGconlaw (talk) 10:06, 1 April 2018 (UTC)

FWIW, I once again agree with Dan Polansky. --Per utramque cavernam (talk) 11:56, 1 April 2018 (UTC)

I don't agree: "engraving from a 16th-century treatise" is valuable contextual information. This information is conveniently in one place, I don't have to click through to Commons, wait for the page to load and then scroll through the page to find that information. Also think of a situation where the page is read in an offline reader (e.g. Kiwix), without access to Commons. – Jberkel 12:02, 1 April 2018 (UTC)
To me, the main point is that knowing it's an "engraving from a 16th-century treatise" is lexicographically irrelevant. I just want our entries to "get to the point", which is offering lexicographical information. I want as much of that as possible, and in that sense do I agree that "space is not an issue"; but I don't want anything else.
And frankly, I think our attentions are already solicited enough by ten thousands different things that we don't need more of that; I find it actually refreshing to have a single-minded environment, focused on one thing only: words. I don't want more occasions for idle curiosity and serendipity. --Per utramque cavernam (talk) 12:22, 1 April 2018 (UTC)
As for serendipity, all these definitions alone provide for it: you may have wanted one sense of triangulation, but get a multitude of definitions instead; and then you have derived terms and related terms to explore further if you are in an explorative mood. Or click on a category to get more items. All lexicographical. --Dan Polansky (talk) 13:03, 1 April 2018 (UTC)
I agree with Dan. I think the information is best included under "References," not in the caption, where the sole purpose of the picture is to illustrate a definition. I don't care who took the photo, or painted the picture, or carved the statue, I just care about the lexicographical information. Andrew Sheedy (talk) 19:32, 1 April 2018 (UTC)
Me too. The caption should be as simple as possible and lexicographically focused. The caption currently on that entry ("People determining the width of a river by triangulation (sense 1)") is ideal, although I would like to get rid of the non-lexicographic information altogether, not even placing it in a footnote. People can click through to the file description page if they want to know more about the picture itself. This, that and the other (talk) 03:32, 2 April 2018 (UTC)
I agree with SGconlaw and Jberkel. I don't see why images should be 100% about lexicographical information without adding any other information. The added information provides useful and interesting context and takes up so little space that it hardly distracts or obscures anything lexicographical. — Mnemosientje (t · c) 03:40, 2 April 2018 (UTC)
Images should be 100% lexicographical because Wiktionary should be 100% lexicographical. From Wiktionary:Criteria_for_inclusion#Wiktionary_is_not_an_encyclopedia: "Care should be taken so that entries do not become encyclopedic in nature; if this happens, such content should be moved to Wikipedia, but the dictionary entry itself should be kept. ¶ Wiktionary articles are about words, not about people or places. Articles about the specific places and people belong in Wikipedia."
In this revision, in my browser, the caption is almost two times as tall as the image itself, and it forces definitions to wrap to a next line before the end of the page. --Dan Polansky (talk) 10:13, 2 April 2018 (UTC)
I don't think the WT:CFI is very apt in this context. It is clearly talking about Wiktionary entries that essentially become Wikipedia articles. The issue at hand is whether it is appropriate to provide some sourcing information for images used in entries in a "References" section. Plus, browser settings vary from user to user; on my browser the same caption is nowhere near even half the height of the image. I also do not see why definitions wrapping to another line is an issue. The text remains entirely readable, and we have other forms of content such as example boxes that also cause line wrapping. — SGconlaw (talk) 10:56, 2 April 2018 (UTC)
Definition line wrapping is acceptable when caused by lexicographical content; it is annoying when caused by non-lexicographical random tidbits added to make the entry more artificially "interesting", to people who do not find lexicographical information interesting enough. The CFI passage may not have been intended for image captions but rather for definitions, but that does not change the impact and significance of its wording, which is: let Wiktionary show lexicographical content, and none other. --Dan Polansky (talk) 11:25, 2 April 2018 (UTC)
  • I don't see why longer image captions should be considered a problem, I have done this myself on occasion. In fact captions can be used to supplement the definition, and I don't see why DP is picking on Sgconlaw in particular. It seems to be DP's pet hate. DonnanZ (talk) 12:01, 2 April 2018 (UTC)
    I've called Sgconlaw into the discussion via a ping since almost all instances of the problem that I have seen were from him, and I found it only fair for him to join the discussion. Captions should not supplement definitions; if definitions are incomplete, they should be expanded. Moving encyclopedic content from definitions to captions is still in violation of WT:CFI as formulated. Moreover, encyclopedic content about the referent is still more worthwhile than telling us the author of a painting. --Dan Polansky (talk) 12:07, 2 April 2018 (UTC)
I don't care for the extras, which remind me of what you find on a museum exhibit. I would rather not have them, but if we are going to include them, make them into alt text that shows only when you hover over the image. Does the standard image markup have a parameter for this, or do we need to have a template that provides the option? I know it can be done in html, but that would clutter up the wikitext and make it less accessible to those who don't know html. Chuck Entz (talk) 14:18, 2 April 2018 (UTC)
Wouldn’t putting such information in a footnote that appears in the “References” section be a reasonable compromise? — SGconlaw (talk) 14:34, 2 April 2018 (UTC)
Re: "space is not an issue". Screen space often is an issue. (Download time can be an issue, though long captions don't have a material effect.) If the problem is screen space we could resort to show/hide bars to have it both ways: Lexicographical content in the bar, non-lexicographical content hidden by default.
Definitions are not the sole kind of lexicographical content which an image (or sound, etc) can support: I use images that provide some support for the semantic etymology of a term. The first image at Godiva will serve to illustrate why Godiva quadricolor has the specific epithet it does and, less clearly, why the genus name. DCDuring (talk) 15:07, 2 April 2018 (UTC)
@Sgconlaw, I would be happy with that compromise. Andrew Sheedy (talk) 15:59, 2 April 2018 (UTC)
I was thinking the other day the ability to show or hide captions would be a good idea, with "hide" as the default setting. Would that please everybody, DP even? DonnanZ (talk) 17:32, 2 April 2018 (UTC)
It might just be a little confusing where there are several pics in a row for different senses of a word. Equinox 17:37, 2 April 2018 (UTC)
@Equinox: Would it be more confusing than having the encyclopedic caption? or just more confusing than a caption with only lexicographic information? DCDuring (talk) 18:59, 2 April 2018 (UTC)
Here I'm talking about any caption versus none at all, not about the specific content. It's a matter of distinguishing senses. Equinox 22:23, 2 April 2018 (UTC)
They are not references, and do not belong in the "References" section.
You could make them part of a "Notes" section, using <ref group="note">BLAH</ref> / <references group="note"/>. —Suzukaze-c 03:11, 26 April 2018 (UTC)
A major benefit of hypertext over earlier forms of text is the ability to follow hyperlinks to learn more about something. These are optional side routes. I might see a pic and think "I wonder who took that photograph for Mediawiki", or "I wonder which year that oil painting was done in", but those are diversions; they are what clicking and linking are for. We should not put that lexicographically irrelevant info directly into the entry. Equinox 22:24, 2 April 2018 (UTC)
The primary purpose of images in dictionaries is to illustrate senses. It might be interesting to know more about the image besides what is visible, but I believe such information is noncrucial, belongs elsewhere, and only ends up becoming extra clutter on our entries. I agree with Equinox. —Suzukaze-c 03:18, 26 April 2018 (UTC)

Why are header levels the way they are?[edit]

Why are we not using L1 headers at all and why are etymology/pronunciation/POS all L3 instead of POS and one of the others being nested within the third? So why isn't L1 language, L2 pronunciation and etymology/POS L3 or even POS nested as L4 beneath etymology? Korn [kʰũːɘ̃n] (talk) 11:32, 2 April 2018 (UTC)

If I understand this correctly an L1 header like =Norwegian Nynorsk= would be far too big, I tried it. ==Norwegian Nynorsk== is more acceptable. DonnanZ (talk) 11:46, 2 April 2018 (UTC)
A more general question about headers: Why are always sized by level? I can understand why the sizing makes some sense from the top of an L2 section. But, IMO. headings appearing after the main lexical content like "References", "Further reading", "Anagrams", etc. don't merit L3 heading size. In addition, "Alternative forms", "Pronunciation", and, to a lesser extent, "Etymology" don't merit the font size we use. Couldn't the structuring function sometimes served by "Etymology" (and less often "Pronunciation") be supported by some means other than heading size? DCDuring (talk) 14:50, 2 April 2018 (UTC)
Here as at Wikipedia we don't use L1 headers because the page name itself is already the L1 header. —Mahāgaja (formerly Angr) · talk 17:07, 2 April 2018 (UTC)
@Mahagaja: Doesn't look like you're right about that. --WikiTiki89 17:59, 2 April 2018 (UTC)
Page titles are already L1 headers. Many pages don't even have etymologies or pronunciations which would mean we are putting everything under a blank "Etymology" header- consider the millions of inflected entries. Furthermore you should think about how multiple etymologies will interact with multiple pronunciations. DTLHS (talk) 18:20, 2 April 2018 (UTC)
Ah, page titles. I would assume multiple etymologies will interact with multiple pronunciations the same way as now: One of them comes on top, then the rest gets sorted in below it, then the next one comes on top... Korn [kʰũːɘ̃n] (talk) 09:31, 3 April 2018 (UTC)
The interaction of Etymology and Pronunciation headers was a long-standing issue between User:EncycloPetey ("EP") and the late Robert Ullmann. EP's concern was, for the Latin entries he was interested in, sometimes it made more sense (for him, at least) for PoSes to be organized first by pronunciation, then by etymology. As a result Latin entries were excluded from the operation of one of Ullmann's bots that attempted to ensure that ELE header rules were followed. EP never came up with a counterproposal to the ELE approach that gives Etymology priority. DCDuring (talk) 12:41, 3 April 2018 (UTC)
Because we could never come up with an alternative to Etymology-first structure, see an entry like palma#Latin for a simply-complicated page where the interaction of Etymology and Pronunciation creates issues. Two etymologies, each of which have two pronunciations, where the pronunciations are tied to specific inflected forms. This same situation of two different pronunciations of the same spelling, tied to the same etymology, but applied only to specific inflected forms occurs in nearly every regular Latin verb, as well as the ablative endings of nouns (and adjectives) as evinced by palma. So, because Etymology has priority, we have to use two different Pronunciation sections under each Etymology section. --EncycloPetey (talk) 14:38, 3 April 2018 (UTC)
In cases like that I prefer to list all of the pronunciations under a single Pronunciation header, labeled appropriately, e.g. at briseadh#Irish. —Mahāgaja (formerly Angr) · talk 15:23, 3 April 2018 (UTC)
Etymology doesn't get priority when a certain bot places Alternative forms above it, which is why I am beginning to treat Alternative forms as L4. That way the bot leaves them alone. DonnanZ (talk) 23:29, 3 April 2018 (UTC)
I always held the view that pronunciation should per default be the first header by which things are sorted after spelling unless there is strong reason to do otherwise, which can be argued for briseadh, effectively making it a case-by-case issue. Of course this is a problem for consistency. and while as an editor I understand why it is done the way it is, as a user I think that the pronunciation section of briseadh is an monstrosity. Why are the verbal pronunciations put above the noun and not above the verb? Korn [kʰũːɘ̃n] (talk) 00:10, 4 April 2018 (UTC)
@Korn: The way I see it, the Pronunciation section of briseadh applies to the entire Irish entry and not just to the POS following it. Would you prefer it if it looked like this? I can see a case could be made for it, but it also seems a bit like overkill. —Mahāgaja (formerly Angr) · talk 12:08, 4 April 2018 (UTC)
The screen-filling pronunciation section at [[briseadh]] would seem a perfect use of a show-hide bar for the entire section. DCDuring (talk) 12:15, 4 April 2018 (UTC)
  • Wow, I thought I'd consider this overkill too, but now that I see it, yes, yes I would prefer if it looked like this. A lot less scrolling around in the page at the cost of some redundancy. Definitely the user experience I prefer. Korn [kʰũːɘ̃n] (talk) 12:16, 4 April 2018 (UTC)

Brooke's Point Palawano[edit]

This L2 header appears in ~174 entries, apparently primarily added by User:Mar vin kaiser. Perhaps the language should also have an entry? - Amgine/ t·e 04:39, 3 April 2018 (UTC)

  • Yes, it should. It has no entry in the English Wikipedia, but it does have one in Ethnologue. I'll have a go. SemperBlotto (talk) 05:37, 3 April 2018 (UTC)

Requesting rollback[edit]

Hi. I am trusted here and I frequently look at Special:RecentChanges and undo vandalism. Therefore, I would like to request the rollback right. Inner Focus (talk) 08:58, 3 April 2018 (UTC)

You have like 190 edits here and you're blocked as a sock on enwiki. —AryamanA (मुझसे बात करेंयोगदान) 10:39, 3 April 2018 (UTC)

News from French Wiktionary[edit]

Logo Wiktionnaire-Actualités.svg


March issue of Wiktionary Actualités just came out in English!

An incredible issue of Actualités just fall on Wiktionary with two articles about words, some words about Wiki Indaba, an tremendous dictionary that will change the world, stats and news as usual.

This issue was written by six people and was translated for you by Pamputt. This translation may be improved by readers (wiki-spirit). We still receive zero money for this publication and your comments are welcome. You can also registered to be notice on your talk page. Face-smile.svg Noé 17:04, 3 April 2018 (UTC)

Should anons' editing rights be restricted?[edit]

There are at least two topics in Wiktionary that are especially vulnerable for speculation: 1) reconstructed entries and 2) etymologies. I have noted that there are anon users who show perhaps too much interest in these, adding fantastic theories about whatever, often based only on phonetic similitude between two words. This has lead me to thinking that reconstructed entries and, if technically feasible, also the etymology -sections of mainspace articles should be the domain of registered users only. It takes the time of several wise men to check the work of one fool, who can use a changing array of IP-addresses. I'd like to invite discussion about this topic. --Hekaheka (talk) 12:24, 4 April 2018 (UTC)

There are ways to protect reconstructed entries, but there is no way to protect only sections of pages (without a significant overhaul in how we structure our pages anyhow). As to whether we should, I don't think so. Those who patrol pages should perhaps flag changes to etymologies in some way for further review. - TheDaveRoss 12:57, 4 April 2018 (UTC)
As one of those people who patrol, etymology is a bright line I avoid touching due to the likelihood of on-wiki drama. I agree it should not be separated to a sub-page or namespace (or further abuse filter abuse,) but I also disagree it should be made off-limits. Poor anon contributions are often a symptom of future good contributions, and other negative personal outcomes. - Amgine/ t·e 19:40, 4 April 2018 (UTC)
IMO it's important that people can do stuff without signing up. I remember the 1990s Internet where you could roam free and comment here, chat there, and never give anyone a name, or have to invent YET ANOTHER stupid password. Okay, now we have to deal with a massive influx of millions of stupid children, but requiring an account is close to having a paywall; and wikis are supposed to be open. Unless we're seeing 98% bad edits from IPs I think it's a very bad step to punish them proleptically. Equinox 02:30, 5 April 2018 (UTC)
To address Heka's comment more directly: we do have the "unpatrolled" flag on entries until someone looks at them. If anything, the problem might be that we don't have enough patrollers (whether actual admins don't bother, or we don't speak the right languages, or we don't have enough admin users). Equinox 02:31, 5 April 2018 (UTC)
We have plenty of admins (most of them seem to find actually using the admin tools distasteful). DTLHS (talk) 02:36, 5 April 2018 (UTC)
Perhaps this is a Finnish-only problem, then. We have a very limited supply of admins who are capable of patrolling etymologies (I'm not admin, nor knowledgeable enough on history of words), not to even mention the reconstruction pages. To put it straight, I'm afraid that a considerable portion of our pro-fin reconstructions may be bullshit. --Hekaheka (talk) 12:43, 5 April 2018 (UTC)
I share some of Heka's concerns, especially taking into account various sockpuppets of formerly blocked uses enjoying the hide-and-seek game and challenging our rules hospitable to anons. When they get blocked for bypassing the blocks, they cry about censorship. I don't know what tools admins may have when they are outnumbered by careless or even hostile editors but I would suggest we need something to bulk-undo edits of editors identified as unreliable, especially if they have been warned several times or are known to be sockpuppets of formerly blocked users. We already have Special:Nuke, we need something to mass undo edits by user name/IP. --Anatoli T. (обсудить/вклад) 13:31, 5 April 2018 (UTC)
But since I come with a different IP each time and most often you don’t detect my edits (and when you do I afterwards re-edit them to your despair): tough luck. —This unsigned comment was added by (talk).
[This comment above is left by one of numerous User:Gfarnab's sock-puppets, a serial multiple account abuser under impression he is winning] --Anatoli T. (обсудить/вклад) 12:54, 13 April 2018 (UTC)
While I am surprised that Heka is not an admin, they are a patroller and rollbacker, so they have the relevant tools to help. It might be the case that abuse filters could be made to flag entries specific to Finnish etymologies for further review. - TheDaveRoss 14:54, 5 April 2018 (UTC)
There has definitely been a recent influx of anons adding speculative or just plain wrong etymological info to mostly Finnish and Proto-Finnic entries, building up faster than I at least can possibly keep up with, most of which also smells like the work of one dedicated person (recent examples include,,, I can think of a few remedies:
  • indeed ban anon editing of reconstructions (but this seems unlikely to have a major effect, since mainspace etymologies would remain for editing);
  • make sure our mainspace etymologies are sourced, and keep a close eye on (auto-flag?) any edits that remove sources or add unsourced information;
  • add something like Appendix:False cognates where anonymous observations on etymology are welcome (the four anons above have been adding "X is false cognate with Y" rather liberally around, even though this is usually irrelevant for the actual etymology)
But I fear that eventually we may have to abandon maintaining reliable etymologies on Wiktionary altogether, for smaller languages with fewer dedicated editors at least. Etymology is a much more academic discipline than general lexicography, that requires more background knowledge and caution. As our coverage grows, more and more knowledgeable editors are needed to keep track of all the etymological information we have already, and to prevent decay over time. Unlike casual drive-by vandals, amateur etymologists are often also quite dedicated to pushing their views.
— FWIW I am currently working in a project to establish an online repository (a closed wiki, in fact) of proper academic research on Finnish and general Uralic etymology, so I expect that the amount of time and energy that I can spend on patrolling Wiktionary's etymology coverage is not going to be increasing in the future. --Tropylium (talk) 15:11, 5 April 2018 (UTC)
Personally, I don't like the idea of banning anons from editing etymologies, since I probably edited hundreds of etymologies in the months before I finally decided to make an account. It might be useful to have a way of keeping track of anon edits specifically to etymology and reconstructed templates, like on a page where editors can check them off once they've reviewed them. I have no idea how common these edits are, though, so something like that may be an unreasonably labor-intensive task. —Globins 03:29, 14 April 2018 (UTC)

April LexiSession: mining[edit]

This month is mine! Not mine as if one possessed it, but mine as in mining, exploitation of minerals. The reason behind this theme is that in the French revolutionary calendar, April was renamed Germinal, and it is also the title of a book by Emile Zola about miners. So, mines!

By the way, LexiSession in short: a collaborative transwiktionary experiment. You're invited to participate however you like and to suggest next month's topic. The idea is to look at other community improvements on the same topic to improve our own pages and learn foreign way of contributing. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession (like Lingo Bingo Dingo did last month, thanks to him!). If you can spread the word to other Wiktionaries, you are welcome to do so. Ideally, LexiSession should be a booster for every Wiktionary on the same page, but it depend on the people, and I am still volunteer in this project, so with limited time to disseminate the message Face-smile.svg Noé 10:14, 5 April 2018 (UTC)

Play a game again[edit]

I shall become Gamesmaster again. We can have another round of a version of a multilingual word game played on a 15 by 15 board. We're gonna start Wiktionary:Random Competition 2018 next week. --Cien pies 6 (talk) 11:30, 11 April 2018 (UTC)

This coming Monday will be when the first entries are placed on the board. --Cien pies 6 (talk) 11:18, 13 April 2018 (UTC)

Documentation page about interwikilinks[edit]


For your information, I updated the documentation page about interwikilinks for Wiktionary. Feel free to reuse its content for your own documentation, and let me know if I forgot something important.

Cheers, Lea Lacroix (WMDE) (talk) 13:54, 12 April 2018 (UTC)

A live about contribute on French Wiktionary in few hours : https://www.youtube.com/c/LyokoïKun/live[edit]

Hello all ! I try a new format on my YouTube channel : a live where I contribute on french Wiktionary. This live is in french, but I can respond in english. Maybe you want to make the same for the english Wiktionary ! :D --Lyokoï (talk) 15:57, 12 April 2018 (UTC)

That looks fun. I could certainly go for that one day - I'd be wearing a mask and red underpants and with a voice distorter of course to hide my identity. --Cien pies 6 (talk) 11:14, 13 April 2018 (UTC)

Removing text on learned borrowing and doublet[edit]

I think we should remove the text on Template:learned borrowing and Template:doublet so that they're consistent with Template:borrowing. I end up writing notext=1 to get rid of the text almost every time I use these, and it would be more convenient to just have no text as the default and be able to add the text when it's necessary. —Globins 03:15, 14 April 2018 (UTC)

Support removing the lead text of {{learned borrowing}} (and that of {{calque}} and {{semantic loan}}), abstain for {{doublet}}. --Per utramque cavernam (talk) 11:58, 15 April 2018 (UTC)
I disagree / tend to oppose this. How are you using {{doublet}} that you're having to suppress the text? Are you spelling out the text "doublet of" (in which case, let the template do it), or adding doublets to lists of cognates (which seems undesirable)? Likewise, with {{calque}} and {{semantic loan}}, it seems desirable to spell out that it's a calque (etc), whereas with mere "borrowing" vs "inheritance" one doesn't need the text IMO because the result is the same (word foo was adopted, and potentially adapted as e.g. fu) and it's almost always obvious whether what happened was borrowing or descent based on whether the receiving language is descended from the giving language. In practice, "learned borrowing" seems to be poorly distinguished (especially in English entries) from mere borrowing (think of Latinate words variously labelled "borrowings" or "learned borrowings"); I'm on the fence about whether its text should be removed. - -sche (discuss) 16:31, 15 April 2018 (UTC)
@-sche, I agree that adding doublets to lists of cognates seems undesirable, but based on the pages I've edited it seems to be a common practice. Often, doublets are somewhere within a list of cognates and use {{cog}} instead of {{doublet}}. Actually, now that I'm writing this out, I realize that it probably makes more sense to use another sentence for doublets and keep the text, and that I got the idea that the text should be suppressed because I made some edit when I was new to Wiktionary where I added a new sentence for a doublet and it was reverted. I oppose what I said earlier about doublets, as well as removing the text from {{calque}} and {{semantic loan}}. These are uncommon enough that I think it's preferable to specifically mention them with text when they occur. For learned borrowings, I think the lack of distinction between them and normal borrowings gives more validity to the argument that the text should be removed, since the text could potentially mislabel a term when we could just avoid saying something wrong by writing "from." —Globins 21:30, 15 April 2018 (UTC)
To be clear, I agree that "calque" and "semantic loan" should always be spelled out explicitly. I just don't think we need this pesky lead text to do it. --Per utramque cavernam (talk) 13:24, 16 April 2018 (UTC)
I doubt that editors will reliably spell the text out themselves, though, if the template doesn't provide it. I also don't see why we should make them, when the template can do it. - -sche (discuss) 20:01, 19 April 2018 (UTC)

Use of Template:cognate[edit]

Is this template only intended for cognates, or should it be used when you just want to link to the Wikipedia page for a language? —Globins 04:11, 16 April 2018 (UTC)

Technically there is no difference in output (IIRC), but if you aren't comfortable with using a template called {{cog}}, you can use {{noncog}}. —Suzukaze-c 04:12, 16 April 2018 (UTC)
I always use {{cog}}. --WikiTiki89 13:18, 16 April 2018 (UTC)
As do I. Introducing a difference between the two would probably require manual cleanup, then.__Gamren (talk) 17:30, 19 April 2018 (UTC)
At the moment, it is only a difference in template name. At Module:etymology/templates, noncognate is a redirect to cognate. —Suzukaze-c 18:53, 19 April 2018 (UTC)
Guys, don't forget that we also have {{m+}}. --Per utramque cavernam (talk) 18:54, 19 April 2018 (UTC)
Any template will be used by editors anywhere, regardless of its intended purpose, if it produces their desired formatting result. This is the case no matter how good the documentation is or how explicit it is about where the template should be used. I regularly clean up uses of {{t}} outside of translation sections. The only solution is either some software change (to display a prominent error message if the template is used in the wrong context, not currently possible), or vigilance to manually clean up other editors' carelessness. DTLHS (talk) 19:03, 19 April 2018 (UTC)
However, in relation to {{cog}}, it was originally meant to be a general template that combined {{etyl|lang|-}} with {{m|lang|word}}. It's only named "cognate" because that's its most common use. The only reason {{noncog}} exists is because some stubborn editors couldn't bring themselves to use a template named "cognate" for something that wasn't a cognate. --WikiTiki89 19:54, 19 April 2018 (UTC)

Typo Team project 'Moss' has been updated[edit]

w:Wikipedia:Typo Team/moss, a project first introduced to us in 2015 which finds words used on Wikipedia that do not have Wiktionary entries, has been updated. The words in this list are generally either valid words that we're missing and could add, or (more often) typos in the Wikipedia articles which could be fixed. - -sche (discuss) 17:11, 16 April 2018 (UTC)

Disallowing Appendix-only constructed languages[edit]

As a follow-up to this. Per that vote, all Lojban entries were moved to Appendix on the grounds that most of them would likely not satisfy our criteria for inclusion and thus be deleted. The situations of the other languages in the category are presumably similar. But Appendix is still a "dictionary namespace", i.e. it is a part of the site that is presented to our readers. As such, we are obliged to ensure the accuracy of the information contained therein, which can only meaningfully be done by the same means that we prove accuracy in the mainspace: attestation. Really, delegating this content to appendix only makes it somewhat more troublesome to access and less likely for someone to stumble upon it; it doesn't solve the problem that we are being used as a platform for distributing unverifiable information. I hope I will be excused for speculating that that course of action was favoured over entirely expunging the content out of a misguided desire to appease.

I am not opposed to allowing terms satisfying the WDL criteria being included in mainspace, even if the number of such terms would be small; after all, such is also the situation of some natural extinct languages. It would probably be useless for users, but academically legitimate. I do, however, oppose allowing constructed languages being allowed LDL status.__Gamren (talk) 17:27, 19 April 2018 (UTC)

You claim that "we are obliged to ensure the accuracy of the information contained therein [in the Appendix], which can only meaningfully be done by the same means that we prove accuracy in the mainspace: attestation". This is false. See WT:LOP for a time-honoured appendix page explicitly designed to hold terms that cannot be attested. —Μετάknowledgediscuss/deeds 18:40, 19 April 2018 (UTC)
Those pages inform the reader that the content is protologisms, which makes it more acceptable. However, the text also seems to invite editors to add their own invented words? That doesn't seem appropriate.__Gamren (talk) 18:51, 19 April 2018 (UTC)
That is what we have done for many years. You are welcome to suggest changing our practices, but you should understand them first. —Μετάknowledgediscuss/deeds 19:46, 19 April 2018 (UTC)
I think that for appendix-only constructed languages the burden is (in practice) "a use or mention"; sometimes we've seemed to take (or accept confirmation of) a few vocab words from even mere websites of official government bodies. If there were concern than a Lojban word was attested in only a single non-authoritative book and might be a nonce, it could be labelled as such; if a word were only "attested" in e.g. an official dictionary, the only concern I would have about including it would pertain to copyright; I don't think there would be a danger that "the authorities made up the word and it isn't in use", since that describes almost the entire language and is the reason it exists in appendix-space and not the mainspace. If the word is not attested anywhere at all, then obviously we shouldn't have an entry for it. - -sche (discuss) 19:59, 19 April 2018 (UTC)
Right, and we simply shouldn't have made-up words that haven't been used, regardless of whether anyone thinks the person who made it up is an "authority".__Gamren (talk) 09:19, 20 April 2018 (UTC)
  • The appendix space has never been reserved for strictly verifiable information, as Metaknowledge points out above.
  • Even ignoring this, practically all information about most of the languages in question is verifiable. The attestation criteria applied to entries in the mainspace, which are a set of workable but somewhat arbitrary conventions, are not the only means by which information or usage can be meaningfully verified. Note that a word can easily be attested and yet not meet WDL CFI. Setting aside constructed languages, a number of other appendices exist in order to contain information that’s verifiable but doesn’t meet the standard WT:CFI criteria — compare Appendix:English dictionary-only terms — which suggests the established usage of the appendix namespace is not in line with the proposed restrictions on that namespace.
  • Some (CFI-attestable) words in natural languages are etymologically derived from words in appendix-only constructed languages (e.g. silflay). Deleting the etyma here would be strange, as the word didn’t pop into existence in English, but was derived from a word in another language.
  • As far as I’ve seen, the arguments given so far against treating constructed languages with communities of speakers as LDLs have solely been on the grounds that they are invented. I frankly don’t see why that fact should enter into the question at all (assuming they are genuinely used as a means of communication). They seem like a perfect use case for the LDL label: lexicographical information that is straightforwardly verifiable by its attestation in texts, which, for reasons of poor documentation in the types of sources we label ‘durably archived’, doesn’t meet WDL attestation criteria. Words can be coined just as readily in little-documented natural languages as in artificial ones, so this wouldn’t ‘open the floodgates’ for protologisms any more than our current policies do.
  • I do think coverage of constructed languages should be limited to those actually used by multiple people as a means of communication; certainly, we shouldn’t cover personal artlangs that only one person will ever use.
  • Considerations of what is useful to users of the site are important. This suggestion would make the site less useful to people looking for information on the languages in question. The benefits gained in doing so seem minimal. — Vorziblix (talk · contribs) 12:27, 21 April 2018 (UTC)
But if we don't care about accuracy at all (and I flat out refuse to accept "it was made up by this or that person" as proof of accuracy; usage matters, origin does not), why not just keep it in mainspace (this is aimed at those who voted support)? Surely that's more "useful".__Gamren (talk) 10:49, 22 April 2018 (UTC)
The vast majority of the words in question are attested in usage in addition to being mentioned in the source material. For Lojban in particular, which now makes up most of the material being questioned, there’s a corpus of some 7 million tokens in actual usage here. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
Seven million tokens. So, seven to ten times the size of the Bible, 14 times War and Peace, or 140 times a NanoWriMo novel. And that includes everything, even IRC. Assuming the average WP page is at least 500 words, we could get that for 150 languages just by dumping WP (all namespaces).--Prosfilaes (talk) 00:14, 23 April 2018 (UTC)
The point isn’t that the corpus is extremely large, but that it’s large enough to verify whether words are in use and how they are used. — Vorziblix (talk · contribs) 01:47, 23 April 2018 (UTC)
Constructed languages are just plan less interesting than natural languages. Knowing what a word is in Nahautl can tell you something about ancient history, about the evolution of languages. Knowing what a word is in Lojban tells you nothing about history. Yes, we could copy a dictionary in, but that doesn't add value besides just keeping a copy of the dictionary. CFI means that for major modern languages, every word we have an entry on, in theory someone in the future could find it in a text and come looking for a definition. With the exception of a few conlangs, all of which are in mainspace, they don't have that; nothing has been written in the language besides a few didactic texts. To get the equivalent of a tiny library, your Lojban corpus tossed in the ephemeral IRC logs.--Prosfilaes (talk) 00:32, 23 April 2018 (UTC)
Indeed, they are less linguistically interesting than natural languages, which doesn’t mean information about them is useless. Nor are the existing Lojban texts all ‘didactic’; they include original compositions of prose and poetry as well, and anyone finding words in any texts and wanting a definition would be well served by a dictionary. (The same is true for at least some of the other conlangs in appendix space, though probably not for all of them.) Maintaining such a dictionary can go well beyond just ‘copying a dictionary in’, as we also have attestations to work from. Records of non-literary communication in a language are also valuable and provide evidence of how that language is used by its community of speakers; corpuses of spoken language are not uncommon, for instance. — Vorziblix (talk · contribs) 01:47, 23 April 2018 (UTC)
These kinds of languages do need to meet some standard of verification, even if a weakened standard such as "at least one mention", in my view. As for protologisms, Wiktionary:Votes/pl-2013-09/Deleting list of protologisms had no consensus, but perhaps times have changed. --Dan Polansky (talk) 11:16, 22 April 2018 (UTC)
I fully agree. I would support the introduction of such a standard. — Vorziblix (talk · contribs) 17:07, 22 April 2018 (UTC)
For LDLs, to quote CFI, "the community of editors for that language should maintain a list of materials deemed appropriate as the only sources for entries based on a single mention". I would assume (and desire, and advocate if there is no consensus about this) that a source should only be allowed to be on such a list if we trust it to be strictly descriptive like ourselves, hence why we wouldn't accept Urban Dictionary as a reference. Plena Ilustrita Vortaro might be considered "authoritative", but it also demonstrably contains lots of made-up words. For Lojban and the other languages, is there really a dictionary that we can trust not to try to "patch holes" in an incompletely developed vocabulary? If so, I guess my arguments for having special rules for conlangs kind of crumble, but I seriously doubt it is the case..__Gamren (talk) 08:09, 30 April 2018 (UTC)
No, there isn’t a Lojban dictionary that we can trust to be descriptive, but, as noted above, there are corpora showing words in actual use, based on which a descriptive dictionary can straightforwardly be compiled. At least some of the other languages (Quenya, Sindarin, Klingon, ...) also have published texts in which terms are used in context, from which a descriptive dictionary (albeit a much sparser one) can also be compiled. — Vorziblix (talk · contribs) 15:58, 30 April 2018 (UTC)
@Vorziblix Then I guess I don't really mind making Lojban an LDL after all. Do you have a rough sense of how many of our current entries would be citable using this korpora zei sisku?__Gamren (talk) 06:37, 3 May 2018 (UTC)
@Gamren: To test this, I randomly sampled 50 entries out of Category:Lojban lemmas and manually checked for their attestation in the corpora. (I excluded the Tatoeba corpus from consideration because it consists of isolated example sentences.) Here’s the list of all sampled entries:
And here are the results:
So, as a rough estimate, about 94% of the words would be citable. — Vorziblix (talk · contribs) 19:02, 4 May 2018 (UTC)
And I still don't get the point of having entries in appendices! We can only either include something or not, and if we are including we might as well do so in the mainspace.__Gamren (talk) 08:27, 30 April 2018 (UTC)
Unless this discussion suddenly jumpstarts, I'll probably start a vote later today.__Gamren (talk) 08:56, 30 April 2018 (UTC)
See Wiktionary:Votes/pl-2018-04/Disallowing_appendix-only_languages.__Gamren (talk) 10:56, 30 April 2018 (UTC)

Swaziland -> eSwatini[edit]

In the news - Swaziland has been renamed to eSwatini. --Anatoli T. (обсудить/вклад) 05:53, 20 April 2018 (UTC)

That is definitely a hot word. And there's no guarantee that English speakers will start to use it over Swaziland. DTLHS (talk) 06:06, 20 April 2018 (UTC)
I don't deny it but we should reflect the official announcements for country names, besides, the etymology is old and it's allegedly an older name for the country. --Anatoli T. (обсудить/вклад) 06:10, 20 April 2018 (UTC)
When did people start using the Beer Parlour to talk about anything at all? We have the Tea Room to discuss words. --WikiTiki89 14:55, 20 April 2018 (UTC)
Maybe the lang name should be changed, from "Swazi" as e.g. in siSwati to "siSwati" like the country's name did or might change. That would be a BP matter. If OP's post was a question like "should cats like Category:en:Swaziland be moved to the new name?" it would also be a BP matter. Question like "should the translations be moved from one entry to the other?" could be a BP matter (if asked rather general) or TR matter (if asked for just these two country names). If it was a question like "does the new name deserve an entry, because it's the official name, even if not attested?" it would also be a BP matter as it would require a change of WT:CFI -- but the answer should be "no": if not attested, then no entry. A usage note in Swaziland however could have been ok. - 13:58, 30 April 2018 (UTC)
It's absolutely too soon to be changing anything. All we did is create an entry for the words, we're continuing to call the country and the language and everything else by the old names until it becomes clear that they are no longer the most common terms. That would take years, if it ever happens at all. --WikiTiki89 14:09, 30 April 2018 (UTC)
I know. I just wrote how this could be a BP matter . I didn't propose to change any Swazi term (as I don't really care about it). - 14:22, 30 April 2018 (UTC)

Vote: CFI and images[edit]

FYI, I created Wiktionary:Votes/pl-2018-04/CFI and images.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:07, 21 April 2018 (UTC)


The Beer Barlour is a Discussion page and is categorised as such. A beer parlor is a kind of bar, but wasn't categorised as such. So, I added Category:en:Bars to a few pages, but I suck at categories so didn't make it, and probably people would prefer it named "Drinking establishments" or something like that, to avoid anyone putting [[crowbar]] therein. We can probably think of at least 100 kinds of bars, anyway. --Cien pies 6 (talk) 19:02, 21 April 2018 (UTC)

Tell us what you think about the automatic links for Wiktionary[edit]

Hello all,

One year ago, the Wikidata team started deploying new automatic interwiki links for Wiktionaries. Today, the links for the main namespace are automatically displayed by a Mediawiki extension, and the links for other namespaces are stored in Wikidata. You can find the documentation here.

We would like to know if you encountered problems with the system, if you would have suggestions for further improvements. This could be for example:

  • Some automatic links don’t work as expected
  • Some problems you encountered with entering links (for non-main namespace) in Wikidata
  • Some new features you’d like to have, related to links

To give feedback, you have two options:

  • Let a message on this talk page
  • Let a message here. If you do so, please mention me with the {{ping}} template, so I can get a notification.

I’m looking forward for your feedback! Lea Lacroix (WMDE) (talk) 10:16, 24 April 2018 (UTC)

@Lea Lacroix (WMDE) Feature request: Provide a LUA or Parser function to query Cognate database (T163734). – Jberkel 10:38, 24 April 2018 (UTC)
Indeed useful. I also support phab:T190210. When I search a page and it does not exist, I click to "create" and "preview" the blank page just to check the interwiki links. If they are available without saving the new page, it would be more useful to show the interwikis directly in the search none-found page and the no article text message. --Vriullop (talk) 18:19, 24 April 2018 (UTC)
@Lea Lacroix (WMDE) Overall, excellent function. Thank you to all who made this happen. --Dan Polansky (talk) 18:09, 25 April 2018 (UTC)
@Lea Lacroix (WMDE) I think it's good. We don't need a bot any more, and don't have to deal with human users adding and removing them erroneously. Equinox 18:12, 25 April 2018 (UTC)
Not a complaint, but I note that there are some situations where this will not work - in particular, where projects have parallel appendices with their native-language names. bd2412 T 20:01, 25 April 2018 (UTC)
@BD2412: Don't worry, it's only meant to be enabled in the main namespace. --WikiTiki89 23:38, 25 April 2018 (UTC)
It would be useful to have Wikidata items in certain other spaces, though. bd2412 T 00:35, 26 April 2018 (UTC)
@BD2412 this is already possible. See Q35973249 or Q4663356. What do you need more? Pamputt (talk) 02:17, 26 April 2018 (UTC)

Long ſ in quotes[edit]

When quoting a book that uses long ſ, should it be reproduced in the quote? @Aabull2016 and I have been discussing this (brief background: I added it to some quotes and they reverted me) and we've been unable to come to an agreement. WT:" says using it is "optional," but I feel like we need a more concrete policy on this. Nloveladyallen (talk) 20:47, 26 April 2018 (UTC)

I have no problem with the choices other contributors make on this issue. However, unless a clear blanket policy is established, I'd prefer not to have my contributions edited and links changed merely to display certain 17th or 18th century typographical features, as I do give careful consideration to the ways in which I present and reference quotations, and make every effort to make them of maximum use to the largest number of potential users. Aabull2016 (talk) 17:51, 27 April 2018 (UTC)
I believe quotations should be reproduced as faithfully as possible. --WikiTiki89 18:19, 27 April 2018 (UTC)
Then whenever possible in images. Transcribing to text is only faithful in those texts born and reproduced in English digitally.
If we don't use the long s (or ſ, not the long ſ) in our spellings, it doesn't seem something that we need to preserve in quotations. It's a typographical feature more than a spelling feature.--Prosfilaes (talk) 06:40, 28 April 2018 (UTC)
I definitely think we should reproduce the original text as faithfully as possible, regardless of whether anyone finds it "off-putting". I see no reason to change what we find into something else, when we have the opportunity not to.__Gamren (talk) 05:17, 30 April 2018 (UTC)
Again, then we should use images. We abuse so much text--Ovid never wrote anything like the modern casing and punctuation that reneri assigns to him--why should we throw users under the bus to correctly represent the long s versus the s? It wasn't a spelling issue at the time; G.W., who wrote Magazine or Animadversions on the English Spelling in 1701 didn't think about ſ versus s at all. Why we should pervade our quotes with this distinction at all?
Moreover, if we want to use this distinction, let's fix Shakespeare first. Go through every cite and pin it to an early edition, and be reproduced spelling and long-s exactly. I think it an active detriment if we let Shakespeare be whatever we find in whatever edition, updated as it might be but make unknowns be shouldered by an alienating use of the original typography, as if Shakespeare was somehow more modern then his contemporaries.--Prosfilaes (talk) 20:54, 30 April 2018 (UTC)
I tried to do exactly that at 'rosemary', 'poison', 'dagger' and 'soundpost'. Kaixinguo~enwiktionary (talk) 21:09, 30 April 2018 (UTC)
This has been discussed more than once over the years. AFAIK there's no policy mandating or prohibiting the long s, and this thread suggests we aren't likely to agree on one now, so editors can bother to reproduce it, or not, as they choose. In a few entries where the long-ness of the s has influenced the uses/spellings of the word, like windsucker~windfucker, reproducing it seems helpful and so desirable. Otherwise, it has no real benefit. Changing quotations that someone else has added seems like a poor (perhaps rude) use of one's time. - -sche (discuss) 21:09, 30 April 2018 (UTC)
I find the long s obnoxious, similar to (whoever it was -- ReidAA?) trying to reproduce cited text precisely, including line breaks for word wrap, and long chains of &-nbsp; code to force alignment with spaces. As long as we don't actually re-spell things (like Webster 1913 naughtily citing colour and spelling it color, or such) I don't think it matters. As someone said above, it's typography. Equinox 21:16, 30 April 2018 (UTC)
Good to know what I put is obnoxious to you. Kaixinguo~enwiktionary (talk)
Yes, it's good to have feedback from the community about what they think. Equinox 21:39, 30 April 2018 (UTC)

Vote: Unifying on Inflection heading[edit]

FYI, I created Wiktionary:Votes/pl-2018-04/Unifying on Inflection heading.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 05:49, 28 April 2018 (UTC)

New Latin[edit]

I've added a section about New Latin to the About Latin page to start a discussion. Many New Latin terms are compiled in dictionaries prescriptively, which leads to a problem when those dictionaries are published by the arbiters of New Latin (e.g. the Latinitas Opus Fundatum in Civitate Vaticana) or arguably experienced Latinists (e.g. professors of Latin). So the question is, are these words to be ignored as unattested, as our own guidelines for inclusion insist, or should this rule be relaxed, allowing for inclusion of terms from a carefully curated list of acceptable dictionaries? --Robert.Baruch (talk) 17:07, 28 April 2018 (UTC)

  • Wikipedia isn't necessarily correct. It even contradicts itself (and thus also [1]) with en:w:Template:Latin periods: "1500–present New Latin". Also it contradicts the entry New Latin.
    As for "On Wiktionary, New Latin is considered to be the same as Contemporary Latin": I don't think so, cp. New Latin. And it's not necessarily correct or stated anywhere as accepted consensus. While Contemporary Latin terms are labelled "New Latin" (e.g. hamaxostichus), New Latin terms aren't necessarily considered to be from Contemporary Latin, i.e. Contemporary Latin might very well be a sub-form of New Latin or be merged into the more general New Latin.
  • WT:CFI#Number of citations (mentionings aren't accepted except for some sources) + Wiktionary:About Latin#Attestation (some sources accepted for mentionings) makes it clear: As for now, mentionings in modern dictionaries aren't sufficient for attesting Latin terms. Which makes it a non-open-question. However, some terms not sufficient as for WT:CFI (as usages in web pages which aren't durably archived, and single mentioned terms not from Contemporary Latin dictionaries) could be added to Appendix:List of protologisms/non-English#Latin or into an own appendix.
  • Contemporary dictionaries (as the "Lexicon Recentis Latinitatis") presumably are under copyright. While it's ok for a normal person to quote a few terms (at least in some countries), it doesn't seem right to use it for wiktionaries entries as in the end WT could have copied all terms out of it.
- 13:16, 30 April 2018 (UTC)

Italiot Greek[edit]

We use the term Italiot Greek as a synonym of Griko, i.e. the variety of Greek spoken at the tip of the "heel" of Italy. However, at Wikipedia, Italiot Greek is a cover term for both Griko and Calabrian Greek, spoken at the tip of the "toe" of Italy. Intuitively, I feel like the term ought to refer to all Greek lects spoken in Italy, in which case we should probably rename grk-ita to "Griko language" or something, but does anyone know what the term actually most commonly refers to in the linguistic literature? (It was almost five years ago that we dicussed creating new codes for these two lects at all, but I'm not finding a discussion of what precise names we should use.) —Mahāgaja (formerly Angr) · talk 20:59, 28 April 2018 (UTC)

Also (pinging @Aearthrise as our primary Italiot editor), if Wikipedia is to be believed, Calabrian is written in the Latin alphabet while Griko is written in the Greek alphabet, but all of our Italiot Greek lemmas are written in the Latin alphabet, making me wonder if they are actually in fact Calabrian rather than Griko. —Mahāgaja (formerly Angr) · talk 21:08, 28 April 2018 (UTC)
@Mahagaja, Grekànika(Grecanic) and Katoitaliòtika(Κατωιταλιώτικα) are the names given to all the dialects of Greek in Italy: Apulian(from Salento) is called "Griko/Grico", and Calabrian(from Bova) is called "Greko/Greco". Comparatively, Apulian Griko(Γκρίκο) has a wide array of literature, administrative use, and substantial number of speakers, while Calabrian Greko(Γκραίκο) is a rural tongue with a severe lack of literature and administrative use(it is close to becoming extinct).
All of the lemmata I have added for Italiot Greek are from the Apulian dialect; words in Calabrian are very similar to Apulian: Cal.Avri Apul.Avvri(tomorrow), Cal.Jineca Apul.Jineka(woman), Cal.Discolo Apul.Diskolo(difficult)- most of the vocabulary is the same with minor differences in spelling. Their grammar(from what I know) is mostly(or exactly) the same.
As for the correct alphabet of the Grecanic dialects, both Greek and Latin scripts are valid. I initially created Italiot Greek as a regional dialect of Greek, but because of the use of the Latin script(the Greek lemma page would begin with Latin words), we decided to use the grk-ita code. I have ported all of the Latin script words to grk-ita; I still need to complete the transfer of the Greek script words.
I don't believe that we should use separate codes for Calabrian and Apulian Greek on wiktionary- glottolog concords with my opinion: it lumps both together as Apulia-Calabrian Greek.
(Ⲁⲉⲁⲣⲑⲣⲓⲥⲉ) 04:05, 1 May 2018 (UTC)
@Aearthrise: Thanks for that comprehensive explanation. So does anyone object to retiring the code grk-cal, which doesn't even have any lemmas, and clarifying that grk-ita is a cover term for both Apulian and Calabrian Greek? We can make two regional dialects for it if we want: CAT:Apulian Greek already exists as a regional variety of CAT:Italiot Greek language, so we would just need to create CAT:Calabrian Greek as the other one. (I find these names much more helpful than, say, Griko and Greko or Grico and Greco!) Also, Aearthrise, would you be willing to start WT:About Italiot Greek just to clarify how we're using the term and everything else you mentioned above? —Mahāgaja (formerly Angr) · talk 06:05, 1 May 2018 (UTC)
If there were some way to have the Latin letters sort after the Greek letters, then I think it would be better not to have a separate language code for Italiot Greek. --WikiTiki89 15:39, 1 May 2018 (UTC)
I don't know about that. Glancing through a few Italiot lemmas, I find several that are quite different from their Greece-Greek synonyms: ammài is not just a Latin-alphabet spelling of μάτι (máti), while ajarài and πεταλούδα (petaloúda) are completely different words. I don't know how paparasciànni would even be spelled in the Greek alphabet since standard Greece-Greek doesn't have a /ʃ/ sound. I definitely agree with keeping Italiot a distinct language. Incidentally, we should get rid of CAT:Italiot Greek, which is categorized as a regional dialect of Greek, and keep only CAT:Italiot Greek language. —Mahāgaja (formerly Angr) · talk 17:04, 1 May 2018 (UTC)
So what? Not every Italiot entry needs to have a Greek-alphabet equivalent for them to be treated as the same language. --WikiTiki89 17:15, 1 May 2018 (UTC)
Keeping Italiot Greek separate would be consistent with what we've done in similar cases, e.g. the recently-discussed Mariupol Greek, and/so it seems reasonable to me, unless speakers themselves were to insist their lect was just a dialect. - -sche (discuss) 17:12, 1 May 2018 (UTC)
All I was saying is that if the sorting order is only reason, it would be preferable to fix the sorting order rather than split it off. --WikiTiki89 17:15, 1 May 2018 (UTC)
@Wikitiki89 the sorting order was the reason for creating grk-ita ( grk-cal did exist beforehand); at this point, most of the lemmata are under the Italiot Greek tag- it would be easy to return them to the Greek page.
@Mahagaja I will be honored to write the WT page!
(Ⲁⲉⲁⲣⲑⲣⲓⲥⲉ) 19:58, 1 May 2018 (UTC)

Compounds lists: 南河[edit]

On 2017/2/24 07:44, User:Wyang made the page for 南河. But it was only today, more than a year later, that 南河 was added to the list of compounds including 南 on the page. I manually added 南河 to the 南 compound list. I have no knowledge of computer programming, but I think that there should be some way to automatically 'grab' all the pages with a given character and add them to the compounds lists. --Geographyinitiative (talk) 22:56, 28 April 2018 (UTC)

Sounds like you're looking for the Grease Pit, not the Beer Parlour. Korn [kʰũːɘ̃n] (talk) 21:56, 30 April 2018 (UTC)


We need to decide if Gheg is treated as a language or not. I don't know enough about Albanian to have an opinion on how we treat it, but having it be both a language and a dialect is confusing. I saw that there was a discussion about this in 2011, and it was ruled inconclusive. Any thoughts? – Gormflaith (talk) 23:39, 30 April 2018 (UTC)

That's a fun discussion to read, lots of classic Dick Laurent. I agree that the combined macrolanguage-dialect method is problematic and should be discussed. The Compendium of the World's Languages says: "Tosk and Gheg are mutually intelligible, and differ indeed only in certain points - most importantly in the rhotacism of Tosk: Gheg -VnV- = Tosk -VrV-; e.g. Gheg zani 'voice', Tosk zeri; and in the formation of the future tense (see Verb, below)." As a result, I would support a merger into sq with dialectal forms found only in Tosk or Gheg but not in Standard Albanian to be labelled and categorised as such. —Μετάknowledgediscuss/deeds 00:03, 1 May 2018 (UTC)
Pinging @Etimo as our resident Albanian-speaker: should Gheg and Tosk be considered one language? Are they mutually intelligible? Do native speakers/scholars/references think of them as one language? - -sche (discuss) 17:16, 1 May 2018 (UTC)
@Torvalu4 too, who I think has some knowledge of Albanian. --Per utramque cavernam 08:00, 2 May 2018 (UTC)
My position on the matter is basically that of Μετάknowledge. I've never read any expert treat the 2 as separate lang.s. They're really dialect groups and there's a subtle progression from one extreme to the other. The differences btw. the 2 extremes aren't minor though; it would be like the difference btw. Southern American and Cockney. That contrast would stretch intelligibility to the max, but they're still English. I think the best way to deal with them is to enter them all under "Albanian" (no qualification) and then add a label (Gheg, etc.) in the definition block. Incidentally, Arvanitic (aat) and Arbëreshë (aae) are treated differently by Wikt as well, but they're Tosk dialects. Torvalu4 (talk) 02:06, 3 May 2018 (UTC)

Technically speaking, Gheg and Tosk can be considered dialects of the same language, not two separate languages, as they differ only in some phonetic features. They are mutually intelligible although intelligibility in some cases could be affected by the speaker's cultural level. A similar debate arose in Albanian academic circles some times ago, although it was mainly of political nature, considering that the Gheg dialect has the oldest surviving literature and the total number of Gheg speakers slightly exceeds those of Tosk's (and therefore should be considered standard Albanian instead of Tosk). Tosk was made the standard Albanian language at the party-sponsored linguistic congress of 1972, as Albanian communists came almost exclusively from the South, thus making it a political decision which went not without repercussions. However some efforts have been made to introduce Gheg features in standard Albanian in order to better harmonize and to somehow reconcilethe two languages. Etimo (talk) 07:26, 2 May 2018 (UTC)

OK, it sounds like they should be considered one language. I will start reheadering our few Gheg entries. - -sche (discuss) 01:28, 7 May 2018 (UTC)
I've merged aae (Arbëreshë Albanian), aat (Arvanitika Albanian), aln (Gheg Albanian) into sq. - -sche (discuss) 23:10, 11 May 2018 (UTC)
  • I would like to ask you guys one question: do you speak albanian? If yes... gegë or toskë - or both dialects?
  • I speak the gegë dialect. However, I had to LEARN the toskë dialect. And no, there are not just "few" differences between toskë and gegë. I SPEAK gegë but had to LEARN toskë at school. Just like Swiss ppl SPEAK "schwizerdütsch" but LEARN High german at school. If Swiss german is considered a "language" - then I guess gegë and toskë should be treated the SAME way. It would be nothing but ... fair.
  • Albanian is an own branch. Swiss german isn't. It's found in the Germanic branch; where many other germanic languages are also found.
  • Not to mention that Old-Albanian is in gegë dialect.
  • I suggest: ghg for Gegë, "Gheg" albanian.
  • Thanks.

May 2018

Renaming Taino to Taíno[edit]

@-sche, Metaknowledge, I'd like to broach renaming Taino to Taíno again.

  1. Taíno is the spelling commonly used in academic papers (the main source of reconstructions), Ethnologue, and Wikipedia.
  2. It is illustrative of its proper pronunciation, /taɪˈinoʊ/, and not /ˈtaɪnoʊ/.

--Victar (talk) 16:52, 2 May 2018 (UTC)

No opposition from me.
Minor quibble: wouldn't the correct pronunciation be /ta.ˈinoʊ/ instead? ‑‑ Eiríkr Útlendi │Tala við mig 17:06, 2 May 2018 (UTC)
@Eirikr: I normally pronounce it more like /təˈinoʊ/, but I've often heard /taɪˈinoʊ/, as in naïve. --Victar (talk) 17:13, 2 May 2018 (UTC)
Thank you. Part of what prompted my question was the stress, which you'd fixed in the meantime.  :) ‑‑ Eiríkr Útlendi │Tala við mig 17:19, 2 May 2018 (UTC)
Cheers. --Victar (talk) 17:21, 2 May 2018 (UTC)
The same argument about why we have Maori rather than Māori seems to apply here: we should avoid diacritics, which are difficult for most users to type, wherever they are not universally used. In this case, I don't care too much either way, since the language is such a niche interest and I don't think anyone will be caused problems by the change. But I do note that when I search for a phrase like google books:"spoke Taíno", the majority of the results actually have "Taino". —Μετάknowledgediscuss/deeds 17:22, 2 May 2018 (UTC)
@Metaknowledge, Thanks for the reply. I addressed both those concerns in the last discussion. 1. Taino is a purely academic reconstructed language with a very limited lexicon, so ease of typing is something of a moot point. 2. Doing a search comparison is for Taíno is innately flawed because a) OCR software often does not recognize i-acute, and b) diacritical marks are commonly dropped from non-academic sources. --Victar (talk) 17:38, 2 May 2018 (UTC)
And correct me if I'm wrong, but Maori and Māori would both be pronounced the same in English, unlike Taino and Taíno, the í representing a breakup of the diphthong. --Victar (talk) 17:47, 2 May 2018 (UTC)
Responding to your points... I agree with #1, which is why I'm abstaining on this. Your 2a is irrelevant, because I went and looked at the actual previews of the books where possible; you can come up with your own search string and try the same. 2b is not a flaw, but actually an important fact: if a lot of people writing about this language are not doing so in academic contexts, maybe we shouldn't use an academic spelling. —Μετάknowledgediscuss/deeds 17:57, 2 May 2018 (UTC)
@Metaknowledge, to give an example of failed OCR, see results #11 and #13 of google books:"Taíno language", Caciques and Cemi Idols and An Account of the Antiquities of the Indians.
Regarding 2b, I think the context is very relevant. When the book is actually about the Taíno language and people or even the Caribbean peoples at large, I've found Taíno very commonly used, yet when Taíno is just a passing mention, we often find Taino. Also, to restress, this is an academic language, so I believe academic publications should hold more weight than, say, a biography about Christopher Columbus. --Victar (talk) 18:19, 2 May 2018 (UTC)
But the passing references still count towards establishing what the common name of the language is, especially if they greatly outnumber the specialist works that use the diacritics, as seems to be the case here. I'd still prefer the diacritic-less name since it's (more?) common and easier to type, but I don't feel strongly about it. - -sche (discuss) 18:32, 5 May 2018 (UTC)
@-sche:, well, if there are no strong objections, which seems to be the case, and since I appear to be the only one interested enough in adding entries for the language, would you mind renaming it to Taíno? --Victar (talk) 04:48, 7 May 2018 (UTC)
Also, FWIW, at least some of the hits showing as non-accented Taino in the excerpt on the hits page actually show up as accented Taíno when viewing the preview. I suspect the opposite might also happen, due to the vagaries of Google's OCR processing. ‑‑ Eiríkr Útlendi │Tala við mig 17:39, 2 May 2018 (UTC)

Indonesian vs. Indonesian Malay[edit]

I thought I had already asked this a few months ago, but I can't find the discussion now, so maybe I only imagined asking it. What is the difference between Indonesian language and Indonesian Malay? Do we need both categories? —Mahāgaja (formerly Angr) · talk 18:16, 2 May 2018 (UTC)

No, I definitely remember you asking that. Lemme check. --Per utramque cavernam 18:20, 2 May 2018 (UTC)
There was a plan to merge Indonesian into Malay and use {{lb}} appropriately, but the vote never started. —Suzukaze-c 18:23, 2 May 2018 (UTC)
Yes, but I'm pretty sure the discussion Mahagaja is referring to is more recent than that. I even remember someone posting a link to a Wikipedia article, possibly this one, and Mahagaja was satisfied after that. --Per utramque cavernam 18:25, 2 May 2018 (UTC)
Can't find it. --Per utramque cavernam 18:49, 2 May 2018 (UTC)
I definitely don't remember being satisfied, though I may have given up in frustration. At any rate, if the previous discussion can't be found, the question stands: What, if anything, is the difference, and do we need both categories? —Mahāgaja (formerly Angr) · talk 19:10, 2 May 2018 (UTC)
I can't find the discussion either. Maybe it somehow got deleted. — Eru·tuon 19:12, 2 May 2018 (UTC)
Do you also remember this discussion taking place? (I'm slowly starting to feel like I imagined it too, but it couldn't have happened to three of us). --Per utramque cavernam 19:23, 2 May 2018 (UTC)
It's the top discussion here. Wyang (talk) 22:09, 2 May 2018 (UTC)
Hah! This is so satisfying, thanks a lot. --Per utramque cavernam 22:16, 2 May 2018 (UTC)
Thanks, Wyang! WT:Feedback doesn't get archived, just erased, which explains why we were unable to find it. —Mahāgaja (formerly Angr) · talk 22:17, 2 May 2018 (UTC)
Wow, you found it! It's unfortunate that the useful discussions on that page simply vanish into thin air, so to speak. — Eru·tuon 01:01, 3 May 2018 (UTC)
If I happen to see a particularly useful discussion, I aWa-archive it to the most relevant talk page I can find so that it's still findable. A lot of the threads are tosh, but we could either get more users in the habit of archiving the useful ones (especially if a discussion leads to improvement of an entry, it can be useful to have the discussion on the entry's talk page), or just save them all as is done with translation requests. - -sche (discuss) 04:26, 3 May 2018 (UTC)
I would be inclined to support a merger. There seems to be a massive amount of duplication. Perhaps we should (re-)start the vote? - -sche (discuss) 18:53, 10 May 2018 (UTC)
The Malay language in modern context is the one used in Brunei, Malaysia and Singapore which is different from the one used in Indonesia which is called Bahasa Indonesia. So I think there should be a distinction between these two languages. --Tofeiku (talk) 07:18, 12 May 2018 (UTC)

Being nice[edit]

What is the community feeling about this kind of thing? [2] (StackOverflow is one of the main question-and-answer sites for computer programmers.) Are we bad unkind people? Are we causing problems for women and non-white people? Equinox 00:39, 3 May 2018 (UTC)

Do you actually care? DTLHS (talk) 00:39, 3 May 2018 (UTC)
Yes. Equinox 00:41, 3 May 2018 (UTC)
en.wikt is not as welcoming as some other communities, even other WMF ones. The fact that there is a "women problem" is empirical (i.e. the vast, vast majority of contributors are men). As for solutions, ¯\_(ツ)_/¯. —Justin (koavf)TCM 01:04, 3 May 2018 (UTC)
We are a feisty bunch. Some of it is somewhat viciously circular: we don't have many editors relative to WP, so those we do have may be relatively overworked, so partially-OK but e.g. malformatted edits (which are perhaps more possible here than on WP because our formatting is more rigid) are often rolled back rather than cleaned up or even undone with a specific explanation. Sometimes, editors who make such edits are harangued on their talk pages or forums like RFC/RFV/RFD. This contributes somewhat to our non-retention of new editors, including some prolific ones, which means we don't have many editors, which brings us back to the first point in the circle. Also, established users sometimes reflexively defend each other even when [newbies'] complaints (e.g., that someone should have fixed a partially-OK edit rather than rolling it back...) are probably reasonable. (I'm sure I've been the defender, the defendant and the complainant at different times in such situations before.) Of course, we also attract quite a few editors who persist in making low quality or incorrect edits, whether out of ignorance or POV or malice, who we benefit from shutting down without all the quasi-"due process" of Wikipedia. - -sche (discuss) 19:07, 6 May 2018 (UTC)
OT DCDuring (talk) 20:54, 6 May 2018 (UTC)
LOL. - -sche (discuss) 21:10, 6 May 2018 (UTC)
The other side of that is tolerating "things" (racism, sexism) from contributors because they have some rare resource or know an obscure language. DTLHS (talk) 21:37, 6 May 2018 (UTC)

Category:Old French nouns in Hebrew script[edit]

Is this valid? DTLHS (talk) 01:16, 3 May 2018 (UTC)

Why would that be surprising? I'm guessing French Jews wrote the local tongue in Hebrew script, like Jews in many places.--Prosfilaes (talk) 04:13, 3 May 2018 (UTC)
The entries themselves are valid results of (reasonably, IMO) considering Judeo-French zrp to be the same language as Old French fro. I would have thought the category was valid, as we at one time also had Category:Afrikaans nouns in Arabic script, but that category is now empty and the words it contained are simply categorized with other Afrikaans nouns, so maybe we have gotten away from such categorization. - -sche (discuss) 04:20, 3 May 2018 (UTC)
I don't think we need to categorize by both part of speech and script. It would suffice to put these in both Category:Old French nouns and Category:Old French entries in Hebrew script. --WikiTiki89 14:30, 3 May 2018 (UTC)
That's reasonable. I suppose "entries in Hebrew script" might not even need their own category of any sort, since they are usually findably grouped together in the lemmas and non-lemmas categories (and other categories). So, I suppose this category should be removed from the entries that call for it. - -sche (discuss) 18:55, 6 May 2018 (UTC)
I've removed this category from the entries, and deleted the empty Afrikaans categories, but we have many such categories; see e.g. "nouns+in"+"script"&title=Special:Search&profile=default&fulltext=1 this search. - -sche (discuss) 18:52, 10 May 2018 (UTC)

Tropical cyclone names[edit]

I'm planning to create entries of names of tropical cyclones. They are easily attestable and sometimes have interesting etymologies. (Note tropical cyclone names are recycled and thus they are not names of specific entries.) See Haikui for an example. However I don't know whether this is within the scope of Wiktionary and I want to gather some comments about how should the entries be formatted. (including: whether to use English or Translingual header, how should the definitions be, should we have a link to Wikipedia articles about specific cyclones with such names, etc.)--Zcreator alt (talk) 14:36, 3 May 2018 (UTC)

  • That seems to be OK, but I think they need a better definition (and a link to -pedia). SemperBlotto (talk) 14:40, 3 May 2018 (UTC)
  • You should create a template to use for the definitions. That way all the definitions will be the same and we can easily edit it in one place. I suggest a formatting similar to {{tropical cyclone|en|typhoon}}. --WikiTiki89 14:43, 3 May 2018 (UTC)

Format for given names in non-Latin scripts[edit]

It seems as though given names are being treated in different ways by non-Latin script languages. In Persian, so far most given names have had a red link for an English entry. On the other hand, some other languages are not linking and just using {{given name}} to display 'A female/male given name'. For example, Bengali, Georgian and Hindi appear to not link usually (e.g. 'श्वेता', exception: 'शबनम'), but Armenian and Persian do give links. Arabic and Hebrew appear to be mixed and Russian links to equivalent names.

If there is no English entry/Romanisation, it might be more difficult to find information. For example, it is unlikely anyone would ever to type in 'ğonče' to find غنچه (ğonče), but they will find it in its current format by typing 'Ghoncheh'. This could create a large number of potential entries in English, though.

Should there be a policy across all non-Latin script languages to either link to a future English entry or to not link? Kaixinguo~enwiktionary (talk) 09:20, 5 May 2018 (UTC)

Questions in this topic area (of how to handle names, especially English/Latin-script forms of foreign names) have come up several times over the years and have no easy unproblematic answer. The entries should use the "given name" template, and then if the name has a common English counterpart / translation, provide it after a colon: "A male given name: Ghoncheh." Whether or not there should be a link and an ==English== entry for "counterpart" probably needs to be decided on a case-by-case basis: if there are native English-speakers with the name, e.g. children of Iranian immigrants, we should have an entry that defines it as an English given name of Persian origin. But if the name only occurs when foreign bearers' names are rendered into English / Latin script, there has historically been less agreement about how to handle it. (I may dig up links to previous discussions later.) - -sche (discuss) 19:18, 6 May 2018 (UTC)
Thanks. Amongst the languages other than Persian that I looked at, several do not follow the given name template with the transliteration/Romanisation/whatever it is.
I posted a question at the Information Desk asking about the punctuation. At the moment the template uses a comma rather than a colon, so I was wondering if it should be modified.
You are right about the distinction in how the names are being treated 'in English'. There are two categories, Category:English_female_given_names_from_Persian and Category:en:Persian_female_given_names. I'm not thinking about that so much as the format of the Persian- or foreign-language entries in non-Latin scripts and whether they should take a common approach. Kaixinguo~enwiktionary (talk) 21:22, 6 May 2018 (UTC)
Oh, I have no strong feelings about whether the punctuation should be a comma or a colon. Also, I could understand omitting the second part (the English 'version' or customary transliteration) if there isn't one, as could be the case with names from more obscure languages. But I think the first part (the {{given name}} template) should always be used (right?), except maybe on alt forms of names (which could just use {{altform}}), although even there it could be used (as on Sara). - -sche (discuss) 06:24, 10 May 2018 (UTC)


Birgit Müller (WMDE) 14:45, 7 May 2018 (UTC)

Upcoming reference template votes[edit]

After some renewed sparring over the layout of reference and citation templates (in which both sides have become frustrated and acted poorly), I will soon be setting up a vote to standardize the reference template layouts. The proposed text of the vote is as follows:

All citation templates and citations should use the {{cite-*}} or {{R:Reference-meta}} templates underlyingly. All removal of citational information found as a parameter to the {{cite-*}} or {{R:Reference-meta}} templates shall only be motivated via agreement among editors.

As part of this, {{R:Reference-meta}} will need to be brought into alignment with {{cite-*}} where it differs. Please provide comments. —*i̯óh₁n̥C[5] 19:25, 8 May 2018 (UTC)

@Dan Polansky, Sgconlaw, Metaknowledge, Ungoliant MMDCCLXIV, Victar*i̯óh₁n̥C[5] 19:27, 8 May 2018 (UTC)
The proposal looks a bit abstract to me. What does this entail exactly? What difference in display will it cause? --Per utramque cavernam 19:31, 8 May 2018 (UTC)
@Per utramque cavernam: The context is an ongoing disagreement with Dan about what information should be in reference templates and how they should be formatted. Dan prefers that they be simple and not "ornamented" with information that Sgconlaw and I consider to be valid academic citation material. The normal line of argument is that Dan feels there is a "status quo ante" of simple references and that we should not diverge from that. I am proposing this vote to prevent this line of argumentation and the deletion of valid citational information. There is a separate complaint from Dan that our citation method is "not usual academic practice", and to be sure, en.Wikt has always been a bit idiosyncratic in our layout and in the information we allow (specifically regarding ISBN's), I do not think we stray particularly far from recognizable academic standards. The advantage of this vote, however, is that if we decide to change our citation standards at any point, we only need to edit a handful of templates to change the whole project. To summarize, I would like to prevent the removal of (what I consider) valid academic information and would like to centralize everything so all changes my be applied uniformly. —*i̯óh₁n̥C[5] 19:46, 8 May 2018 (UTC)
@JohnC5: Ah yes, I've seen that debate happen in a few places.
So, does the proposal mean that any citation template should make use of all the parameters (or all the relevant parameters) of the aforementioned underlying templates; i.e., that they should always be as complete as possible?
While I tend to prefer a more succinct format, I appreciate thoroughness and accuracy too. Could we imagine having the "short form" by default, and an "expand/see more" button for the full view?
I think @Widsith could be interested in this, as I remember him challenging the length of some quotation templates. --Per utramque cavernam 20:10, 8 May 2018 (UTC)
I can understand making the body of an article more succinct and I often have to restrain myself from adding 20 alternative forms, but I don't see any valid reason to hold that same standard to content in the footer. --Victar (talk) 20:27, 8 May 2018 (UTC)
But it's not always used in the footer. For example, {{RQ:Browne Pseudodoxia Epidemica}} is used in the body of the entry; see latitancy. --Per utramque cavernam 20:37, 8 May 2018 (UTC)
Is this vote meant to apply to RQ templates? DTLHS (talk) 20:40, 8 May 2018 (UTC)
@DTLHS: Under this formulation, no. —*i̯óh₁n̥C[5] 20:51, 8 May 2018 (UTC)
@Per utramque cavernam: In your example, it's collapsed content, which I'm also not concerned by the length of. If there are examples of this that are not collapsible, I recommend that they be made so. --Victar (talk) 21:08, 8 May 2018 (UTC)
@Per utramque cavernam: So your question about what parameters will be used is crucial. When we cite printed material, even digitized print material, we are almost always citing a particular version from a particular year. This information should be represented in the citation. The user should be able, with the appropriate library, find the appropriate version from which a claim is made. Some of these books differ wildly in the content chronologically, and finding the correct one is important. Thus providing the standard bibliographic information (author, title, section/chapter/article title, year, [editors], location, publisher, [pages]) is crucial. As to the ISBN's specifically, I could take them or leave them. The issue is how do we decide the threshold of what is too little to be accurate and what is excessive. This information should be used (in part) to disambiguate where there are multiple versions. In the case of entirely online dictionaries like dictionary.com, less information would be stable or relevant so less would be included (there might not be author(s), publisher(s), location(s), edition(s), etc.).
As for short vs. long form, if we can do this via css, I'm all for it. I'm less excited about appendices or alt text. —*i̯óh₁n̥C[5] 20:49, 8 May 2018 (UTC)
I'd vote yes to that. If someone believes a parameter should be removed or the formatting of templates altered, that should be discussed on {{cite-*}} and {{R:Reference-meta}}, respectively. --Victar (talk) 19:40, 8 May 2018 (UTC)
Seems reasonable to me. Not sure what the problem is to be honest. —AryamanA (मुझसे बात करेंयोगदान) 22:36, 8 May 2018 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I was going to start a discussion on this; thanks for beating me to it. The background to the issue is that there is a difference of opinion over what format the so-called "dictionary-related" reference templates (i.e., templates like {{R:Online Etymology Dictionary}} and {{R:Webster 1913}}) should have. It seems to me that the following options are available:

  • Option 1: All reference templates should follow the {{cite}} family or {{R:Reference-meta}} formats, which means that full citation information (including imprint information – e.g., edition number, place of publication, name of publisher, year of publication, and ISBN or OCLC number – can or should be provided.
  • Option 2: Dictionary-related reference templates should have a simpler format (entry not enclosed in quotation marks, little or no imprint information provided), while other reference templates should follow the {{cite}} family or {{R:Reference-meta}} formats.
  • Option 3: All reference templates should have a simpler format.

My preference is for option 1. It seems to me that we need to try to achieve consensus on the following issues:

  • Should we try to maintain consistency in the formats of all reference templates and quotation templates? (Quotations are generally formatted using the {{quote}} family of templates, and the {{cite}} family of templates is aligned with the former.)
  • If yes, should we standardize all reference templates with the {{cite}} family of templates, {{R:Reference-meta}} (which I created to try and bridge the differences between options 1 and 2 above), or some other format?
  • If no, what are the reasons for treating the dictionary-related reference templates differently? How do we consistently identify these templates?

SGconlaw (talk) 22:37, 8 May 2018 (UTC)

  • The further issue is, if we choose Option 2, how will we differentiate when there are multiple, conflicting versions of a dictionary and, if we choose Option 3, how will differentiate more fiddly scholarly sources?
  • I would be in favor of standardizing to the {{cite}} family of references, as hey more closely mirror the {{quote}} family in certain ways.
  • We should decide on the quotation-marks-around-cited-dictionary-entries issue. IF we decide to have quotation marks (which I favor) we should build this functionality into all the templates.
*i̯óh₁n̥C[5] 23:19, 8 May 2018 (UTC)
I'm sympathetic to the idea that references don't need so much clutter, though since they're mostly limited to the end of entries (and could be collapsed, I guess, if present halfway through a multi-etymology entry) and can still be input by short names like {{R:MWO}}, I suppose it doesn't matter much how cluttered they are. However, a caution: online references will need to have fewer details in order to be accurate. For a printed dictionary, you could reasonably expect people to cite which edition / publication year they found a term in, but online dictionaries often change versions without anyone noticing ... for example, for more than a decade, Template:R:Dictionary.com referenced "v1.0.1" and the "2006" version of the site, even though the template continued to be applied while the site was updated, including probably to at least some words that weren't present in the 2006 version of the site. So, it was good that the version and date were stripped out of the template.
(And adding e.g. an access date would probably be unhelpful, since if a later editor can't find the term in the dictionary anymore, they can hardly confirm that the earlier editor was telling the truth that they found it in the site, and maybe the entry was removed for inaccuracy. Only in limited circumstances like writing a usage note about how dictionaries formerly included a term does it seem useful to cite e.g. an archived version of a no-longer-present entry in an otherwise-still-online dictionary.)
- -sche (discuss) 15:39, 10 May 2018 (UTC)
  • I have strong feelings about the unwieldy size of RQ-format citation templates, but I'm not clear whether this actually applies to them. Does it? And if not, perhaps we should consider all these similar templates at the same time. Ƿidsiþ 12:48, 17 May 2018 (UTC)
    • I'd say that's an issue which can be discussed separately, otherwise the current discussion will become unwieldy. — SGconlaw (talk) 16:09, 17 May 2018 (UTC)
      • Like the templates! Ƿidsiþ 16:38, 17 May 2018 (UTC)

WT:CFI and "clearly widespread use"[edit]

Does "clearly widespread use" have any currency? It seems like we always slap {{rfv}} on entries and ask for 3+ durable cites (and they HAVE to be durable, AND they must be uses), regardless of widespread use, which usually doesn't go over too well for internet slang, and then we have minor debates every so often about whether the Internet Archive is durable, or how we could amend WT:CFI to suit the digital age. —Suzukaze-c 03:07, 10 May 2018 (UTC)

What is an example of a page that failed RFV despite "widespread use"? DTLHS (talk) 03:12, 10 May 2018 (UTC)
Anyway, I view that line as a way to pass RFV without doing the work of actually putting the quotations in the entry, not that they don't have to exist. DTLHS (talk) 03:15, 10 May 2018 (UTC)
I was thinking mostly about internet slang, but I indiscriminately excised part of my text and accidentally removed that part. I edited my original post to fix it. Regional words might fit in this category too, and social media might be used as proof of "clearly widespread use".
IIRC there was an RFV for some gaming-related term some time ago, and someone mentioned that it seemed to be used a lot but not in published literature, and the entry didn't pass. I can't find it though. —Suzukaze-c 03:25, 10 May 2018 (UTC)
I think we should remove the "widespread use" rule as it is not well-defined. Instead we should reconsider WebCite and Internet Archive - Usenet is no longer in fashion and is English-biased.--Zcreator alt (talk) 04:39, 10 May 2018 (UTC)
I only remember using or seeing used "widespread use" to avoid the effort of citing something that was hard to cite but within the experience of almost everyone or almost every native speaker. I don't think it has been abused.
It's harder to apply for regional use because we may not have multiple editors familiar with such use. DCDuring (talk) 09:31, 10 May 2018 (UTC)
I generally agree with DTLHS and DCDuring. When I tended RFV a lot (a tiring task Kiwima is now the most prominent doer of), I didn't consider it useful to sticklerishly make anyone (including me!) spend time typing up three citations, in those pre-aWa days, as long as they could link to them or provide search terms that would find them or something. But if there simply aren't three citations to be found, a term can't be that widespread, now can it? "Widespread use" is a useful shortcut (including against trolling nominations of words like "shirt"), IMO, but valid citations still need to exist. - -sche (discuss) 15:21, 10 May 2018 (UTC)
Trolling nominations can be handled by w:WP:SNOW without adding a vague rule.--Zcreator alt (talk) 19:57, 10 May 2018 (UTC)
How is that not about as vague as "widespread use"? Widespread use is at least understandable at first glance, whatever the theoretically possible problems with its application. BTW, does anyone have specific instances of the principle having been misapplied? DCDuring (talk) 20:25, 10 May 2018 (UTC)
Not a positive example: I think OP's position is that it isn't being applied enough: words which don't have three citations but which are used a lot online or regionally aren't being passed as "in widespread use" (because, I might argue, they aren't in widespread use). - -sche (discuss) 20:42, 10 May 2018 (UTC)
I wouldn't mind seeing examples of that. It is only after something is RfVed that "widespread use" is invoked. If the objection is to excessive use of RfV, people RfV things for lots of reasons, including being tired and cranky at the time. Also please consider the vast number of definitions that have not been challenged.
RfV are a means of documenting that a term means what our definition says it means. Existence for many words is a non-issue. RfV is especially appropriate for words not found in other dictionaries.
I would not object to easing citation requirements for terms that could be found in sources of a quality similar to DARE, though copyvio is an issue. DCDuring (talk) 21:21, 10 May 2018 (UTC)
I would object to that. The OED has a ton of garbage in it that we shouldn't be reproducing. DTLHS (talk) 21:25, 10 May 2018 (UTC)
@DTLHS, I beg your pardon, is that ironically pertinent to the topic? I could see a few parallels. Rhyminreason (talk) 11:09, 14 May 2018 (UTC)
I don't know what you mean by ironically pertinent. The OED has many words where the only citation is from some dictionary from the 18th century. They also include words that we would classify as Old English / Middle English / Scots. The point is that just copying from other reference works is going to lead to trouble since they have different inclusion criteria. DTLHS (talk) 16:08, 14 May 2018 (UTC)
I'll admit that I've invoked the clause for things that are common but not commonly committed to durable media. One example is Unsupported_titles/Hyphen_vertical_line_vertical_line_hyphen (BTW: why isn't that rendering properly?), which I chose to enter because I was and am confident it would be recognized by most or all native speakers, should any be invited (e.g. from da.wiki, since there are few here). I agree that this sort of reasoning can easily become problematic ("but I've seen it many times!"), and that we should clarify what exactly the clause means.__Gamren (talk) 17:13, 15 May 2018 (UTC)
  • I've always understood this to be shorthand for "Shall we all agree that finding three quotes is unnecessary?" But if someone disagrees, three quotes should always be demonstrated IMO. Ƿidsiþ 12:45, 17 May 2018 (UTC)
If that's all it means, I feel we should get rid of it, since in those situations supplying three quotes is really not burdensome, and in any case entries are always improved by having quotes. In my experience, RFVs of easily verifiable words are not frequent enough to be disruptive.__Gamren (talk) 18:42, 19 May 2018 (UTC)

Our representation of Polish ⟨y⟩ (my primary concern) and handling of consensus (a side note)[edit]

I recently came across the fact that we erroneously transcribe Polish ⟨y⟩ with ⟨ɨ⟩ in IPA, for no good reason. So I changed the IPA module, but I've been told to seek consensus beforehand - which I already had done to no avail. I then was told that lack of response does not equal consensus. While I find that an understandable notion, I also think that it is a dangerous one as it prevents users from fixing errors simply by the majority, frankly my dear, not giving a damn, completely going against the be bold principle that the Wiki projects were based on and ossifying those parts of Wiktionary less frequented.
Be that as it may, from my writing you see that I consider ⟨ɨ⟩ to be an error. As far as I'm aware, ⟨y⟩ is never pronounced with any sound even close to a high central vowel and authors using this character are not contesting this. The actual sound is close-mid, slightly fronted, and potentially raised to near-close. I think this is because until recent times it was [ɪ], as conservative dialects use this sound, but historical Polish is nothing I'm acquainted with. I had changed the module to use [ɪ̽] for ⟨y⟩ and think this what we should do. (Although en.Wikipedia uses ⟨ɘ̘⟩, I believe.) For disclosure: The letter is transcribed thus in broad structural descriptions of Polish phonology, although I do not understand why. I think the reasoning behind it was that an Rocławski chose to consider [ɪ̞̈] to be basically [ɪ̈], which he then represented with ⟨[ɨ̞]⟩ and then simplified to ⟨/ɨ/⟩. I find that approach too imprecise, especially considering that the actual range of the vowel never even reaches near-close according to Rocławski as presented on Wikipedia (I don't have access to the actual book.), and thus not exemplary. We're a dictionary, not a work for peer review in structural linguistics, we owe it to our users that our pronunciation section actually give them a rough idea how to pronounce the word instead of teaching them to speak a language with a Russian accent. Korn [kʰũːɘ̃n] (talk) 09:34, 11 May 2018 (UTC)

@Wikitiki89 As the one reverting me, are you actually for using ⟨ɨ⟩? Is there actually consensus for ⟨ɨ⟩ or is it just happening to be the one implemented first by some ancient user who did a lot of work? And if you want to avoid diacritics in broad transcriptions, why use a character with one (centralising bar) instead of ⟨ɘ⟩, which does not have a diacritic and is actually close to the actual value? Korn [kʰũːɘ̃n] (talk) 11:13, 14 May 2018 (UTC)
I recently spoke with my Polish friend (he has very little exposure to Russian) about phonetics and we agreed, among other things, that the Russian "ы" and the Polish "y" sound identical = /ɨ/. If something sounds similar in similar languages, why should this be avoided? --Anatoli T. (обсудить/вклад) 11:22, 14 May 2018 (UTC)
The question is not 'why avoid ⟨ɨ⟩', that's basically an inversion of the burden of proof. The question is 'why consciously choose ⟨ɨ⟩ when we know that it is not the IPA character representing the sound most closely'? Korn [kʰũːɘ̃n] (talk) 11:38, 14 May 2018 (UTC)
@Korn: Please have a listen to Polish ryba (fish), Russian ры́ба (rýba, fish), Ukrainian ри́ба (rýba, fish) and Belarusian ры́ба (rýba, fish). Does the first vowel sound different to you? (Czech, Slovak or South Slavic have a different sound here). --Anatoli T. (обсудить/вклад) 11:41, 14 May 2018 (UTC)
They sound nothing alike to me. White Russian and Ukrainian I can differentiate roughly when hearing them back to back, but those I would count as "roughly the same vowel", but Polish and Russian sound worlds apart of my ear and both at least recognisably different from be/uk. Korn [kʰũːɘ̃n] (talk) 11:48, 14 May 2018 (UTC)
  • The Ukrainian and Belarusian files sound the same to me, as approximately [ˈrɪba ~ ˈri̠ba]. The Polish and Czech also sound the same to me, but somewhat further back than uk and be, approximately [ˈrɨ̞ba]. The Russian sounds somewhat diphthongal, approximately [ˈrɯi̯ba]. —Mahāgaja (formerly Angr) · talk 12:00, 14 May 2018 (UTC)
Well, you're both working hard on finding differences but they are just /ˈrɨ-/ to me. The differences are so minor, that I, a native Russian speaker, could think these people speak the same language. I'm sure the German pronunciation module could be another Wiktionary success story if we were trying to find an acceptable middle ground, rather than trying to nitpick. --Anatoli T. (обсудить/вклад) 12:16, 14 May 2018 (UTC)
I'm not working hard at all, they sound nothing alike to me. Korn [kʰũːɘ̃n] (talk) 12:36, 14 May 2018 (UTC)
"Sounds like" is cheap, spectrograms or bust. Comparing Polish and Russian ryba in Praat it's obvious that they are different, Polish has a F2 in the 1800Hz ballpark and F1 around 500Hz, which is much lower than high vowels such as /u/ and /i/, but also lower than English /ɪ/ and also not front. Basically an /ɘ/ as Korn said.
In contrast, Russian vowel is consistently high with an unstable F2 (the diphthongal quality described by Mahagaja). An interesting thing is how low the spectral tilt (the relative strength of higher frequencies compared to lower) is in the Russian sample compared to Polish, indicating a much slacker voice, however I'm not sure if this has any phonological relevance (velarization?) Crom daba (talk) 14:43, 14 May 2018 (UTC)

@Hergilei, Tweenk? --Per utramque cavernam 11:23, 14 May 2018 (UTC)

And @Guldrelokk too --Per utramque cavernam 11:50, 14 May 2018 (UTC)
To me Polish y does sound notably different from Russian /ы/, like something between it and the backed Russian /э/ after hard consonants. However, I would not rely on any ear except an actual phonetician’s one for purposes of sound description. Then, what symbols to use in phonemic transcription is a non-issue: hardly any vowel in the usual Danish transcription matches its ‘reference’ phonetic realisation, and for good reason. On the other hand, it is important not to deviate from the major sources on Polish phonology, and there is clearly a preference and a tradition to write /ɨ/. I would stick to it. Guldrelokk (talk) 15:48, 14 May 2018 (UTC)
I wouldn't call the crossbar on ⟨ɨ⟩ a diacritic for IPA purposes since it can't be added to any vowel letter (the centralizing diacritic is the dieresis). And it's true we should avoid diacritics in phonemic representation as much as possible (it isn't always possible, e.g. for phonemically nasalized vowels). So the three main contenders for the phoneme of the sound in question are /ɪ/, /ɨ/, and /ɘ/. Which of these symbols is most commonly used in the literature? I haven't read a lot about Polish phonology, but my impression is that /ɨ/ is the most common (Wikipedia uses it in phonemic represenation, reserving [ɘ̟] for the discussion of its precise realization). If it's true that the literature uses /ɨ/ more than /ɪ/ and /ɘ/—especially if it uses it much more—then I think that's what we should stick with. It's not our job to invent new phonological transcriptions for languages that already have well-established traditions. We can explain at Appendix:Polish pronunciation that /ɨ/ is realized in the range of [ɘ̟ ~ ɨ̞] or whatever it is. Whether the Polish and Russian vowels are identical or different is irrelevant: the vowels of German bieten and English beaten aren't identical either, but we transcribe both as /ˈbiːtn̩/ —Mahāgaja (formerly Angr) · talk 11:42, 14 May 2018 (UTC)
You do not expect that hiding the actual realisation in a hard-to-acces page will cause our average user to simply leave Wiktionary a wrong idea of how the word he looked up is pronounced? Also, please explain how commonality in linguistic works has relevance to our dictionary entries. That something is the most common transcription is a non sequitur description of facts, not an argument. Korn [kʰũːɘ̃n] (talk) 11:50, 14 May 2018 (UTC)
I reject your premise that the Appendix:Polish pronunciation is a hard-to-access page as well as your implication that the average user will be better served by the use of obscure symbols. I know professors of phonology who are unfamiliar with the symbol ⟨ɘ⟩ because it's so rarely used in the literature (cross-linguistically, I mean, not specifically for Polish). I suspect that more than 75% of the uses of ⟨ɘ⟩ in the wild are simply errors for ⟨ə⟩ rather than intentional uses of ⟨ɘ⟩. Commonality of usage in other works is relevant because we are not (and should not be) anyone's sole resource for any language. If a Polish learner encounters ryba transcribed /ˈrɨba/ in all his other resources but /ˈrɘba/ here, he'll be confused. He probably won't assume we're just using a different symbol for the same sound, more likely he'll assume we're saying it's pronounced differently than the other materials say (which we aren't), or he'll assume it's an error on our part and will attempt to fix it. —Mahāgaja (formerly Angr) · talk 12:00, 14 May 2018 (UTC)
Well, you looked up mysz. How do you enter Appendix:Polish pronunciation from there? What tells you this page exists in the first place? My argument is simple: Our pronunciation section is meant to tell users how to pronounce a word. When we say the pronunciation section uses IPA, then our users will expect that the glyphs used represent those values assigned to them by the IPA. Therefore, if we use the glyphs assigned to the vowels in question by the IPA, our pronunciation section will tell users how to pronounce a word. I've yet to hear an actual argument as to how using the wrong glyph will benefit anyone or fulfill the purpose of the pronunciation section. [Edit conflict: Mahagaja has since brought forth confusion with other learning materials.] As for your disliking of ⟨ɘ⟩, which can be looked up literally within a second in the IPA vowel chart I expect anyone to know IPA in the first place to be familiar with, I would like to remind you that my proposal was a centralised ⟨ɪ⟩. Korn [kʰũːɘ̃n] (talk) 12:03, 14 May 2018 (UTC)
I could live with ⟨ɪ⟩ if it's more common in Polish learning materials and phonological literature than ⟨ɨ⟩, but I'm not sure that's the case. At mysz, you click on "key", which is a pretty intuitive thing to do when confronted with a set of symbols you might not be familiar with. —Mahāgaja (formerly Angr) · talk 12:15, 14 May 2018 (UTC)
Here's the crux: As someone who cannot read IPA, you might do that. I wouldn't. I would do the alternative method: Look up IPA on a search engine or Wikipedia. So there's a 50:50 chance (based on the number of options, lacking actual user data) that people will find our key. As someone who knows IPA, or has just looked it up as I described, you expect to be familiar with the symbol and have no reason to expect that we're, basically randomly because other people did that too, reassigning them to new values they're not meant for, unless you already have consumed the works of these people we're copying, in which case you don't actually need the pronunciation section because you are already familiar with Polish phonology. If you however encounter an IPA symbol you are not familiar with, you will look it up if you want to know what it means. Thus using the appropriate symbol creates less pitfalls for the actual target audience of the Pronunciation section. Korn [kʰũːɘ̃n] (talk) 12:23, 14 May 2018 (UTC)
There is a secondary influence on the sound of the vowel that most of you are missing. Both Polish and Russian have hard and soft consonants, but the Russian hard consonant is much, much harder than a Polish hard consonant. For most consonants (such as B, V, P, M), the Polish hard consonant is approximately the same as an English consonant (the Polish L/Ł, English L in various environments, and Russian hard and soft Л being excepted), and the vowel /Y/ after it seems reasonably pure. The Russian hard consonants, including Б, В, П, and М, are much harder, and this hardness strongly influences the sound of a following /Ы/. You have to be familiar with Russian phonology in order to judge Polish /RY/ versus Russian /РЫ/. If you're not familiar with the hard Russian consonants and their effect on /Ы/, then you will think you're hearing an ordinary consonant followed by a weird Russian vowel. Most of the "weirdness" belongs to the consonant.
I think the hard Polish /R/ versus the Russian /Р/ complicates the comparison. It is easier with Polish /M/ versus Russian /М/. So try listening to Polish my (we) and Russian мы (my, we). You still need to be familiar with Russian hard consonants, but with this example, I think it's easier to note and subtract the influence of the hard /М/. After subtracting the influence of the Russian hardness, the Polish /Y/ and Russian /Ы/ sound very much the same to me. Another way to look at it: if you put the Russian /Ы/ after a Polish /R/ or /M/, you get /RЫ/ or /MЫ/, which is identical to /RY/ or /MY/. —Stephen (Talk) 15:23, 14 May 2018 (UTC)
Aside from Crom daba citing what I take to be his own Praat usage, here is the allophonic range according to Rocławski, and here is the chart per Jassem. It really has no relevance how close some Russian allophone gets to a Polish allophone, sections about language X must be based on language X, not be based on language Y. Korn [kʰũːɘ̃n] (talk) 19:35, 14 May 2018 (UTC)
Good point on the /r/ @Stephen G. Brown, it also occured to me that it might cause some lowering, but a thorough exploration of Polish phonetics is beyond the scope of this project and somebody probably already did it much better than I ever could.
Comparing my, same trends are again present, although Polish is not as low (F1 mostly under 450Hz which is somewhere between /ɪ/ and /ʊ/) and Russian is somewhat lower for an interval before going completely closed (here's the contour, the vowel starts a bit after the sudden F2 rise).
I think it's silly to talk about the hardness affecting /Ы/, it is the hardness. Some analysis consider it an allophone of /i/ after hard consonants, we could write /mˠi/ but I don't think that would be helpful. Crom daba (talk) 20:52, 14 May 2018 (UTC)
I'm usually sympathetic to the idea that we should use the nearest simple symbol. I do note, however, that even in narrow transcription pl.Wikt uses [ɨ] (in ryba and my), and perusing works on Polish phonology, I see Edmund Gussmann's Phonology of Polish also uses [ɨ] (in rybak and my), and Jerzy Rubach's Cyclic and lexical phonology: the structure of Polish analyzes the (dia)phoneme as //ɨ// (in e.g. absolutyzm). Perhaps /ɨ/ can be used in broad transcription, and our narrow transcription can give the actual vowel? - -sche (discuss) 15:55, 14 May 2018 (UTC)
There are a lot of things to consider, such as the variation between different speakers, different regions, different social classes, etc. That sort of stuff is better left to the experts. As a dictionary, we should just stick to the common conventions. --WikiTiki89 22:05, 14 May 2018 (UTC)
If Wikipedia is right, /ɨ/ is realised as /ɪ/ in some dialects and it's the older pronunciation (?) of "y", that's why it's rhyming with "i" in poetry. Well, in East Slavic languages "и" rhymes with "ы" (ru) (equivalents in Ukrainian and Belarusian і (i)/и (y) (uk), і (i)/ы (y) (be). Czech and Slovak definitely lack /ɨ/, so Czech is /ˈrɪba/, not /ˈrɨba/. --Anatoli T. (обсудить/вклад) 02:05, 15 May 2018 (UTC)
Pity User:Kephir has left us. I'm sure he could clarify his decisions regarding /ɨ/ in Module:pl-IPA. --Anatoli T. (обсудить/вклад) 02:09, 15 May 2018 (UTC)
He'll have used it because everyone uses it, can't blame him. @Wikitiki89 To our best knowledge, which completely agrees with my personal experience for what such anecdote is worth, the allophonic range is this with conservative speakers using [ɪ], which agrees with the neighbouring Slavic languages of Bohemian (which merged /ɪ/ and /i/), Ukrainian (which uses /ɪ/) and Upper Sorbian (which uses /ɪ/) realising the phoneme in a similar manner, as well as, medium fetched, with the fact that local/neighbouring Germanic lects realise /ɪ/ as [ɪ̈~ɘ̘] (a situation continuing well into Dutch territory). So there is no evidence that this vowel developed from [ɨ] nor that it's currently moving upwards to it within living memory or recorded history. Nobody has claimed anything different, neither here nor in the sources brought forth. The common convention is misuse of the IPA, you could just as well represent the vowel as [ə]. Korn [kʰũːɘ̃n] (talk) 09:30, 15 May 2018 (UTC)
@Korn: Most common or standard pronunciation of "y" is not /ɪ/ but /ɨ/. Please listen to this video made by a native speaker aimed at teaching the sound, which does cause some issues to foreigners, unlike /ɪ/ in the neighbouring Czech Republic and Slovakia. Especially words był or dysk. In YouTube, please enter "osiEeCjQAIM" in the search. If you can't pronounce /ɨ/, /ɪ/ will work for you. --Anatoli T. (обсудить/вклад) 09:44, 15 May 2018 (UTC)
@Korn: Another good example is video "aJI6JDAxUd4" ("Polish Pronunciation Guide") by a Polish girl. Listen to cytryny and other examples. A good clear recording. Someone already did the analysis of the Polish phonology, why are we reviewing it? --Anatoli T. (обсудить/вклад) 09:54, 15 May 2018 (UTC)
Anatoli, someone indeed already did the analysis of Polish phonology, which is why we have hard technical proof and at least 40 recorded years of scientific consensus about the specific make-up of the vowel and its allophonic range, which is [ɪ~ɪ̞̈~ɘ̘]. I'm not sure why you keep providing recordings of Poles saying things. As an aside, if this really is the range of Russian /ɨ/, then it's basically 'any central vowel which is not /ɐ/' and not a suitable reference for anything, not that it would matter either way. Korn [kʰũːɘ̃n] (talk) 10:53, 15 May 2018 (UTC)
I think that range includes the unstressed ы, which has a wider range. Also, the range you gave for Polish incldudes ɨ. --WikiTiki89 11:58, 15 May 2018 (UTC)
'ɨ' is a central close vowel. So the range I gave does not include that, even if you construe /ɨ/ to also cover a near-close central vowel, as Polish Y is moving along an axis of either near-close and front or close-mid and central, with standard language apparently being centered around the latter. We're beginning to turn in circles. Is this too specific a topic to have a vote over? Korn [kʰũːɘ̃n] (talk) 13:36, 15 May 2018 (UTC)
Yes, ɨ covers the near-close central unrounded vowel as well in less precise transcriptions, like ours. That's why it's labeled as such in the diagram you linked to. --WikiTiki89 15:02, 15 May 2018 (UTC)
@Korn: Do you think you could provide a list of online academic references that support your proposal? Or least non-academic sources that support the proposal? This can be put to a vote, but one that is likely to fail given the opposition above; you can take a chance, though. --Dan Polansky (talk) 13:53, 15 May 2018 (UTC)
@Dan Polansky My proposal is to use the appropriate glyph (AFIK nobody is arguing that the vowel was not mid-close in standard language) instead of the overwhelmingly common one. What would support that? Korn [kʰũːɘ̃n] (talk) 14:08, 15 May 2018 (UTC)
@Korn I don't know. The problem is, with no academic references to back it up, and with people who know something about the matter opposing, the proposal may be difficult to sell. --Dan Polansky (talk) 14:17, 15 May 2018 (UTC)
Opposition will also come from those of us who think being consistent with what other sources say is simply more important than being "right". We do this for our English transcriptions all the time: almost no variety of English uses cardinal IPA [ʌ] as the strut vowel, but virtually everyone (including us) uses /ʌ/ to transcribe it, and we do it because it's less confusing for our users to encounter the same symbol they're used to from elsewhere than to use the symbol that most precisely matches the vowel sound. —Mahāgaja (formerly Angr) · talk 14:29, 15 May 2018 (UTC)
@Mahagaja To quote the ever so venerated Captain Picard: There are four lights. - For one, [ʌ] at least is a symbol for a sound that is and was ever used by native speakers of English. For the other, I have to say that I literally do not understand those who hold the opposing view. It's inconceivable to me, so it would help me if you explain. I can name the tangible harm I see coming from the pronunciation section as is: People pronounce it wrong. That means the pronunciation section fails to fulfill its very purpose for existing. Can you please name the tangible harm you expect to come from using the actual symbol? Can you please also explain 1. why you expect confusion to arise in the first place (including why you expect people able to read IPA to not be able to read most IPA or look up unknown IPA) and 2. why you value avoiding this confusion more than providing people with the correct pronunciation? Korn [kʰũːɘ̃n] (talk) 19:55, 15 May 2018 (UTC)
We have to leave people who don't know IPA out of consideration, because they will not understand any of the symbols we use, neither the one(s) you prefer nor the one I prefer. People who do know IPA will not actually be at much risk of mispronouncing the words if we use /ɨ/, because as a phonemic representation, /ɨ/ doesn't mean "a close unrounded central vowel", it means, in the context of Polish, "the vowel of mysz and of the stressed syllable of ryba". It's kind of like how in this image, c doesn't refer to a specific length, but just to , whatever they may be. People will learn the exact realization of the vowel from listening to native speakers. —Mahāgaja (formerly Angr) · talk 20:28, 15 May 2018 (UTC)
The above note about the triangle makes no sense: there, the three letters refer to three specific lengths, and these happen to be bound by the formula. Again, 'c doesn't refer to a specific length' makes no sense to me, and I see no analogy to the discussed subject at all. The dependence in the image is mutual; c can be determined from a and b, but also a can be determined from c and b. --Dan Polansky (talk) 05:39, 16 May 2018 (UTC)
Note that this exact same discussion breaks out periodically at w:Polish phonology, where the /ɨ/ won out based on convention in other sources. While I agree that /ɘ/ is a better representation, I would be reluctant to defect from the way Wikipedia did things since this is the source most of our users probably rely on. Perhaps this discussion should be taken there (for the n-th time), if it wins there I'm sure no one will object to its introduction here. Crom daba (talk) 11:31, 16 May 2018 (UTC)
@Dan Polansky: What I mean by the triangle comparison is that c in the triangle image doesn't stand for a specific measurement like "25 mm"; it's a variable whose value depends on the context it occurs in. The same is true of IPA symbols. We can't measure the formants of a vowel in isolation and say "this is /ɨ̞/, not /ɘ̝/" (or vice versa), because the symbols are only interpretable in the context of other vowels in the same system. The closest, frontest, most unrounded vowel that exists in a language is that language's /i/; the closest, backest, most rounded vowel is that language's /u/; the openest, backest, most unrounded vowel is that language's /ɑ/. If there's a back rounded vowel whose height is between /u/ and /ɑ/, then that's that language's /o/. If there are two vowels in that area, then the closer one is /o/ and the opener one is /ɔ/. That's why two very different-sounding vowels in two different languages can correctly be transcribed with the same symbol. I'm sure if Crom daba ran File:en-us-moot.ogg and File:de-Mut2.oga through Praat (s)he would find wildly different formants between them, but that doesn't make it wrong to transcribe them both /muːt/, because both vowels are /u/ within the context of their languages. —Mahāgaja (formerly Angr) · talk 23:12, 16 May 2018 (UTC)
I don't want to confuse people by using a different symbol than all the other general references for the same sound- especially for a distinction that only dogs, bats, dolphins and trained phoneticians will even notice. Chuck Entz (talk) 04:30, 17 May 2018 (UTC)
 :p --Per utramque cavernam 09:48, 17 May 2018 (UTC)
The fact that we use /æ/ rather than /a/, not only for the American vowel which is closer to /æ/ but also for the British vowel that arguably is /a/ (as Widsith mentions in a thread below), seems to rebut the notion that "the closest, frontest, most unrounded vowel that exists in a language is that language's /i/; the closest, backest, most rounded vowel is that language's /u/; [etc]"; clearly consideration is sometimes given to what IPA symbol the vowel is actually nearest to, and also sometimes the symbols at the extremities are avoided even when they're accurate. Anyway, why not just continue to analyse this Polish vowel in the normal/traditional way in broad phonemic transcriptions and give the actual vowel in narrow transcriptions? - -sche (discuss) 14:47, 17 May 2018 (UTC)
An example of that can be viewed at pięść. While not my preferred way, at least that puts the actual pronunciaton on the page, so if push comes to shove, I'll support it. Korn [kʰũːɘ̃n] (talk) 20:58, 19 May 2018 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── I would suggest having the narrow transcription be generated by the same module that's generating the broad transcriptions. Unfortunately, that takes technical skill. - -sche (discuss) 21:10, 19 May 2018 (UTC)

Negative polarity items[edit]

I've just created Category:Negative polarity items by language. Do you think this is viable? @-sche, DCDuring --Per utramque cavernam 19:20, 11 May 2018 (UTC)

You could add a lot of know phrases, like know someone from Adam and its alphabetical neighbours. Equinox 19:24, 11 May 2018 (UTC)
I don't see why not. It seems orthogonal to the longstanding issue that by lemmatizing the positive forms and creating redirects to them from the negative forms, we may confuse readers who look up not know someone from Adam, don't notice they've been redirected, and find it defined as "know or recognise someone", the opposite of what the term they looked up means. At least the category usefully groups such entries. - -sche (discuss) 19:31, 11 May 2018 (UTC)
I added a few items. (It was fun.) The category seems useful. Wouldn't one for positive items be similarly useful?
Many MWEs that include any are NPIs, but some are SoP. Should polarity influence our inclusion/exclusion decision? What connection should there be between [[any]] and the category? How about things like budge and red cent? Are we capable of formulating or finding explicit criteria for inclusion of lexical items in the category?
We have lots of cases where we the positive form of a term often used in an NPI exists. I suppose, that if the positive only form exists in questions with a negative expected answer, we can still say it is an NPI. But usually there are some other, often not very common, even contrived, positive form uses. DCDuring (talk) 23:31, 11 May 2018 (UTC)
  • I found an interesting and useful 1-page pdf filled with example of NPIs. DCDuring (talk) 15:23, 12 May 2018 (UTC)
    As negative polarity is often a feature of a definition rather than a PoS (let alone a word), we should be populating this from {{lb}}, unless it isn't up to the job. DCDuring (talk) 17:00, 12 May 2018 (UTC)
    I have created User:DCDuring/Negative_polarity_items based, so far, only on CGEL (2002). Whether the redlinked MWEs which are SoP should be included merely because they are negative polarity items in some usage is a policy decision or a policy-application decision. DCDuring (talk) 17:37, 12 May 2018 (UTC)

English pronunciation template and module[edit]

Is it possible to create a template that would automatically generate phonetic English pronunciations in different dialects/accents? I was looking up the entry now today hoping to find some phonetic pronunciations in various English dialects, but all I found was the rather unhelpful /naʊ/. This across-dialect automation seems to be lacking, although it looks like a largely automatable job, using diaphonemes/phonemic forms, with parameters for each dialect (accepting phonemic IPA) that would allow overriding.

Wikipedia has International Phonetic Alphabet chart for English dialects, which has a rather impressive comparison of the English varieties. It doesn't necessarily need to be elaborate as such; even something crude as the; following would be immensely helpful and educational:

A similar model is {{vi-IPA}} used for Vietnamese. Thanks! Wyang (talk) 11:17, 14 May 2018 (UTC)

I'd support this. This is information that we're definitely lacking, especially for underrepresented but important dialects other than RP and GA, like Australian, NZ and Canadian English. — justin(r)leung (t...) | c=› } 18:54, 14 May 2018 (UTC)
I think that is a good idea. We tend to be somewhat what our most active user in that field considers to be proper language-centric and making more of these modules and thus establishing a normalcy to variety will encourage the same for other language. Korn [kʰũːɘ̃n] (talk) 09:39, 15 May 2018 (UTC)
@Wyang: If you find some data available, go for it, Frank!
I would also support the use of some form of phonetic respelling for English, even if it's not exposed to end users. It proved very useful for a variety of languages for automating pronunciation. --Anatoli T. (обсудить/вклад) 02:53, 16 May 2018 (UTC)
I'm nowhere near as familiar with English phonology as a lot of the other editors here... hopefully this can initiate some discussion and be the catalyst for a more standardised and systematic coverage of regional English pronunciations. Wyang (talk) 04:36, 16 May 2018 (UTC)
@Erutuon? --Per utramque cavernam 07:34, 16 May 2018 (UTC)
I too support this idea. I also think that the template should generate (diaphonemic?) enPR, so we can phase out having multiple phonemic transcriptions that can fall out of sync. —Μετάknowledgediscuss/deeds 04:16, 16 May 2018 (UTC)
Just wondering: is English pronunciation uniform enough for this to work? Also, am I correct in assuming that what is being proposed is a template that would generate pronunciations in various varieties of English based on what is input for RP and/or GA? — SGconlaw (talk) 08:16, 16 May 2018 (UTC)
To be precise, a template merely invokes a module. The question isn't proposing a concrete implementation, it's merely specifying what the implementation should achieve. Text-to-Speech and vice versa exist, so the question is not whether it's possible in principle -- yes, it is, to varying degrees of effort. But how and who, that's the question.
Of course I cannot but remark that english orthography makes this really hard for school children and not any easier for programmers.
I mean it's not a simple task. The assumption, e.g. by Anatoli, seems to be that manual labour is required to normalize the spelling first. Instead, my first thought was to use established machine learning algorithms that compare recordings and transcripts to generate probalistic models with over 99% accuracy. Some don't even use transcripts. That's just my two cents.
Normalized "authorgrefy" piques a tangential interest, to be sure. Rhyminreason (talk) 17:17, 16 May 2018 (UTC)
It could probably be made to work for most words, enough that it'd be useful, if we allow manual overrides and additions for when a dialect has an unexpected way of pronouncing something either instead of or in addition to the way one would 'expect' based on other dialects. For example, apparently RP and GenAm both pronounce margarine with /dʒ/, but RP exceptionally also pronounces it with /ɡ/. Other complicated words include aunt, eschew, pecan, quahog, and pwn. The "input" would probably have to be a fictional dialect that lacks any of the losses/mergers of real dialects, since the module might not be able to reliably re-insert /ɹ/ into GenAm from RP input (unless we start always using /(ɹ)/?), couldn't reliably un-merge Mary-marry-merry from GenAm input, etc. - -sche (discuss) 17:24, 16 May 2018 (UTC)
Quite apart from the problems of lexical incidence and recoverability, we'd need separate symbols for each of Wells's lexical sets, which gets complicated for the bath and cloth sets since they merge with different sets in RP and GA. The bath words are even more complicated in Australia, where some of them go with palm/start and others with trap. We'd have to split nurse up for Scottish accents that still distinguish /ɪr/, /ɛr/, and /ʌr/; we'd have to split goose up for Welsh accents that distinguish choose and chews; we'd have to split face and goat up for those accents (East Anglian? I'm not sure) that distinguish made from maid and don't rhyme toe and snow; and somehow we'd have to accommodate the old-fashioned New England accents that distinguish road from rode. That's just for starters. —Mahāgaja (formerly Angr) · talk 20:11, 16 May 2018 (UTC)
I don't think anyone expected this to be useful for all English dialects, especially more poorly documented ones. Just producing GA and RP would be great and reduce potential for error in many entries. —Μετάknowledgediscuss/deeds 21:38, 16 May 2018 (UTC)
Oh, so the template would generate pronunciations from ordinary spelling? Can it deal with cases like "bough, enough, thorough, though", to pick one common example? — SGconlaw (talk) 22:02, 16 May 2018 (UTC)
No. —Μετάknowledgediscuss/deeds 22:18, 16 May 2018 (UTC)
No indeed. If this happens at all, it's clear we'll need some kind of respelling to use in a parameter, just as {{fr-IPA}} already has for cases like ville /vil/ vs. fille /fij/. —Mahāgaja (formerly Angr) · talk 22:52, 16 May 2018 (UTC)
I see ... so what exactly is the template being proposed here? — SGconlaw (talk) 19:33, 17 May 2018 (UTC)
What's the distinction of rode/road? Does one of them belong to the hoarse set? Korn [kʰũːɘ̃n] (talk) 22:22, 16 May 2018 (UTC)
No; there seems to be (or to have been, since the accent in question is very old-fashioned and possibly extinct by now) a separate phoneme /ɵ/ in parts of New England (especially New Hampshire and Maine), that was used in some goat words like coat, road, smoke, stone, home, whole, but not in others like shoat, rode, own, knoll. I have no idea where it came from, since it forms minimal and near-minimal pairs with /oʊ/, and it doesn't come from the Middle English /oː ~ ɔu/ contrast (the toe/snow contrast mentioned above). Indeed, the ancestors of road and rode were homophones in Middle and Old English, so it's hard to imagine why they split in this small section of the English-speaking world. /ɵ/ occurs only in closed syllable and thus patterns like a lax or "checked" vowel, but is still distinct from both /ʊ/ (by being more open) and /ʌ/ (by being rounded). —Mahāgaja (formerly Angr) · talk 22:46, 16 May 2018 (UTC)
  • This could be a good way of settling a long-running dispute on whether to use the Oxford or Cambridge phonemes for British English (i.e. /a/ versus /æ/, /ʌɪ/ versus /aɪ/ etc.) by implementing both of them. However I would note that a lot of words are pronounced VERY differently in different regions and it would take more than automated transcription to deal with them – stress, for instance, often comes in different places in US/UK English (clitoris comes to mind, for some reason) and a few words are just wildly divergent (e.g. croissant – kre-SAHNT versus KWA-son). (We used to have a category for such words; has it disappeared?) Ƿidsiþ 12:42, 17 May 2018 (UTC)
What was the category called? I don't remember that. DTLHS (talk) 18:18, 17 May 2018 (UTC)
You're probably thinking of the awkwardly-named Category talk:Pronunciations wildly different across the pond, the contents of which are now hosted at the still-awkwardly-named Appendix:English words with pronunciations wildly different across the pond. (See also Category talk:English words with different meanings in different locations.) - -sche (discuss) 19:26, 17 May 2018 (UTC)
Yeah, that was it! Ƿidsiþ 04:35, 18 May 2018 (UTC)

'No equivalent expression' template or parameter [edit]

Following a discussion at the Tea room about translations of 'the curse' (a very old name for periods), I have updated the translation table so that translations of 'menstruation' are found at that word's page and translations reflecting the nature of the expression in English can be found at 'curse'.

Edit: let's talk about moobs instead (still not ideal, but the best I can think of).

I wonder if there should be a 'no equivalent expression' template for people to use, though, which could express this whilst still allowing for people to give the best translation available. It could reveal some interesting information about some languages. Kaixinguo~enwiktionary (talk) 07:45, 15 May 2018 (UTC)

I thought we had a template for "no translation exists", at least for "translations" of e.g. she into languages that don't have gendered pronouns (or don't have pronouns), but apparently I was just thinking of the {{qualifier}}s she uses for Dyirbal etc. Situations like that, and languages having only verbal and not adjectival constructions for cold, and languages having no jocular/pejorative/etc but only literal translations for things, are common enough that templates for at least the first and third cases seem like they'd be useful. Maybe the first could just say something like "no term for this exists" and the other could say something like "no comparable term exists, use maldita menstruação, literally 'cursed menstruation'". - -sche (discuss) 15:06, 15 May 2018 (UTC)
I'd be concerned about the overuse of any generic "no translation exists" template. I like what was done at curse. Glossing an FL term for which "no English translation exists" may be difficult, but must be attempted. Perhaps only a non-gloss definition is possible or a usage note is required. DCDuring (talk) 15:33, 15 May 2018 (UTC)
@DCDuring: I'm suggesting this for translation tables from English.
@Kaixinguo~enwiktionary:. How would that integrate with our translation-facilitation system? If incorporated, wouldn't it generate a lot of easy-way-out non-translations or failure to search for a nearly-right translations? If it were something only serious contributors had effective access to (ie, knowledge of), the potential for overuse would be much less. Even then, imitation would still become likely as use of the template spread. DCDuring (talk) 11:07, 16 May 2018 (UTC)
If we do this via a template, especially if we use separate templates for cases like she where no translation at all exists because a language doesn't have gender or pronouns, and cases where a term has to be rendered by an unidiomatic description, we can track and periodically re-examine which entries use the templates, especially the first one. - -sche (discuss) 17:28, 16 May 2018 (UTC)
@DCDuring: I'm not sure where the concern about overuse or generating non-translations comes from. One would have to be at least near-native or even a native speaker to say definitively whether an equivalent expression exists in the foreign language, so in that respect the bar is set higher for such translations, not lower. The question of searching for translations or not isn't that relevant; either one has the level in a language to say whether an equivalent exists, or one does not. You could argue that it could be misused, but that's the case for all translations as it stands. There is very little, if anything, to stop bad translations being added anyway: the potential for bad edits or misuse is the same. Kaixinguo~enwiktionary (talk) 10:36, 17 May 2018 (UTC)
My concern is not specific to translations. I think wherever we have options in our templates that provide an easy way out of doing some work, those options are misused. For a long time {{en-noun}} had only two options. As a result many nouns whose only offense was that the plural was rare or, in any event, not known to the person adding the template were marked as uncountable. Now {{en-noun}} has more options that cover some of the possible cases. But sadly, no one (I include myself.) finds it much fun to review all of the instances of {{en-noun}} to check whether they are correct. Maintenance only works if the task is relatively modest in scope. Initially instances in which no-translation-exists is indicated will be few and easily reviewed. When the category has numerous members it will become difficult to find the recent additions, at least when there are more new additions between reviews than fit in a reasonably sized list of recent additions to the category (assuming such a system of categories is created). DCDuring (talk) 14:30, 17 May 2018 (UTC)+
Just to clarify, I'm not suggesting 'no-translation-exists', what I suggested above is 'no equivalent term or expression', which is different. Kaixinguo~enwiktionary (talk) 18:20, 17 May 2018 (UTC)
@DCDuring: What if this could only be used if near or literal translations were added at the same time, which would address your thoughts about it being an 'easy way out'? For example at 'moobs', *Language name: {{t|no equivalent|use|[example]|language code}} Kaixinguo~enwiktionary (talk) 19:58, 17 May 2018 (UTC)
@-sche: I know I used that as an example, as it's what prompted the suggestion, but let's move away from the 'the curse' example, as I suspect most people here are not comfortable discussing even the term. Someone please suggest another example, if possible. But on the your point, I think 'maldita menstruação' shouldn't be put unless it is idiomatic in Spanish, that is the point. Kaixinguo~enwiktionary (talk) 05:22, 16 May 2018 (UTC)
I think maldita menstruação or some similarly SOP expression of the concept should be included as 'how this would be translated into Portuguese if it had to be translated into Portuguese'; the first part of the text takes care of clarifying that no direct translation exists, but it's not as if the idea couldn't be rendered into other languages. We already do this without any explanatory preface in a lot of cases where languages can only express some English word/concept via a SOP phrase. - -sche (discuss) 05:30, 16 May 2018 (UTC)
To replace 'the curse' let's talk about 'moobs' instead, it's a slang word for male breast tissue that resembles a woman's that could be hard to translate or not have an equivalent in a foreign language. Kaixinguo~enwiktionary (talk) 11:13, 17 May 2018 (UTC)
Aha, I knew we had a template for (the first part of) this: {{not used}}. The wording perhaps could be improved. - -sche (discuss) 21:57, 16 May 2018 (UTC)
I've just seen this relevant previous discussion linked from that template: Wiktionary:Grease_pit/2017/December#How_to_note_the_absence_of_a_translation_in_translation_sections. Kaixinguo~enwiktionary (talk) 19:16, 17 May 2018 (UTC)

This suggestion is now withdrawn per my talk page. Kaixinguo~enwiktionary (talk) 13:31, 18 May 2018 (UTC)

Given that we already have one such template, I'm not sure this can be withdrawn per se. I think it's useful, anyway, because we already have translations tables that make this claim, but the ones that don't use {{not used}} are hard to track; with templates, they can be easily tracked and checked for correctness. I've switched the tables at [[he]] and [[she]] and [[they]] and [[I'm rubber, you're glue]] to use {{not used}} (which is also used on [[the]], [[be]], [[it]], [[an]], [[that]], [[really]], [[do]], and [[-er]], and on [[corrective rape]], where however I may be able to replace it with a translation), and I've switched the ad-hoc wording at [[own]] and [[deadpan]] to use a second template. - -sche (discuss) 14:37, 18 May 2018 (UTC)
{{not used}} isn't what I was suggesting, and I think it is flawed. I think we have been talking about two different concepts anyway, what I was suggesting was a template or parameter to indicate whether an idiomatic term or expression has an equivalent in a language, to suggest an idiomatic approximation that can be used, an explanation and possibly an optional literal translation .
Also, if you look on my talk page you will see that User:DCDuring is unhappy about it.
{{no direct idiomatic translation}} is also flawed. Quite often an idiomatic translation is indeed possible; my intention was to call attention to whether an equivalent idiomatic term or expression exists in the foreign language, that is not the same. I wouldn't have proposed the template name '{{no direct idiomatic translation}}'; I strongly disagree with the idea of suggesting no idiomatic translation is possible. Kaixinguo~enwiktionary (talk) 15:31, 18 May 2018 (UTC)
Something seems warranted for at least some cases, even if just for labeling/categorizing entries with definitions that are problematic for translators. There seem to be at least a few classes of these. Normal users might find the distinctions we find useful hard to understand. DCDuring (talk) 18:14, 18 May 2018 (UTC)
Perhaps something more like "no idomatic equivalent". Anything can be referred to or described in any living human language, but no language has a specific, dedicated term for everything. Also, the semantic range covered by a term in one language may only partly overlap with the semantic range of similar terms in other languages, so a term in English may be the intersection of concepts A and B, but the closest term in Language X might be the intersection of A and D and in Language Y the intersection of B and E. Can either of those other terms be a true translation of the English term if they have no concepts in common? Chuck Entz (talk) 21:39, 18 May 2018 (UTC)

Vote: Proficiency as a prerequisite for contribution[edit]

FYI, I created Wiktionary:Votes/pl-2018-05/Proficiency as a prerequisite for contribution. Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 06:54, 17 May 2018 (UTC)

The vote page was deleted by Wyang. I request that the vote page is restored, in accordance with the amicable civilized practices of the English Wiktionary. --Dan Polansky (talk) 11:32, 17 May 2018 (UTC)
@Wyang I restored the page, it isn't obvious vandalism and will cause no harm if it exists while the discussion is taking place. On a personal note, I cannot see any reason for the vote to actually take place, but that is what the discussion page is for. - TheDaveRoss 11:37, 17 May 2018 (UTC)
Can the editors here do some actual work? Why the hell are we feeding ridiculous proposals like that? Wyang (talk) 11:43, 17 May 2018 (UTC)
The reason for existence of the vote is that the vote proposal seems to represent positions of certain people, and absent the vote, the people are likely to make representations that are hard to verify or disprove, and brought to check. --Dan Polansky (talk) 11:44, 17 May 2018 (UTC)
I might be wrong, but I suspect a quite short BP discussion would demonstrate that this proposal would not succeed in a vote, at least not as written. This is why it is a good practice to have a discussion prior to creating a vote. - TheDaveRoss 11:46, 17 May 2018 (UTC)
I believe vote talk pages are excellent places to discuss proposals, including their ability to have headings and the ease with which they can be separately monitored via the watch feature. In order for the vote talk page to exist, the vote page has to exist. And since the vote page always contains a certain specific textual proposal, the discussion can turn around that proposal, and amends to the text can be proposed and made. This has worked real well in the past in the English Wiktionary. --Dan Polansky (talk) 11:55, 17 May 2018 (UTC)
I can almost hear someone saying: Yes, I support the proposal, and no, this proposal should not be brought to vote to fail. Votes are evil. The powerful, strong and competent must rule, not the mediocre majority. --Dan Polansky (talk) 12:04, 17 May 2018 (UTC)
If that is your belief, you should consider changing our guidelines because I recall them explicitly saying that a vote should be the result of discussion, not its start. Korn [kʰũːɘ̃n] (talk) 14:44, 17 May 2018 (UTC)
I don't know which guideline you mean, but this vote is created based on an established practice with countless instances. --Dan Polansky (talk) 14:53, 17 May 2018 (UTC)
This is obviously a POINTy response to the discussion happening on Dan's (and Meta's) talk page right now. - -sche (discuss) 14:33, 17 May 2018 (UTC)
The linked page says "Wikipedia:Do not disrupt Wikipedia to illustrate a point". This is not a disruption, and therefore, is not POINTy. Rather, it is an attempt to find out how many people support a certain proposal and similar proposals. This has worked well in the past. The discussion on my talk page broke my patience; true enough. Nonetheless, I had seem similar representations before, but did not have the energy to deal with them. --Dan Polansky (talk) 14:53, 17 May 2018 (UTC)
My talk page contains three potential supporters of something like the proposed policy: Metaknowledge, Mellohi!, and AryamanA. --Dan Polansky (talk) 16:10, 17 May 2018 (UTC)
You are intentionally misrepresenting my position. —Μετάknowledgediscuss/deeds 17:11, 19 May 2018 (UTC)
I never said that about proficiency in Ancient Greek. What I said on your talk page was that it's not good form that you wrote a definition for an adjective consisting of solely a single noun. mellohi! (僕の乖離) 18:26, 19 May 2018 (UTC)
I said proficiency in a language might allow a contributor to get away with providing fewer sources in entries, not that only proficient editors should edit. I would never support this vote. See also my response below. —AryamanA (मुझसे बात करेंयोगदान) 01:39, 20 May 2018 (UTC)
  • If voted on and implemented, this proposal would go a long way toward confirming what many newbies already seem to feel: that this place is ruled by insiders who do not welcome outsiders.
I also wonder how to operationalize the 'proficiency' prerequisite: a number of (unreverted)/(content) edits, a score on an quiz, subjective opinion, a vote, a vote of those with proficiency in the admission candidate's language, an absence of blackballing (aka blocking)? DCDuring (talk) 17:50, 17 May 2018 (UTC)
I think this vote violates a lot of principles the Wiki sphere stands for and actually having it is thus probably precluded by the terms of service by which the Wikimedia Foundation hosts us. Korn [kʰũːɘ̃n] (talk) 18:41, 17 May 2018 (UTC)
Citation needed. Lots of other Wikimedia wikis have restricted contributors in various ways. DTLHS (talk) 18:43, 17 May 2018 (UTC)

Dan Polansky, since you have no interest whatsoever in actually introducing this topic to BP to solicit opinions, and would rather see this poorly formulated and completely unimplementable proposal being voted on (09:16, 19 May 2018: “Add "Oppose having this vote" section seen in some votes: there is some opposition to having this vote”), you are wasting everyone's time when this time should have been spent on actual dictionary building. Wyang (talk) 09:28, 19 May 2018 (UTC)

Normal parliamentary processes at least require someone to second a proposal. Is there anyone seconding it? DCDuring (talk) 18:35, 19 May 2018 (UTC)
I agree to the deletion of the vote. This kind of restriction goes against the principles of freely and openly editable wikis (as Korn said). Not to mention: how would it even be enforced!? And what exactly constitutes "proficiency"? Votes on such broad topics shouldn't be considered to begin with; only narrowly defined policy items can be voted on with a simple support/oppose/abstain. —AryamanA (मुझसे बात करेंयोगदान) 18:38, 19 May 2018 (UTC)

How can we make it easier for Wikimedia contributors to understand Wikidata?[edit]

Noun Project author icon 1642368 cc.svg

Dear all

Over the past year or so I've been working quite a lot on Wikidata documentation and have been thinking more about the needs of different kinds of user. I feel that currently Wikidata can be difficult to understand (what it does, how to contribute, what issues there are and what is being done to address them etc) even for experienced Wikimedia project contributors. To help address this I've started an RFC to try and collate this information together. It would be really helpful if you could share your thoughts, especially if you find Wikidata hard to understand or confusing, you can just share your thoughts on the talk page and we will synthesize them into the main document.

Wikidata:Requests for comment/Improving Wikidata documentation for different types of user

Thanks very much

John Cummings (talk) 12:54, 18 May 2018 (UTC)

  • I don't use it because I find it really boring. --Genecioso (talk) 13:03, 18 May 2018 (UTC)
    That is why Wikidata still has a Main Page. - TheDaveRoss 13:25, 18 May 2018 (UTC)