Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016


Contents

March 2016

Wiktionary:Wanted entries - name and shortcuts[edit]

I'd like to create shortcuts for all the subpages of Wiktionary:Wanted entries:

But one thing bothers me: basically, the only practical difference between Wiktionary:Wanted entries/la and Wiktionary:Requested entries (Latin) (WT:RE:la) is that the former is of requested redlinks. (then again, most "requested entries" are redlinks too) Do we want to keep the name Wiktionary:Wanted entries? I don't even have a better idea for a name, that's why I didn't create an RFM. Maybe I'd better just create the shortcuts, but first I'd like to know if it's okay, since they further cement "Wiktionary:Wanted entries" as the name we should use.

For reference, the page was created in 2005 as Wiktionary:project-wanted articles, then moved in 2008 to Wiktionary:project-wanted entries and finally moved again in 2010 to Wiktionary:Wanted entries. --Daniel Carrero (talk) 04:56, 1 March 2016 (UTC)

Bashkir Transliteration policy[edit]

Hi all, I noticed someone has recently introduced changes into the active Bashkir transliteration template(s). I couldnot track down who that was, or where these are actually stored. Folks, can you please help me figure out who that was, explain to me how these policies are accepted, and why there has been no public discussion on this before the changes were made? Borovi4ok (talk) 09:08, 2 March 2016 (UTC)

It appears to be this edit by User:Amateur55. The edit summary given was "Fixed transliterations in accordance with WT:BA TR". Wyang (talk) 09:13, 2 March 2016 (UTC)
Thanks a lot for helping out the noob ) Borovi4ok (talk) 09:40, 2 March 2016 (UTC)

Inspire Campaign: Making our content more meaningful[edit]

WPCube.png

The second Inspire Campaign has launched to encourage and support new ideas focusing on content review and curation in Wikimedia projects. Wikimedia volunteers collaboratively manage vast repositories of knowledge in our projects. What ideas do you have to manage that knowledge to make it more meaningful and accessible? We invite all Wikimedians to participate and submit ideas, so please get involved today! The campaign runs until March 28th.

All proposals are welcome - research projects, technical solutions, community organizing and outreach initiatives, or something completely new! Funding is available from the Wikimedia Foundation for projects that need financial support. Constructive, positive feedback on ideas is appreciated, and collaboration is encouraged - your skills and experience may help bring someone else’s project to life. Join us at the Inspire Campaign and help your project better represent the world’s knowledge! I JethroBT (WMF) 19:54, 2 March 2016 (UTC)

Entries with no definitions and no citations[edit]

User:Equinox has expressed the view that entries with no definitions and no citations should be speedied (see Talk:desklapizar and diff), and I think I agree. Does anyone disagree? If not, I will go ahead and delete every entry in Category:Ido entries needing definition that has no definitions and no citations. —Mr. Granger (talkcontribs) 21:32, 2 March 2016 (UTC)

I am OK with that, but it would be great if they ended up on the requested list for the applicable language. If you don't feel like doing that, though, I don't blame you. (edit @Embryomystic, who created a number of these.) - TheDaveRoss 21:44, 2 March 2016 (UTC)
@Embryomystic, TheDaveRoss: The ping does not get sent if you just edit your message, you must add a new message. --Daniel Carrero (talk) 21:50, 2 March 2016 (UTC)
Thanks, and weird. - TheDaveRoss 21:52, 2 March 2016 (UTC)
Okay, I'll look at the pages. It's possible that I overreached a bit on some of them, and I see they've been added to lists of Ido entries that are lacking. I'll add definitions to the ones that I can do that to without overreaching, and I guess the rest will just end up getting speedied. I agree that they should be put on the requested list at very least, though embryomystic (talk) 02:16, 4 March 2016 (UTC)
The usual way we use lists of words without definitions is to put them in a user subpage. DCDuring TALK 02:41, 4 March 2016 (UTC)
I agree that the content of Category:Ido entries needing definition (currently 85 entries) should be speedied. The reason for speedy delete is the conjunction of (a) lacking definition and (b) appearing unattested by a quick web search. Here's e.g. google:"predikegar". Since they appear unattested, collecting them in a request page seems of questionable utility. Beware that also the inflected forms need to be deleted. The deletion summary could be "Speedy delete a definitionless term that, on the face of it, appears unattested. Do not readd without attesting quotations meeting WT:ATTEST", or the like. --Dan Polansky (talk) 21:24, 4 March 2016 (UTC)
Category:Ido entries needing definition now only has 7 entries. The rest has been filled with definitions by the creator of these entries but it does not mean they are attested. I don't know how to handle the volume of suspect entries left in en wikt database by the user. I could start systematically sending them one by one to RFV, for each one that looks suspect based on a quick web search, but that would flood RFV quite badly. --Dan Polansky (talk) 14:14, 5 March 2016 (UTC)
It's a problem that seems to happen with constructed languages in general—people invent words that are morphologically plausible (or sometimes morphologically implausible) and create entries for them even if they aren't attested. Our Esperanto entries seem to be under control, but both Ido and Volapük have a large number of apparently unattestable entries. Our coverage of Novial, Interlingua, and Interlingue is much less extensive, and Lojban morphology is very isolating, so I imagine there's less of a problem in those languages. —Mr. Granger (talkcontribs) 15:15, 5 March 2016 (UTC)
Maybe we could develop a quick superficial test for speedy deletion of suspect entries. Like, if the lemma has zero hits at Google Books, and has no attesting quotation in Wiktionary, it is speedied, and can be created again when attesting quotations are supplied. It would be a temporary measure applied only to languages suspected to have a fairly large number of unattested entries in en wikt. I don't know whether Google Books is not too stringent for Ido, though. --Dan Polansky (talk) 18:29, 5 March 2016 (UTC)
Ido isn't really well represented on Google Books (for good economical reasons). There are many books without previews, of which some can be found on Ido-Vivo. And a lot of the texts that are on Google Books are textbooks and/or propaganda, like the notorious Lehrbuch der Weltsprache IDO fur Arbeiter. Lingo Bingo Dingo (talk) 14:02, 17 March 2016 (UTC)
  • If they actually have no definitions, I guess speedying is OK. If they just have definitions that need a little TLC, however... Purplebackpack89 14:41, 5 March 2016 (UTC)
  • Seeing no objections, I have deleted the remaining Ido entries with no definitions and no citations. As requested, I've added them to Wiktionary:Requested entries (Ido), in case they do turn out to be attestable with some meaning. —Mr. Granger (talkcontribs) 23:08, 5 March 2016 (UTC)

You guys should be specific that you're talking about Ido entries. There are hundreds of fresh Russian and old Chinese entries needing definitions and they needed. People do fill definitions.--Anatoli T. (обсудить/вклад) 23:50, 5 March 2016 (UTC)

Oppose if applied to all languages. Same reasoning as in the discussion about allowing definitionless entries before. Wyang (talk) 10:56, 6 March 2016 (UTC)

Pageviews graph[edit]

Just so you know, I have copied wp's {{PageViews graph}} here. This template generates a <graph> with the current page views statistics. There's also a user script that adds a "View statistics" item to the "More" menu, showing view statistics in a popup dialog. You can use it by adding the following line to your /common.js

mw.loader.load('//en.wikipedia.org/w/index.php?title=User:קיפודנחש/viewstats.js&action=raw&ctype=text/javascript');

--Dixtosa (talk) 09:30, 3 March 2016 (UTC)

Suggestion: give etymologies to irregularities[edit]

We’re not supposed to give etymologies to noncanonical forms because such etymologies tend to be obvious and uninteresting, but I think that we should make an exception for irregularities since their origins are more difficult to determine. Here’s an example:

Any objections? --Romanophile (contributions) 18:31, 3 March 2016 (UTC)

They should be put at the lemma form. The etymology of était is the same as the other past forms. —CodeCat 18:38, 3 March 2016 (UTC)
A fair point I suppose, but wouldn’t it bother you if one etymology section was inflated to accommodate all of the irregularities? --Romanophile (contributions) 19:38, 3 March 2016 (UTC)
No that wouldn't bother me. Don't forget that the lemma is just the form chosen to represent all the other forms. Thus, any etymological information regarding any of the forms should be in the etymology section of the lemma. --WikiTiki89 19:49, 3 March 2016 (UTC)
We already have entries that use show-hide bars to conceal portions of overly long etymologies (eg, long cognate lists) so the length of an etymology section need not be a consideration. DCDuring TALK 20:04, 3 March 2016 (UTC)
@CodeCat, DCDuring, Wikitiki89: yeah, I guess that this wasn’t such a good idea. I’m glad that I asked before doing a shitload of them, though! --Romanophile (contributions) 20:29, 3 March 2016 (UTC)

Okay here’s my attempt. Any comments? --Romanophile (contributions) 23:17, 3 March 2016 (UTC)

Why not just list which parts of the paradigm come from which Latin verb/PIE root? You don't need to go into all that detail, and keeping it short gets the point across better. Compare zijn. —CodeCat 23:34, 3 March 2016 (UTC)
All right, I simplified it. Do you like it? --Romanophile (contributions) 00:29, 4 March 2016 (UTC)
The information is useful (and interesting!), but I think some sort of tabular presentation would be vastly better. Writing this sort of stuff out as prose just makes it slightly harder to grasp. Imaginatorium (talk) 06:45, 4 March 2016 (UTC)
I would go for some combination of the two. Personally, I find the longer version very interesting, and would love to see it included, as it shows the development of each form over time. I created my own version (the collapsabe box is probably using the wrong template, but I'm not sure what I should be using). Andrew Sheedy (talk) 10:38, 4 March 2016 (UTC)
@Romanophile What do think? Andrew Sheedy (talk) 04:57, 5 March 2016 (UTC)
I personally like it, since it gives a summary and also an elaborate version for those interested. --Romanophile (contributions) 05:38, 5 March 2016 (UTC)

I am not arguing for less information at all. I was going to suggest that the conjugation tables could have colour-coded backgrounds, then I discovered there aren't any conjugation tables, just lists of words. I would think that for all IE languages (even English), a standard 6 column, or 3+3 column, or similar layout would be much more effective. (I looked at Andrew Sheedy's version, but could not immediately see the meaning of the red items.) Imaginatorium (talk) 10:17, 5 March 2016 (UTC)

No, there's no rule against etymologies for conjugated forms. in French the conjugated forms of aller and être are both pretty interesting because for some of them they look very different to the infinitive: va, irai, suis, soit and so on. Renard Migrant (talk) 12:44, 6 March 2016 (UTC)
I mean inflections of all kinds; there's no rule against it and it's already in use so this discussion is a bit moot. Renard Migrant (talk) 12:45, 6 March 2016 (UTC)
But I don't think that is a very good practice. People might not think to look at specific inflected forms' entries when looking for this kind of information. --WikiTiki89 18:49, 7 March 2016 (UTC)

@Wikitiki89, what about pronouns? I noticed that some nominative pronouns are treated as inflected forms of their masculine equivalents, though hypothetically they could be lemmatized. --Romanophile (contributions) 02:28, 11 March 2016 (UTC)

Whether they could be lemmatized is a separate issue (and should be decide on a case-by-case basis), the point is that if they are lemmatized, they would have their own etymology section, and if they are not, then they are forms of the lemma and the etymology should be at the lemma. --WikiTiki89 16:05, 11 March 2016 (UTC)
Having a separate etymology could be an argument in favour of lemmatising, though. Not the only argument nor probably the most convincing one, but an argument nonetheless. —CodeCat 17:02, 11 March 2016 (UTC)

Reconstruction move count[edit]

In case anyone is interested:

  • Pages in the Reconstruction namespage: 2,597
  • Appendices to be moved (nonredirects starting with "Proto"): 5,963

Also, can the rest of the appendices be moved by bot? --Daniel Carrero (talk) 18:30, 4 March 2016 (UTC)

One important thing to note is that not all reconstructions start with Proto: unattested Latin terms, for example. --WikiTiki89 18:42, 4 March 2016 (UTC)
I can bot move every Appendix:Proto... page and corresponding talk page to the Reconstruction namespace if nobody else feels like doing so, but if there are more nuances to it than that I will leave it to someone who knows what is going on. - TheDaveRoss 18:45, 4 March 2016 (UTC)
The nuances are that if a page is a redirect, it should be fixed to redirect to the Reconstruction namespace as well. For non-redirects, it would be good to make sure that the page has a {{reconstructed}} (or {{reconstruction}}) template at the top, otherwise it might not actually be a reconstruction page (and it would be nice to have a list of such pages if there are any). --WikiTiki89 18:51, 4 March 2016 (UTC)
Those are all pretty straightforward, I can probably do it this afternoon if nobody beats me to it. Also see above for the list of pages. - TheDaveRoss 18:56, 4 March 2016 (UTC)
Swadesh lists aren't entries, they're lists. I don't think they should be moved. Chuck Entz (talk) 20:02, 4 March 2016 (UTC)
Thanks. I moved the ones that were actually reconstructions and left the ones that weren't. Another thing: If there are redirects from pages without a slash to pages with a slash (e.g. from "Appendix:Proto-Something foo" to "Appendix:Proto-Something/foo"), they can be deleted. --WikiTiki89 20:04, 4 March 2016 (UTC)
@TheDaveRoss: Please confirm that you read my previous post about redirects. Thanking this edit would be enough. --WikiTiki89 22:09, 4 March 2016 (UTC)
@Wikitiki89 Currently I am ignoring redirects altogether (only moving pages which are not redirects), once those are done I can circle back and fix redirects so that they all point to the right place. - TheDaveRoss 22:13, 4 March 2016 (UTC)
Except for a few Old Prussian terms I'm not sure about, all of the non-protolanguage reconstruction pages should be done. KarikaSlayer (talk) 19:29, 4 March 2016 (UTC)
Another nuance is that a lot of pages in the Index: namespace have simple wiki links to the Appendix: namespace rather than links provided by {{l}} or {{m}}. For example, Index:Proto-Indo-European/d has [[Appendix:Proto-Germanic/talō#Proto-Germanic|1]] and [[Appendix:Proto-Germanic/talō#Proto-Germanic|*''talō'']] instead of {{l|gem-pro|*talō|1}} and {{m|gem-pro|*talō}}. That means all those links will be broken whenever a reconstruction is moved to Reconstruction: space without leaving a redirect. I'm gradually going through these appendices and fixing the links to reconstructions, but there's a hell of a lot of them, and fixing them is time-consuming because it can't be done by simple search and replace. —Aɴɢʀ (talk) 19:11, 4 March 2016 (UTC)
Didn't we vote on getting rid of the Index namespace? --WikiTiki89 19:17, 4 March 2016 (UTC)
We didn't. I said I would create that vote later, but I didn't create it yet. --Daniel Carrero (talk) 19:20, 4 March 2016 (UTC)
It is simple search and replace. You search "Appendix:Proto" and replace with "Reconstruction:Proto". No?--Dixtosa (talk) 19:22, 4 March 2016 (UTC)
Well, that works if you're content with pages that say [[Reconstruction:Proto-Germanic/talō#Proto-Germanic|*''talō'']], but I'm not. If I'm going to go to the trouble of tidying up these pages, I'm going to do it right and use {{l|gem-pro|*talō}}. —Aɴɢʀ (talk) 19:50, 4 March 2016 (UTC)
I am not either. I am just saying that that is different unrelated low-priority (not intending to diminish your work) job that does not really interfere with this high-priority job. This message assumes that the bot is going to do the mentioned simple search and replace too. --Dixtosa (talk) 20:18, 4 March 2016 (UTC)
It's relatively easy to do with any regex-based search and replace. --WikiTiki89 20:20, 4 March 2016 (UTC)
Not sure how prevalent it is, but if there are lots it sounds like a good candidate for that bot to-do list page. - TheDaveRoss 21:12, 4 March 2016 (UTC)

I've done a sweep to clean up all English Wikipedia links to our protolang entries (wasn't many of them really). Are there other Wikimedia projects where we can suspect links to these to be lurking about? --Tropylium (talk) 01:08, 6 March 2016 (UTC)

Yes check.svg Done, apparently. Thank you!

  • All appendices of reconstructed terms starting with "Proto" are redirects, so I take it all of them were successfully moved to the Reconstruction: namespace.
  • {{reconstructed}} (redirect: {{reconstruction}}) is not linked by any pages in the Appendix: namespace.

Are there any unattested Latin terms, or some other reconstructed terms in appendices yet? Can we remove {{reconstruction}} from all appendices now and remove the patch that makes {{l}} and {{m}} link to the appendix namespace? --Daniel Carrero (talk) 02:21, 6 March 2016 (UTC)

  • There are no more Latin reconstructions in Appendix namespace. —Aɴɢʀ (talk) 07:34, 6 March 2016 (UTC)
    The only single-term entries that I've been able to find in the Appendix namespace are conlangs. For the purposes of moving and of the special code that CodeCat added to {{reconstructed}} and to Module:links, we're done. I'm sure there are still lots of hard-coded links to reconstructions in Appendix space, and there are no doubt obscure bits of code in modules and templates that will have to be tracked down and fixed (I believe the templates that link to the next/previous in a series, such as {{cardinalbox}}, need to be checked, for instance). Chuck Entz (talk) 07:54, 6 March 2016 (UTC)

Template:auto cat[edit]

I created this template and module just now. It is used as a category boilerplate template. It tries to automatically detect the type of template that the category requires, and the parameters that go with it. It works with a good number of category types already, even though the module itself is quite small and simple. I don't know if it works with subst: right now, it would be nice for sure if it did. Ideally, substing it would give you the underlying template invocation. —CodeCat 20:08, 4 March 2016 (UTC)

Bot spam[edit]

Now that it's abundantly clear why we have bot flags, can we please block User:TheDaveBot until it gets one? My watchlist is completely unusable. —CodeCat 15:48, 5 March 2016 (UTC)

[1] DTLHS (talk) 15:55, 5 March 2016 (UTC)
I could find no way to flag a move as a bot. For future reference, you can use the "inverted" namespace selection to look at everything which isn't a move from one namespace to another. - TheDaveRoss 16:12, 5 March 2016 (UTC)
Bot flags are given to the user as a whole. You have to apply for one and get it voted on. I don't think the vote will fail, but you are technically violating WT:BOT by running a bot without a vote... I think people turn a blind eye because the work your bot is doing is good and needed. —CodeCat 16:34, 5 March 2016 (UTC)
The account has a bot flag, and a flood flag, and the bot and minor flags were set on the API calls. Moves cannot be flagged. - TheDaveRoss 19:52, 5 March 2016 (UTC)
TheDaveRoss, thank you for your volume renames, performed in align with an objectively verifiable consensus as evidenced by a vote. You don't need to take these attacks from the greatest WT:BOT violator the English Wiktionary ever had very seriously. Thanks again. --Dan Polansky (talk) 20:16, 5 March 2016 (UTC)
For the record, the enbotting vote is at User_talk:TheDaveBot#Discussion and Vote on TheDaveBot, from September 2006; the bot flag was granted by Dvortygirl as per that page. This was around the time Wiktionary:Votes was started on 22 September 2006 by the very TheDaveRoss. --Dan Polansky (talk) 08:09, 6 March 2016 (UTC)

Collapsible derived terms[edit]

One user in currently in the process of systematically placing derived terms in collapsible boxes, for English and non-English entries alike, in significant volume. I am not happy about it; if you are unhappy too, we can try to do something about it. --Dan Polansky (talk) 15:09, 6 March 2016 (UTC)

Why are you unhappy about it? For any long (>6 items) list it makes a larger portion of the of the overall entry visible at once. I thought there was a gadget or preference which allowed them to be open or closed by default for registered users. I can only find that (browser) preference implemented for translation boxes. Would such a gadget satisfy you? DCDuring TALK 17:24, 6 March 2016 (UTC)
It's in the sidebar, in the "Visibility" section. --Yair rand (talk) 18:49, 6 March 2016 (UTC)
I thought they didn't persist, but I see that they do. Thanks. DCDuring TALK 19:47, 6 March 2016 (UTC)

Wikipedia article titles as a durably archived source[edit]

We don't allow Wikipedia articles as sources because, I presume, they can be modified and anyone can write stuff on them in theory. However, the names of articles themselves are often much more stable. Furthermore, Wikipedia itself is durably archived (through countless mirrors). So I wonder if the titles of articles could be allowed as attestations for terms? We could set some restrictions, such as that the article has to have existed with that name for X years. —CodeCat 19:23, 6 March 2016 (UTC)

Strongly oppose. We shouldn't allow Wikipedia as a source because editors of small language Wikipedias (and even some bigger ones, like Portuguese) make up protologisms all the time. They have the requirement of their information being true, but the name they use for the title doesn't need to be a name that anyone actually calls it by. We have to have those names be true as well, though. Because limited documentation languages only require a single use or mention, this would allow thousands of protologisms to enter Wiktionary. —Μετάknowledgediscuss/deeds 19:34, 6 March 2016 (UTC)
Better yet let's pretend we stop caring about being durably archived. If being durably archived actually mattered, we'd've accepted Wikipedia 10 years ago. What we actually allow is published books and Usenet. It might not sound great in a policy document, but I personally think we should stop lying. Renard Migrant (talk) 19:45, 6 March 2016 (UTC)
We allow durably archived media, not just books and Usenet. We cite books, journals, magazines, newspapers, archived films and television shows and songs (in the US and UK, modern commercially-released films and shows and songs are archived: libraries keep copies just like they keep copies of books; but e.g. a Somali TV show probably isn't durably archived and couldn't be cited), as well as monuments and runestones, and that list probably isn't exhaustive. - -sche (discuss) 04:23, 7 March 2016 (UTC)
Even titles can change (and be deleted such that the history can't be cited), which speaks against the idea that they are durably archived, and they are also often made-up. Another reason for not accepting Wikipedia is that for a WMF project to cite a WMF project is circular; perhaps it would work for terms that we were claiming as WMF jargon, but if we're asserting a term is generally used, we need to show it's generally used. - -sche (discuss) 04:27, 7 March 2016 (UTC)
Exactly, de facto durably archived stuff that is unreliable or allows self-referencing we don't allow. Which I wholly support. But we don't document this practice anywhere, in fact we tend to deny its existence until someone tries such a source, like citing a Wikipedia article as a source for a word. Renard Migrant (talk) 19:15, 7 March 2016 (UTC)
Another question people ask is (merging this into one) why don't we allow stuff accessible via archive sites like the wayback machine and why don't we durably archive stuff ourselves by taking screenshots. Renard Migrant (talk) 19:16, 7 March 2016 (UTC)
Websites can be removed from the wayback machine upon requests (via robots.txt). Screenshots are worth little more than our direct quotations. The point is that someone who doesn't believe our direct quotation can go look it up themselves, and screenshots do not fix that problem. --WikiTiki89 19:20, 7 March 2016 (UTC)
Anything used as a Wikipedia article title, if not nonsense, would already have to come from some other archived source. bd2412 T 22:29, 7 March 2016 (UTC)
Ideally, yes, but in practice, especially with small minority languages, that is not the case. --WikiTiki89 22:42, 7 March 2016 (UTC)
Again, even with bigger ones like Portuguese Wikipedia (Ungoliant can attest to this), they make stuff up. Remember, the ideas must be cited, not the words used to describe or name the ideas. —Μετάknowledgediscuss/deeds 01:16, 8 March 2016 (UTC)

Inter-project links to missing pages[edit]

I am not sure if there is already a policy in place about links to other projects, but I have run across a number of such links which do not lead to anywhere. I propose that we modify the templates ({{wikipedia}} most notably) to accept a parameter which hides the template from the page and categorizes the page to be reviewed. I think that many of these cases just require choosing an appropriate target page on the sister project, perhaps some cases should be removed altogether. A bot could easily maintain such a system, periodically checking all usages (either through dumps or whatlinkshere). I assume that nobody is in favor of links to non-existent pages on sister projects are useful, but maybe I am missing something. - TheDaveRoss 23:35, 6 March 2016 (UTC)

What IMO would be more useful would be to run the list of target pages from {{wikipedia}} and its relatives against the list of mainspace headwords at WP. The ones that were to non-existent pages would be problems to be cleaned up (and perhaps those to redirect pages to be bot modified). Something similar could be done for links to other sister projects. Perhaps focusing on links to a smallish sister project, but one which had at least a hundred links from here would be a good way to test the concept. DCDuring TALK 00:35, 7 March 2016 (UTC)
I think we might be saying the same thing, but differently. I am suggesting that we generate lists of pages which have {{wikipedia}} links (and similar) where the target of the link does not exist on that project. Is that what you are saying too? - TheDaveRoss 00:48, 7 March 2016 (UTC)
I'm glad. I was afraid you were suggesting that when a user discovered a dead link s/he simply manually insert a flag that would place the page in a category. I'd been thinking about such a thing for quite a while. It might also be handy for the {{w}} and in-text hard links to other projects, each meriting inclusion in a different category or dump-derived cleanup page. The situation isn't quite the same with {{taxlink}} and {{vern}} but they might benefit from something similar. The complications don't make them a good place to start. DCDuring TALK 01:53, 7 March 2016 (UTC)
There are a few ways to accomplish this, my preference is to include a parameter on the templates (so they can be hidden/delinked), but we could also just add a hidden category to the pages which contain such links, or remove the links, etc. Any preferences there? - TheDaveRoss 03:09, 7 March 2016 (UTC)
Why hide the template on such pages and not just remove it or let it remain on whatever cleanup list (of entries linking to nonexistent WP pages) gets generated? Btw, in the other direction, Wikipedia had a project a while ago to find words it used that weren't present on Wiktionary, many of which were misspellings on WP's part but others of which were omissions on our part; I can't offhand find it. - -sche (discuss) 04:32, 7 March 2016 (UTC)
The main reason to hide rather than delete is that someone, at some time, thought that a link to Wikipedia was beneficial. It could be removed and a category could be added, but that make more work for whoever actually fixes it down the road (they would have to both add the template and remove the category rather than simply changing {{wikipedia|missing=true}} to {{wikipedia|New target}}. As I said above, there are a number of ways to accomplish the same thing, I am not opposed to any of them really. - TheDaveRoss 12:06, 7 March 2016 (UTC)
See predator bug & Category:Wikipedia link with missing target page for an example. - TheDaveRoss 03:33, 7 March 2016 (UTC)
I added correct links under "External links" for predator bug.
The first thing to try for links to species names and the remedy in the case of [[predator bug]] is to link to the genus instead of the species. For Translingual entries in general a link one rank/level higher in the hierarchy often finds a valid link. Sometimes the taxon that is parameter 3 in {{taxon}} solves the problem adequately. IOW, many Translingual entries could have the links corrected by bot. As to the choice of first tool for correcting the problem I suppose it is a question of whether there are thousands of bad links in Translingual entries or a much smaller number. DCDuring TALK 11:41, 7 March 2016 (UTC)
If the algorithm is fairly rigid (if the link is missing then try to link one level higher, stopping at [family? class?]) then that is certainly something which could just be fixed along the way rather than categorizing and leaving for someone else. - TheDaveRoss 12:06, 7 March 2016 (UTC)

Links to discussions often broken[edit]

Links to supposedly ongoing discussions that appear in banners such as "A user suggests that this entry be cleaned up ..." are frequently broken, normally because the discussion has been archived. Is it not possible to make durable links? 86.152.161.107 01:40, 11 March 2016 (UTC)

It's not much of an excuse but many of the discussions never took place or have little helpful content. Sometimes there was a cleanup but the tag was not removed.
We have many problems not tagged with RfCs, particularly in the area of definition quality, such as having definitions copied unchanged from Webster 1913 or Century 1911 using words in ways that are obscure or misleading. DCDuring TALK 23:51, 11 March 2016 (UTC)
Yeah the tag is supposed to be removed but people often forget. Renard Migrant (talk) 11:03, 12 March 2016 (UTC)

Diacritical marks[edit]

I think we should not give to all our readers the false intention that the orthographically correct writing of some words (especially Latin and ancient Greek words) is with diacritical marks such as macron. This should be declared in the lemma in a special section or in a special "way". Especially we should not have this in the etymology section or in section for descendants or in the declension or inflection tables. The declined or inflected forms, found in the inflection and declension tables, have their own articles where the (non correct orthographically) term with diacritics can be also shown. Also, in the "header", the "diacritical mark", the mark that is not commonly written but only in case the teacher or the lecturer asks for, should be marked as such; as a non orthographically correct but a special mark (with "header" I mean the "word" that the templates such as {{grc-noun}} or {{la-verb}} produce and show in the page).

An ordinary reader who looks at the procrastinate sees "prōcrastinātum" and says: Oh! Look how the Romans typed their letters! It is fascinating! My school book has typos! I must inform the editors of my book for those typos! --Xoristzatziki (talk) 08:09, 14 March 2016 (UTC)

Wiktionary:About_Latin#Macrons_should_be_used_only_within_pages. --Anatoli T. (обсудить/вклад) 08:53, 14 March 2016 (UTC)
@Xoristzatziki Why not complain about all languages, which don't normally use diacritics in a running text, but we do? :) Shall I start listing? --Anatoli T. (обсудить/вклад) 08:56, 14 March 2016 (UTC)
First off: We as a dictionary are not responsible for daft kids with too much time on their hands and too little scrutiny in their reading. Secondly, we already clearly express when diacritics are part of the actual orthography or a scholastic addition by choice of page title. Which brings me back to point number one. Korn [kʰũːɘ̃n] (talk) 09:31, 14 March 2016 (UTC)
I agree (with Korn). - -sche (discuss) 05:54, 16 March 2016 (UTC)
@Atitarev My complaint is about all languages, which don't normally use(d) diacritics in a running text. grc and la (and macron) are only for reference.
@Korn When most users come here expect to find the definition (primary) and the declension-inflection (possible secondary) of a word. Sure there are philologists and such that want to know also if the pronunciation of the vowel was long or maybe would-be poets looking for rhymes. But no one will find the text in a normal Latin, Ancient Greek etc. book.
The "so called" clearness of page title is not so clear to actual readers (not the ones involved in editing), same as redirection pages are confusing some times. As above mentioned if a diacritic is a scholastic addition (IMHO) it should be placed in a place for scholastic additions and nowhere else and personally I do not think of the header in the definitions as such a place. For me "looks" like we place the pinyin in the header of Chinese characters instead of the actual pictograms or placing the transliteration instead of the actual word (for other languages). --Xoristzatziki (talk) 11:57, 14 March 2016 (UTC)
But anyone can find the text in a normal Latin, Ancient Greek dictionary. Likewise, no one will ever find gender, principal parts, etc. in running text. Chuck Entz (talk) 12:24, 14 March 2016 (UTC)
@Xoristzatziki Users need to know how to use dictionaries. We include diacritics for Arabic, Hebrew, Russian, Serbo-Croatian, etc. These languages normally don't use diacritics in a running text. We have policy pages, which describe the purposes of the symbols. It's the way it is. You can suggest a vote on removing diacritics but I doubt it will pass. --Anatoli T. (обсудить/вклад) 12:40, 14 March 2016 (UTC)
Ok (for but I doubt it will pass). I am just pointing (and registering...) my strong opposition to (such special) exceptions from ordinary (and especially electronic) dictionaries. Users coming here to find a definition and everything they are accustomed to find in an everyday dictionary (so this should be a simple page, as simple as we can do). --Xoristzatziki (talk) 05:03, 16 March 2016 (UTC)
Ordinary dictionaries that are any good certainly show these diacritics. This is simply part of how Wiktionary strives (and succeeds in some languages, I might add) to be a superior resource in terms of balancing completeness and user-friendliness than any others freely available online. —Μετάknowledgediscuss/deeds 05:31, 16 March 2016 (UTC)

If only we had discussed this at length recently. —JohnC5 15:01, 16 March 2016 (UTC)

If I'm a casual user and I come across procrastino with the page name different from the head word, how do I know where to look to explain this phenomenon? Unless I'm missing something, that answer is, I can't possibly know where to look as there's no explanation whatsoever in the entries. Renard Migrant (talk) 22:01, 16 March 2016 (UTC)

Use of "inh" vs "etyl"[edit]

Apologies if this has been covered elsewhere already, but it's more of a question of how to handle etymologies from now on. I've noticed more entries are getting the {{|inh}} tag if they are inherited from another language, as opposed to the simple {{|etyl}} one that has been widely used until recently. Is this how all entries should be treated from now on if it is known they are inherited (the same going for borrowed ones, respectively), under Wiktionary policy now?

I ask because for most of the Romance languages, the vast majority are still under the plain {{|etyl}} kind; for example https://en.wiktionary.org/wiki/Category:Italian_terms_derived_from_Latin, as opposed to https://en.wiktionary.org/wiki/Category:Italian_terms_inherited_from_Latin, which does not also put those words in the aforementioned category. What's going to happen to the original category (the one called simply Italian terms derived from Latin)? They're not mutually exclusive, in my opinion, but now we're sort of in between... If someone is interested in finding a list of words in Italian that is derived from Latin and they click on that Category, they won't find some important basic words because they'll be in the "inherited" category. It's going to take a really long time to manually sort through all the words and replace the inherited ones with the correct tags... I just got through most of the Romanian inherited lexicon using the {{|etyl}} ones.

Is the "derived from" category just now going to be for words in general derived from that language, especially if it isn't known they are inherited or borrowed? Is it going to be needed anymore, in the long run, if we're distinguishing inherited vs not? Or should inherited terms be put into both categories, such as by manually adding a category tag at the bottom? How should this be treated? Word dewd544 (talk) 19:29, 16 March 2016 (UTC)

If I'm not mistaken (and please forgive me if I am, I'm not up on this either...), the Category:Italian_terms_derived_from_Latin should remain for those Italian words that were borrowed from Latin, as opposed to inherited (i.e. remained in VLat throughout its history), correct ? Leasnam (talk) 19:35, 16 March 2016 (UTC)
The traditional categories continue to be used by {{bor}}. It is also used by {{der}}, which is used in terms borrowed indirectly. Don’t add the categories manually; let the templates worry about categorisation. .
Indeed, it will be a lot of hard work to update all the etymologies, but I personally think it’s a step in the right direction. As you know, borrowing vs. inheritance is an extremely important distinction in etymology. — Ungoliant (falai) 19:39, 16 March 2016 (UTC)
Yeah, I guess we just have to accept that some of these categories are going to be a bit disjointed for the time being, until everything is properly sorted into the right one; I wish I would've seen the policy discussions on these matters earlier so I could've avoided using {{|etyl}} lately. The only problem is, what if there are words where it is uncertain if they were borrowed or inherited? Do we just use the {{|der}} in those cases, and say "possibly" or "probably" borrowed/inherited before it, until we can find a source that removes the ambiguity? Unfortunately, there aren't always academic sources to cite for every word.
Additionally, if you you really delve into it and do more in depth research, you'll actually find the majority of words in every Romance language, on a purely numerical basis, were borrowed (Sardinian might have the among the highest amount inherited)... Though of course, when it comes to the core lexicon/vocabulary, inherited terms are almost completely dominant for all these languages, despite only consisting of a few thousand words among a much larger overall lexicon. Not a lot of people outside of linguists, philologists, or etymologists specializing in Romance languages and their evolution know this, however, so it may seem strange for some to see a fairly common word in Italian or Spanish or French was actually a borrowing.
Also, another technical question: can words be "borrowed" from Vulgar Latin, especially by a Romance language? There are a few cases where some words may have been (depending on what is defined as a Vulgar Latin word, of course- maybe Late Latin or Medieval Latin are actually more accurate in these cases), but it seems by its very nature, Vulgar Latin would be the source of primarily inherited words since it, for the most part, was a natural language and more spoken than written. Word dewd544 (talk) 23:07, 18 March 2016 (UTC)
A side-question: Should {{bor}} put entries in its own "borrowed from" category like {{inh}} already does? —CodeCat 20:27, 16 March 2016 (UTC)
Yes. --WikiTiki89 20:30, 16 March 2016 (UTC)
Can {{bor}} be used at any step of an etymology, not just the most recent ? (i.e. if a term was inherited in English from Middle English, but borrowed by Middle English from Old French --a non-Modern English borrowing...) Leasnam (talk) 21:34, 16 March 2016 (UTC)
The borrowing has to have occurred within the "current" language. If a term was borrowed into Middle English, it is not a borrowing in modern English but would be considered inherited from Middle English. —CodeCat 21:37, 16 March 2016 (UTC)
Thanks :)Leasnam (talk) 21:38, 16 March 2016 (UTC)
That's not how I've been using it. If a modern English word has been inherited from Middle English and the Middle English word was borrowed from Old French, then the modern English word belongs in both Category:English terms inherited from Middle English and Category:English borrowed terms. There's no reason to treat the two as mutually exclusive. —Aɴɢʀ (talk) 21:52, 16 March 2016 (UTC)
Would this be done only for "recent" borrowings (2-3 steps back ?). Otherwise, if we follow this logic, the further back we go, the more confusing it will become. For instance, if a word was borrowed into Old English from Latin, then inherited by Middle English and English, I can see this working. But if the same Latin word was itself borrowed by Latin from Greek, then it starts to get hairy. And likewise, if the Greek word was borrowed from Phoenician or Persian, then is it still a Persian borrowing in English ??? Leasnam (talk) 22:13, 16 March 2016 (UTC)
It's only confusing if we start categorizing terms by the language they were borrowed from. {{bor|en|fro}} puts things into Category:English borrowed terms and Category:English terms derived from Old French, but we don't allow categories like Category:English terms borrowed from Old French, so it doesn't matter whether it's borrowed directly or indirectly. (I see that Category:English terms borrowed from Old French actually suddenly exists now, even though there's been no consensus to create categories like that, so that's a shame. At any rate, my argument still holds according to the state of affairs up to 48 hours ago.) —Aɴɢʀ (talk) 15:25, 17 March 2016 (UTC)
That's not what the documentation says. —CodeCat 21:54, 16 March 2016 (UTC)
There is notext=1 which is useful for a term that was say, borrowed into Latin, borrowed into Old French then borrowed into Middle English, to avoid having 'borrowing' three times in a paragraph making it quite a lot longer. Renard Migrant (talk) 21:58, 16 March 2016 (UTC)
That's still not how the documentation says the template should be used, though. It should specifically be used only for borrowing into the current language, anything else should not use this template. —CodeCat 22:00, 16 March 2016 (UTC)
So change the documentation. Renard Migrant (talk) 23:07, 16 March 2016 (UTC)
Is there a consensus for changing the practice established by the documentation? —CodeCat 23:22, 16 March 2016 (UTC)
I think that {{bor}} should only be used for borrowings directly into that stage of the language. Thus, words borrowed into Middle English should not use {{bor}} in Modern English. --WikiTiki89 14:43, 17 March 2016 (UTC)
Is there a consensus for using the practice established by the documentation? —Aɴɢʀ (talk) 15:25, 17 March 2016 (UTC)
Belatedly, but: as long as we treat languages like English, Middle English and Old English distinct, I for one support the idea that {{bor}} should only be used for borrowings later than Middle English, not for things inherited from ME or OE. The idea after all is that each term is exactly one of inherited or borrowed (or coined etc.), and thus should not simultaneously be in both e.g. "English terms inherited from Old English" and "English borrowed terms".
On the other hand, if a term is marked as borrowed from Latin, and inherited from Middle English at the same time, you can infer that the borrowing happened in Middle English. —CodeCat 19:44, 20 April 2016 (UTC)

Comment: I don't like that {{etyl}} and {{der}} have their parameters inverted. Either of these does the same thing:

Result:

--Daniel Carrero (talk) 21:39, 16 March 2016 (UTC)

It can be confusing, but {{etyl}} is the odd one out here. It's the only template on Wiktionary (that I know of) that has the current language specified with the second parameter. All the others use either the first or lang=. See Wiktionary:Templates with current language parameter. —CodeCat 21:44, 16 March 2016 (UTC)
I'd support deleting {{etyl}} in favor of keeping only {{der}}. --Daniel Carrero (talk) 21:48, 16 March 2016 (UTC)
Purely for historical accuracy, {{etyl}}'s the odd one out only because {{der}}, {{bor}} and {{inh}} were designed to be the opposite way round to {{etyl}}. Renard Migrant (talk) 21:58, 16 March 2016 (UTC)
It is true, {{etyl}} was created in 2008 and the others in 2015. But I understood CodeCat's remark like this: we have {{head|en|...}}, {{label|en|...}}, {{m|en|...}}, {{l|en|...}}, we might as well have {{der|en|...}}, {{bor|en|...}} and {{inh|en|...}}. In all cases, "en" is the current section. (English) --Daniel Carrero (talk) 22:01, 16 March 2016 (UTC)
{{der}}, {{bor}} and {{inh}} would not make sense if their language parameters were inverted. Yes, this does make it confusing when compared to the older {{etyl}}, but I think this is a good change. {{etyl}} should not be changed, however. --WikiTiki89 14:43, 17 March 2016 (UTC)
I think {{bor}} and {{inh}} should categorize into distinct categories and also into the generic "X terms derived from Y". This would be good both in the short term (it would address the problem that until the many entries which use {{etyl}} are updated, derivations of the same type are split into multiple categories) and in the long term: I think that it's useful to let people find all derivations from language Y in one place, no matter whether the derivation is by inheritance, by borrowing, or by an unclear route (the last of which probably ensures that some category besides "inherited" and "borrowed" will always have to exist). - -sche (discuss) 02:52, 17 March 2016 (UTC)
Yes. --WikiTiki89 14:43, 17 March 2016 (UTC)

Open call for Individual Engagement Grants[edit]

IEG barnstar 2.png

Hey folks! The Individual Engagement Grants (IEG) program is accepting proposals from March 14th to April 12th to fund new tools, research, outreach efforts, and other experiments that enhance the work of Wikimedia volunteers. Whether you need a small or large amount of funds (up to $30,000 USD), IEGs can support you and your team’s project development time in addition to project expenses such as materials, travel, and rental space.

Also accepting candidates to join the IEG Committee through March 25th.

With thanks, I JethroBT (WMF) 23:01, 16 March 2016 (UTC)

  • Anyone interested in working on getting corpora and a KWIC generator so we don't have to risk violating copyright when we try to produce modern definitions? DCDuring TALK 03:16, 17 March 2016 (UTC)

Template:borrowing and borrowings into ancestral stages[edit]

I have been thinking a bit about the situation above, regarding when to use {{bor}}. I think the use case can be expanded so that it can be used for borrowings into the current language or any of its ancestors. This means that cellar would become allowable in Category:English terms borrowed from Latin, despite having been borrowed in Middle English. Is everyone ok with changing practice and documentation to allow for this?

I am still opposed to placing terms there that have been borrowed from (for example) Latin through another intermediate language. Consider, for example, emperor. It is borrowed from Old French, but Old French inherited it from Latin. We don't say that emperor is inherited from Latin, that would be silly, and saying it was borrowed from Latin would be equally silly. However, consider the cognate imperial, which Old French borrowed from Latin (on account of its i). It wouldn't make sense to consider this borrowed from Latin into English anymore than emperor was borrowed or inherited from Latin. English, or any of its ancestors, had no "linguistic contact" with Latin when these terms ended up in English. The only direct contact was with Old French. —CodeCat 23:22, 16 March 2016 (UTC)

What about Latin terms borrowed into Proto-Germanic? Or terms borrowed into Proto-Indo-European? It seems silly that hemp and canvas should both be in Cat:English borrowed terms, when one is inherited in unbroken succession from at least a pre-Grimm's-Law stage of Proto-Germanic, and the other is borrowed into Greek and then borrowed into Latin, and finally borrowed into Middle English from Anglo-Norman/Old French. There are attested Sumerian terms that can be found as inherited forms in modern Semitic languages, with the borrowing likely having happened literally thousands of years ago. This looks like a good way to overload the borrowed term categories and obliterate from the categories very significant distinctions about recentness/ancientness of borrowings. Chuck Entz (talk) 01:42, 17 March 2016 (UTC)
Yes, as Leasnam noted in the section above, labelling things as "borrowings" when they were borrowed into some previous stage gets hairy fast. I agree that "emperor" was not inherited from Latin and that, accordingly, it makes no sense to say "imperial" was borrowed from Latin. It seems like "derived" is the best label for such things. - -sche (discuss) 03:00, 17 March 2016 (UTC)
This issue is due to leaving out the intermediate stages. The etymology for emperor should say: Inherited ({{inh}}) from Middle English, borrowed ({{der}}) from Old French, inherited ({{der}}) from Latin. --WikiTiki89 14:48, 17 March 2016 (UTC)
But if we do that, emperor is in Category:English terms inherited from Middle English, but it isn't in Category:English borrowed terms even though it is a borrowed term. —Aɴɢʀ (talk) 15:26, 17 March 2016 (UTC)
That's the point. It was a borrowed term in Middle English, but no longer so in Modern English. --WikiTiki89 15:28, 17 March 2016 (UTC)
A word never stops being a borrowed term. It hasn't suddenly become a native term, so it remains a borrowed term for the rest of eternity. —Aɴɢʀ (talk) 15:32, 17 March 2016 (UTC)
Then statistically we should be able to say that almost every word in every language is likely a borrowed term, because when you go as far back as PIE and earlier, who the hell knows? --WikiTiki89 15:34, 17 March 2016 (UTC)
In theory yes, but in practice we'd never put {{bor}} on a word without knowing for sure that it had been borrowed from some specific named language or proto-language. —Aɴɢʀ (talk) 15:40, 17 March 2016 (UTC)
Still, it leaves many words as borrowings that were borrowed so long ago that it's irrelevant. Such as dish and kitchen. --WikiTiki89 15:50, 17 March 2016 (UTC)
I guess it all comes down to how borrowing is viewed in each respective language. A language like English is keen to observe terms borrowed from other languages--its speakers even pride themselves on such. However, there are some languages (not to mention any) which might not fancy this, but see ancient borrowings as native or at least naturalised. Either way is fine, but I think we should be consistent. Leasnam (talk) 16:03, 17 March 2016 (UTC)
To follow up a bit, a word borrowed into Proto-Germanic shouldn't be listed as a term borrowed into English, because English didn't exist yet. Same with French. A Word borrowed into Latin from Greek shouldn't be a borrowed term in French because French didn't exist then (because we see Latin as distinct from its Romance descendants: i.e. Latin ceased and "new" languages emerged. French is not seen as "Modern Latin" in quite the same way that Modern English is seen as "Modern Old English"). I suppose it's natural to place a cutoff point when one language evolves into a new language (e.g. 'Proto-Germanic' into 'Old/Middle/New English' and 'Latin' into 'Old/Middle/Modern French'), but a word borrowed into Old French would be easy to consider as a borrowing into French, since language stages are really quite distinct when comparing them to separate languages (--a position that I have always held to)...definitely something to think about. Leasnam (talk) 16:17, 17 March 2016 (UTC)
Languages are always evolving into new languages. A term borrowed into English on 31 december 1499 doesn't suddenly become a native term on 1 january 1500. Nor is there a point where Proto-Germanic suddenly becomes English. Don't forget that Old English developed into Scots too, so if we call Old English "English", then is Scots also English? Old English is as much "English" as it is "Scots" after all. Just like Proto-Germanic is as much "Old English" as it is "Old High German", "Gothic" and "Old Norse". People don't suddenly stop speaking one language and start speaking others, English is just one particular dialect of Proto-Germanic, and it's so distinct from the others that we've given it a name and called it a language. But it's arbitrary. —CodeCat 16:41, 17 March 2016 (UTC)
I see your point, and come to think of it I think you're right. When I think about a scenario such as Proto-Indo-European borrowing a word from Ancient Chinese (just hypothetically in this example), and this same word being inherited through PGmc into OE, ME, and finally English...would I see it as a Chinese borrowing in English? The answer strangely is yes, I would. Leasnam (talk) 16:58, 17 March 2016 (UTC)
Yes, it's arbitrary, but we still need to have cutoffs. --WikiTiki89 17:18, 17 March 2016 (UTC)
Why? And more pressingly, what? People speak all the time of French loanwords in English, but many times those were borrowed in Middle English. How do we define the point where we no longer consider English "English" in terms of borrowing? I think this is a very muddy picture that we probably don't want to solve. I for one would like to see borrowed terms listed regardless of time depth. I want to see cheese listed in Category:English terms borrowed from Latin for sure. It fits the description perfectly: it is an English term, and it was borrowed into the language that was to become modern English from Latin. —CodeCat 18:07, 17 March 2016 (UTC)
But wouldn't Category:English terms derived from Latin be a little more fitting for a word like cheese ? [EDIT CONFLICT] For "borrow" I would expect the form of the word to be like that in Latin (in my opinion), like ergo, that to me is a word borrowed from Latin. Cheese has been altered so much within English that it doesn't look like a Latin word anymore. If I "borrow" something of yours, it's still yours. Cheese means nothing in Latin. Café is a borrowed word from French. It still means something in French. Leasnam (talk) 18:25, 17 March 2016 (UTC)
Why? —CodeCat 18:26, 17 March 2016 (UTC)
Please see above ^ Leasnam (talk) 18:28, 17 March 2016 (UTC)
Because we treat Middle English and Modern English as separate languages already and we already have a cutoff. This doesn't mean that Middle English became Modern English overnight on whatever day New Year's was in the year 1500, but just that we had to pick a cutoff and that's what we picked. We should stick with this cutoff for all intents and purposes, including borrowings. --WikiTiki89 18:35, 17 March 2016 (UTC)
CodeCat et al. I see your point about emperor (I didn't think of that). I think counting ancestor languages as separate languages for this purpose is a bad idea. Basically what we're talking about is cellar was borrowed into what we now call Middle English and has continued to exist ever since. Renard Migrant (talk) 20:40, 17 March 2016 (UTC)
I think it would be unhelpful to categorize "hemp" as having been borrowed by English from Scythian: it would dilute the category, putting things in it that I imagine many readers, like me (and apparently Leasnam), would not expect; it would also be inaccurate: English didn't exist at the time of the borrowing. I think "derived from" is the best label. What WikiTiki suggests in his comment of 14:48, 17 March 2016 (UTC) is good, especially if {{bor}}/{{inh}}/{{der}} can be made to only generate a language name (like {{etyl}}, and like someone has proposed elsewhere) and not the additional text they currently generate. I don't see how one could coherently categorize "hemp" in "English terms borrowed from Scythian" and not categorize e.g. "squaw" into "English terms inherited from Proto-Algonquian", and that would be absurd: we just had a discussion about what "inherited" and "ancestor" mean (with regard to Yiddish), and English is definitely not inheriting things from Proto-Algonquian. Perhaps the templates could detect whenever an English term was listed as deriving from anything that was not an ancestor of English, and then put the word into "English borrowed terms", but even that is questionable IMO, since "hemp" is a very different kind of "borrowed term" from "de jure". - -sche (discuss) 21:24, 17 March 2016 (UTC)
I agree with your point about "hemp", that was what I was trying to say in the second paragraph of my opening post. But I disagree that English did not exist. It did, it just wasn't called English yet. —CodeCat 21:33, 17 March 2016 (UTC)
Huh, I thought that's what you were saying (that you wouldn't want "emperor"/"imperial" or "hemp" to be listed in "English terms borrowed from X"), but then you also say you do want to see "cheese" listed in "English terms borrowed from Latin"; what's the difference? - -sche (discuss) 21:47, 17 March 2016 (UTC)
What I mean is that these terms should only be listed in the categories from which English itself (any of its ancestors) borrowed the term directly. So hemp would appear in Category:English terms borrowed from Scythian, but cannabis would appear in Category:English terms borrowed from Latin. Essentially, the rule is that a language can only borrow a term from at most one language, and any inheritances must follow the borrowing in time. —CodeCat 00:44, 18 March 2016 (UTC)
Ah; then we disagree (and I agree with what I think Wikitiki and Leasnam are saying). "Hemp" should go in "English terms derived from Scythian", not "English terms borrowed from Scythian". - -sche (discuss) 00:51, 18 March 2016 (UTC)
But it was borrowed from Scythian. —CodeCat 01:21, 18 March 2016 (UTC)
It was borrowed, yes, by English in an ancestral form, that it true. It all comes down to what we want "English terms borrowed from" to mean: 1). "English words that were borrowed" OR 2). "words that [Modern] English borrowed". As above, {{der}} does a very nice job at handling the former. Leasnam (talk) 01:37, 18 March 2016 (UTC)
Yeah I also see what you mean about 'emperor'. Also, what's the exact criteria to call a word a borrowing on here? Is it simply just a word taken from another language which is not actually a parent language to it? I'm curious because there's several hundred words in Albanian, for example, that were technically "borrowed" from Latin back in antiquity, but they're unique since they were absorbed in a more natural way from the Vulgar Latin variety spoken there as opposed to intentionally borrowed in a scholarly fashion, which is, in contrast, what happened with some later words in more recent times (especially pertaining to scientific and technical fields). That very ancient layer of loanwords from Latin, which pertains to many aspects of general life (like mjek, mik, fytyrë, fëmijë, gjymtyrë, arësye, for example), has undergone a distinct set of sound shifts within Albanian that almost mirror the developments of some actual Romance languages in some aspects, and can often be clearly distinguished from later scholarly borrowings (e.g. bibliotekë, absorboj, etc.). On the other hand, to call them "inherited" is also not right, of course, since Latin is certainly not a parent language to Albanian, despite having imparted strong influence on it. How can this be handled? This also perhaps applies in a way to a few English words that entered Old English from Latin.
I guess, in a broader sense it can apply to any language taking a term from another that it is not a descendant of. So if there's a Romanian term derived from Proto-Slavic or some unspecified Slavic language, does it have to use {{bor}}, since its core vocabulary is not from that language (depends how concerned we are about how a term was borrowed, whether intentionally as part of a language reform or if they just entered naturally over time through interaction of different languages/populations)? How are pidgin or creole languages handled, also?Word dewd544 (talk) 22:36, 18 March 2016 (UTC)

colloquial and informal labels[edit]

These two labels are very confusingly defined as the same thing in our Appendix:Glossary. There are old discussions about this problem there and here, but nobody has fixed the problem and the discussions just fizzled out.

If even people writing this dictionary have difficulties defining the difference between them, we can be sure that their simultaneous use is of absolutely no benefit to users and instead confuses and discourages them and new editors and wastes their time.

In fact, simultaneous use of these very similar terms would still remain detrimental even if we should eventually succeed in defining a clear difference. The best attempt at that i've seen is to define "colloquial" as "limited to spoken use (including written dialogue)", but even if we could agree to that, it would still be senseless to hide that distinction behind a word (colloquial) that even our glossary admits is commonly misunderstood to mean "regional" or "location"!

If we feel it is important to make or label the difference between informal everyday speech and informal everyday writing (which will soon seem silly and anachronistic to younger users and editors in this age of email, texting, and social media), we should stop using a term that most users misunderstand or don't know and instead use plain English labels such as "informal, esp. spoken" and "informal, esp. written". --Espoo (talk) 08:01, 19 March 2016 (UTC)

I think we should standardize on informal and abandon colloquial. Some background is at User talk:Dan Polansky#Colloquial vs. informal. In sum, modern dictionaries overwhelmingly seem to have switched to "informal" label. I don't believe the wiki lexicographers will be able to maintain a useful distinction between informal and colloquial and convey the distinction to the readers. --Dan Polansky (talk) 09:12, 19 March 2016 (UTC)
The distinction may not be clear for English, but in other languages it is. In Welsh, "colloquial" means "not literary". There are any number of Welsh verb forms, for example, that are colloquial in the sense of not literary but are not necessarily informal. —Aɴɢʀ (talk) 11:15, 19 March 2016 (UTC)
Exceptions could be made without problem, but for English, at least, and probably most languages, we should probably stick with "informal." Andrew Sheedy (talk) 05:57, 20 March 2016 (UTC)
If we decided to use only "informal", we would need to have a bot periodically change uses of "colloquial" in specified languages (which is undesirable, because it will stop as soon as the owner dies or leaves, as happened with Autoformat once or twice), or else just make "colloquial" an alias of "informal" in the module that handles the labels. The latter would prevent "colloquial" being used for Welsh, but then, if our readers don't grasp the distinction, it's probably unwise to attach significance to it: a clearer label could be used instead of "colloquial" ("familiar"?). - -sche (discuss) 06:56, 20 March 2016 (UTC)
I think the above stated bot problem is largely non-existent: once you switch all colloquials to informals for English, the rate of addition of new colloquial tags will diminish to almost nothing since people very often proceed by following the existing examples, wisely so. Even if it does not diminish, we will have much better state of affairs than we have now. --Dan Polansky (talk) 08:20, 20 March 2016 (UTC)
Good point; it also occurs to me that because "colloquial" would continue to categorize, we would have a way of finding new additions (other than checking a database dump). The number of languages we would have to watch for new additions in would be large and smaller languages might go unchecked for long periods, but I guess it's also not urgent to fix new uses. - -sche (discuss) 15:23, 20 March 2016 (UTC)
@Aɴɢʀ: By "not literary", do you mean "not appearing in writing but only in speech"? --Dan Polansky (talk) 08:22, 20 March 2016 (UTC)
@Dan Polansky Not necessarily. Magazines, newspapers, and popular fiction are often written in colloquial Welsh; unpublished writing like letters virtually always are. Modern-day introductory textbooks usually only teach the colloquial register, leaving the literary register for more advanced learners. —Aɴɢʀ (talk) 15:15, 20 March 2016 (UTC)
  • Why wouldn't we follow the original suggestion of Espoo and display "informal, esp. spoken" in every instance where it is used within {{lb}} or {{q}}. Do we use the term in some other way as a label?
Wouldn't this suggest ways to make our treatment of internet and texting slang more relatable to the rest of the language? Much of such usage seems to spread fairly quickly to other media. What doesn't spread are things reflective of the special constraints or capabilities of the medium, eg, leet, number-keypad-based puns and codes, ASCII-character emoticons.
Also is contemporary fictional dialog "writing" or "speech". To me it seems to be speech, which is how we treat it for attestation purposes.
Is a term like "Yo!" informal, colloquial, or without the need for a label or category? DCDuring TALK 11:45, 20 March 2016 (UTC)
I don't see them being one and the same, exactly...informal to me just means "not formal" (i.e. you wouldn't use it in an interview or on an job application). Colloquial is like slang, but more general...you could say it in a formal situation and everybody knows what you mean, but it's not acceptable or "proper" English. I see informal slightly above colloquial. Just my cent. Leasnam (talk) 16:24, 20 March 2016 (UTC)
My point is that whatever distinction can be made between informal and colloquial labels, it is hard to maintain in Wiktionary entries. I have seen no English monolingual dictionary that uses both labels. I believe the variation between informal and colloquial label in en wikt does not carry signal but largely noise, an accident of who was entering the label and based on what external source. --Dan Polansky (talk) 18:57, 20 March 2016 (UTC)
I'm for keeping both labels as is. For one: Is there really a problem that demands one label to be abandoned? The reason the discussion has trickled out and been ignored until now is probably that it's a non-issue. For the other: From what I know of the language, Japanese for example draws a razor sharp line between informal and colloquial, and I'm against splintering up practices language by language without pressing need, as that makes Wiktionary more difficult to get into for beginners. And that is bad. Korn [kʰũːɘ̃n] (talk) 01:20, 21 March 2016 (UTC)
The main problem is having categories for both and having entries arbitrarily split between the two- so neither is complete. Chuck Entz (talk) 03:41, 21 March 2016 (UTC)
Right. We've seen that editors use the labels interchangeably, and we can expect them to continue to: few editors, especially newer or infrequent editors, will notice whatever arbitrary distinction someone might invent in the glossary, so the distinction will not be made in practice. Readers who notice one label on one entry and the other on a synonym may think there is some difference, especially if they consult the glossary and find that it asserts a distinction, but the absence of an actual distinction means the readers will leave mislead or confused. That is bad. Splintering a label into two synonyms, which new users might expect (but be unable to find) a distinction between, also makes Wiktionary more difficult to get into for beginning editors. And the division of the entries into two categories which are synonymous in actual practice is also bad for usability. - -sche (discuss) 04:09, 21 March 2016 (UTC)
That's my thinking. --Dan Polansky (talk) 13:44, 27 March 2016 (UTC)
I'm also for keeping both--regardless of how we've arrived at two--now that we have two, let's make the most of it. We can place any (reasonable) distinction upon them as we see fit, and provide the distinction in the glossary Leasnam (talk) 01:34, 21 March 2016 (UTC)
As for scattering words across two categories: If you think both are applicable, nothing stops you from applying both tags. But with the distinction made in other languages, it's not inconceivable that it could occur in English too and if it doesn't, I don't see the harm of always using both. Korn [kʰũːɘ̃n] (talk) 11:08, 21 March 2016 (UTC)
If it's too much detail, then delete one or merge them. Can't they both point to the same Category? Leasnam (talk) 21:14, 21 March 2016 (UTC)
Re: "I don't see the harm of always using both": Do you mean like {{lb|en|informal|colloquial}}? --Dan Polansky (talk) 13:43, 27 March 2016 (UTC)

"Please provide the title of the work" in cite web[edit]

Hi! I used Reflinks to fill in some references, for example: {{cite web|url=http://www.cpfmarketplace.com/mp/showthread.php?225659-What-does-NIB-mean-to-you |title=What does NIB mean to you? |publisher=Cpfmarketplace.com |date= |accessdate=2016-03-20}} For some reason, it displays on the page as "^ “What does NIB mean to you?”, in (Please provide the title of the work)[2], Cpfmarketplace.com, accessed 2016-03-20". I don't see any other place to put the title of the work than the title parameter in the template, which is already filled. Would someone mind taking a look at this? I'm not sure whether it's a bug or something I'm doing wrong…. Thanks! :) Goldenshimmer (talk) 06:41, 20 March 2016 (UTC)

Oh, this was on NIB. Forgot to mention that :3 Goldenshimmer (talk) 06:42, 20 March 2016 (UTC)
Looks like @Smuconlaw fixed it, and improved the page in a number of other ways too. Thank you! :) Goldenshimmer (talk) 02:01, 21 March 2016 (UTC)
No worries. For consistency, please use {{cite-web}} instead of {{cite web}}. — SMUconlaw (talk) 06:53, 21 March 2016 (UTC)
By the way, @Goldenshimmer, what is "Reflinks"? — SMUconlaw (talk) 15:30, 22 March 2016 (UTC)
Hi @Smuconlaw, since I don't know how to use all those templates, I just use <ref>http://example.org</ref> and then use Reflinks, which is an app that turns it into the correct templates… theoretically. Apparently it doesn't use the right one for Wiktionary? It can be found at 📌[2]. 😸 Goldenshimmer (talk) 16:53, 22 March 2016 (UTC)
Wiktionary doesn't have the same templates as Wikipedia, so any app that is designed for Wikipedia is not going to work. --WikiTiki89 17:24, 22 March 2016 (UTC)
@Wikitiki89, I think it's supposed to work for Wiktionary, though, since it changes the title of the article when pasted to "wikt:foo". I think this may be a bug in Reflinks…. Goldenshimmer (talk) 19:31, 24 March 2016 (UTC)
It's probably not due to a bug, but to the fact that I recently modified {{cite-web}} as part of a move to streamline all the citation and quotation templates. If you know who the developer of the app is, contact him or her and request for Reflinks to be updated. — SMUconlaw (talk) 21:52, 24 March 2016 (UTC)

Category:French terms spelled with , etc[edit]

Should we create these categories, or is whatever module is adding them still in flux? DTLHS (talk) 02:49, 22 March 2016 (UTC)

@kc kennylau, CodeCat What's the update on this project? —JohnC5 03:10, 22 March 2016 (UTC)
Ugh no, commas aren't part of French (or English) spelling, they're punctuation. Imagine Category:English terms with commas in their punctuation, what would be the sodding point of that? Renard Migrant (talk) 12:23, 22 March 2016 (UTC)
I have added some more characters to the standardChars field. DTLHS (talk) 21:31, 23 March 2016 (UTC)

Terms derived from PIE words[edit]

I created the template {{PIE word}} as an alternative to {{PIE root}} when a word derives from a PIE word that is not currently described as being derived from a root: for instance, ὄνομα ‎(ónoma) and ὀνομάζω ‎(onomázō) from *h₁nómn̥‎. {{PIE word cat}}, like {{PIE root cat}}, serves as the category boilerplate template.

I just created the new templates by modifying {{PIE word}} and {{PIE word cat}}, and might have made some mistakes, so I'd appreciate it if someone looked over my work and checked if there are errors. Also not sure where the category Category:Terms derived from the PIE word *h₁nómn̥ should go; it's currently placed in Category:Terms derived from Proto-Indo-European roots.

Maybe {{PIE word}} and {{PIE root}} could be merged, but not quite sure how. — Eru·tuon 03:12, 22 March 2016 (UTC)

Hm, probably have to merge. Otherwise the two templates have to be stacked in ἔθω ‎(éthō), from *swé +‎ *dʰeh₁-. Not sure how to do it, except in an inelegant way (multiple if-statements). — Eru·tuon 03:22, 22 March 2016 (UTC)

I think this sets a dangerous precedent. We probably don't want to end up tracking every derivation in every language from every word ever in existence. That would get hairy really fast. We tried to do this with English once and it became a disaster; some of the categories created back then are still around. Proto-Indo-European is perhaps still reasonable because there aren't tons of fully reconstructed words, but it wouldn't surprise me if someone then decided it would be great to do the same thing with Proto-Germanic too, and we have over 3000 Proto-Germanic lemmas... —CodeCat 22:31, 22 March 2016 (UTC)
Yes, it could be taken to absurd lengths, but that can easily be prevented by restricting the categories to the lowest level of root or word. So a category should only be created for words like *swé, not for derivatives of such words, like *sewos. And if a verbal root for *h₁nómn̥‎ is decided on, then the categories relating to h₁nómn̥‎ should be deleted and replaced with a category relating to the verbal root. The problem right now is that there's no way to categorize the words that derive from basic PIE words without a root like *swé and *h₁nómn̥‎, and that's the reason why I created the template. — Eru·tuon 23:13, 22 March 2016 (UTC)
I guess that is understandable. You should probably write this in the documentation of the template though, so that people don't use it in unintended ways. —CodeCat 23:24, 22 March 2016 (UTC)
Done. — Eru·tuon 23:42, 22 March 2016 (UTC)

function words[edit]

([3])

How exactly should we define function words? Should we be verbose, or brief? I’d think that the most effective way to define them is to be elaborate, but it also seems like that’s not a common practice in lexicography. Still, I feel uncomfortable with the possibility that readers might be mislead. One compromise would be to link to https://en.wikibooks.org/, but it seems like nobody does that. --Romanophile (contributions) 04:12, 22 March 2016 (UTC)

Common practice in lexicography is to be terse, because dictionaries were traditionally made of paper and would be too expensive if they went into detail. Since we're not paper, we do have the luxury of being verbose. —Aɴɢʀ (talk) 10:08, 22 March 2016 (UTC)
We can be terse about those function words that have as synonyms more basic function words or about the senses of function words that are more semantic and less grammatical. The more advanced learner's dictionaries are not usually terse about basic function words. Collins COBUILD and Longmans DCE are exemplary, but the modern ones all seem to follow the same pattern. OED is quite verbose to explain the evolution of the meaning and grammatical function of such words. Similarly TLF in French. I think advanced learners may be our most important audience. I find some coverage of the historical aspect of such words, which IMO require verbosity, is enormously helpful in getting outside of my own idiolect as I don't have the benefit of good understanding of other languages. DCDuring TALK 12:18, 22 March 2016 (UTC)
The dictionnaire du moyen français has a lovely full entry on quoi [4]. We should just cover as much usage as we can and make really sure that most common usage is covered in the first senses because we have to assume that most readers won't 20 definitions when there are 20 definitions. Frankly, I don't unless I really have to. Renard Migrant (talk) 12:25, 22 March 2016 (UTC)
As we don't and perhaps can't do a very good job at, for example, all the important entries for verbs and nouns explaining how they are used with prepositions and functional adverbs, we need to have some better coverage at the preposition and functional adverb entries. Core determiners require verbose coverage as well. For all of these and for other function words like pronouns and conjunctions we can't be too terse and we need lots of usage examples that span the whole range of reasonably common or of deceptive uses. DCDuring TALK 12:44, 22 March 2016 (UTC)
We should also make sure to divide the content up properly into the definition and usage notes. Each definition line should be relatively brief, while the usage notes can explain the finer details in as many words as it takes. --WikiTiki89 15:01, 22 March 2016 (UTC)
To me, the most important part of a function word’s definition is not the definition itself, but the usage examples. For example, the definitions at voor, especially the first one, are not very useful. — Ungoliant (falai) 15:02, 22 March 2016 (UTC)

"Lochia" as a Word of the Day?[edit]

Hi. @AK and PK recently nominated lochia as a potential Word of the Day, but I'd like to get a sense of whether anyone feels this might be too graphic for the Main Page. (The definition is: "Normal post-partum vaginal discharge; blood, mucus, and placental tissue that are discharged from a female's vagina (similar to menstruation) for several weeks after she has given birth.") — SMUconlaw (talk) 14:33, 22 March 2016 (UTC)

It wouldn't be my choice for WotD. I'd prefer we find something else. How about something topical like triacetone triperoxide? DCDuring TALK 14:40, 22 March 2016 (UTC)
How is that topical? (Also, feel free to nominate more words!) — SMUconlaw (talk) 15:28, 22 March 2016 (UTC)
It's the explosive of choice for evading the most common explosive-sniffing devices. DCDuring TALK 17:26, 22 March 2016 (UTC)
I also am not a fan, I like WotD which are non-technical as well if possible. But since I am not putting any effort into the thing my vote should not count for much. - TheDaveRoss 14:44, 22 March 2016 (UTC)
OK, removing it from the nominations list, then. Thanks. — SMUconlaw (talk) 13:50, 25 March 2016 (UTC)
  • I'd be willing to put some effort into selecting and editing into shape one "technical" word a week. The preference would be find one that was topical for some reason. The word could be itself technical or one that had a topical technical definition. Examples from the recent past are Zika virus, Aedes aegypti, olinguito. The idea would be to provide a gateway to related terms and to WP and other sources, so that Wiktionary could be seen as a resource that provided some depth. DCDuring TALK 14:12, 25 March 2016 (UTC)
    We could even have a technical WOTD in addition to the others. DCDuring TALK 14:14, 25 March 2016 (UTC)
    Sure, why not? — SMUconlaw (talk) 19:07, 30 March 2016 (UTC)

Etymology section for non-lemmas[edit]

We don't include etymologies for non-lemmas (I think this should be an absolute rule, it's close to universal in practice anyway), but it does happen that non-lemmas occur side-by-side with lemmas on the same page, or with non-lemmas from another lemma. The practice on how to handle these seems to differ a lot. In some pages, non-lemma entries are split by the etymology of their lemma. On other pages, I've seen lemmas and non-lemmas mixed under the same etymology header, which is obviously not very sensible because different lemmas will have different exact etymologies and can only have the same etymology approximately. Entries consisting of only nonlemmas with different etymologies might split them by etymology or they might not. Things get even trickier when considering that two lemmas with the same spelling but different etymologies can share the same inflected forms. For some examples, consider leek, leken and lijken. Should leek and leken's verb forms be split by etymology to reflect the two etymologies of lijken? Should leken be split by etymology? Should this split include three different sections, all saying "plural of leek", to reflect the three different noun lemmas spelled leek that all have the same plural form? It can get very messy if we start taking lots of different lemmas and etymologies into account.

I would therefore like to use just one etymology section for all non-lemmas. It would be a special section named something like "Inflected forms" and appear always at level 3, at the same level as the other etymology sections. It would always appear as the final etymology section for a language, and only if the language section is split by etymology at all; if the section only has non-lemmas, then there's no need for a split. There's still the question of what to do with the three identical plurals for the three leek nouns: should each one have a "Noun" section with a "plural of" definition, or would a single Noun section do even though there are three nouns? —CodeCat 22:07, 22 March 2016 (UTC)

That actually sounds like a good idea. I'm not sure whether "Inflected forms" is the best name for the section, however; it kind of sounds like it means that these are the inflected forms of the lemmas on this page. --WikiTiki89 22:17, 22 March 2016 (UTC)
I'm reluctant to make absolute rules about this sort of thing. I feel like such decisions should be made on a case-by-case basis, as the details of each non-lemma in each language will be different. And we do include etymologies for non-lemmas in the case of suppletive forms and other irregular forms, e.g. do·fúaid and estir. —Aɴɢʀ (talk) 22:20, 22 March 2016 (UTC)
I see this more as a standardized option to use when it makes sense to do so, not as a requirement. --WikiTiki89 22:21, 22 March 2016 (UTC)
@Angr I think that the etymologies at do·fúaid and estir should be moved to the lemmas. Otherwise, how will someone looking at those lemmas know where all the pieces come from? Nobody wants to click through an entire inflection-table worth of links to figure out the etymology of all the forms. —CodeCat 22:24, 22 March 2016 (UTC)
@CodeCat, if I do that, then do·fúaid isn't in Category:Old Irish words prefixed with dí- and Category:Old Irish words prefixed with fo- anymore. I suppose I could put the lemma ithid in those categories, but people would probably be confused by that. Likewise, if we were to create a Category:Old Irish terms derived from the PIE root *h₁ed-, it would be misleading to put the lemma ithid in it, though do·fúaid and estir would belong in it. —Aɴɢʀ (talk) 12:55, 23 March 2016 (UTC)
Compare zijn, go, sum, ferō. They're categorised by the roots found throughout the paradigm, not just the ones from which the lemma form derives. —CodeCat 15:52, 23 March 2016 (UTC)
(EC) I agree, but I can't really think of anything better. A name that contains "etymology" or some related form would be preferable, so that it's clear that this is a section that stands at the same level as the other etymology sections. But, by definition, such a section is not an etymological grouping, so it may be a misnomer. We're really splitting language sections into two pieces: lemmas, then non-lemmas. Lemmas would be all the level 3 etymology sections, non-lemmas are the final level 3 section. We could also opt to rename all our etymology sections to "Lemma etymology 1" and so on, and then have a final "Non-lemma etymologies" (in plural). But that may be too big a change. —CodeCat 22:23, 22 March 2016 (UTC)
What about the name "Non-lemmas"? --WikiTiki89 22:25, 22 March 2016 (UTC)
That would work too, but there's less disagreement on what an inflected form is than what a non-lemma is. Some people consider alternative forms to be non-lemmas, but I consider them lemmas because they appear in the lemma form and you'd expect them to appear in a paper dictionary (with a "see (other entry)" definition of some sort). —CodeCat 22:27, 22 March 2016 (UTC)
There doesn't have to be agreement on the meaning of the term as long as there is agreement on the criteria for how the section can be used. I think the criteria is pretty clear that if it is a form-of entry whose etymology is given (or would be given) on the page it links to, then it can go in the "Non-lemmas" section. --WikiTiki89 22:36, 22 March 2016 (UTC)
Alternative forms can and sometimes do have their own etymologies (for example, color was derived from colour by Noah Webster), so I guess by that reasoning they wouldn't go in the section. But they're a bit in between... we kind of assume their etymology is the same as that of the main lemma, unless specified otherwise. —CodeCat 22:41, 22 March 2016 (UTC)
My point is that if it's an alternative form, it could go either way depending on the situation. If it has its own etymology section, it would not go in the "Non-lemmas" section, if we want to imply that the etymology is the same as the main lemma, then it could go in the "Non-lemmas" section, but not necessarily. It's a case-by-case thing. --WikiTiki89 22:46, 22 March 2016 (UTC)
I'd rather just make a set rule that they shouldn't go in the non-lemmas section. Consider that alternative forms still use the lemma headword-line templates like {{en-noun}}, which categorise in the lemma category. —CodeCat 23:03, 22 March 2016 (UTC)
But also consider that an orthographic alternative form could correspond to multiple etymologies, and is thus an ideal candidate for this new section. --WikiTiki89 23:08, 22 March 2016 (UTC)
The practice I think I've used when creating Russian non-lemma forms is to create a separate etymology section for each corresponding lemma. So if фад has an inflection фа́де and фадь also has an inflection фа́де (or if it has фаде́, doesn't matter) then they go in separate etym sections, but if фа́да and фада́ are both inflections of фад (as is often the case) then they go into the same etym section, with separate subsections (and if e.g. фа́да can be three different inflected forms of the same lemma then there's only one subsection with three definitions listed, one per inflection). The underlying principle is as is the etymology read "Inflected form of FOO". I've tried to keep certain forms together, though, e.g. if lemma фабала can be stressed as either фаба́ла or фабала́ and they have corresponding respective inflections фаба́лы and фабалы́, then I try to keep them together, although this requires additional coding work. Benwing2 (talk) 23:46, 22 March 2016 (UTC)
Yeah that seems to be the more-or-less standard practice right now. But it creates too many etymology sections that make the page a bit messy. This is exactly what CodeCat is proposing to change. --WikiTiki89 14:14, 23 March 2016 (UTC)
I don't think we need (and to be explicit, I oppose) a new "Non-lemmas" header. I've seen some entries already grouping inflected forms under one "Etymology" section that just says ===Etymology=== \n Inflected forms. or the like; I think the "Etymology" is sufficient. ALso, it is sometimes desirable for a non-lemma to have its own etymology separate from that of other non-lemmas: sawed as a dialectal past tense of see (compare seent) merits an explanation separate from sawed as the past tense of saw, IMO. - -sche (discuss) 04:04, 25 March 2016 (UTC)
A big advantage of a separate header is that it's immediately obvious for bots to find, and add new entries to, or alternatively to create it and add it to the end of the entry. With a numbered etymology header it's not so clear: how does a bot determine whether it contains a lemma or nonlemma? —CodeCat 21:44, 28 March 2016 (UTC)

Change to terminology of {{sense}}?[edit]

It now reads of the sense “shake” or whatever instead of just putting the word in italics. When was this discussed? I'm not sure I like it, too wordy. Benwing2 (talk) 03:36, 25 March 2016 (UTC)

I see this was CodeCat. Please revert, this needs to be discussed first. Thanks. Benwing2 (talk) 03:38, 25 March 2016 (UTC)
I support the change, though perhaps there should have been discussion. It removes the confusion of wondering whether the thing in parentheses is the meaning of the antonym or the meaning of the thing the antonym is an antonym of. — Eru·tuon 03:47, 25 March 2016 (UTC)
I think that wording that clarifies what's inside the parentheses is useful, but the current wording is a bit verbose (and gets repetitive quickly). —suzukaze (tc) 08:50, 25 March 2016 (UTC)
The tradeoff between transparency to a user who has never been to Wiktionary before and waste-of-screen-space/more-to-filter-out for almost all repeat users and many of the new users with half a brain seems clear to me in this case.
I don't think that using the word sense is helpful for the kind of new user that might not guess what we mean by our short gloss. Interestingly, MWOnline has the following as its first definition of sense: "a meaning conveyed or intended: import, signification; especially: one of a set of meanings a word or phrase may bear especially as segregated in a dictionary entry". Other dictionaries put this definition much lower. I take MW's practice as an estimation that users of a dictionary are most likely to be looking up the word sense as it is used in dictionaries and language studies. That we might need to link to the particular definition at [[sense]] to explain the label is a certain indication that the "solution" being offered is not much of a solution.
Would our users be better off with no synonyms (or other semantic relations, related terms, derived term, etc) on the definition page for L2 sections that don't fit in a typical window on a typical desktop?
Revert. DCDuring TALK 12:47, 25 March 2016 (UTC)
Evidently this edit summary from {{sense}} is intended as justification for the previously rejected change: "(Getting tired of people thinking this reflects the sense of the antonym rather than the sense that it's an antonym of. Reinstating my previous edit since nobody came up with a better solution in the past 2 years.)" DCDuring TALK 13:09, 25 March 2016 (UTC)
I note that the arguments in favor are limited to the use under Antonyms. The Antonyms header is used about 29K times, the other semantic relations about 68K (Synonyms 50K, Hypernyms 9K, Hyponyms 8K, all others 1K). More than 2K of the Antonyms headers appear on pages that also have Synonyms headers, which would help users grasp the use of the label.
An obvious solution would be to have a switch in {{sense}} that would be applied in Antonyms sections only. The wording could be specifically tuned for that use. This would seem vastly superior to the personal-annoyance-motivated recently installed. DCDuring TALK 13:26, 25 March 2016 (UTC)
It also doesn't work for all those cases where the parameter used to distinguish the definition is a topic or register label associated with the definition. I'm just going to revert it for now. And it doesn't work for cases where the parameter used is a defining hypernym in the definition. It can be changed back once we've sorted out the issues which should have been sorted it out before it the change. DCDuring TALK 15:09, 25 March 2016 (UTC)
If someone makes a major change without discussing it, it's always CodeCat, no need to check the edit history. Renard Migrant (talk) 16:11, 25 March 2016 (UTC)
I know. That's why I wanted to make clear the petulant motivation. DCDuring TALK 17:48, 25 March 2016 (UTC)
Thank you for the revert. --Dan Polansky (talk) 14:13, 27 March 2016 (UTC)
OK, I'm thinking something needs to be done here. I was just creating an entry for я́вственный ‎(jávstvennyj, clear, distinct) and my first instinct was to put "unclear" and "indistinct" in the {{sense}} field for antonyms, rather than "clear" and "distinct". One possibility is to create a template {{antsense}} (or similar), which displays something more like what CodeCat wants. It shouldn't be too hard to use a bot to automatically convert uses of {{sense}} to the new template. Benwing2 (talk) 20:51, 1 April 2016 (UTC)

Old Gujarati[edit]

Would the community support the creation of a code for the Old Gujarati language? Old Gujarati was spoken from the twelfth to sixteenth centuries, and is signficantly different from Modern Gujarati, with an entirely different case system. It was written in Devanagari, and is the ancestor of Middle and Modern Gujarati. I think gu-old would be a sufficient code. DerekWinters (talk) 00:26, 26 March 2016 (UTC)

@DerekWinters: The normal naming convention would be something like inc-ogj. gu-old doesn't follow the triliteral code convention. I'd also be fine with this addition. —JohnC5 01:02, 26 March 2016 (UTC)
inc-ogj is perfect! DerekWinters (talk) 01:07, 26 March 2016 (UTC)
inc-ogu would be preferable, to match the code of the modern language. —CodeCat 01:56, 26 March 2016 (UTC)
Ahh, that does make more sense. Then does anyone oppose inc-ogu? DerekWinters (talk) 02:01, 26 March 2016 (UTC)
@CodeCat could you possibly implement inc-ogu? DerekWinters (talk) 18:59, 26 March 2016 (UTC)
It's done. —CodeCat 19:06, 26 March 2016 (UTC)
Thank you very much. DerekWinters (talk) 19:16, 26 March 2016 (UTC)

where does the translation table go?[edit]

I always felt it looks better at the bottom of the page. What is the protocol on this? ---> Tooironic (talk) 09:57, 26 March 2016 (UTC)

WT:ELE provides that it goes after Usage notes, Quotations, all the semantic relations and derived and related terms. It also provides for it to follow Descendants and See also, on which there has been strong dissent, especially for Descendants. It precedes External links, References, Anagrams, Statistics. That's not quite the bottom of the displayed page, but its close. DCDuring TALK 11:47, 26 March 2016 (UTC)

More Words of the Day needed![edit]

We are running short of Words of the Day so do nominate some, either in general or for particular days of the year! — SMUconlaw (talk) 12:18, 27 March 2016 (UTC)

  • I assume you mean in English. There's a backlog in FWOTD. Donnanz (talk) 12:26, 27 March 2016 (UTC)
Yes, I meant English. And thanks to everyone who has been nominating words! — SMUconlaw (talk) 18:42, 1 April 2016 (UTC)
We do need more nominations for FWOTD as well. Or rather, more nominations in languages other than Latin, Classical Nahuatl, Portuguese, Spanish, French, German, Ancient Greek & others that have had more than their fair share of FWOTDs. — Ungoliant (falai) 18:57, 1 April 2016 (UTC)
There may be a case for splitting FWOTD between Latin and non-Latin languages, does anyone have a view on this? Donnanz (talk) 19:02, 1 April 2016 (UTC)
I don't think that's a good idea. --WikiTiki89 19:16, 1 April 2016 (UTC)
That's the kind of reaction I expected. But when interesting words like German See (nominated one year ago) aren't used it makes one wonder. The requirements for quotations and pronunciation are also bugbears which don't seem to apply to English WOTD, and possibly prevent interesting terms from being nominated. Donnanz (talk) 08:40, 2 April 2016 (UTC)
The requirement for a quotation is to ensure that we don't feature a fictitious word. Also, we seem to have been ignoring it lately. Can you link to where See was nominated? --WikiTiki89 13:26, 2 April 2016 (UTC)
See was nominated by an IP on 31 March 2015 [5]. Quotations aren't so much of a problem, but pronunciation can be if it's not included in a foreign-language Wiktionary. Donnanz (talk) 13:55, 2 April 2016 (UTC)
Oh. Not sure why I didn't see it before. --WikiTiki89 14:33, 4 April 2016 (UTC)
What's so interesting about See? Just that it's a false friend? —Aɴɢʀ (talk) 15:27, 2 April 2016 (UTC)
It has different genders for different senses for a start. Donnanz (talk) 15:32, 2 April 2016 (UTC)
That's true. When I first started learning German, my mnemonic for the genders was that See is like a spider: the female is larger than the male. —Aɴɢʀ (talk) 15:52, 2 April 2016 (UTC)
You know, it would help if people who nominate words (or come across already nominated words) would give a reason to why it's interesting, if it's not obvious. Otherwise someone unfamiliar with the word would not see from a quick glimpse at the entry why See is interesting. --WikiTiki89 14:33, 4 April 2016 (UTC)
It's a point worth bearing in mind, but probably not the sort of thing an IP would think of. It shouldn't be a requirement, like pronunciation (which should be scrapped). Donnanz (talk) 08:09, 8 April 2016 (UTC)
Definitely not a requirement. Maybe we can't expect an IP to be so thoughtful, but you did happen to notice it and why it might be interesting, and so you can add a comment to it. --WikiTiki89 14:34, 8 April 2016 (UTC)
Done yesterday, but it probably won't make any difference with the present incumbent. Donnanz (talk) 11:52, 12 April 2016 (UTC)

Hittite lemmas[edit]

To put it simply, what should be the page on which lemmatic information is presented?

For substantives, the citation form in both Kloekhorst and the CHD is the root, which is spelled in Latin. This is basically the same as Sanskrit: the citation form is lūli-, the nom. sg. is lūliš, acc. sg. lūlin, etc. For verbs, CHD gives the stem: mark-, markiya-, markišta(i)-, marzai-. Kloekhorst basically does this too, except that he shows multiple forms of the stem, and also adds as a superscript the third singular ending: mārk-i / mark-, markiye/a-zi, markištai-zi, marzai-zi.

Note, however, that none of these are transliterations—they are approximations of the root. The reason for this, of course, is that Hittite writing varies wildly. The genitive singular of lūli- could be spelled 𒇻𒇷𒄿𒀀𒀸 (lu-li-ya-aš), 𒇻𒌑𒇷𒀸 (lu-ú-li-aš), 𒇻𒌑𒇷𒄿𒀀𒀸 (lu-ú-li-ya-aš), etc. You could point to a stem 𒇻(𒌑)𒇷- (lu-(ú)-li-) in that case, but you can do no such thing with mark-, where the various inflected forms mārkḫi, markanzi, markēr are spelled 𒈠𒀀𒅈𒅗𒀪𒄭 (ma-a-ar-ka-aḫ-ḫi), 𒈥𒃷𒍣 (mar-kán-zi), 𒈥𒆠𒅕 (mar-ke-er) respectively—i.e. there is no spelling of mark- that does not include part of the inflectional ending.

What this means is that we can't cite the stem as a lemma form if we want the lemma form to be spelled in its native script. This leaves three options for the lemma form:

  • The third person singular (for verbs), and the nominative singular (for nouns), spelled in cuneiform. This raises the problem that not all verbs/nouns have an attested 3sg./nom. sg. respectively, and even when they do the spelling thereof may vary (𒈥𒆠𒅖𒋫𒄑𒍣 (mar-ki-iš-ta-iz-zi) or 𒈠𒅈𒆠𒅖𒁕𒀀𒄑𒍣 (ma-ar-ki-iš-da-a-iz-zi)? Of course, one can resolve this by taking the stem spelling more consistent with other forms of the word [in this case, the former.])
  • The third person singular / nominative singular, spelled in Latin. This is contrary to the ideal that a word should be spelled in its native script (of course, we would have pages in cuneiform for individual forms, but the lemma would have to be in Latin.) It also ostensibly shares the above problem of attestation, except that the spelling is effectively standardized, and so the entry would only be markištaizi. The problem of lack of attestation is less, too—if 3pl. pret. act. 𒈥𒊺𒂊𒅕 (mar-še-e-er) is attested, one may infer a 3sg. pres. act.maršēzi. It has the slight advantage of being consistent with verbs in Kloekhorst (but not with nouns, and with neither POS in the CHD.)
  • The 'stem', spelled in Latin. This negates the problem of attestation entirely, is consistent with Kloekhorst and the CHD, and is also neutral with respect to forms. Of course, it's also Latin and not Cuneiform, and the actual stem may vary (is it mark- or mārk-? Different forms have different stems, in accordance with IE mobile full grade.)

I know that we have few if any Hittite contributors, but I would appreciate any opinions that anyone is able to offer. —ObsequiousNewt (εἴρηκα|πεποίηκα) 20:20, 27 March 2016 (UTC)

@ObsequiousNewt: I have often wondered the same thing. My instinct is to have Latin root lemmata which has a list of all attested forms in cuneiform. That said, it seems a bit silly for complicated adjectives or nouns for which we only have a single attested form to have full Latin entries that just link to a single cuneiform page. I've also wondered what we do about Sumerian, Akkadian, and Hittite determinatives. Are they part of the lemma or not? —JohnC5 20:47, 27 March 2016 (UTC)
(EC) How widely is Hittite attested? If it's at least somewhere on the level of Gothic, then we probably have no problems inferring our chosen lemma form based on whatever forms are attested. As for spelling, we can standardise, like we do for many other languages already. —CodeCat 20:49, 27 March 2016 (UTC)
You could use each spelling found, and link the others under alternative forms, and have each be a full lemma page, each with its own declension/conjugation/etc. based on its individual spelling. This way, each spelling would be given full representation. There need not be one standard, when there never was a standard. And this should of course be all done in the Hittite script, with romanizations given separately. DerekWinters (talk) 22:20, 27 March 2016 (UTC)
@CodeCat: I don't know how well Gothic is attested, but I can safely say that Hittite spelling will be considerably more difficult—if not altogether impossible—to standardize in its native script. Cuneiform is syllabic, and the syllables in writing do not typically actually coincide with the syllables in pronunciation. It is sometimes possible to infer a lemma form (i.e. nom. 𒆜𒀸 (KAŠKAL-aš) can be inferred from acc. 𒆜 (KAŠKAL-an)𒀭, but it is more difficult to infer, as above, the form of maršēzi—even if it were attested it might vary between any number of forms (mar-še-e-zi? ma-ar-še-e-zi? mar-še-ez-zi? mar-še-e-ez-zi?)
In terms of what the page should look like, I feel like 𒆜𒀸 is a good example: show all attested spellings of each Latin form, and asterize the Latin form if no such spelling is attested. The question is what the page should actually be: 𒆜𒀸, or palšaš (or palsas?), or palša-. (𒆜- would work in this case, but can hardly be extended to other words.) DerekWinters' suggestion is possible, but I would imagine it to be more useful to put lemma information on one page per stem. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:18, 30 March 2016 (UTC)
@ObsequiousNewt I don't think we have the right to standardize a language that has never been standardized. Take for example, 𤠅. If you check the entries under the alternative forms, you will find them to be the same. This is what my earlier proposal had been. If each attested form of maršēzi for example were to have different conjugations or declensions, then those would belong under the appropriate form, but the other information should all be the same. In effect, we're saying that each form is correct, which is true (unless one is actually wrong), because the language was never standardized. DerekWinters (talk) 21:43, 30 March 2016 (UTC)
@DerekWinters: I think this makes sense, except for the problem of inflections. 𤠅 doesn't inflect, and colours can (and does) just say "3sg of colour", because inflections are consistent across English dialects—but in Hittite you can have the only attested forms be acc. sg. 𒇷𒆷𒀭 (le-la-an) and gen. sg. 𒇷𒂊𒆷𒀸 (le-e-la-aš), and there may not even be a nominative. It's as if the only attested forms of the verb meaning "to give sth. a distinct hue" were colors and colouring. It'd be possible to give each of these forms a common declension table as well as the same etymology/derived terms/whatever, but it also raises the problem of what form—if there is no lemma—to cite in those sections. I've made a page for one of the four attested forms of the stem lila-, and, as usual, the Luwian citation as well as the derived causative verb have no attested 3sg present forms.
(There's also the pervasive question of what tr= should be used for: is the tr= of 𒆜𒀸 KASKAL-aš or palšaš?) —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:37, 1 April 2016 (UTC)
@ObsequiousNewt: Hmmm, that is problematic. I know with Phrygian lemmas on here, there are terms like αββερετ which are in the third person singular present active. That could be a potential solution. Otherwise, another one could be creating something like Reconstruction:Taíno/bohi for the nominative of the attested forms, giving in the declension of the reconstruction, the attested forms. Also, the attested, non-nominative forms could be given first as the main lemmas with a declension table full of reconstructed terms, similarly to the lemma on your page. Also, for the tr=, I would say that the pronunciation (palšaš) be given for languages like Hittite, Akkadian, etc. There should be also be Sumerogram, Akkadogram, etc.-noting functions like smg=, akg=, etc. that would show Sumerogram = KASKAL-aš. DerekWinters (talk) 01:57, 2 April 2016 (UTC)
Using the third-person singular present active is certainly a good lemma, and in fact is the one Kloekhorst partially uses, but, as I have said, not all verbs have an attested 3sg. pres. act., and it is not always possible to infer it in the native script (although it is usually possible to infer it in the Latin script.) The idea of using Latin roots (or inferred nom. sg./3sg. pres. act.) (whether in the Reconstruction namespace or otherwise) is the other solution I have proposed, and may honestly be the only option, given the obvious problems with having a cuneiform lemma.
With respect to your comments on transliteration—how would these parameters be displayed? —ObsequiousNewt (εἴρηκα|πεποίηκα) 00:49, 3 April 2016 (UTC)
@ObsequiousNewt: It may be inferable from the Latin script, but we could do the inferring ourselves, yet still make the lemmas in the cuneiform. And it is always possible (and sometimes practical) to create a romanization entry of the lemma. And for terms that are found attested only in their allative, for example, then a lemma entry could be created of the attestation, and we could give it a declension/conjugation table with all other forms filled in with the corresponding reconstructions. For my transliteration suggestion, it may look something like this (but feel free to change it as much as you'd like): 𒆜𒀸 (palšaš) (Sumerogram KASKAL-aš) DerekWinters (talk) 01:34, 3 April 2016 (UTC)
If it's possible to infer a Latin-script lemma form of a word, it should be possible to convert that form to cuneiform (using tables like w:Cuneiform script#Syllabary). The result might be a 'normalized'/'idealized' cuneiform spelling, but we already normalize some old languages like Old Norse (hljóð vs hliod), and I don't see how using a reconstructed cuneiform-script-form would be any less desirable than using a reconstructed foreign-script form. That doesn't address the question of which form to make the lemma, but it means there's no reason for the lemma form to be in Latin script rather than cuneiform. - -sche (discuss) 02:20, 3 April 2016 (UTC)
@-sche, DerekWinters: It may be possible to standardize Hittite spelling, as CVC syllables are typically written CV-VC (as well as CVC, if such a sign exists), and long vowels can be notated with plene spelling. However, there are a few problems, namely labiovelars (which can be spelled ku/ḫu or uḫ/uk) and clusters of three or more consonants (e.g. parḫzi = pár-aḫ-zi or pár-ḫa-zi). Additionally, the actual spelling of words would vary wildly from such a standard, and it would not be in accordance with any academic standard. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:08, 5 April 2016 (UTC)
I think if we're going to use romanizations that are not direct representations of the individual signs, then there should be some kind of reasoning for why the signs are interpreted the way they are. For example, why is pár-aḫ-zi taken to stand for parḫzi and not, say, paraḫzi? —CodeCat 17:57, 5 April 2016 (UTC)
I think paraḫzi would have to be spelled pa-ra-aḫ-zi. AFAIK an apparent C.V syllable break shows that the vowel is purely orthographic. —Aɴɢʀ (talk) 18:19, 5 April 2016 (UTC)
Could these rules perhaps be detailed at WT:About Hittite? —CodeCat 18:22, 5 April 2016 (UTC)
@ObsequiousNewt DerekWinters (talk) 19:31, 6 April 2016 (UTC)
@CodeCat, Angr: The word parḫzi is spelled either pár-aḫ-zi or pár-ḫa-zi, because the /ḫ/ is part of a consonant cluster and cannot be represented using a neighboring vowel. It is never represented as pa-ra-aḫ-zi, because the word is not paraḫzi. There is no vowel between /r/ and /ḫ/; the form pár-ḫa-zi proves this; and there is no vowel between /ḫ/ and /z/; the form pár-aḫ-zi proves this. —ObsequiousNewt (εἴρηκα|πεποίηκα) 22:04, 6 April 2016 (UTC)

User:Liliana-60's de.wikipedia troubles.[edit]

Someone stopped by IRC this morning to try and get the word out that Liliana-60 had been de-sysoped and blocked on de.wikipedia. There is some discussion here in German and here in English. I don't read German, but the English one has a lot of accusation and not a lot of actual evidence (the offending edit has apparently been hidden by WMF). I don't have a suggestion, I am just passing along someone's concerns. - TheDaveRoss 11:18, 28 March 2016 (UTC)

It's just a little dispute with someone who loves to terrorize me by reporting me to law enforcement for crimes that never even happened and trying to get me involuntarily committed. And of course he acts all innocent afterwards. Yeah. -- Liliana 11:21, 28 March 2016 (UTC)
Be sure that the evidence is horrible enough to have it suppressed and the global sysop rights immediately removed (no dewiki rights were changed). Best, DerHexer (talk) 11:30, 28 March 2016 (UTC)
You're biased and you know that the GS removal was abusive. -- Liliana 11:31, 28 March 2016 (UTC)
I am not a regular here so i will add only a singe and short comment regarding Liliana-60: The user has been indefinitely blocked on both german wikipedia and wikimedia commons for making a death-threat. The threat has been removed ans oversighted by the WMF. A part of the userpage on dewiki has been revision deleted because it has contained very strange stuff. She is currently temporary blocked at metawiki for personal attacks, taking a look at the german wikipedia block log is wort as well. I think the whole situation is speaking for itself, Lilianas behavior is speaking for itself as well. Of course, it is up to the local community here to judge if Liliana can keep the sysop tools. I hope this short summary of the cause helps. Best --Steinsplitter (talk) 11:45, 28 March 2016 (UTC)
Not you again. I've told you a hundred times that 1. there was never a death threat, 2. there is no consensus for either of the blocks on Commons and Meta (the latter of which was obviously just done to shut me up) and 3. these matters are totally irrelevant for Wiktionary. -- Liliana 11:47, 28 March 2016 (UTC)
The first question for me is whether DerHexer is able and willing to provide any evidence. The second question is which are the edits that were such that they had to be hidden, that is, what is the diff number and on what wiki, and what user did the hiding. --Dan Polansky (talk) 11:58, 28 March 2016 (UTC)
He hasn't even provided any evidence on Meta and after I asked for evidence I was promptly blocked. lol. -- Liliana 12:02, 28 March 2016 (UTC)
Do you know which edits (diffs) of yours he has hidden as problematic? --Dan Polansky (talk) 12:05, 28 March 2016 (UTC)
It's here, regrettably you can't link directly to oversighted revisions. -- Liliana 12:07, 28 March 2016 (UTC)
Thanks. The last edit at Commons:User talk:Nightflyer is 27 March 2016‎ Kalliope (WMF): removed threat of harm. And your hidden edit of Commons:User talk:Nightflyer is 27 March 2016‎ Liliana-60: Ich werde mich von dir nicht einschüchtern lassen.: new section. User:Nightflyer user page contains German text. --Dan Polansky (talk) 12:11, 28 March 2016 (UTC)
Note that a threat of harm is not a death threat, but most users so far seem to ignore the distinction. Hmm. -- Liliana 12:12, 28 March 2016 (UTC)
Your user Liliana-60 was indefed on Commons on 27 March 2016 by Túrelio, a native German speaker, with the block summary "death-threat (already oversighted) against Nightflyer , user has been indef-blocked on :de and been globally emergency-de-admined by DerHexer. (for details: https://de.wikipedia.org/wiki/Wikipedia:Sperrpr%C3%BCfung#Benutzer:Liliana-60_.28erl..29.".
Did you, by your assessment, make a threat of harm? --Dan Polansky (talk) 12:18, 28 March 2016 (UTC)
I just told him that if he doesn't stop bothering me his family will fall into great misfortune. This is wording that you'll get from any fortune teller if you let your future be predicted, but for some reason (I wonder why?) it's been massively misinterpreted. -- Liliana 12:23, 28 March 2016 (UTC)
(ec) Yes, she did, a very strong threat of harm which can be interpreted as serious damage to people mentioned (“wouldn't it be too bad if anything would happen to these; a tragic misfortune~accident/dangerous situation can happen”) when she ambushes his family at a real WikiConference trying to find out where they live in order to complete her rage. We have a zero tolerance policy for real life threats and cannot entrust anyone with powerful rights who just went nuts like that (and not for the very first time fyi). DerHexer (talk) 12:32, 28 March 2016 (UTC)
omg banned for witchcraft! Anyway, dunno why we need their drama here since you haven't said anything controversial on en.wikt. Equinox 12:26, 28 March 2016 (UTC)
Actually, Liliana has said such things here in the past (diff). --WikiTiki89 15:08, 28 March 2016 (UTC)
The past is the past for a reason. And while it might not be okay to reveal CodeCat's place of residence (as I have to admit) there was nothing seriously threatening there. -- Liliana 15:11, 28 March 2016 (UTC)
You don't a "surprise visit" is threatening? --WikiTiki89 15:41, 28 March 2016 (UTC)
Not at all, I read it in a more jocular way, you know? -- Liliana 15:46, 28 March 2016 (UTC)
Well you have to realize that the things that you write will not always be interpreted in the way you intend them. --WikiTiki89 15:57, 28 March 2016 (UTC)
It really feels like a witch hunt by now. <_< -- Liliana 12:35, 28 March 2016 (UTC)
(ec) Please don't make fun of this very serious real-life threat which included stalking as well as violence. If she either does not want to remember or cannot remember what horrible thing she wrote, it's her very problem. DerHexer (talk) 12:36, 28 March 2016 (UTC)
You're far too short on evidence. Talk is cheap. --Dan Polansky (talk) 12:48, 28 March 2016 (UTC)
Okay, you are kidding me (or either did not read my statement above). Please contact WMF which suppressed the revisions with very good reason if you don't believe my description (but I am unsure whether you will trust them either). More I cannot tell you without breaching security and I'm a bit afraid that some things I revealed here will also be suppressed due to the seriousness mentioned. Best, DerHexer (talk) 12:52, 28 March 2016 (UTC)
Your opinions and reports cannot replace evidence. If the evidence is hidden, then it is hidden and the case is closed. The fact that you want your reports to replace evidence is troubling. --Dan Polansky (talk) 13:02, 28 March 2016 (UTC)
My last comment in this regard: Evidences are her admittance that she did this very comment (although she plays it much, much down, imnsho), the removal and suppression by the Wikimedia Foundation (see link to Commons above), and the report of its correctness and seriousness by plenty of stewards who still can look into this comment and are very indignant about what happened (see link to my talk page on Meta). Of course, you can on the other hand also trust her claim which is backed by nothing but her very own word. Everybody has to find his or her very own truth. Best, DerHexer (talk) 13:12, 28 March 2016 (UTC)
It's all serious business people! Haven't you understood? (And the claim that every steward can read the comment is pointless because how many of our stewards speak German, exactly?) -- Liliana 13:15, 28 March 2016 (UTC)
Did you or did you not write on Commons that you intend to ambush him at a wiki conference, trying to find our where he lives with his family, foreseeing a tragic misfortune, and regretting that it would be too bad if anything would happen to these (a sentence which is used in German to foreshadow a serious harm right up to death, FYI)? DerHexer (talk) 13:28, 28 March 2016 (UTC)
The third statement I confirmed above. The second is also true but it's kinda part of a conference that people exchange private information including address data, if you don't want that you wouldn't attend, right? The first statement is false. The fourth statement is a misinterpretation as I stated. -- Liliana 13:34, 28 March 2016 (UTC)
Right, you said, you just wanted to ambush him, following him from the conference to his home for getting to know where he and his family lives (who have to expect tragic misfortune or anything unregrettable when he doesn't stop contacting you). How does that make anything better? Is that what you wrote or not? DerHexer (talk) 13:43, 28 March 2016 (UTC)
You're implying that there's something inherently wrong with visiting people. I find that amusing. -- Liliana 13:46, 28 March 2016 (UTC)
You did write you want to follow him home (not for a visit), to his family, for which you see a tragic misfortune and anything unregrettably when he does not stop contacting you. That must be read as stalking as well as doing harm to people even if you try to be as ambigous as possible for not getting immediately locked. Get your wordings right and neither stalk people nor threat them. That is not accepted on these wikis according to our terms of use. And stop playing that down. DerHexer (talk) 13:57, 28 March 2016 (UTC)
So you're implying trying to get a fellow Wikipedian involuntarily committed in malicious intent is allowed? Because that's what he did to me. -- Liliana 13:59, 28 March 2016 (UTC)
@DerHexer This still boils down to you wanting Liliana punished here for something that happened elsewhere. That does not seem proper IMO, and frankly seems a textbook example of disruption: you've come here, to a project where you were once blocked for four years, brought a problem from another project to here, and wasted a lot of everybody's time dealing with it. Purplebackpack89 15:17, 28 March 2016 (UTC)
Please don't mix things up. It was not me who brought that up here, I just explained myself when I was asked to do so (and my actions did not affect at all enwiktionary nor did I request any actions to be taken here, see my talk page for further information). The block derives from a vandal account, an imposter, that took my name back in 2007, and was later renamed so that I could usurp the name, see the full log. Cheers, DerHexer (talk) 16:14, 28 March 2016 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

  • The behavior that Liliana has admitted to is undesirable. I hope that it won't occur here. DCDuring TALK 14:13, 28 March 2016 (UTC)
  • Honestly, who bloody cares? German Wiktionary isn't English Wiktionary. Unless commensurate behavior occurs here, I see no reason to lift an eyebrow. Purplebackpack89 14:43, 28 March 2016 (UTC)
    • Exactly. No need for torches and pitchforks. --87.63.114.210 14:56, 28 March 2016 (UTC)
  • The same person will be a very different user on different wikis. No need to worry too much about what an English Wiktionary user did in a German community dispute. Nemo 15:38, 28 March 2016 (UTC)
  • I have to agree, I'm not seeing any relevance and my experience tells me that blocks and de-sysopsing aren't always for the reasons they claim to be. Renard Migrant (talk) 15:47, 28 March 2016 (UTC)
  • In the time it took me to read through this, people said that I think: What Liliana does elsewhere is irrelevant to her value as a member of Wiktionary (en), which is what her status here should be judged by. As For what happens elsewhere, we know little of the context, we are not harmed by it. I would much prefer that the Germans not bring their quarrel here. Korn [kʰũːɘ̃n] (talk) 16:02, 28 March 2016 (UTC)
    I agree with that statement, but as I mentioned above, it is important to remember that Liliana has done similar things on the English Wiktionary as well (and served blocked time for it). --WikiTiki89 16:43, 28 March 2016 (UTC)
    I agree with Wikitiki. Liliana doesn't exactly have a clean record here either, but at the moment she's done nothing here at en-wikt (that I know of) to warrant a block or ban. Sanctions imposed at other projects are none of our concern, unless it's decided that her actions warrant a global block, in which case it's out of our hands. —Aɴɢʀ (talk) 16:55, 28 March 2016 (UTC)
    I agree. My impression of the German Wiktionary [and Wikipedia] situation is that there's been a long, tangled history of disputes that would have to be understood as context for what people have been saying there, and without that context we shouldn't be relating it her actions here. In the past, Liliana said some things she shouldn't have here, and we dealt with it. That should be the end of it here. Chuck Entz (talk) 20:47, 28 March 2016 (UTC)
  • I had a look at the block log at German Wikipedia and one of the block summaries contains a reference to a horrible diff from 2015 which is not hidden. I wanted to post the diff here but then I hesitated. I must say that after reading the diff, I find the actions of German admins very understandable. --Dan Polansky (talk) 18:05, 28 March 2016 (UTC)
    @Dan Polansky: Can you link the diff? --WikiTiki89 18:20, 28 March 2016 (UTC)
    I really do not know whether I should. It's a bit too strong material to my taste. But if multiple people think it is a good idea anyway, I probably will. I think, let people sooner think that I am misreporting than think that Liliana has actually posted the material. --Dan Polansky (talk) 18:30, 28 March 2016 (UTC)
I don't understand what you're talking about either, I must admit. There was one dispute that was revision-deleted and nothing else comes to my mind. -- Liliana 18:43, 28 March 2016 (UTC)
I found the diff too, but I'm not going to link to it because I don't see the relevance to this project. As long as she doesn't say anything like that here, there's no reason to ban her from here. —Aɴɢʀ (talk) 19:43, 28 March 2016 (UTC)

Hi, maybe some of you can take a look at this dicussion? There are some people from dewiki who think that your community is unable to decide which sysops do you want to have. I mean, I'm from dewiki too, but I don't like people, who say, that there were able to decide for your community, but are not a member of it. Greetings, Luke081515 (talk) 14:34, 29 March 2016 (UTC)

  • I commented there. I agree with the general consensus clearly expressed above that what happens on de.wiktionary is distinct from what happens on en.wiktionary for purposes of managing user rights. bd2412 T 16:13, 29 March 2016 (UTC)
    General note, it's de.wikipedia, not de.wiktionary. Greetings, Luke081515 (talk) 17:00, 29 March 2016 (UTC)
    The same would apply for any de. Wikimedia project. bd2412 T 17:11, 29 March 2016 (UTC)
    I think in general it is bad, if members from one community say, "we know what's good for you" to another community. If there is a commnity, they can decide it, that's my opinion. No matter if it's de or other... Greetings, Luke081515 (talk) 17:48, 29 March 2016 (UTC)
    Amen to that, @Luke081515 Purplebackpack89 18:31, 29 March 2016 (UTC)
    I agree that we can decide things for ourselves, but it is always better to be informed. So I don't think it was a bad idea to start this discussion. --WikiTiki89 18:38, 29 March 2016 (UTC)
  • As an outsider, I find some of the opinions expressed here a bit shocking. As an administrator, Liliana-60 has access to deleted content on this wiki, which could contain sensitive information about other editors in real life. (Yes, that is usually oversighted, but especially since this wiki has no local oversighters, it is very likely that stuff could be missed). Do you want someone who has made threats of harm in real life on another wiki to have access to deleted content on English Wiktionary? It is possible that WMF may also choose to remove rights in such a case as this too, see m:Requests for comment/Privacy violation by TBloemink and JurgenNL. --Rschen7754 01:56, 30 March 2016 (UTC)
    • Is there any evidence whatsoever that she has ever actually used her admin bit to do such a thing? bd2412 T 03:34, 30 March 2016 (UTC)
      • That's not what I am asking. What I am asking is, do you trust her to not do such a thing, because with privacy-related things once the damage is done, it cannot be undone, unlike say, a bad block. --Rschen7754 03:37, 30 March 2016 (UTC)
        • Shall I draft a vote defining our interwiki extradition policy and should it supplant the extraordinary rendition program de.wiki seems to think we have? —JohnC5 03:56, 30 March 2016 (UTC)
          • This here is not a vote of trust for Liliana. Several people have noted there was trouble in the past. This is simply a statement by en.Wiktionary that we prefer not to handle our users based on outside events. It is good that we were informed, it was necessary for us to consider this issue, but we had a look at what we were given and came to the conclusion that we see no urgent need to immediately strip Liliana. If you have seen more than we and think there is such a great importance to having her be an admin nowhere, kill her rights globally. But Angr noted: If you do that, it's out of our hands anyway and then there is no reason to try pushing us towards an opinion. I must say that this continuing uncouth coercion is not giving me the best impression of de.Wiki. Korn [kʰũːɘ̃n] (talk) 10:20, 30 March 2016 (UTC)
            • Another thing which gets to me is that we are meant to trust the judgment of people we don't know over those we do. If you want us to consider what was said somewhere else then don't hide what was said. I am not sure why threatening material must be hidden, anyway, unless it contains personally identifiable information. - TheDaveRoss 13:24, 30 March 2016 (UTC)
              • It appears that WMF were the ones who hid the comment, not the community. And for what it's worth, I'm not active on de.wikipedia; I don't even speak German. I comment here as a concerned Wikimedian. And of course you are welcome to leave her sysop rights (as long as WMF doesn't do anything to the contrary), but in my opinion it doesn't reflect well on the English Wiktionary or on Wikimedia in general to do so. (Also, if for some reason you don't trust the current stewards, you are more than welcome to participate in the steward elections that happen every year). --Rschen7754 18:21, 30 March 2016 (UTC)
                • It is not trust in the stewards, it is a blind assumption that their judgment is the same as ours would be. I trust that the stewards acted in what they thought was the correct manner, I don't have to take the same action as they did without any evidence but their word. I think you are vastly overestimating the possible effect this could have on the WMF's (or Wiktionary's) credibility or image. - TheDaveRoss 18:43, 30 March 2016 (UTC)
    Ugh. Reminds me of an obnoxious occurrence on one of the coding sites (was it GitHub?) where a contributor was banned because of opinions he had expressed on a personal Twitter page. Equinox 13:12, 30 March 2016 (UTC)

Update: Liliana-60 was apparently globally banned by WMF. [6] --Rschen7754 00:13, 23 April 2016 (UTC)

TTS dictionaries from Wiktionary IPA representations[edit]

It was suggested the following,

Wiktionary:Grease_pit/2016/March#TTS_dictionaries_from_Wiktionary_IPA_representations

might get more interest if also posted here. ShakespeareFan00 (talk) 19:12, 28 March 2016 (UTC)

R:Derksen 2008 vs. R:sla:Derksen 2008[edit]

I prefer Template:R:Derksen 2008 over Template:R:sla:Derksen 2008. The added prefix does not add any value, IMHO, and is non-minimalistic. What do you think? --Dan Polansky (talk) 19:16, 29 March 2016 (UTC)

I agree. Naming conflicts should be minimal, and also the same reference is often used for multiple languages. --WikiTiki89 19:19, 29 March 2016 (UTC)
Indeed. Hard categorization should handle the language-related scope delineation. DCDuring TALK 20:19, 29 March 2016 (UTC)
I can only assume the language code is there for disambiguation as what else does it add that's not contained in the information provided by the template? How many of these template even need disambiguation because of two names the same. Only use when needed (which I suspect is zero times). Renard Migrant (talk) 11:47, 30 March 2016 (UTC)
Here's a list of all the reference templates with language prefixes. --WikiTiki89 12:00, 30 March 2016 (UTC)
There is no doubt that many reference templates use a language affix. However, many reference templates do not use an affix. Hence, a discussion is needed. I do not want e.g. template:R:PSJC to be moved to template:R:cs:PSJC; this could easily happen in yet another undiscussed volume change. R:Derksen 2008 was created without the affix and it should remain without affix unless consensus swings in the other direction, IMHO. --Dan Polansky (talk) 12:14, 30 March 2016 (UTC)
The reason I linked to it was not to show the number of them, but to give a sense of how much name collision there would be if we were to remove the prefixes. There wouldn't be very much, but there would be some. --WikiTiki89 12:21, 30 March 2016 (UTC)
I see, I did not realize that. What are some of the name collisions? Can they be fixed by adding the year to the name? --Dan Polansky (talk) 12:27, 30 March 2016 (UTC)
Adding the date conveys useful information and follows the common convention in scholarly works. Do we discriminate by the language in which the reference is written? What about multilingual dictionaries? DCDuring TALK 12:49, 30 March 2016 (UTC)
One collision is {{R:hy:GB}} and {{R:xcl:GB}} (which actually are, ironically, different sources). --WikiTiki89 13:43, 30 March 2016 (UTC)
Thanks. This could be easily resolved in mutliple ways. "R:xcl:GB" could be renamed to "R:GB 2000" while "R:hy:GB" to "R:GB 1910". But since I can't figure out what GB in {{R:hy:GB}} stands for, that could instead be renamed to R:NBHL to match the work title. Note that I am not proposing to perform the renaming and annoy the template creator, in both cases Vahagn Petrosyan; I am merely showing how straightforward it is to remove naming conflicts. --Dan Polansky (talk) 13:56, 30 March 2016 (UTC)
Yeah of course. I was just responding to Renard Migrant's claim that there would be zero conflicts. --WikiTiki89 14:27, 30 March 2016 (UTC)
Alrighty. Thanks for the example; it gave me the opportunity to expound ;). --Dan Polansky (talk) 14:35, 30 March 2016 (UTC)
GB is a traditional abbreviation for referring to both. I do not want to change the names. --Vahag (talk) 18:25, 1 April 2016 (UTC)
If it comes to that, {{R:my:MED}} was created with the prefix, and I don't it moved to {{R:MED}}, which could also "happen in yet another undiscussed volume change". —Aɴɢʀ (talk) 19:10, 30 March 2016 (UTC)
Perhaps we should create {{R:MED}} for Middle English and keep {{R:my:MED}} where it is. I don't understand how these URLs work or I'd do it myself; laurer for example although I've been unable to work out the URL for the home page or how to get from one entry to another without just Googling it and getting an entirely new URL. Renard Migrant (talk) 19:33, 30 March 2016 (UTC)
  • I prefer names with the langcode of the primary langauge it is used for. Why? well, because of (1) less probability of name collisions and (2) autocomplete. You are more likely to know the langcode of the language you are working in rather than the name (which often times get really cryptic like this {{R:nan:thcwd}}) of the reference given by the template author. --Dixtosa (talk) 15:32, 1 April 2016 (UTC)
  • I too prefer including the language codes. When I forget the name of the template I start typing Template:R:xx: in the Search box and rely on autocomplete. --Vahag (talk) 18:25, 1 April 2016 (UTC)
    If these were redirects then the autocomplete would still work, right? Renard Migrant (talk) 19:00, 1 April 2016 (UTC)
  • I would prefer including a language code for a reference, whenever the language that the source treats can be unambiguously specified; but in the case of comparative resources like OP's example of {{R:Derksen 2008}}, a family code seems superfluous (except possibly for disambiguation). --Tropylium (talk) 15:52, 22 April 2016 (UTC)

Removing word frequency statistics from the Project Gutenberg from the mainspace[edit]

Someone proposed to remove statistics from the Project Gutenberg from the mainspace. They did so via Wiktionary:Requests_for_deletion/Others#Template:en-rank. I think a Beer parlour discussion is in order; this is not so much about removal of a template but rather about a removal of a class of information. Template {{en-rank}} was originally at {{rank}}.

An example of how the statistics currently looks like in word entry:

Statistics

Most common English words before 1923: does · Gutenberg · best · #245: word · light · felt · since

--Dan Polansky (talk) 09:50, 30 March 2016 (UTC)

It would be nice if we had such token-frequency statistics for current usage in books, news, scholarly articles, and the web, all separately, really nice if we had something that was for lemmas. It would be even nicer if we had an approximate indication of frequency of definitions in current use. Others do. We could. The PG statistics on word tokens seem lame and useless. DCDuring TALK 11:01, 30 March 2016 (UTC)
I agree with removing PG stats and considering more meaningful frequency stats with which to replace them. I wonder if the best measure is not a rank but rather a classification, so a word might be extremely rare in the 1860s and become common by the 1960s, only to fade again to relatively rare in the 2000s. How to measure and define those is still an open question. - TheDaveRoss 16:21, 30 March 2016 (UTC)
I agree with their lameness and would like to see them deleted. We tried to have these deleted a few years ago, I think it was when I was an admin. It wasn't successful, anyway, and I stopped caring soon after. --AK and PK (talk) 17:02, 30 March 2016 (UTC)
Hmm, it may have been Keene who actually wanted to get rid of it. Blimey, that was 9 years ago, perhaps 250 accounts ago! --AK and PK (talk) 17:04, 30 March 2016 (UTC)
How about replacing these with statistics with some of our own compilation based on use in Wiktionary definitions, definitions+citations, definitions+citations+discussion, WP articles, WP article+discussions, or all combined. This would probably give reasonable results for many classes of words. In addition we could even go so far as to use our native search to determine which etymologies and/or PoSes (for homonyms) and which definitions (for polysemic words) were the most common, based on something more than idiolect-based opinion. This would get us a reasonable sample of current usage, more or less formal if we rely on articles, entries, and citations, less formal if we rely on discussions. DCDuring TALK 12:43, 1 April 2016 (UTC)
Regardless of where the statistics come from, I think it's silly to link to the "nearby" words in the list. Better to just give the position in the list and link to the list. --WikiTiki89 12:45, 1 April 2016 (UTC)
I think the "nearby" words are useful for giving context to the ranking. —suzukaze (tc) 12:53, 1 April 2016 (UTC)
A visible, but inconspicuous indication of relative frequency, such as the OED's dots, is useful, especially for individual definitions. Whether we require users to click through to another page for comparative frequency or have some limited on-page comparative frequency is secondary, IMO. To me the question is whether statistics on token, lemma, etc, and definition frequency based on Mediawiki project data is good enough and whether any other source is likely to be available whose use would not be subject to copyright. DCDuring TALK 14:23, 1 April 2016 (UTC)

Eye dialect[edit]

The explanatory note at "Category:English eye dialect" says: "English nonstandard spellings, which however do not change pronunciation, deliberately used by an author to indicate that the speaker uses a nonstandard or dialectal speech." Is the italicized clause correct? Surely aboot and about are not pronounced the same way. — SMUconlaw (talk) 15:19, 30 March 2016 (UTC)

Aboot is the way a Scot (for example?) pronounces about anyway, so it doesn't change the pronunciation for the person whose speech is being imitated with the new spelling. Equinox 15:26, 30 March 2016 (UTC)
It's a well-known fact that we've been using the term "eye dialect" the wrong way in our entries. Some quintessential examples of real eye-dialect are sed for said and enuff for enough, which don't change the pronunciation for neither for the author, nor for the expected readers (aboot, however, is expected to be read differently by the expected readership). Eye-dialect spellings imply that the speaker is speaking with a dialect despite the fact that the spelling produces the same pronunciation. Compare the term eye rhyme. The correct thing to say is that aboot is a "spelling that is imitative of a dialect"; or if the spelling is actually used in the dialect itself, then you can just call it a "dialectal spelling". --WikiTiki89 15:35, 30 March 2016 (UTC)
Ah, I see. So, do we let sleeping dogs lie? I had placed sais (a non-standard form of says) in this category, and then saw the explanatory note and became puzzled. — SMUconlaw (talk) 15:49, 30 March 2016 (UTC)
I've been ignoring them (and sometimes even contributing them, where is my integrity?), but ultimately I think this needs to be fixed. As for sais, I'm not sure whether it is supposed to represent /seɪz/ or /sɛz/. In the latter case, it would certainly be eye-dialect. --WikiTiki89 15:58, 30 March 2016 (UTC)
Ha, ha! OK, thanks. It's impossible to know for sure how sais was intended to be pronounced, of course. In the 2000 quote, given the unusual spelling of many other words, I'm guessing it wasn't supposed to be pronounced as /sɛz/. — SMUconlaw (talk) 16:12, 30 March 2016 (UTC)
I always correct misuses of the term "eye dialect" where I encounter them in entries, but I don't go hunting for them. —Aɴɢʀ (talk) 16:30, 30 March 2016 (UTC)
Feel free to correct sais! — SMUconlaw (talk) 16:46, 30 March 2016 (UTC)
I'm not convinced sais isn't eye dialect. The entry claims that sais is pronounced to rhyme with gaze, but is it really? If I was reading one of the quoted passages out loud, I would pronounce it in the same way as says, namely to rhyme with fez. —Aɴɢʀ (talk) 17:55, 30 March 2016 (UTC)
Template:pronunciation spelling of/Template:pronunciation respelling of is the template for those who make a distinction between this and eye dialect. - -sche (discuss) 17:32, 30 March 2016 (UTC)
But that doesn't retain the connotation of dialect. --WikiTiki89 17:40, 30 March 2016 (UTC)
I usually just use {{nonstandard spelling of}}, but more nuanced options are also available. —Aɴɢʀ (talk) 17:53, 30 March 2016 (UTC)
That's what the "from=" field (which displays "representing _") is for. If a context label is put in, it categorizes; ad-hoc input is also accepted. - -sche (discuss) 02:27, 3 April 2016 (UTC)
@-sche: The "from=" field doesn't seem to be doing anything here. --WikiTiki89 14:49, 4 April 2016 (UTC)
@Wikitiki89 You've used T:nonstandard spelling of on that page, rather than T:pronunciation spelling of. We could add "from=" support to T:nonstandard spelling of (yielding "nonstandard spelling of X, representing Y")... I guess there's no reason not to... but under what circumstances would someone use that rather than either "alternative form of X, representing Y" or "pronunciation spelling of X, representing Y"? - -sche (discuss) 01:35, 5 April 2016 (UTC)

Shades of meaning - a thesaurus of synonyms with a bit of a difference?[edit]

What could be a helpful addition to Wiktionary, and something I don't think has exactly been done before, is a series of lists of words and phrases that are close to being synonyms - a little like Wiktionary:Wikisaurus and the classic Roget's thesaurus - but with an important improvement... put them in a table where even subtle differences between the words can be seen. So it would highlight "shades of meaning" and clues as to when you may or may not want to use a particular word.

Many lists of synonyms in books and online simply heap them together; Roget has a bit of order in their listing within sections, but not really enough to help writers (and especially not enough to help speakers of English as a second language, and there are more and more people in that category).

What I am imagining (and I guess I could present a few examples if requested) is a table where entries are sortable by any of several columns - the first being the word/phrase, another two that would almost always exist would be the degree of formality (perhaps: 9=probably only used in legal documents, 0=slang, -1=impolite, -5=very rude) and high widely-known the word is when used in this sense (9=very frequently used word across all cultures, 7=anyone in an English-speaking country with a reading age over 12 should know it, 5=might be a bit misunderstood or country-specific but many people kind of know what it means, 2=rare/archaic 0=not sure it even is a word). And then there would be columns that would vary according to the situations, e.g. a table for "not pay" might include words/phrases like "protest a bill", "welsh", "bilk", "balk", "renege", "dishonour/dishonor", "hold back payment", "block", "default", "become insolvent" and so on, and columns for degree to which the row has an element of stealing, inability to pay, dispute, lateness, charity... plus space to add explanations.

Perhaps what I am suggesting is a change to the format of Wikisaurus pages, or an optional feature you can click on to see the table, but I think there has to be an element of including words that are not exact synonyms.

Thoughts? Maitchy (talk) 20:41, 31 March 2016 (UTC)

I suspect we don't have enough users (or users who are interested in synonyms) to manage this. One good thing that we don't do, which might be more achievable, is to include usage notes on synonyms pages: e.g. Chambers Thesaurus on hate says: "Dislike is a fairly mild term for something simply being displeasing, whilst despise is far stronger and implies an element of contempt. Both detest and loathe would similarly refer to something deeply felt..." (and so on). Equinox 20:45, 31 March 2016 (UTC)
One might expect that diction would be what one would get from a dictionary. One can, but not very conveniently unless one's dictionary or thesaurus goes to unusual lengths. It would be interesting to have a model page for single synonym (sensu lato) group that could form an object for discussion. I had long hoped that Wikisaurus could be what is being suggested. It can still be a resource for it. DCDuring TALK 22:27, 31 March 2016 (UTC)

April 2016

Category:Classical Latin[edit]

Aren't Latin entries presumed to be classical unless otherwise noted? Should we have this category? DTLHS (talk) 03:18, 2 April 2016 (UTC)

Indeed, that's the standard we've used. The use of this as a context label is fine when indicating a spelling, but can we prevent it from categorising? —Μετάknowledgediscuss/deeds 03:27, 2 April 2016 (UTC)
For just Latin or everywhere? (AFAIK no other languages use the Classical label) KarikaSlayer (talk) 18:30, 5 April 2016 (UTC)
@KarikaSlayer Classical vs. Vedic Sanskrit comes to mind. —JohnC5 18:38, 5 April 2016 (UTC)
@JohnC5 I was thinking more about the code under "classical" in Module:labels/data/regional. To my knowledge Latin is the only language that uses that specific label (Classical Hebrew has its own tag, and it doesn't look like there's a Category:Classical Sanskrit). KarikaSlayer (talk) 18:48, 5 April 2016 (UTC)
Note that I just changed "Classical Hebrew" to be called "Biblical Hebrew", which is the much more common name. --WikiTiki89 18:53, 5 April 2016 (UTC)
Classical Arabic, Classical Chinese, etc. See w:Classical language for a long list. Although for Arabic we say "Koranic" or "Modern Standard". Not sure about the others. Benwing2 (talk) 23:32, 5 April 2016 (UTC)
There is a lot of Classical Arabic that is not Koranic. In fact, I would say Koranic Arabic is pre-Classical. --WikiTiki89 23:49, 5 April 2016 (UTC)

Catalan/Old Provencal and Walloon/Old French[edit]

I'm not sure if this has been discussed already, but there are a few issues with the way we handle Old Provencal in relation to Catalan. For one, it isn't universally accepted that the former was truly or technically an ancestor/parent language of the latter, even though I personally basically think so. Or at least I can agree an early form of what we may call Old Provencal or it's immediate ancestor was parent to both Catalan and Occitan. The problem is that many of the Old Provencal entries we have happen to be from later in time, in later stages of the language after certain sound shifts distancing and differentiating it (and the later Occitan) from what would become Catalan happened (or Old Catalan). So it may not be accurate to list a certain Catalan term as a descendant on an Old Provencal page of a term that was attested after the split Old Catalan made from it, as Old Provencal by then underwent several sound changes that already made it look more like Occitan but less like Catalan.

Like for one example, Catalan dolç vs. Old Provencal dous and Occitan doç, or another example: Catalan ocell vs. the 'au' diphthong arising in Old Provencal auzel and Occitan aucèl (a later unique development not to be confused with a direct inheritance from the original Latin 'au' in some cases). Or occir and aucir. In some cases, Catalan may have also received influence from neighboring Castillian Spanish, further differentiating it. I do agree that inherited Catalan terms can be listed as deriving from (some form of) Old Provencal, but not necessarily all the forms we have listed (the same even goes for some Occitan words, unless we specify which variant they came from, and not just the main lemma term; for example, how do we relate Cat. caure and Oc. caire, also càder and càser to Old Prov. chazer? Or the descendants on eu?). For the majority of terms, it isn't a problem, as they are very similar or identical, but there are some notable exceptions.

I've been making some of these descendant entries on Old Provencal or etymology entries on Catalan terms, based on what others have already done, but now I'm not so sure about the way we deal with it.

It does seem at least partly a matter of semantics, and what we call or how we define Old Provencal and Old Catalan and Old Occitan.

Also, is it really accurate to call Old French a parent language of Walloon? We're defining Old French that broadly, so as to basically include any Oïl language? Word dewd544 (talk) 20:12, 2 April 2016 (UTC)

  • First point, what else could be the parent language of Walloon? I can't think of anything. I think the answer to your question is yes Old French by our definition includes all the Oïl dialects including what are now the British Isles (including Ireland), France, specifically the northern half of France and what is now Belgium. I don't see how we say is came directly from Vulgar Latin as that leaves a gap of several hundred years. And there's no language or hint of a language called 'Old Walloon'. Renard Migrant (talk) 11:17, 20 April 2016 (UTC)
  • Not sure who the other user is creating Old Provençal entries. There's an argument for renaming the whole language Old Occitan because we merged Provençal and Occitan some years ago when ISO 639 retired prv. As for chazer it's the spelling used by Bernard de Ventadour as it's available on Wikisouce in s:fr:Catégorie:Œuvres de troubadors. FEW lists it as cazer, caire, caer (top of the second column, code is apr. for German Altprovenzalisch). Essentially the existence of chazer does not imply that cazer does not exist. I wouldn't worry too much about having the exact form (or mostly likely of the forms) that the word descended from. Note that Old French soloil originally listed French soleil as a descendant, but this got moved to Old French soleil. These things aren't set in stone.
  • Date-wise I have noted in Wiktionary:About Old Provençal that while we have a code for Old Catalan, roa-oca we don't have a cutoff either geographical or in terms of dates for Old Catalan. Of course if Catalan caure comes from Old Catalan, that also does not imply that it doesn't come from Old Provençal before that. Renard Migrant (talk) 11:24, 20 April 2016 (UTC)
A related issue, see the etymology of vriþa. The Old Norse form is Old Icelandic, which has a change of word-initial vr- > r-. But Swedish doesn't have this change, so it makes it look like the v disappeared and then reappeared. —CodeCat 17:53, 23 April 2016 (UTC)

PIE verb lemmas[edit]

Please see Wiktionary talk:About Proto-Indo-European#Lemmatising PIE verbs. —CodeCat 15:06, 3 April 2016 (UTC)

Singapore English entries[edit]

It looks like someone (maybe a teacher) in Singapore has noticed Wiktionary and has persuaded a group of people to start adding Singlish terms. Most of them are OK, but etymology sections can be over-complicated and formatting can be a little strange. I think it's better to clean the bad ones up rather than deleting them. On your guard! SemperBlotto (talk) 15:50, 3 April 2016 (UTC)

Maybe we should build a list so that they can be re-examined after the wave dies down. —suzukaze (tc) 06:29, 4 April 2016 (UTC)
Yeah, it seems they've been tasked with creating entries in a specific format (with "usage notes" that are not lexical and thus not appropriate here) and keep reverting if we change them. Not a great idea on a public, shared wiki project really. Equinox 16:12, 4 April 2016 (UTC)
The reversion isn't good, but the entries are interesting. DCDuring TALK 17:49, 4 April 2016 (UTC)
Can someone provide some links for those of us who have no idea where to look? --WikiTiki89 17:52, 4 April 2016 (UTC)
See contributions from Coffeeandbiscuits (talkcontribs), Potatogarcia (talkcontribs), Whatyoumean2016 (talkcontribs), Razif07 (talkcontribs), Kellytongjy1990 (talkcontribs), Syy03 (talkcontribs), Chingwennnn (talkcontribs), Afbak (talkcontribs), Nahte79 (talkcontribs), Maeaeae (talkcontribs), Heroaldchern (talkcontribs), Wani.lee (talkcontribs), E-van-316 (talkcontribs) (and others?) (not in any particular order). SemperBlotto (talk) 20:22, 4 April 2016 (UTC)
And Category:Singapore English. DCDuring TALK 21:03, 4 April 2016 (UTC)
For Singapore English cat members with usage notes: [7] DCDuring TALK 21:07, 4 April 2016 (UTC)
Well, that exercise seems to have ended, and teacher is now marking their work. I don't suppose we will ever hear of the results. SemperBlotto (talk) 18:08, 6 April 2016 (UTC)

Filter watchlists and recent changes by language[edit]

I added an option to WT:PREFS: "Filter watchlist and recent changes to only show changes for certain languages." (Third from the bottom of the "Experiments" section.) Suggestions, bug reports, feature requests, etc would be most welcome. --Yair rand (talk) 18:49, 3 April 2016 (UTC)

It seems it only filters out mainspace entries? --Giorgi Eufshi (talk) 06:27, 4 April 2016 (UTC)
Yes. Should it filter out other namespaces? --Yair rand (talk) 14:32, 4 April 2016 (UTC)
I would think that when you use it, it should show only changes in the mainspace and reconstruction namespace. Another flaw is that this is all done client-side, which means that the number of results you see is very small and inconsistent. If only this could be done server-side... But still, a useful gadget. Thanks, Yair! --WikiTiki89 14:58, 4 April 2016 (UTC)
Since we separate reconstructions by language anyway, it can be assumed that watching such a page means you're interested in that language. Or am I understanding this wrong? —CodeCat 15:03, 4 April 2016 (UTC)
But you might not be watching every page in the language you're interested in and want to see all recent changes related to that language. I guess that makes more sense in recent changes than it does in the watchlist. --WikiTiki89 15:10, 4 April 2016 (UTC)
That's true. —CodeCat 15:22, 4 April 2016 (UTC)

Russian Church Slavonic[edit]

I asked over at the Grease Pit about creating an etymology-only language for Russian Church Slavonic. I wrote:

I'm pretty sure it should be treated as a dialect of Old Church Slavonic (code cu). Not sure what code to use, maybe cu-ru? (Although most such codes seem to have 3 letters after the hyphen.) While we're at it, we might want similar entries for other Church Slavonic dialects; Wikipedia mentions three: Old Moscow, Croatian and Czech, but all of them in rather limited usage. Benwing2 (talk) 03:53, 3 April 2016 (UTC)

Wikitiki responded:

This is more of a Beer Parlour issue, since the real question is whether we want to have these. User:Ivan Štambuk probably has some opinions about it. --WikiTiki89 04:02, 3 April 2016 (UTC)

Any comments? I'm not sure why it wouldn't be a good idea to have these. They are clearly different dialects from Old Church Slavonic, with words spelled differently, etc. Benwing2 (talk) 04:07, 4 April 2016 (UTC)

(You mentioned there a comparison with Medieval Latin, but we do not have etymology-only languages for "French Medieval Latin" or "English Medieval Latin".) But anyway, if you give me an example of where you intended to use this code, then I can more easily consider whether or not it's a good idea. --WikiTiki89 14:46, 4 April 2016 (UTC)
@Benwing2: In case you missed my previous post, I would like an example of where you intended to use an etymology-only code for Russian Church Slavonic. --WikiTiki89 14:26, 5 April 2016 (UTC)
@Wikitiki89 Thanks, I did miss your post. The comparison with Medieval Latin was meant to be Classical vs. Medieval, not French Medieval vs. English Medieval; but if the issue of Russian vs. Croatian vs. Czech bothers you, then I'd be OK with just Russian. An example of where it would be used is осени́ть ‎(osenítʹ). Vasmer specifically says it's borrowed from Russian Church Slavonic rather than from Old Church Slavonic. Benwing2 (talk) 21:42, 5 April 2016 (UTC)
@Benwing2: I can't find an entry for that in Vasmer at all. Can you link me to where you found it? --WikiTiki89 23:56, 5 April 2016 (UTC)
Hmmm. I actually got it from ru:осенить, which says it comes from Vasmer and another source, but you're right that it's not in the online version of Vasmer. Perhaps it's the other source, or the online version of Vasmer doesn't include everything that was published? Benwing2 (talk) 00:09, 6 April 2016 (UTC)
What you may have thought was another source is actually just a link to a "list of literature". Anyway, assuming the information is correct, if a Russian word is derived from Russian Church Slavonic, that can mean one of two things: either the word was inherited from Old Church Slavonic, but not actually attested in the OCS period, or the word was borrowed into Russian Church Slavonic from Russian itself (with a possible spelling change). In the former case, we can just call it OCS; in the latter case, we can just say that its spelling was influenced by OCS. I don't think late varieties of Church Slavonic were ever used as a written lingua franca, nor do I think much innovation occurred within them, which is why they are much less useful as etymology languages than Medieval Latin. But I would like User:Ivan Štambuk to confirm this, since he knows a lot more than I do about it. --WikiTiki89 00:36, 6 April 2016 (UTC)

Old Hindi[edit]

I've found a grammatical analysis of Old Hindi, which shows considerable differences from modern Hindi (more cases, less schwa dropping, etc.). Could we make hi-old into a full language, not just etymology-only? The code inc-ohi also works. —Aryamanarora (मुझसे बात करो) 20:26, 4 April 2016 (UTC)

Is it different from Sauraseni Prakrit psu? —Aɴɢʀ (talk) 19:22, 5 April 2016 (UTC)
Sauraseni Prakrit is an earlier stage, I think maybe 1000 years before Old Hindi (although the time period of Prakrit is quite long). Between the two was Sauraseni Apabhramsa. If I'm not mistaken, there was a time around maybe 1100 AD when some people still wrote in Apabhramsa and others in Old Hindi, with significant differences in the case system. Apabhramsa still has the old inherited case system to a large extent while Old Hindi is closer to the modern agglutinative system. For this reason, Old Hindi is put in the "modern" stage while Apabhramsa is in the "middle" stage. (This would suggest that the present system of having Old Hindi be an etymological variant of modern Hindi is consistent with the linguistic consensus.) However, take everything I just said with a grain of salt as I'm going by memory and might be wrong in some particulars. Benwing2 (talk) 21:52, 5 April 2016 (UTC)
You got the gist - Old Hindi is a much more simplified Sauraseni Prakrit. It is different enough to be a different language, with (I think) only five cases (modern Hindi has three). Literature is rare, but the book I've found shows some very old poetry in the language, from the region of Rajasthan. Overall, at least a few words can be added in Old Hindi with adequate citations. —Aryamanarora (मुझसे बात करो) 23:11, 5 April 2016 (UTC)
(Hang on, did you just say Hindi is agglutinative? While it does have plenty of postpositions, it is not agglutinative, at least not like Sanskrit) —Aryamanarora (मुझसे बात करो) 23:13, 5 April 2016 (UTC)
I guess I'm thinking more of other Modern Indo-Aryan languages which have more clearly agglutinative-like case systems, i.e. the plural forms are essentially the same as the singular ones. These are in origin postpositions, and in Hindi you can still analyze them this way and say there are only 3 cases. IMO it's pretty clear that these case systems are heavily influenced by the Dravidian ones. I wouldn't say Sanskrit was exactly agglutinative, though; rather, it let you form long compounds and had extensive derivational morphology of the inflected type. Benwing2 (talk) 23:25, 5 April 2016 (UTC)
BTW having more cases doesn't necessarily mean it has to be a separate language -- Early Middle English, for example, had 4 cases and 3 genders, and Late Middle English had no cases and no genders, but they're clearly dialectal variants of the same language. Benwing2 (talk) 23:27, 5 April 2016 (UTC)
@Aryamanarora You say Rajasthan right? Are you sure those aren't in fact Old Marwadi (if such a term exists), or Old Western Rajasthani (Old Gujarati)? DerekWinters (talk) 05:02, 6 April 2016 (UTC)
@DerekWinters No, it's definitely old Hindi. The verb "to be" has the 3p.s.ind. form hai, 1p.s.ind. hū̃, unlike Gujarati che. —Aryamanarora (मुझसे बात करो) 15:13, 10 April 2016 (UTC)
We shouldn't make up our own codes; if a new code for old Hindi is needed, it should be requested from SIL/ISO 639-3 (if a full language code is believed justified) or from IETF-languages (if a subcode will do.)--Prosfilaes (talk) 00:30, 15 April 2016 (UTC)
Good luck with that. —CodeCat 00:40, 15 April 2016 (UTC)
I know for a fact that you've never tried to get a code from IETF-languages, since I've been on that list for a long time. It's not that hard, provided someone is willing to come up with cites to published descriptions and justify why this is a distinct lect.--Prosfilaes (talk) 07:19, 16 April 2016 (UTC)
That would take at least a few months – I want to add entries now. —Aryamanarora (मुझसे बात करो) 11:41, 15 April 2016 (UTC)
So you want to do a lot of work, but don't care if it's useful or will be preserved? It could be done under a private use tag and switched over once we have a real one. If you're going to do something, do it right, instead doing it right now.--Prosfilaes (talk) 07:19, 16 April 2016 (UTC)
Why would the work be less useful or not be preserved if he uses the inc-ohi code now instead of waiting for a new code? If and when the new code is created, the forms can be moved. We had a lot of entries in Norman varieties using our own local codes before the code nrf was created, and once it was, we moved them to the new code. Nothing was lost. —Aɴɢʀ (talk) 07:44, 16 April 2016 (UTC)
Or we could use a correct code, like inc-x-ohi, that never will get confused with anything else. New codes don't pop into existence; they get created when someone proposes them. I can deal with the bureaucracy of IETF-languages, but I know nothing of Old Hindi, and couldn't possibly come up with a list of published descriptions, and am not the best person to make the argument for it. If we just go on and create Old Hindi entries, the odds this will ever get any sort of standard code will be minimal.--Prosfilaes (talk) 23:35, 19 April 2016 (UTC)

Entry layout for reconstructed terms[edit]

Please see some thoughts at Wiktionary talk:Reconstructed terms#Layout proposal. I am considering creating a new draft version of the policy to include considerations such as these at some point. --Tropylium (talk) 09:00, 6 April 2016 (UTC) 

Old East Slavic adjectives[edit]

Right now we have some of these lemmatized at their long forms (e.g. бѣсовьскꙑи ‎(běsovĭskyi) rather than бѣсовьскъ ‎(běsovĭskŭ)), and some of them lemmatized at their short forms (e.g. четвьртъ ‎(četvĭrtŭ) rather than четвьртꙑи ‎(četvĭrtyi)). Is this intentional? I’d think they should be consistently at one or the other, but am I missing something? Vorziblix (talk) 16:04, 7 April 2016 (UTC)

I think they should all be lemmatized at their short forms, unless the short forms are unattested. --WikiTiki89 17:09, 7 April 2016 (UTC)
Why does attestation matter? We lemmatise words in other languages consistently even if the lemma form happens to be not attested. And the lemma form of Slavic adjectives is entirely predictable from any other form. —CodeCat 17:33, 7 April 2016 (UTC)
Because the short forms may have not existed for some adjectives. Compare how the short forms of adjectives with *-ьskъ do not exist in any modern Slavic language. I wouldn't want to lemmatize a short form when it did not exist. --WikiTiki89 17:48, 7 April 2016 (UTC)
We can easily determine from OES grammars which classes of adjectives had only the longer forms. Also consider that OCS did still have short inflections for almost all adjectives. —CodeCat 18:54, 7 April 2016 (UTC)
If that can be determined, then sure. OES ≠ OCS. --WikiTiki89 19:01, 7 April 2016 (UTC)
I would assume so, unless OES attestation is so sparse that nobody has been able to determine it yet. —CodeCat 19:11, 7 April 2016 (UTC)
Short forms of *-ьskъ adjectives do seem to be rare, but even a cursory search turns up at least one instance: »Азъ… хотѣвъ вкоуситі оучительскаго съказаниꙗ, готова прѣложити въ словѣньскь ꙗзꙑкъ.« In any case, I’ll move добрꙑи, тѧжькꙑи, and бородатꙑи, which are each attested in their short forms on the very pages they link to. Vorziblix (talk) 22:03, 7 April 2016 (UTC)

Template:wikipedia[edit]

When and why did this stop capitalizing the first letter of the word? Was this discussed? I feel like the link should reflect the spelling in the Wikipedia entry, and thus automatically capitalize the first letter. --WikiTiki89 18:53, 7 April 2016 (UTC)

It was done because not all Wikipedias capitalise all entries. I think it was the Lojban Wikipedia that was having trouble, all the links to it were broken. —CodeCat 18:54, 7 April 2016 (UTC)
So why can't we do that only for the Lojban Wikipedia and whatever few other ones there may be? --WikiTiki89 19:00, 7 April 2016 (UTC)
Ask the person who made the edit and the people who participated in the discussion. —CodeCat 19:12, 7 April 2016 (UTC)
What a review of the Wiktionary:Grease_pit/2015/June "discussion" shows is that, of the three technical adepts pinged 2015, August 5, based on their having contributed to the modules and templates, none responded. DCDuring TALK 20:59, 7 April 2016 (UTC)

Italian pronunciation[edit]

In this page you read:

'See Italian phonology at Wikipedia for a thorough look at the sounds of Italian. In addition, the Wikipedia help page for IPA for Italian has some useful English approximations.'

Neither in Italian phonology nor in Help:IPA for Italian you can find any reference to asterisks (*) used in phonetic transcriptions of Italian language to indicate the so called "syntactic gemination", but in Appendix:Italian pronunciation it is specified that the asterisk is the symbol used for that. The reason you can find it in this dictionary and not in the main encyclopedia is that it was inserted without any consensus by an Italian user and nobody has noticed it during these months. This user, IvanScrooge98, did the same on en.wikipedia, and there he was contested for that: user Macrakis created a discussion in Help talk:IPA for Italian#Syntactic gemination asking to remove the asterisks and users Peter238 and Aeusoes1 joined the discussion and agreed with him. These users are expert in phonetic issues and IPA and decided to delete this arbitrary symbol deliberately inserted. You can read the full talk in the link, and I am reporting the main passages here:

  • We should not show the syntactic gemination (SG) symbol (*) in our transcriptions of Italian, except in articles that are specifically about Italian phonology, for three reasons: 1) cases are predictable except for a small closed set of function words; 2) the actual set of words varies by variety; 3) it is not a universal feature of standard spoken Italian.
  • Syntactic gemination is fully predictable — they occur after stressed final syllables, including stressed monosyllables (phonological SG) and after a small, closed set of unstressed monosyllables and some penultimate-stressed polysyllables. This closed set includes only function words and not nouns, which are the typical words for which we give pronunciations.
  • The actual set of exceptional words varies by variety of Italian, even among those varieties which show SG. But in any case, it does not include nouns, if I'm not mistaken.
  • Many varieties of standard Italian spoken outside central Italy do not show SG at all.
  • Also, the only time the "*" annotation is useful is if the word is being composed with another word, e.g., "il Po superiore", which is presumably [il'possuperi'ore] rather than [il'po.superi'ore]. But if you know enough Italian to compose words like this, you don't need the "*".
  • I agree that there's no need to transcribe SG when the last syllable is stressed, but I'm not sure about the rest.
  • Why couldn't we just show the actual gemination if it occurs in our transcription? It would be like somehow marking French transcriptions with the final consonant that is normally elided except in cases of liaison.
  • In looking back over the discussion, I see that I wasn't specific enough about the case in question. I didn't intend to discuss the transcription of syntactic gemination in the interior of phrases. That is, I see no problem with [kafˈfɛ lˈlatte]. I was concerned with cases like the Po (river) article, where the name of the river is transcribed as [pɔ*]. This is parallel to transcribing Alaska (and every other noun ending in 'a') as [əˈlæskə(r)] because in sentences like "Alaska is big", RP speakers say [əˈlæskər ɪz 'bɪg], which I hope no one is proposing.

Consider, please, above all the last 2 points. It is completely useless, not to say counterproductive, to add an asterisk to show something that, if the asterisk is used, does not happen at all. When it happens, in a sequence of words, a consonant is doubled or it is used the ː symbol as for vowels. Moreover, the International Phonetic Alphabet (which is supposed to be the conventional alphabet used in this dictionary to transcribe the pronunciation of words and names in any language) has never, ever, employed asterisks for syntactic gemination, in any language. They are used sometimes for consonants pronounced fortis in Korean language and maybe in other cases, so using it for Italian words (either stressed on the final vowel, or monosyllabic, or a half dozen of polysyllabic words) would just confuse readers, as it happened to Macrakis on en.wikipedia, and even here to other users. Why should we keep a totally subjective convention introduced without asking anyone and just at one user's will? I frankly find it absurd to have to open a new discussion to ask for permission to remove something introduced without consulting anyone and already crossed out from en.wikipedia, but I am following the rules and probably also the 3 users who have already discussed about it in the talk page I have linked above may join too. I entrust your common sense about your final decision. 94.174.22.105 21:34, 7 April 2016 (UTC)

i consider this argumentation reasonable Phfnhyn (talk) 23:55, 8 April 2016 (UTC)

The IP editor above has correctly quoted my arguments against including the * symbol for "syntactic gemination". This is certainly an interesting phonological phenomenon, but is not tied to individual words (except for a small number of function words). I see no point in including * in the phonetic transcription of Italian words on Wikipedia or in Wiktionary. Even for the small closed class of function words, it is not universal in Italian, and really belongs in a grammar or phonological description, not in a dictionary. --Macrakis (talk) 18:04, 9 April 2016 (UTC)

I don't know much about it, but I agree using the asterisk to mark places where syntactic gemination may occur is kind of silly. If its triggers are not predictable in all cases, then maybe it would be better to mark certain words as triggering it, the way certain Irish words (e.g. go, i, bhur, etc.) are marked as triggering certain initial mutations. —Aɴɢʀ (talk) 19:56, 9 April 2016 (UTC)
Silly is the word! As quoted above, it's the same as marking French words ending with "d p s t x z", because when followed by a word starting with a vowel such consonants are read, unlinke if the words are pronounced alone or followed by consonant: the logical solution is just writing the consonant when it's actually pronounced in the 1st case. E.G.: "ils" (they) is pronounced [il], while "ils ont" (they have) is pronounced [ilz ɔ̃] and "ils ont un" (they have a) [ilz ɔ̃t œ̃]... 94.174.22.105 20:39, 9 April 2016 (UTC)
Actually, writing French liaison consonants has a lot more merit, because it is unpredictable and has to be memorised for each word. —CodeCat 20:45, 9 April 2016 (UTC)
Yes, maybe I shouldn't have switched language... Italian words with syntactic gemination are fully predictable, and in the page we're talking about there're links to specific articles about Italian phonology where this phenomenon is explained. It'd make a lot more sense marking French words subject to liaison than Italian words subject to syntactic gemination, and here French word are marked with nothing when they're subject to liaison! 94.174.22.105 20:59, 9 April 2016 (UTC)

I didn’t notice this practice until quite recently. I’ve never seen any dictionary or encyclopaedia inserting asterisks into their pronunciative instructions, for Italian or otherwise. I’d be quite fine with them being subtracted. --Romanophile (contributions) 08:28, 11 April 2016 (UTC)

@Angr I would like this discussion to have aroused more interest, but after 4 days only a few people have replied, none of them liking the use of the asterisk notation in question, though; I cannot say whether this is just not an important matter for the community or the community prefers truly to take the asterisks away, but in both cases, and since the only ones who wrote down their opinion expressed themselves against this convention, I wonder if this is not enough to start considering their removal. 94.174.22.105 17:30, 11 April 2016 (UTC)

I'd give it a week, but I agree it looks unlikely that the asterisks will be kept. —Aɴɢʀ (talk) 17:44, 11 April 2016 (UTC)
Disclaimer: I am not for removing information just because it is predictable. Many languages allow pronunciation to be predicted from spelling or inflection from verb class, yet get both a pronunciation section and a declension table. But since this case is apparently an idiosyncratic practice marking a non-general prosodic side effect, I'm for removing it. Korn [kʰũːɘ̃n] (talk) 08:05, 12 April 2016 (UTC)

@Angr I see that another use has commented negatively the asterisk notation, tomorrow it will be a week and probably we shall be able to take a decision, if you agree. 94.174.22.105 18:48, 13 April 2016 (UTC)

Now that a discussion has been held, I have no objection. I was never in favor of the asterisk, I was only opposed to a large-scale change in our representation of Italian pronunciation without discussion first. —Aɴɢʀ (talk) 09:25, 14 April 2016 (UTC)
O.K. then, thank you! Later I'll fix the asterisks. 94.174.22.105 09:33, 14 April 2016 (UTC)

New policy: Reconstructions must have descendants or derived terms[edit]

Reconstructions are always based on descendant forms which point to their existence. I have already been following this rule as an unofficial policy myself, making sure I can find descendants before creating reconstructed entries. I wonder if we can make this an official policy, to be added to WT:RECONS? —CodeCat 19:12, 8 April 2016 (UTC)

Is there a problem that reconstruction entries are actually being created without descendants, or is this purely a hypothetical issue? --WikiTiki89 19:34, 8 April 2016 (UTC)
There have certainly been a fair few yes. I just deleted them all in the past, but we had no policy stating that it was grounds for deletion. I'd feel better if I could point to a policy when deleting. —CodeCat 19:47, 8 April 2016 (UTC)
I see. The bug is in the {{policy}} template, which seems to imply that a "guideline or common practices page" "must not be modified without a VOTE". We're allowed to enforce guidelines and common practice, so if we were able to modify the page without a BP discussion rather than a vote, that would be the ideal solution. --WikiTiki89 20:09, 8 April 2016 (UTC)
I'm actually asking if people agree though, I don't know if this is a common practice. It's just a practice I've applied, myself. —CodeCat 20:11, 8 April 2016 (UTC)
Ok. I agree. I also think this has been discussed before. --WikiTiki89 20:20, 8 April 2016 (UTC)
The loophole is that you can add {{policy}} without a vote but then any subsequent changes, such as reverting that edit, do need a vote. That's why I think sometimes {{policy}} shouldn't be taken as gospel. Just make the edit. Renard Migrant (talk) 22:47, 8 April 2016 (UTC)
  • I strongly agree. I think it's worth adding to the page. —Μετάknowledgediscuss/deeds 20:16, 8 April 2016 (UTC)
  • If it isn't a problem, it doesn't need a solution. —Aɴɢʀ (talk) 22:45, 8 April 2016 (UTC)
    • Do you think reconstructions without descendants or derived terms are fine, then? —CodeCat 22:48, 8 April 2016 (UTC)
      • No, I think common sense doesn't need to be regulated. —Aɴɢʀ (talk) 09:44, 9 April 2016 (UTC)
  • Not sure what I think. I feel like I need some context. What are some examples of entries that someone created but you deleted? And in what situation and for what purpose would a scholar (assuming the person used scholarly sources and didn't just make things up) reconstruct forms that have no descendants? — Eru·tuon 00:07, 9 April 2016 (UTC)
  • Definitely. —JohnC5 05:24, 9 April 2016 (UTC)
Whether we officialise it with a policy or not I am in support of the proposal. I don't think we have an issue now with it, but creating a policy now will certainly make it easier to deal with should a problem ever arise in future. Leasnam (talk) 21:02, 9 April 2016 (UTC)
  • Hmm. Corner case: suppose we know that in subfamily X of family Y, a base vocabulary item — let's say 'water' — has been replaced by loanwords. However, the loaning has taken place later than proto-X (which might be inferrable e.g. due to external history, due to the loanwords being from numerous distinct sources, or from internal chronology of sound changes). Would it be then legitimate to reconstruct the inherited Proto-X form for 'water', despite its later extinction, e.g. for the purposes of rounding out a Swadesh list for Proto-X? --Tropylium (talk) 16:00, 22 April 2016 (UTC)
    Another corner case that comes to mind: it's occasionally possible to find a reconstruction referenced in a particular source, but without any actual descendants listed (e.g. due to the reconstruction being compared to data from a related branch). I suppose that in such cases, it should still be OK to create the reconstruction on the basis of the source, and worry about looking up the actual descendants later. --Tropylium (talk) 16:06, 22 April 2016 (UTC)
    In such cases, I refrain from creating the entry. —CodeCat 16:12, 22 April 2016 (UTC)
    For your first corner case, no. You don't know that the language inherited the native word. It could have fallen out of use before the borrowing. Perhaps there was another intermediate borrowing, or perhaps people stopped talking about water, who knows. For your second corner case, I would say that you can probably create the entry with the intent to add descendants later, but once it is clear that no such descendants can be found, then you would have to delete it. --WikiTiki89 16:15, 22 April 2016 (UTC)

Black's Law Dictionary 10th Edition (2014)[edit]

If anyone has access to this book, could they check that the edits of 75.69.172.246 (talk) are not coyvios please. They look like they might be word-for-word copies SemperBlotto (talk) 11:45, 9 April 2016 (UTC)

@BD2412, do you have a copy? - -sche (discuss) 22:02, 9 April 2016 (UTC)
Not of the 10th edition, I'm afraid. I'm stuck with the one I picked up in law school - the eighth. By the way, if anyone is interested, Wikisource has the second edition in progress. bd2412 T 18:39, 11 April 2016 (UTC)
The user is not doing a great job either, e.g. default en-noun adding an English -s to Latin terms, even ones that end with -s (and the others rarely take an English plural anyhow). Equinox 12:50, 12 April 2016 (UTC)

Russian -стрелить, -скочить, -прячь, etc. -- suffixes, verbs, roots, or what?[edit]

I've been creating entries for verbal roots like -стрелить, -скочить, -прячь, for convenience in creating etymologies and listing derived verbs. These are cases where there are prefixed verbs, e.g. застрелить, напрячь, and logically the base verb should be attested but it isn't. What part of speech should they be labeled as? I've been using "suffix" but don't feel totally comfortable with this, maybe it should be "verb" instead? Benwing2 (talk) 23:08, 9 April 2016 (UTC)

I've listed morphemes like -fico, -φρων ‎(-phrōn), -γραφία ‎(-graphía) as combining forms. That might be appropriate. I kind of wish categories and {{affix}} were able to recognize the category, though. It currently treats them as suffixes, so σώφρων ‎(sṓphrōn) is placed in Category:Ancient Greek words suffixed with -φρων. — Eru·tuon 23:36, 9 April 2016 (UTC)

Appearance of templates in temp[edit]

Right now, templates referenced using {{temp}}, like {{en-noun}}, have a border around the reference and gray background. This was not so recently, AFAIR. I think it's pretty ugly. Is this a change made via MediaWiki software update or did someone edit some Wiktionary global css? I see that {{temp}} uses the <code> element, so I can achieve the same appearance by using code directly, like this. By contrast, tt element does not do this, like this, and has the reasonably plain appearance that I prefer. --Dan Polansky (talk) 07:39, 10 April 2016 (UTC)

<tt> got 'deprecated', whatever that means. I mean if it's deprecated why does it still work? Anyway have you considered just getting used to it? Humans are pretty adaptable after all. Renard Migrant (talk) 15:59, 11 April 2016 (UTC)
Deprecated means that even though it may still work, it may be removed in the future. Now I have no idea why it's deprecated or who deprecated it. As for {{temp}}, I have come to like to the surrounding box thing. --WikiTiki89 16:47, 11 April 2016 (UTC)
It's deprecated by the HTML standard. —CodeCat 16:57, 11 April 2016 (UTC)
I am indifferent regarding the box, but we can easily create a css class which emulates the tt tag so the deprecation is moot (other than that we should not use it). - TheDaveRoss 17:55, 11 April 2016 (UTC)
  • I like the appearance of {{temp}}. --Dixtosa (talk) 17:10, 11 April 2016 (UTC)
  • Like Wikitiki, I've come to like the surrounding box. - -sche (discuss) 05:45, 22 April 2016 (UTC)

self‐proposal for adminship[edit]

I think that we need more administrators here, so I’d like to suggest adminship for myself. This has been a long time coming, but I’m seeing a lot of crap popping up and I just don’t have the utilities to deal with it by myself. Considering my trustworthiness, I will confess that I have inserted jocular content into the mainspace a few times, but I did have the decency to alert other users about it later (or just fix it myself), so I don’t think that any jokes that I made are still rotting in a lexical entry somewhere. I am a bureaucrat over at Wikcionario, and you simply won’t find any silly jokes there, no matter how arduously you try. The users there seem quite satisfied with my position, and my (few) errors there aren’t particularly outrageous. @Peter Bowman could give his opinion if he’d like. --Romanophile (contributions) 06:16, 11 April 2016 (UTC)

Coming from a returning user who’s infamous for goofing around, that doesn’t surprise me. --Romanophile (contributions) 08:19, 11 April 2016 (UTC)
I wouldn't support you probably because you swear too much, and seem to get stroppy when things don't go your way. --F909fef0j (talk) 08:36, 11 April 2016 (UTC)
In recent times, I don’t think that I really have a tendency towards obstinance; I hate being called stubborn. Also Dick Laurent (talkcontribs) is probably a worse coprolaliac than I am. --Romanophile (contributions) 09:39, 11 April 2016 (UTC)
The real reason you won't be getting support from him is because he is ineligible to vote. And he's already gotten himself blocked, but chances are he'll be back. --WikiTiki89 18:59, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I think that we need more administrators here, too. JackPotte (talk) 13:52, 11 April 2016 (UTC)
  1. Symbol support vote.svg Support I’m surprised this isn’t already the case. Vorziblix (talk) 18:44, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I actually disagree; we have too many admins. However, Romanophile would be a good admin, so I can't vote against him. --WikiTiki89 18:59, 12 April 2016 (UTC)
  1. Symbol support vote.svg Support I was also surprised that you weren't already. --Robbie SWE (talk) 19:05, 12 April 2016 (UTC)
  2. Symbol abstain vote.svg Abstain You'll be relieved to know that I won't be seeking adminship. Donnanz (talk) 19:22, 12 April 2016 (UTC)
  3. (comment) That's correct, Romanophile is a serious and solid colleague on eswiktionary. Regards, Peter Bowman (talk) 19:56, 12 April 2016 (UTC)
  4. Symbol support vote.svg Support. I too thought you were already an Admin. Leasnam (talk) 20:35, 12 April 2016 (UTC)
  5. (comment) I still think you are an admin. --Dixtosa (talk) 21:01, 12 April 2016 (UTC)
All right, the chances look reasonable. Now I just have to figure out how to set up an official election, or wait for somebody else to do it for me. --Romanophile (contributions) 00:52, 13 April 2016 (UTC)
Here: Wiktionary:Votes/sy-2016-04/User:Romanophile for admin. --Romanophile (contributions) 03:36, 13 April 2016 (UTC)

Plateaus[edit]

Should we put our plateau entries under the category of "Mountains" or "Plateaus"? E.g. Mexican Plateau, Armenian Highland, Tibetan Plateau, Loess Plateau, Mongolian Plateau, etc. ---> Tooironic (talk) 10:25, 13 April 2016 (UTC)

@Tooironic: Plateaux. — I.S.M.E.T.A. 15:28, 14 April 2016 (UTC)
No, the more common English plural is preferable; there's no need for us to be pretentioux. - -sche (discuss) 05:42, 22 April 2016 (UTC)

Proto language translations[edit]

It was my understanding that proto languages were not to be included in translations. See my edit on gold that were reverted by @I'm so meta even this acronym. Is this an official policy or not? DTLHS (talk) 15:06, 13 April 2016 (UTC)

FYI, here are: my request, DTLHS’s reversion, and my reversion. — I.S.M.E.T.A. 15:13, 13 April 2016 (UTC)

Yes. The official policy is that words in proto-languages are not to be included in the main namespace other than in etymologies. --WikiTiki89 15:10, 13 April 2016 (UTC)
@DTLHS, Wikitiki89: It seems legitimate that proto-languages' translations be included in those tables. It's information people might well want, and how else would they find it? In this case, I was trying to find out what the indigenous Celtic word for "gold" was, since all the Celtic languages' terms I could find on here derive from the Latin aurum. — I.S.M.E.T.A. 15:16, 13 April 2016 (UTC)
No, we have consistently kept the standard that only languages which are in mainspace can be added as translations, so no Proto-Celtic or Klingon or what have you. There's a lot of information some people might hypothetically want, but that doesn't mean it's appropriate for us to include. —Μετάknowledgediscuss/deeds 15:22, 13 April 2016 (UTC)
(e/c) We don't consider reconstructed words to be real words. That's one reason. As for your specific example, if no Celtic language has an indigenous word for gold, then there would be no basis on which to reconstruct one. --WikiTiki89 15:25, 13 April 2016 (UTC)
What Μετάknowledge said. - -sche (discuss) 15:56, 13 April 2016 (UTC)

*Humph* Fine. Does anyone know of any word in any of the Celtic languages that means "gold" and which doesn't derive from the Latin aurum? There's a fair bit of gold in Wales, so I'd assumed that there would exist such a word. — I.S.M.E.T.A. 15:30, 14 April 2016 (

I'm unaware of any Celtic word for "gold" besides loanwords from aurum. Maybe in Gaulish or Celtiberian, but I don't think there's anything in Insular Celtic. The gap isn't especially surprising, though; there are all sorts of semantic gaps in proto-languages where we might not expect them. For example, there's no reconstructable PIE word for "rain", despite the fact that PIE speakers must have been aware of and had a word for it. —Aɴɢʀ (talk) 21:30, 19 April 2016 (UTC)
Hmm. There seems to have been a PIE verb, at least. Gamkrelidze and Ivanov's Indo-European and the Indo-Europeans says that although taboo replacement led to a "low number of attested Indo-European languages preserving *seu-/*su- in the sense 'rain' [...] the ancient root is preserved in its original sense" of "to rain" in Greek, Albanian, Old Prussian (suge in the Elbing vocabulary), and Tocharian (su- per Adams' Dictionary of Tocharian B), and Reconstruction:Proto-Germanic/sūpaną mentions it.
What would be the expected Celtic reflex of PIE aus- / h₂é-h₂us-o-?
- -sche (discuss) 05:33, 22 April 2016 (UTC)
@-sche Probably Old Irish áu (attested in the meaning 'ear' from *h₂ṓws) and Middle Welsh eu/Modern Welsh au. —Aɴɢʀ (talk) 18:33, 22 April 2016 (UTC)

This is a bit insensitive to visitors - if we have so many reconstructed terms, they ought to be linked in translation tables. —Aryamanarora (मुझसे बात करो) 20:46, 19 April 2016 (UTC)

No, they oughtn't. They ought to be linked in etymology sections of real words. —Aɴɢʀ (talk) 21:16, 19 April 2016 (UTC)

How to render "someone" in lemma phrases in Russian?[edit]

@Atitarev, Cinemantique, Wikitiki89, Wanjuscha, KoreanQuoter I wanted to create an entry for the expression ободра́ть кого́-нибудь как ли́пку meaning "to rob someone blind" (literally "to strip someone like a young linden tree"). We have the corresponding English expression under "rob someone blind" with the word "someone" embedded into it. The only issue is that there are three more-or-less synonymous words for "someone" in Russian: кто-нибудь, кто-либо and кто-то. I've been using кто-нибудь in usage examples; Wanjuscha prefers кто-либо. My primary dictionary uses кто-нибудь but Zaliznyak and some others use кто-либо (often abbreviated кто-л or кто-л.). I just put the expression under ободрать как липку (to "rob blind", leaving out the "someone"), but the English analogy suggests that we should include "someone" (there's no rob blind). What should be done? Benwing2 (talk) 18:48, 13 April 2016 (UTC)

One thing to be considered is that in English we add these words because word order is important and there must be an object in between "rob" and "blind", otherwise it doesn't make sense. In Russian, since word order is flexible, we can just create the entry at ободра́ть как ли́пку ‎(obodrátʹ kak lípku). The reason many dictionaries include the word for "someone" is in order to indicate which case it should be in, but we can do that just as easily with tags and/or usage examples in the entry. --WikiTiki89 18:54, 13 April 2016 (UTC)
I think "rob blind" makes perfect sense as a lemma. It can even be used in real sentences that way. —CodeCat 18:57, 13 April 2016 (UTC)
Actually, in this particular case, I would also prefer "rob blind", although this is a dictionary-only form, since in the real world there is always an object in between. But there are many cases where the "someone" is required in the lemma. --WikiTiki89 19:02, 13 April 2016 (UTC)
Not always. "They robbed blind the very people who were trying to help them." —CodeCat 19:04, 13 April 2016 (UTC)
You're right. I definitely support moving rob someone blind to rob blind. --WikiTiki89 19:07, 13 April 2016 (UTC)
To expand on what Wikitiki said, you don't need to represent or add the word "someone" in the Russian entry titles. It's fine for a Russian entry with no word that corresponds to "someone" to link to an English entry containing "someone",; you can pipe the link if you think it would be misleading otherwise: [[rob someone blind|rob blind]]. - -sche (discuss) 01:57, 21 April 2016 (UTC)

ist der Ruf erst ruiniert, lebt es sich ganz ungeniert[edit]

What is our policy on (excessive?) referencing like in this one and also in some other edits by the user:Caligari, who is an admin from the German wiktionary. I don't want to denunciate, I'm just asking because the entry doesn't look like our entries usually look. And I'm relatively sure that we don't want our lemmas to look like the German entry for Haus, which lists 57 sources and an additional 15 references. Kolmiel (talk) 23:53, 14 April 2016 (UTC)

@Kolmiel: The entry for ist der Ruf erst ruiniert, lebt es sich ganz ungeniert looks fine to me. — I.S.M.E.T.A. 12:10, 15 April 2016 (UTC)
I also think that the Wikiwörterbuch's entry looks fine, and that the amount of sourcing that appears therein is proportionate to the amount of substantive content in the entry. — I.S.M.E.T.A. 12:13, 15 April 2016 (UTC)
I agree, but I also agree that the reference material for de:Haus is excessive and I don't want our entries to look like that either. —Aɴɢʀ (talk) 12:19, 15 April 2016 (UTC)
@Angr: I'm not a fan of the Wikiwörterbuch entry layout. I've yet to see how all that reference would look if converted to our format. I expect we'd probably stick it in a collapsible table if there was that much of it. — I.S.M.E.T.A. 12:36, 15 April 2016 (UTC)
If we had many entries with that amount of reference material, someone would doubtless propose a new References tab next to the Citations tab. —Aɴɢʀ (talk) 12:42, 15 April 2016 (UTC)
@Angr: Since Wikipedia hasn't done that already (and I've seen some articles with hundreds of citations), I think it's very unlikely that we'd do so. One of the main justifications of the Citations: was that it would give a place to house citations for non–CFI-satisfying terms; I see no analogical use for a References: tab. — I.S.M.E.T.A. 15:40, 15 April 2016 (UTC)
I didn't say it was likely we would do it, merely that it was likely someone would propose it. It would probably also be really difficult to get <ref> tags to generate text in a different namespace on a different tab. —Aɴɢʀ (talk) 15:43, 15 April 2016 (UTC)
@Angr: I'm afraid I wouldn't know about that. — I.S.M.E.T.A. 15:47, 15 April 2016 (UTC)

This entry seems fine to me. My main issue―and I hate to be a schoolmarm about this―is that Caligari's seems to be in violation of WT:USER#User pages. Furthermore, I can barely read the talkpage. This may just be silly of me and irrelevant, but I thought I'd mention. —JohnC5 14:47, 15 April 2016 (UTC)

@JohnC5: The user page isn't too offensive, even if it does violate WT:USER#User pages, but the talk page's background colour is nauseating. — I.S.M.E.T.A. 15:40, 15 April 2016 (UTC)

[edit]

I have created a vote with the intent that we can finally decide on a new logo for the English Wiktionary. I welcome all discussion on how to make this vote as likely as possible to be successful at representing the consensus of Wiktionary editors, so please feel free to edit the vote if you can improve it. —Μετάknowledgediscuss/deeds 03:25, 16 April 2016 (UTC)

Thank you! I've always been bothered by the current logo. Benwing2 (talk) 05:21, 16 April 2016 (UTC)
I've always been bothered by the RP pronunciation. It doesn't seem right to endorse British, American, or any other English. We could change the pronunciation to enPR or just omitting it altogether in favo(u)r of something more universally accepted and comprehensible like hyphenation (Wik‧tion‧ar‧y); though on second thought, there might be some debate about that too. It's also strange that the headword format in the logo is not one we use (viz. we don't use PoS abbreviations). —JohnC5 05:43, 16 April 2016 (UTC)
We already did this. Why are we doing it again? --Yair rand (talk) 19:31, 21 April 2016 (UTC)
We haven't done "this", whatever you mean by that, for quite a while, and there is great dissatisfaction with the current logo (just look at the comments above yours). —Μετάknowledgediscuss/deeds 14:18, 26 April 2016 (UTC)
  • The vote has begun! Please vote — I would really like to get maximal turnout so we can find a logo that's broadly acceptable to the community. —Μετάknowledgediscuss/deeds 14:18, 26 April 2016 (UTC)
    Why not give the link in the News for editors top part of the site? Or, *gasp*, at the same place as the News for editors link? — Dakdada 10:46, 27 April 2016 (UTC)

Server switch 2016[edit]

The Wikimedia Foundation will be testing its newest data center in Dallas. This will make sure Wikipedia and the other Wikimedia wikis can stay online even after a disaster. To make sure everything is working, the Wikimedia Technology department needs to conduct a planned test. This test will show whether they can reliably switch from one data center to the other. It requires many teams to prepare for the test and to be available to fix any unexpected problems.

They will switch all traffic to the new data center on Tuesday, 19 April.
On Thursday, 21 April, they will switch back to the primary data center.

Unfortunately, because of some limitations in MediaWiki, all editing must stop during those two switches. We apologize for this disruption, and we are working to minimize it in the future.

You will be able to read, but not edit, all wikis for a short period of time.

  • You will not be able to edit for approximately 15 to 30 minutes on Tuesday, 19 April and Thursday, 21 April, starting at 14:00 UTC (15:00 BST, 16:00 CEST, 10:00 EDT, 07:00 PDT).

If you try to edit or save during these times, you will see an error message. We hope that no edits will be lost during these minutes, but we can't guarantee it. If you see the error message, then please wait until everything is back to normal. Then you should be able to save your edit. But, we recommend that you make a copy of your changes first, just in case.

Other effects:

  • Background jobs will be slower and some may be dropped.

Red links might not be updated as quickly as normal. If you create an article that is already linked somewhere else, the link will stay red longer than usual. Some long-running scripts will have to be stopped.

  • There will be a code freeze for the week of 18 April.

No non-essential code deployments will take place.

This test was originally planned to take place on March 22. April 19th and 21st are the new dates. You can read the schedule at wikitech.wikimedia.org. They will post any changes on that schedule. There will be more notifications about this. Please share this information with your community. /User:Whatamidoing (WMF) (talk) 21:07, 17 April 2016 (UTC)

Proposal to globally ban WayneRay from Wikimedia[edit]

Per Wikimedia's Global bans policy, I'm alerting all communities in which WayneRay participated in that there's a proposal to globally ban his account from all of Wikimedia. Members of the Wiktionary community are welcome in participate in the discussion. --Michaeldsuarez (talk) 14:53, 18 April 2016 (UTC)

That user has only two edits to English Wiktionary, both of which have been deleted. —Aɴɢʀ (talk) 14:55, 18 April 2016 (UTC)
And one of those was to a transwiki that was later deleted, while it was still at Wikipedia. Their only real edit here was creation of a user page that was deleted an hour and a half later as "promotional material". Definitely not a participant here. Chuck Entz (talk) 01:27, 19 April 2016 (UTC)

Cascading protection of the main page[edit]

The main page has cascading protection. This prevents non-admins from editing Words of the Day on the day the word is featured, while allowing them to usually edit and set Words of the Day, which is good and is AFAICT the intention behind the cascading protection. However, it has some negative effects, too: it blocks people from editing certain modules, such as Module:labels/data, when labels appear on the main page. (See discussion.) Should we remove cascading protection from the main page? Alternatively, I think labels used to be "converted" by hand to (''text'') before being plugged into the WOTD templates, whereas that doesn't happen anymore... we could make an effort to convert the labels on all the WsOTD that have been set. - -sche (discuss) 00:48, 20 April 2016 (UTC)

I think labels should be converted to {{qualifier}} for WOTDs. --WikiTiki89 16:04, 20 April 2016 (UTC)
Or even a qualifier template used only on the Main Page! Renard Migrant (talk) 21:58, 20 April 2016 (UTC)
Better yet, they should be substed. Chuck Entz (talk) 01:32, 21 April 2016 (UTC)
I'm not sure that would work. Substing {{lb|en|chemistry}}, for example, yields {{#invoke:labels/templates|show}}. —Aɴɢʀ (talk) 13:51, 21 April 2016 (UTC)
The template code can be tweaked to make substing possible, but I don't think that's a good solution, since it would add too much unnecessary mark-up. --WikiTiki89 15:01, 21 April 2016 (UTC)
Modules can be substituted, and there is also the Lua function mw.isSubsting() which a module can use to return different results depending on whether it's substed or not. —CodeCat 15:45, 21 April 2016 (UTC)
Yes they can be, but they won't be unless the template's code is changed to something like {{<includeonly>safesubst:</includeonly>#invoke:labels/templates|show}}. --WikiTiki89 15:57, 21 April 2016 (UTC)

Let's kill nds-de/nds-nl.[edit]

I've always been hesitant to bring it up because it's not really a high priority issue and since I don't know how to bot, I would be either heaping the work on somebody else or have to work through 500 pages manually. Also I was partially at fault for the adoption of these tags in my early days on Wiktionary and learning about Low German. But being more acquainted with both, I don't see any justification for this split, especially considering that "Low German" covers the last 400 years and not only post WWII. The only actual distinction that runs sharply on the border is orthographic tradition, though you can find some Dutch traditions in Western Germany as well, e.g. "zy" for [zɛɪ] in Eastern Frisia around 1890, which is when modern Low German had its greatest international public attention.
For reference: The distinction between nds-nl and nds-de arose on nds.Wikipedia because they couldn't settle on a way to spell things, but that is an issue we as a descriptive dictionary don't face. We currently have 99 nds-NL lemmas, 405 nds-DE lemmas and 548 nds-only lemmas. Opinions, commentary? Korn [kʰũːɘ̃n] (talk) 10:23, 21 April 2016 (UTC)

I'm fine with merging the codes as long as we have regional labels for them allowing the categories Category:Dutch Low German and Category:German Low German to be retained as regional varieties of Low German. We could even have separate categories for all of the subdialects of Dutch Low German that have their own ISO codes. —Aɴɢʀ (talk) 13:49, 21 April 2016 (UTC)
I know nothing or almost nothing about the language, however based on what I've said on Wiktionary this seems like a good idea. The Norman Wikipedia I know has similar issues with spelling because Jersey and Guernsey spellings are so different. Renard Migrant (talk) 20:08, 21 April 2016 (UTC)
I've had problems deciding which template to use, and now I just use the nds template. With Norwegian words it's hard to tell whether they're of Low German or Middle Low German origin, so I refer to Den Danske Ordbog as well, where they usually differentiate between nedertysk and middelnedertysk. What language came before Middle Low German, by the way? Donnanz (talk) 20:22, 21 April 2016 (UTC)
Old Saxon (osx). —Aɴɢʀ (talk) 21:36, 21 April 2016 (UTC)
I think we should merge these languages only if Norwegian is also merged, it's more or less the same issue. —CodeCat 21:41, 21 April 2016 (UTC)
Ooh, don't start that one up again. What about merging Scots with English?
@Angr: I thought so, cheers. Donnanz (talk) 21:53, 21 April 2016 (UTC)
I'm definitely opposed to making this merger dependent on some unrelated merger. —Aɴɢʀ (talk) 22:08, 21 April 2016 (UTC)
If there is only around 1000 nds entries in all we have only scratched the surface. I have come across many more without entries when entering etymology, and the same goes for gml. Donnanz (talk) 22:31, 21 April 2016 (UTC)
While the situation within Low German might be similar to that of Norwegian, this is not bound to national borders, which is what we wrongly imply. Westphalian dialects exist in both Germany and the Netherlands, they are very similar to each other and markedly different to the other dialects within the same nation. East Frisian and Twents on the other hand are two dialects which have the political border running right through them without too great an impact. Sometimes not even on orthography, as the zy-example shows. The dialects within Germany might agree roughly on which glyphs not to use. But since they have vastly different phonetic inventories, and are often influenced by different regional traditions, they do not employ identical spellings either, even if they agree on the pronunciation. Further they do not agree on declension patterns, number and form of articles and pronouns, and number of grammatical cases. So while the situation within Low German might be like that of Nynorsk/Bokmål/Rigsdansk, that is simply not the issue nds-de and nds-nl are dealing with. What they are dealing with is spelling. They would separate entries with the same pronunciation, meaning and grammar into separate L2 headers purely based on spelling. And this is what I ask to do away with. Korn [kʰũːɘ̃n] (talk) 10:48, 22 April 2016 (UTC)
I know that we are descriptive, but we must always be forward facing. What does the future hold for these two varieties? Will they trend towards greater divergence in future or no? I know, we can cross that bridge when we get to it, but why wait? Planning today for the future is never foolish. Leasnam (talk) 19:11, 22 April 2016 (UTC)
I don't know about Dutch Low German, but I suspect German Low German will become extinct before it has a chance to either converge with or diverge from Dutch Low German. Trying to find a fluent native speaker of German Low German below the age of 50 is like trying to find a hay-colored needle in a hay stack. —Aɴɢʀ (talk) 19:19, 22 April 2016 (UTC)
Hmm, good point. Then what's left will be just the one variety. But (hopefully) like Bavarian and Swiss German, there is always hope :) Leasnam (talk) 19:25, 22 April 2016 (UTC)
I don't see why being forward facing would include making a random guess at the outcome of the next few decades of development. Whatever their result, they won't undo the last centuries either. And if they change that much, they probably deserve their own L2 as a new phase of the language. My point was that it seems only sensible to me that either we separate the variants based on how different they are, or group them all as one. If we split them up, German Low German has to be parted into the actual dialects - which range from something between 4 and 40, depending on what you split by. If we're not going to split up Low German into its actual dialects, I don't see why we would create the singular case that groups of spellings would get a separate language tag here. (My understanding is that Nynorsk also differs in grammar from Bokmål, but, again, "both" Low Germans already differ in grammar and spelling from themselves plenty.) Korn [kʰũːɘ̃n] (talk) 20:07, 22 April 2016 (UTC)
No one in their right mind would suggest a random guess. But we can gauge based on the past 100 years or so whether they are becoming more alike, remaining static, or diverging. Leasnam (talk) 21:16, 22 April 2016 (UTC)
Maybe there's a parallel with Flemish and Swiss German which aren't regarded as separate languages here, despite the fact they are spoken in countries other than the home country of the original language. Donnanz (talk) 13:45, 25 April 2016 (UTC)
This comparison seems correct to me. I do not see how a divergence of dialects in the future justifies the split as we have it (for less than half of the entries), though. The future is neither the present nor the past, which is what we record. And I don't see why a one difference should get special treatment over all other differences which are not tied to a political border. Be as it may, since there doesn't seem to be a hard opposition, I'll just use the plain NDS-tag in etymology and descendence sections, especially since it's the major one anyway. Korn [kʰũːɘ̃n] (talk) 15:06, 25 April 2016 (UTC)
We are a dictionary, not a crystal ball. Predicting the future is not our business and should not drive any of our decisions. --WikiTiki89 15:10, 25 April 2016 (UTC)
When Wiktionary merged the various dialect codes which the Ethnologue / ISO encoded alongside nds (Sallands, Westphalian, etc) into just two codes mirroring the two Wikipedia codes, I hoped it would make the situation less messy, avoiding some of the debates over spelling and capitalization because some trends are much clearer (even if not entirely uniform) inside nds-de and nds-nl than in nds as a whole, partly as a result of the separate pull that Dutch has exerted on nds-nl and that German has exerted on nds-de.
nds-de and nds-nl are far more dissimilar than e.g. sr, hr and bs (which are functionally identical, all based on one specific subdialect). There are dialects on the edges of each which bleed into the other, as is also true of e.g. Scots vs English (where sentences can sometimes be hard to classify as Scots vs Scottish English). There is also fine internal variation, especially historically: for example, some trends and specific words distinguished similar varieties of Low Prussian from each other, not just from Western Pomeranian vs Mecklenburgish, etc — this is true of most Germanic lects, e.g. Central Franconian, Swedish, even English (contrast da yooge boid ate da olykoek /də judʒ bɜjd eɪt də ˈ(oʊ~oə).lɪ.kʊk/ and the huge bird ate the doughnut /ðə hjudʒ bɝd eɪt ðə ˈdoʊ.nʌt/).
One user previously proposed merging not only nds-de and nds-nl but also pdt into nds as nds.Wiktionary does; I oppose merging pdt into nds. I neither support nor oppose merging nds-de and nds-nl.
- -sche (discuss) 16:31, 25 April 2016 (UTC)
What's your reasoning for Plautdietsch? And what is your definition of it? Many nds-de entries have a tag including Low Prussian, which is what I understand as Plautdietsch. Korn [kʰũːɘ̃n] (talk) 22:40, 26 April 2016 (UTC)
Plautdietsch is a Low-Prussian-derived lect that was historically (and still is) spoken outside Prussia, in America and Ukraine and Russia and elsewhere. It has been kept separate on account of its separate geographic development, just like Luxembourgish and Transylvanian Saxon have been kept separate from each other and from Central Franconian, and Pennsylvania German and Volga German have been kept separate from Rhine Franconian, etc. Low Prussian is the lect that was historically spoken inside Prussia. We have entries in it because it is well- and accessibly-documented.
I have periodically questioned if it was sensible to do things that way — merge all the "inland" varieties of Rhine Franconian under one code, but keep "outland" varieties like Pennsylvania German and Volga German all separate, and likewise Hunsrik, etc, etc. It is convenient in some ways, and the weight of precedent is firmly behind it when it comes to German lects. But I periodically muse that we seem to be doing just the opposite of Ethnologue: they split everything except languages they speak, e.g. they split Fula varieties but not New York vs London English; whereas, we merge Fula but split up (some of) the languages we speak.
- -sche (discuss) 01:53, 27 April 2016 (UTC)
I'll state for the record that I think our language codes for languages which are not officially codified by some nation should be divided solely by difference in grammar and phonology. Minor variations thereof, as well as political and lexical differences, should be the criteria for tags and sub-categories. Exempli gratia I would class Berlinerisch as de. Korn [kʰũːɘ̃n] (talk) 21:15, 27 April 2016 (UTC)
Why even make an exception for languages codified by a nation? --WikiTiki89 21:20, 27 April 2016 (UTC)
Because if we were to start defining and splitting Scandinavian lects based on actual differences, we would become unusable to an average people who wants to look up a word in Swedish and not Svea-Geatlandish and or whatever we'd arrive at. My hunch is that anything with an official set of rules and spellings should have its own header, since it's what people would most likely look up. Don't pin me down on it, though, this statement feels like thin ice. Korn [kʰũːɘ̃n] (talk) 11:07, 28 April 2016 (UTC)
There's no need to use funny names; we'd just call it Southern Swedish or something. Wouldn't people wanting to look up a Norwegian word already be confused that there is no such thing as Norwegian, but only "Norwegian Bokmål" and "Norwegian Nynorsk"? And that Serbian, Croatian, and Bosnian don't exist, but instead Serbo-Croatian does? We shouldn't worry too much about whether people will be confused by which languages we choose to merge and which to split. --WikiTiki89 15:13, 28 April 2016 (UTC)
The southern Swedish dialect is known as Scanian. Donnanz (talk) 15:34, 28 April 2016 (UTC)
We got one of the problems right there: Is Scanian really southern Swedish? I've seen it classified as Eastern Danish. And there are things like Tronderish/Jamtlandic, whose area is in both and Sweden for like 50%. Is that Central Norwegian or Central Swedish? Nor do I think that people would understand "South Swedish" to be a supra-term to what they understand to be just "Swedish", rather than a subcategory of it. You got a point about Bokmål and Nynorsk nomenclature. I guess it comes down to whether one can expect users to intuitively understand that the term covers what is looked for, I don't know if that is or is not the case for Norwegian and any other language in that vein. Korn [kʰũːɘ̃n] (talk) 16:42, 28 April 2016 (UTC)

Palochka[edit]

Discussion moved from User talk:Stephen G. Brown#Palochka.

I see that you were discussing the use of uppercase and lowercase Palochka back in 2012 and 2013. Currently, ru.wiktionary (XML dump) contains some 5000 uppercase and 1000 lowercase Palochka (mixed in with a few hundred uppercase Latin I and Cyrillic dotted I), while en.wiktionary and fr.wiktionary today almost entirely use the lowercase version and tr.wiktionary and ce.wikipedia (Chechen Wikipedia) only use the uppercase version. Would it be possible to find a consensus for all WMF projects on which version of Palochka to use? Should ru.wiktionary, which currently has a mix, change all to uppercase or to lowercase? --LA2 (talk) 00:56, 17 April 2016 (UTC)

I would much prefer that all uses of palochka be of one case. Originally there was only one palochka, Ӏ (u04c0), and I think it should have stayed that way. Today, the original palochka has become the uppercase palochka, so I support the use of the uppercase for all cases. Since uppercase and lowercase palochkas look identical, having it in two cases would lead to a lot of misspellings and trouble.
It would be great if a consensus among all WMF projects could be reached, but I do not know how to go about doing it. —Stephen (Talk) 07:28, 17 April 2016 (UTC)
I agree that lower-case palochka should never be used. --WikiTiki89 14:26, 18 April 2016 (UTC)
I'd prefer lower-case palochka to be used where appropriate. Many wiki-projects go through normalisation. Wikipedias in languages that use palochka will catch up eventually. It's only correct to spell "Qatar" as Къатӏар ‎(Q̇aṭar) (lower case) in Chechen but the spelling "КъатӀар" (upper case palochka) is common. --Anatoli T. (обсудить/вклад) 09:02, 19 April 2016 (UTC)
I noticed that on my iPhone, the lower-case palochka is actually shorter than the upper-case palochka. In other fonts (on my computer), the lower-case one looks like a lower-case "L/l", and the upper-case one looks like an upper-case "I/i"; and so they range from distinguishable by serifs only, to nearly indistinguishable. @Atitarev: Can you link to professionally type-set printed text in a language that uses palochka? --WikiTiki89 19:37, 19 April 2016 (UTC)
No, I don't think I will be able to link to properly type-set texts in North Caucasian or other Russian minority languages, even for common words like лугӏат ‎(luġat). Cf. Chuvash сăмахсар (Roman ă) vs сӑмахсар ‎(sămahsar), Ossetian цæхх (roman æ) vs цӕхх ‎(cæxx). The Internet penetration and digitization of these minority languages is very poor and hard to verify. You'll find that misspellings are often more common than standard or standardized forms. Vahagn might know more about North Caucasian spellings, fonts and the mix-up with palochka and other special symbols, which are missing in other Cyrillic-based languages. We should try and normalize these spellings. I think we can compare the use of Hebrew "׳" vs a simple apostrophe ' or Arabic hamza, which is used inconsistently out there but we use it here. As a dictionary, we can afford using the latest standard forms. --Anatoli T. (обсудить/вклад) 01:42, 20 April 2016 (UTC)
The professionally typeset books of Soviet times do not distinguish the case of palochka. I am in favour of distinguishing lowercase palochka from the uppercase one, because proper nouns can start with a palochka. For example, Chechen Ӏаьрбийн Цхьанатоьхна Эмираташ ‎(ʿärbīn Cḥanatöχna Emirataš, United Arab Emirates). --Vahag (talk) 06:00, 20 April 2016 (UTC)
@Vahagn Petrosyan, Wikitiki89 Our transliteration modules can't handle all possible misspellings either and the translation adding tool User:Conrad.Irwin/editor.js needs some attention for North Caucasian languages (see lines starting after "var diacriticStrippers"). Compare Avar Къатӏар ‎(Q̇̄aṭar) (lower case palochka, correct) and КъатӀар ‎(Q̇̄atӀar) (upper case palochka). The latter is transliterated incorrectly "Q̇̄atӀar". If we use I, l, 1, | instead of proper palochkas, the results are even worse. --Anatoli T. (обсудить/вклад) 08:00, 20 April 2016 (UTC)
@Vahagn Petrosyan: What do current professionally type-set texts do? How would professionally type-set text have rendered United Arab Emirates then and now? --WikiTiki89 15:45, 21 April 2016 (UTC)
According to the few modern sources I have, they too do not distinguish the case of palochka. Compare this extract from {{R:ce:Aliroev}}. --Vahag (talk) 16:06, 21 April 2016 (UTC)
Do your sources say anything about whether the next letter is capitalized (i.e. something like ӀАьрбийн Цхьанатоьхна Эмираташ ‎(ʿÄrbīn Cḥanatöχna Emirataš))? --WikiTiki89 17:26, 21 April 2016 (UTC)
They don't capitalize the next letter. --Vahag (talk) 06:28, 22 April 2016 (UTC)

Yes, we should aim for normalization. But is it settled that the lowercase Palochka is the future? Maybe adding it to Unicode 5.0 was a mistake that we should better ignore? (In the 1970s, the Swedish Academy tried to change the spelling of zebra to sebra and juice to jos, but these reforms didn't catch on, nobody used them, and they are now a laughing matter of a bygone era when overly optimistic planners thought they had an influence that indeed they lacked.) If there is a scientific/societal consensus to use the lowercase Palochka, then we should do so. But is there? It seems to me that the Chechen Wikipedia is already fully normalized and standardized on using the uppercase Palochka. What exactly is wrong with that? --LA2 (talk) 22:09, 20 April 2016 (UTC)

I really favor using only the uppercase as the Chechens do. The palochka is nothing more than a version of the apostrophe, and we would not want to have upper- and lowercase apostrophes, would we? And there is no reason why the unicase palochka couldn’t be used as the initial letter of a proper noun.
It should not be a difficult choice. At the moment, it seems that the speakers and writers of the affected languages themselves prefer to use the unicameral palochka, and we should follow suit. If at a future time the writers of these languages decide that they want two cases for the palochka, it will be a simple matter for us here to make this change on Wiktionary. Just as the readers and writers of a language should have the right to decide the spellings of their words, and how to punctuate their language, those same people should have the say in whether the palochka is unicase or dual case, and we should bow to their usage. It can always be changed later on if the users of the languages want to do it. —Stephen (Talk) 01:02, 21 April 2016 (UTC)
I think this topic should be moved to Beer parlour, it concerns language policies, transliterations of a few languages. Whatever decision is made, could be accompanied by some bot work, changes to translit modules and policy pages. --Anatoli T. (обсудить/вклад) 21:53, 21 April 2016 (UTC)
Topic moved from User talk:Stephen G. Brown#Palochka. —Stephen (Talk) 08:41, 22 April 2016 (UTC)
I think the original reason for adding a lowercase character is that Unicode rules were changed such that case-pairings are fixed and can never be changed (after the glottal stop debacle). As such, a lowercase-only character can get an uppercase counterpart encoded later, but the opposite isn't possible anymore. As such, lowercase counterparts were added for all uppercase-only characters in Unicode (there were only about 5 or 7 or so) so they wouldn't run into problems in the case that a lowercase counterpart would later be needed. -- Liliana 08:55, 22 April 2016 (UTC)
It may the background story but we need to decide, if we're going to use upper case palochka or use upper/lower case palochkas in the concerned languages. I no longer insist on using both, like with other, usual upper/lower case letters. It seems there is more evidence that we need to stick to upper case palochka. --Anatoli T. (обсудить/вклад) 09:23, 22 April 2016 (UTC)
I too don't care that much about which option we choose, as long as we follow it consistently. --Vahag (talk) 09:47, 22 April 2016 (UTC)
In previous discussions, it seemed intuitive to me that we should make the case distinction, for the reasons Anatoli gives above. However, there is a big different between using what amounts to CamelCase, and mixing scripts as in the цæхх vs цӕхх example. If the speakers themselves apparently always or mostly use uppercase, then I guess we should go with that. There is at least one other language with a standard orthography that uses only one member of a pair of cased letters, although it's a constructed language (Klingon uses only I and no i). - -sche (discuss) 15:18, 22 April 2016 (UTC)
The other difference is that with the Latin æ vs. Cyrillic ӕ, we can say that even though we are going against widespread internet practice, we are still staying true to the printed orthography, since these codepoints are not typographically different; however, the lower-case palochka goes against the printed orthography as well, and therefore I oppose using it. --WikiTiki89 15:34, 22 April 2016 (UTC)

Should definitions and glosses reflect the lemma form of a lemma?[edit]

For Latin verbs, most of our entries show the definitions in the first person singular. For example abdico. This is presumably because the lemma form is the first person singular present. I'm not sure if this practice makes sense, though, because the lemma represents all forms and not just the first-person singular present. The Latin lemma should be translated with an English lemma, and in English, it is usual to lemmatise verbs in the infinitive, so we should use it in definitions also. Compare verb lemmas for Bulgarian, Irish, Macedonian, Welsh and probably many other languages that have no infinitive, where we already do this. —CodeCat 18:51, 22 April 2016 (UTC)

I agree that lemmas should be translated and glossed with lemmas. For example, dīcō should be defined as "to say". Just like Arabic قَالَ ‎(qāla), which is the 3rd person singular past tense, is defined as "to say", and Macedonian каже ‎(kaže), which is the 3rd person singular present tense, is defined as "to say". --WikiTiki89 19:01, 22 April 2016 (UTC)
I agree in principle, but it will be difficult to get people to comply, perhaps especially for Latin and Ancient Greek. This may be due in part to how these languages are taught. When I was in school and competing in certamen at Junior Classical League conventions, your answer would be considered wrong if you said something like "dīcō – to say". We were constantly reminded to answer along the lines of either "dīcō – I say" or "dīcere – to say". —Aɴɢʀ (talk) 19:16, 22 April 2016 (UTC)
I disagree, as we are defining the word and dico, for example, does not mean "to say". Lemma-to-lemma makes sense on some levels, but I think it's better to provide an accurate translation of that form of the word, especially because beginners in the language might not be aware that the lemma of a Latin verb is not the same form of the verb as the English lemma, and this could be a potential source of confusion. Andrew Sheedy (talk) 11:51, 26 April 2016 (UTC)
Would you then agree with defining Arabic verbs with things like "said", "opened", "defined", etc? And Welsh verbs with "saying", "opening", "defining"? —CodeCat 13:20, 26 April 2016 (UTC)
We aren't defining the word. We are defining the paradigm. In some languages, the lemma form does not even cleanly correspond to any English tense. Furthermore, many languages lack an infinitive, so what would we put in translation tables? --WikiTiki89 15:10, 26 April 2016 (UTC)
This is another good point. Since we translate from English lemmas to whatever lemma the other language uses, we should translate back into English from lemma to lemma too. It doesn't make much sense for a translation of a Latin verb to appear on the English infinitive page, only for that verb to define itself as a 1st person form. —CodeCat 15:49, 26 April 2016 (UTC)

Categorize ditransitive verbs by template label[edit]

Please do. Thanks.--Dixtosa (talk) 12:01, 23 April 2016 (UTC)

Done. - -sche (discuss) 02:27, 27 April 2016 (UTC)

Global ban of Liliana-60[edit]

FYI, I sent a message today to the e-mail in the global account log[8] of Liliana-60 to ask exactly why she was globally blocked. I'm waiting for their response. --Daniel Carrero (talk) 19:06, 23 April 2016 (UTC)

Death threats. — Ungoliant (falai) 19:26, 23 April 2016 (UTC)
  • The transparency has been quite underwhelming. I'm curious to see how they respond to your query. —Μετάknowledgediscuss/deeds 19:33, 23 April 2016 (UTC)
    Contrast #Proposal to globally ban WayneRay from Wikimedia above; it links to guidelines that seem to have been ignored in this case. —Μετάknowledgediscuss/deeds 19:36, 23 April 2016 (UTC)
    Im guessing you'll get no reply at all. Renard Migrant (talk) 20:41, 23 April 2016 (UTC)
  • Before you complain about transparency, note that the m:WMF Global Ban Policy clearly states "Also, to protect the privacy of all involved, the Wikimedia Foundation generally will not publicly comment on the reason for any specific banning action." and "Please note that questions about specific WMF global bans will not be addressed, to protect the privacy of all involved." I think this is perfectly reasonable. If the ban was wrongful, Liliana should be the one to sort it out. Also, I'm not sure whether this is significant or not, but three other accounts were globally banned on the same day. --WikiTiki89 15:20, 26 April 2016 (UTC)
    I don't know whether it's significant or not either, but it certainly wouldn't put anybody's privacy at risk to tell us. And yet they won't. (Also, as noted before, they already broke their own rules on global bans in this case.) —Μετάknowledgediscuss/deeds 15:25, 26 April 2016 (UTC)
    It seems there is a distinction between community-initiated global bans and WMF-initiated global bans. The rules for the former do not apply to the latter, so no rule was broken. --WikiTiki89 15:40, 26 April 2016 (UTC)
I am unhappy with the process the WMF is using, it seems to me that anyone can be blocked without any evidence being given to justify that block. The whole point of the WMF is that the community is self-governing. - TheDaveRoss 15:38, 26 April 2016 (UTC)
Just because evidence is not released to the public, doesn't mean none was presented. And you can't judge the affects of releasing that information if you don't know what that information is. I'm sure the WMF doesn't take any of this lightly and certainly would not ban just "anyone". --WikiTiki89 15:49, 26 April 2016 (UTC)
They do not need to reveal the contents of the edit to be at least partially transparent. They could, for instance, post something here (where Liliana was very active) saying that they were taking an action, and justifying that action. Unilateral and silent action is against the spirit of the project, and while I agree that they would probably not wantonly ban anyone, Liliana would likely disagree. She is unable to disagree here. - TheDaveRoss 16:59, 26 April 2016 (UTC)
Liliana is able to disagree in private conversation with the WMF. --WikiTiki89 17:32, 26 April 2016 (UTC)
There is a major flaw with that argument, they hold all of the cards and she holds none. There is a reason why trials in civilized society always include impartial third parties, either a jury or the state or both. When community members make blocks there are other community members who can review the block and judge whether or not it was justified. It happens all of the time on this project. When the WMF blocks someone there is no similar check, so the possibility for abuse is much higher. - TheDaveRoss 17:47, 26 April 2016 (UTC)
The WMF is the impartial third party. Equivalent somewhat to "the state" (but I would caution not to take that analogy too far, there are many differences between an organization and a state). --WikiTiki89 18:02, 26 April 2016 (UTC)
They are not the third party, they are not impartial. The WMF has interests of its own, and has recently shown to go to some lengths to hide those interests from constituents. That whole mess with board members etc. being removed and resigning recently demonstrates that the small group of people who represent the organization officially are not infallible and should work harder to operate in the open. The community writ large is the third party in this situation, the WMF is judge, jury and executioner. - TheDaveRoss 18:28, 26 April 2016 (UTC)
They are a third party. Whether they are impartial depends on what the topic is. As far as Liliana's block is concerned, they certainly are impartial. Liliana didn't do anything personally to them (at least as far as we know). --WikiTiki89 18:32, 26 April 2016 (UTC)
Prove it. You want to argue semantics, I would rather address the issue, which is that the WMF took unilateral action without providing justification or an open mechanism for appeal or community input. I am not the kind of person who distrusts authority on spec, but this process is hypocritical from an organization founded on the concept of openness and community control. I agree that the WMF needs to have the ability to take decisive action to protect the project and the community, but those actions should be very rare and be well justified. - TheDaveRoss 18:46, 26 April 2016 (UTC)
I'm not arguing semantics; if you can't see past the semantics what I'm actually trying to say, then I'll try to explain it better. I can see that your issue is with the lack of openness, which has nothing to do with whether the WMF is an impartial third party (and I have explained why I think so). We are in agreement that there is a lack of openness, but the question remains whether the lack of openness is justified. You say "those actions should be very rare and be well justified". There have been 15 of these bans since 2012. I would say that makes it fairly rare, considering the size of the entire community, but you may disagree with that. Whether they are justified, and whether the lack of openness is justified, is not something you can judge without all of the information, which none of us have. But as long as they remain rare, I am willing to assume they are justified. And I'll reiterate that even if it's not justified, it's not as though Liliana has no recourse. --WikiTiki89 19:12, 26 April 2016 (UTC)
The only way I can judge whether the lack of openness is justified is if the WMF justify their actions, they have not. Fifteen bans is relatively few, the four this week however are another story, but it does seem that the actions are infrequent. Even the infrequent actions, however should be justified. By this I do not mean that they should have justification in the mind of the person taking action, I mean that the person taking action should explain the action to the community. You hit the nail on the head in stating that this "is not something you can judge without all of the information," and I completely agree, which is why more information should be made available. They do not need to reveal the specific offending text, but they should be able to cite the policy under which they acted, and describe the reason why they felt the action was necessary. Concerning Liliana's recourse, the entity to whom she can appeal is the same entity who she believes has acted in bad faith. If I block someone unjustly the person I block can appeal to you, Liliana is not afforded the same recourse. - TheDaveRoss 19:45, 26 April 2016 (UTC)
My point was that if you don't know what they're not saying, you can't judge whether or not they should be saying it. Now did Liliana say that the WMF acted in bad faith, or is it just you who's saying that? And don't forget that the WMF isn't just one person either. --WikiTiki89 19:53, 26 April 2016 (UTC)
Liliana said that the initial set of blocks, and the removal of her permissions elsewhere were unjustified, so I assume she also considers the further action similarly. I have not heard from her since this block came along. You are right that I can't know what they are not saying. I just can't imagine any scenario in which silence is necessary. If Liliana posted nuclear launch codes then they should say that she was blocked for violating whatever policy is in place for such an action. They don't have to give out the codes to explain their actions. Also, though the WMF is more than one person, this action was taken using the anonymous account, so I don't know who actually made the block. - TheDaveRoss 20:09, 26 April 2016 (UTC)
There's a difference between unjustified and done in bad faith. But anyway, I do think it would be reasonable and desirable for WMF to release some information. I just wanted to point out that their current policy itself states that they will not say anything, so we should complain about the policy rather than about the silence itself. I don't know how I even got sucked into this argument. Also, another point is that if WMF states their reason and then is later proven wrong about it, it could be seen as slanderous. --WikiTiki89 20:48, 26 April 2016 (UTC)
In the end it probably comes down to the fact that "the community" can't be sued or tried for a crime whereas WMF can. So WMF needs to have the ability to act quickly and decisively if they feel WMF is at risk. To disclose to the community the details of the allegations and evidence many run the risk of increasing WMF's legal risk.
I am reminded of a Saturday Night Live skit involving a press conference during the Gulf war. DCDuring TALK 16:14, 26 April 2016 (UTC)
As a stakeholder in the WMF, a suit brought against them is a suit against my interests as well. I understand that organizations need to protect themselves (I am on the board of a non-profit myself, so I have some [limited] experience with such considerations). As I said above, it is not that I oppose the action, it is the process by which this action was taken that I am in opposition to. I think that the community should be a safe place for all, and if Liliana's words truly violated that safety then perhaps a ban was called for. Without any form of evidence or justification presented I cannot be comfortable with the action. - TheDaveRoss 16:59, 26 April 2016 (UTC)
@TheDaveRoss Therein is why non-admins don't like admins. To a non-admin, the complaint of blocks being capricious seems hardly surprising. Purplebackpack89 16:21, 26 April 2016 (UTC)
I reject your premise. - TheDaveRoss 16:59, 26 April 2016 (UTC)
  • This block process, especially the global block, has struck me as DerHexer and others talking to the right people privately, rather than anything that approaches a community consensus. Also, aren't you supposed to be disruptive on all of your main projects (of which this is one for Liliana) before being globally banned? Purplebackpack89 17:25, 26 April 2016 (UTC)

Multiple times on Wikis including this, she gave info on where users she had disputes with lived and threatened to visit. On German Wikis, she said she would resort to vigilante justice and that only people who don't shy away from murder get ahead. Multiple people issued blocks over threats and noted the danger of her having privileges that require trust, give access to deleted personally identifying information and allow blocking. As she continued to make threats, people predicted she would be banned. It's sure inexplicable she has been banned. Thrwwy (talk) 20:49, 26 April 2016 (UTC)

Linking to the constituent words in the headword templates[edit]

I find links to

  1. "ex" and "libris" in ex_libris
    and
  2. "Marie Antoinette" in Marie_Antoinette_syndrome

totally useless and potentially misleading.

In the first example libris is an orange link (useless). ex is blue but it has nothing to do with Latin ex (misleading).

In the second case Marie_Antoinette makes one link and if you follow the link you will see a definition that has nothing to do with the syndrome (useless).

So, the questions are

  1. do you agree that both (actually three) of the links should be delinked?
  2. is there any reasonable rule to follow to identify the parts that should be linked?
    and finally super high IQ editors can contemplate on the following
  3. What's the purpose of linking to the constituent words in headword templates at all?

--Dixtosa (talk) 19:54, 25 April 2016 (UTC)

I have always advocated for the auto-linking to be opt-in. Having lost that battle, I tried advocating for a reasonably easy way to opt out other than head=the entire long phrase. I lost that battle too. --WikiTiki89 20:00, 25 April 2016 (UTC)
@Dixtosa How could links from Marie Antoinette syndrome to Marie Antoinette and syndrome possibly be misleading? (I agree that links to Marie and Antoinette are not instructive.) If w:Marie Antoinette is not helpful then perhaps the entry needs an image of a painting of Marie Antoinette when she was relatively young, but had whitened hair. I think it is our job to figure out the best way of answering questions "How did those words come to mean that?"
The default linkage to individual terms is a labor-saver because AFAICT no one here has the AI skill to make a module that could do links better than a skilled editor and links to the individual components are probably the single most usually instructive links. The meanings of many headwords of MWEs do have a comprehensible connection to the meanings of some combination of the constituent terms, sometimes of the individual components. DCDuring TALK 21:30, 25 April 2016 (UTC)