Wiktionary:Beer parlour/2016/March

Wiktionary:Wanted entries - name and shortcuts

I'd like to create shortcuts for all the subpages of Wiktionary:Wanted entries:

But one thing bothers me: basically, the only practical difference between Wiktionary:Wanted entries/la and Wiktionary:Requested entries (Latin) (WT:RE:la) is that the former is of requested redlinks. (then again, most "requested entries" are redlinks too) Do we want to keep the name Wiktionary:Wanted entries? I don't even have a better idea for a name, that's why I didn't create an RFM. Maybe I'd better just create the shortcuts, but first I'd like to know if it's okay, since they further cement "Wiktionary:Wanted entries" as the name we should use.

For reference, the page was created in 2005 as Wiktionary:project-wanted articles, then moved in 2008 to Wiktionary:project-wanted entries and finally moved again in 2010 to Wiktionary:Wanted entries. --Daniel Carrero (talk) 04:56, 1 March 2016 (UTC)[reply]

Bashkir Transliteration policy

Hi all, I noticed someone has recently introduced changes into the active Bashkir transliteration template(s). I couldnot track down who that was, or where these are actually stored. Folks, can you please help me figure out who that was, explain to me how these policies are accepted, and why there has been no public discussion on this before the changes were made? Borovi4ok (talk) 09:08, 2 March 2016 (UTC)[reply]

It appears to be this edit by User:Amateur55. The edit summary given was "Fixed transliterations in accordance with WT:BA TR". Wyang (talk) 09:13, 2 March 2016 (UTC)[reply]

Thanks a lot for helping out the noob ) Borovi4ok (talk) 09:40, 2 March 2016 (UTC)[reply]

Inspire Campaign: Making our content more meaningful

The second Inspire Campaign has launched to encourage and support new ideas focusing on content review and curation in Wikimedia projects. Wikimedia volunteers collaboratively manage vast repositories of knowledge in our projects. What ideas do you have to manage that knowledge to make it more meaningful and accessible? We invite all Wikimedians to participate and submit ideas, so please get involved today! The campaign runs until March 28th.

All proposals are welcome - research projects, technical solutions, community organizing and outreach initiatives, or something completely new! Funding is available from the Wikimedia Foundation for projects that need financial support. Constructive, positive feedback on ideas is appreciated, and collaboration is encouraged - your skills and experience may help bring someone else’s project to life. Join us at the Inspire Campaign and help your project better represent the world’s knowledge! I JethroBT (WMF) 19:54, 2 March 2016 (UTC)[reply]

Entries with no definitions and no citations

User:Equinox has expressed the view that entries with no definitions and no citations should be speedied (see Talk:desklapizar and diff), and I think I agree. Does anyone disagree? If not, I will go ahead and delete every entry in Category:Ido entries needing definition that has no definitions and no citations. —Mr. Granger (talk • contribs) 21:32, 2 March 2016 (UTC)[reply]

I am OK with that, but it would be great if they ended up on the requested list for the applicable language. If you don't feel like doing that, though, I don't blame you. (edit @Embryomystic, who created a number of these.) - TheDaveRoss 21:44, 2 March 2016 (UTC)[reply]

@Embryomystic, TheDaveRoss: The ping does not get sent if you just edit your message, you must add a new message. --Daniel Carrero (talk) 21:50, 2 March 2016 (UTC)[reply]

Thanks, and weird. - TheDaveRoss 21:52, 2 March 2016 (UTC)[reply]

Okay, I'll look at the pages. It's possible that I overreached a bit on some of them, and I see they've been added to lists of Ido entries that are lacking. I'll add definitions to the ones that I can do that to without overreaching, and I guess the rest will just end up getting speedied. I agree that they should be put on the requested list at very least, though embryomystic (talk) 02:16, 4 March 2016 (UTC)[reply]

The usual way we use lists of words without definitions is to put them in a user subpage. DCDuring TALK 02:41, 4 March 2016 (UTC)[reply]

I agree that the content of Category:Ido entries needing definition (currently 85 entries) should be speedied. The reason for speedy delete is the conjunction of (a) lacking definition and (b) appearing unattested by a quick web search. Here's e.g. google:"predikegar". Since they appear unattested, collecting them in a request page seems of questionable utility. Beware that also the inflected forms need to be deleted. The deletion summary could be "Speedy delete a definitionless term that, on the face of it, appears unattested. Do not readd without attesting quotations meeting WT:ATTEST", or the like. --Dan Polansky (talk) 21:24, 4 March 2016 (UTC)[reply]

Category:Ido entries needing definition now only has 7 entries. The rest has been filled with definitions by the creator of these entries but it does not mean they are attested. I don't know how to handle the volume of suspect entries left in en wikt database by the user. I could start systematically sending them one by one to RFV, for each one that looks suspect based on a quick web search, but that would flood RFV quite badly. --Dan Polansky (talk) 14:14, 5 March 2016 (UTC)[reply]

It's a problem that seems to happen with constructed languages in general—people invent words that are morphologically plausible (or sometimes morphologically implausible) and create entries for them even if they aren't attested. Our Esperanto entries seem to be under control, but both Ido and Volapük have a large number of apparently unattestable entries. Our coverage of Novial, Interlingua, and Interlingue is much less extensive, and Lojban morphology is very isolating, so I imagine there's less of a problem in those languages. —Mr. Granger (talk • contribs) 15:15, 5 March 2016 (UTC)[reply]

Maybe we could develop a quick superficial test for speedy deletion of suspect entries. Like, if the lemma has zero hits at Google Books, and has no attesting quotation in Wiktionary, it is speedied, and can be created again when attesting quotations are supplied. It would be a temporary measure applied only to languages suspected to have a fairly large number of unattested entries in en wikt. I don't know whether Google Books is not too stringent for Ido, though. --Dan Polansky (talk) 18:29, 5 March 2016 (UTC)[reply]

Ido isn't really well represented on Google Books (for good economical reasons). There are many books without previews, of which some can be found on Ido-Vivo. And a lot of the texts that are on Google Books are textbooks and/or propaganda, like the notorious Lehrbuch der Weltsprache IDO fur Arbeiter. Lingo Bingo Dingo (talk) 14:02, 17 March 2016 (UTC)[reply]

If they actually have no definitions, I guess speedying is OK. If they just have definitions that need a little TLC, however... Purple backpack89 14:41, 5 March 2016 (UTC)[reply]

Seeing no objections, I have deleted the remaining Ido entries with no definitions and no citations. As requested, I've added them to Wiktionary:Requested entries (Ido), in case they do turn out to be attestable with some meaning. —Mr. Granger (talk • contribs) 23:08, 5 March 2016 (UTC)[reply]

You guys should be specific that you're talking about Ido entries. There are hundreds of fresh Russian and old Chinese entries needing definitions and they needed. People do fill definitions.--Anatoli T. ^{(обсудить}/^вклад) 23:50, 5 March 2016 (UTC)[reply]

Oppose if applied to all languages. Same reasoning as in the discussion about allowing definitionless entries before. Wyang (talk) 10:56, 6 March 2016 (UTC)[reply]

Pageviews graph

Just so you know, I have copied wp's {{PageViews graph}} here. This template generates a <graph> with the current page views statistics. There's also a user script that adds a "View statistics" item to the "More" menu, showing view statistics in a popup dialog. You can use it by adding the following line to your /common.js

mw.loader.load('//en.wikipedia.org/w/index.php?title=User:קיפודנחש/viewstats.js&action=raw&ctype=text/javascript');

--Dixtosa (talk) 09:30, 3 March 2016 (UTC)[reply]

Suggestion: give etymologies to irregularities

We’re not supposed to give etymologies to noncanonical forms because such etymologies tend to be obvious and uninteresting, but I think that we should make an exception for irregularities since their origins are more difficult to determine. Here’s an example:

était

Any objections? --Romanophile ♞ (contributions) 18:31, 3 March 2016 (UTC)[reply]

They should be put at the lemma form. The etymology of était is the same as the other past forms. —CodeCa t 18:38, 3 March 2016 (UTC)[reply]

A fair point I suppose, but wouldn’t it bother you if one etymology section was inflated to accommodate all of the irregularities? --Romanophile ♞ (contributions) 19:38, 3 March 2016 (UTC)[reply]

No that wouldn't bother me. Don't forget that the lemma is just the form chosen to represent all the other forms. Thus, any etymological information regarding any of the forms should be in the etymology section of the lemma. --Wiki Tiki 89 19:49, 3 March 2016 (UTC)[reply]

We already have entries that use show-hide bars to conceal portions of overly long etymologies (eg, long cognate lists) so the length of an etymology section need not be a consideration. DCDuring TALK 20:04, 3 March 2016 (UTC)[reply]

@CodeCat, DCDuring, Wikitiki89: yeah, I guess that this wasn’t such a good idea. I’m glad that I asked before doing a shitload of them, though! --Romanophile ♞ (contributions) 20:29, 3 March 2016 (UTC)[reply]

Okay here’s my attempt. Any comments? --Romanophile ♞ (contributions) 23:17, 3 March 2016 (UTC)[reply]

Why not just list which parts of the paradigm come from which Latin verb/PIE root? You don't need to go into all that detail, and keeping it short gets the point across better. Compare zijn. —CodeCa t 23:34, 3 March 2016 (UTC)[reply]

All right, I simplified it. Do you like it? --Romanophile ♞ (contributions) 00:29, 4 March 2016 (UTC)[reply]

The information is useful (and interesting!), but I think some sort of tabular presentation would be vastly better. Writing this sort of stuff out as prose just makes it slightly harder to grasp. Imaginatorium (talk) 06:45, 4 March 2016 (UTC)[reply]

I would go for some combination of the two. Personally, I find the longer version very interesting, and would love to see it included, as it shows the development of each form over time. I created my own version (the collapsabe box is probably using the wrong template, but I'm not sure what I should be using). Andrew Sheedy (talk) 10:38, 4 March 2016 (UTC)[reply]

@Romanophile What do think? Andrew Sheedy (talk) 04:57, 5 March 2016 (UTC)[reply]

I personally like it, since it gives a summary and also an elaborate version for those interested. --Romanophile ♞ (contributions) 05:38, 5 March 2016 (UTC)[reply]

I am not arguing for less information at all. I was going to suggest that the conjugation tables could have colour-coded backgrounds, then I discovered there aren't any conjugation tables, just lists of words. I would think that for all IE languages (even English), a standard 6 column, or 3+3 column, or similar layout would be much more effective. (I looked at Andrew Sheedy's version, but could not immediately see the meaning of the red items.) Imaginatorium (talk) 10:17, 5 March 2016 (UTC)[reply]

No, there's no rule against etymologies for conjugated forms. in French the conjugated forms of aller and être are both pretty interesting because for some of them they look very different to the infinitive: va, irai, suis, soit and so on. Renard Migrant (talk) 12:44, 6 March 2016 (UTC)[reply]

I mean inflections of all kinds; there's no rule against it and it's already in use so this discussion is a bit moot. Renard Migrant (talk) 12:45, 6 March 2016 (UTC)[reply]

But I don't think that is a very good practice. People might not think to look at specific inflected forms' entries when looking for this kind of information. --Wiki Tiki 89 18:49, 7 March 2016 (UTC)[reply]

@Wikitiki89, what about pronouns? I noticed that some nominative pronouns are treated as inflected forms of their masculine equivalents, though hypothetically they could be lemmatized. --Romanophile ♞ (contributions) 02:28, 11 March 2016 (UTC)[reply]

Whether they could be lemmatized is a separate issue (and should be decide on a case-by-case basis), the point is that if they are lemmatized, they would have their own etymology section, and if they are not, then they are forms of the lemma and the etymology should be at the lemma. --Wiki Tiki 89 16:05, 11 March 2016 (UTC)[reply]

Having a separate etymology could be an argument in favour of lemmatising, though. Not the only argument nor probably the most convincing one, but an argument nonetheless. —CodeCa t 17:02, 11 March 2016 (UTC)[reply]

Reconstruction move count

In case anyone is interested:

Pages in the Reconstruction namespage: 2,597
Appendices to be moved (nonredirects starting with "Proto"): 5,963

Also, can the rest of the appendices be moved by bot? --Daniel Carrero (talk) 18:30, 4 March 2016 (UTC)[reply]

One important thing to note is that not all reconstructions start with Proto: unattested Latin terms, for example. --Wiki Tiki 89 18:42, 4 March 2016 (UTC)[reply]

I can bot move every Appendix:Proto... page and corresponding talk page to the Reconstruction namespace if nobody else feels like doing so, but if there are more nuances to it than that I will leave it to someone who knows what is going on. - TheDaveRoss 18:45, 4 March 2016 (UTC)[reply]

The nuances are that if a page is a redirect, it should be fixed to redirect to the Reconstruction namespace as well. For non-redirects, it would be good to make sure that the page has a {{reconstructed}} (or {{reconstruction}}) template at the top, otherwise it might not actually be a reconstruction page (and it would be nice to have a list of such pages if there are any). --Wiki Tiki 89 18:51, 4 March 2016 (UTC)[reply]

Pages missing templates

Those are all pretty straightforward, I can probably do it this afternoon if nobody beats me to it. Also see above for the list of pages. - TheDaveRoss 18:56, 4 March 2016 (UTC)[reply]

Swadesh lists aren't entries, they're lists. I don't think they should be moved. Chuck Entz (talk) 20:02, 4 March 2016 (UTC)[reply]

Thanks. I moved the ones that were actually reconstructions and left the ones that weren't. Another thing: If there are redirects from pages without a slash to pages with a slash (e.g. from "Appendix:Proto-Something foo" to "Appendix:Proto-Something/foo"), they can be deleted. --Wiki Tiki 89 20:04, 4 March 2016 (UTC)[reply]

@TheDaveRoss: Please confirm that you read my previous post about redirects. Thanking this edit would be enough. --Wiki Tiki 89 22:09, 4 March 2016 (UTC)[reply]

@Wikitiki89 Currently I am ignoring redirects altogether (only moving pages which are not redirects), once those are done I can circle back and fix redirects so that they all point to the right place. - TheDaveRoss 22:13, 4 March 2016 (UTC)[reply]

Except for a few Old Prussian terms I'm not sure about, all of the non-protolanguage reconstruction pages should be done. KarikaSlayer (talk) 19:29, 4 March 2016 (UTC)[reply]

Another nuance is that a lot of pages in the Index: namespace have simple wiki links to the Appendix: namespace rather than links provided by {{l}} or {{m}}. For example, Index:Proto-Indo-European/d has [[Appendix:Proto-Germanic/talō#Proto-Germanic|1]] and [[Appendix:Proto-Germanic/talō#Proto-Germanic|*''talō'']] instead of {{l|gem-pro|*talō|1}} and {{m|gem-pro|*talō}}. That means all those links will be broken whenever a reconstruction is moved to Reconstruction: space without leaving a redirect. I'm gradually going through these appendices and fixing the links to reconstructions, but there's a hell of a lot of them, and fixing them is time-consuming because it can't be done by simple search and replace. —Aɴɢʀ (talk) 19:11, 4 March 2016 (UTC)[reply]

Didn't we vote on getting rid of the Index namespace? --Wiki Tiki 89 19:17, 4 March 2016 (UTC)[reply]

We didn't. I said I would create that vote later, but I didn't create it yet. --Daniel Carrero (talk) 19:20, 4 March 2016 (UTC)[reply]

It is simple search and replace. You search "Appendix:Proto" and replace with "Reconstruction:Proto". No?--Dixtosa (talk) 19:22, 4 March 2016 (UTC)[reply]

Well, that works if you're content with pages that say [[Reconstruction:Proto-Germanic/talō#Proto-Germanic|*''talō'']], but I'm not. If I'm going to go to the trouble of tidying up these pages, I'm going to do it right and use {{l|gem-pro|*talō}}. —Aɴɢʀ (talk) 19:50, 4 March 2016 (UTC)[reply]

I am not either. I am just saying that that is different unrelated low-priority (not intending to diminish your work) job that does not really interfere with this high-priority job. This message assumes that the bot is going to do the mentioned simple search and replace too. --Dixtosa (talk) 20:18, 4 March 2016 (UTC)[reply]

It's relatively easy to do with any regex-based search and replace. --Wiki Tiki 89 20:20, 4 March 2016 (UTC)[reply]

Not sure how prevalent it is, but if there are lots it sounds like a good candidate for that bot to-do list page. - TheDaveRoss 21:12, 4 March 2016 (UTC)[reply]

I've done a sweep to clean up all English Wikipedia links to our protolang entries (wasn't many of them really). Are there other Wikimedia projects where we can suspect links to these to be lurking about? --Tropylium (talk) 01:08, 6 March 2016 (UTC)[reply]

Done, apparently. Thank you!

All appendices of reconstructed terms starting with "Proto" are redirects, so I take it all of them were successfully moved to the Reconstruction: namespace.
{{reconstructed}} (redirect: {{reconstruction}}) is not linked by any pages in the Appendix: namespace.

Are there any unattested Latin terms, or some other reconstructed terms in appendices yet? Can we remove {{reconstruction}} from all appendices now and remove the patch that makes {{l}} and {{m}} link to the appendix namespace? --Daniel Carrero (talk) 02:21, 6 March 2016 (UTC)[reply]

There are no more Latin reconstructions in Appendix namespace. —Aɴɢʀ (talk) 07:34, 6 March 2016 (UTC)[reply]
The only single-term entries that I've been able to find in the Appendix namespace are conlangs. For the purposes of moving and of the special code that CodeCat added to {{reconstructed}} and to Module:links, we're done. I'm sure there are still lots of hard-coded links to reconstructions in Appendix space, and there are no doubt obscure bits of code in modules and templates that will have to be tracked down and fixed (I believe the templates that link to the next/previous in a series, such as {{cardinalbox}}, need to be checked, for instance). Chuck Entz (talk) 07:54, 6 March 2016 (UTC)[reply]

Template:auto cat

I created this template and module just now. It is used as a category boilerplate template. It tries to automatically detect the type of template that the category requires, and the parameters that go with it. It works with a good number of category types already, even though the module itself is quite small and simple. I don't know if it works with subst: right now, it would be nice for sure if it did. Ideally, substing it would give you the underlying template invocation. —CodeCa t 20:08, 4 March 2016 (UTC)[reply]

Bot spam

Now that it's abundantly clear why we have bot flags, can we please block User:TheDaveBot until it gets one? My watchlist is completely unusable. —CodeCa t 15:48, 5 March 2016 (UTC)[reply]

[1] DTLHS (talk) 15:55, 5 March 2016 (UTC)[reply]

I could find no way to flag a move as a bot. For future reference, you can use the "inverted" namespace selection to look at everything which isn't a move from one namespace to another. - TheDaveRoss 16:12, 5 March 2016 (UTC)[reply]

Bot flags are given to the user as a whole. You have to apply for one and get it voted on. I don't think the vote will fail, but you are technically violating WT:BOT by running a bot without a vote... I think people turn a blind eye because the work your bot is doing is good and needed. —CodeCa t 16:34, 5 March 2016 (UTC)[reply]

The account has a bot flag, and a flood flag, and the bot and minor flags were set on the API calls. Moves cannot be flagged. - TheDaveRoss 19:52, 5 March 2016 (UTC)[reply]

TheDaveRoss, thank you for your volume renames, performed in align with an objectively verifiable consensus as evidenced by a vote. You don't need to take these attacks from the greatest WT:BOT violator the English Wiktionary ever had very seriously. Thanks again. --Dan Polansky (talk) 20:16, 5 March 2016 (UTC)[reply]

For the record, the enbotting vote is at User_talk:TheDaveBot#Discussion and Vote on TheDaveBot, from September 2006; the bot flag was granted by Dvortygirl as per that page. This was around the time Wiktionary:Votes was started on 22 September 2006 by the very TheDaveRoss. --Dan Polansky (talk) 08:09, 6 March 2016 (UTC)[reply]

Collapsible derived terms

One user in currently in the process of systematically placing derived terms in collapsible boxes, for English and non-English entries alike, in significant volume. I am not happy about it; if you are unhappy too, we can try to do something about it. --Dan Polansky (talk) 15:09, 6 March 2016 (UTC)[reply]

Why are you unhappy about it? For any long (>6 items) list it makes a larger portion of the of the overall entry visible at once. I thought there was a gadget or preference which allowed them to be open or closed by default for registered users. I can only find that (browser) preference implemented for translation boxes. Would such a gadget satisfy you? DCDuring TALK 17:24, 6 March 2016 (UTC)[reply]

It's in the sidebar, in the "Visibility" section. --Yair rand (talk) 18:49, 6 March 2016 (UTC)[reply]

I thought they didn't persist, but I see that they do. Thanks. DCDuring TALK 19:47, 6 March 2016 (UTC)[reply]

Wikipedia article titles as a durably archived source

We don't allow Wikipedia articles as sources because, I presume, they can be modified and anyone can write stuff on them in theory. However, the names of articles themselves are often much more stable. Furthermore, Wikipedia itself is durably archived (through countless mirrors). So I wonder if the titles of articles could be allowed as attestations for terms? We could set some restrictions, such as that the article has to have existed with that name for X years. —CodeCa t 19:23, 6 March 2016 (UTC)[reply]

Strongly oppose. We shouldn't allow Wikipedia as a source because editors of small language Wikipedias (and even some bigger ones, like Portuguese) make up protologisms all the time. They have the requirement of their information being true, but the name they use for the title doesn't need to be a name that anyone actually calls it by. We have to have those names be true as well, though. Because limited documentation languages only require a single use or mention, this would allow thousands of protologisms to enter Wiktionary. —Μετάknowledge^{discuss/deeds} 19:34, 6 March 2016 (UTC)[reply]

Better yet let's pretend we stop caring about being durably archived. If being durably archived actually mattered, we'd've accepted Wikipedia 10 years ago. What we actually allow is published books and Usenet. It might not sound great in a policy document, but I personally think we should stop lying. Renard Migrant (talk) 19:45, 6 March 2016 (UTC)[reply]

We allow durably archived media, not just books and Usenet. We cite books, journals, magazines, newspapers, archived films and television shows and songs (in the US and UK, modern commercially-released films and shows and songs are archived: libraries keep copies just like they keep copies of books; but e.g. a Somali TV show probably isn't durably archived and couldn't be cited), as well as monuments and runestones, and that list probably isn't exhaustive. - -sche (discuss) 04:23, 7 March 2016 (UTC)[reply]

Even titles can change (and be deleted such that the history can't be cited), which speaks against the idea that they are durably archived, and they are also often made-up. Another reason for not accepting Wikipedia is that for a WMF project to cite a WMF project is circular; perhaps it would work for terms that we were claiming as WMF jargon, but if we're asserting a term is generally used, we need to show it's generally used. - -sche (discuss) 04:27, 7 March 2016 (UTC)[reply]

Exactly, de facto durably archived stuff that is unreliable or allows self-referencing we don't allow. Which I wholly support. But we don't document this practice anywhere, in fact we tend to deny its existence until someone tries such a source, like citing a Wikipedia article as a source for a word. Renard Migrant (talk) 19:15, 7 March 2016 (UTC)[reply]

Another question people ask is (merging this into one) why don't we allow stuff accessible via archive sites like the wayback machine and why don't we durably archive stuff ourselves by taking screenshots. Renard Migrant (talk) 19:16, 7 March 2016 (UTC)[reply]

Websites can be removed from the wayback machine upon requests (via robots.txt). Screenshots are worth little more than our direct quotations. The point is that someone who doesn't believe our direct quotation can go look it up themselves, and screenshots do not fix that problem. --Wiki Tiki 89 19:20, 7 March 2016 (UTC)[reply]

Anything used as a Wikipedia article title, if not nonsense, would already have to come from some other archived source. bd2412 T 22:29, 7 March 2016 (UTC)[reply]

Ideally, yes, but in practice, especially with small minority languages, that is not the case. --Wiki Tiki 89 22:42, 7 March 2016 (UTC)[reply]

Again, even with bigger ones like Portuguese Wikipedia (Ungoliant can attest to this), they make stuff up. Remember, the ideas must be cited, not the words used to describe or name the ideas. —Μετάknowledge^{discuss/deeds} 01:16, 8 March 2016 (UTC)[reply]

Inter-project links to missing pages

I am not sure if there is already a policy in place about links to other projects, but I have run across a number of such links which do not lead to anywhere. I propose that we modify the templates ({{wikipedia}} most notably) to accept a parameter which hides the template from the page and categorizes the page to be reviewed. I think that many of these cases just require choosing an appropriate target page on the sister project, perhaps some cases should be removed altogether. A bot could easily maintain such a system, periodically checking all usages (either through dumps or whatlinkshere). I assume that nobody is in favor of links to non-existent pages on sister projects are useful, but maybe I am missing something. - TheDaveRoss 23:35, 6 March 2016 (UTC)[reply]

What IMO would be more useful would be to run the list of target pages from {{wikipedia}} and its relatives against the list of mainspace headwords at WP. The ones that were to non-existent pages would be problems to be cleaned up (and perhaps those to redirect pages to be bot modified). Something similar could be done for links to other sister projects. Perhaps focusing on links to a smallish sister project, but one which had at least a hundred links from here would be a good way to test the concept. DCDuring TALK 00:35, 7 March 2016 (UTC)[reply]

I think we might be saying the same thing, but differently. I am suggesting that we generate lists of pages which have {{wikipedia}} links (and similar) where the target of the link does not exist on that project. Is that what you are saying too? - TheDaveRoss 00:48, 7 March 2016 (UTC)[reply]

I'm glad. I was afraid you were suggesting that when a user discovered a dead link s/he simply manually insert a flag that would place the page in a category. I'd been thinking about such a thing for quite a while. It might also be handy for the {{w}} and in-text hard links to other projects, each meriting inclusion in a different category or dump-derived cleanup page. The situation isn't quite the same with {{taxlink}} and {{vern}} but they might benefit from something similar. The complications don't make them a good place to start. DCDuring TALK 01:53, 7 March 2016 (UTC)[reply]

There are a few ways to accomplish this, my preference is to include a parameter on the templates (so they can be hidden/delinked), but we could also just add a hidden category to the pages which contain such links, or remove the links, etc. Any preferences there? - TheDaveRoss 03:09, 7 March 2016 (UTC)[reply]

Why hide the template on such pages and not just remove it or let it remain on whatever cleanup list (of entries linking to nonexistent WP pages) gets generated? Btw, in the other direction, Wikipedia had a project a while ago to find words it used that weren't present on Wiktionary, many of which were misspellings on WP's part but others of which were omissions on our part; I can't offhand find it. - -sche (discuss) 04:32, 7 March 2016 (UTC)[reply]

The main reason to hide rather than delete is that someone, at some time, thought that a link to Wikipedia was beneficial. It could be removed and a category could be added, but that make more work for whoever actually fixes it down the road (they would have to both add the template and remove the category rather than simply changing {{wikipedia|missing=true}} to {{wikipedia|New target}}. As I said above, there are a number of ways to accomplish the same thing, I am not opposed to any of them really. - TheDaveRoss 12:06, 7 March 2016 (UTC)[reply]

See predator bug & Category:Wikipedia link with missing target page for an example. - TheDaveRoss 03:33, 7 March 2016 (UTC)[reply]

I added correct links under "External links" for predator bug.

The first thing to try for links to species names and the remedy in the case of [[predator bug]] is to link to the genus instead of the species. For Translingual entries in general a link one rank/level higher in the hierarchy often finds a valid link. Sometimes the taxon that is parameter 3 in {{taxon}} solves the problem adequately. IOW, many Translingual entries could have the links corrected by bot. As to the choice of first tool for correcting the problem I suppose it is a question of whether there are thousands of bad links in Translingual entries or a much smaller number. DCDuring TALK 11:41, 7 March 2016 (UTC)[reply]

If the algorithm is fairly rigid (if the link is missing then try to link one level higher, stopping at [family? class?]) then that is certainly something which could just be fixed along the way rather than categorizing and leaving for someone else. - TheDaveRoss 12:06, 7 March 2016 (UTC)[reply]

Links to discussions often broken

Links to supposedly ongoing discussions that appear in banners such as "A user suggests that this entry be cleaned up ..." are frequently broken, normally because the discussion has been archived. Is it not possible to make durable links? 86.152.161.107 01:40, 11 March 2016 (UTC)[reply]

It's not much of an excuse but many of the discussions never took place or have little helpful content. Sometimes there was a cleanup but the tag was not removed.

We have many problems not tagged with RfCs, particularly in the area of definition quality, such as having definitions copied unchanged from Webster 1913 or Century 1911 using words in ways that are obscure or misleading. DCDuring TALK 23:51, 11 March 2016 (UTC)[reply]

Yeah the tag is supposed to be removed but people often forget. Renard Migrant (talk) 11:03, 12 March 2016 (UTC)[reply]

Diacritical marks

I think we should not give to all our readers the false intention that the orthographically correct writing of some words (especially Latin and ancient Greek words) is with diacritical marks such as macron. This should be declared in the lemma in a special section or in a special "way". Especially we should not have this in the etymology section or in section for descendants or in the declension or inflection tables. The declined or inflected forms, found in the inflection and declension tables, have their own articles where the (non correct orthographically) term with diacritics can be also shown. Also, in the "header", the "diacritical mark", the mark that is not commonly written but only in case the teacher or the lecturer asks for, should be marked as such; as a non orthographically correct but a special mark (with "header" I mean the "word" that the templates such as {{grc-noun}} or {{la-verb}} produce and show in the page).

An ordinary reader who looks at the procrastinate sees "prōcrastinātum" and says: Oh! Look how the Romans typed their letters! It is fascinating! My school book has typos! I must inform the editors of my book for those typos! --Xoristzatziki (talk) 08:09, 14 March 2016 (UTC)[reply]

Wiktionary:About_Latin#Macrons_should_be_used_only_within_pages. --Anatoli T. ^{(обсудить}/^вклад) 08:53, 14 March 2016 (UTC)[reply]

@Xoristzatziki Why not complain about all languages, which don't normally use diacritics in a running text, but we do? :) Shall I start listing? --Anatoli T. ^{(обсудить}/^вклад) 08:56, 14 March 2016 (UTC)[reply]

First off: We as a dictionary are not responsible for daft kids with too much time on their hands and too little scrutiny in their reading. Secondly, we already clearly express when diacritics are part of the actual orthography or a scholastic addition by choice of page title. Which brings me back to point number one. Korn [kʰũːɘ̃n] (talk) 09:31, 14 March 2016 (UTC)[reply]

I agree (with Korn). - -sche (discuss) 05:54, 16 March 2016 (UTC)[reply]

@Atitarev My complaint is about all languages, which don't normally use(d) diacritics in a running text. grc and la (and macron) are only for reference.

@Korn When most users come here expect to find the definition (primary) and the declension-inflection (possible secondary) of a word. Sure there are philologists and such that want to know also if the pronunciation of the vowel was long or maybe would-be poets looking for rhymes. But no one will find the text in a normal Latin, Ancient Greek etc. book.

The "so called" clearness of page title is not so clear to actual readers (not the ones involved in editing), same as redirection pages are confusing some times. As above mentioned if a diacritic is a scholastic addition (IMHO) it should be placed in a place for scholastic additions and nowhere else and personally I do not think of the header in the definitions as such a place. For me "looks" like we place the pinyin in the header of Chinese characters instead of the actual pictograms or placing the transliteration instead of the actual word (for other languages). --Xoristzatziki (talk) 11:57, 14 March 2016 (UTC)[reply]

But anyone can find the text in a normal Latin, Ancient Greek dictionary. Likewise, no one will ever find gender, principal parts, etc. in running text. Chuck Entz (talk) 12:24, 14 March 2016 (UTC)[reply]

@Xoristzatziki Users need to know how to use dictionaries. We include diacritics for Arabic, Hebrew, Russian, Serbo-Croatian, etc. These languages normally don't use diacritics in a running text. We have policy pages, which describe the purposes of the symbols. It's the way it is. You can suggest a vote on removing diacritics but I doubt it will pass. --Anatoli T. ^{(обсудить}/^вклад) 12:40, 14 March 2016 (UTC)[reply]

Ok (for but I doubt it will pass). I am just pointing (and registering...) my strong opposition to (such special) exceptions from ordinary (and especially electronic) dictionaries. Users coming here to find a definition and everything they are accustomed to find in an everyday dictionary (so this should be a simple page, as simple as we can do). --Xoristzatziki (talk) 05:03, 16 March 2016 (UTC)[reply]

Ordinary dictionaries that are any good certainly show these diacritics. This is simply part of how Wiktionary strives (and succeeds in some languages, I might add) to be a superior resource in terms of balancing completeness and user-friendliness than any others freely available online. —Μετάknowledge^{discuss/deeds} 05:31, 16 March 2016 (UTC)[reply]

If only we had discussed this at length recently. —John C5 15:01, 16 March 2016 (UTC)[reply]

If I'm a casual user and I come across procrastino with the page name different from the head word, how do I know where to look to explain this phenomenon? Unless I'm missing something, that answer is, I can't possibly know where to look as there's no explanation whatsoever in the entries. Renard Migrant (talk) 22:01, 16 March 2016 (UTC)[reply]

Use of "inh" vs "etyl"

Apologies if this has been covered elsewhere already, but it's more of a question of how to handle etymologies from now on. I've noticed more entries are getting the {{|inh}} tag if they are inherited from another language, as opposed to the simple {{|etyl}} one that has been widely used until recently. Is this how all entries should be treated from now on if it is known they are inherited (the same going for borrowed ones, respectively), under Wiktionary policy now?

I ask because for most of the Romance languages, the vast majority are still under the plain {{|etyl}} kind; for example https://en.wiktionary.org/wiki/Category:Italian_terms_derived_from_Latin, as opposed to https://en.wiktionary.org/wiki/Category:Italian_terms_inherited_from_Latin, which does not also put those words in the aforementioned category. What's going to happen to the original category (the one called simply Italian terms derived from Latin)? They're not mutually exclusive, in my opinion, but now we're sort of in between... If someone is interested in finding a list of words in Italian that is derived from Latin and they click on that Category, they won't find some important basic words because they'll be in the "inherited" category. It's going to take a really long time to manually sort through all the words and replace the inherited ones with the correct tags... I just got through most of the Romanian inherited lexicon using the {{|etyl}} ones.

Is the "derived from" category just now going to be for words in general derived from that language, especially if it isn't known they are inherited or borrowed? Is it going to be needed anymore, in the long run, if we're distinguishing inherited vs not? Or should inherited terms be put into both categories, such as by manually adding a category tag at the bottom? How should this be treated? Word dewd544 (talk) 19:29, 16 March 2016 (UTC)[reply]

If I'm not mistaken (and please forgive me if I am, I'm not up on this either...), the Category:Italian_terms_derived_from_Latin should remain for those Italian words that were borrowed from Latin, as opposed to inherited (i.e. remained in VLat throughout its history), correct ? Leasnam (talk) 19:35, 16 March 2016 (UTC)[reply]

The traditional categories continue to be used by {{bor}}. It is also used by {{der}}, which is used in terms borrowed indirectly. Don’t add the categories manually; let the templates worry about categorisation. .

Indeed, it will be a lot of hard work to update all the etymologies, but I personally think it’s a step in the right direction. As you know, borrowing vs. inheritance is an extremely important distinction in etymology. — Ungoliant ^(falai) 19:39, 16 March 2016 (UTC)[reply]

Yeah, I guess we just have to accept that some of these categories are going to be a bit disjointed for the time being, until everything is properly sorted into the right one; I wish I would've seen the policy discussions on these matters earlier so I could've avoided using {{|etyl}} lately. The only problem is, what if there are words where it is uncertain if they were borrowed or inherited? Do we just use the {{|der}} in those cases, and say "possibly" or "probably" borrowed/inherited before it, until we can find a source that removes the ambiguity? Unfortunately, there aren't always academic sources to cite for every word.

Additionally, if you you really delve into it and do more in depth research, you'll actually find the majority of words in every Romance language, on a purely numerical basis, were borrowed (Sardinian might have the among the highest amount inherited)... Though of course, when it comes to the core lexicon/vocabulary, inherited terms are almost completely dominant for all these languages, despite only consisting of a few thousand words among a much larger overall lexicon. Not a lot of people outside of linguists, philologists, or etymologists specializing in Romance languages and their evolution know this, however, so it may seem strange for some to see a fairly common word in Italian or Spanish or French was actually a borrowing.

Also, another technical question: can words be "borrowed" from Vulgar Latin, especially by a Romance language? There are a few cases where some words may have been (depending on what is defined as a Vulgar Latin word, of course- maybe Late Latin or Medieval Latin are actually more accurate in these cases), but it seems by its very nature, Vulgar Latin would be the source of primarily inherited words since it, for the most part, was a natural language and more spoken than written. Word dewd544 (talk) 23:07, 18 March 2016 (UTC)[reply]

A side-question: Should {{bor}} put entries in its own "borrowed from" category like {{inh}} already does? —CodeCa t 20:27, 16 March 2016 (UTC)[reply]

Yes. --Wiki Tiki 89 20:30, 16 March 2016 (UTC)[reply]

Can {{bor}} be used at any step of an etymology, not just the most recent ? (i.e. if a term was inherited in English from Middle English, but borrowed by Middle English from Old French --a non-Modern English borrowing...) Leasnam (talk) 21:34, 16 March 2016 (UTC)[reply]

The borrowing has to have occurred within the "current" language. If a term was borrowed into Middle English, it is not a borrowing in modern English but would be considered inherited from Middle English. —CodeCa t 21:37, 16 March 2016 (UTC)[reply]

Thanks :)Leasnam (talk) 21:38, 16 March 2016 (UTC)[reply]

That's not how I've been using it. If a modern English word has been inherited from Middle English and the Middle English word was borrowed from Old French, then the modern English word belongs in both Category:English terms inherited from Middle English and Category:English borrowed terms. There's no reason to treat the two as mutually exclusive. —Aɴɢʀ (talk) 21:52, 16 March 2016 (UTC)[reply]

Would this be done only for "recent" borrowings (2-3 steps back ?). Otherwise, if we follow this logic, the further back we go, the more confusing it will become. For instance, if a word was borrowed into Old English from Latin, then inherited by Middle English and English, I can see this working. But if the same Latin word was itself borrowed by Latin from Greek, then it starts to get hairy. And likewise, if the Greek word was borrowed from Phoenician or Persian, then is it still a Persian borrowing in English ??? Leasnam (talk) 22:13, 16 March 2016 (UTC)[reply]

It's only confusing if we start categorizing terms by the language they were borrowed from. {{bor|en|fro}} puts things into Category:English borrowed terms and Category:English terms derived from Old French, but we don't allow categories like Category:English terms borrowed from Old French, so it doesn't matter whether it's borrowed directly or indirectly. (I see that Category:English terms borrowed from Old French actually suddenly exists now, even though there's been no consensus to create categories like that, so that's a shame. At any rate, my argument still holds according to the state of affairs up to 48 hours ago.) —Aɴɢʀ (talk) 15:25, 17 March 2016 (UTC)[reply]

That's not what the documentation says. —CodeCa t 21:54, 16 March 2016 (UTC)[reply]

There is notext=1 which is useful for a term that was say, borrowed into Latin, borrowed into Old French then borrowed into Middle English, to avoid having 'borrowing' three times in a paragraph making it quite a lot longer. Renard Migrant (talk) 21:58, 16 March 2016 (UTC)[reply]

That's still not how the documentation says the template should be used, though. It should specifically be used only for borrowing into the current language, anything else should not use this template. —CodeCa t 22:00, 16 March 2016 (UTC)[reply]

So change the documentation. Renard Migrant (talk) 23:07, 16 March 2016 (UTC)[reply]

Is there a consensus for changing the practice established by the documentation? —CodeCa t 23:22, 16 March 2016 (UTC)[reply]

I think that {{bor}} should only be used for borrowings directly into that stage of the language. Thus, words borrowed into Middle English should not use {{bor}} in Modern English. --Wiki Tiki 89 14:43, 17 March 2016 (UTC)[reply]

Is there a consensus for using the practice established by the documentation? —Aɴɢʀ (talk) 15:25, 17 March 2016 (UTC)[reply]

Belatedly, but: as long as we treat languages like English, Middle English and Old English distinct, I for one support the idea that {{bor}} should only be used for borrowings later than Middle English, not for things inherited from ME or OE. The idea after all is that each term is exactly one of inherited or borrowed (or coined etc.), and thus should not simultaneously be in both e.g. "English terms inherited from Old English" and "English borrowed terms".

On the other hand, if a term is marked as borrowed from Latin, and inherited from Middle English at the same time, you can infer that the borrowing happened in Middle English. —CodeCa t 19:44, 20 April 2016 (UTC)[reply]

etyl vs. der → inverted parameters

Comment: I don't like that {{etyl}} and {{der}} have their parameters inverted. Either of these does the same thing:

{{etyl|la|en}}
{{der|en|la}}

Result:

Latin
Latin [Term?]

--Daniel Carrero (talk) 21:39, 16 March 2016 (UTC)[reply]

It can be confusing, but {{etyl}} is the odd one out here. It's the only template on Wiktionary (that I know of) that has the current language specified with the second parameter. All the others use either the first or lang=. See Wiktionary:Templates with current language parameter. —CodeCa t 21:44, 16 March 2016 (UTC)[reply]

I'd support deleting {{etyl}} in favor of keeping only {{der}}. --Daniel Carrero (talk) 21:48, 16 March 2016 (UTC)[reply]

Purely for historical accuracy, {{etyl}}'s the odd one out only because {{der}}, {{bor}} and {{inh}} were designed to be the opposite way round to {{etyl}}. Renard Migrant (talk) 21:58, 16 March 2016 (UTC)[reply]

It is true, {{etyl}} was created in 2008 and the others in 2015. But I understood CodeCat's remark like this: we have {{head|en|...}}, {{label|en|...}}, {{m|en|...}}, {{l|en|...}}, we might as well have {{der|en|...}}, {{bor|en|...}} and {{inh|en|...}}. In all cases, "en" is the current section. (English) --Daniel Carrero (talk) 22:01, 16 March 2016 (UTC)[reply]

{{der}}, {{bor}} and {{inh}} would not make sense if their language parameters were inverted. Yes, this does make it confusing when compared to the older {{etyl}}, but I think this is a good change. {{etyl}} should not be changed, however. --Wiki Tiki 89 14:43, 17 March 2016 (UTC)[reply]

I think {{bor}} and {{inh}} should categorize into distinct categories and also into the generic "X terms derived from Y". This would be good both in the short term (it would address the problem that until the many entries which use {{etyl}} are updated, derivations of the same type are split into multiple categories) and in the long term: I think that it's useful to let people find all derivations from language Y in one place, no matter whether the derivation is by inheritance, by borrowing, or by an unclear route (the last of which probably ensures that some category besides "inherited" and "borrowed" will always have to exist). - -sche (discuss) 02:52, 17 March 2016 (UTC)[reply]

Yes. --Wiki Tiki 89 14:43, 17 March 2016 (UTC)[reply]

Open call for Individual Engagement Grants

Hey folks! The Individual Engagement Grants (IEG) program is accepting proposals from March 14th to April 12th to fund new tools, research, outreach efforts, and other experiments that enhance the work of Wikimedia volunteers. Whether you need a small or large amount of funds (up to $30,000 USD), IEGs can support you and your team’s project development time in addition to project expenses such as materials, travel, and rental space.

Submit a grant request or draft your proposal in IdeaLab
Get help with your proposal in an upcoming Hangout session
Learn from examples of completed Individual Engagement Grants

Also accepting candidates to join the IEG Committee through March 25th.

With thanks, I JethroBT (WMF) 23:01, 16 March 2016 (UTC)[reply]

Anyone interested in working on getting corpora and a KWIC generator so we don't have to risk violating copyright when we try to produce modern definitions? DCDuring TALK 03:16, 17 March 2016 (UTC)[reply]

Template:borrowing and borrowings into ancestral stages

I have been thinking a bit about the situation above, regarding when to use {{bor}}. I think the use case can be expanded so that it can be used for borrowings into the current language or any of its ancestors. This means that cellar would become allowable in Category:English terms borrowed from Latin, despite having been borrowed in Middle English. Is everyone ok with changing practice and documentation to allow for this?

I am still opposed to placing terms there that have been borrowed from (for example) Latin through another intermediate language. Consider, for example, emperor. It is borrowed from Old French, but Old French inherited it from Latin. We don't say that emperor is inherited from Latin, that would be silly, and saying it was borrowed from Latin would be equally silly. However, consider the cognate imperial, which Old French borrowed from Latin (on account of its i). It wouldn't make sense to consider this borrowed from Latin into English anymore than emperor was borrowed or inherited from Latin. English, or any of its ancestors, had no "linguistic contact" with Latin when these terms ended up in English. The only direct contact was with Old French. —CodeCa t 23:22, 16 March 2016 (UTC)[reply]

What about Latin terms borrowed into Proto-Germanic? Or terms borrowed into Proto-Indo-European? It seems silly that hemp and canvas should both be in Cat:English borrowed terms, when one is inherited in unbroken succession from at least a pre-Grimm's-Law stage of Proto-Germanic, and the other is borrowed into Greek and then borrowed into Latin, and finally borrowed into Middle English from Anglo-Norman/Old French. There are attested Sumerian terms that can be found as inherited forms in modern Semitic languages, with the borrowing likely having happened literally thousands of years ago. This looks like a good way to overload the borrowed term categories and obliterate from the categories very significant distinctions about recentness/ancientness of borrowings. Chuck Entz (talk) 01:42, 17 March 2016 (UTC)[reply]

Yes, as Leasnam noted in the section above, labelling things as "borrowings" when they were borrowed into some previous stage gets hairy fast. I agree that "emperor" was not inherited from Latin and that, accordingly, it makes no sense to say "imperial" was borrowed from Latin. It seems like "derived" is the best label for such things. - -sche (discuss) 03:00, 17 March 2016 (UTC)[reply]

This issue is due to leaving out the intermediate stages. The etymology for emperor should say: Inherited ({{inh}}) from Middle English, borrowed ({{der}}) from Old French, inherited ({{der}}) from Latin. --Wiki Tiki 89 14:48, 17 March 2016 (UTC)[reply]

But if we do that, emperor is in Category:English terms inherited from Middle English, but it isn't in Category:English borrowed terms even though it is a borrowed term. —Aɴɢʀ (talk) 15:26, 17 March 2016 (UTC)[reply]

That's the point. It was a borrowed term in Middle English, but no longer so in Modern English. --Wiki Tiki 89 15:28, 17 March 2016 (UTC)[reply]

A word never stops being a borrowed term. It hasn't suddenly become a native term, so it remains a borrowed term for the rest of eternity. —Aɴɢʀ (talk) 15:32, 17 March 2016 (UTC)[reply]

Then statistically we should be able to say that almost every word in every language is likely a borrowed term, because when you go as far back as PIE and earlier, who the hell knows? --Wiki Tiki 89 15:34, 17 March 2016 (UTC)[reply]

In theory yes, but in practice we'd never put {{bor}} on a word without knowing for sure that it had been borrowed from some specific named language or proto-language. —Aɴɢʀ (talk) 15:40, 17 March 2016 (UTC)[reply]

Still, it leaves many words as borrowings that were borrowed so long ago that it's irrelevant. Such as dish and kitchen. --Wiki Tiki 89 15:50, 17 March 2016 (UTC)[reply]

I guess it all comes down to how borrowing is viewed in each respective language. A language like English is keen to observe terms borrowed from other languages--its speakers even pride themselves on such. However, there are some languages (not to mention any) which might not fancy this, but see ancient borrowings as native or at least naturalised. Either way is fine, but I think we should be consistent. Leasnam (talk) 16:03, 17 March 2016 (UTC)[reply]

To follow up a bit, a word borrowed into Proto-Germanic shouldn't be listed as a term borrowed into English, because English didn't exist yet. Same with French. A Word borrowed into Latin from Greek shouldn't be a borrowed term in French because French didn't exist then (because we see Latin as distinct from its Romance descendants: i.e. Latin ceased and "new" languages emerged. French is not seen as "Modern Latin" in quite the same way that Modern English is seen as "Modern Old English"). I suppose it's natural to place a cutoff point when one language evolves into a new language (e.g. 'Proto-Germanic' into 'Old/Middle/New English' and 'Latin' into 'Old/Middle/Modern French'), but a word borrowed into Old French would be easy to consider as a borrowing into French, since language stages are really quite distinct when comparing them to separate languages (--a position that I have always held to)...definitely something to think about. Leasnam (talk) 16:17, 17 March 2016 (UTC)[reply]

Languages are always evolving into new languages. A term borrowed into English on 31 december 1499 doesn't suddenly become a native term on 1 january 1500. Nor is there a point where Proto-Germanic suddenly becomes English. Don't forget that Old English developed into Scots too, so if we call Old English "English", then is Scots also English? Old English is as much "English" as it is "Scots" after all. Just like Proto-Germanic is as much "Old English" as it is "Old High German", "Gothic" and "Old Norse". People don't suddenly stop speaking one language and start speaking others, English is just one particular dialect of Proto-Germanic, and it's so distinct from the others that we've given it a name and called it a language. But it's arbitrary. —CodeCa t 16:41, 17 March 2016 (UTC)[reply]

I see your point, and come to think of it I think you're right. When I think about a scenario such as Proto-Indo-European borrowing a word from Ancient Chinese (just hypothetically in this example), and this same word being inherited through PGmc into OE, ME, and finally English...would I see it as a Chinese borrowing in English? The answer strangely is yes, I would. Leasnam (talk) 16:58, 17 March 2016 (UTC)[reply]

Yes, it's arbitrary, but we still need to have cutoffs. --Wiki Tiki 89 17:18, 17 March 2016 (UTC)[reply]

Why? And more pressingly, what? People speak all the time of French loanwords in English, but many times those were borrowed in Middle English. How do we define the point where we no longer consider English "English" in terms of borrowing? I think this is a very muddy picture that we probably don't want to solve. I for one would like to see borrowed terms listed regardless of time depth. I want to see cheese listed in Category:English terms borrowed from Latin for sure. It fits the description perfectly: it is an English term, and it was borrowed into the language that was to become modern English from Latin. —CodeCa t 18:07, 17 March 2016 (UTC)[reply]

But wouldn't Category:English terms derived from Latin be a little more fitting for a word like cheese ? [EDIT CONFLICT] For "borrow" I would expect the form of the word to be like that in Latin (in my opinion), like ergo, that to me is a word borrowed from Latin. Cheese has been altered so much within English that it doesn't look like a Latin word anymore. If I "borrow" something of yours, it's still yours. Cheese means nothing in Latin. Café is a borrowed word from French. It still means something in French. Leasnam (talk) 18:25, 17 March 2016 (UTC)[reply]

Why? —CodeCa t 18:26, 17 March 2016 (UTC)[reply]

Please see above ^ Leasnam (talk) 18:28, 17 March 2016 (UTC)[reply]

Because we treat Middle English and Modern English as separate languages already and we already have a cutoff. This doesn't mean that Middle English became Modern English overnight on whatever day New Year's was in the year 1500, but just that we had to pick a cutoff and that's what we picked. We should stick with this cutoff for all intents and purposes, including borrowings. --Wiki Tiki 89 18:35, 17 March 2016 (UTC)[reply]

CodeCat et al. I see your point about emperor (I didn't think of that). I think counting ancestor languages as separate languages for this purpose is a bad idea. Basically what we're talking about is cellar was borrowed into what we now call Middle English and has continued to exist ever since. Renard Migrant (talk) 20:40, 17 March 2016 (UTC)[reply]

I think it would be unhelpful to categorize "hemp" as having been borrowed by English from Scythian: it would dilute the category, putting things in it that I imagine many readers, like me (and apparently Leasnam), would not expect; it would also be inaccurate: English didn't exist at the time of the borrowing. I think "derived from" is the best label. What WikiTiki suggests in his comment of 14:48, 17 March 2016 (UTC) is good, especially if {{bor}}/{{inh}}/{{der}} can be made to only generate a language name (like {{etyl}}, and like someone has proposed elsewhere) and not the additional text they currently generate. I don't see how one could coherently categorize "hemp" in "English terms borrowed from Scythian" and not categorize e.g. "squaw" into "English terms inherited from Proto-Algonquian", and that would be absurd: we just had a discussion about what "inherited" and "ancestor" mean (with regard to Yiddish), and English is definitely not inheriting things from Proto-Algonquian. Perhaps the templates could detect whenever an English term was listed as deriving from anything that was not an ancestor of English, and then put the word into "English borrowed terms", but even that is questionable IMO, since "hemp" is a very different kind of "borrowed term" from "de jure". - -sche (discuss) 21:24, 17 March 2016 (UTC)[reply]

I agree with your point about "hemp", that was what I was trying to say in the second paragraph of my opening post. But I disagree that English did not exist. It did, it just wasn't called English yet. —CodeCa t 21:33, 17 March 2016 (UTC)[reply]

Huh, I thought that's what you were saying (that you wouldn't want "emperor"/"imperial" or "hemp" to be listed in "English terms borrowed from X"), but then you also say you do want to see "cheese" listed in "English terms borrowed from Latin"; what's the difference? - -sche (discuss) 21:47, 17 March 2016 (UTC)[reply]

What I mean is that these terms should only be listed in the categories from which English itself (any of its ancestors) borrowed the term directly. So hemp would appear in Category:English terms borrowed from Scythian, but cannabis would appear in Category:English terms borrowed from Latin. Essentially, the rule is that a language can only borrow a term from at most one language, and any inheritances must follow the borrowing in time. —CodeCa t 00:44, 18 March 2016 (UTC)[reply]

Ah; then we disagree (and I agree with what I think Wikitiki and Leasnam are saying). "Hemp" should go in "English terms derived from Scythian", not "English terms borrowed from Scythian". - -sche (discuss) 00:51, 18 March 2016 (UTC)[reply]

But it was borrowed from Scythian. —CodeCa t 01:21, 18 March 2016 (UTC)[reply]

It was borrowed, yes, by English in an ancestral form, that it true. It all comes down to what we want "English terms borrowed from" to mean: 1). "English words that were borrowed" OR 2). "words that [Modern] English borrowed". As above, {{der}} does a very nice job at handling the former. Leasnam (talk) 01:37, 18 March 2016 (UTC)[reply]

Yeah I also see what you mean about 'emperor'. Also, what's the exact criteria to call a word a borrowing on here? Is it simply just a word taken from another language which is not actually a parent language to it? I'm curious because there's several hundred words in Albanian, for example, that were technically "borrowed" from Latin back in antiquity, but they're unique since they were absorbed in a more natural way from the Vulgar Latin variety spoken there as opposed to intentionally borrowed in a scholarly fashion, which is, in contrast, what happened with some later words in more recent times (especially pertaining to scientific and technical fields). That very ancient layer of loanwords from Latin, which pertains to many aspects of general life (like mjek, mik, fytyrë, fëmijë, gjymtyrë, arësye, for example), has undergone a distinct set of sound shifts within Albanian that almost mirror the developments of some actual Romance languages in some aspects, and can often be clearly distinguished from later scholarly borrowings (e.g. bibliotekë, absorboj, etc.). On the other hand, to call them "inherited" is also not right, of course, since Latin is certainly not a parent language to Albanian, despite having imparted strong influence on it. How can this be handled? This also perhaps applies in a way to a few English words that entered Old English from Latin.

I guess, in a broader sense it can apply to any language taking a term from another that it is not a descendant of. So if there's a Romanian term derived from Proto-Slavic or some unspecified Slavic language, does it have to use {{bor}}, since its core vocabulary is not from that language (depends how concerned we are about how a term was borrowed, whether intentionally as part of a language reform or if they just entered naturally over time through interaction of different languages/populations)? How are pidgin or creole languages handled, also?Word dewd544 (talk) 22:36, 18 March 2016 (UTC)[reply]

colloquial and informal labels

These two labels are very confusingly defined as the same thing in our Appendix:Glossary. There are old discussions about this problem there and here, but nobody has fixed the problem and the discussions just fizzled out.

If even people writing this dictionary have difficulties defining the difference between them, we can be sure that their simultaneous use is of absolutely no benefit to users and instead confuses and discourages them and new editors and wastes their time.

In fact, simultaneous use of these very similar terms would still remain detrimental even if we should eventually succeed in defining a clear difference. The best attempt at that i've seen is to define "colloquial" as "limited to spoken use (including written dialogue)", but even if we could agree to that, it would still be senseless to hide that distinction behind a word (colloquial) that even our glossary admits is commonly misunderstood to mean "regional" or "location"!

If we feel it is important to make or label the difference between informal everyday speech and informal everyday writing (which will soon seem silly and anachronistic to younger users and editors in this age of email, texting, and social media), we should stop using a term that most users misunderstand or don't know and instead use plain English labels such as "informal, esp. spoken" and "informal, esp. written". --Espoo (talk) 08:01, 19 March 2016 (UTC)[reply]

I think we should standardize on informal and abandon colloquial. Some background is at User talk:Dan Polansky#Colloquial vs. informal. In sum, modern dictionaries overwhelmingly seem to have switched to "informal" label. I don't believe the wiki lexicographers will be able to maintain a useful distinction between informal and colloquial and convey the distinction to the readers. --Dan Polansky (talk) 09:12, 19 March 2016 (UTC)[reply]

The distinction may not be clear for English, but in other languages it is. In Welsh, "colloquial" means "not literary". There are any number of Welsh verb forms, for example, that are colloquial in the sense of not literary but are not necessarily informal. —Aɴɢʀ (talk) 11:15, 19 March 2016 (UTC)[reply]

Exceptions could be made without problem, but for English, at least, and probably most languages, we should probably stick with "informal." Andrew Sheedy (talk) 05:57, 20 March 2016 (UTC)[reply]

If we decided to use only "informal", we would need to have a bot periodically change uses of "colloquial" in specified languages (which is undesirable, because it will stop as soon as the owner dies or leaves, as happened with Autoformat once or twice), or else just make "colloquial" an alias of "informal" in the module that handles the labels. The latter would prevent "colloquial" being used for Welsh, but then, if our readers don't grasp the distinction, it's probably unwise to attach significance to it: a clearer label could be used instead of "colloquial" ("familiar"?). - -sche (discuss) 06:56, 20 March 2016 (UTC)[reply]

I think the above stated bot problem is largely non-existent: once you switch all colloquials to informals for English, the rate of addition of new colloquial tags will diminish to almost nothing since people very often proceed by following the existing examples, wisely so. Even if it does not diminish, we will have much better state of affairs than we have now. --Dan Polansky (talk) 08:20, 20 March 2016 (UTC)[reply]

Good point; it also occurs to me that because "colloquial" would continue to categorize, we would have a way of finding new additions (other than checking a database dump). The number of languages we would have to watch for new additions in would be large and smaller languages might go unchecked for long periods, but I guess it's also not urgent to fix new uses. - -sche (discuss) 15:23, 20 March 2016 (UTC)[reply]

@Aɴɢʀ: By "not literary", do you mean "not appearing in writing but only in speech"? --Dan Polansky (talk) 08:22, 20 March 2016 (UTC)[reply]

@Dan Polansky Not necessarily. Magazines, newspapers, and popular fiction are often written in colloquial Welsh; unpublished writing like letters virtually always are. Modern-day introductory textbooks usually only teach the colloquial register, leaving the literary register for more advanced learners. —Aɴɢʀ (talk) 15:15, 20 March 2016 (UTC)[reply]

Why wouldn't we follow the original suggestion of Espoo and display "informal, esp. spoken" in every instance where it is used within {{lb}} or {{q}}. Do we use the term in some other way as a label?

Wouldn't this suggest ways to make our treatment of internet and texting slang more relatable to the rest of the language? Much of such usage seems to spread fairly quickly to other media. What doesn't spread are things reflective of the special constraints or capabilities of the medium, eg, leet, number-keypad-based puns and codes, ASCII-character emoticons.

Also is contemporary fictional dialog "writing" or "speech". To me it seems to be speech, which is how we treat it for attestation purposes.

Is a term like "Yo!" informal, colloquial, or without the need for a label or category? DCDuring TALK 11:45, 20 March 2016 (UTC)[reply]

I don't see them being one and the same, exactly...informal to me just means "not formal" (i.e. you wouldn't use it in an interview or on an job application). Colloquial is like slang, but more general...you could say it in a formal situation and everybody knows what you mean, but it's not acceptable or "proper" English. I see informal slightly above colloquial. Just my cent. Leasnam (talk) 16:24, 20 March 2016 (UTC)[reply]

My point is that whatever distinction can be made between informal and colloquial labels, it is hard to maintain in Wiktionary entries. I have seen no English monolingual dictionary that uses both labels. I believe the variation between informal and colloquial label in en wikt does not carry signal but largely noise, an accident of who was entering the label and based on what external source. --Dan Polansky (talk) 18:57, 20 March 2016 (UTC)[reply]

I'm for keeping both labels as is. For one: Is there really a problem that demands one label to be abandoned? The reason the discussion has trickled out and been ignored until now is probably that it's a non-issue. For the other: From what I know of the language, Japanese for example draws a razor sharp line between informal and colloquial, and I'm against splintering up practices language by language without pressing need, as that makes Wiktionary more difficult to get into for beginners. And that is bad. Korn [kʰũːɘ̃n] (talk) 01:20, 21 March 2016 (UTC)[reply]

The main problem is having categories for both and having entries arbitrarily split between the two- so neither is complete. Chuck Entz (talk) 03:41, 21 March 2016 (UTC)[reply]

Right. We've seen that editors use the labels interchangeably, and we can expect them to continue to: few editors, especially newer or infrequent editors, will notice whatever arbitrary distinction someone might invent in the glossary, so the distinction will not be made in practice. Readers who notice one label on one entry and the other on a synonym may think there is some difference, especially if they consult the glossary and find that it asserts a distinction, but the absence of an actual distinction means the readers will leave mislead or confused. That is bad. Splintering a label into two synonyms, which new users might expect (but be unable to find) a distinction between, also makes Wiktionary more difficult to get into for beginning editors. And the division of the entries into two categories which are synonymous in actual practice is also bad for usability. - -sche (discuss) 04:09, 21 March 2016 (UTC)[reply]

That's my thinking. --Dan Polansky (talk) 13:44, 27 March 2016 (UTC)[reply]

I'm also for keeping both--regardless of how we've arrived at two--now that we have two, let's make the most of it. We can place any (reasonable) distinction upon them as we see fit, and provide the distinction in the glossary Leasnam (talk) 01:34, 21 March 2016 (UTC)[reply]

As for scattering words across two categories: If you think both are applicable, nothing stops you from applying both tags. But with the distinction made in other languages, it's not inconceivable that it could occur in English too and if it doesn't, I don't see the harm of always using both. Korn [kʰũːɘ̃n] (talk) 11:08, 21 March 2016 (UTC)[reply]

If it's too much detail, then delete one or merge them. Can't they both point to the same Category? Leasnam (talk) 21:14, 21 March 2016 (UTC)[reply]

Re: "I don't see the harm of always using both": Do you mean like {{lb|en|informal|colloquial}}? --Dan Polansky (talk) 13:43, 27 March 2016 (UTC)[reply]

"Please provide the title of the work" in cite web

Hi! I used Reflinks to fill in some references, for example: {{cite web|url=http://www.cpfmarketplace.com/mp/showthread.php?225659-What-does-NIB-mean-to-you |title=What does NIB mean to you? |publisher=Cpfmarketplace.com |date= |accessdate=2016-03-20}} For some reason, it displays on the page as "^ “What does NIB mean to you?”, in (Please provide the title of the work)[2], Cpfmarketplace.com, accessed 2016-03-20". I don't see any other place to put the title of the work than the title parameter in the template, which is already filled. Would someone mind taking a look at this? I'm not sure whether it's a bug or something I'm doing wrong…. Thanks! :) Goldenshimmer (talk) 06:41, 20 March 2016 (UTC)[reply]

Oh, this was on NIB. Forgot to mention that :3 Goldenshimmer (talk) 06:42, 20 March 2016 (UTC)[reply]

Looks like @Smuconlaw fixed it, and improved the page in a number of other ways too. Thank you! :) Goldenshimmer (talk) 02:01, 21 March 2016 (UTC)[reply]

No worries. For consistency, please use {{cite-web}} instead of {{cite web}}. — SMUconlaw (talk) 06:53, 21 March 2016 (UTC)[reply]

By the way, @Goldenshimmer, what is "Reflinks"? — SMUconlaw (talk) 15:30, 22 March 2016 (UTC)[reply]

Hi @Smuconlaw, since I don't know how to use all those templates, I just use <ref>http://example.org</ref> and then use Reflinks, which is an app that turns it into the correct templates… theoretically. Apparently it doesn't use the right one for Wiktionary? It can be found at 📌[2]. 😸 Goldenshimmer (talk) 16:53, 22 March 2016 (UTC)[reply]

Wiktionary doesn't have the same templates as Wikipedia, so any app that is designed for Wikipedia is not going to work. --Wiki Tiki 89 17:24, 22 March 2016 (UTC)[reply]

@Wikitiki89, I think it's supposed to work for Wiktionary, though, since it changes the title of the article when pasted to "wikt:foo". I think this may be a bug in Reflinks…. Goldenshimmer (talk) 19:31, 24 March 2016 (UTC)[reply]

It's probably not due to a bug, but to the fact that I recently modified {{cite-web}} as part of a move to streamline all the citation and quotation templates. If you know who the developer of the app is, contact him or her and request for Reflinks to be updated. — SMUconlaw (talk) 21:52, 24 March 2016 (UTC)[reply]

Category:French terms spelled with , etc

Should we create these categories, or is whatever module is adding them still in flux? DTLHS (talk) 02:49, 22 March 2016 (UTC)[reply]

@kc kennylau, CodeCat What's the update on this project? —John C5 03:10, 22 March 2016 (UTC)[reply]

Ugh no, commas aren't part of French (or English) spelling, they're punctuation. Imagine Category:English terms with commas in their punctuation, what would be the sodding point of that? Renard Migrant (talk) 12:23, 22 March 2016 (UTC)[reply]

I have added some more characters to the standardChars field. DTLHS (talk) 21:31, 23 March 2016 (UTC)[reply]

Terms derived from PIE words

I created the template {{PIE word}} as an alternative to {{PIE root}} when a word derives from a PIE word that is not currently described as being derived from a root: for instance, ὄνομα (ónoma) and ὀνομάζω (onomázō) from *h₁nómn̥. {{PIE word cat}}, like {{PIE root cat}}, serves as the category boilerplate template.

I just created the new templates by modifying {{PIE word}} and {{PIE word cat}}, and might have made some mistakes, so I'd appreciate it if someone looked over my work and checked if there are errors. Also not sure where the category Category:Terms derived from the PIE word *h₁nómn̥ should go; it's currently placed in Category:Terms derived from Proto-Indo-European roots.

Maybe {{PIE word}} and {{PIE root}} could be merged, but not quite sure how. — Eru·tuon 03:12, 22 March 2016 (UTC)[reply]

Hm, probably have to merge. Otherwise the two templates have to be stacked in ἔθω (éthō), from *swé +‎ *dʰeh₁-. Not sure how to do it, except in an inelegant way (multiple if-statements). — Eru·tuon 03:22, 22 March 2016 (UTC)[reply]

I think this sets a dangerous precedent. We probably don't want to end up tracking every derivation in every language from every word ever in existence. That would get hairy really fast. We tried to do this with English once and it became a disaster; some of the categories created back then are still around. Proto-Indo-European is perhaps still reasonable because there aren't tons of fully reconstructed words, but it wouldn't surprise me if someone then decided it would be great to do the same thing with Proto-Germanic too, and we have over 3000 Proto-Germanic lemmas... —CodeCa t 22:31, 22 March 2016 (UTC)[reply]

Yes, it could be taken to absurd lengths, but that can easily be prevented by restricting the categories to the lowest level of root or word. So a category should only be created for words like *swé, not for derivatives of such words, like *sewos. And if a verbal root for *h₁nómn̥ is decided on, then the categories relating to h₁nómn̥ should be deleted and replaced with a category relating to the verbal root. The problem right now is that there's no way to categorize the words that derive from basic PIE words without a root like *swé and *h₁nómn̥, and that's the reason why I created the template. — Eru·tuon 23:13, 22 March 2016 (UTC)[reply]

I guess that is understandable. You should probably write this in the documentation of the template though, so that people don't use it in unintended ways. —CodeCa t 23:24, 22 March 2016 (UTC)[reply]

Done. — Eru·tuon 23:42, 22 March 2016 (UTC)[reply]

function words

([3])

How exactly should we define function words? Should we be verbose, or brief? I’d think that the most effective way to define them is to be elaborate, but it also seems like that’s not a common practice in lexicography. Still, I feel uncomfortable with the possibility that readers might be mislead. One compromise would be to link to https://en.wikibooks.org/, but it seems like nobody does that. --Romanophile ♞ (contributions) 04:12, 22 March 2016 (UTC)[reply]

Common practice in lexicography is to be terse, because dictionaries were traditionally made of paper and would be too expensive if they went into detail. Since we're not paper, we do have the luxury of being verbose. —Aɴɢʀ (talk) 10:08, 22 March 2016 (UTC)[reply]

We can be terse about those function words that have as synonyms more basic function words or about the senses of function words that are more semantic and less grammatical. The more advanced learner's dictionaries are not usually terse about basic function words. Collins COBUILD and Longmans DCE are exemplary, but the modern ones all seem to follow the same pattern. OED is quite verbose to explain the evolution of the meaning and grammatical function of such words. Similarly TLF in French. I think advanced learners may be our most important audience. I find some coverage of the historical aspect of such words, which IMO require verbosity, is enormously helpful in getting outside of my own idiolect as I don't have the benefit of good understanding of other languages. DCDuring TALK 12:18, 22 March 2016 (UTC)[reply]

The dictionnaire du moyen français has a lovely full entry on quoi [4]. We should just cover as much usage as we can and make really sure that most common usage is covered in the first senses because we have to assume that most readers won't 20 definitions when there are 20 definitions. Frankly, I don't unless I really have to. Renard Migrant (talk) 12:25, 22 March 2016 (UTC)[reply]

As we don't and perhaps can't do a very good job at, for example, all the important entries for verbs and nouns explaining how they are used with prepositions and functional adverbs, we need to have some better coverage at the preposition and functional adverb entries. Core determiners require verbose coverage as well. For all of these and for other function words like pronouns and conjunctions we can't be too terse and we need lots of usage examples that span the whole range of reasonably common or of deceptive uses. DCDuring TALK 12:44, 22 March 2016 (UTC)[reply]

We should also make sure to divide the content up properly into the definition and usage notes. Each definition line should be relatively brief, while the usage notes can explain the finer details in as many words as it takes. --Wiki Tiki 89 15:01, 22 March 2016 (UTC)[reply]

To me, the most important part of a function word’s definition is not the definition itself, but the usage examples. For example, the definitions at voor, especially the first one, are not very useful. — Ungoliant ^(falai) 15:02, 22 March 2016 (UTC)[reply]

"Lochia" as a Word of the Day?

Hi. @AK and PK recently nominated lochia as a potential Word of the Day, but I'd like to get a sense of whether anyone feels this might be too graphic for the Main Page. (The definition is: "Normal post-partum vaginal discharge; blood, mucus, and placental tissue that are discharged from a female's vagina (similar to menstruation) for several weeks after she has given birth.") — SMUconlaw (talk) 14:33, 22 March 2016 (UTC)[reply]

It wouldn't be my choice for WotD. I'd prefer we find something else. How about something topical like triacetone triperoxide? DCDuring TALK 14:40, 22 March 2016 (UTC)[reply]

How is that topical? (Also, feel free to nominate more words!) — SMUconlaw (talk) 15:28, 22 March 2016 (UTC)[reply]

It's the explosive of choice for evading the most common explosive-sniffing devices. DCDuring TALK 17:26, 22 March 2016 (UTC)[reply]

I also am not a fan, I like WotD which are non-technical as well if possible. But since I am not putting any effort into the thing my vote should not count for much. - TheDaveRoss 14:44, 22 March 2016 (UTC)[reply]

It's a terrible nomination. Not as inspiring as Boaty McBoatface --AK and PK (talk) 16:26, 22 March 2016 (UTC)[reply]

OK, removing it from the nominations list, then. Thanks. — SMUconlaw (talk) 13:50, 25 March 2016 (UTC)[reply]

I'd be willing to put some effort into selecting and editing into shape one "technical" word a week. The preference would be find one that was topical for some reason. The word could be itself technical or one that had a topical technical definition. Examples from the recent past are Zika virus, Aedes aegypti, olinguito. The idea would be to provide a gateway to related terms and to WP and other sources, so that Wiktionary could be seen as a resource that provided some depth. DCDuring TALK 14:12, 25 March 2016 (UTC)[reply]
We could even have a technical WOTD in addition to the others. DCDuring TALK 14:14, 25 March 2016 (UTC)[reply]
Sure, why not? — SMUconlaw (talk) 19:07, 30 March 2016 (UTC)[reply]

Etymology section for non-lemmas

We don't include etymologies for non-lemmas (I think this should be an absolute rule, it's close to universal in practice anyway), but it does happen that non-lemmas occur side-by-side with lemmas on the same page, or with non-lemmas from another lemma. The practice on how to handle these seems to differ a lot. In some pages, non-lemma entries are split by the etymology of their lemma. On other pages, I've seen lemmas and non-lemmas mixed under the same etymology header, which is obviously not very sensible because different lemmas will have different exact etymologies and can only have the same etymology approximately. Entries consisting of only nonlemmas with different etymologies might split them by etymology or they might not. Things get even trickier when considering that two lemmas with the same spelling but different etymologies can share the same inflected forms. For some examples, consider leek, leken and lijken. Should leek and leken's verb forms be split by etymology to reflect the two etymologies of lijken? Should leken be split by etymology? Should this split include three different sections, all saying "plural of leek", to reflect the three different noun lemmas spelled leek that all have the same plural form? It can get very messy if we start taking lots of different lemmas and etymologies into account.

I would therefore like to use just one etymology section for all non-lemmas. It would be a special section named something like "Inflected forms" and appear always at level 3, at the same level as the other etymology sections. It would always appear as the final etymology section for a language, and only if the language section is split by etymology at all; if the section only has non-lemmas, then there's no need for a split. There's still the question of what to do with the three identical plurals for the three leek nouns: should each one have a "Noun" section with a "plural of" definition, or would a single Noun section do even though there are three nouns? —CodeCa t 22:07, 22 March 2016 (UTC)[reply]

That actually sounds like a good idea. I'm not sure whether "Inflected forms" is the best name for the section, however; it kind of sounds like it means that these are the inflected forms of the lemmas on this page. --Wiki Tiki 89 22:17, 22 March 2016 (UTC)[reply]

I'm reluctant to make absolute rules about this sort of thing. I feel like such decisions should be made on a case-by-case basis, as the details of each non-lemma in each language will be different. And we do include etymologies for non-lemmas in the case of suppletive forms and other irregular forms, e.g. do·fúaid and estir. —Aɴɢʀ (talk) 22:20, 22 March 2016 (UTC)[reply]

I see this more as a standardized option to use when it makes sense to do so, not as a requirement. --Wiki Tiki 89 22:21, 22 March 2016 (UTC)[reply]

@Angr I think that the etymologies at do·fúaid and estir should be moved to the lemmas. Otherwise, how will someone looking at those lemmas know where all the pieces come from? Nobody wants to click through an entire inflection-table worth of links to figure out the etymology of all the forms. —CodeCa t 22:24, 22 March 2016 (UTC)[reply]

@CodeCat, if I do that, then do·fúaid isn't in Category:Old Irish words prefixed with dí- and Category:Old Irish words prefixed with fo- anymore. I suppose I could put the lemma ithid in those categories, but people would probably be confused by that. Likewise, if we were to create a Category:Old Irish terms derived from the PIE root *h₁ed-, it would be misleading to put the lemma ithid in it, though do·fúaid and estir would belong in it. —Aɴɢʀ (talk) 12:55, 23 March 2016 (UTC)[reply]

Compare zijn, go, sum, ferō. They're categorised by the roots found throughout the paradigm, not just the ones from which the lemma form derives. —CodeCa t 15:52, 23 March 2016 (UTC)[reply]

(EC) I agree, but I can't really think of anything better. A name that contains "etymology" or some related form would be preferable, so that it's clear that this is a section that stands at the same level as the other etymology sections. But, by definition, such a section is not an etymological grouping, so it may be a misnomer. We're really splitting language sections into two pieces: lemmas, then non-lemmas. Lemmas would be all the level 3 etymology sections, non-lemmas are the final level 3 section. We could also opt to rename all our etymology sections to "Lemma etymology 1" and so on, and then have a final "Non-lemma etymologies" (in plural). But that may be too big a change. —CodeCa t 22:23, 22 March 2016 (UTC)[reply]

What about the name "Non-lemmas"? --Wiki Tiki 89 22:25, 22 March 2016 (UTC)[reply]

That would work too, but there's less disagreement on what an inflected form is than what a non-lemma is. Some people consider alternative forms to be non-lemmas, but I consider them lemmas because they appear in the lemma form and you'd expect them to appear in a paper dictionary (with a "see (other entry)" definition of some sort). —CodeCa t 22:27, 22 March 2016 (UTC)[reply]

There doesn't have to be agreement on the meaning of the term as long as there is agreement on the criteria for how the section can be used. I think the criteria is pretty clear that if it is a form-of entry whose etymology is given (or would be given) on the page it links to, then it can go in the "Non-lemmas" section. --Wiki Tiki 89 22:36, 22 March 2016 (UTC)[reply]

Alternative forms can and sometimes do have their own etymologies (for example, color was derived from colour by Noah Webster), so I guess by that reasoning they wouldn't go in the section. But they're a bit in between... we kind of assume their etymology is the same as that of the main lemma, unless specified otherwise. —CodeCa t 22:41, 22 March 2016 (UTC)[reply]

My point is that if it's an alternative form, it could go either way depending on the situation. If it has its own etymology section, it would not go in the "Non-lemmas" section, if we want to imply that the etymology is the same as the main lemma, then it could go in the "Non-lemmas" section, but not necessarily. It's a case-by-case thing. --Wiki Tiki 89 22:46, 22 March 2016 (UTC)[reply]

I'd rather just make a set rule that they shouldn't go in the non-lemmas section. Consider that alternative forms still use the lemma headword-line templates like {{en-noun}}, which categorise in the lemma category. —CodeCa t 23:03, 22 March 2016 (UTC)[reply]

But also consider that an orthographic alternative form could correspond to multiple etymologies, and is thus an ideal candidate for this new section. --Wiki Tiki 89 23:08, 22 March 2016 (UTC)[reply]

The practice I think I've used when creating Russian non-lemma forms is to create a separate etymology section for each corresponding lemma. So if фад has an inflection фа́де and фадь also has an inflection фа́де (or if it has фаде́, doesn't matter) then they go in separate etym sections, but if фа́да and фада́ are both inflections of фад (as is often the case) then they go into the same etym section, with separate subsections (and if e.g. фа́да can be three different inflected forms of the same lemma then there's only one subsection with three definitions listed, one per inflection). The underlying principle is as is the etymology read "Inflected form of FOO". I've tried to keep certain forms together, though, e.g. if lemma фабала can be stressed as either фаба́ла or фабала́ and they have corresponding respective inflections фаба́лы and фабалы́, then I try to keep them together, although this requires additional coding work. Benwing2 (talk) 23:46, 22 March 2016 (UTC)[reply]

Yeah that seems to be the more-or-less standard practice right now. But it creates too many etymology sections that make the page a bit messy. This is exactly what CodeCat is proposing to change. --Wiki Tiki 89 14:14, 23 March 2016 (UTC)[reply]

I don't think we need (and to be explicit, I oppose) a new "Non-lemmas" header. I've seen some entries already grouping inflected forms under one "Etymology" section that just says ===Etymology=== \n Inflected forms. or the like; I think the "Etymology" is sufficient. ALso, it is sometimes desirable for a non-lemma to have its own etymology separate from that of other non-lemmas: sawed as a dialectal past tense of see (compare seent) merits an explanation separate from sawed as the past tense of saw, IMO. - -sche (discuss) 04:04, 25 March 2016 (UTC)[reply]

A big advantage of a separate header is that it's immediately obvious for bots to find, and add new entries to, or alternatively to create it and add it to the end of the entry. With a numbered etymology header it's not so clear: how does a bot determine whether it contains a lemma or nonlemma? —CodeCa t 21:44, 28 March 2016 (UTC)[reply]

Change to terminology of `{{sense}}`?

It now reads of the sense “shake” or whatever instead of just putting the word in italics. When was this discussed? I'm not sure I like it, too wordy. Benwing2 (talk) 03:36, 25 March 2016 (UTC)[reply]

I see this was CodeCat. Please revert, this needs to be discussed first. Thanks. Benwing2 (talk) 03:38, 25 March 2016 (UTC)[reply]

I support the change, though perhaps there should have been discussion. It removes the confusion of wondering whether the thing in parentheses is the meaning of the antonym or the meaning of the thing the antonym is an antonym of. — Eru·tuon 03:47, 25 March 2016 (UTC)[reply]

I think that wording that clarifies what's inside the parentheses is useful, but the current wording is a bit verbose (and gets repetitive quickly). —suzukaze (t・c) 08:50, 25 March 2016 (UTC)[reply]

The tradeoff between transparency to a user who has never been to Wiktionary before and waste-of-screen-space/more-to-filter-out for almost all repeat users and many of the new users with half a brain seems clear to me in this case.

I don't think that using the word sense is helpful for the kind of new user that might not guess what we mean by our short gloss. Interestingly, MWOnline has the following as its first definition of sense: "a meaning conveyed or intended: import, signification; especially: one of a set of meanings a word or phrase may bear especially as segregated in a dictionary entry". Other dictionaries put this definition much lower. I take MW's practice as an estimation that users of a dictionary are most likely to be looking up the word sense as it is used in dictionaries and language studies. That we might need to link to the particular definition at [[sense]] to explain the label is a certain indication that the "solution" being offered is not much of a solution.

Would our users be better off with no synonyms (or other semantic relations, related terms, derived term, etc) on the definition page for L2 sections that don't fit in a typical window on a typical desktop?

Revert. DCDuring TALK 12:47, 25 March 2016 (UTC)[reply]

Evidently this edit summary from {{sense}} is intended as justification for the previously rejected change: "(Getting tired of people thinking this reflects the sense of the antonym rather than the sense that it's an antonym of. Reinstating my previous edit since nobody came up with a better solution in the past 2 years.)" DCDuring TALK 13:09, 25 March 2016 (UTC)[reply]

I note that the arguments in favor are limited to the use under Antonyms. The Antonyms header is used about 29K times, the other semantic relations about 68K (Synonyms 50K, Hypernyms 9K, Hyponyms 8K, all others 1K). More than 2K of the Antonyms headers appear on pages that also have Synonyms headers, which would help users grasp the use of the label.

An obvious solution would be to have a switch in {{sense}} that would be applied in Antonyms sections only. The wording could be specifically tuned for that use. This would seem vastly superior to the personal-annoyance-motivated recently installed. DCDuring TALK 13:26, 25 March 2016 (UTC)[reply]

It also doesn't work for all those cases where the parameter used to distinguish the definition is a topic or register label associated with the definition. I'm just going to revert it for now. And it doesn't work for cases where the parameter used is a defining hypernym in the definition. It can be changed back once we've sorted out the issues which should have been sorted it out before it the change. DCDuring TALK 15:09, 25 March 2016 (UTC)[reply]

If someone makes a major change without discussing it, it's always CodeCat, no need to check the edit history. Renard Migrant (talk) 16:11, 25 March 2016 (UTC)[reply]

I know. That's why I wanted to make clear the petulant motivation. DCDuring TALK 17:48, 25 March 2016 (UTC)[reply]

Thank you for the revert. --Dan Polansky (talk) 14:13, 27 March 2016 (UTC)[reply]

OK, I'm thinking something needs to be done here. I was just creating an entry for я́вственный (jávstvennyj, “clear, distinct”) and my first instinct was to put "unclear" and "indistinct" in the {{sense}} field for antonyms, rather than "clear" and "distinct". One possibility is to create a template {{antsense}} (or similar), which displays something more like what CodeCat wants. It shouldn't be too hard to use a bot to automatically convert uses of {{sense}} to the new template. Benwing2 (talk) 20:51, 1 April 2016 (UTC)[reply]

I suppose "antonym of", or "opposite of", might be worthwhile to have here, since as noted their absence does confuse many users. I don't like the "of the sense" that was implemented here. Equinox ◑ 13:24, 13 December 2016 (UTC)[reply]

Old Gujarati

Would the community support the creation of a code for the Old Gujarati language? Old Gujarati was spoken from the twelfth to sixteenth centuries, and is signficantly different from Modern Gujarati, with an entirely different case system. It was written in Devanagari, and is the ancestor of Middle and Modern Gujarati. I think gu-old would be a sufficient code. DerekWinters (talk) 00:26, 26 March 2016 (UTC)[reply]

@DerekWinters: The normal naming convention would be something like inc-ogj. gu-old doesn't follow the triliteral code convention. I'd also be fine with this addition. —John C5 01:02, 26 March 2016 (UTC)[reply]

inc-ogj is perfect! DerekWinters (talk) 01:07, 26 March 2016 (UTC)[reply]

inc-ogu would be preferable, to match the code of the modern language. —CodeCa t 01:56, 26 March 2016 (UTC)[reply]

Ahh, that does make more sense. Then does anyone oppose inc-ogu? DerekWinters (talk) 02:01, 26 March 2016 (UTC)[reply]

@CodeCat could you possibly implement inc-ogu? DerekWinters (talk) 18:59, 26 March 2016 (UTC)[reply]

It's done. —CodeCa t 19:06, 26 March 2016 (UTC)[reply]

Thank you very much. DerekWinters (talk) 19:16, 26 March 2016 (UTC)[reply]

where does the translation table go?

I always felt it looks better at the bottom of the page. What is the protocol on this? ---> Tooironic (talk) 09:57, 26 March 2016 (UTC)[reply]

WT:ELE provides that it goes after Usage notes, Quotations, all the semantic relations and derived and related terms. It also provides for it to follow Descendants and See also, on which there has been strong dissent, especially for Descendants. It precedes External links, References, Anagrams, Statistics. That's not quite the bottom of the displayed page, but its close. DCDuring TALK 11:47, 26 March 2016 (UTC)[reply]

More Words of the Day needed!

We are running short of Words of the Day so do nominate some, either in general or for particular days of the year! — SMUconlaw (talk) 12:18, 27 March 2016 (UTC)[reply]

I assume you mean in English. There's a backlog in FWOTD. Donnanz (talk) 12:26, 27 March 2016 (UTC)[reply]

Yes, I meant English. And thanks to everyone who has been nominating words! — SMUconlaw (talk) 18:42, 1 April 2016 (UTC)[reply]

We do need more nominations for FWOTD as well. Or rather, more nominations in languages other than Latin, Classical Nahuatl, Portuguese, Spanish, French, German, Ancient Greek & others that have had more than their fair share of FWOTDs. — Ungoliant ^(falai) 18:57, 1 April 2016 (UTC)[reply]

There may be a case for splitting FWOTD between Latin and non-Latin languages, does anyone have a view on this? Donnanz (talk) 19:02, 1 April 2016 (UTC)[reply]

I don't think that's a good idea. --Wiki Tiki 89 19:16, 1 April 2016 (UTC)[reply]

That's the kind of reaction I expected. But when interesting words like German See (nominated one year ago) aren't used it makes one wonder. The requirements for quotations and pronunciation are also bugbears which don't seem to apply to English WOTD, and possibly prevent interesting terms from being nominated. Donnanz (talk) 08:40, 2 April 2016 (UTC)[reply]

The requirement for a quotation is to ensure that we don't feature a fictitious word. Also, we seem to have been ignoring it lately. Can you link to where See was nominated? --Wiki Tiki 89 13:26, 2 April 2016 (UTC)[reply]

See was nominated by an IP on 31 March 2015 [5]. Quotations aren't so much of a problem, but pronunciation can be if it's not included in a foreign-language Wiktionary. Donnanz (talk) 13:55, 2 April 2016 (UTC)[reply]

Oh. Not sure why I didn't see it before. --Wiki Tiki 89 14:33, 4 April 2016 (UTC)[reply]

What's so interesting about See? Just that it's a false friend? —Aɴɢʀ (talk) 15:27, 2 April 2016 (UTC)[reply]

It has different genders for different senses for a start. Donnanz (talk) 15:32, 2 April 2016 (UTC)[reply]

That's true. When I first started learning German, my mnemonic for the genders was that See is like a spider: the female is larger than the male. —Aɴɢʀ (talk) 15:52, 2 April 2016 (UTC)[reply]

You know, it would help if people who nominate words (or come across already nominated words) would give a reason to why it's interesting, if it's not obvious. Otherwise someone unfamiliar with the word would not see from a quick glimpse at the entry why See is interesting. --Wiki Tiki 89 14:33, 4 April 2016 (UTC)[reply]

It's a point worth bearing in mind, but probably not the sort of thing an IP would think of. It shouldn't be a requirement, like pronunciation (which should be scrapped). Donnanz (talk) 08:09, 8 April 2016 (UTC)[reply]

Definitely not a requirement. Maybe we can't expect an IP to be so thoughtful, but you did happen to notice it and why it might be interesting, and so you can add a comment to it. --Wiki Tiki 89 14:34, 8 April 2016 (UTC)[reply]

Done yesterday, but it probably won't make any difference with the present incumbent. Donnanz (talk) 11:52, 12 April 2016 (UTC)[reply]

Hittite lemmas

To put it simply, what should be the page on which lemmatic information is presented?

For substantives, the citation form in both Kloekhorst and the CHD is the root, which is spelled in Latin. This is basically the same as Sanskrit: the citation form is lūli-, the nom. sg. is lūliš, acc. sg. lūlin, etc. For verbs, CHD gives the stem: mark-, markiya-, markišta(i)-, marzai-. Kloekhorst basically does this too, except that he shows multiple forms of the stem, and also adds as a superscript the third singular ending: mārk-ⁱ / mark-, markiye/a-^zi, markištai-^zi, marzai-^zi.

Note, however, that none of these are transliterations—they are approximations of the root. The reason for this, of course, is that Hittite writing varies wildly. The genitive singular of lūli- could be spelled 𒇻𒇷𒄿𒀀𒀸 (lu-li-ya-aš), 𒇻𒌑𒇷𒀸 (lu-ú-li-aš), 𒇻𒌑𒇷𒄿𒀀𒀸 (lu-ú-li-ya-aš), etc. You could point to a stem 𒇻(𒌑)𒇷- (lu-(ú)-li-) in that case, but you can do no such thing with mark-, where the various inflected forms mārkḫi, markanzi, markēr are spelled 𒈠𒀀𒅈𒅗𒀪𒄭 (ma-a-ar-ka-aḫ-ḫi), 𒈥𒃷𒍣 (mar-kán-zi), 𒈥𒆠𒅕 (mar-ke-er) respectively—i.e. there is no spelling of mark- that does not include part of the inflectional ending.

What this means is that we can't cite the stem as a lemma form if we want the lemma form to be spelled in its native script. This leaves three options for the lemma form:

The third person singular (for verbs), and the nominative singular (for nouns), spelled in cuneiform. This raises the problem that not all verbs/nouns have an attested 3sg./nom. sg. respectively, and even when they do the spelling thereof may vary (𒈥𒆠𒅖𒋫𒄑𒍣 (mar-ki-iš-ta-iz-zi) or 𒈠𒅈𒆠𒅖𒁕𒀀𒄑𒍣 (ma-ar-ki-iš-da-a-iz-zi)? Of course, one can resolve this by taking the stem spelling more consistent with other forms of the word [in this case, the former.])
The third person singular / nominative singular, spelled in Latin. This is contrary to the ideal that a word should be spelled in its native script (of course, we would have pages in cuneiform for individual forms, but the lemma would have to be in Latin.) It also ostensibly shares the above problem of attestation, except that the spelling is effectively standardized, and so the entry would only be markištaizi. The problem of lack of attestation is less, too—if 3pl. pret. act. 𒈥𒊺𒂊𒅕 (mar-še-e-er) is attested, one may infer a 3sg. pres. act.maršēzi. It has the slight advantage of being consistent with verbs in Kloekhorst (but not with nouns, and with neither POS in the CHD.)
The 'stem', spelled in Latin. This negates the problem of attestation entirely, is consistent with Kloekhorst and the CHD, and is also neutral with respect to forms. Of course, it's also Latin and not Cuneiform, and the actual stem may vary (is it mark- or mārk-? Different forms have different stems, in accordance with IE mobile full grade.)

I know that we have few if any Hittite contributors, but I would appreciate any opinions that anyone is able to offer. —ObsequiousNewt (εἴρηκα|πεποίηκα) 20:20, 27 March 2016 (UTC)[reply]

@ObsequiousNewt: I have often wondered the same thing. My instinct is to have Latin root lemmata which has a list of all attested forms in cuneiform. That said, it seems a bit silly for complicated adjectives or nouns for which we only have a single attested form to have full Latin entries that just link to a single cuneiform page. I've also wondered what we do about Sumerian, Akkadian, and Hittite determinatives. Are they part of the lemma or not? —John C5 20:47, 27 March 2016 (UTC)[reply]

(EC) How widely is Hittite attested? If it's at least somewhere on the level of Gothic, then we probably have no problems inferring our chosen lemma form based on whatever forms are attested. As for spelling, we can standardise, like we do for many other languages already. —CodeCa t 20:49, 27 March 2016 (UTC)[reply]

You could use each spelling found, and link the others under alternative forms, and have each be a full lemma page, each with its own declension/conjugation/etc. based on its individual spelling. This way, each spelling would be given full representation. There need not be one standard, when there never was a standard. And this should of course be all done in the Hittite script, with romanizations given separately. DerekWinters (talk) 22:20, 27 March 2016 (UTC)[reply]

@CodeCat: I don't know how well Gothic is attested, but I can safely say that Hittite spelling will be considerably more difficult—if not altogether impossible—to standardize in its native script. Cuneiform is syllabic, and the syllables in writing do not typically actually coincide with the syllables in pronunciation. It is sometimes possible to infer a lemma form (i.e. nom. 𒆜𒀸 (KAŠKAL-aš) can be inferred from acc. 𒆜 (KAŠKAL-an)𒀭, but it is more difficult to infer, as above, the form of maršēzi—even if it were attested it might vary between any number of forms (mar-še-e-zi? ma-ar-še-e-zi? mar-še-ez-zi? mar-še-e-ez-zi?)

In terms of what the page should look like, I feel like 𒆜𒀸 is a good example: show all attested spellings of each Latin form, and asterize the Latin form if no such spelling is attested. The question is what the page should actually be: 𒆜𒀸, or palšaš (or palsas?), or palša-. (𒆜- would work in this case, but can hardly be extended to other words.) DerekWinters' suggestion is possible, but I would imagine it to be more useful to put lemma information on one page per stem. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:18, 30 March 2016 (UTC)[reply]

@ObsequiousNewt I don't think we have the right to standardize a language that has never been standardized. Take for example, 𤠅. If you check the entries under the alternative forms, you will find them to be the same. This is what my earlier proposal had been. If each attested form of maršēzi for example were to have different conjugations or declensions, then those would belong under the appropriate form, but the other information should all be the same. In effect, we're saying that each form is correct, which is true (unless one is actually wrong), because the language was never standardized. DerekWinters (talk) 21:43, 30 March 2016 (UTC)[reply]

@DerekWinters: I think this makes sense, except for the problem of inflections. 𤠅 doesn't inflect, and colours can (and does) just say "3sg of colour", because inflections are consistent across English dialects—but in Hittite you can have the only attested forms be acc. sg. 𒇷𒆷𒀭 (le-la-an) and gen. sg. 𒇷𒂊𒆷𒀸 (le-e-la-aš), and there may not even be a nominative. It's as if the only attested forms of the verb meaning "to give sth. a distinct hue" were colors and colouring. It'd be possible to give each of these forms a common declension table as well as the same etymology/derived terms/whatever, but it also raises the problem of what form—if there is no lemma—to cite in those sections. I've made a page for one of the four attested forms of the stem lila-, and, as usual, the Luwian citation as well as the derived causative verb have no attested 3sg present forms.

(There's also the pervasive question of what tr= should be used for: is the tr= of 𒆜𒀸 KASKAL-aš or palšaš?) —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:37, 1 April 2016 (UTC)[reply]

@ObsequiousNewt: Hmmm, that is problematic. I know with Phrygian lemmas on here, there are terms like αββερετ which are in the third person singular present active. That could be a potential solution. Otherwise, another one could be creating something like Reconstruction:Taíno/bohi for the nominative of the attested forms, giving in the declension of the reconstruction, the attested forms. Also, the attested, non-nominative forms could be given first as the main lemmas with a declension table full of reconstructed terms, similarly to the lemma on your page. Also, for the tr=, I would say that the pronunciation (palšaš) be given for languages like Hittite, Akkadian, etc. There should be also be Sumerogram, Akkadogram, etc.-noting functions like smg=, akg=, etc. that would show Sumerogram = KASKAL-aš. DerekWinters (talk) 01:57, 2 April 2016 (UTC)[reply]

Using the third-person singular present active is certainly a good lemma, and in fact is the one Kloekhorst partially uses, but, as I have said, not all verbs have an attested 3sg. pres. act., and it is not always possible to infer it in the native script (although it is usually possible to infer it in the Latin script.) The idea of using Latin roots (or inferred nom. sg./3sg. pres. act.) (whether in the Reconstruction namespace or otherwise) is the other solution I have proposed, and may honestly be the only option, given the obvious problems with having a cuneiform lemma.

With respect to your comments on transliteration—how would these parameters be displayed? —ObsequiousNewt (εἴρηκα|πεποίηκα) 00:49, 3 April 2016 (UTC)[reply]

@ObsequiousNewt: It may be inferable from the Latin script, but we could do the inferring ourselves, yet still make the lemmas in the cuneiform. And it is always possible (and sometimes practical) to create a romanization entry of the lemma. And for terms that are found attested only in their allative, for example, then a lemma entry could be created of the attestation, and we could give it a declension/conjugation table with all other forms filled in with the corresponding reconstructions. For my transliteration suggestion, it may look something like this (but feel free to change it as much as you'd like): 𒆜𒀸 (palšaš) (Sumerogram KASKAL-aš) DerekWinters (talk) 01:34, 3 April 2016 (UTC)[reply]

If it's possible to infer a Latin-script lemma form of a word, it should be possible to convert that form to cuneiform (using tables like w:Cuneiform script#Syllabary). The result might be a 'normalized'/'idealized' cuneiform spelling, but we already normalize some old languages like Old Norse (hljóð vs hliod), and I don't see how using a reconstructed cuneiform-script-form would be any less desirable than using a reconstructed foreign-script form. That doesn't address the question of which form to make the lemma, but it means there's no reason for the lemma form to be in Latin script rather than cuneiform. - -sche (discuss) 02:20, 3 April 2016 (UTC)[reply]

@-sche, DerekWinters: It may be possible to standardize Hittite spelling, as CVC syllables are typically written CV-VC (as well as CVC, if such a sign exists), and long vowels can be notated with plene spelling. However, there are a few problems, namely labiovelars (which can be spelled ku/ḫu or uḫ/uk) and clusters of three or more consonants (e.g. parḫzi = pár-aḫ-zi or pár-ḫa-zi). Additionally, the actual spelling of words would vary wildly from such a standard, and it would not be in accordance with any academic standard. —ObsequiousNewt (εἴρηκα|πεποίηκα) 16:08, 5 April 2016 (UTC)[reply]

I think if we're going to use romanizations that are not direct representations of the individual signs, then there should be some kind of reasoning for why the signs are interpreted the way they are. For example, why is pár-aḫ-zi taken to stand for parḫzi and not, say, paraḫzi? —CodeCa t 17:57, 5 April 2016 (UTC)[reply]

I think paraḫzi would have to be spelled pa-ra-aḫ-zi. AFAIK an apparent C.V syllable break shows that the vowel is purely orthographic. —Aɴɢʀ (talk) 18:19, 5 April 2016 (UTC)[reply]

Could these rules perhaps be detailed at WT:About Hittite? —CodeCa t 18:22, 5 April 2016 (UTC)[reply]

@ObsequiousNewt DerekWinters (talk) 19:31, 6 April 2016 (UTC)[reply]

@CodeCat, Angr: The word parḫzi is spelled either pár-aḫ-zi or pár-ḫa-zi, because the /ḫ/ is part of a consonant cluster and cannot be represented using a neighboring vowel. It is never represented as pa-ra-aḫ-zi, because the word is not paraḫzi. There is no vowel between /r/ and /ḫ/; the form pár-ḫa-zi proves this; and there is no vowel between /ḫ/ and /z/; the form pár-aḫ-zi proves this. —ObsequiousNewt (εἴρηκα|πεποίηκα) 22:04, 6 April 2016 (UTC)[reply]

User:Liliana-60's de.wikipedia troubles.

Someone stopped by IRC this morning to try and get the word out that Liliana-60 had been de-sysoped and blocked on de.wikipedia. There is some discussion here in German and here in English. I don't read German, but the English one has a lot of accusation and not a lot of actual evidence (the offending edit has apparently been hidden by WMF). I don't have a suggestion, I am just passing along someone's concerns. - TheDaveRoss 11:18, 28 March 2016 (UTC)[reply]

It's just a little dispute with someone who loves to terrorize me by reporting me to law enforcement for crimes that never even happened and trying to get me involuntarily committed. And of course he acts all innocent afterwards. Yeah. -- Liliana • 11:21, 28 March 2016 (UTC)[reply]

Be sure that the evidence is horrible enough to have it suppressed and the global sysop rights immediately removed (no dewiki rights were changed). Best, DerHexer (talk) 11:30, 28 March 2016 (UTC)[reply]

You're biased and you know that the GS removal was abusive. -- Liliana • 11:31, 28 March 2016 (UTC)[reply]

I am not a regular here so i will add only a singe and short comment regarding Liliana-60: The user has been indefinitely blocked on both german wikipedia and wikimedia commons for making a death-threat. The threat has been removed ans oversighted by the WMF. A part of the userpage on dewiki has been revision deleted because it has contained very strange stuff. She is currently temporary blocked at metawiki for personal attacks, taking a look at the german wikipedia block log is wort as well. I think the whole situation is speaking for itself, Lilianas behavior is speaking for itself as well. Of course, it is up to the local community here to judge if Liliana can keep the sysop tools. I hope this short summary of the cause helps. Best --Steinsplitter (talk) 11:45, 28 March 2016 (UTC)[reply]

Not you again. I've told you a hundred times that 1. there was never a death threat, 2. there is no consensus for either of the blocks on Commons and Meta (the latter of which was obviously just done to shut me up) and 3. these matters are totally irrelevant for Wiktionary. -- Liliana • 11:47, 28 March 2016 (UTC)[reply]

The first question for me is whether DerHexer is able and willing to provide any evidence. The second question is which are the edits that were such that they had to be hidden, that is, what is the diff number and on what wiki, and what user did the hiding. --Dan Polansky (talk) 11:58, 28 March 2016 (UTC)[reply]

He hasn't even provided any evidence on Meta and after I asked for evidence I was promptly blocked. lol. -- Liliana • 12:02, 28 March 2016 (UTC)[reply]

Do you know which edits (diffs) of yours he has hidden as problematic? --Dan Polansky (talk) 12:05, 28 March 2016 (UTC)[reply]

It's here, regrettably you can't link directly to oversighted revisions. -- Liliana • 12:07, 28 March 2016 (UTC)[reply]

Thanks. The last edit at Commons:User talk:Nightflyer is 27 March 2016 Kalliope (WMF): removed threat of harm. And your hidden edit of Commons:User talk:Nightflyer is 27 March 2016 Liliana-60: Ich werde mich von dir nicht einschüchtern lassen.: new section. User:Nightflyer user page contains German text. --Dan Polansky (talk) 12:11, 28 March 2016 (UTC)[reply]

Note that a threat of harm is not a death threat, but most users so far seem to ignore the distinction. Hmm. -- Liliana • 12:12, 28 March 2016 (UTC)[reply]

Your user Liliana-60 was indefed on Commons on 27 March 2016 by Túrelio, a native German speaker, with the block summary "death-threat (already oversighted) against Nightflyer , user has been indef-blocked on :de and been globally emergency-de-admined by DerHexer. (for details: https://de.wikipedia.org/wiki/Wikipedia:Sperrpr%C3%BCfung#Benutzer:Liliana-60_.28erl..29.".

Did you, by your assessment, make a threat of harm? --Dan Polansky (talk) 12:18, 28 March 2016 (UTC)[reply]

I just told him that if he doesn't stop bothering me his family will fall into great misfortune. This is wording that you'll get from any fortune teller if you let your future be predicted, but for some reason (I wonder why?) it's been massively misinterpreted. -- Liliana • 12:23, 28 March 2016 (UTC)[reply]

(ec) Yes, she did, a very strong threat of harm which can be interpreted as serious damage to people mentioned (“wouldn't it be too bad if anything would happen to these; a tragic misfortune~accident/dangerous situation can happen”) when she ambushes his family at a real WikiConference trying to find out where they live in order to complete her rage. We have a zero tolerance policy for real life threats and cannot entrust anyone with powerful rights who just went nuts like that (and not for the very first time fyi). DerHexer (talk) 12:32, 28 March 2016 (UTC)[reply]

omg banned for witchcraft! Anyway, dunno why we need their drama here since you haven't said anything controversial on en.wikt. Equinox ◑ 12:26, 28 March 2016 (UTC)[reply]

Actually, Liliana has said such things here in the past (diff). --Wiki Tiki 89 15:08, 28 March 2016 (UTC)[reply]

The past is the past for a reason. And while it might not be okay to reveal CodeCat's place of residence (as I have to admit) there was nothing seriously threatening there. -- Liliana • 15:11, 28 March 2016 (UTC)[reply]

You don't a "surprise visit" is threatening? --Wiki Tiki 89 15:41, 28 March 2016 (UTC)[reply]

Not at all, I read it in a more jocular way, you know? -- Liliana • 15:46, 28 March 2016 (UTC)[reply]

Well you have to realize that the things that you write will not always be interpreted in the way you intend them. --Wiki Tiki 89 15:57, 28 March 2016 (UTC)[reply]

It really feels like a witch hunt by now. <_< -- Liliana • 12:35, 28 March 2016 (UTC)[reply]

(ec) Please don't make fun of this very serious real-life threat which included stalking as well as violence. If she either does not want to remember or cannot remember what horrible thing she wrote, it's her very problem. DerHexer (talk) 12:36, 28 March 2016 (UTC)[reply]

You're far too short on evidence. Talk is cheap. --Dan Polansky (talk) 12:48, 28 March 2016 (UTC)[reply]

Okay, you are kidding me (or either did not read my statement above). Please contact WMF which suppressed the revisions with very good reason if you don't believe my description (but I am unsure whether you will trust them either). More I cannot tell you without breaching security and I'm a bit afraid that some things I revealed here will also be suppressed due to the seriousness mentioned. Best, DerHexer (talk) 12:52, 28 March 2016 (UTC)[reply]

Your opinions and reports cannot replace evidence. If the evidence is hidden, then it is hidden and the case is closed. The fact that you want your reports to replace evidence is troubling. --Dan Polansky (talk) 13:02, 28 March 2016 (UTC)[reply]

My last comment in this regard: Evidences are her admittance that she did this very comment (although she plays it much, much down, imnsho), the removal and suppression by the Wikimedia Foundation (see link to Commons above), and the report of its correctness and seriousness by plenty of stewards who still can look into this comment and are very indignant about what happened (see link to my talk page on Meta). Of course, you can on the other hand also trust her claim which is backed by nothing but her very own word. Everybody has to find his or her very own truth. Best, DerHexer (talk) 13:12, 28 March 2016 (UTC)[reply]

It's all serious business people! Haven't you understood? (And the claim that every steward can read the comment is pointless because how many of our stewards speak German, exactly?) -- Liliana • 13:15, 28 March 2016 (UTC)[reply]

Did you or did you not write on Commons that you intend to ambush him at a wiki conference, trying to find our where he lives with his family, foreseeing a tragic misfortune, and regretting that it would be too bad if anything would happen to these (a sentence which is used in German to foreshadow a serious harm right up to death, FYI)? DerHexer (talk) 13:28, 28 March 2016 (UTC)[reply]

The third statement I confirmed above. The second is also true but it's kinda part of a conference that people exchange private information including address data, if you don't want that you wouldn't attend, right? The first statement is false. The fourth statement is a misinterpretation as I stated. -- Liliana • 13:34, 28 March 2016 (UTC)[reply]

Right, you said, you just wanted to ambush him, following him from the conference to his home for getting to know where he and his family lives (who have to expect tragic misfortune or anything unregrettable when he doesn't stop contacting you). How does that make anything better? Is that what you wrote or not? DerHexer (talk) 13:43, 28 March 2016 (UTC)[reply]

You're implying that there's something inherently wrong with visiting people. I find that amusing. -- Liliana • 13:46, 28 March 2016 (UTC)[reply]

You did write you want to follow him home (not for a visit), to his family, for which you see a tragic misfortune and anything unregrettably when he does not stop contacting you. That must be read as stalking as well as doing harm to people even if you try to be as ambigous as possible for not getting immediately locked. Get your wordings right and neither stalk people nor threat them. That is not accepted on these wikis according to our terms of use. And stop playing that down. DerHexer (talk) 13:57, 28 March 2016 (UTC)[reply]

So you're implying trying to get a fellow Wikipedian involuntarily committed in malicious intent is allowed? Because that's what he did to me. -- Liliana • 13:59, 28 March 2016 (UTC)[reply]

@DerHexer This still boils down to you wanting Liliana punished here for something that happened elsewhere. That does not seem proper IMO, and frankly seems a textbook example of disruption: you've come here, to a project where you were once blocked for four years, brought a problem from another project to here, and wasted a lot of everybody's time dealing with it. Purple backpack89 15:17, 28 March 2016 (UTC)[reply]

Please don't mix things up. It was not me who brought that up here, I just explained myself when I was asked to do so (and my actions did not affect at all enwiktionary nor did I request any actions to be taken here, see my talk page for further information). The block derives from a vandal account, an imposter, that took my name back in 2007, and was later renamed so that I could usurp the name, see the full log. Cheers, DerHexer (talk) 16:14, 28 March 2016 (UTC)[reply]

┌────────────────────────────────────────────────────────────────────────────────────────────────────┘

The behavior that Liliana has admitted to is undesirable. I hope that it won't occur here. DCDuring TALK 14:13, 28 March 2016 (UTC)[reply]
Honestly, who bloody cares? German Wiktionary isn't English Wiktionary. Unless commensurate behavior occurs here, I see no reason to lift an eyebrow. Purple backpack89 14:43, 28 March 2016 (UTC)[reply]
- Exactly. No need for torches and pitchforks. --87.63.114.210 14:56, 28 March 2016 (UTC)[reply]
The same person will be a very different user on different wikis. No need to worry too much about what an English Wiktionary user did in a German community dispute. Nemo 15:38, 28 March 2016 (UTC)[reply]
I have to agree, I'm not seeing any relevance and my experience tells me that blocks and de-sysopsing aren't always for the reasons they claim to be. Renard Migrant (talk) 15:47, 28 March 2016 (UTC)[reply]
In the time it took me to read through this, people said that I think: What Liliana does elsewhere is irrelevant to her value as a member of Wiktionary (en), which is what her status here should be judged by. As For what happens elsewhere, we know little of the context, we are not harmed by it. I would much prefer that the Germans not bring their quarrel here. Korn [kʰũːɘ̃n] (talk) 16:02, 28 March 2016 (UTC)[reply]
I agree with that statement, but as I mentioned above, it is important to remember that Liliana has done similar things on the English Wiktionary as well (and served blocked time for it). --Wiki Tiki 89 16:43, 28 March 2016 (UTC)[reply]
I agree with Wikitiki. Liliana doesn't exactly have a clean record here either, but at the moment she's done nothing here at en-wikt (that I know of) to warrant a block or ban. Sanctions imposed at other projects are none of our concern, unless it's decided that her actions warrant a global block, in which case it's out of our hands. —Aɴɢʀ (talk) 16:55, 28 March 2016 (UTC)[reply]
I agree. My impression of the German Wiktionary [and Wikipedia] situation is that there's been a long, tangled history of disputes that would have to be understood as context for what people have been saying there, and without that context we shouldn't be relating it her actions here. In the past, Liliana said some things she shouldn't have here, and we dealt with it. That should be the end of it here. Chuck Entz (talk) 20:47, 28 March 2016 (UTC)[reply]
I had a look at the block log at German Wikipedia and one of the block summaries contains a reference to a horrible diff from 2015 which is not hidden. I wanted to post the diff here but then I hesitated. I must say that after reading the diff, I find the actions of German admins very understandable. --Dan Polansky (talk) 18:05, 28 March 2016 (UTC)[reply]
@Dan Polansky: Can you link the diff? --Wiki Tiki 89 18:20, 28 March 2016 (UTC)[reply]
I really do not know whether I should. It's a bit too strong material to my taste. But if multiple people think it is a good idea anyway, I probably will. I think, let people sooner think that I am misreporting than think that Liliana has actually posted the material. --Dan Polansky (talk) 18:30, 28 March 2016 (UTC)[reply]

I don't understand what you're talking about either, I must admit. There was one dispute that was revision-deleted and nothing else comes to my mind. -- Liliana • 18:43, 28 March 2016 (UTC)[reply]

I found the diff too, but I'm not going to link to it because I don't see the relevance to this project. As long as she doesn't say anything like that here, there's no reason to ban her from here. —Aɴɢʀ (talk) 19:43, 28 March 2016 (UTC)[reply]

Hi, maybe some of you can take a look at this dicussion? There are some people from dewiki who think that your community is unable to decide which sysops do you want to have. I mean, I'm from dewiki too, but I don't like people, who say, that there were able to decide for your community, but are not a member of it. Greetings, Luke081515 (talk) 14:34, 29 March 2016 (UTC)[reply]

I commented there. I agree with the general consensus clearly expressed above that what happens on de.wiktionary is distinct from what happens on en.wiktionary for purposes of managing user rights. bd2412 T 16:13, 29 March 2016 (UTC)[reply]
General note, it's de.wikipedia, not de.wiktionary. Greetings, Luke081515 (talk) 17:00, 29 March 2016 (UTC)[reply]
The same would apply for any de. Wikimedia project. bd2412 T 17:11, 29 March 2016 (UTC)[reply]
I think in general it is bad, if members from one community say, "we know what's good for you" to another community. If there is a commnity, they can decide it, that's my opinion. No matter if it's de or other... Greetings, Luke081515 (talk) 17:48, 29 March 2016 (UTC)[reply]
Amen to that, @Luke081515 Purple backpack89 18:31, 29 March 2016 (UTC)[reply]
I agree that we can decide things for ourselves, but it is always better to be informed. So I don't think it was a bad idea to start this discussion. --Wiki Tiki 89 18:38, 29 March 2016 (UTC)[reply]

As an outsider, I find some of the opinions expressed here a bit shocking. As an administrator, Liliana-60 has access to deleted content on this wiki, which could contain sensitive information about other editors in real life. (Yes, that is usually oversighted, but especially since this wiki has no local oversighters, it is very likely that stuff could be missed). Do you want someone who has made threats of harm in real life on another wiki to have access to deleted content on English Wiktionary? It is possible that WMF may also choose to remove rights in such a case as this too, see m:Requests for comment/Privacy violation by TBloemink and JurgenNL. --Rs chen 7754 01:56, 30 March 2016 (UTC)[reply]
- Is there any evidence whatsoever that she has ever actually used her admin bit to do such a thing? bd2412 T 03:34, 30 March 2016 (UTC)[reply]
  - That's not what I am asking. What I am asking is, do you trust her to not do such a thing, because with privacy-related things once the damage is done, it cannot be undone, unlike say, a bad block. --Rs chen 7754 03:37, 30 March 2016 (UTC)[reply]
    - Shall I draft a vote defining our interwiki extradition policy and should it supplant the extraordinary rendition program de.wiki seems to think we have? —John C5 03:56, 30 March 2016 (UTC)[reply]
      - This here is not a vote of trust for Liliana. Several people have noted there was trouble in the past. This is simply a statement by en.Wiktionary that we prefer not to handle our users based on outside events. It is good that we were informed, it was necessary for us to consider this issue, but we had a look at what we were given and came to the conclusion that we see no urgent need to immediately strip Liliana. If you have seen more than we and think there is such a great importance to having her be an admin nowhere, kill her rights globally. But Angr noted: If you do that, it's out of our hands anyway and then there is no reason to try pushing us towards an opinion. I must say that this continuing uncouth coercion is not giving me the best impression of de.Wiki. Korn [kʰũːɘ̃n] (talk) 10:20, 30 March 2016 (UTC)[reply]
        Another thing which gets to me is that we are meant to trust the judgment of people we don't know over those we do. If you want us to consider what was said somewhere else then don't hide what was said. I am not sure why threatening material must be hidden, anyway, unless it contains personally identifiable information. - TheDaveRoss 13:24, 30 March 2016 (UTC)[reply]
        It appears that WMF were the ones who hid the comment, not the community. And for what it's worth, I'm not active on de.wikipedia; I don't even speak German. I comment here as a concerned Wikimedian. And of course you are welcome to leave her sysop rights (as long as WMF doesn't do anything to the contrary), but in my opinion it doesn't reflect well on the English Wiktionary or on Wikimedia in general to do so. (Also, if for some reason you don't trust the current stewards, you are more than welcome to participate in the steward elections that happen every year). --Rs chen 7754 18:21, 30 March 2016 (UTC)[reply]
        It is not trust in the stewards, it is a blind assumption that their judgment is the same as ours would be. I trust that the stewards acted in what they thought was the correct manner, I don't have to take the same action as they did without any evidence but their word. I think you are vastly overestimating the possible effect this could have on the WMF's (or Wiktionary's) credibility or image. - TheDaveRoss 18:43, 30 March 2016 (UTC)[reply]
Ugh. Reminds me of an obnoxious occurrence on one of the coding sites (was it GitHub?) where a contributor was banned because of opinions he had expressed on a personal Twitter page. Equinox ◑ 13:12, 30 March 2016 (UTC)[reply]

Update: Liliana-60 was apparently globally banned by WMF. [6] --Rs chen 7754 00:13, 23 April 2016 (UTC)[reply]

TTS dictionaries from Wiktionary IPA representations

It was suggested the following,

Wiktionary:Grease_pit/2016/March#TTS_dictionaries_from_Wiktionary_IPA_representations

might get more interest if also posted here. ShakespeareFan00 (talk) 19:12, 28 March 2016 (UTC)[reply]

R:Derksen 2008 vs. R:sla:Derksen 2008

I prefer Template:R:Derksen 2008 over Template:R:sla:Derksen 2008. The added prefix does not add any value, IMHO, and is non-minimalistic. What do you think? --Dan Polansky (talk) 19:16, 29 March 2016 (UTC)[reply]

I agree. Naming conflicts should be minimal, and also the same reference is often used for multiple languages. --Wiki Tiki 89 19:19, 29 March 2016 (UTC)[reply]

Indeed. Hard categorization should handle the language-related scope delineation. DCDuring TALK 20:19, 29 March 2016 (UTC)[reply]

I can only assume the language code is there for disambiguation as what else does it add that's not contained in the information provided by the template? How many of these template even need disambiguation because of two names the same. Only use when needed (which I suspect is zero times). Renard Migrant (talk) 11:47, 30 March 2016 (UTC)[reply]

Here's a list of all the reference templates with language prefixes. --Wiki Tiki 89 12:00, 30 March 2016 (UTC)[reply]

There is no doubt that many reference templates use a language affix. However, many reference templates do not use an affix. Hence, a discussion is needed. I do not want e.g. template:R:PSJC to be moved to template:R:cs:PSJC; this could easily happen in yet another undiscussed volume change. R:Derksen 2008 was created without the affix and it should remain without affix unless consensus swings in the other direction, IMHO. --Dan Polansky (talk) 12:14, 30 March 2016 (UTC)[reply]

The reason I linked to it was not to show the number of them, but to give a sense of how much name collision there would be if we were to remove the prefixes. There wouldn't be very much, but there would be some. --Wiki Tiki 89 12:21, 30 March 2016 (UTC)[reply]

I see, I did not realize that. What are some of the name collisions? Can they be fixed by adding the year to the name? --Dan Polansky (talk) 12:27, 30 March 2016 (UTC)[reply]

Adding the date conveys useful information and follows the common convention in scholarly works. Do we discriminate by the language in which the reference is written? What about multilingual dictionaries? DCDuring TALK 12:49, 30 March 2016 (UTC)[reply]

One collision is {{R:hy:GB}} and {{R:xcl:GB}} (which actually are, ironically, different sources). --Wiki Tiki 89 13:43, 30 March 2016 (UTC)[reply]

Thanks. This could be easily resolved in mutliple ways. "R:xcl:GB" could be renamed to "R:GB 2000" while "R:hy:GB" to "R:GB 1910". But since I can't figure out what GB in {{R:hy:GB}} stands for, that could instead be renamed to R:NBHL to match the work title. Note that I am not proposing to perform the renaming and annoy the template creator, in both cases Vahagn Petrosyan; I am merely showing how straightforward it is to remove naming conflicts. --Dan Polansky (talk) 13:56, 30 March 2016 (UTC)[reply]

Yeah of course. I was just responding to Renard Migrant's claim that there would be zero conflicts. --Wiki Tiki 89 14:27, 30 March 2016 (UTC)[reply]

Alrighty. Thanks for the example; it gave me the opportunity to expound ;). --Dan Polansky (talk) 14:35, 30 March 2016 (UTC)[reply]

GB is a traditional abbreviation for referring to both. I do not want to change the names. --Vahag (talk) 18:25, 1 April 2016 (UTC)[reply]

If it comes to that, {{R:my:MED}} was created with the prefix, and I don't it moved to {{R:MED}}, which could also "happen in yet another undiscussed volume change". —Aɴɢʀ (talk) 19:10, 30 March 2016 (UTC)[reply]

Perhaps we should create {{R:MED}} for Middle English and keep {{R:my:MED}} where it is. I don't understand how these URLs work or I'd do it myself; laurer for example although I've been unable to work out the URL for the home page or how to get from one entry to another without just Googling it and getting an entirely new URL. Renard Migrant (talk) 19:33, 30 March 2016 (UTC)[reply]

I prefer names with the langcode of the primary langauge it is used for. Why? well, because of (1) less probability of name collisions and (2) autocomplete. You are more likely to know the langcode of the language you are working in rather than the name (which often times get really cryptic like this {{R:nan:thcwd}}) of the reference given by the template author. --Dixtosa (talk) 15:32, 1 April 2016 (UTC)[reply]

I too prefer including the language codes. When I forget the name of the template I start typing Template:R:xx: in the Search box and rely on autocomplete. --Vahag (talk) 18:25, 1 April 2016 (UTC)[reply]
If these were redirects then the autocomplete would still work, right? Renard Migrant (talk) 19:00, 1 April 2016 (UTC)[reply]

I would prefer including a language code for a reference, whenever the language that the source treats can be unambiguously specified; but in the case of comparative resources like OP's example of {{R:Derksen 2008}}, a family code seems superfluous (except possibly for disambiguation). --Tropylium (talk) 15:52, 22 April 2016 (UTC)[reply]

Removing word frequency statistics from the Project Gutenberg from the mainspace

Someone proposed to remove statistics from the Project Gutenberg from the mainspace. They did so via Wiktionary:Requests_for_deletion/Others#Template:en-rank. I think a Beer parlour discussion is in order; this is not so much about removal of a template but rather about a removal of a class of information. Template {{en-rank}} was originally at {{rank}}.

An example of how the statistics currently looks like in word entry:

Statistics

Most common English words before 1923: does · Gutenberg · best · #245: word · light · felt · since

--Dan Polansky (talk) 09:50, 30 March 2016 (UTC)[reply]

It would be nice if we had such token-frequency statistics for current usage in books, news, scholarly articles, and the web, all separately, really nice if we had something that was for lemmas. It would be even nicer if we had an approximate indication of frequency of definitions in current use. Others do. We could. The PG statistics on word tokens seem lame and useless. DCDuring TALK 11:01, 30 March 2016 (UTC)[reply]

I agree with removing PG stats and considering more meaningful frequency stats with which to replace them. I wonder if the best measure is not a rank but rather a classification, so a word might be extremely rare in the 1860s and become common by the 1960s, only to fade again to relatively rare in the 2000s. How to measure and define those is still an open question. - TheDaveRoss 16:21, 30 March 2016 (UTC)[reply]

I agree with their lameness and would like to see them deleted. We tried to have these deleted a few years ago, I think it was when I was an admin. It wasn't successful, anyway, and I stopped caring soon after. --AK and PK (talk) 17:02, 30 March 2016 (UTC)[reply]

Hmm, it may have been Keene who actually wanted to get rid of it. Blimey, that was 9 years ago, perhaps 250 accounts ago! --AK and PK (talk) 17:04, 30 March 2016 (UTC)[reply]

How about replacing these with statistics with some of our own compilation based on use in Wiktionary definitions, definitions+citations, definitions+citations+discussion, WP articles, WP article+discussions, or all combined. This would probably give reasonable results for many classes of words. In addition we could even go so far as to use our native search to determine which etymologies and/or PoSes (for homonyms) and which definitions (for polysemic words) were the most common, based on something more than idiolect-based opinion. This would get us a reasonable sample of current usage, more or less formal if we rely on articles, entries, and citations, less formal if we rely on discussions. DCDuring TALK 12:43, 1 April 2016 (UTC)[reply]

Regardless of where the statistics come from, I think it's silly to link to the "nearby" words in the list. Better to just give the position in the list and link to the list. --Wiki Tiki 89 12:45, 1 April 2016 (UTC)[reply]

I think the "nearby" words are useful for giving context to the ranking. —suzukaze (t・c) 12:53, 1 April 2016 (UTC)[reply]

A visible, but inconspicuous indication of relative frequency, such as the OED's dots, is useful, especially for individual definitions. Whether we require users to click through to another page for comparative frequency or have some limited on-page comparative frequency is secondary, IMO. To me the question is whether statistics on token, lemma, etc, and definition frequency based on Mediawiki project data is good enough and whether any other source is likely to be available whose use would not be subject to copyright. DCDuring TALK 14:23, 1 April 2016 (UTC)[reply]

Eye dialect

The explanatory note at "Category:English eye dialect" says: "English nonstandard spellings, which however do not change pronunciation, deliberately used by an author to indicate that the speaker uses a nonstandard or dialectal speech." Is the italicized clause correct? Surely aboot and about are not pronounced the same way. — SMUconlaw (talk) 15:19, 30 March 2016 (UTC)[reply]

Aboot is the way a Scot (for example?) pronounces about anyway, so it doesn't change the pronunciation for the person whose speech is being imitated with the new spelling. Equinox ◑ 15:26, 30 March 2016 (UTC)[reply]

It's a well-known fact that we've been using the term "eye dialect" the wrong way in our entries. Some quintessential examples of real eye-dialect are sed for said and enuff for enough, which don't change the pronunciation for neither for the author, nor for the expected readers (aboot, however, is expected to be read differently by the expected readership). Eye-dialect spellings imply that the speaker is speaking with a dialect despite the fact that the spelling produces the same pronunciation. Compare the term eye rhyme. The correct thing to say is that aboot is a "spelling that is imitative of a dialect"; or if the spelling is actually used in the dialect itself, then you can just call it a "dialectal spelling". --Wiki Tiki 89 15:35, 30 March 2016 (UTC)[reply]

Ah, I see. So, do we let sleeping dogs lie? I had placed sais (a non-standard form of says) in this category, and then saw the explanatory note and became puzzled. — SMUconlaw (talk) 15:49, 30 March 2016 (UTC)[reply]

I've been ignoring them (and sometimes even contributing them, where is my integrity?), but ultimately I think this needs to be fixed. As for sais, I'm not sure whether it is supposed to represent /seɪz/ or /sɛz/. In the latter case, it would certainly be eye-dialect. --Wiki Tiki 89 15:58, 30 March 2016 (UTC)[reply]

Ha, ha! OK, thanks. It's impossible to know for sure how sais was intended to be pronounced, of course. In the 2000 quote, given the unusual spelling of many other words, I'm guessing it wasn't supposed to be pronounced as /sɛz/. — SMUconlaw (talk) 16:12, 30 March 2016 (UTC)[reply]

I always correct misuses of the term "eye dialect" where I encounter them in entries, but I don't go hunting for them. —Aɴɢʀ (talk) 16:30, 30 March 2016 (UTC)[reply]

Feel free to correct sais! — SMUconlaw (talk) 16:46, 30 March 2016 (UTC)[reply]

I'm not convinced sais isn't eye dialect. The entry claims that sais is pronounced to rhyme with gaze, but is it really? If I was reading one of the quoted passages out loud, I would pronounce it in the same way as says, namely to rhyme with fez. —Aɴɢʀ (talk) 17:55, 30 March 2016 (UTC)[reply]

Template:pronunciation spelling of/Template:pronunciation respelling of is the template for those who make a distinction between this and eye dialect. - -sche (discuss) 17:32, 30 March 2016 (UTC)[reply]

But that doesn't retain the connotation of dialect. --Wiki Tiki 89 17:40, 30 March 2016 (UTC)[reply]

I usually just use {{nonstandard spelling of}}, but more nuanced options are also available. —Aɴɢʀ (talk) 17:53, 30 March 2016 (UTC)[reply]

That's what the "from=" field (which displays "representing _") is for. If a context label is put in, it categorizes; ad-hoc input is also accepted. - -sche (discuss) 02:27, 3 April 2016 (UTC)[reply]

@-sche: The "from=" field doesn't seem to be doing anything here. --Wiki Tiki 89 14:49, 4 April 2016 (UTC)[reply]

@Wikitiki89 You've used T:nonstandard spelling of on that page, rather than T:pronunciation spelling of. We could add "from=" support to T:nonstandard spelling of (yielding "nonstandard spelling of X, representing Y")... I guess there's no reason not to... but under what circumstances would someone use that rather than either "alternative form of X, representing Y" or "pronunciation spelling of X, representing Y"? - -sche (discuss) 01:35, 5 April 2016 (UTC)[reply]

Shades of meaning - a thesaurus of synonyms with a bit of a difference?

What could be a helpful addition to Wiktionary, and something I don't think has exactly been done before, is a series of lists of words and phrases that are close to being synonyms - a little like Wiktionary:Wikisaurus and the classic Roget's thesaurus - but with an important improvement... put them in a table where even subtle differences between the words can be seen. So it would highlight "shades of meaning" and clues as to when you may or may not want to use a particular word.

Many lists of synonyms in books and online simply heap them together; Roget has a bit of order in their listing within sections, but not really enough to help writers (and especially not enough to help speakers of English as a second language, and there are more and more people in that category).

What I am imagining (and I guess I could present a few examples if requested) is a table where entries are sortable by any of several columns - the first being the word/phrase, another two that would almost always exist would be the degree of formality (perhaps: 9=probably only used in legal documents, 0=slang, -1=impolite, -5=very rude) and high widely-known the word is when used in this sense (9=very frequently used word across all cultures, 7=anyone in an English-speaking country with a reading age over 12 should know it, 5=might be a bit misunderstood or country-specific but many people kind of know what it means, 2=rare/archaic 0=not sure it even is a word). And then there would be columns that would vary according to the situations, e.g. a table for "not pay" might include words/phrases like "protest a bill", "welsh", "bilk", "balk", "renege", "dishonour/dishonor", "hold back payment", "block", "default", "become insolvent" and so on, and columns for degree to which the row has an element of stealing, inability to pay, dispute, lateness, charity... plus space to add explanations.

Perhaps what I am suggesting is a change to the format of Wikisaurus pages, or an optional feature you can click on to see the table, but I think there has to be an element of including words that are not exact synonyms.

Thoughts? Maitchy (talk) 20:41, 31 March 2016 (UTC)[reply]

I suspect we don't have enough users (or users who are interested in synonyms) to manage this. One good thing that we don't do, which might be more achievable, is to include usage notes on synonyms pages: e.g. Chambers Thesaurus on hate says: "Dislike is a fairly mild term for something simply being displeasing, whilst despise is far stronger and implies an element of contempt. Both detest and loathe would similarly refer to something deeply felt..." (and so on). Equinox ◑ 20:45, 31 March 2016 (UTC)[reply]

One might expect that diction would be what one would get from a dictionary. One can, but not very conveniently unless one's dictionary or thesaurus goes to unusual lengths. It would be interesting to have a model page for single synonym (sensu lato) group that could form an object for discussion. I had long hoped that Wikisaurus could be what is being suggested. It can still be a resource for it. DCDuring TALK 22:27, 31 March 2016 (UTC)[reply]

Wiktionary:Beer parlour/2016/March

Contents

Wiktionary:Wanted entries - name and shortcuts

Bashkir Transliteration policy

Inspire Campaign: Making our content more meaningful

Entries with no definitions and no citations

Pageviews graph

Suggestion: give etymologies to irregularities

Reconstruction move count

Template:auto cat

Bot spam

Collapsible derived terms

Wikipedia article titles as a durably archived source

Inter-project links to missing pages

Links to discussions often broken

Diacritical marks

Use of "inh" vs "etyl"

etyl vs. der → inverted parameters

Open call for Individual Engagement Grants

Template:borrowing and borrowings into ancestral stages

colloquial and informal labels

"Please provide the title of the work" in cite web

Category:French terms spelled with , etc

Terms derived from PIE words

function words

"Lochia" as a Word of the Day?

Etymology section for non-lemmas

Change to terminology of `{{sense}}`?

Old Gujarati

where does the translation table go?

More Words of the Day needed!

Hittite lemmas

User:Liliana-60's de.wikipedia troubles.

TTS dictionaries from Wiktionary IPA representations

R:Derksen 2008 vs. R:sla:Derksen 2008

Removing word frequency statistics from the Project Gutenberg from the mainspace

Eye dialect

Shades of meaning - a thesaurus of synonyms with a bit of a difference?

Navigation menu

Wiktionary:Beer parlour/2016/March

Wiktionary:Wanted entries - name and shortcuts

Bashkir Transliteration policy

Inspire Campaign: Making our content more meaningful

Entries with no definitions and no citations

Pageviews graph

Suggestion: give etymologies to irregularities

Reconstruction move count

Template:auto cat

Bot spam

Collapsible derived terms

Wikipedia article titles as a durably archived source

Inter-project links to missing pages

Links to discussions often broken

Diacritical marks

Use of "inh" vs "etyl"

etyl vs. der → inverted parameters

Open call for Individual Engagement Grants

Template:borrowing and borrowings into ancestral stages

colloquial and informal labels

"Please provide the title of the work" in cite web

Category:French terms spelled with , etc

Terms derived from PIE words

function words

"Lochia" as a Word of the Day?

Etymology section for non-lemmas

Change to terminology of {{sense}}?

Old Gujarati

where does the translation table go?

More Words of the Day needed!

Hittite lemmas

User:Liliana-60's de.wikipedia troubles.

TTS dictionaries from Wiktionary IPA representations

R:Derksen 2008 vs. R:sla:Derksen 2008

Removing word frequency statistics from the Project Gutenberg from the mainspace

Eye dialect

Shades of meaning - a thesaurus of synonyms with a bit of a difference?

Navigation menu

Search

Change to terminology of `{{sense}}`?