Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:BP)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit
2002
December
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016


Contents

January 2016

Happy New Year[edit]

Happy wiki-ing I hope that 2016 is a banner year for etymologies, appendices of personal names, and pushing forward creating the most comprehensive and accurate dictionary yet. —Justin (koavf)TCM 15:35, 31 December 2015 (UTC)

Vote counter[edit]

Planned and running votes [edit this list]
Ends Title Status/Votes
Feb 4 User:RileyBot decision?
Feb 8 Uncle G for de-sysop decision?
Feb 14 NORM: 10 proposals 95 (20 people)
Feb 23 EL introduction Symbol support vote.svg5 Symbol oppose vote.svg0 Symbol abstain vote.svg0
Feb 27 Definitions Symbol support vote.svg2 Symbol oppose vote.svg1 Symbol abstain vote.svg0
Feb 28 References Symbol support vote.svg5 Symbol oppose vote.svg0 Symbol abstain vote.svg0
Mar 5 Short blocking policy Symbol support vote.svg4 Symbol oppose vote.svg2 Symbol abstain vote.svg0
Mar 6 Literal translations in translation tables Symbol support vote.svg4 Symbol oppose vote.svg3 Symbol abstain vote.svg0
Mar 7 Translations of taxonomic names Symbol support vote.svg5 Symbol oppose vote.svg0 Symbol abstain vote.svg3
Mar 8 Entry name: sign languages 6 (3 people)
Mar 9 Pronunciation Symbol support vote.svg1 Symbol oppose vote.svg3 Symbol abstain vote.svg0
Mar 10 Notes about pronunciations Symbol support vote.svg1 Symbol oppose vote.svg0 Symbol abstain vote.svg0
Mar 11 Placement of "Usage notes" 8 (5 people)
Mar 12 Language 2 Symbol support vote.svg2 Symbol oppose vote.svg2 Symbol abstain vote.svg0
Mar 13 Automated transliterations Symbol support vote.svg1 Symbol oppose vote.svg0 Symbol abstain vote.svg0
Mar 14 Entry name 3 starts: Feb 14
Mar 15 Multiple pronunciation sections starts: Feb 15
Mar 16 Placement of "Alternative forms" starts: Feb 16
Mar 17 Removing "Flexibility" starts: Feb 17
Mar 18 Removing "Quotations" starts: Feb 18
Mar 19 Trivia starts: Feb 19
Mar 20 Attestation vs. the slippery slope 2 starts: Feb 20
Mar 21 Interwiki links starts: Feb 21
Apr 22 Remove "The essentials" starts: Mar 22
Apr 23 Etymology starts: Mar 23

Happy new year! I added a vote counter in Template:votes. Let me know if you would change anything or if there's any bug. --Daniel Carrero (talk) 09:15, 1 January 2016 (UTC)

(I'll update the documentation later, I need some sleep.) --Daniel Carrero (talk) 09:25, 1 January 2016 (UTC)
That's cool, but for past votes I think the result is more important. --Dixtosa (talk) 10:06, 1 January 2016 (UTC)
There's no way to get the decision automatically as a simple text (Passes, Fails, No consensus) just by parsing the page. For example, technically I could make the module look for uses of "Passes", "Fails", "No consensus" in the decision text but it would fail and get wrong results sometimes, especially in cases of votes with multiple options and complex results like "passes except for English".
Template:votes can be changed to let people add the vote result manually, but I oppose that. Showing the past votes with their results is already the job of WT:V#Recently ended votes, which shows more votes. If someone has the time to edit Template:votes to add a "Fails", they could as well be moving the vote to the actual list of recently ended votes. --Daniel Carrero (talk) 19:27, 1 January 2016 (UTC)

Note: Hover the mouse over the number of votes to see who voted. --Daniel Carrero (talk) 21:48, 1 January 2016 (UTC)

For me, "{{#ifeq|normal|past|}}" and whatnot is showing up dozens of times over at the top of the table. Andrew Sheedy (talk) 22:09, 1 January 2016 (UTC)
@Andrew Sheedy: Sorry, that appeared for a moment because I made a mistake. I already fixed that. --Daniel Carrero (talk) 22:11, 1 January 2016 (UTC)
Alright, thanks. I feel like it would be better if the table showed fewer votes; say, 10-15, rather than 22 like it does now. Also, why does each vote title begin with something like "pl-2015-12/"? I don't recall it doing so when the table was first implemented, and it looks kind of messy. That being said, I find it helpful to keep me up-to-date on recent votes, as I never remember to check them out otherwise. Andrew Sheedy (talk) 22:32, 1 January 2016 (UTC)
@Andrew Sheedy: The long list of votes is my fault, I created 18 of those. (82%) I feel it's better if the table shows all current votes rather than hiding some. If the table shown only 10-15 votes right now, some unstarted votes would be hidden from view but some active votes would be hidden too.
I removed the "pl-2015-12/" part now. Thanks for the last comment! --Daniel Carrero (talk) 22:49, 1 January 2016 (UTC)
Thanks, that makes it easier to read. Andrew Sheedy (talk) 23:02, 1 January 2016 (UTC)

I would like to install Extension:GetUserName to show if the current user already voted.

  • Yes check.svg (You already voted!)
  • X mark.svg (You did not vote yet!)

--Daniel Carrero (talk) 22:24, 1 January 2016 (UTC)

Daniel, is it possible to allow people to manually insert the result of a vote? There's already an "edit this list" link. There are few enough votes that it wouldn't be too hard to insert the results, or at least a "passed/failed/other" or "passed/failed/partly passed" or whatever. Benwing2 (talk) 03:03, 2 January 2016 (UTC)
I oppose anything that requires further manual upkeep. —Μετάknowledgediscuss/deeds 03:21, 2 January 2016 (UTC)
It's not required. Benwing2 (talk) 03:30, 2 January 2016 (UTC)
@Benwing2, Dixtosa, Metaknowledge: I can change the template to add a "decision" parameter to all votes but it would require manual upkeep forever: Everytime a vote ends, someone would have to put the "Passes", "Fails", whatever in the vote box. I mean, if nobody types a decision, then it would just show nothing.
If enough people want that and are willing to update the box whenever needed... Why not, I suppose. --Daniel Carrero (talk) 03:36, 2 January 2016 (UTC)
I still oppose that. —Μετάknowledgediscuss/deeds 03:45, 2 January 2016 (UTC)
@Benwing2, Dixtosa, Metaknowledge: I added the decision1=, decision2= parameters per Benwing2 and Dixtosa, despite Metaknowledge's opposition. Does it look good? Myself, I said above that I opposed it, but now I abstain as to whether that parameter should be kept.
Currently, the vote box shows 2 "passed" results. --Daniel Carrero (talk) 03:52, 2 January 2016 (UTC)
Thanks, looks good. Benwing2 (talk) 03:58, 2 January 2016 (UTC)
Good. FYI: after the end date of a vote, if nobody has entered a decision yet, the shown decision is "decision?". --Daniel Carrero (talk) 04:05, 2 January 2016 (UTC)

Format of entries in the Reconstruction namespace[edit]

Now that we have the Reconstruction namespace, it would be really good to implement it. However, this is held up by a basic formatting decision that needs to be made: do we want "Reconstruction:Proto-Indo-European/albʰós" or "Reconstruction:albʰós#Proto-Indo-European"? I'd like it if we could make this decision in a quick poll rather than have a protracted formal vote. —Μετάknowledgediscuss/deeds 06:48, 2 January 2016 (UTC)

Support "Reconstruction:Proto-Indo-European/albʰós"[edit]

  1. Symbol support vote.svg Support because this most closely matches our current format and avoids having reconstructed forms in multiple languages on a single page. —Μετάknowledgediscuss/deeds 06:48, 2 January 2016 (UTC)
  2. Symbol support vote.svg Support for the same reasons —suzukaze (tc) 06:59, 2 January 2016 (UTC)
  3. Symbol support vote.svg Support since reconstructions do not have a well-defined spelling — they vary by author and source, and ours may also be subject to revision. Two reconstructions in unrelated proto-languages having "the same spelling" is usually not a meaningful relationship. --Tropylium (talk) 20:41, 2 January 2016 (UTC)
  4. Symbol support vote.svg Support per above and because the other alternative would require restrictions in use of redirects. Currently we can have hard redirects for exact equivalents in other notational systems, such as *-an to *-ą for Proto-Germanic, or PIE *a and schwa to various clusters including laryngeals in our notation. Most of these no-brainer redirects don't apply to any other proto-language, and a lot would have to be made into soft redirects. Chuck Entz (talk) 23:02, 2 January 2016 (UTC)
  5. Symbol support vote.svg Support because, as noted below, merging creates new problems. I also think this is a good first step towards a one-language-per-page format. —CodeCat 00:35, 3 January 2016 (UTC)
    What problems? Renard Migrant (talk) 12:31, 3 January 2016 (UTC)
    @Renard Migrant: Problems with existing and future redirects — see Angr's comment below. —Μετάknowledgediscuss/deeds 16:03, 3 January 2016 (UTC)
  6. Symbol support vote.svg Support --Vahag (talk) 08:47, 3 January 2016 (UTC)
  7. Symbol support vote.svg Support Wyang (talk) 01:24, 4 January 2016 (UTC)
  8. Symbol support vote.svg Support Hillcrest98 (talk) 01:29, 4 January 2016 (UTC)
  9. Symbol support vote.svg Support this, and oppose the alternative, per Trop and Chuck. - -sche (discuss) 05:04, 4 January 2016 (UTC)
  10. Symbol support vote.svg Support for many reasons I have given before. --WikiTiki89 15:59, 4 January 2016 (UTC)
  11. Symbol support vote.svg Support per CodeCat, we should be looking at subpages in a big way. - TheDaveRoss 22:43, 5 January 2016 (UTC)
  12. Symbol support vote.svg Support to make consensus clearer, but this does not mean I oppose the other option. —Aɴɢʀ (talk) 16:52, 8 January 2016 (UTC)
  13. Symbol support vote.svg Support per CodeCat. —Pengo (talk) 22:27, 9 January 2016 (UTC) [I'd like to see us try this approach, but the other way is fine too. Pengo (talk) 00:30, 2 February 2016 (UTC)]
  14. Symbol support vote.svg Support I changed my mind. —JohnC5 18:02, 31 January 2016 (UTC)

Support "Reconstruction:albʰós#Proto-Indo-European"[edit]

  1. Symbol support vote.svg Support because it matches the entry format. --Daniel Carrero (talk) 06:52, 2 January 2016 (UTC)
  2. Symbol support vote.svg SupportJohnC5 09:48, 2 January 2016 (UTC)
  3. Symbol support vote.svg Support -- don't have a super-strong opinion here but this feels more natural, and I remember early on finding it difficult to figure out where proto-language entries were in Wiktionary with the old (i.e. current) format, which is similar to the Reconstruction:Proto-Indo-European/albʰós format. — Benwing2 (talk) 17:38, 2 January 2016 (UTC)
  4. Symbol support vote.svg Support. I like this one. Would anything actually need to be merged? Renard Migrant (talk) 18:12, 2 January 2016 (UTC)
    Yes, for example *-kʷe and *-kʷe, or *pénkʷe and *pénkʷe. Even more when redirects are taken into consideration (*kapros and *kapros*kápros), and even more when plausible future redirects are taken into consideration (*oynos and plausible *oynos*óynos). I would be very surprised if no proto-language other than PIE had a form spelled *ne. —Aɴɢʀ (talk) 22:19, 2 January 2016 (UTC)
    Proto-Algonquian has ne- (Proto-Algic has n-, contrast PIE n̥-). - -sche (discuss) 05:04, 4 January 2016 (UTC)
    It's most likely to come up in parent/daughter cases, as in Proto-Uralic *kala > Proto-Finnic *kala. Though it's debatable how much benefit there is to treating these kind of cases as different entries in the first place. (My stance remains that if Protolang A looks usually identical to its parent Protolang B, it should be treated as a dialect of the latter, rather than as a distinct reconstructed language entirely.)
  5. Symbol support vote.svg Support per Daniel Carrero and Benwing2. Easier to find IMO. —Aryamanarora (मुझसे बात करो) 01:17, 4 January 2016 (UTC)
  6. Symbol support vote.svg Support --profesjonalizmreply 13:17, 4 January 2016 (UTC)
  7. Symbol support vote.svg Support Feels more natural to me. This matches our current mainspace format. A similar alternative would be to have the reconstructions directly in the mainspace and start with *, for which AFAIK the objections were rather weak. We absolutely should not be moving the normal mainspace to one entry per language, which I see CodeCat above say. Also, this format makes entries easier to find: there is going to be a shortcut for the namespace so the reader only types rec:albʰós, and there comes the entry; in fact, the reader only types rec:alb, and there appears a list of items that are completions of that; try typing ws:pers into the search box to see how this works for Wikisaurus. I acknowledge that there will probably be fewer redirects than for the alternative but I do not see anyone showing us how large the problem is; it is possibly relatively small. On a procedural note, if a plain majority prefers the Reconstruction:Proto-Indo-European/albʰós format, let's use it; it is the status quo anyway. --Dan Polansky (talk) 09:34, 9 January 2016 (UTC)

Support a different option (specify)[edit]

Comments etc.[edit]

  • Either one is fine with me. —Aɴɢʀ (talk) 09:34, 2 January 2016 (UTC)
    Statement of support added above. —Aɴɢʀ (talk) 16:52, 8 January 2016 (UTC)
  • It looks like we're going to have a lot of supporters of each form. Perhaps each supporter should note whether the other form is also acceptable to him/her or whether, on the contrary, he/she supports only the form indicated. That might help decide what the general consensus is.​—msh210 (talk) 22:40, 5 January 2016 (UTC)
  • As I mentioned above, I prefer the combined form but I'd be OK with the separated form. BTW I don't see the merging issue as a big problem; I've done lots of more complicated transformations using bots. The only thing that might be tricky is converting hard redirects into soft redirects; however, I imagine most of this can be automated as well. Benwing2 (talk) 22:17, 8 January 2016 (UTC)
    @Benwing2, CodeCat: I think we have a pretty solid consensus that's emerged, and we really ought to make the transition now that we have the namespace. Would you mind doing the honours? The trick is to ensure that relevant templates and modules are updated as soon as the moves are done, so we don't leave anything broken. —Μετάknowledgediscuss/deeds 04:59, 9 January 2016 (UTC)
    I'm not completely convinced there is a consensus given the relative number of users going the other way, but if everyone else thinks there's a consensus I'm fine with it. I can help move pages although I may be a bit busy until around the 13th or 14th of this month. If you need it before then, maybe User:CodeCat can help? BTW, CodeCat or anyone, how can I get a list of all appendix-only languages that should be moved to the Reconstruction space? Benwing2 (talk) 07:30, 9 January 2016 (UTC)
    On a procedural note, to want to close such a major decision after mere 7 days of voting seems improper to me. I would see 14 days as the minimum, or even 4 weeks typical of votes. Also, the discussion is very weak; I see very little in way of argument or links to specific places where arguments and reasoning can be found. --Dan Polansky (talk) 09:39, 9 January 2016 (UTC)
  • This wasn't a !vote for a reason — I just wanted to get it over with so that we could deal with the move, but the people who are capable of doing that (and, as far as I know, most interested in doing that) haven't, so I'm not sure whether there was a point to that. @CodeCat, Benwing2Μετάknowledgediscuss/deeds 03:39, 26 January 2016 (UTC)
    I guess I've been treating it like a vote and waiting for it to be formally closed -- it seems important enough and contested enough to merit this. In this respect I agree with Dan. I actually think it might not be a bad idea to treat it like a vote, and create a formal vote with a retroactive start date of say Jan 2 and an end date of say Jan 31; or at least set a fixed end point a few days out to make sure anyone else who wants to say something can do so -- people who have been quiet tend to perk up when deadlines approach. Benwing2 (talk) 03:51, 26 January 2016 (UTC)
    So, what's the plan currently? —JohnC5 18:02, 31 January 2016 (UTC)

Color-coded EL[edit]

FYI: I was curious to see how much of WT:EL was voted and how much was unvoted, so I created User:Daniel Carrero/Color-coded EL.

Turns out it's about 75% unvoted. --Daniel Carrero (talk) 08:36, 2 January 2016 (UTC)

Interesting. Could you select less saturated colors to enhance readability? I get a headache just thinking about the page as it is. DCDuring TALK 13:16, 2 January 2016 (UTC)
Absolutely. I won't do that right now because I'm on my cell phone, but if anyone wants to do a search/replace on that page, go ahead. Just look for background-color: red and background-color: green. --Daniel Carrero (talk) 14:02, 2 January 2016 (UTC)
Yes check.svg Done, but imperfectly. My color selections were not quite pale enough, IMO. DCDuring TALK 18:12, 2 January 2016 (UTC)
Perhaps we should also distinguish between parts that have a vote created for them, from parts that don't, instead of marking them all red. --WikiTiki89 16:02, 4 January 2016 (UTC)
This is nice, thanks Daniel. I updated one section that I recalled a vote for, I assume that is Ok? - TheDaveRoss 17:11, 4 January 2016 (UTC)
@DCDuring: That's great, thanks.
@TheDaveRoss: That's great, too, thanks.
@Wikitiki89: OK, I did as you suggested. --Daniel Carrero (talk) 00:20, 5 January 2016 (UTC)
If people prefer just the green/red format, the page can be converted back. But I wonder if we would be able to create votes for all the unvoted sections later, thus defeating the point of having the additional yellow (I mean, "khaki") color for unvoted sections that have votes created for them. Better yet if they all pass, then EL would finally be 100% voted and thus User:Daniel Carrero/Color-coded EL would be deleted, lest it become a completely green, thus useless, version of EL. --Daniel Carrero (talk) 06:18, 5 January 2016 (UTC)
Actually, all of ELE was voted in. (I'm only half kidding.)​—msh210 (talk) 16:43, 5 January 2016 (UTC)
LOL, true. That vote even has "Replacing the contents of Wiktionary:Entry layout explained by the contents of (another revision of the same policy).", sounds pretty serious. (I'm half kidding, too.) --Daniel Carrero (talk) 19:16, 5 January 2016 (UTC)
Or perhaps the replacement of the voted on contents with identical new contents negates the voting done for the original content? Maybe we don't actually have a CFI or ELE. (I never kid.) - TheDaveRoss 19:19, 5 January 2016 (UTC)
We don't? Good. I always wanted to make an entry for a8idsah09d8has9dh, but CFI was in the way! --Daniel Carrero (talk) 19:59, 5 January 2016 (UTC)

English possessives[edit]

My earlier posting on this: Wiktionary:Tea room/2011/April#English possessives.

A number of words indicate the possessive in English. These include have, of, -'s, their (my, et al.), and theirs (mine, et al.). For some of those words, we have a simple "marks the possessive" or similar sense, which is not very informative. For others of those words, we have specific senses — but those senses are not the same for two possessive words, though they may overlap.

What we need IMO is one central location for definitions of the possessive, and for all the possessive words to link thereto. I think this should be an appendix, say appendix:English possessives, and that all the possessive senses of these words (which is not always all the senses) should be referred to the appendix in lieu of the definition lines. A very rough draft toward such an appendix is at [[User:Msh210/English possessives]]; please feel free to edit it.

I would love to see what others think of this.​—msh210 (talk) 08:59, 5 January 2016 (UTC)

Seems like a good idea. Where would the links be? Under "See also" or something more likely to be clicked? DCDuring TALK 11:30, 5 January 2016 (UTC)
I'm thinking it would be a sense. For example, for my, we now have four senses ("Belonging to me", "Associated with me", "Related to me", "In my possession": those are not summaries but the entire definition lines). They'd be replaced by a single "{{n-g|Possessive of me, used before a noun phrase: see Appendix: English possessives}}" or some such. Likewise, for your we have four senses, of which the last two ("A determiner that conveys familiarity…" and "That; the specified") would remain in place but the first two ("Belonging to you; of you; related to you") would be replaced by something like above. Likewise, several of the subsenses of of could be replaced by something like "{{n-g|marking the possessive, see appendix…}}" and usexes.​—msh210 (talk) 15:13, 5 January 2016 (UTC)
Hmm. If I had to coin a slogan, it would be "case envy". Think about the last item in your list: "concatenation of noun phrases" (to which you've put ????). This is simply using a noun phrase to qualify another noun phrase, and there is essentially no limit to the nature of the relationship to which this refers, and any attempt at listing the "meanings" is doomed. (I really suggest you should remove this from the list.) The other terms (constructions) you have listed are very close to what might be "Genitive case", if we had a proper case system (or a system of unvarying particles, like Japanese の (no)), and trying to list the "meanings" of the genitive is obviously hopeless. After all, in many cases, it is a purely grammatical construct: if I say "I approach this question with an open mind", then want to nominalise this in a later reference, I say "My approach..." There is no more a "meaning" of "my" here than there is a "meaning" of "I" in the first sentence.
All that said, I think an appendix on possessives would be a Very Sensible Idea. It should show the pronoun table, rules for apostrophe-ess, including noting that this is not normal morphology, because you can say things like "The King of Persia's right elbow". It should also point out the way in which these can be seen as nominalisations of the verb "to have" (I learned about four foreign languages before I met one two which don't have one, and it is a very enlightening experience.) I hope this makes up for the negative tone of my first paragraph. Imaginatorium (talk) 15:57, 5 January 2016 (UTC)
Note that although concatenation of two nouns has several meanings, one of them is the possessive (at least according to w:English possessive); the same is true for of and have and even -'s, all of which have meanings other than the possessive.​—msh210 (talk) 16:35, 5 January 2016 (UTC)
What does "the possessive meaning" mean? (Serious question: I think it is semantically empty.) I can't actually see where it makes the claim about concatenated nouns "meaning" "possessive", but I'm sure one can find an example, for some values of "possessive meaning" at least. Imaginatorium (talk) 17:35, 5 January 2016 (UTC)
@Imaginatorium Re "possessive meaning": Good question. I guess when I think of possessive meaning, I think of meanings typically held by both -'s and his. But doubtless linguists have done some work in this area already and we don't need to reinvent the wheel. There's some stuff to read in the "Semantics" section of the WP article.   Re WP: It's in the "As determiners" section: "the system failure, using system as a noun adjunct rather than a possessive" (where "possessive" means with -'s morphology, not with possessive meaning).​—msh210 (talk) 18:39, 5 January 2016 (UTC)

Definitions at town and city names[edit]

See Woodstock. In many town-name entries there are lists similar to this, I am wondering if it would be better to replace all of the specific towns with more generic definitions a la given and family names. I am thinking of something along the lines of "A town or city name in the US, Canada and England." Are the specific definition lines of value? It seems to be treading the line of encyclopedic, and Wikipedia's disambiguation pages do a better job I think. - TheDaveRoss 14:48, 8 January 2016 (UTC)

Plenty of dictionaries include these kinds of lists, and it appears Wiktionary is following tradition in this case. {{place}} could be useful in these cases though. Personally, I'm fine with generic definitions or lists. —Aryamanarora (मुझसे बात करो) 15:50, 8 January 2016 (UTC)
I agree that this is reduplication of WP content, and not helpful in a dictionary sort of way. Traditional dictionaries do often include geographical names, either in the body, or in an appendix, but then, they didn't have the option of a WPlink. In at least some cases there could be an etymological overview -- how this name spread (e.g. from England to the former colonies, sort of thing). Imaginatorium (talk) 04:49, 9 January 2016 (UTC)
See Talk:Paris. The definitions of Woodstock could be replaced by "Any of a number of cities and towns in the US, Canada and England", or subsenses could be listed under that definition,in a fold-up list preferably. @Lo Ximiendo is currently adding definitions of specific places, many of them useful, but a comprehensive list is not realistic for the most common place names.. --Makaokalani (talk) 12:16, 9 January 2016 (UTC)
To treat all of the Parises as if they were lexically equivalent is silly, as is making Woodstock, Troy, Amsterdam, Geneva, Birmingham, Rome, Utica, Syracuse, all in New York State; Moscow, ID; and London, ON equivalent to all the others. Often one, two, or a few of the places are distinguished in general or regional discourse and might merit inclusion on some lexical grounds. Our attributive-use standard for including proper nouns made lexical sense to me, but apparently not to others. We are stuck with voting on these things, one at a time, since the language of that standard was explicitly voted down without any comparably effective replacement. DCDuring TALK 13:59, 9 January 2016 (UTC)

Category:en:Rivers in Canada[edit]

@CodeCat I made Category:en:Rivers in Canada. Is there a way to enable this empty category into something else? Thank you in advance. --KoreanQuoter (talk) 04:23, 9 January 2016 (UTC)

-èd words[edit]

For most languages, we don't include the forms with macrons and stress markers where these are usually ignored in writing. So... where does this leave entries like wingèd, learnèd and cursèd? (If nothing else, we need to add pronunciations to these entries.) Smurrayinchester (talk) 17:33, 9 January 2016 (UTC)

I don't think these are 'usually ignored' just rare, obsolete forms that were used for a relatively short period. Renard Migrant (talk) 18:43, 9 January 2016 (UTC)
Well, some scholarly and pseudo-scholarly material still use them, e.g in some critical version of Shakespeare's texts. I say we keep 'em. —Μετάknowledgediscuss/deeds 18:46, 9 January 2016 (UTC)
But even then they're independent stress markers of poetical criticism and not part of the actual spelling of the word, are they not? Sounds comparable to the accents used in Russian text books for me. Korn [kʰʊ̃ːæ̯̃n] (talk) 20:26, 9 January 2016 (UTC)
True, but is it not worth keeping them in the dictionary in case someone wants to look one up to see what the stress marker means? (We should have pronunciation for these so that people can see the difference between them and the unnaccented word.) Andrew Sheedy (talk) 21:49, 9 January 2016 (UTC)
The entry for "winged" doesn't have a Pronunciation section, but if it did, it would surely just list the normal one. Then an entry (or at least a note) for the accented version should mention that this is not just a poetic spelling (it isn't, really!) but a spelling indicating a poetic pronunciation. Imaginatorium (talk) 06:52, 10 January 2016 (UTC)

Russian cognate[edit]

@ cognate, under the linguistics examples, it has:

"English mother is cognate with Greek μητέρα ‎(mētéra), German Mutter, Russian маmь ‎(matʹ) and Persian مادر ‎(madar)."

My focus is on the Russian term: it appears as маmь, although it is linked to мать. It doesn't have an alias in the arguments. Why is this, does anyone else see what my browser is rendering, or is it just me ? Leasnam (talk) 01:46, 10 January 2016 (UTC)

Please rephrase your question it doesn't make sense. мать ‎(matʹ) is linked as it should be. What alias do expect? Inflected forms? It's a lemma. Stresses? Not required for monosyllabic Russian words.--Anatoli T. (обсудить/вклад) 04:30, 10 January 2016 (UTC)
I think he's just saying that it looks strange on his browser. I don't see that, but m is the Russian representation of т in an italic font, so possibly there's some weird font issue on your system? Benwing2 (talk) 05:14, 10 January 2016 (UTC)
Hi - Thank you, all ! Yes, it appears as an "m", but mousing over clearly shows the "t". oK, it's just me (and perhaps some others) then, not an issue of concern :) Leasnam (talk) 19:50, 10 January 2016 (UTC)
The question makes perfect sense: you are seeing a script lower-case Cyrillic 'T', which looks like 'm', instead of a "regular" lower-case Cyrillic 'T', which looks like a small-cap 'T'. I see the text in a sloped sans-serif font (the browser says the font is "DejaVu Sans Oblique"), so I see a sloped version of the normal мать. It would help if we knew what browser, OS, font selection etc etc you are using.
OS is old, Windows 7. Browser is Chrome. Leasnam (talk) 19:53, 10 January 2016 (UTC)
I think this sort of problem is worth considering very carefully. The problem is that (of course) stuff like HTML and CSS were conceived by clueless monolinguals (I mean clueless about anything not monolingual), so it is not easy to avoid unpredictable odd side-effects. Do we have any HTML/CSS gurus who could understand why I see a sloped font and @Leasnam sees an "italic" font, which looks like script? (I can't see anything in the CSS rules which the Firefox Inspector shows which would make it "sloped" or "italic".) Imaginatorium (talk) 06:49, 10 January 2016 (UTC)
Cyrillic т ‎(t)'s cursive (and thus italic, too) form resembles an m. Observe:
  • т
  • т
Copy and paste "Russian маmь ‎(matʹ)""Russian мать ‎(matʹ)" into Notepad; there is no "alias".
suzukaze (tc) 06:57, 10 January 2016 (UTC)
That exactly demonstrates the problem: you cannot discuss this stuff by pasting in the problem text and saying "Look!" -- I see something quite different from what you see. In fact the Wikipedia double-quote trick does not generate italics (at least for me), but generates a sloped sans-serif font. The text you have presumably copied above, i.e. "маmь" is not Russian at all: it's a sequence of Cyrillic ма followed by Roman m followed by a Cyrillic ь. (I don't have "Notepad", and I don't know what you mean by "alias".) Imaginatorium (talk) 07:53, 10 January 2016 (UTC)
I wasn't replying to you, I'm sorry if it seemed that way D:
But clearing things up: 1. Leasnam mentioned an "alias". 2. my "Russian маmь ‎(matʹ)" was copy-and-pasted from Leasnam's message, which contains mixed Cyrillic/Latin rather than pure Cyrillic (maybe that was a bad idea) —suzukaze (tc) 07:56, 10 January 2016 (UTC)
I've reformatted the example sentences because if you have running text in italics and then there's something that would normally be italicized, the thing to do is put it back in roman. —Aɴɢʀ (talk) 15:22, 10 January 2016 (UTC)
By "alias" I am referring to the display text (arg 3?), e.g. {{m|en|hello|HELLO}}, where "HELLO" acts as an alias (maybe that's not the correct term, sorry) Leasnam (talk) 19:58, 10 January 2016 (UTC)
I see it as мать now Leasnam (talk) 21:22, 10 January 2016 (UTC)
Just remember that Russian uppercase М and lowercase м look alike except for height (no rounded humps). Lowercase м does not look like т, and т cannot be a case of М; т is lowercase italic Т. —Stephen (Talk) 22:07, 10 January 2016 (UTC)

crimson in Template:table:colors[edit]

An anon (Special:Contributions/46.30.181.2) added crimson to Template:table:colors and language subtemplates.

As far as I'm concerned, I think I'd allow it, assuming it was all done correctly. (I'd just move crimson to someplace else in the order of colors) I have an inclusionist point of view concerning that template. I probably wouldn't mind having that table with 50, or maybe 300 colors, it could be made collapsible to save space.

Just my two cents. I'd understand if other people want to delete "crimson", since it's close enough to red. --Daniel Carrero (talk) 07:17, 10 January 2016 (UTC)

The problem with having 300 colours is that the swatches become quite meaningless. Everybody (almost!) can agree on a red, a yellow, and a green, but once you get into the silly territory and have to pick separate swatches for grey, "timberwolf", "silver birch", and "shiny new saucepan", they aren't helpful or meaningful any more. Equinox 14:00, 11 January 2016 (UTC)
I vote on removing the following: strongly: crimson, cream, azure; weakly: teal, indigo. — Ungoliant (falai) 14:24, 11 January 2016 (UTC)
  • There shouldn't be more than 18 colors in that thing: the 16 HTML colors (red, maroon, yellow, olive, lime, green, cyan, teal, blue, navy, magenta, purple, white, silver, gray and black), then maybe indigo and orange, but anything beyond that's too much @Daniel Correro, if people want the full 300, there are appendices that list all of them, so there's no point in Template:table:colors duplicating that. Purplebackpack89 15:53, 11 January 2016 (UTC)
  • Don't see why their being HTML colours matters. Surely brown is a much more basic and useful colour than "teal" or "maroon", which are basically blue and red. Equinox 19:04, 11 January 2016 (UTC)
    Setting aside the issue that teal and maroon are actually a long ways away from blue and red (at least HTML-wise), I have an alternate proposal: black, blue, brown, cyan, green, gray, magenta, orange, pink, purple, red, white and yellow. Purplebackpack89 19:53, 11 January 2016 (UTC)
    If we are going to use only basic colors, why magenta? The others seem fine. --Daniel Carrero (talk) 19:55, 11 January 2016 (UTC)
    Magenta (and cyan) give us the three secondary colors of light, while the rest are the colors you're most likely to find in a relatively small box of crayons, markers or colored pencils. Purplebackpack89 20:11, 11 January 2016 (UTC)

Adding a parameter or two to {{m}} and {{l}}[edit]

What do other people think of the idea of adding a parameter to {{m}} and {{l}} that would display the language name? At the moment, in etymology sections for example, if we want to write "compare German Mutter" we write compare {{etyl|de|-}} {{m|de|Mutter}}. Wouldn't it be nice to write compare {{m|de|Mutter|name=1}} instead? Not only is it fewer keystrokes, it ensures that the language named and the language tagged are the same, so there won't be any mistakes like compare {{etyl|fr|-}} {{m|de|Mutter}}, which do happen from time to time. For {{l}}, since it's more often used in lists, the language name could be preceded by a colon, as is the case in Descendants lists. For example muoter#Descendants could just list * {{l|de|Mutter|name=1}} instead of * German: {{l|de|Mutter}}. This would have the same two advantages: fewer keystrokes, elimination of the chance of a mismatch. Any thoughts/ideas for improvement? —Aɴɢʀ (talk) 15:49, 10 January 2016 (UTC)

  • Support. Great idea; I wish I'd thought of this myself. —Μετάknowledgediscuss/deeds 15:54, 10 January 2016 (UTC)
  • Support. This would relieve some of the pressure for misuse of {{inh}} and {{der}} in etymology sections, not to mention simplifying that {{etyl|xyz|-}} {{m|xyz|blah blah blah}} typing exercise. Chuck Entz (talk) 16:09, 10 January 2016 (UTC)
  • Oppose, redundant to {{cog}}. —CodeCat 16:19, 10 January 2016 (UTC)
    • I was unaware of {{cog}}; however, {{cog}} still isn't right for things like Descendants lists. —Aɴɢʀ (talk) 16:48, 10 January 2016 (UTC)
      • I can only support it for descendant lists if we also change the format of translation tables accordingly. —CodeCat 16:51, 10 January 2016 (UTC)
        • As in * {{t|de|Mutter|f|name=1}} instead of * German: {{t|de|Mutter|f}}? I'm down with that. —Aɴɢʀ (talk) 19:28, 10 January 2016 (UTC)
          • It's clearly a good idea as it prevents mistakes where you don't know which is the mistake, the language name or the language code. The template's already expanding the language code into a language name, so doing it twice shouldn't pose a problem. Renard Migrant (talk) 16:27, 11 January 2016 (UTC)
  • Support. Andrew Sheedy (talk) 23:12, 11 January 2016 (UTC)
  • Oppose only because I think it would look nicer to have a separate template for that, something like {{name-l|de|Mutter}} and {{name-t|de|Mutter|f}} (maybe that's not the best naming scheme, but I definitely think that whatever it is should be added to the left of the l or t. --WikiTiki89 23:21, 11 January 2016 (UTC)
    • I would support this over the other option. Andrew Sheedy (talk) 02:32, 12 January 2016 (UTC)

Away until Thursday.[edit]

I will be away for a few days. Normally, I would ask that the dictionary be finished by the time I get back, but this time I am going to be far more modest, and will merely ask that you get all of the WT:RFD and WT:RFV discussions finished and cleaned up. Of course, this includes WT:RFDO. Cheers! bd2412 T 18:47, 10 January 2016 (UTC)

This joke never stops getting old. —Aɴɢʀ (talk) 21:20, 10 January 2016 (UTC)
Perhaps it never stopped to begin to stop being a joke? --WikiTiki89 02:39, 12 January 2016 (UTC)

Let's make a WT:FUN competition in which the last person to edit Wiktionary and make it complete wins. --Daniel Carrero (talk) 23:11, 14 January 2016 (UTC)

I have an idea for a project: "A visual representation of the etymology of words using trees"[edit]

Hi everyone, I am new here. I am developing a project - "etytree" (click here) - to visualize the etymological tree of a word using an interactive tool. Basically I have built a graphical interactive web page using d3.js - a JavaScript library for manipulating documents based on data - where you search a word and then you visualize the etymological tree of the word (ancestors, cognates, on the same tree). Right now I have only developed a demo for 10 words and my next step will be to build an extractor of etymological relationships from Wiktionary. It won't be easy but definitely interesting.

A screenshot of the etymological tree of the English word 'butter' as produced by 'etytree'

I have some ideas on how to do it but I would like to get some feedback from people that are experts in the field / interested in the topic + I'm writing a grant proposal (click here) because I need funds.

Do you think this is the right place to describe my project and ask for feedback? Thanks! --Epantaleo

This is definitely a fascinating idea. Thanks for describing it. Not sure if this is the right place to ask for feedback, but if not, I'm not sure where is better. If you have specific questions about extracting the etymological links (which you are right won't be easy), you might ask at the Grease Pit. Benwing2 (talk) 09:33, 12 January 2016 (UTC)
This is really cool! UI could use some polishing (the ISO codes are a little small) but a great concept. —Aryamanarora (मुझसे बात करो) 23:01, 12 January 2016 (UTC)

Wiktionary:Word of the day/January 11[edit]

Could an admin please change the word of the day slightly, from "A snake" in sense 1 to "Any snake". The current formulation is confusing - there is still a snake called the "adder"; what has changed is calling other snakes "adders". Smurrayinchester (talk) 11:19, 11 January 2016 (UTC)

  • Yes check.svg Done I also replaced the red links with blue, one to a synonym that we have and another to a WP link (via {{vern}}. DCDuring TALK 13:49, 11 January 2016 (UTC)

Filling the CFI donut hole[edit]

At present, it's possible for an abbreviation, a derivative or a slang term of/for certain terms to be included as an entry, but not for the term that is being abbreviated, derived or corrupted to itself have an entry. This "donut hole" seems nonsensical to me IMO, and we should expand CFI to fix it. Something along the lines of "Any word or phrase that has an abbreviation or derivative term derived from it that passes CFI will also pass CFI." IMO, no harm will be done to the project in allowing these additional entries. Will start a vote on it if others think filling this hole is a good idea. Purplebackpack89 15:33, 11 January 2016 (UTC)

I disagree; when the derived term is a word, include it, sure, but things that aren't words shouldn't get a free pass based on having a word derived from them. It's backwards thinking (as in literally, in the reverse order). Renard Migrant (talk) 16:25, 11 January 2016 (UTC)
At first blush I agree with Renard, but I might not be thinking of the same sorts of terms as you are so I would like some examples of types of entries. I am thinking of things like FBI, which is best handled with a link to Wikipedia I think. - TheDaveRoss 16:38, 11 January 2016 (UTC)
I think we're talking about things like AFAICT, the existence of which should (in Purplebackpack's view) or should not (in Renard Migrant's view) automatically permit the existence of as far as I can tell. —Aɴɢʀ (talk) 18:57, 11 January 2016 (UTC)
I would not favor the automatic inclusion extending to proper nouns, eg, inclusion of FTC should not lead to automatic inclusion of Federal Trade Commission. I haven't really looked for more subtle faults in the proposal, though I expect there to be quite a few. DCDuring TALK 23:04, 11 January 2016 (UTC)
FBI (Federal Bureau of Investigation) is as good in terms of an example as AFAICT. Purplebackpack's of course free to clarify but I do think he means literally everything. Also I don't think the distinction is nonsensical, it has clear boundaries and a clear rationale. Renard Migrant (talk) 23:15, 11 January 2016 (UTC)
I do mean everything, in case of acronyms or other derived terms. When I created this, I think of acronyms of things that have never been created, were deleted, or (in the case of field goal percentage) were about one vote away from deletion. At present time, CFI as worded is too restrictive towards words like these. Yes, the result maybe creates some words of dubious value to the project, but better to have a CFI that's overly broad than one that's overly restrictive. There may be quite a few, but I don't think it's as many as DCDuring thinks...and remember, whenever one of them goes to RfV, you'd be RfVing for two, because if the derived word fails RfD or RfV (and thereby fails CfD), the root does to. Purplebackpack89 00:39, 12 January 2016 (UTC)
Strongly oppose. This would force us to include many unnecessary phrases, like as far as I can tell, rolling on the floor laughing, and fucked up beyond all recognition. Oh and rolling on the floor laughing my fucking ass off (too bad I couldn't find cites for ROFFLMFAO, or that would have been one word longer). Not to mention, this would get very messy if the etymology is unclear or disputed: Should we include all of laughing out loud, laughing online, and lots of laughs? Should we go as far as to include fornication under consent of the king and for unlawful carnal knowledge? --WikiTiki89 01:29, 12 January 2016 (UTC)
I dispute your claim that those are "unnecessary". As for "should we...", I personally think we should. Also, why do you guys insist on drawing from the longer and more absurd side of the spectrum, rather than accepting that those are the price to pay for a great many shorter, more important and less controversial words and phrases? Purplebackpack89 02:30, 12 January 2016 (UTC)
Care to give an example of a "shorter, more important and less controversial" word or phrase that would otherwise fail CFI? --WikiTiki89 02:32, 12 January 2016 (UTC)
What's so bad about having laugh out loud, or true shooting percentage? Or any number of other things that are slipping my mind at the moment. As I've said, you need to look at the big picture, not just random seven word entries that happen to pop into your head. Your argument boils down to "I don't think we should have those entries", even though there are no technical limitations preventing us from having them. And don't say, "more entries means more vandalism", because it doesn't...more editors means more vandalism. Purplebackpack89 05:38, 12 January 2016 (UTC)
Also, let me correct a misconception Wikitiki, Equinox and DCDuring routinely mention whenever expanding CFI comes up. Expanding CFI doesn't force any thing. It doesn't mean that people have to go out and create some random seven word entry. It doesn't even mean people will go out and create that entry. Purplebackpack89 05:42, 12 January 2016 (UTC)
If laugh out loud should be included (and perhaps it should, as some kind of interjection), it should be included on its own merit, and not simply due to the existence of LOL. My point is exactly that your suggestion would include everything that would not merit inclusion on its own. Anything useful that would be covered by your suggestion would be covered by other CFI criteria. And perhaps we should add more criteria to CFI, but these criteria should judge terms on their own and not based on other terms. --WikiTiki89 15:45, 12 January 2016 (UTC)
@Wikitiki89 You're ignoring the perceived problem this thread seeks to address: that some definitions we have (such as LOL) are dependent on other definitions (such as laugh out loud). Can I at least get you to concede that people who don't know what laugh out loud means won't be able to discern what LOL means either? Once you've conceded me that, any chance I can get you to concede that this problem is compounded by the fact that many acronyms, abbreviations and corruptions aren't defined except by the thing they are abbreviating? And maybe you'll acknowledge a that it isn't easy to figure out what laugh out loud means using only Wiktionary? (I doubt the majority of editors would think to look for laugh + out loud, laugh + out + loud doesn't really give you the proper definition and many wouldn't have the patience to look it up anyway). Purplebackpack89 17:53, 12 January 2016 (UTC)
I'm not going to concede anything. If LOL is inadequately defined, that's a problem only with the entry LOL. If laugh out loud is idiomatic and needs explaining, then the existence of LOL is still irrelevant. --WikiTiki89 18:20, 12 January 2016 (UTC)
So you believe it's somehow possible that you can know what LOL means even if you don't know what laugh out loud means? That doesn't seem to make any sense at all! If LOL stands for laugh out loud, and you don't know what laugh out loud means, you can't use the expression LOL properly. Also, if we had a definition for laugh out loud, we could just link to it and that would solve the problem of LOL's definition being inadequate. We have the functionality of linking entries to each other, might as well use it. Purplebackpack89 18:34, 12 January 2016 (UTC)
Let me rephrase. There are two possibilities:
  1. laugh out loud is SOP, which means that defining it doesn't help anyone understand anything that they couldn't have understood from looking up its parts.
  2. laugh out loud is not SOP, which means that your proposal is not necessary for it to have an entry.
In either case, I think defining LOL as "laugh out loud" is inadequate. --WikiTiki89 18:52, 12 January 2016 (UTC)
Setting SOP (which is bullshit, I might add; people have never proven that readers CAN actually make those connections) aside, if defining LOL as "laugh out loud" is inadequate, there are two ways to fix it:
  1. Add more to the definition of LOL
  2. Create the definition of laugh out loud and link to it.
I advocate the latter. Purplebackpack89 19:30, 12 January 2016 (UTC)
What I'm saying is that even if we had a research-paper length definition of laugh out loud, defining LOL as "laugh out loud" would still be inadequate. They are not the same thing. So yes, what I'm saying is we should add more to the definition of LOL. If SOP is "bullshit", then campaign against that. You still won't get much support, but at least you'd be being honest about what you want. Stop beating around the bush with these ridiculous proposals. --WikiTiki89 19:41, 12 January 2016 (UTC)
If I believe a proposal is a good idea, I'll make it. Equinox suggested I make this proposal after an RfD (or was it an RfV?) vote I made. I believe I've tried one-shot SOP dismantling already; that didn't work, so I'm settling for dismantling it piecemeal, starting with the most egregious examples. Purplebackpack89 05:14, 13 January 2016 (UTC)
The proof that human beings can make SoP "connections" is in the fact that people who learn a language can produce original sentences as well as individual words. Jesus H. Christ. Equinox 21:14, 12 January 2016 (UTC)
Equinox, that a) assumes that everybody who uses this dictionary has enough comprehension of English to get to the construction of proper sentences, b) two-word phrases are all constructed the exact same way sentences are, and c) a person would never voluntarily look up a two-, three- or four-word phrase unless it passed our CFI. I can't in good faith make any of those leaps, sorry. Purplebackpack89 05:14, 13 January 2016 (UTC)
Chances are that if someone is using an English dictionary, they know enough English to construct proper sentences, or at least understand what is proper and what isn't. I'm not perfectly fluent in French, but whenever I stumble across a French phrase with which I am not familiar, I am nearly always able to intuitively determine whether I should look up an individual word or the entire phrase. I think you underestimate people's intelligence, and I don't think you understand that understanding English is a prerequisite for using a dictionary written entirely in that language. Andrew Sheedy (talk) 05:47, 13 January 2016 (UTC)
"laugh out loud" being SOP, Purplebackpack89's reasoning could be used to argue the inclusion of any SOP construction, couldn't it? Suppose there's a person who does not understand properly the sentence "I see dead people" (which is SOP). That being SOP, he/she should look for I + see + dead + people. Suppose we are discussing whether we should have an entry I see dead people. The argument goes this way: if there's an abbreviation ISDP for it, keep that entry, otherwise delete that entry. --Daniel Carrero (talk) 08:16, 13 January 2016 (UTC)
  • Oppose per Wikitiki. Our dictionary would be rendered a laughingstock were we to include these. This is not the right way to go about making CFI more inclusive. —Μετάknowledgediscuss/deeds 02:34, 12 January 2016 (UTC)
    We're a laughingstock as it is because people can't find the definitions of words we need, @Metaknowledge. As usual, people seem to ignore the fact that if a person can't find the definition that they are looking for, they leave Wiktionary, find it somewhere else, and probably continue using that somewhere else instead of Wiktionary. The whole "if we do this, we'll be a laughingstock" line of "argument" is completely fallacious. Purplebackpack89 05:35, 12 January 2016 (UTC)
    My experience is that people generally have an idea of what words can be expected to be contained in a dictionary, and they tend to find them in languages where we have good coverage. But why don't you amend your suggestion instead of baselessly calling criticisms of it fallacious? —Μετάknowledgediscuss/deeds 05:38, 12 January 2016 (UTC)
    @Metaknowledge The reason I'm critical of your "laughingstock" claim is you haven't said why we'd be a laughingstock. You seem to be implying that anything above the "general idea" (which, I might add, doesn't exist, at least not a single one that's anywhere near the same for everybody) is a waste of space, and if people discover we have entries above and beyond the "general idea", they will think it absurd for one reason or another. That idea has no basis in either provable fact or in common sense; the people most likely to look for/find the definitions I'm proposing to include are the people the least likely to find it absurd that we have them. Furthermore, there is no technical need to artificially constrain ourselves to the words we're expected to have. Maybe you meant something else when you said what you did, but as you've said nothing else, it's hard to believe otherwise.
    As for amending my proposal, a) I truly believe the project would be better if the proposal were adopted verbatim, and b) I don't really know in what direction I'd go to amend it to placate you, Wikitiki and others. Purplebackpack89 05:57, 12 January 2016 (UTC)
@Purplebackpack89 If the spectrum that would be included by your proposal includes terms that look like ones we wouldn't want, you need to come up with some wording that doesn't include them. A "proposal" that consists of a grand statement, some hand-waving, and hope for good outcomes obviously isn't going to satisfy those who are skeptical of the desirability of quantity increases at Wiktionary. A policy proposal needs a little bit more thought. DCDuring TALK 03:30, 12 January 2016 (UTC)
Well, we can spend the rest of this thread thinking and discussing. It doesn't have to be perfect on the first go. Purplebackpack89 05:32, 12 January 2016 (UTC)
@Purplebackpack89 Impulse control is one of the wonderful consequences of having a deliberative body to mull over questions of policy. DCDuring TALK 11:09, 12 January 2016 (UTC)
Oppose Your initial premise has a tantalising air of plausibility, but does not stand up to careful thought. Some abbreviations which require explanation refer to perfectly ordinary language which does not. Some words which require explanation have (typically variable) abbreviations used in context, which do not require explanation. Therefore the two inclusion decisions are somewhat independent. Of course, when considering any particular case, the existence of an abbreviation entry can be taken into account. And in the end I do not think you have given a single convincing example. Imaginatorium (talk) 08:55, 12 January 2016 (UTC)
I think on a usability level, this is not a user-friendly proposition. Changing creating an entry for laugh out loud in order to change [[laugh]] [[out loud]] to [[laugh out loud]] is not user friendly, because a user would be better off understanding what laugh and out loud mean. This won't help anyone understand more words and abbreviations. I should note, that not the intention of this proposal either, so I wouldn't expect it to. But I see no value in having entries allowed per WT:CFI that won't help anyone understand any words or phrases. Renard Migrant (talk) 18:32, 12 January 2016 (UTC)
I've never bought into your line of reasoning that we're somehow more user-friendly with fewer entries. The only thing that is user-friendly is having all the entries people would look for. If a person looks for laugh out loud, they should be able to find it; it's doubtful whether even looking for out loud would even cross their mind. And the entry for "laugh out loud" would help that person understand the phrase "laugh out loud". I'm sorry, but since you insisted on bringing user-friendliness into this, my belief is that perfect user-friendliness would call for a complete abolition of SOP and anything else restricting verifiable entries. That's not what I'm advocating in this proposal, but that's what would generate optimum user-friendliness. Furthermore, "helping people understand more words" isn't the same thing as user-friendliness. Purplebackpack89 05:07, 13 January 2016 (UTC)
Because users need to be able to speak English as opposed to learning phrases verbatim. If you learn "I have a cat" verbatim you won't know what "I have a dog" means because you don't know what any of the individual words mean. But if you learn the word I, have, a, cat and dog you'll know what "I have a cat" and "I have a dog" means. Are you genuinely saying it's just a numbers game? We should be based purely on the number of entries we have not what they are and what they contain? I mean, we could have picture of a tall man with a dog on his lap just purely because it's one more entry than we have now. Like I said, you're not claiming this is a user-friendly feature and I think you're right not to, as the aim of this proposal is not to make Wiktionary more user-friendly; it's to satisfy you personally. Renard Migrant (talk) 14:34, 13 January 2016 (UTC)
In many ways, being more user-friendly and satisfying me personally are one and the same, because making the project more user-friendly is one of my goals for the project. If you're thinking about user-friendliness, you need to think less about what people are learning and more about what people are looking for. I have a cat is not something that would be allowed by this proposal, but something like laugh out loud or anything else that's commonly acronymed is. A person who is searching for the definition of "laugh out loud" might not want to or think to to break it into its component parts, and even if he/she does want to or think to, giving them only one avenue isn't user-friendly. In essence, Renard, your line of reasoning forces people to look for certain things. To do so isn't user-friendly. Am I saying that user-friendliness is a numbers game? Yeah, pretty much. And I again say that what you're advocating isn't user-friendliness per se. Purplebackpack89 14:48, 13 January 2016 (UTC)
Don't be so sure about 'I have a cat'. Also, you don't have a monopoly on what constitutes user friendliness, on who users are, or on what their needs might be. The argument that something should be included merely because it might be of use to someone at some point is irrelevant, we also have a scope. We don't try to be Wikipedia, we don't try to be a stock price index, we don't try to be IMDB. We are trying to be a dictionary. The argument is that all of the phrases which you propose to include are not material for a dictionary. - TheDaveRoss 15:05, 13 January 2016 (UTC)
But, @DaveRoss I am entitled to my opinion of user-friendiless and inclusiveness, and I am entitled to advocate that policy reflects my point of view. Purplebackpack89 15:11, 13 January 2016 (UTC)
Oppose because of...everything written here. SOP and WT:CFI exist specifically so we don't have entries like laugh out loud. —Aryamanarora (मुझसे बात करो) 23:11, 12 January 2016 (UTC)
To continue with the metaphor, what's wrong with a donut with a hole in the middle? Renard Migrant (talk) 14:34, 13 January 2016 (UTC)
If there wasn't a hole in the donut, there'd be more donut. Purplebackpack89 14:48, 13 January 2016 (UTC)
Without a hole it wouldn't be a donut, and there would be fewer of the doughballs than there would have been donuts given the same amount of dough. DCDuring TALK 15:03, 13 January 2016 (UTC)
I believe the two are made separately. Purplebackpack89 15:11, 13 January 2016 (UTC)

Internationalisms in etymologies[edit]

In the modern world, many languages that came later than others into a particular academic field or the like borrowed many words from an international pool of terminology, rather than from any particular language. These are then often naturalized in a systematic way, which helps hide the direct source of the borrowing, if there even was one. The most prominent example of this are words with the suffix derived from Latin -tiō. Good examples are virtually all the words in the translation tables at radio, civilization, and physics. My question is, how should we handle the etymology sections of these terms? One thing we sometimes do is say "Ultimately from Latin/Greek X", but I find that insufficient. --WikiTiki89 20:03, 12 January 2016 (UTC)

I quite like "coined based on". Renard Migrant (talk) 22:37, 12 January 2016 (UTC)
Coined based on what? It's not the wording that's the problem, it's what do we link to? --WikiTiki89 23:19, 12 January 2016 (UTC)
You're going to have to rephrase it then, I don't understand. Renard Migrant (talk) 14:22, 13 January 2016 (UTC)
Ok, so give me a full example of the etymology section of, lets say, Turkish radyo, and I'll explain what I mean based on that. --WikiTiki89 17:53, 13 January 2016 (UTC)
In some cases, a little research will reveal which language the scientific word was first coined in. For example, homosexual was first coined in German, though most languages' words look as if they come from a New Latin homosexuālis. —Aɴɢʀ (talk) 16:02, 13 January 2016 (UTC)
Yes, but homosexual did not go directly from German to every other language. --WikiTiki89 17:53, 13 January 2016 (UTC)
I see what you're getting at. Many languages created their cognate terms for homosexual at a time that cognates had already established itself in many languages (German, English, New Latin, French, Russian, etc), and so the creation probably proceeded along the lines of "well, everyone else calls it this" rather than "German calls it this". It's probably still possible to decide which specific language it was borrowed from / coined based on in a lot of cases, but in those where it isn't, I'd say something like "Coined based on English homosexual, German homosexuell, French homosexuel, etc, as if from New Latin homosexualis". - -sche (discuss) 01:37, 15 January 2016 (UTC)
Yes, that's exactly what I'm talking about. The problem is, that's a lot to add. This is a very common situation in many languages and I was hoping we could get some kind of standard format for it. Also, the problem remains of which language to categorize it under. Perhaps we shouldn't categorize it under any language and create a new category for "Internationalisms". Also, should we link to the "New Latin" term? --WikiTiki89 03:08, 15 January 2016 (UTC)
Yeah, the drawback to what I suggested is that it's a lot to add and a lot that will get duplicated (potentially) across many entries. Perhaps we could say "From English foo and cognates thereof in other languages", potentially using a template to keep the wording the same across many entries, and then let foo#English list the other cognates. Or for words derived as if from something Latin, link to the Latin entry rather than the English entry. - -sche (discuss) 02:59, 16 January 2016 (UTC)
I often encounter this problem when adding Esperanto etymologies—Zamenhof and other important Esperantists seem to have coined a lot of words based on whatever word French, Italian, Spanish, English, German, and Russian (or some combination of those) have in common. When it's a case of simply taking a root shared by most major Romance languages, I use the phrase "common Romance", as in rompi and dento, but I think this is an Esperanto-specific solution, and it doesn't work for all cases. In other cases, I list a few of the languages, as in adjektivo (something like what -sche suggests above). It would be good to have a standard way to deal with this across languages. —Mr. Granger (talkcontribs) 23:46, 15 January 2016 (UTC)
Lojban uses {{jbo-etym}} and faces a similar situation. —Aryamanarora (मुझसे बात करो) 03:09, 16 January 2016 (UTC)

long enough to qualify[edit]

I think it’s time that DerekWinters be made an admin. It was suggested almost a year ago by WF, but some were against it at the time because he had not been around long enough, or because WF had proposed it. DerekWinters has made 6600 edits on Wiktionary and has been active here since 14 October 2012 (over three years). —Stephen (Talk) 20:15, 12 January 2016 (UTC)

I'm sure he's trustworthy and experienced enough, but the real questions are: Does he want to be an admin? And what will he contribute as an admin? --WikiTiki89 20:24, 12 January 2016 (UTC)
If he never uses any admin powers, he would still be an outstanding representative of our slogan, based solely on his Babel. The practical value of having admins who can communicate in such a range of languages and scripts seems important to me. DCDuring TALK 22:35, 12 January 2016 (UTC)
I didn't ask what could he contribute, but what will he contribute. Only he himself can answer that. @DerekWinters: I'll ask you directly: Do you want to be an admin? And what would you contribute as an admin, if you were made one? --WikiTiki89 22:58, 12 January 2016 (UTC)
I wouldn't mind being an admin, especially because I'll be able to speedy delete some of the mistakes I make. But I honestly don't know what I'd be able to contribute should I become one. I definitely can speak a few languages rather well, and it is quite the hobby of mine to master other writings systems, but several of them are often unused in day to day matters. I however do believe I have been useful in making transliteration modules and some declension and headline templates and I have been increasing the lemma-count of several underrepresented languages, but I'm not entirely sure that being an admin would allow me to do this significantly better. DerekWinters (talk) 03:40, 13 January 2016 (UTC)
One thing we definitely need is better patrolling of non-European-language edits in Recent changes. There are lots of cases where I may check for defacing of entries or insertion of out-of-place text, but I have no clue whether the non-English content is correct or is deliberately-planted offensive nonsense. There are a few admins who know the languages, but they don't always have the time. Even if you patrolled only a fraction of the edits, it would be an improvement. Chuck Entz (talk) 04:05, 13 January 2016 (UTC)
I would be able to help with that. DerekWinters (talk) 04:14, 13 January 2016 (UTC)
Should this be formally voted on, I'd support. Mainly because of this cool list that he gave me. —Aryamanarora (मुझसे बात करो) 23:03, 12 January 2016 (UTC)
  • Stephen is being somewhat dishonest about why DerekWinters was not made into an admin when WF suggested it. The reason can be seen at Wiktionary:Votes/sy-2015-06/User:DerekWinters for admin, where I opposed because DerekWinters created entries that did not meet CFI more than once over a long period of time, and when he was told about this or pinged in RFVs of his protologisms, he did not once respond to the best of my knowledge. He continues to create entries in languages he doesn't speak, and he has not demonstrated that he even recognises the problem. You can see that he often ignores messages left to him about problems with his editing at User talk:DerekWinters. It doesn't matter how long someone has been active on Wiktionary: I still cannot support a candidate for sysophood whose edits cannot be trusted, and who admits himself he has little or no use for the tools. —Μετάknowledgediscuss/deeds 03:49, 13 January 2016 (UTC)
I admit that I had added some terms that did not meet the CFI simply because they were words I found to be beautiful. However, since then I have been ascertaining that any term I add to the project most definitely meets the CFI requirements. I do apologize for not having responded then for the RFVs. However I also do believe from what I've seen that many editors add terms in languages they do not speak. DerekWinters (talk) 04:14, 13 January 2016 (UTC)

The Quality of the Macedonian Entries on Wiktionary[edit]

I would like to point out to anyone who may find it of interest that many (I use the term "many" hyperbolically) users which are not fluent speakers of Macedonian are freely contributing to the Macedonian corpus on the English Wiktionary with little to no concern for making errors. So, they're basically polluting the body of Macedonian entries, not with minimal oversights such as failing to mark a literary word as such (of such oversights I am at times guilty myself), but with grave inaccuracies such as allowing blatantly wrong suffixes to be generated in the inflection tables, be it willingly or inadvertently. This saddens me greatly because I've invested so much effort into creating Macedonian entries, only for some B1-level speaker to taint them all by adding a -о vocative form to a feminine noun which actually has an -е vocative form. Indeed, it's not as though the errors of other users don't affect me whatsoever - most people consulting online dictionaries view those dictionaries as integral units, so if they detect an error in the Macedonian Wiktionary for which some non-fluent speaker of Macedonian is liable, they will deem the entire body of Macedonian entries unreliable, such that the reputation of all of my own entries will be marred - they will be reduced to collateral damage.

My entries aside, the mistakes made by unskilled and/or reckless users trying to enhance the set of Macedonian entries are naturally to the detriment of anyone trying to learn Macedonian from this project, yet if no one notices them by hazard, there is no systematic way in which they can be detected and subsequently resolved. I personally try to hunt down faulty Macedonian entries and correct anything that needs correction, but I am not always able to do this. First of all, I don't devote attention to Wiktionary regularly; second of all, I don't get a notification every time a Macedonian entry is created or modified. All I can do (as far as I am aware) is check the contribution history of users whom I have already observed making Macedonian contributions. Either way, it's not as though I'm a moderator of the Macedonian part of Wiktionary - I haven't assumed any official obligations. I'm just trying to direct the attention of other concerned parties to the fact that in the absence of a moderator and a strict system of regulations, the set of Macedonian entries is left at the mercy of whomever feels the whimsical desire to tamper with it. This is not so with the corpora of many other languages, e.g. French or Japanese - there are so many active users that speak those languages here that mistakes can't simply weave themselves into the project with absolutely no one taking heed of that. On the whole, I feel that something should be done about this issue, although I don't have any concrete proposals - the fact that there aren't enough users fluent in Macedonian here makes everything so infeasible. Martin123xyz (talk) 18:23, 13 January 2016 (UTC)

To begin with we could publish regular reports documenting all changes to Macedonian entries, including the person making the contribution. This will only be of use if there are people capable of reviewing those reports, but it might help you find things to look at when you do have time. - TheDaveRoss 18:28, 13 January 2016 (UTC)
I find this suggestion agreeable - indeed, I wouldn't be able to review those reports regularly (so I don't think its necessary for you to produce them too often, e.g. weekly - every two months would be better), but even if I manage to devote attention to them a year or two after their creation, they will not have been in vain. I will correct whatever needs correction belatedly, and that is certainly better than nothing. Either way, I hope that it will be possible for me to filter my own contributions out of those reports, so that I can focus on the ones by other users (though there will obviously be no contributions from me during the breaks I take, e.g. the one I've just started). Martin123xyz (talk) 10:32, 14 January 2016 (UTC)
@Martin123xyz I can try and work on this this weekend, if you could provide me a list of editors who you would like to whitelist that would be great. Also, would you like to "trust" any entry which is most recently edited by a trusted editor, or only edits which were made by those contributors? - TheDaveRoss 20:42, 21 January 2016 (UTC)
@Martin123xyz Check out User:TheDaveRoss/Macedonian/р and let me know what you think. It is a relatively slow process, since I don't want to download and extract the full revisions dump, but I think I could get all of the Macedonian pages audits in this format in an hour or two. - TheDaveRoss 22:31, 25 January 2016 (UTC)
@TheDaveRoss I don't think I can really provide you with a proper whitelist of editors, because I'm hardly familiar with the Wiktionary users who have created or modified Macedonian entries so far; moreover I cannot predict the ones that will do so in the future. Indeed, I have identified three users to be blacklisted so far, but I haven't identified any trustworthy ones. I suppose that Bjankuloski06~enwiktionary could be assigned to that category - after all, he's a native speaker. Meanwhile, I don't think that he's been active on Wiktionary recently, but naturally, that doesn't necessarily mean anything. Anyhow, I don't understand your question about what edits I would like to "trust"; could you please rephrase it (I don't understand what "trusted editor" as opposed to "those contributors" implies)? As for the link you've provided, I've looked at the table you've generated and I think I like it, but I don't understand why some sections have comments whereas some don't. Furthermore, I think its impractical to have a separate row for each edit on every entry (that makes reading the table cumbersome, i.e. long and messy). I would only be interested in being notified that an entry has been edited to begin with; then I could go take a look at it to see what exactly has been changed. Also, I'm worried about chronological order in the table - edits from many years ago are shown together with more recent edits for all of the words. Could you program it to show only recent edits and to sort words according to the date they were edited, rather than sorting them alphabetically, and then sorting the edits of each one independently after that? If not, I would have to go through all 11,000 + Macedonian pages to check if something has been edited, rather than just looking at the top. Well, at least that's the impression I'm getting. I'm sorry if I'm making inapposite requests or arriving at absurd conclusions - I have a very poor understanding of how coding works, so I can't imagine what your table can and cannot do. Either way, I really appreciate your interest in cooperating with me. Martin123xyz (talk) 16:51, 26 January 2016 (UTC)
@Martin123xyz The rows without comments are edits which did not have an edit summary. The current contents are just edit histories of the Macedonian sections, excluding certain bots. My thinking was that you could scan down the page and, if you saw an edit which was suspicious, click on the link to see the diff. I am not totally sure how you intend to use the results.
If you would prefer to only see the most recent edit, I can do that. If you would prefer to see only entries which have been edited since some particular date, I can do that too. Just let me know what you would like to see and I can try and accommodate. I just assumed the alphabetical was the most convenient ordering, if you would like chronological by most recent edit that is also possible. - TheDaveRoss 17:04, 26 January 2016 (UTC)
@TheDaveRoss Thank you for the prompt reply. I would only like to see the most recent edit for all Macedonian entries, and if it hasn't been made by me, I'll check it. I would also prefer to see entries from the 12th of January (which is when I terminated my last contribution spree) until whenever I start contributing again (at which time the dates will be reset, presumably). Finally, it would be nice if the table were ordered chronologically based on the recentness of the edits. Martin123xyz (talk) 19:27, 26 January 2016 (UTC)
@Martin123xyz can you point to some entries that had incorrect content added by a non-speaker? — Ungoliant (falai) 18:56, 13 January 2016 (UTC)
I will present, albeit with reserve (since I do not particularly wish to defame anyone or cause them any other form of inconvenience), three different entries (there are many more I have in mind, but three are enough to serve as illustrative examples) created or modified by three different non-speakers (I judge that they are non-speakers by the information on their profile pages) - народ ‎(narod) (which was marked as feminine, whereas it is masculine; even if this was a coding error, it was nonetheless alarming), дојде ‎(dojde) (which was given a more regular but either way invented past tense ("дојдол" instead of "дошол"), and ниво ‎(nivo) (whose correct plural form, "нивоа", entered by myself, was changed to "нива", as though it were a regular neuter noun in -o, rather than a French loanword with a final stress). I have now taken care of all these entries, such that no errors are observable in them, but one can review their histories to see what their earlier condition was like. Either way, I must mention that I don't require whatsoever that all users contributing to Macedonian without speaking the language fluently be prohibited from doing so - their work can indeed prove useful at times. Indeed, there are many entries created by non-speakers of Macedonian which are of decent quality. Furthermore, a non-speaker once corrected a mistake I had made myself out of inattention, by allowing plural forms to be generated for чаре ‎(čare), which is actually singularia tantum. It's just that non-speakers appear to be unable to contribute in a favourable manner consistently. Martin123xyz (talk) 10:16, 14 January 2016 (UTC)
@Martin123xyz I just want to say, thanks for your work! I've noticed many of your contributions appearing on various pages (in particular, those with a Russian term that's spelled the same), and I definitely appreciate the effort. I know it can be difficult or lonely working on a language without many Wiktionary contributors; I ran into this issue when I was working on Arabic entries. Benwing2 (talk) 18:52, 14 January 2016 (UTC)

Thank you very much, Martin123xyz, for making a great contribution for the Macedonian entries. --KoreanQuoter (talk) 05:54, 15 January 2016 (UTC)

Thank you for the compliments (they're not exactly relevant to the topic I'd introduced, but it's nice to receive them :) ) - I'm glad that people consider my entries useful. I greatly enjoyed creating them (my frustration with Wiki code aside). Martin123xyz (talk) 08:58, 15 January 2016 (UTC)
What do you mean by "Macedonian Corpus"? I think you mean "the entries in Macedonian", wherease "Corpus" normally means something quite different from dictionary entries (i.e. "a corpus of text(s)"). I strongly suggest renaming this as "The Quality of the Macedonian entries", not because your title is "wrong", but because it could lead to confusion... Imaginatorium (talk) 08:05, 17 January 2016 (UTC)
Thank you for the correction - I have made appropriate modifications (as I saw fit). Martin123xyz (talk) 19:37, 18 January 2016 (UTC)

A conference about French Wiktionary at Wikimania ?[edit]

Hello, English-speaking wiktionarians!

We are three French-speaking wiktionarian with a strong will of going to the annual conference Wikimania. We want to share our experiences with others about different topics. Well, to make it short, you can directly go to this direct link to our draft. We have until Sunday to send it, so only few days, but any help is welcome, especialy regarding the language. As you are probably guessing reading my prose now, English is not my mother tongue. Plus, we want to know what do you want to hear from us and imply the community as much as possible. You can react here or there, as you prefer. Thanks a lot in advance! Noé (talk) 21:34, 13 January 2016 (UTC)

I have edited the text a little to make it sound more "nativelike". I hope I preserved all of the meaning. Good luck with your project. Currently, the various Wiktionaries have different markup schemes and different policies, so the scope for collaboration seems a bit limited... Equinox 23:31, 13 January 2016 (UTC)
Thanks Equinox and Koavf for proofreading! I think we are going in the same direction without having a proper understanding of our paths. I plan to translate in English our 2015 report to gather your comments on it. I think we need to start talking about other Wiktionary policies to see if it may be a good idea to adopt it. I don't want to have a supervision but to publish thought about our own projects and to discuss about others' votes and decisions. It's a lot of energy and we need bilingual people to help, but I think it had to be one of our goal in the future. Noé (talk) 13:34, 14 January 2016 (UTC)

SI prefixes[edit]

About the SI prefixes:

Y Z E P T G M k h da
y z a f p n μ m c d

Shouldn't they be named like normal prefix entries, that is, with a hyphen in the end?

Y- Z- E- P- T- G- M- k- h- da-
y- z- a- f- p- n- μ- m- c- d-

Some of these entries already exist, defined as prefixes in various languages. I find it amusing that μ- is defined as "Abbreviation of micro-." in English. Plus, a- has a Translingual section, but it is not defined as a SI prefix. --Daniel Carrero (talk) 23:35, 14 January 2016 (UTC)

I don't see how they are grammatically prefixes. Sticking abbreviations together is not morphology. Equinox 23:49, 14 January 2016 (UTC)
Daniel, it’s an interesting point, but we categorize them as symbols, not as prefixes. — TAKASUGI Shinji (talk) 01:06, 15 January 2016 (UTC)

EL: Language vote[edit]

Of all the five votes that are going to end in the next few days, please direct your attention specifically to Wiktionary:Votes/pl-2015-12/Language.

Reason: It has few votes: 1-0-2. Please vote on it, abstention is fine too, IMO. End date: January 20. Thanks. --Daniel Carrero (talk) 10:25, 15 January 2016 (UTC)

Entries for suffix-like words[edit]

Per a suggestion at Requests for Deletion under the current discussion of -mongering, it might be a good idea to keep entries for words that are likely to be searched for as suffixes (with a leading hyphen) as redirects to the unhyphenated entries. Several editors participating in the discussion either assumed that -monger and -mongering were suffixes, or felt that they could be considered suffixes when attached (suffixed) to the end of other words. Since the words are rarely encountered except in compounds, one might expect a large percentage of people looking for definitions, etymologies, or other words formed with them to search for them with a leading hyphen. The same must be true of many other words that may not technically be considered suffixes, but which are frequently placed at the end of compound words. However, these searches usually turn up no results, frustrating the user, and in at least some cases probably leading to the creation of entries that are subsequently nominated for deletion.

Therefore, my suggestion is that we convert entries such as these into redirects to the entries that cover the intended meaning. -house would redirect to house; -wall to wall, -monger and -mongering to monger and mongering (or both to monger), etc. That would solve the problem of people looking for them as suffixes and not getting any results at all. It wouldn't involve a great deal of work; the redirects could be created as needed or converted as they appear; and if any legitimate suffixes happen to exist with the same spelling, then a sense could be added with wording such as, "house used in a compound word" (just using "house" as an example; I know it won't have a corresponding suffix entry). P Aculeius (talk) 15:34, 15 January 2016 (UTC)

I would support soft redirects, but not hard redirects. --WikiTiki89 15:37, 15 January 2016 (UTC)
I'll also add that these should only be words that people would tend to look up as suffixes. --WikiTiki89 16:25, 15 January 2016 (UTC)
I support the idea, but I think it should be limited to words that have a relatively high percentage of usage as a compound element. In other words, I’d include -monger and -mongering but not -house and -wall.
My preference is for hard redirects, but if soft redirects are used they should use the correct POS instead of suffix. — Ungoliant (falai) 16:18, 15 January 2016 (UTC)
I support hard redirects. Definitions in the target entry should have an appropriate label if use in combination is not rare, ie, (usually/often/also in combination). DCDuring TALK 16:36, 15 January 2016 (UTC)

[ä][edit]

What should I use to write [ä] (Open central unrounded vowel) in IPA for entries? For Hindi, I see many entries with [ɑ] and rarely [a], even though Hindi should be using [ä]. —Aryamanarora (मुझसे बात करो) 21:48, 15 January 2016 (UTC)

Personally, I'm for writing ⟨ä⟩ when applicable. Korn [kʰʊ̃ːæ̯̃n] (talk) 22:32, 15 January 2016 (UTC)
Same here, just wanted to know the conventions here. [ä] isn't in the official IPA guide and is commonly replaced with the other open vowels in transcription. —Aryamanarora (मुझसे बात करो) 22:59, 15 January 2016 (UTC)
It's part of IPA nonetheless. I think in the last discussion I had about that here, a few people were of the opinion that one should use ⟨a⟩ instead, because our poor users might otherwise be scared and confused by something as uncommon as ⟨ä⟩. But for me /ä/ is simply a cardinal vowel like all others. I think another practice is to use /ɑ/ when [ä] phonemically behaves like a backvowel in twofold systems like vowel harmonies or consonant palatalisations Korn [kʰʊ̃ːæ̯̃n] (talk) 12:56, 17 January 2016 (UTC)
I consider [a] sufficient for cases where a language does not have both [a] and [ä] as distinguishable allophones, but I do not oppose the more exact practice of using [ä] either.
On the other hand, sometimes I've seen people using [ɐ] for this purpose, which I find a poor idea: the symbol indicates specifically a near-open vowel, not a fully open one (and is usually only used in the transcription of languages that have both /a/ and /ɐ/).
In phonemic transcription it's recommendable practice to keep it simple, and thus e.g. use /a/ even for [ɑ] if there are no other open vowels, or /u/ even for [ɯ] or [ʊ] if there are no other close back vowels. But that might not be much of an issue around here. --Tropylium (talk) 11:48, 19 January 2016 (UTC)
For dictionary-writing purposes it's almost never necessary to use [ä]. The IPA vowel diacritics are great when you're discussing the fine details of phonetic realization, such as in a discussion of allophones in various contexts or when comparing the vowel systems of two distinct languages or dialects. But in a dictionary, what's important is the phonemes and maybe their most common, widespread allophones, and for that the IPA recommends using the typographically simplest symbol in the neighborhood. Although the cardinal vowel ɑ is defined as maximally back and maximally low, that doesn't mean that only a maximally back and maximally low vowel is correctly transcribed with ɑ. If a language has only one unrounded back low vowel, then ɑ is the correct symbol for it, even if (to judge from the vowel chart at Hindustani phonology) that vowel is closer to being central than being maximally back. Using ä for a language's only vowel in the low back unrounded range, or worse yet, for a language's only low vowel, is an example of false precision that we should avoid. —Aɴɢʀ (talk) 13:04, 19 January 2016 (UTC)
Yep, that chart is accurate. Thanks for the information everyone! I'm going to use /ɑ/ for Hindi since it's the only low vowel. —Aryamanarora (मुझसे बात करो) 18:58, 19 January 2016 (UTC)

Module errors on the vote box[edit]

Discussion moved to Wiktionary:Grease pit/2016/January#Module errors on the vote box.

Gadgets[edit]

Is twinkle not available on this wiki? Ipadguy (talk) 23:48, 16 January 2016 (UTC)

Categories like Category:French verbs with conjugation er[edit]

@Kc kennylau I think these should be named with a hyphen, e.g. Category:French verbs with conjugation -er. Also, when you create them you should probably create a catboiler to generate the content. Benwing2 (talk) 00:53, 17 January 2016 (UTC)

@Benwing2: First part done; second part no idea what to add yet. --kc_kennylau (talk) 09:18, 17 January 2016 (UTC)
@Kc kennylau Check out {{fr-verbconjcat}}. Benwing2 (talk) 12:16, 17 January 2016 (UTC)
@Benwing2: Thank you. --kc_kennylau (talk) 13:38, 17 January 2016 (UTC)
We already have Category:French first group verbs. Renard Migrant (talk) 15:19, 17 January 2016 (UTC)
These aren't quite the same thing, though. There are categories like Category:French verbs with conjugation -cer that don't have an equivalent. Benwing2 (talk) 20:51, 17 January 2016 (UTC)

Module:ugly hacks[edit]

(Firstly, I apologize for violating the intention of this module by mentioning it here, since this would be equivalent to advertising for that module, which is not the writer User:Kephir's attention.) Shortly after Kephir decided to discourage the use of that module, he used that module himself on Template:en-verb ([1]). My question is, what is the current (inofficial) policy towards the use (or the discouragement thereof) of this module? Should I refactor Template:en-verb (as well as all the other templates that use this module) so that it no longer uses this module? --kc_kennylau (talk) 09:17, 17 January 2016 (UTC)

Next votes[edit]

I would like these to be the next WT:EL votes to start. Please review them and see if they are OK.

Plus I created a poll. It was scheduled to start in 1 month and last for 3 months.

But I'm probably going to oppose the current proposal of this poll, even if I'm the creator! It's just that the issue has been brought up repeatedly before and IMO it's better worded as a new proposal but I'd prefer the status quo. Please edit/change it too if you'd like.

Cheers! --Daniel Carrero (talk) 10:45, 17 January 2016 (UTC)

Complete entry template[edit]

I'm playing with the idea to make a template which triggers a row of subtemplates which create an entire entry from scratch for Middle Low German. So the final entry would look like any other to the user, but for editors it would be thus:
----
{{gml-entry|1|2|3|4|5|6|7|8}}
----
Atm it's just a random fleeting idea, but I figured before I even playfully muse about it, I'd ask whether there would be any problems with such a template/form of entry. Korn [kʰʊ̃ːæ̯̃n] (talk) 21:35, 17 January 2016 (UTC)

It would probably be confusing to newbies. I'd suggest to use {{subst:}} when using it, like the templates {{ja-new}}, {{zh-new}}, {{ne-new}}, and some others do. —Aryamanarora (मुझसे बात करो) 02:44, 18 January 2016 (UTC)

2016 WMF Strategy consultation[edit]

Hello, all.

The Wikimedia Foundation (WMF) has launched a consultation to help create and prioritize WMF strategy beginning July 2016 and for the 12 to 24 months thereafter. This consultation will be open, on Meta, from 18 January to 26 February, after which the Foundation will also use these ideas to help inform its Annual Plan. (More on our timeline can be found on that Meta page.)

Your input is welcome (and greatly desired) at the Meta discussion, 2016 Strategy/Community consultation.

Apologies for English, where this is posted on a non-English project. We thought it was more important to get the consultation translated as much as possible, and good headway has been made there in some languages. There is still much to do, however! We created m:2016 Strategy/Translations to try to help coordinate what needs translation and what progress is being made. :)

If you have questions, please reach out to me on my talk page or on the strategy consultation's talk page or by email to mdennis@wikimedia.org.

I hope you'll join us! Maggie Dennis via MediaWiki message delivery (talk) 19:06, 18 January 2016 (UTC)

Poll: Restore deleted high-use templates[edit]

Usually, when a high-use template is nominated for deletion ({{term}}, {{l/en}}), @Dan Polansky argues that they should be kept to preserve page histories. (Wiktionary:Beer parlour/2015/November#About deleting l/en, l/la, l/de and others, Wiktionary:Requests for deletion/Others#Template:l/de, Wiktionary:Votes/2015-11/term → m; context → label; usex → ux#usex → ux, etc.)

I would like to know what people generally think about this.

This is a poll with no policy value. The full proposal of this poll:

  • There should be some effort to restore templates that were once highly-used in the main namespace and were orphaned and/or deleted, to keep them usable for past revisions of the main namespace to be readable. This arguably includes: {{proto}}, some context templates ({{obsolete}}, {{colloquial}}, {{UK}}, {{transitive}}) {{Wikisaurus-link}}, {{SAMPA}}, {{l/en}} and others.

--Daniel Carrero (talk) 18:11, 19 January 2016 (UTC)

Support[edit]

  1. Symbol support vote.svg Support I support this both for enhanced usability of entry histories and to make Special:WantedTemplates and Special:WantedPages more useful by eliminating some of the detritus there, though there are other, greater contributors to the problem. DCDuring TALK 18:33, 19 January 2016 (UTC)
  2. Symbol support vote.svg Support Make revision histories legible. As for making sure deprecated templates are no longer used: we could use the AbuseFilter or some such tool to enforce deprecation on the technical level: it would be impossible to save a page that uses a deprecated template. In the oppose section, I see no reasoning that explains why this or a similar technical solution is not a good idea; all I see there is very vague and non-specific. --Dan Polansky (talk) 21:09, 22 January 2016 (UTC)
    If you want to keep old revisions legible, why not just lock all templates right now and prevent changes for all eternity? The preservation of templates alone cannot preserve legibility, as parameters or internal code may be changed, as Benwing described below. —suzukaze (tc) 07:25, 23 January 2016 (UTC)
    @suzukaze: Your question does not seem to be serious. I want to keep revisions legible at an acceptable cost. Using a deprecation mechanisms instead of deleting templates is not only acceptable but also reasonable. By contrast, locking all templates for eternity is not acceptable. In some cases described by Benwing below, keeping templates deleted may be in order. Alternatively, the body of a deprecated template could be updated to be less dependent on other templates. Either way, an argument of the form "we cannot make page histories perfectly legible => let's give up on the legibility altogether" is a crass fallacy. In its form, it is identical to "we cannot prevent all environmental pollution => let's give up on limitting environmental polution". --Dan Polansky (talk) 07:57, 23 January 2016 (UTC)
    Let me be specific: if we delete template:l/en, we will make all the links that used it illegible. If, by contrast, we enter a plain wikilink to the template, which does not depend on any other templates, we will preserve the legibility of all the uses of the template. To prevent further use of the template, we may create a filter using AbuseFilter. Now, what are the specific disadvantages of keeping the template and deprecating it using AbuseFilter? I see none. The Benwing objections do not apply to this template and the deprecation change to it just presented. --Dan Polansky (talk) 08:03, 23 January 2016 (UTC)
  3. Symbol support vote.svg Support High-use templates, at a minimum, should be either redirected or left as historical. Not doing so confuses the bajeebers out of editors. Purplebackpack89 21:40, 22 January 2016 (UTC)

Oppose[edit]

  1. Symbol oppose vote.svg Oppose I'm not sure that restoring these templates is worth the trouble, especially all the (hundreds of?) context templates. Some people might potentially use the old templates if they are available, and it would require the additional work of converting them to the new templates. Example: Even after {{l/en}} was orphaned, it was not deleted, and it was added to tapaculo and pupunha. --Daniel Carrero (talk) 18:11, 19 January 2016 (UTC)
    @Daniel Carrero Is there a way to allow the templates to be used everywhere but current principal namespace, Appendix space, etc, or only in histories? DCDuring TALK 00:56, 20 January 2016 (UTC)
    As far as I can tell, no, templates have no way to check if they are being used in the current or a previous revision. I also did not find any extensions on mw: to that effect. --Daniel Carrero (talk) 01:51, 20 January 2016 (UTC)
  2. Symbol oppose vote.svg Oppose. It's something Dan Polansky talks about, having older revisions readable because you end up with things like Template:infl. The cost for keeping them is however too high in my opinion; they need to be usable in the main namespace for it to work, and then they get used. And how many people actually look at old revisions? I look at the odd one usually in RFV debates trying to find who added a disputed definition, but I can live with things like Template:infl and Template:idiom being red links because I know what they refer to and they're not what I'm looking for. And I can bypass those problems by clicking on edit. In short, the disadvantages far outweigh the advantages, not least because very few people read old revisions of pages, and experienced editors won't be bothered by old red links. Renard Migrant (talk) 18:27, 19 January 2016 (UTC)
    You and other veterans may know, but this constitutes yet another barrier to strong participation by newcomers. DCDuring TALK 00:56, 20 January 2016 (UTC)
    Maybe I'm just smarter than most editors, but it didn't take me very long to figure out what was up with those red links, once I'd got to the stage where I was looking at enough previous revisions to notice them. I've only been editing half a year, and I think that's the least of our worries. Ensuring that the help pages are up to date, as Daniel Carreiro has been doing, is a far more useful task, as that is where I got confused when I first started. Andrew Sheedy (talk) 04:47, 23 January 2016 (UTC)
    @Renard: "the disadvantages": That's a plural. You've stated only one disadvantage: when the templates are usable, they get inadvertently used. We have AbuseFilter that we could "abuse" to block saving pages that contain a deprecated template. --Dan Polansky (talk) 21:14, 22 January 2016 (UTC)
  3. Symbol oppose vote.svg Oppose, for the reasons listed above. I've never been bothered by red links in previous revisions of pages. Andrew Sheedy (talk) 18:43, 19 January 2016 (UTC)
  4. Symbol oppose vote.svg Oppose, per above. —Aryamanarora (मुझसे बात करो) 18:47, 19 January 2016 (UTC)
  5. Symbol oppose vote.svg Oppose. I wouldn’t mind having a grace period for an RFD-failed template before it’s deleted, so contributors can get used to the new template (especially occasional or antisocial contributors who don’t follow discussions). But I think that keeping them forever would do more harm than good. — Ungoliant (falai) 00:46, 20 January 2016 (UTC)
  6. Symbol oppose vote.svg Oppose Progress is progress. Does anyone still make computer towers that accept 5¼ floppy disks? —suzukaze (tc) 00:57, 20 January 2016 (UTC)
  7. Symbol oppose vote.svg Oppose per Daniel, who has very gracefully pointed out that idiots like me adding {{l/en}} long after deprecation when I'm multitasking on Wiktionary is something that is bound to occur. —Μετάknowledgediscuss/deeds 00:58, 20 January 2016 (UTC)
  8. Symbol oppose vote.svg Oppose. Way too complicated to do something like this. But I wonder if anyone has proposed a mediawiki feature where page revisions use the revision of any templates at the time the edit was made? DTLHS (talk) 01:30, 20 January 2016 (UTC)
  9. Symbol oppose vote.svg Oppose per Daniel. An additional issue is that technically it can be complicated and messy to require such compatibility. This is especially the case if an existing template is changed to eliminate a particular parameter or change the parameters, and all uses fixed accordingly. For example, in {{ru-noun-table}}, which declines Russian nouns, it used to have a 4th parameter that specified a special "bare-stem" form. I fixed up the Lua code so this wasn't required, and corrected all the template uses to eliminate their use of this parameter, and then deleted the code that supported this old parameter and eventually reused the 4th parameter for a different use. If maintaining compatibility were mandated, I couldn't do this, and instead would have to keep the old useless code around forever to support the old use of the 4th parameter, and would have to specify the new use as a 5th parameter with an always-empty 4th parameter before it. Benwing2 (talk) 05:58, 20 January 2016 (UTC)
  10. Symbol oppose vote.svg What Ungoliant said.​—msh210 (talk) 18:06, 21 January 2016 (UTC)
  11. Symbol oppose vote.svg Oppose because we want to think about the people who look things up in a dictionary more than those who make it (more and more these days, I'm using it as a look-up dictionary rather than a make-it-up dictionary...that is progress BTW). Those who make the dictionary know their way around. If they are able to click on [History] and can see a red link there, they're probably at a pretty good stage in the development of Wiktionarying already and will probably be able to find the new-and-improved versions of these templates. Ce mot-ci (talk) 03:52, 23 January 2016 (UTC)
  12. Symbol oppose vote.svg Oppose to much work and complications (and basically bad resource allocation) for something that isn't very important. If anything it should be done like DTLHS describes. Enosh (talk) 07:09, 23 January 2016 (UTC)
    @User:Enoshd: Please clarify how is creating an abuse filter to ensure deprecation "too much work and complication". I have no idea what you are talking about. --Dan Polansky (talk) 08:10, 23 January 2016 (UTC)
    Similar to what Benwing says above, you have to continue updating them when other things change while keeping them backwards compatible. This mostly applies for templates transcluding other templates, otherwise not so much. Enosh (talk) 09:03, 23 January 2016 (UTC)
    @User:Enoshd: What prevents us from changing deprecated templates in such a way that they largely do not depend on other templates and yet render something legible? For {{l/en}}, we can replace the template body with a plain wikilink to preserve legibility and be done. --Dan Polansky (talk) 09:23, 23 January 2016 (UTC)
    I agree with you that {{l/en}} is a simple one, but it's also the last on the list and the only one not deleted. The first {{proto}} would have needed updating multiple times, most recently when we changed namespace. (I list the changes since deletion because I cannot predict the future ones. We won't make the same changes again but there'll probably be changes.) The context labels would have needed updating (excluding what a bot will do) or at least consideration in aliases and such. {{Wikisaurus-link}} and {{SAMPA}} I'm not sure, perhaps not problematic. Enosh (talk) 06:31, 30 January 2016 (UTC)

Abstain[edit]

  1. Symbol abstain vote.svg Abstain kc_kennylau (talk) 13:14, 28 January 2016 (UTC)

Oodles of numbers[edit]

Anon user 126.31.253.142 (talk) has been adding lots of entries for numbers, like 1338, 912, and 43. Is this appropriate for Wiktionary? ‑‑ Eiríkr Útlendi │Tala við mig 18:12, 19 January 2016 (UTC)

No, these aren't words or idioms in any language. Nor are they symbols (nothing wrong with 0, 1, 2, etc.) Renard Migrant (talk) 18:22, 19 January 2016 (UTC)
  • I've just nuked this anon's contribs in their entirety -- with the exception of XCII and XCIII, all of their contributions are of Arabic-numeral numbers as linked above, and even for those two Roman-numeral entries, this anon is the only user to touch them.
(If this was in error, please let me know and feel free to undo.)
What about all the other number entries not created by this specific anon, like 313 or 51 or 32? ‑‑ Eiríkr Útlendi │Tala við mig 00:17, 20 January 2016 (UTC)
I would probably delete all of them above "9". Their categorization is an inconsistent muddle and makes little sense to me. An alternative would be to agree on a consistent format and create them all (up to 2016?) by a bot. SemperBlotto (talk) 08:57, 20 January 2016 (UTC)
I don't mind for one that have other meanings like 69 but everything else should go. If you look at the original revision of 313 it should have been shot on sight, and 32 is just Wonderfool pissing around. Renard Migrant (talk) 18:21, 21 January 2016 (UTC)
  • For entries like 33 that have a valid idiomatic sense, should we rip out the otherwise-useless and not-dictionary-material ==Translingual== sections? ‑‑ Eiríkr Útlendi │Tala við mig 19:58, 21 January 2016 (UTC)
And what about confusingly formatted entries like the ==Translingual== section at xxx? My instinct is to rip that out too. ‑‑ Eiríkr Útlendi │Tala við mig 20:00, 21 January 2016 (UTC)
I suppose I'd keep them a bit like we keep unidiomatic sense for idioms when the idioms meet CFI (see {{&lit}}). Renard Migrant (talk) 19:27, 22 January 2016 (UTC)
For the curious, here are the entries which had only digits and decimal points as of the last dump
.500 0 0. 000 007 0157 06 1 1. 1.0 10 10. 100 1000 101 102 103 104 1040 1080 109 1099 11 11. 112 12 12. 121 125 13 13. 1337 14 147 1471 15 16 17 18 180 187 19 1984 1992 2 2. 2.0 20 20. 200 21 22 224 23 233 24 25 26 27 28 29 3 3. 30 300 303 31 313 32 33 360 39 4 4. 40 400 404 411 419 42 420 45 4649 470 5 5. 50 500 51 520 527 540 555 5555 6 6. 60 600 606 666 69 7 7. 70 71 720 73 737 747 757 777 78 79 8 8. 8.3 80 81 82 83 84 85 86 87 88 89 9 9. 90 900 9000 91 911 92 93 94 95 96 97 98 99 999 - TheDaveRoss 14:42, 23 January 2016 (UTC)
Also these:
-one -ten billion decillion duodecillion eight eight hundred eight thousand eighty eighty six eighty-eight eighty-five eighty-four eighty-nine eighty-one eighty-seven eighty-six eighty-three eighty-two fifty fifty six fifty thousand fifty-eight fifty-fifty fifty-five fifty-four fifty-nine fifty-one fifty-seven fifty-six fifty-three fifty-two five five hundred five thousand five-nine forty forty two forty-eight forty-five forty-four forty-nine forty-one forty-seven forty-six forty-three forty-two four four hundred four one one four thousand hundred hundred thousand million nine nine hundred nine one one nine thousand nine-one-one ninety ninety-eight ninety-five ninety-four ninety-nine ninety-one ninety-seven ninety-six ninety-three ninety-two nonillion octillion one one billion one hundred one hundred million one hundred one one hundred six one hundred thousand one million one thousand one-hundred one-two one-two-three quadrillion quintillion septillion seven seven hundred seven hundred fifty seven thousand seventy seventy-eight seventy-five seventy-four seventy-nine seventy-one seventy-seven seventy-six seventy-three seventy-two sextillion six six hundred six thousand sixty sixty nine sixty-eight sixty-five sixty-four sixty-nine sixty-one sixty-seven sixty-six sixty-three sixty-two ten ten million ten thousand ten-four tenone tenten thirty thirty one thirty-eight thirty-five thirty-four thirty-nine thirty-one thirty-seven thirty-six thirty-three thirty-two thousand thousand one three three hundred three thousand trillion twenty twenty four twenty four seven twenty hundred twenty one twenty two twenty-eight twenty-five twenty-five-eight twenty-four twenty-four seven twenty-nine twenty-one twenty-one hundred twenty-seven twenty-six twenty-three twenty-three hundred twenty-twenty twenty-two twenty-two hundred twentyone two two hundred two thousand two- two-four twoten undecillion - 15:28, 23 January 2016 (UTC)
  • I would definitely want to keep the spelled-out English words in the paragraph immediately above. —Aɴɢʀ (talk) 17:07, 23 January 2016 (UTC)

Proposal for Sorting Definitions[edit]

As anyone who regularly frequents Wiktonary knows, one of its most widespread problems is that of inconsistency in the ordering of definitions (and etymologies, pronunciations, parts of speech, etc.). I propose implementing something that could somewhat improve the situation and at the same time, satisfy people with conflicting opinions.

This would be a simple template that allowed the ranking of definitions in two different ways. It would look something like {{2|3}}, with the first parameter ranking it according to how common it is, and the second parameter ranking it according to when it entered the language. It would have no effect on the appearance of the page unless a user selected a setting to have definitions ranked either by age or by frequency. With this option available, I would suggest making it policy to order definitions according to their relationship with each other. (By default, entries would be displayed in the order displayed in the wikicode.)

Obviously, some definitions would be almost equally frequent (or infrequent), or might have roughly the same, not necessarily known, time of origin. The template would thus have to allow for multiple definitions to have the same ranking. The other parameter could be used as a secondary ranker, and perhaps the unmodified order of definitions could be used as a tertiary ranker (i.e. equally frequent definitions would be ranked by their age, and if some of them had an equal age, then they would be ranked according to how they were entered in wikicode).

The same thing could perhaps be applied to etymologies, pronunciations, etc. which also suffer from the same issues. I think it would be less important for these than for definitions, however.

This would introduce some problems, especially ones that label definitions as "by extension" from preceding definitions. I don't see this as a huge issue, as the templates would be entered manually, so editors would hopefully ensure that there were no such problems, and other editors would hopefully catch them over time, as they do any other mistakes and inconsistencies. I also don't think that label is particularly useful anyway.

This would be an ambitious change to make, and its implementation would no doubt be tedious, but I feel that it would pay off in a few years, if it is even possible to do. What think you all? Is this possible (and would anyone besides myself care enough to help implement it)? Hopefully the wall of text didn't scare anyone away. Andrew Sheedy (talk) 05:00, 20 January 2016 (UTC)

I like the idea. I wonder if the template thing is needed at all; maybe we can get consensus for commonness-based order rather than etymological order. — Ungoliant (falai) 13:51, 20 January 2016 (UTC)
Consensus has been sought many times, and there are people who are very dedicated to one approach or the other. As for implementation, this would require code, presumably in javascript, to rearrange the page. Page loads take far too long as it is, and this could make pages with lots of definitions really, really slow. I know that it would only be a problem for those that opt in to rearranging, but I'm skeptical that there would be enough use to justify the monumental investment of time and resources to make it happen. It could easily end up as a failed experiment like the alphagrams. Chuck Entz (talk) 14:33, 20 January 2016 (UTC)
How would this be made to work with subsenses (and subsubsenses)? DCDuring TALK 15:09, 20 January 2016 (UTC)
Good question. Maybe a modified template could be used? I'm afraid I'm not very skilled in the programming department, so I'm not too sure how one would make it work in those cases. Andrew Sheedy (talk) 18:39, 20 January 2016 (UTC)
As far as subsenses, this is not a technical issue but a logical one. If you can figure out how we want them to be sorted, dealing with them on the technical side shouldn't be too hard. As far as pageload times, most of the time is spent loading the JavaScript, not running it. It runs pretty fast. --WikiTiki89 19:21, 20 January 2016 (UTC)
In that case, I would think subsenses would be sorted in the same way as other definitions. If a user wanted to see definitions in order of age, the subsenses would still be displayed as such, and would be ordered related to other subsenses under the same main sense. They wouldn't be ordered in relation to other senses, at any rate. Andrew Sheedy (talk) 18:11, 21 January 2016 (UTC)
I must have faster Internet than you, Chuck Entz, because I find that even the longest pages load fairly quickly on Wiktionary. In fact, one of the main reasons I prefer Wiktionary to any other online dictionary is the speed at which pages load (Larousse and Dictionary.com have way too many ads), so if rearranging the definitions would significantly slow down page loads, then maybe it wouldn't be such a good idea, particularly since my proposal is aimed especially at definition-heavy pages. Andrew Sheedy (talk) 18:39, 20 January 2016 (UTC)
I think this is a terrific idea, and restating what I think Dixtosa was saying below: we ought to hold off on this and actually take the plunge re structuring data in such a way that we could migrate it into a relational database. The current suggestions involving templates and reordering pages via js and lua may well work, but only insofar as data is presented locally. The real conversation is about WikiData or some other model for backend structure. - TheDaveRoss 19:25, 20 January 2016 (UTC)
As far as I know, there's never been any consensus over what order to have definitions in. Most entries put common definitions first but a few put things in date of first appearance order, meaning that very rare meanings can come before very common ones, which I oppose. Renard Migrant (talk) 16:48, 21 January 2016 (UTC)
I think that the vast majority of entries actually have the definition lines in a more-or-less random order. The upshot of the suggestion here, and those like it, is that you as a reader could choose how you like your definitions provided and then all entries which had been tagged could be displayed to you in that manner. - TheDaveRoss 16:52, 21 January 2016 (UTC)
What I am proposing would allow users to view definitions sorted in three different ways (the default, according to age, and according to commonness), according to preference. Andrew Sheedy (talk) 18:11, 21 January 2016 (UTC)
We have fewer than 2,600 pages that use {{defdate}}, not all of which have any English content, some of which use defdate not for definitions but for alternative forms etc. A large number of the pages with {{defdate}} have only one definition. Almost all of the PoS sections that have {{defdate}} do not have it for every definition.
I don't think we have any other usable data for time ordering. The basic source for such date is the OED. I don't know whether large-scale use is a copyright problem.
We have no source whatsoever to determine which definition represents the most frequent semantic content of current usage of the word. That is, we would have to do our own interpretation of corpus data to actually offer something that could claim to be reliable.
AFAICT, we have never mastered the relatively simple problem of technically determining whether a given spelling of a word had as lemma one or both of the capitalized or uncapitalized forms. Nor have we mastered folding all inflected forms into the count for the lemma form or separated participles from homonymic adjectives. And, of course we have never mustered the effort to do these things manually, except in principal namespace where it took much of the lifetime of the project to reach our current state of incompletion.
If we cannot get reliable information on frequency of usage of word-definitions, how could we execute the project proposed? What would motivate the manual effort required? How could the problem be solved technically, even over the course of the next decade? DCDuring TALK 19:23, 21 January 2016 (UTC)
Fair points. It's a pity we don't have the manpower to do the job. I think it could be done if people were motivated, but adding new definitions and words is probably higher priority. It would be nice if we could achieve consenus on the definition order, though, even if it required a compromise. We have some entries that are overwhelmingly confusing, and I think some consistency could help solve that.
I might just use {{defdate}} for French entries now that I know about it. It's not essential information, but the more it's used, the easier it'll be to rearrange definitions later. Andrew Sheedy (talk) 23:21, 22 January 2016 (UTC)
It is worthwhile to include the {{defdate}} information where available. I wish that it weren't so hard to get it. Especially difficult, even conceptually, are the more gradual and subtle transitions from creative innovation, through usage limited to special context, to full membership in the lexicon. It seems a bit like speciation in organisms, not the most encouraging of possible analogs. DCDuring TALK 00:21, 23 January 2016 (UTC)
Data about the relative frequency of different definitions of a word is even harder. The conceptual problems are severe: the relative frequency depends on the set of definitions a given dictionary uses and the wording of the definitions. More practical would be recording the frequency of collocations of a word, though that too has problems. At least it would make it more possible to better separate the frequency by PoS. DCDuring TALK 00:34, 23 January 2016 (UTC)
I think any ranking by frequency would have to be largely subjective, and based on personal experience. For more technical terms, this would be far more difficult to determine. Andrew Sheedy (talk) 03:20, 23 January 2016 (UTC)
Try comparing subjective experience with corpus evidence for a few polysemic words. Then assume that crowd-sourcing subjective experience improves things. Do you think the result is satisfactory? DCDuring TALK 13:13, 23 January 2016 (UTC)
Is that sort of comparison even possible without spending hours on a given word? I do think that multiple editors contributing their subjective observations would help balance things out, though. Andrew Sheedy (talk) 06:17, 24 January 2016 (UTC)


  • Tagging this idea as part of {{hashtag|MediawikiHoldsWiktionaryBack}}. --Dixtosa (talk) 19:11, 20 January 2016 (UTC)
    Detagged. This places the whole Beer parlour discussion page into a pointless category. Furthermore, if you don't like Mediawiki and the semi-formal approach, you can goto OmegaWiki and contribute there; good luck with that.--Dan Polansky (talk) 13:21, 23 January 2016 (UTC)
    That's like saying if you don't like me you can fuck off. Thank you for letting me know that I can leave. Very informative. Try directing your energy coming up with a solution next time.
    How does a category that contains specific, clearly defined talks sound pointless to you? I am trying to aggregate all the reasons of why MW ins't perfect for WT in one place.
    Also, the whole Beer parlour discussion page is also in the category "term cleanup/wiktionary namespace". Does this not bother you? --Dixtosa (talk) 14:14, 23 January 2016 (UTC)
    What I was trying to say is that it is pointless to try to convert the English Wiktionary to an analogue of OmegaWiki since we already have OmegaWiki. Those who want to contribute lexicographical content to a relational database with stringent entity-relationship models already have the option. --Dan Polansky (talk) 14:28, 23 January 2016 (UTC)
  • I am not sure what the intention was, but I took it to mean that Mediawiki as it is is not perfectly suited for dictionary content. That is a sentiment that I am sure everyone who has done any work here can agree with to some degree. It would be nice, for instance, if there was some method of tying a sense to a translation which was tighter than the current use of glosses. It would be nice if there were less hacky methods of only displaying the content that a user wanted to see, etc. - TheDaveRoss 14:28, 23 January 2016 (UTC)

My two cents For what it's worth, I definitely think that we need some standard for sorting these definitions: cf. truck. By far the more common usage of the English word is the automobile but the first definition is the verb. —Justin (koavf)TCM 05:04, 24 January 2016 (UTC)

And worse, it's a different etymology, with only dialectal senses! We definitely need some better defined standards. Andrew Sheedy (talk) 06:17, 24 January 2016 (UTC)
I find putting obsolete senses first and obsolete words (etymologies, parts of speech) first massively inferior from usability standpoint. Unfortunately, multiple editors don't think so and we have no consensus as per Wiktionary:Beer_parlour/2012/December#Positions of obsolete senses. The link contains a poll in which the editors are divided on the issue approximately 50:50. --Dan Polansky (talk) 07:28, 24 January 2016 (UTC)
@Dan Polansky: Agreed. But I also see some value in the chronological approach. The nice thing about this if we can template-ize it or make it into a table of some sort is that it would then be sortable--users can chose the order via scripts, user settings, etc. —Justin (koavf)TCM 07:35, 24 January 2016 (UTC)

{{inh}} vs. {{der}} again[edit]

The criteria for when to use one vs. the other aren't always clear. For example, I just fixed up Latin pluit to say it was "inherited" from PIE *plew-. This is true, except that pluit is thematic and the verb corresponding to *plew- might have been athematic *plewti instead of thematic *pleweti; if so, then technically it was "morphologically reformed" at a later point so it should maybe say it was "derived". But this seems a needless distinction to make in cases like this, and in many cases it isn't even known for sure if a particular verb was thematic or athematic in PIE since the athematic->thematic change was so common and repeated so often in so many languages. Or is the statement that it's inherited from a root rather than a particular verbal form enough to work around this issue? Benwing2 (talk) 05:45, 20 January 2016 (UTC)

*plew- is a root, and roots have no descendants, only derived terms. So it can't be inherited from it. —CodeCat 15:29, 20 January 2016 (UTC)
It is, however, inherited from the root present *pléw-, which is derived from *plew-. —JohnC5 15:38, 20 January 2016 (UTC)
We had separate entries for verb stems for a while, but I removed them again because they caused some problems and a few people had requested this long ago. One issue is that etymologies (ours or others') generally don't distinguish between the root and its stems, so knowing what goes where is hard. Another difficulty is that the PIE verb system was structured fundamentally differently from what the descendants have. PIE had different aspect stems, and these could be considered separate verbs in their own right. In the descendants, however, these were generally unified into one paradigm, and missing members were created anew while duplicates were trimmed. For example, every Germanic strong verb has a form that is descended from the PIE stative, but that doesn't mean that a stative form of that verb existed in PIE. Not all PIE verbs might have had an imperfective aspect stem either, but pretty much all descendants formed one at some point. Then there is the question of the athematic-thematic distinction, which was pretty much completely eliminated in many of the descendants (Slavic, Italic, Germanic). All in all, it's very difficult to say "PIE verb X has descendant verb Y as a descendant" when Y can actually be an amalgamation of several PIE verbs, including any that were only formed post-PIE. —CodeCat 16:02, 20 January 2016 (UTC)
I vaguely remember a discussion from which it followed that upholding this distinction was not even supported by consensus. But I am not sure. From what I can see, the distinction between {{inh}} and {{der}} was installed without discussion and consensus. I fear it is going to create a lot of pain down the road. --Dan Polansky (talk) 13:17, 23 January 2016 (UTC)

Fixing cite- and quote- templates[edit]

Recently, @Smuconlaw has changed the behavior of {{cite-book}} and {{cite-web}} (which previously behaved like/redirected to {{quote-book}} and {{quote-web}}) to actually be proper citation templates. While this is a good change, it gives us a big problem - the cite- templates were widely used to include quotations in entries, and suddenly the formatting here has become completely messed up (look at Citations:rest on one's laurels, Citations:macaroni and gravy or Citations:twelve-ounce curls). Would it be possible for someone with a bot or AWB to go through entries and convert "{{cite" to "{{quote" in the following specific circumstances:

  1. Any usage in the Citations: space.
  2. Any usage in the mainspace immediately preceded by "#*" (or "#* " - in fact, any usage of a cite-template preceded by an asterisk is probably an error)
  3. Any usage under a ====Quotations==== header

That should fix the formatting issues messing up the references (although if anyone can see a potential for false positives, please point it out). Smurrayinchester (talk) 08:53, 20 January 2016 (UTC)

(And vice versa, any quote- templates that appear within <ref></ref> tags should be cite-, presumably) Smurrayinchester (talk) 08:53, 20 January 2016 (UTC)
I've spotted two problems with this idea. a) Some people call "cite-book" through the {{cite}} template. This wouldn't necessarily be unfixable, but we'd need to create an equivalent {{quote}} (currently a redirect to {{blockquote}}). b) There are a few citations pages where citation templates are used without a date= or year= field (eg Citations:Brown bounce), which causes problems if you try to convert citation quotes directly into quotation ones. Smurrayinchester (talk) 09:25, 20 January 2016 (UTC)
Yes, I realized there were going to be some transitional issues but figured the short-term pain was worth the long-term gain. If these could be fixed by bot that would be great. By the way, {{cite}} works (I updated it) but links to the citation, not the quotation, templates. Smuconlaw (talk) 10:15, 20 January 2016 (UTC)

How to mark long vowels in Germanic variants[edit]

I think this would apply to Middle Low German/Dutch/Frisian, and some forms of German. I'm making Middle Low German templates and Low German research traditionally has two different systems of marking long vowels. I'd like to have consistency amongst the languages, so I thought I'd informally ask which one to use.

  • System one: Lengthened short vowels are marked by a macron: hö̂gede (heighth) vs. hȫgede (joy)
  • System two: Lengthened short vowels are unmarked: hö̂gede (heighth) vs. högede (joy)

Lengthened vowels can only ever occur in open syllables, so system 1 is superfluous but explicit. Asking @CodeCat as DUM person especially. Korn [kʰʊ̃ːæ̯̃n] (talk) 13:50, 20 January 2016 (UTC)

Vowel length is not marked for Middle Dutch, as it can be (generally) deduced from the spelling. However, we do use macrons and circumflexes to distinguish between different long vowels that were spelled identically. See WT:ADUM for more. —CodeCat 15:31, 20 January 2016 (UTC)
FWIW, I find it hard to distinguish ö̂ from ȫ at the font size most text on this site is displayed at. - -sche (discuss) 19:06, 24 January 2016 (UTC)

Translations vote[edit]

Please vote on Wiktionary:Votes/pl-2015-12/Translations.

Reason: Only 5 people voted so far, and it's going to end in a few days.

  • Current results: 3-2-0
  • End date: January 27

The 2 opposers raised a good point concerning translation tables in Translingual sections, so I proposed to amend the vote based on that point. I believe this vote could pass, with that proposed modification. Just see the support vote by @Andrew Sheedy. Thanks. --Daniel Carrero (talk) 18:11, 22 January 2016 (UTC)

Wikidata & GLAM 'down under'[edit]

In February, I'm undertaking a three-week tour of Australia, giving talks about Wikidata, and Wikimedia's GLAM collaborations. Do join us if you can, and please invite your Wikimedia, OpenData, GLAM or OpenStreetMap contacts in Australia to come along. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 20:42, 22 January 2016 (UTC)

Deprecating term template[edit]

A vote to let bots replace all uses of {{term}} with {{m}} has passed.

I propose to deprecate {{term}} by adding an abuse filter to AbuseFilter extension that prevents saving of pages that contain the template. Thus, we can standardize on {{m}} while keeping the huge volume of page revisions that use {{term}} legible.

--Dan Polansky (talk) 08:24, 23 January 2016 (UTC)

I'm not opposed to this, but concerned this may establish a precedent preventing deletion of obsolete templates in the future. Let's emphasize that doing this doesn't establish such a precedent. Benwing2 (talk) 12:08, 23 January 2016 (UTC)
If we can agree to give it a try with {{term}} and {{m}}, voting on deletion of templates in RFDO should still proceed as before, and those who want to delete other templates should feel free to vote in RFDO according to their best conscience, which may be "Delete". If the template code of {{term}} proves to block improvement of templates or modules on which {{term}} depends, then {{term}} code should be edited to no longer use other templates or modules and provide some minimal legibility, without necessarily having the full original function. --Dan Polansky (talk) 21:26, 23 January 2016 (UTC)
Would this be a problem for more inexperienced users who edit pages with the template and then can't figure out why the page won't save? That template is used on a lot of pages. Once the bots get most of them, it should be fine, though. Andrew Sheedy (talk) 03:07, 24 January 2016 (UTC)
Shouldn't this be at WT:RFDO? There's been a vote to replace {{term}}, but not to delete it. I don't mind having two templates do the same thing; with Lua it should be possible to have them both work in exactly the same way apart from the language statement. They could just both call the same module so any edit to one would be an edit to both. Renard Migrant (talk) 18:05, 25 January 2016 (UTC)
In fact they already use the same Lua code internally, except for a "compat" flag that specifies how the arguments work. Benwing2 (talk) 18:29, 25 January 2016 (UTC)

Definitions vote -- Rationale and changes[edit]

Vote:

Rationale and changes:

  • Removing "The definitions are the most fundamental piece of dictionary", it's a comment rather than a rule.
  • Removing "[definitions] do not have their own header", no need to say what they don't have. Arguably, the POS header is their header.
  • Expanding upon the idea that "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop.", mentioning other type of definitions: "In language sections other than English, the definition generally consists of a simple translation into English, rather than a full definition."
  • Mentioning: "Sometimes, they are grouped into subsenses."
  • Writing out the actual formatting rules of Wiktionary:Votes/2006-12/form-of style and Wiktionary:Votes/2010-08/Italicizing use-with-mention, rather than just linking to them.
  • Removing "The “definitions” of entries that are abbreviations should be the expanded forms of the abbreviations." Sometimes, the expanded abbreviation is in the etymology section, not in the definition.
  • Removing "Where there is more than one expansion of the abbreviation, ideally these should be listed alphabetically to prevent the expanded forms being duplicated.", does not seem common practice.
  • Compressing the explanation of where to link the expanded forms in a single paragraph; arguably, that information does not need its own subsection.
  • In particular, replacing "Expanded forms that are encyclopedic entries should also be wikified and linked to the appropriate Wikipedia entry." by "Otherwise, if appropriate, link it to the appropriate Wikipedia article, if it exists." Arguably, existence in Wikipedia is a more objective criterion than whether an entry is "encyclopedic".
  • Mentioning three abbreviation examples (PC, USA, SNAFU), together in the same line. The original text had two examples (PC and SNAFU) in separate lines.
  • Removing bold formatting from "a definition which only applies in a restricted context"; arguably, it's unnecessary.
  • Compressing the explanation of context labels in a single paragraph; arguably, that information does not need its own subsection. In particular, "Details in Wiktionary:Context labels." does not to be in a separate line.
  • Adding an example of non-lemma definition, properly formatted: "plural of word".
  • Removing three separate references to the same vote (Wiktionary:Votes/pl-2009-03/Context labels in ELE v2) in consecutive paragraphs.
  • Reordering some of the ideas. Original order: introduction, form-of definitions, abbreviations, context labels. Proposed order: introduction, form-of definitions, context labels and abbreviations.
  • Using {{lb}} rather than {{context}}, as approved at Wiktionary:Votes/2015-11/term → m; context → label; usex → ux.
  • Replacing "wikify" by "link"; "wikified" by "linked".
  • Mentioning the fact that some entries are romanizations linking back to the main entries. The requirement that each romanization entry have at least one definition line was voted at Wiktionary:Votes/pl-2013-03/Romanization and definition line.
  • Making sure another WT:EL section is voted, a step in the direction of having WT:EL completely voted.

--Daniel Carrero (talk) 11:00, 23 January 2016 (UTC)

References vote -- Rationale and changes[edit]

Vote:

Rationale and changes:

  • Adding the rule: "References are listed using bullet points".
  • Adding 1 more usage example + the result of the usage examples.
  • Formatting the usage examples with bullet points, showing actual usage.
  • Removing "There is a need to balance respect for copyrights with definitions so inventive as to be inaccurate." For semantics, we go by attestation.
  • Removing "The validity of the dictionary has a profound effect on its usefulness." It's a comment rather than a rule.
  • Minor change of punctuation and word order.
  • Making sure another WT:EL section is voted, a step in the direction of having the WT:EL completely voted.
  • Disclaimer: The References section probably could be expanded with more information. This is proposed as an improvement to the current text, not as the "final" version of it.

--Daniel Carrero (talk) 11:06, 23 January 2016 (UTC)

EL introduction vote -- Rationale and changes[edit]

Vote:

Proposed introduction:
"This is a list of aspects that govern how an entry should be formatted. This includes what are allowed sections and what are the contents expected to be found in them. These rules reflect what editors think as best concerning the standard format of an entry."

Rationale and changes:

  • Quickly states that WT:EL is and what it does, for those unacquainted with the policy.
  • The first sentence was based on WT:NORM's "This is a list of aspects that govern how the wiki code behind an entry should be formatted."
  • The second sentence is a generic, all-encompassing statement but it also suggests that we have some standards concerning specific allowable headers and contents.
  • The third sentence was based on WT:NORM's "[...] they do make the pages conform more to a standard format reflecting what we think of as best for the wiki code."

--Daniel Carrero (talk) 11:21, 23 January 2016 (UTC)

German imperatives[edit]

I brought this up in About German, but didn't get much of a reply. German has developed a First Person Plural and a Third Person Plural imperative, both of which function identically to the old Second Person Singular and Second Person Plural imperatives but require to be used with the personal pronouns. I would like to incorporate these into the templates. Korn [kʰʊ̃ːæ̯̃n] (talk) 10:20, 24 January 2016 (UTC)
ps.: The Third Person Plural imperative comes from the usage of 3rd rather than the 2nd Person as the polite form in German. Korn [kʰʊ̃ːæ̯̃n] (talk) 10:22, 24 January 2016 (UTC)

@Korn: Care to provide any examples? --kc_kennylau (talk) 17:02, 24 January 2016 (UTC)

All four examples mean "stop it!" Wir and Sie are the personal pronouns.

  • Lass das! – 2nd sg.
  • Lassen wir das! – 1st pl.
  • Lasst das! – 2nd pl.
  • Lassen Sie das! – 3rd pl. Korn [kʰʊ̃ːæ̯̃n] (talk) 17:45, 24 January 2016 (UTC)
Isn't Lassen Sie das second person plural? It's definitely used to make requests, but I don't know whether it rises to the level of an imperative. Smurrayinchester (talk) 19:22, 24 January 2016 (UTC)
It's second person (singular and plural) imperative in function, but third person plural present subjunctive in form. —Aɴɢʀ (talk) 21:28, 24 January 2016 (UTC)
What I would ask is whether we should describe forms by function or by form. In most cases, we describe things by form, and don't list the nuances in function that a form may have. This is something for a grammar, not a dictionary. For example, in Finnish verbs, the passive/impersonal form is used colloquially as a first-person plural. In French, the impersonal 3rd person singular is used in a similar way. But these aspects are not found in our inflection tables. —CodeCat 19:31, 24 January 2016 (UTC)
In Spanish we do. Should the Spanish inflection templates be simplified? DTLHS (talk) 21:30, 24 January 2016 (UTC)
I can't really see what you mean, CodeCat. All plural imperatives - including the one inherited from Proto-Germanic - are identical with their indicative present forms. That doesn't mean they're not imperatives. They're clearly not indicative statements and they're not an optative subjunctive present, as subjunctive present is a concept which doesn't exist in most registers of Modern German, other than maybe some consciously archaic speech or legal texts. And given the very comprehensive declension tables we provide for words, I don't see how "that's grammar" would be a reason to only list half of a set of forms. Especially since we already usually provide as much of grammatical information pertaining to a word as we can in both inflection tables and usage notes, and I very much want Wiktionary to be as much grammar as can be. I was about to say that this should be read with the restriction that the grammar has to pertain exclusively to the given entry and not be general rules, but every inflection table for a verb of a regular verb class is already 'general grammar' and nothing that needs to be listed for that word. Yet we do it, because information is good. After all, the Wiki credo is "text is cheap" and I don't see a reason to not give any information available on a word in an online wordbook. Korn [kʰʊ̃ːæ̯̃n] (talk) 11:45, 25 January 2016 (UTC)

Are the names of languages proper nouns?[edit]

previous discussion: Wiktionary:Beer_parlour/2015/February#Languages_-_are_they_proper_nouns_or_not.3F

We seem to define the English names for languages as proper nouns, but I'm pretty sure they are all just uncountable common nouns. They certainly are in Italian and French. How do other dictionaries define them? SemperBlotto (talk) 07:11, 25 January 2016 (UTC)

Names of languages and ethnicities are common nouns and written in lower case in the Russian Wiktionary. I would check others but I doubt there will be any change in policies here. --Anatoli T. (обсудить/вклад) 07:25, 25 January 2016 (UTC)
I imagine this may have to vary from language to language in enwikt. In some languages (e.g. Turkish, and Arabic also to a fair extent), there are grammatical or phonological tests that help determine whether something is a proper noun or not. In other languages (like English), it's more conventional, and tends to be based on whether a noun is capitalized. E.g. English December is capitalized and considered a proper noun, Russian февраль is lowercase and not considered a proper noun, Arabic شُبَاط has no case but is considered a proper noun because it satisfies certain morphological and syntactic tests that distinguish it as proper. Many (most?) other dictionaries don't distinguish proper nouns from common nouns but just classify both as nouns. Many editors want to eliminate proper nouns as a separate class but there's no consensus on this, I don't think. Benwing2 (talk) 18:20, 25 January 2016 (UTC)
This has been discussed a couple of times, once briefly at Wiktionary:Information desk/2015/August#Modern_Greek_.26_PoS and before that at greater length at Wiktionary:Beer parlour/2015/February#Languages_-_are_they_proper_nouns_or_not.3F. As I noted in the latter discussion, few (or no?) other major dictionaries distinguish proper from common nouns, and few authorities give useful guidance on the matter: most authorities (both old and new) just say, sometimes in these exact words, "Capitalize proper nouns and words derived from them; do not capitalize common nouns", which is obviously inaccurate — tell it to the Marines, the Americans, the Englishmen and other capitalized common nouns (and to bell hooks and other uncapitalized proper nouns). (I am sympathetic to the idea that we should reduce the prominence of the distinction between proper and common nouns, e.g. by using a label on the headword line rather than different POS headers. However, distinguishing proper from common nouns does seem to be a useful distinction, and we potentially gain readers by being among the few dictionaries to make it.) I think languages are proper nouns in English, but may be common nouns in other languages. - -sche (discuss) 20:14, 25 January 2016 (UTC)
Well, there's only one Italian language, one French language (and so on) unlike grain or rye which are uncountable (yes they're both countable in some senses, but you get the point). Renard Migrant (talk) 21:46, 25 January 2016 (UTC)
  • Languages are common nouns, even if they're uncountable and spelt with a capital letter in English. In Danish, Norwegian and Swedish no capital letter is used for languages. I think Wiktionary may be out on a limb with its treatment. Donnanz (talk) 21:56, 25 January 2016 (UTC)
As with many proper nouns, they can be forced to be plural (eg, "Indian English is just one of the Englishes that need to be included."). They can be used uncountably more readily than most proper names and accept a wider range of adjective modifiers. But almost all proper nouns can be forced to be used uncountably ("There was too much Boston in his speech"), usually metonymically (eg, "Boston accent") and accept some adjectives ("historic old Boston").
Orthography (initial capital), semantics (the individuality of the named referent), and syntax (no indefinite article, limited acceptance of modifying adjectives) together seem to provide sufficient evidence that, in English, the names of languages are proper names, even when MWEs (eg, Old English). DCDuring TALK 01:17, 26 January 2016 (UTC)
In Hindi, language names take on gender and can (at least theoretically) be inflected in the plural. However, they still use the proper noun header on Wiktionary... I can't say much about English - my understanding of the nuances of English linguistics is lacking. —Aryamanarora (मुझसे बात करो) 02:12, 26 January 2016 (UTC)
In English, the names of languages are always proper nouns when referring to particular languages, and adjectives derived from those names are proper adjectives. Every grammar and dictionary I can find is in agreement that a noun designating a particular person, place, or thing, usually without requiring an article or other limiting modifier, is a proper noun. This may not be the rule in other languages, but it certainly is in English. Note, however, that certain words and phrases derived from proper nouns are treated as common nouns, and not usually capitalized. So, English, French, and Arabic are proper nouns when referring to languages; but one may have french fries, gum arabic, anglicized words, japanned leather. Capitalization varies with common nouns derived from proper nouns; one occasionally sees French fries or Brussels sprouts, but usually English muffins and Belgian waffles; this is a matter of style. P Aculeius (talk) 03:53, 26 January 2016 (UTC)
  • The question keeps popping up, so maybe it should be put to the vote to decide once and for all. Donnanz (talk) 16:32, 26 January 2016 (UTC)
What are you proposing to vote on? Whether the names of languages are always proper nouns, wherever they occur, in all languages? Or just whether they're proper nouns in English, which can be either capitalized or not, at the writer's discretion, when used to form a common noun, such as English muffin, french fries, or a danish? Can you point to some widely-used English language dictionaries, grammars, or style books that define "proper noun" in such a way as to exclude the names of languages? Every source I can find says that a noun referring to a particular person, place, or thing (usually without an article or limiting modifier), is a proper noun. So Polyhymnia, Ithaca, and Greek are all proper nouns, while muse, island, and tongue are common nouns. When referring to a common noun formed from a proper noun, capitalization is up to the writer; so Greek or greek salad; and of course one may use a common noun as a name: O Muse...; where is the unguent, Mother? in which case it becomes a proper noun in that instance. Before we put something that seems settled up for a vote, it needs to be clear just what's being voted on, and there ought to be at least some authority for disputing the issue. P Aculeius (talk) 17:31, 26 January 2016 (UTC)
I was thinking primarily of voting on our treatment in English, but other languages should be taken into consideration. In some languages gender is used, so they can automatically be regarded as common nouns, in many others no capital letter is used, so they can also be regarded as common nouns. Actually with the present treatment as both a proper noun and common noun the entry (for English) is, put bluntly, a mess and quite confusing. And English, French and German can also be surnames (which are undeniably proper nouns, recognised in the entry for English at least). Having said that, I disagree with much of your philosophy, and am siding with SemperBlotto. Donnanz (talk) 19:29, 26 January 2016 (UTC)
  • The fact that most languages have an adjective with the same spelling, in English at least, appears to have not been mentioned (unless I missed it), and of course adjectives are never classed as "proper", e.g. Filipino cuisine. Of course, there are capitalised adjectives which aren't a language also. So this makes the treatment of languages as proper nouns even more illogical.
Going off-topic slightly I notice that all months have been treated as proper nouns, while days are not. I'm not sure what the reasoning behind that is. Incidentally, days have also been classed as adverbs, which is an American trait, not normally done in British English. Donnanz (talk) 22:02, 8 February 2016 (UTC)

Making Byzantine Greek an etymology-only language[edit]

I propose we make Byzantine Greek an etymology-only variant of Ancient Greek (much as Medieval Latin is an etymology-only variant of Latin) for the following reasons:

  1. In effect, it already is one (Category:Byzantine Greek lemmas and Category:Byzantine Greek non-lemma forms are both empty, but Category:Terms derived from Byzantine Greek has 53 subcategories containing over 100 entries)
  2. It has no separate ISO 639-3 code (our code gkm was proposed in 2006 but has never been accepted)
  3. We can still have it as a lect of Ancient Greek, i.e. we can tag any specifically Byzantine words or senses with {{lb|grc|Byzantine}} and put them in Category:Byzantine Greek (or would it automatically be called Category:Byzantine Ancient Greek, and if so, can we override that?).

What do others think? —Aɴɢʀ (talk) 11:56, 25 January 2016 (UTC)

I agree. --Vahag (talk) 13:54, 25 January 2016 (UTC)
I note that the automatic Ancient Greek IPAs have included the Byzantine pronunciation for as long as I remember. — Ungoliant (falai) 14:36, 25 January 2016 (UTC)
If I rememeber correctly, we used to just treat it as grc, then we split off gkm. This would be going back to the previous status quo, but with an etymology-only code. Chuck Entz (talk) 14:49, 25 January 2016 (UTC)
Yeah. As far as I can tell, it's always been treated as an etymology-only language, but because it had a (quasi-)ISO code, it was grouped in with "full" languages early on in Wiktionary's history (as was Cajun French, frc). I proposed to update things to reflect its etymology-only-ness in 2013, but the discussion didn't result in any change. I have no strong feeling about whether Byzantine merits separate L2s etc, but since it's always been de facto subsumed under Ancient Greek here, it makes sense to make it so de jure. - -sche (discuss) 19:51, 25 January 2016 (UTC)
  • Support. It definitely belongs under the grc L2. —Μετάknowledgediscuss/deeds 01:43, 26 January 2016 (UTC)
  • There is already Medieval Greek as an etymology-only language. There's no difference between Medieval Greek and Byzantine Greek, is there? Shouldn't the two be merged into a single term? And Module:languages/data3/g even calls Medieval Greek another name for Byzantine Greek. —Aɴɢʀ (talk) 14:55, 27 January 2016 (UTC)
  • Yes check.svg Done. Last time I refreshed CAT:Pages with module errors there weren't any more caused by this change, but maybe some more will pop up over the next few days. —Aɴɢʀ (talk) 19:53, 31 January 2016 (UTC)
    • Found a couple more after hard purging. —JohnC5 20:12, 31 January 2016 (UTC)
    • @Angr Actually, there are a bunch more now. —JohnC5 20:38, 31 January 2016 (UTC)
    • Why wasn't the code turned into an etymology-only code instead? —CodeCat 00:52, 1 February 2016 (UTC)
      • It was, but it still creates a module error when an entry says "{{etyl|gkm|en}} {{m|gkm|φόοβαρ}}" instead of "{{der|en|gkm|φόοβαρ}}". —Aɴɢʀ (talk) 06:53, 1 February 2016 (UTC)

Rename instances of Template:term lacking a language to Template:termwithoutlang[edit]

Is it ok for me to run a bot to rename all instances where {{term}} lacks a language to {{termwithoutlang}}? This template would exist only temporarily, and is of course not supposed to be added to entries by editors. But it would help with the current effort to orphan {{term}}, because it separates instances that a bot can fix from those that can't be fixed with a bot. —CodeCat 15:26, 25 January 2016 (UTC)

How about we create a fake language code "?" which corresponds the the language "Unknown", so that in all templates which take a language parameter and categorize there would be a category for "blah in Unknown" and we could easily find and work on such problems. - TheDaveRoss 15:31, 25 January 2016 (UTC)
This wouldn't achieve the effect I'm hoping for. —CodeCat 15:47, 25 January 2016 (UTC)
Well, assuming that {{m}} could accept the language code, you will still be able to orphan {{term}}, what other things are you trying to achieve? - TheDaveRoss 17:41, 25 January 2016 (UTC)
{{termwithoutlang}} seems ugly to me, maybe it should be {{term-nolang}}? Benwing2 (talk) 01:40, 26 January 2016 (UTC)
Making it one word would make it easy for editors to doubleclick it and delete the whole thing in one go. Adding a hyphen makes it much more troublesome. —CodeCat 01:50, 26 January 2016 (UTC)
Well then {{termnolang}}. Still looks ugly though. What is the issue with Dave's suggestion? It would be even easier to manually correct that by just changing the ? to the right code. Benwing2 (talk) 01:55, 26 January 2016 (UTC)
If MewBot succeeds in converting all instances of {{term}} with lang into {{m}}, then logically all instances of {{term}} would be without lang and thus would be equal to {{termwithlang}}. Re: "because it separates instances that a bot can fix from those that can't be fixed with a bot", all instances of {{term}} would be the latter. (except if/when people keep adding {{term}} to entries, of course) --Daniel Carrero (talk) 02:06, 26 January 2016 (UTC)
It's not that easy. Some instances of {{term}} occur inside other templates, and may not appear on the template page itself (so that they don't show up as transclusions). My reason for this request was to make it easier to separate pages that need a language from pages that have a language but are still transclusing {{term}} for some reason. However, I've now made a change to Module:term cleanup instead that adds a tracking template if there is a language. So the proposal isn't needed in the short term. It may still be valuable, though, to make it visually obvious to editors viewing the wikitext that the template needs to be replaced, or at least not added to new entries. Perhaps something like "termwithoutlangpleasereplace" would definitely stand out, and draw the attention of editors who happen to edit the page for other reasons. —CodeCat 03:16, 26 January 2016 (UTC)
I would suggest something like "termtemp". We don't really need to explain why it's being used (except in its documentation), but we do want to make clear that it's not permanent so that people don't copy its use in other entries. Chuck Entz (talk) 03:04, 26 January 2016 (UTC)
@CodeCat, what is the effect you're looking for. Renard Migrant (talk) 11:44, 26 January 2016 (UTC)
  • {{term/t}} isn't doing anything these days; why not temporarily resurrect it for current purposes? —Aɴɢʀ (talk) 14:57, 27 January 2016 (UTC)
  • Oppose. The purpose of this would be to delete {{term}}, which I oppose. If the objective is to minimize the number of instances of term in the mainspace, renaming {{term}} to {{m}} while passing {{m}} some artifical lang code standing for "language missing" would be an okay solution, IMHO. --Dan Polansky (talk) 15:12, 31 January 2016 (UTC)

Amateur Altaicists[edit]

I have seen a slew of people add several ridiculous etymological theories. One person was trying to link Etruscans to Turkic, another one was linking random stuff like cannabis and other IE terms to more Turkic stuff.

How do we deal with them? Keep shooting ridiculous claims on sight? Hillcrest98 (talk) 23:45, 25 January 2016 (UTC)

As long as we're sure they're ridiculous. It's fairly widely assumed that cannabis is a loanword in Proto-Indo-European, but there's no consensus on where it came from. The main problem I had with Horsesongrassland's cannabis-related edits was that the etymology already said that it probably came from somewhere else and the references duplicated that, but were incompatible with the rest of the etymology and cited really poor, strongly POV sources. Some of the details were clearly nonsense, but the general idea of some kind of Altaic origin can't be absolutely proven wrong, because there doesn't seem to be enough evidence to prove anything with regard to Altaic- including whether it exists or not. Chuck Entz (talk) 03:27, 26 January 2016 (UTC)

Officializing automated romanizations[edit]

Wiktionary:Votes/pl-2015-12/Translations is probably going to fail. I have the intention of creating a new vote with the same proposal, but improved/fixed based on the multiple points raised by the opposers.

One of the points is: "switching Russian: {{t|ru|апельсин|m|tr=apelʹsín}} to Russian: {{t|ru|апельси́н|m}} is a topic for a separate vote, deserving its own discussion". WT:EL#Translations currently uses some Russian examples with manual romanizations (tr= parameter). But Russian can do it automatically, so as requested in Wiktionary talk:Votes/pl-2015-12/Translations#Transliteration, in my proposed change, I want to use Russian examples without without the tr= parameter, which implies that automatic romanizations are official policy. Is there any problem or controversy here?

I have no problem creating a separate vote officially "Allowing automatic romanizations", if that's what people want. --Daniel Carrero (talk) 12:25, 26 January 2016 (UTC)

By the current convention, not ALL Russian words are automatically transliterated but over 95%. Only languages under "override_translit" in Module:links have automatic transliteration overriding manual. So, Russian is a bad example. --Anatoli T. (обсудить/вклад) 12:31, 26 January 2016 (UTC)
Current rule in WT:EL#Translations:
  • Do add a transliteration or romanization of a translation into a language that does not use the Roman alphabet. Note however that only widespread romanization systems may be used. See Wiktionary:Transliteration.
My proposed change, as per Wiktionary:Votes/pl-2015-12/Translations:
If there's some controversy, I could further edit the sentence this way:
  • You can add a transliteration or romanization of a translation into a language that does not use the Latin script. In some languages, the romanization can be supplied automatically by the software, but there's no consensus as of yet concerning the acceptability of automatic romanizations and exactly what languages should use them. See Wiktionary:Transliteration and romanization.
Looks good? Of course, if there's consensus in favour of generally having automatic translations (I'd vote support on that), then that last change would be unnecessary. --Daniel Carrero (talk) 16:19, 26 January 2016 (UTC)
I'm not clear as to what you're actually proposing. Or are you not proposing anything yet? Renard Migrant (talk) 18:37, 26 January 2016 (UTC)
Proposal 1: For some languages, allowing automatic romanizations.
Proposal 2: In some WT:EL examples of wiki markup of Russian translations in the translation tables, using automatic romanizations.
Reason: I assumed that was a given (i.e., that people generally are supportive of automatic romanizations and that it would be okay mentioning one or two examples in WT:ELE using them), but in Wiktionary:Votes/pl-2015-12/Translations#Oppose, @Dan Polansky complained about it. --Daniel Carrero (talk) 19:22, 26 January 2016 (UTC)
OK, I'm a bit confused about what is specified as current policy and what is being proposed, but I actually wrote and ran a bot to automatically convert {{t|ru|апельсин|m|tr=apelʹsín}} to {{t|ru|апельси́н|m}}, and no one has complained about it; in fact, the main Russian editors here were happy with the results. Actual current policy doesn't agree much at all with the rule that Daniel quoted above. In particular:
  1. "do add a transliteration or romanization" isn't really right. It should ideally only be added when automatic transliteration either doesn't exist for a language or would be wrong. In particular, writing апельсин without a stress mark and then including manual translit apelʹsín with a stress mark is wrong; instead, апельси́н should be written and the auto-translit allowed to work. As Anatoli mentioned, most of the time (I would say over 99%) the automatic transliteration for Russian is correct (provided of course that stress marks are added to the Russian). Pretty much the only time when manual translit is needed for Russian is in cases like тест ‎(tɛst), where the auto-translit would be test. For other languages, it may be needed more often; e.g. for Arabic, it's often needed to specify how a tāʾ marbūṭa should be transliterated (as t or as nothing).
  2. "only widespread romanization systems may be used" gives far too much latitude. This kind of attitude created a huge mess in the Arabic transliterations in translation entries, which took a lot of work on my part to fix (and may have gotten messed up again in more recent entries). Properly, the transliterations must follow the particular translit system used by Wiktionary for that language.
I would propose something like:
  • Add a transliteration or romanization of a translation into a language that does not use the Latin script, except for those languages where the romanization is supplied automatically by the software (but do add a transliteration/romanization if the automatically-provided one is wrong). The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration appendices); do not use any other romanization system.
Benwing2 (talk) 05:37, 27 January 2016 (UTC)
@Benwing2: That sounds great to me. I think these are actually our current "unspoken rules", which you could articulate well. Since @Dan Polansky asked for that specific issue to be voted separately, I don't mind creating a separate vote for it. --Daniel Carrero (talk) 13:33, 27 January 2016 (UTC)
Oh you're trying to get common practice codified somewhere. Excellent. Go for it. Renard Migrant (talk) 15:43, 27 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-01/Automated transliterations. --Daniel Carrero (talk) 03:58, 28 January 2016 (UTC)

Proposed wording:
  • Translations not written in the Latin script should have romanizations. In some cases, the romanization is supplied automatically by the software. Supply the romanization manually if it is not supplied by the software or if the romanization supplied by the software is wrong. The transliteration should follow the appropriate Wiktionary-established conventions for the language in question (see Category:Transliteration policies); do not use any other romanization system.
--Daniel Carrero (talk) 00:45, 30 January 2016 (UTC)
Perhaps the use of manual transliteration for automatically transliterated languages should be mentioned (exceptions as in Korean, Russian, etc.). The necessity to provide word stresses for Cyrillic-based Slavic (Serbo-Croatian accents?), diacritics for Arabic (Hebrew?).
Some languages are in a transition and a unified transliteration hasn't established yet, due to complexities - such as Khmer and Thai. Some transliteration modules are in the process of development or fixing - Lao, maybe Burmese. Thai module may never work the way other transliteration modules do, it will need phonemic spelling, split by syllables, just like Japanese requires kana readings and PoS info (plus morpheme boundaries in some cases) to determine the correct transliteration. Just commenting. --Anatoli T. (обсудить/вклад) 01:18, 30 January 2016 (UTC)
@Atitarev: IMO, ideally I would want a comprehensive list of transliteration circumstances as you described, but for the moment I'll probably just try to update WT:EL to officially allow transliteratons in the first place. --Daniel Carrero (talk) 10:06, 30 January 2016 (UTC)
@Daniel Carrero Sorry, I have been pre-occupied with testing new Thai transliterations and fixes with Russian. Quite busy at work too. There ARE changes currently happening with Thai transliteration methods and headwords and situations with transliterations and requirements with languages are indeed different. Make a list of questions, if you need for transliteration policies/issues and I'll try to answer them. --Anatoli T. (обсудить/вклад) 00:56, 1 February 2016 (UTC)
@Atitarev Thank you very much. :) There's absolutely no need to apologize, language-specific transliteration issues and changes are valuable information to be documented, it's just that WT:EL technically does not even allow automated transliterations. It says: "Add a transliteration or romanization of a translation into a language that does not use the Roman alphabet." If we were to obey that pre-Lua rule, we would have to throw away all transliteration modules, so I'll try to update that rule first, before working on the language-specific issues. (at least that's my plan at the moment) --Daniel Carrero (talk) 01:07, 1 February 2016 (UTC)
I'll just describe briefly how I understand the situation with automated transliterations, not manual transliterations.
Slavic, Cyrillic-based languages should normally use accent marks, especially, Russian, Ukrainian and Belarusian.
Russian: User:Benwing2 kindly converted all Russian translations to have accents when they were present in the manual transliteration. For cases when manual transliterations are required, word stresses are still required. Exceptions requiring manual translit are described, many are now partially automated.
Ukrainian and Belarusian: These don't require manual transliterations, if fully accented Cyrillic forms are provided. There is an unresolved issue with monosyllabic Belarusian words with "ё", fixed in Russian. Manually transliterations shouldn't be simply removed before Cyrillic words get accents.
The above three - currently deciding if we need to use grave accents for the secondary stress. Its usage is inconsistent.
Bulgarian: No manual transliteration is required, if accents are provided. The use of the grave accent is not very clear but normally used with accented vowel "ъ". More info is required on the rules.
Macedonian: No manual transliteration is required. The stress position is predictable but accents should be given for words when they differ from expected.
Serbo-Croatian: (Cyrillic) No transliteration is required. Another nested Roman form should be given to match the Cyrillic. The headwords use accents. (I personally find them problematic but they can be copied from entries if they exist).
Arabic: Automatic transliteration works only with fully vocalised Arabic forms. Loanwords, which are pronounced irregularly still need to have accents, manual transliterations is required (or can be provided) for some loanwords, words with "ة" between words and some words, with silent letters.
Korean: Automated but words of certain etymologies need manual transliterations.
Manual translit overrides automatic for all the above.
Greek, Armenian, Georgian, Kazakh, Kyrgyz, Tajik, etc. - fully automated. The list can be found in Module:links
Lao, Burmese: - fully automated. The transliteration is complex, sometimes doesn't work.
Khmer: - needs more work, can't officialise yet.
Hindi, Sanskrit, Nepali - almost there, the modules look good but occasional manual translit is required.
Japanese and Thai are special cases.
Japanese: transliteration works with headwords on kana, there are some exceptions and additional parameters are sometimes needed to get a correct transliteration. Can't be used to automatically transliterate Japanese words.
Thai: (new) only works in pronunciation sections. It needs phonetic respellings by syllables. Can't be used to automatically transliterate Thai words.
Feel free to add on transliteration policies. --Anatoli T. (обсудить/вклад) 02:02, 1 February 2016 (UTC)
Special thanks from me to User:Wyang and User:Benwing2 (aka Benwing) for making some complex transliterations happen! --Anatoli T. (обсудить/вклад) 02:14, 1 February 2016 (UTC)
That is an amazing list! Thank you! :) --Daniel Carrero (talk) 02:20, 1 February 2016 (UTC)
OK. With Hindi, there seems to be an agreement to provide nuqta and chandra when they effect pronunciations, even if Hindi speakers normally omit them in writing. (Very similar to Russian writing "е" instead of "ё" but dictionaries use "ё", so does Wiktionary). Some Sanskrit lovers prefer to provide word stresses, even if there is no native method to that (also Hebrew) and add hyphens to show morpheme boundaries. (I personally oppose that but I need to mention).
Mongolian (Cyrillic): Fully automated, overrides manual translit but it's known that Mongolian Cyrillic is not fully phonetic. Some textbooks and phrasebooks provide a more phonetic transliteration but we don't - no data or editors.--Anatoli T. (обсудить/вклад) 02:30, 1 February 2016 (UTC)
Overall, I would find it hard to set the rules to vote for and I am not sure how I am going to vote. I'd like to officialise the use of "^" to capitalised Korean romanisations (romaja officially capitalises proper nouns). I haven't described all situations, of course. e.g. Tamil, Malayalam, Telugu, Tamil and Sinhalese don't require manual overrides but Amaharic, Tigrinya do (rules for schwa-dropping are not defined and consonant geminations may need to be provided manually). Yiddish words of Hebrew origin are often transliterated and pronounced irregularly. --Anatoli T. (обсудить/вклад) 03:07, 1 February 2016 (UTC)
We don't need to set the rules for each language, we just need to mention that transliterations should appear for non-Latin script languages (with the exception of languages such as Serbo-Croatian, which are exempt as long as an equivalent Latin-script form is supplied), regardless of whether they are manually entered or automatically generated. Then each language can worry about how this happens on its own without policy in getting in the way. And even this much should not be part of the Translation Table policy, but general transliteration in links policy. --WikiTiki89 19:14, 3 February 2016 (UTC)

Literal translations from FL to English in the translation table[edit]

WT:EL currently says:

  • Do not give translations back into English of idiomatic translations [in the translation table of the English term]. For example, when translating “bell bottoms” into French as “pattes d’éléphant”, do not follow this with the literal translation back into English of “elephant’s feet”. While this sort of information is undoubtedly interesting, it belongs in the entry for the translation itself.

But I propose changing that: some entries give the literal translation in the translation tables. In kill two birds with one stone, we see that the idiom translated into other languages apparently have the literal translations: "to hit two flies with one slap", "to cook two roasts on one fire", "with one shot, two pigeons", etc. I like that, for purpose of language comparison, and also it makes it clearer that the translated idiom has a different literal meaning than the English idiom.

I propose officially allowing the literal translations and adding a parameter such as lit= to {{t}} and {{t+}} if there's no parameter like that available yet. The full syntax might be:

  • Portuguese: {{t|pt|matar dois coelhos com uma cajadada só|lit=to kill two rabbits with only one hit of a staff}}

Some other entries currently giving one or more literal translations back to English:

Thoughts? --Daniel Carrero (talk) 16:50, 26 January 2016 (UTC)

Oppose. The translation table is there just to point readers to an entry. The entry should contain all the information, including the literal translation. --WikiTiki89 16:54, 26 January 2016 (UTC)
But we include genders in translation tables for some reason. Why this exception? —CodeCat 17:02, 26 January 2016 (UTC)
That is a consequence of the history of this project. I don't really think we should include genders, but I'm not going to actively propose such a radical change. --WikiTiki89 18:27, 26 January 2016 (UTC)
Genders are such a small amount of information, like literally one character, I don't mind it. But in general yeah we include too much in translation tables. I've seen people cite bilingual dictionaries using <ref></ref> and it's just so much information for a translation table. Stuff like that goes under ===References=== in the entry itself. Renard Migrant (talk) 18:35, 26 January 2016 (UTC)
I tend to oppose this as well, as it is reduplication of effort and is likely to get out of sync. It is also not the most attractive, and might be confusing for some users. - TheDaveRoss 17:07, 26 January 2016 (UTC)
Is there some danger that users might think that all translations given in a translation table are literal translations? Would a user see that all the tea in China in French is tout l'or du monde and wonder which French word means "tea" and which one means "China"? I don't have actual stats to back it up, but if the answer to these questions is "yes", I'd consider it a point in favor of adding FL-to-English literal translations there. If the answer is "no, the users won't ever make that confusion even without the literal translations", then I suppose it would be fine removing the literal translations from the entries that have them. --Daniel Carrero (talk) 17:29, 26 January 2016 (UTC)
I think adding literal translations is a good idea, especially for multiword idiomatic entries. In fact most sayings already have literal translations. There's no need for an extra template or {{t}}-template parameter, the {{gloss}}-template can readily be used for that purpose. Matthias Buchmeier (talk) 18:08, 26 January 2016 (UTC)
Maybe we could use {{gloss}}, but the literal translations are not standardized in the examples above, so IMO it would be better using lit= to standardize them. --Daniel Carrero (talk) 19:23, 26 January 2016 (UTC)
Support, at least to the extent that the literal meaning of the translation is significantly different from that of the English idiom. It's not just interesting (or hilarious), but it can directly influence the choice of idiom and affect its appropriateness. Not sure how essential the template is, or whether users unfamiliar with it would stumble over it, but as a policy allowing this makes good sense. After all, people should know if "come alive" translates as "rise from the grave" in parts of Asia, to cite a famous, if apocryphal, example! P Aculeius (talk) 22:11, 26 January 2016 (UTC)
@P Aculeius: But they can click on the translation and find out in the entry itself. Why does this information have to be duplicated in the translation table? --WikiTiki89 22:44, 26 January 2016 (UTC)
It's natural to suppose that the translation of a word, phrase, or idiom will have the same meaning; and often the meaning is literally represented, but sometimes it's not. Someone unfamiliar with another language may be assume that the translation is not merely the closest equivalent, but the literal one. Experience suggests that users won't always look up the translation to see if it means the same thing as the phrase being translated, simply because they assume it means the same thing. If it means something significantly different, it would be a good idea to explain that at the point the translation is given, rather than on a different page. The current policy discourages useful information like this. Why? Does it make Wiktionary operate more smoothly? Does it make it easier to find out what translations of idioms really mean? No. The pages for foreign idioms can contain all kinds of information about them that wouldn't make sense in notes such as those proposed here. But relevant and important information about translations ought to be provided where it would do the most good. P Aculeius (talk) 01:18, 27 January 2016 (UTC)
@P Aculeius: But that applies to pretty much all translations of all terms, even if they are not "idioms". Should we just start including the entire definition section of every term in the translation tables? --WikiTiki89 15:57, 27 January 2016 (UTC)
@Wikitiki89: I disagree that it applies to pretty much all translations of all terms, even if they are not "idioms". olho (Portuguese) means eye and is not an idiom, it does not have a "literal" translation back to English; custar o olho da cara in Portuguese means "to cost very much, to cost an arm and a leg" and literally means "to cost the eye from the face", for this one I would argue that a "lit=" translation would be an improvement. --Daniel Carrero (talk) 19:53, 27 January 2016 (UTC)
olho is just one counterexample and the reason that I said "pretty much all" rather than "all", although in retrospect, I probably should have said "many" rather than "pretty much all". For example, look at the translations in the first translation table for the verb watch. A naive reader looking at that might think that in Portuguese ver and assistir mean the same thing, or that Russian смотреть, наблюдать, and глядеть mean the same thing. But that doesn't mean we should take up space in the translation table to clarify the differences. That is what the entries are for. The entries themselves should be able to link to synonyms or similar words and phrases and explain the differences between them, but that is not what translation tables are for. --WikiTiki89 21:30, 27 January 2016 (UTC)
If you are talking about the first verb sense ("To look at, see, or view for a period of time."), then, yes, ver and assistir can be used interchangeably. (In my experience, as a speaker from São Paulo/Brazil). So, in my view, neither of those need a lit= parameter or an explanation. --Daniel Carrero (talk) 23:17, 27 January 2016 (UTC)
Sorry for my limited knowledge of Portuguese, the Russian example still stands. --WikiTiki89 23:23, 27 January 2016 (UTC)
Idioms are different because their meanings aren't literal or intuitive in either language. Because they're basically oddities of speech, they often have very inexact equivalents from one language to another. Because the meaning in each language tends to be complicated and both are often very different from the literal meaning of the words, these phrases pose especial dangers in translation, which is why it makes sense to provide information about very different meanings. People should know if the Vespugian equivalent of "passing the buck" literally means "your pig can't fish by itself." P Aculeius (talk) 00:02, 28 January 2016 (UTC)
About "it can directly influence the choice of idiom and affect its appropriateness": that's one reason why there should be literal FL-to-English translations in the translation table, IMO. If there aren't any, and there are multiple idioms to choose, then the user would have to open all of them in separate windows to make their choice. It might be particularly annoying if said user is using a cellphone, for example.
Re "at least to the extent that the literal meaning of the translation is significantly different from that of the English idiom": if our coverage of FL-to-English translations in the translation table is good, then I think it would be reasonable to assume that all translations without them have the same meaning as the English term. I don't know exactly about borderline cases: o silêncio é ouro = silence is golden (literal ≈ idiom, the Portuguese one means "silence is gold", so maybe it would be OK not having the literal translation); 言わぬが花 = silence is golden (the Japanese one means "not saying is a flower", so I think it requires the literal translation, which the entry does give) --Daniel Carrero (talk) 23:10, 26 January 2016 (UTC)
  • Support. This is de facto allowed, and should be in EL as well. If it is too contested here, I think it ought to go to a vote. —Μετάknowledgediscuss/deeds 21:34, 27 January 2016 (UTC)
  • Support. Based on my experience, I doubt most people will go to the entry to check that they understand the translation. We should provide enough information in the translations tables that someone can correctly use the word without going to the entry (thus, gender and clarification of distinctions made in other languages that don't exist in the English sense should be included as well). The entry itself serves the purpose of including the finer shades of meaning, the pronunciation of the word, and all its other definitions. Andrew Sheedy (talk) 23:09, 27 January 2016 (UTC)
  • Support. I think this is a good idea and I agree with Andrew that people aren't going to go to the entry to check for a literal translation, but will expect it to be alongside the translation itself. Benwing2 (talk) 03:35, 28 January 2016 (UTC)
  • Support. As Meta notes, this has long been common practice for idioms. (I don't think literal translations are appropriate on acorn, especially since "oak-berry" is etymologically where acorn itself comes from. I do think they're appropriate in idioms' entries.) The argument that readers should get the literal translation from the entry has, in addition to the various problems discussed above, the problem that the translation is not required to be a bluelink. If it is a redlink, then a reader cannot get any information from the (non-existent) entry. - -sche (discuss) 04:32, 29 January 2016 (UTC)

I'd like to create a vote. Proposals:

Policy stuff:

  1. Editing WT:EL to officially allow FL-to-English literal translations in translation tables.

Technical stuff:

  1. Adding a lit= parameter to {{t}} and {{t+}}.
  2. Adding lit= support to the translation table gadget.

Note: Some templates already have the lit= parameter, {{t}} and {{t+}} don't:

  • {{m|pt|foo|lit=bar}} returns foo ‎(literally bar)
  • {{l|pt|foo|lit=bar}} returns foo ‎(literally bar)

Possible policy text:

  • If the translated word is an idiom, you can give the literal translation back to English using the parameter |lit=. For example, the idiom “none of your beeswax” cannot be translated into German literally as “nicht dein Bienenwachs”, as this does not have the same meaning in German; an idiomatic translation is “nicht dein Bier” (which means, literally, “not your beer” in English).

--Daniel Carrero (talk) 18:57, 29 January 2016 (UTC)

I created the vote: Wiktionary:Votes/pl-2016-01/Literal translations in translation tables. --Daniel Carrero (talk) 10:36, 30 January 2016 (UTC)

Arabic loanwords and vocalisations[edit]

User:Mahmudmasri has been removing vocalisations (diacritics) from Arabic terms borrowed from other languages, especially words with irregular pronunciations (where automatic transliteration differs from manual). I disagree. We need an agreement on this. @Benwing2, Wikitiki89.

Example: كُوت دِيفْوَار ‎(kōt divwār) should still have diacritics - "كُوت دِيفْوَار", not "كوت ديفوار" even if the transliteration is not the expected "kūt dīfwār". --Anatoli T. (обсудить/вклад) 22:19, 26 January 2016 (UTC)

I think the vocalization should be present. Benwing2 (talk) 22:23, 26 January 2016 (UTC)
I have mixed feelings, but I'm leaning toward keeping them. --WikiTiki89 22:46, 26 January 2016 (UTC)
Transliterations aren't for pronunciation respellings. Otherwise, we might as well start transliterating Latin script words to match their pronunciation too. Anyone interested in Dutch chauffeur ‎(sjofeur)? —CodeCat 23:01, 26 January 2016 (UTC)
@CodeCat Please note that many (not all) Arabic transliterations are considered standard and are attested. Hans Wehr dictionary is one of the sources or perhaps the most reliable (but not comprehensive) for transliterations. As in previous arguments, this applies to irregular Thai, Korean and other transliterations. E.g. كُورِيَا ‎(kōriyā, Korea) is transliterated "kōriyā" in the dictionary, not (as automatic) "kūriyā". Similarly Thai ราช ‎(râat) can be either "râat" or "raa-chaa" for etymological reasons.
As for your Dutch example, nobody seems to care about transliterations of Roman-based languages but "chauffeur" (sjofeur) would be transliterated as "шофёр" into Russian Cyrillic. --Anatoli T. (обсудить/вклад) 23:13, 26 January 2016 (UTC)
Transliterations aren't for pronunciation respellings but often do reflect pronunciation. In Arabic this is inevitable; a true transliteration would include only the consonants (Buckwalter transliteration is an example of this), but this would be far from useful for most users, since the vowels are critical for learners. Hans Wehr's transliterations, which we follow subject to a few modifications, are really transcriptions of the pronunciation; this includes cases (mostly loanwords) where written long vowels are pronounced as short vowels, and where written i and u are pronounced as e and o. Similarly, the Russian transliteration system we use partly reflects pronunciation, and transliteration of East Asian languages is definitely pronunciation-based. Benwing2 (talk) 05:00, 27 January 2016 (UTC)
@Benwing2 Did you mean South East Asian, East Asian or both. All those ar surprisingly phonetic. Japanese kana really has a couple of true exceptions (well, it is designed for pronunciations but something has changed. There are also semantic and morphemic differences, where, e.g. one needs to know if it's ou or ō. Thai has a number of awkward loanwords from Sanskrit and Pali, lots of traditional spellings, like English but it is a phonetic script, more meaningful than English, actually. Korean is slightly less phonetic than Japanese, it's often to do with Sino-Korean words with South Korean spellings. Chinese is not phonetic but can be considered much more consistent in pronunciations of characters than Japanese but there are multiple readings as well. --Anatoli T. (обсудить/вклад) 05:18, 27 January 2016 (UTC)
@Atitarev I was thinking of languages like Chinese and Japanese, where our "transliterations" are necessarily pronunciation-based, but it applies e.g. to Thai as well. Keep in mind the proper distinction between a true "transliteration", which is a direct mapping of the written form, and a true "transcription", which is a direct mapping of the spoken form. A true transliteration would follow Thai spelling exactly, and have distinct Latin representations for every distinct Thai letter, so that the reverse conversion from translit back to Thai would be possible. In reality a transcription is used, which doesn't distinguish letters pronounced the same, and romanizes Pali and other loanwords (the "awkward loanwords" and "traditional spellings" you mention) according to pronunciation rather than spelling. All of this is generally the right thing to do, IMO. Benwing2 (talk) 05:34, 27 January 2016 (UTC)

This discussion is longer than necessary.

  • Keep in mind that the vocalization should be familiar to Arabic speakers, not fake, as it is now.
  • Damma+و are /uː/. The use for /oː/ is not expected. You are wrongly indicating /uː/ rather than /oː/ which has no way other than writing plain و without preceding diacritics.
  • Kasra+ي are /iː/. The use for /eː/ is not expected. You are wrongly indicating /iː/ rather than /eː/ which has no way other than writing plain ي without preceding diacritics.
  • The example of كوريا is wrong, since it is generally Arabized as /ku(ː)rja/; كورى /kuːri/.
  • Diacritics are made for Arabic words, not loanwords. If there is a need to be used in loanwords, they must be minimally used. Some letters don't need vocalization to indicated the expected pronunciation, like the ending of يا, it can't be anything but /ja/.

--Mahmudmasri (talk) 20:35, 6 February 2016 (UTC)

    • @Mahmudmasri The evidence shows that vocalisation is also used for loanwords. I deliberately used كُورِيَا ‎(kōriyā, Korea) because it's referenced (it's Hans Wehr's transliteration). However, your argument (that it should be "kūrya") just shows that there is more than one way to pronounce loanwords, depending on the speaker, region, education and preferences. The only way to indicate the pronunciation for native speakers is vocalisations for all combinations, including vowels o, ō, u, ū / e, ē, i, ī or consonants missing in the Classical Arabic - g, p, v, etc. I can't be that wrong. --Anatoli T. (обсудить/вклад) 00:48, 7 February 2016 (UTC)
I basically agree with Anatoli here. Benwing2 (talk) 02:56, 7 February 2016 (UTC)
I'll just say that we focus too much on Hans-Wehr. Hans-Wehr can be used as a guide, but not as a definitive reference. We need primary sources (i.e. quotations from the "wild") as definitive references for both pronunciation and vocalization of loanwords. What I mean by this is, we can initially give whatever Hans-Wehr gives, but when an investigation is being made in a particular case into the accuracy of a pronunciation or vocalization, Hans-Wehr can no longer be used as a source. Also, by pronunciation, I really mean the one represented by the transliteration. --WikiTiki89 03:02, 7 February 2016 (UTC)
There seem to be some videos on Youtube which use the word; someone who speaks Arabic could take a listen and see how it was pronounced. - -sche (discuss) 03:51, 7 February 2016 (UTC)
There is a fundamental difference between regular Arabic words and Arabic transliterations of loanwords. Regular Arabic words are written in the Arabic abjad, where short vowels usually are not indicated in writing, and long vowels are indicated with letters of prolongation (since there are no letters for vowels other than the diacritics fatha, kasra, damma). Loanwords are written in the Arabic alphabet (a true alphabet), and most vowels are indicated with ا and ي ‎(y) and و ‎(w) (as true vowels), and these vowels do not indicate vowel length. Some very short vowels (the schwa) are not indicated in alphabetic Arabic, because they feel that the lack of a vowel best represents the shortness of the schwa. It means that the automatic Lua transliteration does not work correctly for alphabetic Arabic. —Stephen (Talk) 07:02, 7 February 2016 (UTC)
Indeed; this is why we require manual translit for these words. Benwing2 (talk) 07:12, 7 February 2016 (UTC)
──────────────────────────────────────────────────────────────────────────────────────────────────── We've got several aspects here, some are related, some are not.
Do Arabic words borrowed from other languages get diacritics? Yes, they do. It's our current policy to provide vocalisations for Arabic words. (Objections should have been made earlier, IMO.) The differences from native words:
  1. Loanwords don't get ʾiʿrāb endings like native words do. It's not an issue here, as we normally don't include ʾiʿrāb in headwords or translations, only in inflection tables (or some known exceptions where definite and indefinite forms differ greatly).
  2. It is perceived that only religious texts, such as Qur'an use diacritics. Qur'an doesn't have foreign words. However, foreign words appear in books for children and foreigners, in textbooks and dictionaries and they get vocalisations.
  3. Pronunciations of loanwords may differ substantially from native Arabic words and long vowels may be used to render short vowels or vowels not present in standard Arabic - ي ‎(y) to render e, ē, short i, و ‎(w) to render o, ō, ū and short u, ا can be used for a short a. Consonants g, p, v, č, etc. (absent in the Arabic alphabet) can be rendered with other letters. They still get vocalisations. Incorrect transliterations (sometimes caused by automatic transliterations) should be entered manually in such cases (we DO have words with incorrect transliterations, no-one has denied it). ا and ي ‎(y) and و ‎(w) may represent either long or short vowels in loanwords but they still have to use fatḥa, kasra and ḍamma.
Of course, Hans Wehr is not the only source of information, no-one said it was. Besides, Hans Wehr is not a source for vocalisation but it definitely is a good source for transliterations. If there is evidence that a specific word is pronounced differently, we can change it.
Checking for vocalisations (diacritics) in the Google books would be useless because Arabic books don't use vocalisations but selected dictionaries do and there's also evidence that vocalisation is also used with words irregularly pronounced - loanwords and dialectal words. We have adopted to use diacritics and should stick to it. --Anatoli T. (обсудить/вклад) 12:02, 7 February 2016 (UTC)

Random search on the internet: كوريا /ku(ː)rja/ [2]; كورى /kuːri/ [3]. --Mahmudmasri (talk) 14:43, 7 February 2016 (UTC)

@Mahmudmasri If you're still going on about كوريا, I have added "kūryā" as an alternative transliteration. BTW, I think we agreed to transliterate the final alif with "ā", even if it's shortened in pronunciation. Anyway, I think we need to keep "kōriyā" as well, which is referenced, unless it's proven that it is incorrect and only "kūryā" (kūrya) is correct. --Anatoli T. (обсудить/вклад) 00:49, 8 February 2016 (UTC)
That's exactly my point. It's not referenced because other dictionaries are not valid references for Wiktionary. --WikiTiki89 16:10, 8 February 2016 (UTC)
What is not referenced? --Anatoli T. (обсудить/вклад) 21:51, 8 February 2016 (UTC)
You said "Anyway, I think we need to keep 'kōriyā' as well, which is referenced, unless it's proven that it is incorrect". I disagree that Hans-Wehr counts as a reference for us to keep it. --WikiTiki89 22:09, 8 February 2016 (UTC)
If we don't use Hans Wehr's transliteration, what do we use? There are no books written in Arabic transliterations, they are only used in dictionaries and textbooks. Vocalised Arabic, if available, will result in many mistransliterations for loanwords. As I said, transliteration of loanwords may need some tweaking but Hans Wehr is one of the few limited resources on comprehensive transliterations available. I trust native speakers' judgement but I don't think we should discard Hans Wehr as a resource. Hans Wehr dictionary IS the reference in many cases. You were there when we decided on largely adopting Hans Wehr's methods. Why this negative attitude now? Did you find any mistakes in the dictionary? --Anatoli T. (обсудить/вклад) 11:18, 9 February 2016 (UTC)
I think the basic issue is whether the vocalizations are usage-based, like the definitions and the basic part of the spelling, or reference-based, like the etymologies. Arabic may be a well-documented language, but the vocalizations just aren't used that much to convey meaning, so vocalized Arabic is more like a less-documented language. How much of the vocalization on words not used in children's books can actually be verified without resorting to mentions in dictionaries? Chuck Entz (talk) 14:39, 9 February 2016 (UTC)
I didn't say we need to attest the transliterations in the same way we would attest words and spellings, but that we need to verify them somehow in the wild, such as with YouTube videos. If you didn't understand what I originally said, I said that we can use a Hans-Wehr transliteration initially as long as that particular transliteration is not contested, but as soon as that transliteration is contested we have to find another source. --WikiTiki89 16:28, 9 February 2016 (UTC)
The original term in the discussion wasn't from HW, it's not contested. As for كوريا, I'm not sure Mahmud insists that kōryā is absolutely wrong and I don't think it is. It's like saying موسكو (Moscow) should be mūskū, not mosko. There's more than one reading. Recently I found vocalised جريون (garsōn, "waiter", colloquial), without transliterations disputed on my talk page by another native speaker in an Oxford dictionary. We need to vocalise all words, even if not all words can be found in this form in the literature for obvious reasons, just a policy thing. As for contested transliterations, I have already said that we do have incorrect cases and we need to fix them but we won't find transliterations for each term in books. —This unsigned comment was added by Atitarev (talkcontribs).
I'm sorry if I misunderstood, but I assumed that you were saying that you got "kōriyā" from Hans-Wehr, and that Mahmudmasri was saying that it was wrong. Anyway, I'll repeat that I never said that we have to find transliterations in books in order to use them. --WikiTiki89 20:44, 9 February 2016 (UTC)
Well, if Mahmud proves that "kōriyā" is incorrect, we can take it out, even if it's in Hans Wehr. Do we need to search for "kōriyā" pronunciations or let's trust HW dictionary?
Let's take جَرْسُون ‎(garsōn) as a good example for this discussion. I trust the native judgement that it should be "garsōn" (from French garçon), not "jarsūn" as the spelling would suggest but Oxford English-Arabic dictionary only has the spelling "جَرْسُون". In this case we have an attested vocalised spelling of a loanword but unattested transliteration but I am sure we can find pronunciation examples. BTW, I won't be surprised if "jarsōn" is also used outside Egypt.
Unlike Russian, which is much more homogeneous, an opinion of one Arabic speaker may not be even enough to contest a referenced spelling or pronunciation. --Anatoli T. (обсудить/вклад) 22:25, 9 February 2016 (UTC)
He doesn't have to prove that it is incorrect. Once he is contesting it, we have to prove that it is correct. Again, it's not the spelling we have to verify, but the pronunciation. For example, if you found a YouTube video in which the pronunciation "kōriyā" is used, that would be enough evidence, but Hans-Wehr is not evidence. It's like RFV but more informal. --WikiTiki89 22:29, 9 February 2016 (UTC)
Of course Hans Wehr is evidence. His transcriptions may reflect a particularly formal style of speaking but they're not put there just for someone's random amusement. We can argue whether it's sufficient by itself but it's certainly evidence. Benwing2 (talk) 22:49, 9 February 2016 (UTC)
Sorry, I meant "valid evidence for our purposes". Of course it is evidence, everything is evidence. Hans-Wehr could have assumed or even invented a transliteration when its own editors lacked evidence. --WikiTiki89 23:09, 9 February 2016 (UTC)
Why isn't it valid? Benwing2 (talk) 23:11, 9 February 2016 (UTC)
Other than what I literally just said, Wiktionary on principle does not accept other dictionaries as evidence. Because of Arabic's unusual situation, I'm assuming a relaxed version of CFI and the RFV procedure for the purposes of verifying transliterations and vocalizations. Essentially, once someone comes along and disputes a transliteration (i.e. the pronunciation), we have to go and make sure that our source itself was not wrong, and our source in this case is Hans-Wehr. Do you disagree? --WikiTiki89 23:30, 9 February 2016 (UTC)
I don't know the rules for accepting evidence but it surprises me that we can't use other dictionaries. I've seen plenty of cases where other dictionaries appear in references for definitions. I don't disagree that we should verify pronunciations if they're disputed but that doesn't mean they should be removed pending resolution. Also, I agree with Anatoli that a native Arabic speaker's own intuition isn't sufficient evidence given that there's so much variety and that MSA isn't even anyone's native language (and the line between MSA and dialect isn't very well defined). Benwing2 (talk) 00:32, 10 February 2016 (UTC)
Yes. A reference from a universally accepted dictionary, such as Hans Wehr should be taken seriously and editors do use dictionaries as references. Even more often than references from books.
I have just checked with a native Arabic speaker, a lady from Iraq. She says both pronunciations are acceptable. There you go. --Anatoli T. (обсудить/вклад) 05:46, 10 February 2016 (UTC)
I don't mean that you can't list in the reference section. It's just that the dictionary does not count as verifying the pronunciation. And I'm also not saying that we should immediately delete a pronunciation that is disputed. We should do the research first, but if we then fail to find any evidence "in the wild" of a given pronunciation, its existence in the dictionary should not save it from being deleted at that point. @Atitarev: I was commenting on the procedure, not on the specific word كوريا. But it's great that you verified it. --WikiTiki89 06:20, 10 February 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── My concern is the procedure as well. I had no doubts that a pronunciation from Hans Wehr is verifiable and would be confirmed sooner or later. If we do a research for every word, which is also present in a notable (trustworthy) dictionary, very few entries get created. --Anatoli T. (обсудить/вклад) 22:58, 10 February 2016 (UTC)

That's why I'm saying that we don't need to research every word. Only the ones that someone explicitly contests. --WikiTiki89 23:37, 10 February 2016 (UTC)
google:"كوريا" "koriya" (not many sites eem to use macrons) gets some hit that seem to be include transliterated Arabic (as well hits that seem to transliterate other languages). Mahmud seems to cite some Youtube videos for other pronunciations above. google:"Kūriyā" also gets some hits and seems to be worth investigating. Make of that what you will. - -sche (discuss) 22:50, 9 February 2016 (UTC)

Pali in non-Latin scripts[edit]

IMO, Pali in any non-Latin script should redirect to the Latin form, otherwise we end up with five entries with the same definitions, and mistakes can propagate (like rāja, which was a misspelling of rājā, and all the non-Latin stuff had to be removed). —Aryamanarora (मुझसे बात करो) 23:30, 26 January 2016 (UTC)

Support. Wyang (talk) 00:39, 27 January 2016 (UTC)
Soft redirects (with no definitions, just links, please), not hard redirects. Formats of soft redirects to be discussed. --Anatoli T. (обсудить/вклад) 00:52, 27 January 2016 (UTC)
Of course, soft redirects only - something like {{zh-see}}. —Aryamanarora (मुझसे बात करो) 01:07, 27 January 2016 (UTC)

Made {{pi-sc}} with documentation.Aryamanarora (मुझसे बात करो) 21:14, 27 January 2016 (UTC)

I put it into use at ဗျဂ္ဃ and ब्यग्घ for illustration purposes. Do we like this and want to do things this way? (I do!) —Aɴɢʀ (talk) 21:29, 27 January 2016 (UTC)

Future IdeaLab Campaigns results[edit]

IdeaLab badge 1.png

Last December, I invited you to help determine future ideaLab campaigns by submitting and voting on different possible topics. I'm happy to announce the results of your participation, and encourage you to review them and our next steps for implementing those campaigns this year. Thank you to everyone who volunteered time to participate and submit ideas.

With great thanks,

I JethroBT (WMF), Community Resources, Wikimedia Foundation. 23:49, 26 January 2016 (UTC)

Translations of taxonomic names[edit]

WT:EL#Translations currently says:

  • Translations are to be given for English words only. []

But, some Translingual entries for taxonomic names have translation tables. WT:EL#Translations does not allow Translingual translation tables. Non-exhaustive list:

I created Wiktionary:Votes/pl-2015-12/Translations last month, which proposed rewriting WT:EL#Translations completely. It is most likely going to fail today, 23:59. One of the points raised by the opposers is that my rewrite (also) does not allow Translingual translation tables.

So, I have a proposal:

  • Officially allowing translation tables for taxonomic names, and rewriting the part(s) of WT:EL required to that effect.

But I have some questions: Wouldn't the (Translingual) translation table of Canidae be a duplication of the (English) translation table of canid? If so, shouldn't Canidae use {{trans-see|canid}}? Anyway, there are probably some taxonomic names without easily found English counterparts, I think having a translation table would be most useful for them.

What about actual language-specific taxonomic names like ホモサピエンス, 호모 사피엔스, होमो सैपियन्स and гомо сапиенс, all of which apparently read close enough to "Homo sapiens" rather than "human"? One might argue that they should be in the translation table of Homo sapiens, but AFAIK, Translingual translation tables most often have the colloquial name ("dog", rather than "Canis familiaris") in different languages. For the moment, I placed the "Homo sapiens" variants in the See also section in Homo sapiens, rather than in a translation table.

Some older discussions:

--Daniel Carrero (talk) 14:10, 27 January 2016 (UTC)

Pinging everyone who voted in Wiktionary:Votes/pl-2015-12/Translations: @Metaknowledge, Andrew Sheedy, I'm so meta even this acronym, DCDuring, Dan Polansky, Xbony2. --Daniel Carrero (talk) 14:25, 27 January 2016 (UTC)
The use of {{trans-see}} would be highly desirable. It is an open question whether some uncommon English vernacular names should be the location of the translation table, rather than the corresponding taxonomic name.
Some English vernacular names (eg, hippoboscid) are almost certainly uncommonly used outside of a scientific context. I suspect that some corresponding FL names are similar.
Many English vernacular names are unclear as to their scope. In use, especially outside a strictly scientific context, does canid include genus Canis, tribe Canini (includes genera that are somewhat fox-like IMO), subfamily Caninae (includes genera that are much more fox-like than dog-like in appearance IMO, as well as some more like other carnivores), family Canidae (including foxes), suborder Caniformia (including weasels and seals)? Is dog meant to include only domesticated varieties of Canis lupus familiaris or does it include some other species of wild Canis, including extinct ones? Doesn't it have definitions without any specific relationship to any taxon, eg, "an animal resembling a typical dog in size and in facial features"
As taxonomy increasingly departs from exclusive reliance on gross and other morphological features of organisms, the relationship between taxonomic names and common vernacular names is likely to become ever more tenuous. DCDuring TALK 14:58, 27 January 2016 (UTC)
Yeah codify it, I'm all for it. Renard Migrant (talk) 15:45, 27 January 2016 (UTC)
I would tend to keep the translations in the Translingual entry, with an additional table in the equivalent English entry. The reason for having both tables is this: the Translingual entry has a greater degree of precision, while the more colloquial/language-specific terms can vary in meaning, and don't always correspond exactly to the taxonomic name. Also, certain terms could be used to translate Translingual ones, but might not be appropriate translations of a word that was chosen to host the translations instead of on the Translingual page.
To further clarify (or maybe obfuscate), the entry at Hominidae could host great ape, pongid, or hominid as translations (note that neither of the first two terms typically include humans) for English, and hominidé and grand singe for French. Only one of those could host the translations, namely hominid in this case, but that would (a) eliminate the other English translations, and (b) split up hominidé and grand singe between hominid and great ape.
I realize it may sound like I am arguing for keeping translations tables at Translingual entries to the exclusion of the English ones, but there are of course reasons to have them in the English entry as well.
Hopefully I got my meaning across, as I'm somewhat tired and tend to produce messes of confusion when that is the case. Andrew Sheedy (talk) 23:45, 27 January 2016 (UTC)
For one, I understand what you mean, @Andrew Sheedy. Well, I'll create a vote later for this. The proposal is as I said in the first message: "Officially allowing translation tables for taxonomic names, and rewriting the part(s) of WT:EL required to that effect." The choice of using an actual translation table or {{trans-see}} for an entry in particular would be done on a case-by-case basis, I think. Maybe we should have a paragraph about {{trans-see}} saying that it can be used "cross-language" between English and Translingual, without introducing any hard "requirements" like "always keep translation tables in X language to the exclusion of Y". --Daniel Carrero (talk) 20:00, 29 January 2016 (UTC)
It might be helpful to have a version of the translation table template like this: {{trans-top|Translingual gloss|see also|English entry}}, keeping a translation table in both entries, but letting users know that they might find more translations in the English entry. The same thing could be in the English entry, linking to the Translingual one. Andrew Sheedy (talk) 20:58, 29 January 2016 (UTC)
The only time {{trans-see}} should be used is if the target is very nearly an exact synonym for the source. Sometimes that occurs between vernacular and taxonomic names, especially where there is an official or semi-official body that prescribes the names. Mammal Species of the World and especially Birds of the World: Recommended English Names include such names. For vernacular names not so prescribed, the correspondence is often imperfect and sometimes nearly hopeless, eg, fish, which does not correspond to any taxon, and the more recent DNA-based clade names, which defy economical definition, let alone brief English or other language vernacular names. DCDuring TALK 21:26, 29 January 2016 (UTC)
I agree with Daniel Carrero; it's too early for general principles on where and when to use full translation tables, {{trans-see}}, or whatever. Let's just officially allow their use, and then wait till we have the precedents on which to base our casuistry. — I.S.M.E.T.A. 02:31, 30 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-01/Translations of taxonomic names.

Current text in WT:EL#Translations:

Translations are to be given for English words only. In entries for foreign words, only the English translation is given, instead of a definition. Any translation between two foreign languages is best handled on the Wiktionaries in those languages.

Proposed text:

Translations should be given in English entries, and also in Translingual entries for taxonomic names. Entries for languages other than English and Translingual should not have Translations sections; usually, the English translation is given, instead of a definition. Any translation between two foreign languages is best handled on the Wiktionaries in those languages.

Looks good? --Daniel Carrero (talk) 23:42, 30 January 2016 (UTC)

OK with me. It does the important things. The rest of what I mentioned was just intended to present some aspects of implementation. It would be, at least, premature and, possibly, completely unnecessary to go further than the text above goes. DCDuring TALK 01:22, 31 January 2016 (UTC)
Looks good to me. Andrew Sheedy (talk) 05:11, 31 January 2016 (UTC)

Transliteration policies[edit]

I organized the transliteration pages in Category:Transliteration policies.

Before, they were a mess randomly distributed in Category:Wiktionary:Transliteration and Category:Transliteration appendices. A few (like Wiktionary:Classical Syriac transliteration) were not in a language category (like Category:Classical Syriac language). I made all the pages use the same naming system and categories, through the use of {{transliteration policy}}.

Compare Category:Script appendices, which is a category explaining scripts/characters but for readers, not editors. There was some overlap between this and that category, but I believe I was able to separate them by renaming a few pages and choosing the right category. I don't think the system is perfect but it's more consistent than before. --Daniel Carrero (talk) 17:34, 27 January 2016 (UTC)

Thank you! Benwing2 (talk) 18:34, 27 January 2016 (UTC)

Placement of "Usage notes"[edit]

WT:EL#Order of headings currently has a note between parentheses concerning the "Usage notes" section:

  • Usage notes (can be placed anywhere appropriate)

This was actually voted in 2007, see Wiktionary:Votes/pl-2007-06/ELE level 4 header sequence. But, I don't believe we can actually place it anywhere nowadays, can we? This has been discussed recently at Wiktionary talk:Votes/pl-2015-12/Usage notes#Location of the usage notes header, Wiktionary:Votes/pl-2015-12/Usage notes/old#Oppose and Wiktionary:Votes/pl-2015-12/Usage notes#Support.

Note: until today, WT:EL had the following sentence, repeating that rule: "[Usage notes], whether identified by a heading or indent level may come anywhere." But I removed it, per the vote Wiktionary:Votes/pl-2015-12/Usage notes.

I have a proposal:

  • Officially deciding what is the proper place for the "Usage notes" header and disallowing the rule that "Usage notes can be placed anywhere".

According to @Wikitiki89 in this link (which is the first discussion I linked above), the placement would be: "immediately after the definitions if there is no inflected forms section or after the inflected forms section (===Conjugation===, ===Declension===, ===Inflection===, etc.), which is itself immediately after the definitions". Sounds good? --Daniel Carrero (talk) 02:52, 28 January 2016 (UTC)

Sounds good to me. Sometimes I've seen it before the inflections section but I think it should go after. Benwing2 (talk) 03:26, 28 January 2016 (UTC)
That has my support. I think there's only one time I've seen it elsewhere, anyway. Andrew Sheedy (talk) 03:28, 28 January 2016 (UTC)
I hope that would de-legitimize the Usage notes header (not content) appearing in Pronunciation sections (where I have seen it) or other pre-definition locations (where I have not). DCDuring TALK 12:38, 28 January 2016 (UTC)
Just after the definitions is the most logical place, and is what I do. Donnanz (talk) 12:44, 28 January 2016 (UTC)
  • At [[Ogham]] the Usage notes header has content about pronunciation. Is that where we want the content to be or should it appear in the pronunciation section without a header? DCDuring TALK 13:26, 28 January 2016 (UTC)
    Support this: "appear in the pronunciation section without a header".
    These aren't actual "Usage notes". At best, they are "Pronunciation notes", which don't require a separate header. --Daniel Carrero (talk) 13:34, 28 January 2016 (UTC)
  • Support placing it after definitions, before inflection. —CodeCat 16:33, 28 January 2016 (UTC)
    How does it make sense to place it before the inflection? After all, the inflection section is just an extension of the headword line. Not only that, but usage notes very frequently reference the inflections. I think it makes more sense as thinking of the headword line, definitions, and inflection section as one logical unit that cannot be separated. --WikiTiki89 16:51, 28 January 2016 (UTC)
    I've always looked at it as Wikitiki does. That order does put additional pressure to make sure that the inflection is vertically compact. DCDuring TALK 20:35, 28 January 2016 (UTC)
My understanding was that usage notes belonged after the definitions; I am surprised that WT:EL says otherwise; this is a reminder that we must check periodically that our policies and our practices match. Like CodeCat, I think the usage notes should go immediately after the definitions, before any inflection section, because the usage notes often contain vital information about the definitions, and pushing them below inflection information increases the chance that readers who only came to look at the definitions will not notice that there are usage notes which provide additional information on the definitions. Even when the usage notes are about inflection info, I think it's appropriate to put them right after the definitions (as on regnen). As a compromise, perhaps we could allow the usage notes to go either before or after inflection info, depending on whether or not the usage notes were about the definitions or about the inflection (this would mean we'd flip the order of the usage notes and inflection info in regnen and neger). - -sche (discuss) 21:05, 28 January 2016 (UTC)
As it doesn't effect either English or Translingual L2 sections, I would abstain in a vote on the narrow question of whether Usage notes were immediately before or after inflection tables. DCDuring TALK 22:22, 28 January 2016 (UTC)
I'd usually expect notes about the inflection to be in the inflection header itself. Random example: French pâtir uses a certain conjugation template that returns the text "This is a regular verb of the second conjugation, like finir, choisir, and most other verbs with infinitives ending in -ir. One salient feature of this conjugation is the repeated appearance of the infix -iss-." before the conjugation table. --Daniel Carrero (talk) 22:45, 28 January 2016 (UTC)
Not those kinds of notes. Those are really just curiosities and not necessary to include anyway. I'm talking more about things like an explanation of when different variants are used. For example, the note about puis in the entry for pouvoir (which probably needs a more detailed explanation with examples. Or the note about человек vs. людей as the genitive plural of человек. --WikiTiki89 02:09, 29 January 2016 (UTC)
Data: I used AWB to page through the first 100 entries starting with 'f' (a letter I picked at random) which had usage notes sections as of the 2015-07-02 database dump yes, I should download a newer dump, but it doesn't make a difference here unless you think the overall proportion of definition-centric vs inflection-centric usage notes has changed in the last 6 months. Of these hopefully representative entries' usage notes,
  1. 32% (examples: facient, fag, fait accompli, falseness) expand upon or contain information about specific definitions and/or registers or contexts (offensiveness, transitivity, etc) of specific definitions or of the term as a whole. Whether you buy the claim or not, this information has a stronger claim than other kinds of information to belonging directly after the definitions, before inflection information.
  2. 17% (examples: f., f's, faca, facet, faire, faire caoud, falten) contain information about inflected forms / the inflection of the word, or about lenition, which we seem to treat like inflection. Whether you (or I) buy the claim or not, this information has a stronger claim than other kinds of information to belonging after inflection information.
  3. 22% (examples: fa chomhair, fa-near, fail, fallacious, faoin, fara) contain information about which words or case-forms are used with the word in question (e.g. saying a verb takes the dative case).
  4. 29% had some other function, e.g. functioning as glorified synonyms sections.
- -sche (discuss) 22:54, 28 January 2016 (UTC)
For comparison, fr.Wikt places notes directly after definitions, while full inflection information (of the sort we're discussing interpolating between definitions and usage notes) is on a separate page; see fr:baiser (the note section is called "Note"). de.Wikt places notes before definitions, and full inflection information on a separate page; see de:rheinisch and de:US-amerikanisch (the notes sections are called "Anmerkung" with or without additional words). The speaks in favour of the idea that usage notes are more closely bound to definitions than inflection information is. Even when the usage notes are about inflection, they're as near to the inflection if they're above it as they are if they're below it. For this reason and the reasons I gave above, I favour the order definitions, then usage notes, then inflection. - -sche (discuss) 23:31, 28 January 2016 (UTC)
@-sche These are good points. I'm starting to think it would be a good idea supporting the order: Definitions, Usage notes, Inflection (rather than Definitions, Inflection, Usage notes). I admit I'll have to think about it with more clarity after sleeping. If most people support that order, then I could create a vote about that order specifically. But if people remain divided about the exact order, maybe I should create a vote with both options. --Daniel Carrero (talk) 02:52, 29 January 2016 (UTC)
Regardless of how much support you think there is, the vote should have both options. I think it should be a two-section vote. Section one would be whether we want to fix the position of the Usage notes section in either of these locations. Section two would be whether it should be before the inflection line, after the inflection section, leave up to personal preference, or (perhaps) have it depend on whether the particular usage note is talking about the definitions or the inflections. --WikiTiki89 03:00, 29 January 2016 (UTC)
@Wikitiki89 I feel that section 1 would be redundant. If a person supports any option in section 2, it would be equivalent to support the section 1, wouldn't it? I propose setting up the vote this way:
Voting on: What is the placement of the "Usage notes" section in all languages. (Currently it is the only section where WT:EL states: "can be placed anywhere appropriate".)
Note: In the proposals below, "Inflection" can be replaced by Declension, Conjugation, etc.
Proposal 1: The sections should be ordered this way in all entries:
  • Part of speech, Inflection (if available), Usage notes.
Proposal 2: The sections should be ordered this way in all entries:
  • Part of speech, Usage notes, Inflection (if available).
Proposal 3: The sections should be ordered in either of these ways, up to personal preference, in all entries:
  • Part of speech, Inflection (if available), Usage notes.
  • Part of speech, Usage notes, Inflection (if available).
Proposal 4: The sections should be ordered in either of these ways, depending specifically on the contents of the usage notes, in all entries:
  • Part of speech, Usage notes, Inflection (if available). (if the usage notes are about the definition)
  • Part of speech, Inflection (if available), Usage notes. (if the usage notes are about the inflection)
Proposal 5: Allow "Usage notes" to be used freely anywhere in all entries.
--Daniel Carrero (talk) 14:59, 31 January 2016 (UTC)
@Daniel Carrero: The problem with your suggestion is that it is unclear how to close it if the results are close. The point of my proposal was that section 1 would determine whether the vote passes, and section 2 would determine how it passes. If section 1 passes, but section 2 is inconclusive, the default is to leave it unspecified whether the Usage notes should come before or after the inflection section. While with your proposal, in the equivalent situation, the whole vote fails. --WikiTiki89 19:23, 3 February 2016 (UTC)
@Wikitiki89 I edited Wiktionary:Votes/2016-02/Placement of "Usage notes" per your proposal. --Daniel Carrero (talk) 09:16, 4 February 2016 (UTC)
I agree. I think the option to place the usage notes either before or after the inflection should definitely be in the vote. Benwing2 (talk) 03:13, 29 January 2016 (UTC)
Sure, I'll make it available. --Daniel Carrero (talk) 01:45, 31 January 2016 (UTC)
Template:@ping Excellent data collection and analysis. However the sample should have discarded data from all English (and taxonomic) sections for reasons analogous to why I would abstain from voting on this narrow matter. I don't know whether data from inflected forms entries should also be discarded because no inflection table can appear. Only entries which at least could have inflection tables should be sampled or retained after sampling. DCDuring TALK 14:23, 29 January 2016 (UTC)

I have a separate proposal, to tackle a problem noted above. Adding this rule somewhere on WT:EL (WT:EL#Usage notes and/or WT:EL#Pronunciation):

  • Notes about pronunciations should be placed in the "Pronunciation" section. Entries should not have a "Usage notes" section whose only purpose is to give pronunciation notes.

This could be done in a separate vote, maybe. --Daniel Carrero (talk) 01:45, 31 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Notes about pronunciations, to address the "pronunciation"/"usage notes" issue:
  • Entries should not have a "Usage notes" section whose only purpose is having notes about the pronunciation. Pronunciation notes can be added directly in the "Pronunciation" section.
--Daniel Carrero (talk) 16:29, 1 February 2016 (UTC)

Bold "Do" and "Do not" in WT:EL[edit]

Currently, WT:EL#Translation dos and don’ts has 8 rules starting with bold Do or Do not:

This was a change introduced by Paul G in diff (23 July 2007). I find that distracting. Is that really necessary? No other part of WT:EL has the same formatting.

I have a proposal, which I would like to do without a vote (an unsubstantial change):

  • Remove "Do" from all positive statements. ("Do provide" = "Provide"; "Do ensure" = "Ensure").
  • Keep "Do not" in all cases, but without the bold formatting ("Do not add the pronunciation" = "Do not add the pronunciation").

Can I do that? --Daniel Carrero (talk) 04:20, 28 January 2016 (UTC)

Disclaimer: I'm not saying that I agree (or disagree) with any of these rules. I just don't like the current presentation. --Daniel Carrero (talk) 04:41, 28 January 2016 (UTC)
I agree, they come across as intimidating with the bold Do/Do not. Redoing them as you specify is fine with me. Benwing2 (talk) 05:12, 28 January 2016 (UTC)
I thought the same thing. I say go for it. Andrew Sheedy (talk) 05:15, 28 January 2016 (UTC)
Support. —Μετάknowledgediscuss/deeds 06:23, 28 January 2016 (UTC)
Support. - -sche (discuss) 04:06, 29 January 2016 (UTC)
Yes check.svg Done. --Daniel Carrero (talk) 00:36, 30 January 2016 (UTC)

Luacize Template:grc-ipa-rows[edit]

@Gilgamesh~enwiktionary, Angr: I can't believe such a template is not luacized. I will luacize it if nobody disagrees. --kc_kennylau (talk) 15:12, 28 January 2016 (UTC)

If you feel like it, even better IMO would be to make a new template that takes actual Greek as input and generates the appropriate pronunciations, rather than requiring the awkward code arguments. Benwing2 (talk) 15:24, 28 January 2016 (UTC)
@Benwing2: That is also in my mind. --kc_kennylau (talk) 16:30, 28 January 2016 (UTC)
The luacized template is {{grc-IPA}}. As far as I know, {{grc-ipa-rows}} is almost deprecated (though there seem to be some cases where {{grc-IPA}} doesn't work, so {{grc-ipa-rows}} is required instead. I'd rather have a bot go through and replace all instances of {{grc-ipa-rows}} with {{grc-IPA}} (changing the parameters as required) instead of having two competing luacized templates. —Aɴɢʀ (talk) 16:35, 28 January 2016 (UTC)
Angr is quite correct that we need to convert over to {{grc-IPA}}. We're still trying to fix PHP error (according to Newt) that occurs for words like ᾍδης ‎(Hāídēs). Otherwise, to my knowledge, we are effectively done. —JohnC5 17:05, 28 January 2016 (UTC)
 
It should have been marked as deprecated. In fact, it was, but User:LlywelynII removed the tag, because there had been no formal discussion and he didn't like the 'terseness' of the template. Which is fair to some degree, but it really would have been better to at least make a note on the talk page instead of just silently removing the tag. Which is not to say that he's not right, though: upon looking at the template, I don't like the way it's presented; just having /x/ > /y/ > /z/ is ambiguous because it doesn't actually specify how long of a period this change took place over, or which one is even Correct, and, at that point, why bother collapsing it?
In my opinion, we should remove the "Constantinopolitan" row (which, at 1500, is Modern Greek anyway), and then consider cutting another row or two. The Latin template has two rows, "Classical" and "Ecclesiastical", which is really all anyone should need. On this model, I'd cut us down to "Classical" and "Byzantine", and maybe add "Koinê" (and of course eliminate the show/hide function.) Any more than that just seems unnecessary. —ObsequiousNewt (εἴρηκα|πεποίηκα) 19:01, 28 January 2016 (UTC)
Didn't realize this work had already been done. John, what exactly is the supposed PHP error? Something where regexes don't behave the way they should? So far I haven't seen any such errors with Russian or Arabic. Benwing2 (talk) 18:18, 28 January 2016 (UTC)
I had pinged Newt because I am unfamiliar with the actual error. It is mentioned here, and Newt made a bug report here. —JohnC5 18:47, 28 January 2016 (UTC)
Thanks! @ObsequiousNewt Can you work around the problem? I had to work around an issue with the ordering of the shadda diacritic in Arabic with respect to other diacritics; this is basically a bug in Unicode itself, and causes problems because MediaWiki normalizes the ordering of diacritics according to Unicode. In this case, I had to manually reorder the diacritics in various places. If the problem is with capitalization/lowercasing, can you write a function that wraps the relevant MediaWiki calls and manually converts the chars in question to lowercase? Benwing2 (talk) 19:46, 28 January 2016 (UTC)
The Mediawiki lc: magic word unfortunately doesn't recognize capital letters with iota subscript as having lower-case equivalents, so it doesn't do anything to them. So any workaround will have to employ some other method. —Aɴɢʀ (talk) 20:27, 28 January 2016 (UTC)
Where is lc: used exactly? If it's in template code, presumably it can be replaced with a #invoke to a function that wraps lang:lc() or mw.ustring.lower(). Benwing2 (talk) 22:02, 28 January 2016 (UTC)
I don't know if it is used in this context at all yet; I'm just saying if someone were to try using it to force the template to reinterpret capital letters as lowercase ones, it wouldn't work on iota-subscripted capital letters. Of course, there probably aren't very many words beginning with an iota-subscripted capital letter; we can always write {{grc-IPA|w=ᾅδης}} manually. —Aɴɢʀ (talk) 16:49, 29 January 2016 (UTC)
Yeah, I can try and work something up. Unfortunately, nobody seems to have looked at/confirmed the bug on PHP yet (perhaps I should look into fixing it myself...) —ObsequiousNewt (εἴρηκα|πεποίηκα) 14:18, 29 January 2016 (UTC)
(straying off the original topic) Re "I don't like the way it's presented; just having /x/ > /y/ > /z/ is ambiguous because it doesn't actually specify how long of a period this change took place over, or which one is even Correct, and, at that point, why bother collapsing it?": I agree. - -sche (discuss) 04:05, 29 January 2016 (UTC)

EL: Language section (revision 2)[edit]

Wiktionary:Votes/pl-2015-12/Language passed 10 days ago. Some people complained that the text needs to be written more clearly.

I created Help:Language sections with a longer, more detailed explanation directed at newbies.

I have a proposal:

  • Removing all the how-to parts and explanations from WT:EL#Language and leaving only the rules plus a link to the help page.

Current text:

Language

Each entry has one or more L2 (level-two) language sections. For example, the entry sea has different meanings in English and Spanish, both on the same page. Priority is given to Translingual: this heading includes terms that remain the same in all languages. This includes taxonomic names, symbols for the chemical elements, and abbreviations for international units of measurement; for example Homo sapiens, He ‎(helium), and km ‎(kilometre). English comes next, because this is the English Wiktionary. After that come other languages in alphabetical order. Language sections should be separated from each other by a horizontal line, generated with four dashes (----).[1]

For languages that have multiple names, a single name is chosen that should be used throughout Wiktionary. Typically, this is an English name for the language. See Wiktionary:Languages for more information.

References
  1. ^ Wiktionary:Votes/pl-2015-12/Language

Proposed text:

Language
  • Every entry should have one or more language sections.
  • All language sections should be level-two.
  • The order of language sections is: Translingual, English, then other languages in alphabetical order
  • Language sections should be separated from each other by a horizontal line.
  • For languages that have multiple names, a single name is chosen that should be used throughout Wiktionary. See Wiktionary:Languages for more information.

See Help:Language sections for more information.

Looks good? --Daniel Carrero (talk) 02:22, 30 January 2016 (UTC)

that looks really nice and clear, i wish more guidelines were written in this way when i came here at first profesjonalizmreply 06:38, 30 January 2016 (UTC)
 :) --Daniel Carrero (talk) 23:31, 30 January 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Language 2. --Daniel Carrero (talk) 05:57, 4 February 2016 (UTC)

Proposal: Transclude only 2 months in WT:BP and other discussion rooms[edit]

In Wiktionary talk:Beer parlour#Transclude last three months makes the page very slow to load, @Automatik complained that it is extremely slow to load WT:BP on their computer and suggested transcluding only the last 2 months WT:BP instead of the current 2 months.

I support that. As I pointed out in that discussion, a BP discussion older than 2 months is most often an inactive discussion. This change would require editing {{discussion recent months}} and I think it would make sense changing the behavior of all pages that use that template, not just WT:BP but WT:GP, WT:ID, WT:TR and WT:ES as well.

Can I make these pages transclude only the last 2 months? --Daniel Carrero (talk) 00:13, 31 January 2016 (UTC)

I don't have a problem with it, although I don't feel strongly; I usually access the individual month-specific pages through the watchlist and haven't noticed the slowdown that much (although it did take maybe 7-8 secs to load the front page when I just checked). Benwing2 (talk) 03:51, 31 January 2016 (UTC)
Yes check.svg Done --Daniel Carrero (talk) 16:59, 2 February 2016 (UTC)
Does this mean that we sometimes only have one month and one day of BP discussion loading? Or is it as little as two months and one day? DCDuring TALK 20:43, 2 February 2016 (UTC)
Now, sometimes we only have one month and one day of BP discussion loading. I'm happy with the current state, but that could probably be changed with more complex rules if people want, like "keep showing 3 months for the first two weeks of the month, then change to showing two months".
--Daniel Carrero (talk) 21:00, 2 February 2016 (UTC) (edited: --Daniel Carrero (talk) 13:33, 3 February 2016 (UTC))

CAT:, T:, MOD:, AP:, RC: are working[edit]

Testing all shortcuts, they are all working:

--Daniel Carrero (talk) 02:02, 31 January 2016 (UTC)

When did we have a new namespace that I am not aware of? --kc_kennylau (talk) 02:07, 31 January 2016 (UTC)
@Kc kennylau: These are all preëxisting namespaces, which we now have shortcuts for as a result of a vote. —Μετάknowledgediscuss/deeds 03:43, 31 January 2016 (UTC)
@Metaknowledge: I am talking about the RC namespace. --kc_kennylau (talk) 03:44, 31 January 2016 (UTC)
@Kc kennylau: Also the result of a vote. —Μετάknowledgediscuss/deeds 03:48, 31 January 2016 (UTC)
What about U: for User:? --kc_kennylau (talk) 05:31, 31 January 2016 (UTC)
It was not part of the vote. It could be voted in the future, but personally I don't care much for for U: for User: because User: is short enough. H: for Help: (equally short) was voted but failed. --Daniel Carrero (talk) 15:02, 31 January 2016 (UTC)

Namespace "Reconstruction"[edit]

Is Reconstruction: better than Reconstructed:? --kc_kennylau (talk) 04:12, 31 January 2016 (UTC)

The discussion which resulted in this name being chosen is Wiktionary_talk:Votes/2015-09/Creating_a_namespace_for_reconstructed_terms#What_should_we_name_the_namespace.3F. - -sche (discuss) 04:41, 31 January 2016 (UTC)
@-sche: Alright. Then I can start bot-moving everything. --kc_kennylau (talk) 05:18, 31 January 2016 (UTC)
Oops. Looks like it is not possible. --kc_kennylau (talk) 05:28, 31 January 2016 (UTC)
Should we continue to create new entries under Appendix: until currently existing pages are mostly moved and templates like {{inh}} or {{m}} point to the correct namespace? --Tropylium (talk) 21:08, 31 January 2016 (UTC)

About: Pronunciation 1, Pronunciation 2, Pronunciation 3[edit]

Some stats:

  • There are 31,175 entries with "Etymology 1" written somewhere. (search link: here)
  • There are 5,985 entries with "Pronunciation 1" written somewhere. (search link: here)

WT:EL does not currently allow entries with numbered pronunciation sections such as "Pronunciation 1", "Pronunciation 2" and "Pronunciation 3". Are these sections something we want? If the answer is yes, I have a proposal:

  • Officially allowing numbered pronunciation sections in entries by editing WT:EL. (most likely WT:EL#Pronunciation, and also somewhere in WT:EL explaining what are the allowed sections and section order)

See also: Category:Entries with Pronunciation n headers for a bot-populated category with the entries affected by this proposal.

Older discussions:

--Daniel Carrero (talk) 15:14, 31 January 2016 (UTC)

Whenever I find these (in Russian or Arabic), I rewrite them to have "Etymology N" in them. Benwing2 (talk) 01:30, 1 February 2016 (UTC)
Other such sections that I've seen are "Noun 1", "Noun 2", etc. These I also rewrite, either to "Etymology 1", "Etymology 2" or just "Noun", "Noun". Benwing2 (talk) 01:32, 1 February 2016 (UTC)
My objective is trying again to make an official list of allowed headings as comprehensive as possible after Wiktionary:Votes/pl-2015-12/Headings failed.
"Pronunciation 1", "Pronunciation 2", etc. are part of that project, because I'd like to say either that they're allowed, or disallowed, or that there's no consensus for them yet. (whatever may the case be)
If people don't want to use "Pronunciation 1", I'd be glad to mention that fact in a new proposed Headings list.
Just to be sure, I would probably create a separate vote first, specifically about numbered pronunciation sections, with both options: 1) Allow numbered pronunciation sections, 2) Disallow numbered pronunciation sections.
(Noun 1, Noun 2 are already officially disallowed per WT:EL#Part of speech after Wiktionary:Votes/pl-2015-12/Part of speech passed recently.) --Daniel Carrero (talk) 02:10, 1 February 2016 (UTC)
One of the things I don't like about "Pronunciation 1", "Pronunciation 2", etc. is that there's no clear way they interact with "Etymology 1", "Etymology 2", etc. If we nest Pronunciation under Etymology, we get level-6 inflection tables and such, which seems awkward. For Russian, we have a single Pronunciation subsection per etymology section, and if there are multiple pronunciations in that section, they're all listed under the Pronunciation section with an annotation indicating which headword they go with. See сопли for an example. (They are grouped in the same etymology section because сопли as a lemma is a plurale tantum formed etymologically as the plural of сопля, and the other entries are for non-lemma forms of the same сопля. Forms for different lemmas should generally go in different etymology sections.) Benwing2 (talk) 04:29, 1 February 2016 (UTC)
Most of the entries with Pronunciation n headers are Latin inflected forms with pronunciations that cut across PoS headers and Etymologies. Anyone recommending elimination of Pronunciation n headers should have a proposed format for, say, auraria that at least the regular workers on Latin entries find acceptable. A userpage or sandbox mockup would be nice. I have worked to eliminate such headers in English entries, but could not see how to do it for Latin entries without making the entries almost useless. Maybe fresh eyes can do better. DCDuring TALK 06:03, 1 February 2016 (UTC)
See переда. This is how we tackle this issue in Russian. Benwing2 (talk) 06:26, 1 February 2016 (UTC)
Note: The use of annotations like in переда is necessary in other situations, too, that's why we adapted it for this situation. See Немуро for such an example. (BTW when I say "annotations" I mean the boldfaced words appearing to the left of the IPA, rather than the phonetic respelling appearing to the right in Немуро, which is used only because the pronunciation is irregular.) Benwing2 (talk) 06:29, 1 February 2016 (UTC)
@DCDuring, Benwing2: I don't speak Latin or Russian, but I created User:Daniel Carrero/auraria which should be auraria using the format of переда (without annotations like the ones seen in Немуро). See if I made any mistakes, feel free to edit that page, add annotations if necessary, change whatever you want. --Daniel Carrero (talk) 06:41, 1 February 2016 (UTC)
I'm not the one to be editing the proposed entry. Perhaps some of the following contributors to WT:ALA would be interested: @Wikitiki89, I'm so meta even this acronym, SebastianHelm, Pengo, Angr, Metaknowledge, CodeCat, Robert.Baruch, Jerome Charles Potts (apologies to any I've missed). DCDuring TALK 10:30, 1 February 2016 (UTC)
@DCDuring: Thanks for the ping. I opine below. — I.S.M.E.T.A. 03:21, 3 February 2016 (UTC)
I like the format of User:Daniel Carrero/auraria, though it should be flexible enough to allow cases where the headword forms are identical as well, e.g. the present tense vs. past tense of read. —Aɴɢʀ (talk) 11:19, 1 February 2016 (UTC)
Agreed. I imagine that could be done with present and past annotations or something of that sort. Benwing2 (talk) 12:00, 1 February 2016 (UTC)
I do not see any problem with having Pronunciation 1 and 2 parallelly to Etymology 1 and 2. That's how I handled it for wa and having looked at the alternative offers, I still prefer it over summarising multiple semantically relevant pronunciations under a single header. Korn [kʰʊ̃ːæ̯̃n] (talk) 12:41, 1 February 2016 (UTC)
I'd be OK with having it either way, but I prefer the format used for User:Daniel Carrero/auraria. Andrew Sheedy (talk) 00:08, 2 February 2016 (UTC)
I am opposed both to blank etymology sections and to multiple instances of the same POS header occurring in the same nest (which occurs, for example, when a level-3 Noun header is immediately followed by another level-3 Noun header). Accordingly, I am particularly opposed to the presentations in переда ‎(pereda) and User:Daniel Carrero/auraria. (As an aside, I think it's bizarre that both those entries have two pronunciation sections with identical contents — why not simply move the pronunciation information to the top of the entry so it applies to both etymology sections, and save the space and redundancy?!) Because I oppose multiple instances of the same POS header occurring in the same nest, I consider numbered pronunciation sections to be indispensable for Latin (chiefly for the ablative singular feminine form vs. the other graphically isomorphic forms of first–second-declension adjectives, but also for some other cases). However, even in the case of variously pronounced homonyms, I sometimes see value in using numbered pronunciation sections rather than numbered etymology sections. Take, for example, the Latin entry I recently created for Dion. Now, I could have written it like this, but what would've been the point? Adding etymology headers like that adds no useful information and is just a waste of space; numbered pronunciation sections are better for that entry.
I don't really understand this drive to get rid of numbered pronunciation sections. They're useful and intuitive and nothing is gained (as far as I can see) by banning them. Let's allow them, so I can stop adding {{rfc-pron-n|Pronunciation 1|lang=la}} to Latin entries I create with numbered pronunciation sections once and for all. — I.S.M.E.T.A. 03:21, 3 February 2016 (UTC)
@I'm so meta even this acronym You're right, normally in Russian when two etymology sections share the same pronunciation we put it once at the top; I messed up переда in this respect. I still find it awkward, though, to nest pronunciation sections under etymology sections. Benwing2 (talk) 04:14, 3 February 2016 (UTC)
@Benwing2: Why? — I.S.M.E.T.A. 22:13, 8 February 2016 (UTC)
Because that's the standard practice. --WikiTiki89 22:19, 8 February 2016 (UTC)
I, on the other hand, am opposed to nesting POS sections under anything at all. It's inconsistent that sometimes they are L3 below an etymology or pronunciation section, sometimes L4 nested within either of these sections. The term should be primary, and any information that is associated with the term should be nested under it. —CodeCat 20:34, 3 February 2016 (UTC)
I would even go as far as to say that it makes no sense that the part speech of a word is part of the hierarchy of our entry structure. The part of speech logically should be attached to individual definitions, or at least small groups of related definitions. But there's something about maintaining the status quo in order to focus our efforts on the quality of entries rather than on reformatting the whole dictionary. --WikiTiki89 20:45, 3 February 2016 (UTC)
The status quo already attaches the part of speech to individual definitions or small groups of related definitions. The POS header is what does that. —CodeCat 20:50, 3 February 2016 (UTC)
No the POS header attaches definitions to a POS. That's different. Also, our groups of related definitions are often not as small as what I meant in my previous post. --WikiTiki89 20:56, 3 February 2016 (UTC)
  • Some data points from Japanese.
Japanese lemmata here at EN WIKT are typically the kanji spellings. For these entries, like , pronunciation is subordinate to etymology.
We also have Japanese entries in kana, generally hiragana. These spellings are phonetic, except that kana spellings do not denote pitch accent (vaguely similar to stress in other languages). Sometimes a single kana spelling may have multiple possible pitch accents, and the etymology (or more often, which lemma) depends on the pitch accent. For these entries, like にじ ‎(niji) or まく ‎(maku) or しゃべる ‎(shaberu), etymology is subordinate to pronunciation.
Allowing Pronunciation N headers is really the only way of cleanly organizing Japanese kana entries where there are multiple pitch accents. Japanese kana entries would be negatively affected if the Wiktionary community were to ban these headers. ‑‑ Eiríkr Útlendi │Tala við mig 01:49, 2 February 2016 (UTC)
These are good points. I created User:Daniel Carrero/にじ to try and compare with the entry にじ you linked, using the same format as User:Daniel Carrero/auraria.
What I like about the page I created is that User:Daniel Carrero/にじ takes a bit less space than にじ.
  • Counting from the "Japanese" L2 header, 二時 has 16 lines and User:Daniel Carrero/にじ has 12 lines.
  • I removed the repeated "Noun" and the headword line "にじ ‎(romaji niji)".
  • I also removed the repeated IPA transcription ([nid͡ʑi])
  • The shortness is helped by the fact that にじ has a TOC and User:Daniel Carrero/にじ doesn't, because it has only 3 sections.
In the creation of User:Daniel Carrero/にじ, I used {{head}} because {{ja-noun}} was giving unexpected results in a user page, which I coulnd't fix. Also, I edited the pronunciation markup manually to use a different style than {{ja-pron}} uses, which I can put in a different template for use in other entries if people want.
Overall, I find it a little weird that, when the lemma of a Japanese word is actually the entry with kanji, kana entries have pronunciation sections and romaji entries do not (romanization entries are just the modicum of information to find the right entries). But I won't propose the pronunciations to be removed from the kana entries; despite being somewhat odd IMHO, they are certanly helpful in comparing quickly where to locate the accents in different words spelled with the same kana. --Daniel Carrero (talk) 03:31, 2 February 2016 (UTC)
@Eirikr IMO, all three of にじ ‎(niji), まく ‎(maku), しゃべる ‎(shaberu) should be using "Etymology N" headers, not "Pronunciation N" headers. The different pronunciations are clearly etymologically unrelated; that's exactly what separate etymology sections are for. E.g. しゃべる ‎(shaberu) meaning "shovel" looks like it's borrowed from English, whereas the meaning "to chat" isn't. Benwing2 (talk) 04:27, 2 February 2016 (UTC)
As a user, I find the current Japanese sorting system easier to grasp, i.e. to have a better overview, than the alternative where a single-line list of words is put before the pronunciation. I also find it considerably more sightly than having a block of text in front of the pronunciation. I'd like to put out the idea of making Pronunciation the default (but not mandatory) topmost header, since it is basically the spoken æquivalent of spelling, which is the first thing we sort entries by now. For entries with one etymology but multiple semantically distinguishing pronunciations this would create different entries according to meaning and differentiability in spoken language and for entries with one pronunciation but multiple etymologies, this would cut a bit of superfluous text compared to now. Korn [kʰʊ̃ːæ̯̃n] (talk) 09:36, 2 February 2016 (UTC)
  • (after edit conflict) ... @Benwing2, one key factor with the kana entries is that they are phonetically oriented: if one were to reorganize by etymology instead, まく ‎(maku) would need possibly ten different ===Etymology N=== headers, certainly no fewer than eight. Rather that etymological information belongs on the lemma entries anyway, such a reorganization doesn't seem optimal. Some of these phonetic kana entries map more easily between pitch accents and etymologies, like しゃべる ‎(shaberu), but again, these kana entries are phonetically oriented, and are intended as soft redirects to the kanji-spelled lemmata, with etymologies provided as part of the lemma entries. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
  • Daniel, the kana entries are organized by the reading, rather than the spelling (i.e. kanji). Any user who understands at least romaji and kana can look up something they've heard and find the kana entry. For a given reading, however, there are often different pitch accents, with some words always pronounced with one pitch-accent pattern and never another. If we don't provide any pronunciation information on kana entries, a user who has heard [máꜜkù] cannot identify which set of lemma entries might be relevant, unless they click through to each of the linked lemma entries to check the pronunciation there. This is an onerous burden and is very poor usability. Including pronunciation information on the kana entry page enhances the page's intended function as a kind of disambiguation page, leading the user to the entry they are looking for.
I appreciate you taking the time to create your mock-up. However, I confess I find your proposed entry structure hard to read. It may be more compact, but compactness is not necessarily a virtue: your structure makes it less clear to me how the pitch accents and entries correlate. I also suspect it doesn't scale adequately: applying a similar redesign to more complicated pages like まく ‎(maku) would produce a result that would be even harder to visually parse. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
  • @Korn, I'm certainly open to that approach. I've struggled with the distance between the start of a Japanese entry section and the pronunciation / reading information. This is perhaps more awkward for Japanese, as kanji spellings often have only a tenuous connection to the reading. Organizing Japanese entries by pronunciation first is an attractive idea. ‑‑ Eiríkr Útlendi │Tala við mig 09:44, 2 February 2016 (UTC)
Should we disallow "Pronunciation 1" in most languages but keep them for Japanese as an exception because of the pitch accent in the kana entries?
I wanted to create a layout without "Pronunciation N" that works but it's okay if I was unable to. In any event, I changed a little the design of User:Daniel Carrero/にじ by getting rid of the table style and adding a couple of newlines in the pronunciation section.
I added the etymology of しゃべる and used Etymology N headers rather than Pronunciation N headers. --Daniel Carrero (talk) 17:12, 2 February 2016 (UTC)
I'm with Daniel here that it might make sense to allow "Pronunciation N" for Japanese kana entries and maybe other exceptions to be determined, but not generally. Benwing2 (talk) 01:06, 3 February 2016 (UTC)
Have we actually yet heard a single reason for not having numbered pronunciations? I'm strongly opposed to disallowing them, for if we admit that they're preferable for one language, we already know that they might be in other languages. And we should leave it to the users to decide when and where this need arises, without bureaucracy. Korn [kʰʊ̃ːæ̯̃n] (talk) 19:26, 3 February 2016 (UTC)

Guys, my original proposal in the first message was "Officially allowing numbered pronunciation sections". If people want them, fine. If people don't want them, fine too. I just want to try and make sure what is the consensus to update WT:EL accurately.

Should I create a vote with the proposal "Officially allowing numbered pronunciation sections"?

I don't want to create a vote with the opposite proposal "Officially disallowing numbered pronunciation sections" because @Eirikr made a good point that they are useful in Japanese. That's why I tried the "compromise" of disallowing numbered pronunciations sections everywhere but in Japanese. Re "for if we admit that they're preferable for one language, we already know that they might be in other languages.": Japanese seems to be a very special case, because of the kana stuff. (many entries with the same kana readings but different pitch accents)

Correct me if I'm wrong, but I believe "Pronunciation N" sections are unwanted for English. A search for "pronunciation 1" latin (link) returns 1,107 results so the number of Latin entries with multiple pronunciations seems to be equal or less than 1,107, which seems like a low number to me. I don't suppose it would be completely unreasonable to remove "Pronunciation N" from all Latin entries, would it? --Daniel Carrero (talk) 04:43, 4 February 2016 (UTC)

Since nobody seems to have brought an argument against multiple pronunciation sections, I would propose you start a vote on just generally allowing or disallowing them. If that fails, a new and more restricted proposal can still be generated from the discussion of the first one. Korn [kʰʊ̃ːæ̯̃n] (talk) 13:43, 4 February 2016 (UTC)
re: ""Pronunciation N" sections are unwanted for English."
Let's not get ahead of ourselves. The facts are that I didn't like them, didn't think they were necessary, and worked to eliminate them, there not being very many. AFAICR no one objected. I don't know whether any have been added since my efforts to eliminate them, nor whether there would be objections to having them or not having them. I don't even know that I would agree now with the way I eliminated them then. DCDuring TALK 13:57, 4 February 2016 (UTC)

@Korn, Daniel Carrero: Let me break down this issue so that we can create a better vote and also to explain some of the arguments:

A. Should we have multiple pronunciation sections or should we put multiple pronunciations in one pronunciation section?
B. If multiple pronunciation sections are used:
1. Should they be numbered or simply be repeated unnumbered?
Note: We used to have numbered POS sections, until we decided that we shouldn't have to keep the numbers updated and stopped numbering them. Now we simply have two ===Noun=== sections or two ===Verb=== sections following each other with no problem. It was agreed that only Etymology sections should remain numbered, although the reason they need to remain numbered is not clear to me. Perhaps it is because we used to link to them as [[foo#Etymology 2]], but we have stopped doing that in favor of sense-ids.
2. Should the content be nested under them, or simply come after them?
Note: I guess the argument here is that it does not make sense to nest things under the pronunciation header. After all, part of speech headers do not logically belong "under" a pronunciation. Most of our nesting is pretty straightforward, with only etymology sections allowing optional nesting based on whether there is one of them or more than one. To me that only makes a little bit more sense than nesting under pronunciations.
3. What happens if there are multiple pronunciation sections and multiple etymology sections?

--WikiTiki89 14:42, 4 February 2016 (UTC)

Do Header-levels make a technical difference for anything or are they merely a visual sorting tool? Korn [kʰʊ̃ːæ̯̃n] (talk) 10:25, 5 February 2016 (UTC)
They would make it harder for an amateur like me to process the dump definitively, if I were ever working with a group of languages that used this header. It requires a bit more care in constructing regexes. DCDuring TALK 13:06, 5 February 2016 (UTC)
Header-levels make it clear which POS sections the Pronunciation section applies to. If it is simply at the top, it is pretty clear that it applies to all of the POS sections, but if it occurs in the middle, it's not clear whether it should apply to just the next POS section or all the rest of the POS sections. Nesting solves this problem by putting all POS sections that the Pronunciation section applies to "under" the Pronunciation section. That way the Pronunciation section clearly does not apply to anything that follows but is not "under" it. The structure is often clearer in the TOC than in the actual entry. But still, all the things I just mentioned are only visual and make no technical difference, and same with all of our entry layout decisions really. --WikiTiki89 14:37, 5 February 2016 (UTC)
As I said before, my objective is making a list of allowed headings and "Pronunciation N" is in the way, so I'd like to know if it's allowed or disallowed (or no consensus). I'm thinking of creating a vote based on @Wikitiki89's proposal above, but I'm not sure what would be the options for the part 3. "3. What happens if there are multiple pronunciation sections and multiple etymology sections?" Probably the options would be:
  • Allow/disallow nested numbered pronunciation with numbered etymology sections.
--Daniel Carrero (talk) 03:42, 6 February 2016 (UTC)
Should these questions be asked as a vote or as a poll? I'm thinking a poll is better because this issue has not been discussed much yet, we have still to show a clear consensus about it.
The poll could also ask: Allow/disallow for Latin, Allow/disallow for Japanese, Allow/disallow for English, etc. (just a few select languages that have been brought up for discussion here) --Daniel Carrero (talk) 03:53, 6 February 2016 (UTC)
I created Wiktionary:Votes/2016-02/Multiple pronunciation sections, mostly based on @Wikitiki89's idea above, but I had to fill in the blanks in the question about "multiple pronunciation sections and multiple etymology sections". Feel free to edit that vote. --Daniel Carrero (talk) 02:29, 7 February 2016 (UTC)

I'll start this vote: "Entry name: sign languages"[edit]

FYI: Wiktionary:Votes/pl-2015-12/Entry name: sign languages was created by me 1½ month ago (on December 16), I was delaying the start but I think it is good to go now. (Basically, I was waiting for Wiktionary:Votes/pl-2015-12/Entry name section 2 to end first.)

I edited the vote a bit more and added diffs. Feel free to further change it if you want, I'll update the diffs before starting the vote if anyone changes it. Related discussions: Wiktionary talk:Votes/pl-2015-12/Entry name: sign languages.

--Daniel Carrero (talk) 16:30, 31 January 2016 (UTC)

Wiktionary:Votes/pl-2016-01/Pronunciation[edit]

I created Wiktionary:Votes/pl-2016-01/Pronunciation.

Rationale and changes:

  • More compact version of the same policy. No rules were intended to be changed, they are just described in a way that takes up less space.
  • Using a bulleted list to organize the ideas. The order of ideas changes in a few places.
  • The subsections (Homophones and Rhymes) were removed. The same information was kept, albeit in the bulleted list.
  • The current text uses 4 entries as examples of multiple types of pronunciation information: portmanteau, beta, right and hat. The proposed text uses only 1 entry as an example of all the types of pronunciation information previously mentioned: right. Bonus: The example shows the transcription, the audio, the homophones and rhymes in order in the right example.
  • Another step in the direction of having WT:EL completely voted.

--Daniel Carrero (talk) 21:21, 31 January 2016 (UTC)

February 2016

About entry names (removing from CFI, adding to EL)[edit]

I believe the whole section WT:CFI#Idiomatic phrases is the province of WT:EL#Entry name, not of WT:CFI. I propose removing the whole section from CFI.

Many phrases take several forms. It is not necessary to include every conceivable variant. When present, minor variants should simply redirect to the main entry. For the main entry, prefer the most generic form, based on the following principles:

Pronouns

Prefer the generic personal pronoun, one or one’s. Thus, feel one’s oats is preferable to feel his oats. Use of other personal pronouns, especially in the singular, should be avoided except where they are essential to the meaning.

Articles

Omit an initial article unless it makes a difference in the meaning. E.g., cat’s pajamas instead of the cat’s pajamas.

Verbs

Use the infinitive form of the verb (but without “to”) for the principal verb of a verbal phrase. Thus for the saying It’s raining cats and dogs, or It was raining cats and dogs, or I think it’s going to rain cats and dogs any minute now, or It’s rained cats and dogs for the last week solid the entry should be (and is) under rain cats and dogs. The other variants are derived by the usual rules of grammar (including the use of it with weather terms and other impersonal verbs).

Proverbs

A proverb entry's title begins with a lowercase letter, whether it is a full sentence or not. The first word may still be capitalized on its own:

I propose using this (updated) text for WT:EL#Entry name, which is the same thing from CFI but without some repetitions and explanations (updates are underlined):

The name of the entry is the term, phrase, symbol, morpheme or other lexical unit being defined.[1]

For languages with two cases of script, the entry name usually begins with a lowercase letter.[2] For example, use work for the English noun and verb, not "Work". Words and phrases which begin with a capital letter in running text are exceptions. Typical examples include proper nouns (Paris, Neptune), German nouns (Brot, Straße), and many abbreviations (PC, DIY). Compare: you can't judge a book by its cover and Rome wasn't built in a day. If someone tries accessing the entry with incorrect capitalization, the software will try to redirect to the correct page automatically.

Omit an initial article unless it makes a difference in the meaning. E.g., cat’s pajamas instead of the cat’s pajamas. For multi-word entries, prefer the generic personal pronoun for the main entry. (one, oneself, someone, one’s, etc.) Common forms with other pronouns are usually redirected to the main form. For example, fall over oneself would be the main entry and these are some entries that would redirect to it: fall over himself, fell over himself, falling over himself, fall over herself, fall over themselves, etc. Use the infinitive form of the verb (but without to) for the principal verb of a verbal phrase, for example: rain cats and dogs. In English, sometimes the lemma should include a specific personal pronoun. For example, the proverb you can't fight City Hall would be the lemma, not one can't fight City Hall. For prefixes, suffixes and other morphemes in most languages, place the character "-" where it links with other words: pre-, -ation, -a-, etc.

When multiple capitalizations, punctuation, diacritics, ligatures, scripts and combinations with numbers and other symbols exist, such as pan (as in "frying pan"), Pan (the Greek god), pan- (meaning "all-") and パン ‎(pan) (Japanese for "bread"), use the template {{also}} at the top of the page to cross-link between them. When there are too many variations, place them in a separate appendix page, in this case Appendix:Variations of "pan".

Use the "Alternative forms" header for variations of the same word kept in multiple pages, including:

Some page titles can't be created because of restrictions in the software, usually because they contain certain symbols such as # or |, or are too long. The full list of those entries is at Appendix:Unsupported titles. They are named using the descriptive format "Unsupported titles/Number sign", while using JavaScript to show the correct title like a normal entry.

Matched-pairs, such as brackets and quotation marks, can be defined together as single entries. The entries are named with a space between the left and right characters. Examples: ( ), [ ], “ ”, ‘ ’, " ", „ ”, « », ⌊ ⌋, ¡ ! and ¿ ?.[4][5]

References
  1. ^ Wiktionary:Votes/pl-2015-10/Entry name section
  2. ^ Wiktionary:Votes/pl-2005-03/First letter capitalization
  3. ^ Wiktionary:Votes/pl-2015-12/Entry name section 2
  4. ^ Wiktionary:Votes/2015-08/Allowing matched-pair entries
  5. ^ Wiktionary:Votes/2015-10/Matched-pair naming format: left, space, right

--Daniel Carrero (talk) 04:49, 1 February 2016 (UTC)

Edited: --Daniel Carrero (talk) 11:40, 1 February 2016 (UTC)

Hmmm.
  1. This doesn't seem to differentiate between uses of {{also}} (for orthographic differences that map to the same character in the absence of diacriticals and case differences) and the Alternative forms header (for forms of the same Etymology or PoS, including those that otherwise meet criteria for inclusion in top-of-the-page "also").
  2. It neglects to suggest the possibility, let alone desirability, of redirects from common forms of MWEs that use pronouns other than one/one's, someone/someone's, or something/something's to the lemma, which uses those person- and number-free pronouns. Furthermore, in English, sometimes the lemma should include a specific personal pronoun. For example, a folksy proverb like you can't fight city hall would be the lemma, not one can't fight city hall. DCDuring TALK 11:02, 1 February 2016 (UTC)
What about now? I edited the text. The text you saw and replied to is in this revision. --Daniel Carrero (talk) 11:40, 1 February 2016 (UTC)
I created Wiktionary:Votes/pl-2016-02/Entry name 3. --Daniel Carrero (talk) 15:34, 7 February 2016 (UTC)

Proposal: "Category:English trisyllabic words" and similar categories[edit]

I created/populated Category:English trisyllabic words; just one category, for the purpose of discussion. If people like it, I'd like to create the full category tree, which would be:

Should there be any hard limit for the number of syllables? And, of course, I'd like to create the FL counterparts:

I created {{hyphenation categorization}} for that purpose, which categorizes entries that use {{hyphenation}}.

I was kinda inspired by ptwikt, which has categories like pt:Categoria:Trissílabo (Inglês) and pt:Categoria:Trissílabo (Português). They are populated by a template that I created years ago. --Daniel Carrero (talk) 07:16, 1 February 2016 (UTC)

I'm not sure I see the need for these categories. Benwing2 (talk) 07:23, 1 February 2016 (UTC)
English is my second language. Hyphenation is one aspect of English I have yet to grasp decently. That category seems helpful to me, as I admit I wouldn't recognize more than half of those words as trisyllables: the fact that "eukaryote" and "acquainted" have 3 syllables seems a little bizarre to me. In Portuguese, we separate syllables basically by counting vowels and vowels combinations. (there are also a few more rules, like the thing with "rr" and "ss", but that's it) --Daniel Carrero (talk) 07:32, 1 February 2016 (UTC)
Totally useless. Please don't do it. SemperBlotto (talk) 07:38, 1 February 2016 (UTC)
BTW, eukaryote has 4 syllables, something like 'eu-kar-y-ote' in my dialect if you hyphenate by pronunciation; I think Brits would hyphenate 'eu-ka-ry-ote' if following their pronunciation. Evidently Wiktionary's English hyphenation isn't a reliable guide to how many syllables there are in a word. BTW hyphenation in English is a total mess; no one really understands it. It seems to involve a great deal of conventional rather than linguistic rules, and undoubtedly differs from dictionary to dictionary. The traditional purpose of English hyphenation is to indicate where words can be broken in printed text, and doesn't have much anything to do with pronunciation. Benwing2 (talk) 07:53, 1 February 2016 (UTC)
In fact there are many 4-syllable words in your list of English trisyllabic words, e.g. just among the a's there are abaxial, achievable, analysis, anthology, antipathy, and authority, plus 5-syllable atrabilious. Benwing2 (talk) 07:57, 1 February 2016 (UTC)
In Portuguese, we have very fixed hyphenation rules that we learn at school as children. (in my experience in Brazil) Like: característica = ca-rac-te-rís-ti-ca. Also, perhaps I should update polissílabo, that word in Portuguese seems to be always about a word with specifically 4 or more syllables, not just "more than one syllable", but it's conceivable that the latter could be citable as a separate sense.
If "hyphenation in English is a total mess" then it makes sense that we should not have the proposed category tree for readers. Still, I fixed "eukaryote" based on what you said and you pointed other mistakes in hyphenation. Do you think categories like Category:English trisyllabic words could serve the purpose of helping editors seeing lists of words that use {{hyphenation}} and check if they are being used correctly? If so, maybe they could stay but as hidden categories?
I'm curious if there is any foreign language for which "Category:(language) n-syllabic words" would be a deeply helpful thing for readers. --Daniel Carrero (talk) 08:13, 1 February 2016 (UTC)
The problem is, hyphenation is used for two different purposes: indicating syllables, and indicating where words can be broken with a hyphen in printed text. In English they aren't the same, so e.g. the hyphenation 'atra-bili-ous' may be correct for printing purposes. It's not obvious to me that you should correct this to follow syllable divisions, similarly with 'eu-kary-ote', which is probably correct as-is for printing purposes. Benwing2 (talk) 08:39, 1 February 2016 (UTC)
Ah, never mind then. I reverted my edit on eukaryote. I'll delete {{hyphenation categorization}} and Category:English trisyllabic words later if no one wants it. --Daniel Carrero (talk) 10:20, 1 February 2016 (UTC)
  • Not sure what chronique scandaleuse is doing in this category....--Ce mot-ci (talk) 21:26, 1 February 2016 (UTC)
    • Hyphenation is not a good way to determine the number of syllables. Just today I added the {{hyphenation}} template to two disyllabic words that are hyphenated as if they were monosyllables: abreast and ahead, which are never broken across a line break, because one of the rules of hyphenation in English is to never leave a single letter by itself. Also, dictionaries differ as to where acceptable hyphenation points in English are: for eukaryote, Merriam-Webster's and Oxford agree with us that -kary- should not be broken up even though it's two syllables, but American Heritage does break it up as eu·kar·y·ote. —Aɴɢʀ (talk) 22:07, 1 February 2016 (UTC)
      • As another example: lever is broken as le-ver (by one US dictionary), lev-er (by another), or lever (by a UK dictionary) by different dictionaries. - -sche (discuss) 05:57, 4 February 2016 (UTC)
I don't think such categories are a good idea, and (as noted above) they certainly can't be populated by the hyphenation template. - -sche (discuss) 23:21, 1 February 2016 (UTC)
Not wishing to dogpile but I also think this is a waste of time and resources even if it were done accurately. Perhaps at some point we can reliably generate such info from the IPA transcriptions or something. By that time it is not clear that we will still be using the existing category system at all. Also, has anyone ever asked for this as a feature? Equinox 23:24, 1 February 2016 (UTC)
Nobody asked for it, I was just copying ptwikt. They also have categories like pt:Categoria:Oxítona (Português), pt:Categoria:Paroxítona (Português) and pt:Categoria:Proparoxítona (Português) for oxytone, paroxytone and proparoxytone words. AFAICT, it may be tied to our culture of learning to count syllables in school, it also affects the placement of accents. But I'm not sad to see the category Category:English trisyllabic words go, I see that using {{hyphenation}} for that purpose was a bad idea. In the case of Portuguese, counting syllables is often as easy as counting letters IMO, (if you ignore some stuff like the pt:Categoria:Proparoxítona aparente (Português) thing that affects words like advérbio) so I wouldn't argue we need a special category for it, maybe only if it served the purpose of checking how many of the entries are correctly tagged with hyphenation. (hyphenation = syllables in Portuguese) --Daniel Carrero (talk) 23:45, 1 February 2016 (UTC)
The only way to do this automatically in English is to use the IPA pronunciation, assuming that it's correct (e.g. eukaryote has a strange second pronunciation listed that would suggest it has 5 syllables). BTW, I saw an error in pt:Categoria:Trissílabo (Inglês) -- eclipse is only two syllables. Benwing2 (talk) 01:18, 2 February 2016 (UTC)
Maybe this can be enabled only for specific languages like Portuguese, with regular hyphenation rules. DTLHS (talk) 01:24, 2 February 2016 (UTC)
I confess I'd like that if other people agree with it. Does Spanish have regular hyphenation rules, too? --Daniel Carrero (talk) 12:33, 2 February 2016 (UTC)
Most languages have regular hyphenation rules, I think. Spanish spelling is regular enough that you can count syllables just based on the spelling, exactly like in Portuguese. Benwing2 (talk) 15:22, 2 February 2016 (UTC)
I can support this if the categories are named in regular English (one-syllable, two-syllable, or perhaps 1-syllable, 2-syllable) rather than the obscure Greek. —CodeCat 12:57, 2 February 2016 (UTC)
Symbol support vote.svg Support. I prefer Category:Portuguese 1-syllable words, rather than Category:Portuguese one-syllable words. --Daniel Carrero (talk) 14:33, 2 February 2016 (UTC)
Rationale: categories named with "1" are easier to sort; and also easier to parse visually. --Daniel Carrero (talk) 17:01, 2 February 2016 (UTC)
I can't really see any harm in this. A lot of our categories are in my opinion not very useful for human users (as opposed to finding stuff for bot edits) and this is no worse. Renard Migrant (talk) 14:21, 2 February 2016 (UTC)
Do we have any evidence that any of our categories are ever used by our users. I just assumed they were for our own benefit. SemperBlotto (talk) 15:28, 2 February 2016 (UTC)
FWIW, someone wrote about Wiktionary categories it in a book: link. Does that count as evidence that it Wiktionary categories helped their research? "Categories such as etymology, POS categorizations, personal names, numbers, and plural nouns are obviously important clues for POS tagging." --Daniel Carrero (talk) 15:40, 2 February 2016 (UTC)
I've looked at stuff like CAT:es:Baseball before. If you're talking about non-editors I guess you'd have to look at WT:FEED. Renard Migrant (talk) 16:16, 2 February 2016 (UTC)
I'm thinking of creating only the Portuguese categories for a start, others can be created later. I'll edit the template so only the allowed languages have syllable categories. Let me know if I should enable the categories for more languages right now. (French, Spanish, etc.?)
...
--Daniel Carrero (talk) 04:27, 3 February 2016 (UTC)
The problem is that in many stress-timed languages, including English, syllabicity is not always clear. Are file, peel, hire mono- or disyllabic? Are filing, peeling, hiring, peddling, genial di- or trisyllabic? Are honorable, fashionable tri- or tetrasyllabic? Etc. Should we just allow most of these to be in both categories? --WikiTiki89 20:21, 3 February 2016 (UTC)
Ok, I got it that English's syllabicity is weird. My first guess is that maybe there could be a point in letting the words you mentioned in both categories. But, anyway, I deleted Category:English trisyllabic words and edited {{hyphenation categorization}} in a lazy non-Lua way to categorize entries in Portuguese as discussed above. Now categories like Category:Portuguese 1-syllable words are being populated. I should be able to Luacize that template later to allow for more languages. (Spanish, Italian, maybe, most/all Romance languages would be okay?) Also, update {{poscatboiler}}.
What should the top category be called? Category:Portuguese words by syllabicity? --Daniel Carrero (talk) 04:55, 6 February 2016 (UTC)
"...by number of syllables" would be plainer English. - -sche (discuss) 07:24, 6 February 2016 (UTC)
Yes check.svg Done. I created the categories for Portuguese (1-13 syllables) and Spanish (1-9 syllables). I was too lazy to Luacize {{hyphenation categorization}}, though. But it should work with any language now, provided it's listed in the template. Other languages won't generate categories, to avoid cases like English, where you can't get the syllables from the hyphenation. --Daniel Carrero (talk) 04:22, 7 February 2016 (UTC)

It's weird that Category:Spanish 6-syllable words contains: batará lineado, chova piquigualda, manzana de caramelo. Should we allow entries comprised of multiple words to have a {{hyphenation}}? If the answer is yes, then Category:Portuguese 6-syllable words should be renamed to Category:Portuguese 6-syllable terms. Example: a curiosidade matou o gato is a 12-syllable proverb. But the hyphenation should be found in each component word anyway: a, curiosidade, matou, etc. --Daniel Carrero (talk) 17:18, 7 February 2016 (UTC)

Alternative forms after definitions[edit]

Previous discussions:

I would like to see alternative forms down in the entry. But where could we place it exactly?

Here's one proposed order, I'm open to different ideas:

  • L3: POS section
  • L4: (Inflections and/or Usage notes, the order of these two is the subject of a separate discussion)
  • L4: (Quotations -- but see a critique of this header here)
  • L4: Alternative forms
  • L4: Synonyms

--Daniel Carrero (talk) 17:23, 1 February 2016 (UTC)

Rationale:
  • Having less things above the defitions is a way to promote the definitions.
  • Alternative forms and Synonyms are similar concepts.
--Daniel Carrero (talk) 17:25, 1 February 2016 (UTC)
I agree, they belong close to synonyms. The way I see it, the headers that are "close" to the term in question come first, and then the relationship to the current term becomes less as you go down. —CodeCat 17:40, 1 February 2016 (UTC)
Also agreed. I recently ran into a situation where I couldn't decide whether to make ноль ‎(nolʹ) and нуль ‎(nulʹ) be alternative forms or synonyms of each other. It's easy to miss alternative forms when they're at the top. (And it's even easier to miss the "see also" notes at the very top.) Benwing2 (talk) 17:53, 1 February 2016 (UTC)
If they are to be moved below the definition, then a position above synonyms does seem like a good position for them. "Quotations" is unnecessary in most cases — when I see it, I can almost always move the quotations to be under one of the senses. - -sche (discuss) 23:14, 1 February 2016 (UTC)
A slight aside: even though we won't be a (relational?) database any time soon, if we are trying to enforce orderings of headings, perhaps we could make the editor pop up a warning where a proposed edit would cause sections to be out of the agreed order? Equinox 23:17, 1 February 2016 (UTC)
Would we make that warning appear before or after all entries are fixed to conform to the same heading order? If we do it before the entries are fixed, then when a user edits an entry with the wrong section order and leaves the section order unchanged, they will see the warning even though it is not his/her "fault". --Daniel Carrero (talk) 23:35, 1 February 2016 (UTC)
I'm just suggesting a general practice, not implementation details. If it can't be done it can't be done. Equinox 23:38, 1 February 2016 (UTC)
Ideally, if we can fix all entries, then make the warning appear in new edits, I'd support it.
If the warning thing can't be done (I didn't check), maybe we could use tags like the no-documentation, new-L2, etc. (wrong-section-order) --Daniel Carrero (talk) 23:53, 1 February 2016 (UTC)
  • A counterargument, in re: Alternative forms and Synonyms are similar concepts:
Alternative forms are terms where (generally) everything matches except the graphemic representation.
Synonyms are terms where nothing matches except the meaning (and often just a subset of meanings).
As such, it has always seemed sensible to me that alternative forms would be towards the top: everything beneath applies to the given alternative forms. It has also always seemed sensible to me that synonyms would go beneath the definitions: synonyms generally only match a subset of meanings, with {{sense}} required to stipulate which. ‑‑ Eiríkr Útlendi │Tala við mig 21:02, 2 February 2016 (UTC)
I agree, it generally makes sense to put the alternative forms at the top. Quite a few other dictionaries do it this way as well. Plus, I think that it divides information nicely between the top and bottom sections. Nibiko (talk) 22:13, 2 February 2016 (UTC)
Alternative forms can have different pronunciation, gender, inflection and even descendants. The line between alternative forms and synonyms is very thin. The only criterium seems to be similarities between the terms. So there is no assumption that "everything beneath applies to the given alternative forms", because it certainly doesn't in many cases. —CodeCat 22:26, 2 February 2016 (UTC)
ноль ‎(nolʹ) and нуль ‎(nulʹ) are good examples. They are both clearly derived from Latin nūllus, but possibly through different paths. The meanings are overlapping but not quite the same, and the declensions are related but different (ноль has two declensions, one of which borrows most forms from нуль). Other examples in Russian are воскресенье ‎(voskresenʹje) (meaning "Sunday" and also "resurrection") and воскресение ‎(voskresenije) (meaning "resurrection"), or e.g. мгновенье ‎(mgnovenʹje) and мгновение ‎(mgnovenije) (both meaning "moment, instant"). Benwing2 (talk) 23:37, 2 February 2016 (UTC)
  • Hmm, that's quite interesting, thank you both. It seems that again Japanese might be the oddball here. Alternative forms for Japanese are mostly cases of graphemic differences that are otherwise overlaps. C.f. 山樝子 / 山査子 ‎(sanzashi, hawthorne, Chinese hawthorne), or the more complicated example of 漏れる / 洩れる / 泄れる ‎(moreru, to leak (out from something); to be omitted, to be left off or out). The etymologies, pronunciations, and senses overlap, with only some usage notes to clarify the different spellings. ‑‑ Eiríkr Útlendi │Tala við mig 01:39, 3 February 2016 (UTC)
  • In the case of Finnish myydä, the alternative form myödä can be traced to a separate Proto-Finnic form. So the doublet can be traced back to an ancestral language. Proto-Slavic *pljuťe and *pluťe are derived from different ablaut grades of the root. Serbo-Croatian still has both varieties, differing by dialect. Compare also Finnish auttaa vs avittaa. —CodeCat 01:53, 3 February 2016 (UTC)
An English example of ambiguity between alt forms and synonyms is finocchio, which listed "feminine forms" like finocchia as alt forms: in this case, there's no difference in meaning (they both mean "fennel"), but in the case of a (presumably Italian-derived) word with X-o and X-a forms where one referred to a man who did X and the other referred to a woman who did X, I doubt we'd list the forms as alternative forms, since the meaning as well as the pronunciation and (slightly) etymology would be as different as they are for finocchio vs finocchia; conceivably the plurals could be different, too. Obviously, there are other cases where it's clear that two things are alternative spellings, like realize (Oxford-British and American spelling) vs realise (alternate British spelling). - -sche (discuss) 02:35, 3 February 2016 (UTC)
Are there any examples of English etymological doublets that are treated as synonyms or alternative forms in modern usage, perhaps differing by dialect? —CodeCat 03:33, 3 February 2016 (UTC)
Some of the alternative forms of hajduk go back through different languages, e.g. hajduk is directly from Hungarian, haiduc is via Romanian, haidouk is via French. There are probably many words like that. In contrast, English borrowed Narragansett mishcùp (plural mishcùppaûog) four times with sufficiently different forms that I labelled them synonyms rather than alt forms: mishcup, scup, paugie, and scuppaug. Quasi-synonyms include regal and royal. You may also find Wyang's comment here interesting, about how "Every time Vietnam was conquered by China, the Chinese officials brought with them the Chinese pronunciation of the character[s] at that time", leading to the same characters being borrowed repeatedly with subtly different pronunciation and sometimes meaning, e.g. ‎(to roll; a roll) became cuốn, cuộn, cuợn, quận, quấn, quyển, quyền, and quyến. - -sche (discuss) 17:26, 3 February 2016 (UTC)
To me it seems so far like a problem with classifying certain things as alternative forms rather than a problem with alternative forms themselves. Just because a word is a cognate, doesn't necessarily mean that it's an alternative form. In Japanese, there are plenty of cognates, many are considered alternative forms, many are not. The line is drawn on a practical level. Furthermore, synonyms is far from being the only alternative to classifying something as an alternative form, as there are other headers such as related terms, see also, usage notes, coordinate terms, and for ubiquitous concepts in a language there is usually some template parameter. Nibiko (talk) 03:47, 3 February 2016 (UTC)

I think that synonyms, as mentioned above, may (and for a vast majority of words, in all languages, this is the standard) refer to a specific sense-meaning (or some of them) of both words (both words can be used interchangeably but only for some specific meanings in each of them). As alternative forms we must present "real" alternative forms aka where everything matches (in meanings of both words, the lemma and the synonym) except the graphemic representation. That's why we use {{sense}} in the section of synonyms but not in the section of alternatives. Dialectal variables must not confuse us and must not be taken into account. --Xoristzatziki (talk) 05:05, 3 February 2016 (UTC)

It sounds like you are proposing to get rid of all alternative forms as we have them, and rename alternative spellings to alternative forms. —CodeCat 17:49, 3 February 2016 (UTC)
  • @CodeCat, are you talking about the label? I'm not aware of any ===Alternative spellings=== header. ‑‑ Eiríkr Útlendi │Tala við mig 19:29, 5 February 2016 (UTC)
    • Alternative spellings are, generally, the only pair of words where everything matches except the graphemic representation. Right now, we treat them as a subset of alternative forms, in which other things may differ as well. If we only allow pairs, where the graphemic representation is the only difference, to be labelled "alternative forms", then we are essentially relabelling what is currently "alternative spellings" into "alternative forms". The implication seems then, that whatever pairs remain that were formerly alternative forms but not alternative spellings, would become labelled as synonyms. —CodeCat 20:39, 5 February 2016 (UTC)

need bot permission[edit]

i am user from Bangladesh. i wanted to add new BN word . so i need bot flag . Rahul amin roktim (talk) 03:37, 2 February 2016 (UTC)

You don't need the bot flag to add new words, just add them directly. Benwing2 (talk) 04:56, 2 February 2016 (UTC)
A bot gives you the ability to do an enormous amount of damage very quickly, if you don't know what you're doing. Why should we risk allowing that when you haven't edited a single entry here at English Wiktionary? Not only that, but you've been blocked at bn Wiktionary for continuing to run a bot on your main account after being told not to, which doesn't inspire confidence in your willingness to follow the rules. You need to edit here for a while without a bot so we can see that you know how to do things right- then, maybe, we'll give you permission to run a bot. Chuck Entz (talk) 08:20, 2 February 2016 (UTC)
Oppose per Chunk Entz. Bot flags are for 'creating words' anyway but for repetitive minor edits that would flood the recent changes. Renard Migrant (talk) 14:19, 2 February 2016 (UTC)
No need for a bot flag in the first place, and no reason for giving one to an unknown and untrusted user with few edits. Prior behaviour on other wikis only weighs further against. We should keep an eye on the user and block if they evade our bot policies. —CodeCat 22:28, 2 February 2016 (UTC)
In fact zero edits in the main namespace. First ever edit for that user was the one that started this thread! Renard Migrant (talk) 22:36, 2 February 2016 (UTC)
I have no idea why he/she would try to evade the bot policy yet again by private request. --kc_kennylau (talk) 13:25, 6 February 2016 (UTC)

Proposal: Removing Quotations header[edit]

Proposal: Removing the Quotations header from all entries.

  1. If there are quotations in the Quotations header, move them to the respective senses.
  2. If there is just {{seeCites}} (which links to the citations page), remove the Quotations header altogether, it's just a duplication of the citations link at the top of the entry.

For example, remove the Quotations header from abyss.

Quotations[edit]

Previous discussions:

Can we do that? --Daniel Carrero (talk) 17:31, 2 February 2016 (UTC)

{{seeCites}} can also be moved under whatever senses it applies to (being converted to {{seemoreCites}} if there are existing citations). All of this can be done in most cases. But in a few cases it's unclear which sense a quotation applies to; housing these is the stated reason for the existence of the (oft-misused) Quotations header. What should be done with such citations if the Quotations header is removed? I suppose they could just be moved to the citations page. - -sche (discuss) 22:23, 2 February 2016 (UTC)
We add quotations and other usage examples to illustrate usage. If we don't even know what they illustrate, then they don't really belong in the entry. —CodeCat 22:29, 2 February 2016 (UTC)
Massively support. But 'moving them to the respective senses' has to be done by human editors, so it's a big job. Like CodeCat says, just move them to the citations page and put them back into the entry when a human user gets round to it. Renard Migrant (talk) 22:39, 2 February 2016 (UTC)
Goods points from both of you. Yes, it seems like a bot could just move 'em all to the Citations pages. (A bot could even skip cases where the citations page already existed, if that would make things easier; that would probably cut the number of remaining quotations sections down to something humans could manage.) For reference, as of the last database dump 4223 entries had quotations sections. - -sche (discuss) 22:55, 2 February 2016 (UTC)
I dislike the idea of relegating them to the citations page, but I do support getting rid of the quotations header. The only time I've seen one where the quote couldn't be moved to a specific definition is at PC, which has a quotation containing three different senses of the word. Andrew Sheedy (talk) 00:15, 3 February 2016 (UTC)
According to WT:EL, if it's possible to sort quotations under a sense, it's acceptable to do so. In my experience, only a tiny fraction of the 4223 entries which use quotations headers use them for the WT:EL-approved purpose of housing unclear quotations; most entries can be cleaned up on sight with no change in policy (and for years I have been cleaning them up when I rarely encounter them). - -sche (discuss) 23:01, 2 February 2016 (UTC)
I don't feel super-comfortable with this; I tend to miss the Citations page and I imagine I'm not alone here. Benwing2 (talk) 23:40, 2 February 2016 (UTC)
Might as well. Be warned that there might be a handful of pages where this header has been used to add a citation of uncertain or missing meaning. Equinox 00:32, 3 February 2016 (UTC)
On reflection, ====Quotations==== should be kept but should only ever contain {{seeCites}} and not the actual citations. I share Benwing's concern also, but I think the idea is that we log all the bot created citations page (Special:Contributions would do it automatically for example) and move them back into the entry as fast as humanly possible. Perhaps indeed all pages with a citations page should contain {{seeCites}} under the quotations header, which I think, could be done by bot. Providing the bot can read the language statement in {{citations}} and put the quotations header in the right spot. Renard Migrant (talk) 13:36, 3 February 2016 (UTC)
What's the benefit, when a blue-linked Citations tab at the top already indicates the presence of such a page? Equinox 03:55, 4 February 2016 (UTC)
I'm with Equinox there. --Daniel Carrero (talk) 04:04, 4 February 2016 (UTC)
The benefit is that those who are focusing on the entry itself rather than on the user interface might see it where they wouldn't otherwise. Also, not everyone recognizes "citations" as meaning quotations. Chuck Entz (talk) 04:08, 4 February 2016 (UTC)
Your last sentence could be used as a good argument to change the Citations: namespace into Quotations:. (Citations:abyss -> Quotations:abyss) --Daniel Carrero (talk) 04:17, 4 February 2016 (UTC)

I created Wiktionary:Votes/2016-02/Removing "Quotations". --Daniel Carrero (talk) 04:55, 10 February 2016 (UTC)

Emilian-Romagnol vs. Emilian?[edit]

User:Gloria sah asked me:

I noticed that in the page "", since I wanted to write about Emilian-Romagnolo language, I had to write the abbreviation "egl" (that conversely usually refers only to Emilian language) instead of "eml"=Emilian-Romagnol language. Do you think in the future I'll have to write "egl" again, or there is something to fix upstream? Thank you in advance, --Gloria sah (talk) 16:39, 2 February 2016 (UTC)

I don't know enough about these varieties to answer the question. Anyone? Benwing2 (talk) 17:32, 2 February 2016 (UTC)

@Gloria sah: ISO 639-3 has deprecated eml and now treats Emilian egl and Romagnol rgn as two separate languages. Therefore, please always use egl for Emilian words and rgn for Romagnol words, and use the headers ==Emilian== and ==Romagnol==. If a word happens to be spelled the same in both languages, you can create two entries. —Aɴɢʀ (talk) 18:56, 2 February 2016 (UTC)
Although (Gloria), if you think Emilian and Romagnol should be treated as a single language, we could have a discussion of the appropriateness/inappropriateness and benefits/drawbacks of that. - -sche (discuss) 19:08, 2 February 2016 (UTC)
Thank you for your precious answers. I'll ask my colleagues too. --Gloria sah (talk) 20:01, 2 February 2016 (UTC)
Who are your colleagues? Renard Migrant (talk) 21:56, 2 February 2016 (UTC)
My eml.colleagues, I meant. Now, it's ok as per Angr told me here above. Thank you and good luck, --Gloria sah (talk) 07:18, 3 February 2016 (UTC)
Oh right, there's a unified Emilian-Romagnol Wikipedia rather than two separate ones. —Aɴɢʀ (talk) 10:09, 3 February 2016 (UTC)

Proposal: Removing "Flexibility" from WT:EL[edit]

Flexibility
While the information below may represent some kind of “standard” form, it is not a set of rigid rules. You may experiment with deviations, but other editors may find those deviations unacceptable, and revert those changes. They have just as much right to do that as you have to make them. Be ready to discuss those changes. If you want your way accepted, you have to make the case for that. Unless there is a good reason for deviating, the standard should be presumed correct. Refusing to discuss, or engaging in edit wars may also affect your credibility in other unrelated areas.

I'm not a big fan of WT:EL#Flexibility. That section was added in this diff from 2006 by Eclecticology and has been kept completely unchanged for 10 years. (adding a comma and minor formatting don't count) See also this diff of just the section, compared between 2006 and now.

Here's what I think about that text:

  1. "While the information below may represent some kind of “standard” form, it is not a set of rigid rules."
    • No, I believe they are. Granted, some information in WT:EL may be outdated or contradictory so it tends to get ignored, but the voted rules and standard formatting tend to be followed more seriously. In 2006, our grasp of what exactly should be the entry layout was in an earlier stage of development, so I'm willing to bet that making up rules as you go along was something more sensible then than it is now.
  2. "You may experiment with deviations, but other editors may find those deviations unacceptable, and revert those changes. They have just as much right to do that as you have to make them."
    • Ok, that does not seem to be something we want to encourage. Deviations from the standard format tend to get reverted on sight. WT:EL is directed at newbies for some extent, so I wouldn't want them to think they are in complete control of the presentation of the entry and have ideas like "I'll color my entry completely pink, it's going to be more beautiful" or something. To be fair, seasoned users would probably think of new ways to do stuff that actually make sense, but they would normally be expected discuss these things anyway, I think.
  3. "Be ready to discuss those changes. If you want your way accepted, you have to make the case for that."
    • However sensible that may be, it's not a layout rule.
  4. "Unless there is a good reason for deviating, the standard should be presumed correct."
    • This sentence could be added to any policy without changing the content in any way. WT:CFI could say: "Unless there is a good reason for deviating, the standard should be presumed correct." WT:BLOCK, WT:BOT, WT:AJA, etc., all of them. So the sentence can be removed safely.
  5. "Refusing to discuss, or engaging in edit wars may also affect your credibility in other unrelated areas."
    • That's advice concerning human interaction, that's not a layout rule.

Can we remove WT:EL#Flexibility? --Daniel Carrero (talk) 14:16, 3 February 2016 (UTC)

No I think flexibility is a great idea as it reflects actual practice (which is the major problem with WT:CFI as I see it; it doesn't represent what we actually do). The rules should follow standard practice, not the other way around. Renard Migrant (talk) 14:43, 3 February 2016 (UTC)
I don't like most of it either. Talking about what "right(s)" editors have seems irrelevant to the page, and the implied threat about "refusing to discuss" somehow reminds me of that awful TODO Group Code of Conduct. Equinox 15:23, 3 February 2016 (UTC)
We need some freedom in at least one of two places: in having only rules that have very broad consensus OR in loose enforcement of rules. The second option is much like real life, in which many laws are not enforced with any regularity.
In our little world, IMO, we need more help in improving content (In English, that means quality more than coverage.) than we need prohibitions of non-standard formatting or template use. I would hope that we would get more of such help. Instead we get more standardization. DCDuring TALK 16:00, 3 February 2016 (UTC)
If anything in CFI we should have a flexibility clause, that consensus overrides WT:CFI as that's what happens in vivo. Renard Migrant (talk) 16:28, 3 February 2016 (UTC)
I get it that you are comparing the need of standardization vs. the need for improving content. But I don't suppose you are arguing that standardization is a bad thing? Random example: If we didn't have a standard format for translation tables, then we wouldn't have an "add translation" gadget that works properly, it would be more tedious to add translations and we would probably have a smaller number of translations than what we have now. --Daniel Carrero (talk) 16:33, 3 February 2016 (UTC)
A great deal of our standardization IS a bad thing. For templates it often is done for reasons of tidiness or to give our technical adepts some more or less challenging problems to work on, unaccompanied by any improvement in ease of use for contributors or users. Some standardization of format, headings and templates may have bad effects on languages for whom we have no current contributors to express the problem.
What has been done to make it easier to improve definitions, especially in entries for polysemic terms? To eliminate the use of rare, obsolete, and archaic English terms in definiens? To compile a defining vocabulary? DCDuring TALK 20:23, 3 February 2016 (UTC)
@DCDuring Your criticism is not very specific, do you want to discuss some template or style rule or change in particular? In any event, I don't think you can lecture anyone for not helping in a particular way, since we're a volunteer project, people edit where they choose to. Do you think the presence of WT:EL#Flexibility as it is currently written helps Wiktionary? If I created the vote to delete the section, do you think you would support, oppose, abstain? --Daniel Carrero (talk) 03:37, 4 February 2016 (UTC)
<rant>It's a matter of the apparent interests of many of our contributors. Many seem to be interested in all the fascinating tricks that software can be made to do. Although normally our better technical contributors don't break the infrastructure, many reforms disrupt habits, time-saving user snippets, and user consensus (or unearth buried hatchets). The uniformitarian impulse seems to stem from the desire for universally applicable software. All irregularities need to be eliminated to make the wiki safe for relative simple and dramatic "innovations". (I can't say solutions, because they often aren't.)</rant> I am actually afraid to ask for any technical help other than bug fixes because I am afraid of the unwanted consequences. I may take a chance next year and ask for help when the MW folks ask for ideas. DCDuring TALK 03:52, 4 February 2016 (UTC)
@DCDuring: Re "I am actually afraid to ask for any technical help other than bug fixes because I am afraid of the unwanted consequences.": Ok, I don't think there's nothing wrong with doing completely new stuff, it's just that I'd expect these things to be discussed and approved by the community first, that's all. If a new idea is bad, hopefully problems can be spotted in the discussion of the proposal, right? Let's not change the whole system without warning. (I'm being hypocritical if I did major changes in the past without discussing; still, the right thing is to discuss first.) OTOH, if it's a good idea that people want, then I'd argue it should be implemented anyway. -- Did you like the system of Template:votes where you can see how many people voted? :) It was properly discussed. I think that's a complex "innovation" that I could do right. Obviously we can talk if you disagree. -- Again, there could be some specific templates/changes/etc. that you'd like to discuss.
On that note, I'm still going to create a vote on deleting Flexibility if no one minds or has a different idea. I thought about maybe rewriting the section but there's little related to entry layout written there. If there's any piece of text people would like to keep ("Be ready to discuss those changes. If you want your way accepted, you have to make the case for that." is sensible, as I pointed out.), that could be done in some page other than WT:EL. --Daniel Carrero (talk) 06:41, 4 February 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Removing "Flexibility". --Daniel Carrero (talk) 04:35, 9 February 2016 (UTC)

Should all adjectives have adverbial sections?[edit]

There are some languages which can reuse adjectives as adverbs without modifying them, for example German and Romanian. We could add an adverbial section to an adjective like laut, simply defining it as loudly. This would be technically correct, would make the project more complete, and I suppose that it wouldn’t harm anybody, but this could be wasteful or unprofitable for almost everybody except for absolute beginners of the language (who would probably expect a closely‐related language to have extremely similar laws and functions like theirs). For natives, adverbial uses of adjectives are seen as both given and obvious. I’d like to read your thoughts on the matter. I’m remaining neutral for now. --Romanophile (contributions) 17:56, 3 February 2016 (UTC)

For Dutch, adverbs with the same meaning as the adjective are not given separate entries, but are instead listed as "adverbial" in the inflection table of the entry. See snel, WT:ANL. —CodeCat 18:35, 3 February 2016 (UTC)
But there must be at least some adjs that can't be advs, right? Equinox 19:26, 3 February 2016 (UTC)
Spanish is similar. But different too. --Ce mot-ci (talk) 15:33, 4 February 2016 (UTC)

Switching from є to е in Old Church Slavonic[edit]

Previous discussion: Wiktionary:Beer parlour/2012/December#Є/є in Old Church Slavonic

In the previous discussion linked to above, it seems we were in agreement to switch to е. What happened to that? I still strongly dislike the use of є. --WikiTiki89 23:43, 3 February 2016 (UTC)

“Trivia” or some other suitably explanatory heading[edit]

The section WT:EL#Anagrams and other trivia is an explanation of anagrams, followed by the paragraph below. Can we remove the part "or some other suitably explanatory heading"?

  • Other sections with other trivia and observations may be added, either under the heading “Trivia” or some other suitably explanatory heading. Because of the unlimited range of possibilities, no formatting details can be provided.

Rationale:

  • If there are any new trivia ideas, just "Trivia" should be fine in most cases, other names for trivia sections can be discussed later if needed. I don't recall the use of Trivia sections other than "Trivia", "Anagrams" and "Statistics". If anything, the Statistics section should either be mentioned explicitly in WT:EL or deleted. The Statistics section is used in entries like this: man#Statistics. It serves the purpose of keeping {{en-rank}}, a template that is currently in RFDO: WT:RFDO#Template:en-rank.

--Daniel Carrero (talk) 08:31, 4 February 2016 (UTC)

Four policy proposals in five days feels a bit much, doesn't it? A user who's away for a single week might miss a lot. Equinox 13:06, 4 February 2016 (UTC)
He harbours (not-so-)secret plans to rise to the position of WT overlord. All hail! --Ce mot-ci (talk) 15:32, 4 February 2016 (UTC)
[4] --Daniel Carrero (talk) 15:56, 4 February 2016 (UTC)
@Equinox: I'll reply in the next section. I'd like to keep this section only for the proposal of editing WT:EL#Anagrams and other trivia, in case I create a vote later and link back to this discussion. --Daniel Carrero (talk) 15:56, 4 February 2016 (UTC)

About my proposals[edit]

About @Equinox's question in the discussion above:

I have a file on my PC listing my multiple pending Wiktionary projects. I've been trying to reduce it in size by creating the proposals you are seeing in some of the discussions above, in the course of the last days. :p For example, this link is more or less how I envisioned WT:EL largely revised by me on October 2015. But some things changed since then, so that's not the exact policy I would propose.

So, as Equinox pointed out, I created 4 policy proposals in the last 5 days. (There are also a bunch more BP discussions I created on January 31.) One may argue that I'm rushing things up, but IMHO that's the opposite, I'm taking my time. (If a certain topic is arguably very minor and/or uncontroversial, like the Trivia section thing, does it even count?) I'm trying to get all the discussions I want out of the way, there are others I didn't find the time to create. Not to mention that I'm participating in all the discussions and creating a bunch of related votes. (off the top of my mind, some unresolved issues to be discussed in short-term are: appendices for letters, deleting some index pages, I'm creating a periodic table for the entries, I'd like to see checkmarks in the vote box when I voted and X marks when I didn't vote, it does not make sense that WT:EL#The entry core is nested under WT:CFI#Additional headings, and Beer Parlour Archive Secret Project).

Re: "A user who's away for a single week might miss a lot." But policy changes can only be implemented with a vote, giving the user more time to review the changes. Also, when people complain of something wrong during the course of a vote, I usually create separate votes to address these issues, so that's more votes I either have already created or plan to create in the future. In the end, it takes some time to repeatedly revise the text of some of our voted policies. :p Last year, I created 3 NORM votes (Wiktionary:Votes/pl-2015-05/Normalization of entries, Wiktionary:Votes/pl-2015-07/Normalization of entries 2 and Wiktionary:Votes/pl-2015-11/NORM: 10 proposals), 2 headword line votes (Wiktionary:Votes/pl-2015-10/Headword line and Wiktionary:Votes/pl-2015-12/Headword line 2), etc. to address points raised in the previous votes.

Bottom line is: Please let me continue creating a lot of proposals. (if possible) :) In most cases that I have planned, they should be just codification of actual practices (either obvious ones or ones that may require discussion). There's some new stuff, but IMHO those should be helpful if other people agree (appendices for letters = remove clutter from letter entries), I'll give more details when I actually create the discussions. (Also I wouldn't mind dumping my whole Wiktionary.txt file on my Sandbox, but I can't promise it's completely intelligible.) --Daniel Carrero (talk) 15:56, 4 February 2016 (UTC)

@Daniel Carrero: I'd suggest that you create a page like User:Daniel Carrero/Suggestions and get some feedback now before you propose them for real. —Justin (koavf)TCM 14:45, 5 February 2016 (UTC)
Ok, sounds good. --Daniel Carrero (talk) 15:26, 5 February 2016 (UTC)
Then again, the only difference between making a list of 4 proposals as separate BP discussions and making a list of 4 proposals in my userspace is the location, unless I use my userspace to make a compact/dumbed down list of multiple items. If I'm able to type full proposals and I'd like some feedback, maybe BP is the best place after all. --Daniel Carrero (talk) 15:29, 5 February 2016 (UTC)
Yeah, I don't see the point in moving these discussions to userspace. (I think the original complaint was that there are too many proposals being made in a short time, not where they were being made.) - -sche (discuss) 15:42, 5 February 2016 (UTC)
Dumbed down list:
Part 1 - new things
  • appendices for letters that should replace most letter entries
  • deleting most index pages
  • creating a periodic table for the entries
  • seeing in Module:votes whether I voted or not in a vote
  • it does not make sense that WT:EL#The entry core is nested under WT:CFI#Additional headings
  • Beer Parlour Archive Secret Project
  • move compass points template to "Template:table:compass points/en"
  • character map
  • I don't like that "# {{given name|female|lang=es}}." and other templates require a manual dot but maybe there's nothing to be done
  • update WT:NS
  • Appendix:Superscript and subscript
  • redirect ² to 2, ⁿ to n (?)
  • pics of myself for Appendix:Gestures
  • gloss database in a module
  • Appendix:Date and time formats
  • script code + categories for 1, 2, 3...
  • Portuguese conjugation page
  • rewrite blocking policy
  • actual voting policy page
  • long names for topical categories
  • rename request categories
  • check if a table template is being used in all pages
Part 2 - codify or standardize rules
  • you can't format abbreviations as: laugh out loud
  • {{pedia}} and other external links templates should use a bullet point
  • place images in the language sections, not together with {{also}}
  • language-specific templates are preferable to {{head}} when they exist (?)
  • don't add ---- except between languages
  • quotations in chronological order
  • list of languages with stripped accents
  • translations of: kg, etc
  • delete WT:Anagrams if WT:EL#Anagrams is enough
  • explain better the difference between External links and References, if any
  • mention {{ttbc}} or {{trreq}} or whatever we are using now
  • mention {{trans-see}} in a policy
  • make Help:Translations out of Wiktionary:Translations
  • edit WT:NORM's rule about "#:*" together
  • no links in the gloss between parentheses (?)
  • standard style for Citations: (gloss: italic or between quotation marks; also require ----)
  • use Unicode "micro sign" in English instead of Greek letter, etc.
This is mostly what I never said I wanted to do. If I said it somewhere, it's still on my list, but I didn't want to repeat it here. --Daniel Carrero (talk) 03:27, 6 February 2016 (UTC)

Declension tables after every new etymology[edit]

I discussed if it's necessary to have declension tables after every new etymology - if the word and all its forms remain the same - with a fellow Wiktionarian, but we couldn't recall if there's a rule about this. See pală to see what we're talking about. I personally think that it looks a bit cluttered. Do we have a policy to guide us in this matter? --Robbie SWE (talk) 11:49, 5 February 2016 (UTC)

I'm fairly sure there's no policy on this. WT:EL would be the place to look. Common sense, in my opinion, says to just include the declension once with a level 4 header (this signals that it's not under any specific etymology, as that would be level 5). Renard Migrant (talk) 13:13, 5 February 2016 (UTC)
I agree. I'll take a look and see what I can do. Should we have a policy on this though? --Robbie SWE (talk) 15:27, 5 February 2016 (UTC)
One thing that could alleviate the problem is to make the declension tables less conspicuous. Starting with not taking up the entire width of the screen. --WikiTiki89 15:30, 5 February 2016 (UTC)
@Wikitiki89: I see what you mean. I'll do the necessary changes. Thanks for the advice! --Robbie SWE (talk) 15:39, 5 February 2016 (UTC)
Just out of curiosity, aren't there some cases where two homographs have different declensions? I am looking at you Russian. I like the cleanliness of a single declension/conjugation table, but there may need to be flexibility in any policy around it when required. - TheDaveRoss 15:39, 5 February 2016 (UTC)
Plenty of such cases. That's why we tend to include inflection tables in every POS section. --WikiTiki89 15:46, 5 February 2016 (UTC)
Yeah, I think it's good to include an inflection table in every POS section to clarify when the information is the same vs when it's different. (Renard's suggestion is also acceptable, though, and might work better for some languages where inflection depends solely on a lemma's spelling, or is uniform throughout the language as in constructed languages.) - -sche (discuss) 15:53, 5 February 2016 (UTC)
@TheDaveRoss: Sure there are – not only in Russian, but in Romanian too. I usually include multiple declension tables only if the word has more than one declension (see copil in the Romanian Wiktionary). The issue here is that the same declension table appears 5 times and it just doesn't look appealing, if you ask me. I think that Renard's suggestion is worth a shot, but I would feel more comfortable knowing if the greater community approves of such a "rule". --Robbie SWE (talk) 16:52, 5 February 2016 (UTC)
I think the inflection should always be in the same place, nested under the POS header, below the definitions and usage notes. I oppose any deviation from this format. If there are multiple POS headers with the same inflection, then they should have separate inflection tables nonetheless, because I think keeping to a standard format and ordering is more important for users than our desire to fiddle around with the nesting. —CodeCat 16:56, 5 February 2016 (UTC)
@CodeCat Hmmm, I kind of jumped the gun and went ahead and erased the superfluous declension tables: Should I revert my own canges? --Robbie SWE (talk) 17:02, 5 February 2016 (UTC)
In the page you edited, there should be at least something that says what the declensions of the other nouns are. Right now, I'd be inclined to think they are missing, and add them back in if I knew what they were. A message saying "see above" may work, but you might as well put in a declension table instead. —CodeCat 17:05, 5 February 2016 (UTC)
Ok, I'll revert my edits for now, but I strongly believe that we shoud discuss this further and agree on a policy. --Robbie SWE (talk) 17:09, 5 February 2016 (UTC)
I agree. My hope is that one day, each POS section can stand on its own as a distinct entry, just like in paper dictionaries, and that nesting is no longer used for shared properties. I still support POS sections at level 3 only. —CodeCat 17:15, 5 February 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── My policy for Russian is to include a declension table under every new etymology or POS header, even when all etymologies happen to share the same declension. This is because logically the declension is nested under the POS header and a part of it, and most of the time different etymologies don't share the same declension, and it seems too confusing to do it the other way. I agree with CodeCat in this respect (except for the position of usage notes, which is a fairly minor issue). Benwing2 (talk) 18:18, 5 February 2016 (UTC)

how to represent "Y; in [some dialect], also pronounced X"[edit]

Say a word can be pronounced a certain way in most varieties of English, but in one of them, it can also be pronounced a second way. I've seen this indicated in a variety of ways, none of which seem really satisfactory to me, including (rather than try to find the old examples I just mocked them up on meringue, but you can find some examples by typing e.g. insource:"a US also" into the search bar) [5], [6], [7], [8], and [9]. What do you think, how should this kind of thing be formatted? PS the specific "also" pronunciation of meringue may actually be spurious, but let's discuss the general case. - -sche (discuss) 15:36, 5 February 2016 (UTC)

Since only in #6 I understood what you wanted to say, I vote for that one. The first ones to me are not clear enough about whether the USA has both pronunciations or only the "also"-variant. Korn [kʰʊ̃ːæ̯̃n] (talk) 16:01, 5 February 2016 (UTC)
@Korn: Don't use the numbers, they can change. --WikiTiki89 16:05, 5 February 2016 (UTC)
Yeah, the last one (currently number 6, but as noted that could change if someone adds a link to a section above this one) does seem clearest to me. What Wikitiki favors below also works, and probably better fits how we label other things (we usually say "US" not "in the US")... it just seems less clear. - -sche (discuss) 16:20, 5 February 2016 (UTC)
I think it's fine to use {{a|US|also}}: (US, also). --WikiTiki89 16:05, 5 February 2016 (UTC)
  • I know this isn't the point of your post, but that second pronunciation is actually how I natively pronounce it. But in any case, I think Wikitiki's solution is good. —Μετάknowledgediscuss/deeds 16:39, 5 February 2016 (UTC)
  • I think adopting the practice of "one pronunciation per line" would be good. So I would prefer putting the second pronunciation on its own line, rather than at the end of the first. —CodeCat 16:46, 5 February 2016 (UTC)
    • I would be against a "one pronunciation per line" rule. But I would definitely support a "one tag per line" rule. --WikiTiki89 16:51, 5 February 2016 (UTC)
      • What are your objections? —CodeCat 16:52, 5 February 2016 (UTC)
        • Because pronunciations sections that have five pronunciations each for UK and US English, would have to take up 10 lines instead of two. --WikiTiki89 17:00, 5 February 2016 (UTC)
          • Do those five pronunciations have notices saying when and how they are used, which are more frequent than others, etc? Also, how do you deal with pairing enPR and other language-specific transcription systems with the IPA? I think putting IPA on one line and other schemes on another line is bad. It establishes no correlation between the two. That's why I proposed the rule: to make it clear which "things" are different representations of the same pronunciation. —CodeCat 17:07, 5 February 2016 (UTC)
            • Sometimes that kind of data is unavailable, especially when the variants are essentially interchangeable and not associated with any dialect, sociolect, or anything like that. And the enPR, if it really needs to be paired, can be paired by order alone. --WikiTiki89 17:14, 5 February 2016 (UTC)
Perhaps some sort of all-encompassing tag might be better, to be clear that the first pronunciation is more or less universal. It might be silly to have some sort of {{a|everywhere}} tag, though. I suggest using {{a|in the US, also:}} (with the comma and colon), as that's probably the least ambiguous option. Andrew Sheedy (talk) 18:32, 5 February 2016 (UTC)

Eggcorns[edit]

We have Template:pronunciation respelling of and Template:eye dialect of. Should we have a template for eggcorns? Entries that could perhaps use it include for all intensive purposes and greatfruit. - -sche (discuss) 20:43, 5 February 2016 (UTC)

I have created T:eggcorn of. - -sche (discuss) 22:48, 8 February 2016 (UTC)
Is eggcorn itself citable as an eggcorn of acorn? Keith the Koala (talk) 18:55, 9 February 2016 (UTC)

Poll: pronunciation example[edit]

FYI: There's a poll going on:

The purpose of the poll is choosing an example word for the updated WT:EL#Pronunciation, if the vote Wiktionary:Votes/pl-2016-01/Pronunciation passes. Thanks! --Daniel Carrero (talk) 00:33, 6 February 2016 (UTC)

Placement of a English-specific pronunciation rule[edit]

About this rule:

  • "The r phoneme used in English in words like red, green and orange is to be represented with /ɹ/ instead of /r/, except in accents where it is actually a trill."

It was voted in 2008-01/IPA for English r. It currently is located in WT:EL#Pronunciation.

The vote Wiktionary:Votes/pl-2016-01/Pronunciation should start in a few days. It proposes to rewrite most of WT:EL#Pronunciation but it does not touch that rule. Are we OK with keeping that rule there or should we move it to WT:AEN? My opinion is: WT:AEN is not a "true" policy, it is a think thank (at least in the opinion of the user who placed the TT template in the page, see diff) and I'd like to rewrite much of it. (it would be off-topic to say exactly what parts of it and how) Since it's a hard and fast voted-on rule, let's keep it in WT:EL#Pronunciation and move it to WT:AEN later when we have a proper About English page. Does that sound good? --Daniel Carrero (talk) 01:44, 6 February 2016 (UTC)

We should move it to WT:AEN. Even if WT:AEN is not a policy page, this rule is a policy because it was voted on. Not all vote results need to be on an official "policy page". --WikiTiki89 01:48, 6 February 2016 (UTC)
If we are going to move it between policies, I'd rather do it without a formal vote. Changing the location of a voted policy text is an "unsubstantial change", IMO. That is, it does not change actual practice. --Daniel Carrero (talk) 04:17, 6 February 2016 (UTC)
Yes check.svg Done. See: diff 1, diff 2. --Daniel Carrero (talk) 15:16, 7 February 2016 (UTC)

Esperanto: h-system and x-system[edit]

Esperanto contains diacritics (e.g. ĉirkaŭ). Some entries, such as cxirkaux, serve as the x-system spelling of those words. Are they necessary? And should they be here at all? --kc_kennylau (talk) 04:39, 6 February 2016 (UTC)

@Kc kennylau: This isn't paper, so it doesn't hurt to have standardized forms like this. —Justin (koavf)TCM 15:10, 6 February 2016 (UTC)
They don't get a free pass from WT:CFI so they still need to be attested. Renard Migrant (talk) 17:06, 8 February 2016 (UTC)
Should they have a free pass from CFI? In 2011 a vote passed to allow pinyin transcriptions of attested Chinese words even when the transcriptions themselves are unattested—are h-system and x-system spellings a similar case? It's not obvious to me one way or another. We would need a vote to allow unattested h/x-system spellings, but I agree with Koavf that adding them when attested doesn't hurt. Many, including cxirkaux, are citeable from Usenet. See also Wiktionary talk:About Esperanto#X-system and H-system for past discussion including several suggestions on how to handle these spellings.
As a side note, current practice is to allow x-system and h-system spellings as citations for Esperanto words with diacritics. Regardless of whether we include entries for x/h-system spellings, I think we should continue allowing them as citations for the entries with diacritics. —Mr. Granger (talkcontribs) 00:07, 9 February 2016 (UTC)

Blocking policy - full revision[edit]

Now that Wiktionary:Votes/pl-2015-11/Short blocking policy failed, I propose revising WT:BLOCK completely. I tried and created a new version of the policy below. What do you think? Could this replace the current policy?

Notes: The current WT:BLOCK is part-policy, part-"non-binding explanation". The proposed rewrite is supposed to use all of that content and convert it to be 100% policy. Some commentary such as "For anonymous IP addresses, the 99% case is non-recurring stupidity." was removed. Word such as "usually" and "recommended" were added at some points.

See it here: User:Daniel Carrero/BLOCK. Feel free to edit the page.

--Daniel Carrero (talk) 06:21, 6 February 2016 (UTC)

Hapax legomena[edit]

Do we have a template and/or a category for hapax legomena such as nudiustertian? Smuconlaw (talk) 19:04, 6 February 2016 (UTC)

IMO, nudiustertian should be a row in a table in an Appendix:English hapax legomena. I don't even see how a real definition based on conventional usage can be legitimate for such a term, unless it is morphologically obvious. DCDuring TALK 20:32, 6 February 2016 (UTC)
It does appear in the OED, though (labelled as a hapax). I did a search using Quiet Quentin, but all citations are just repetitions of the one already mentioned in the entry. — SMUconlaw (talk) 20:34, 6 February 2016 (UTC)
Therefore it would not meet WT:ATTEST. We formerly allowed terms in English and other well-attested languages that had but a single attestation in a "well-known work", but that was voted out. I'd start the Appendix, move the citation to Citations:nudiustertian, insert sortable table with a column for etymology and a column for the citations link, and call it a day. DCDuring TALK 21:00, 6 February 2016 (UTC)
We have Appendix:English nonces. - -sche (discuss) 21:02, 6 February 2016 (UTC)
Even better. DCDuring TALK 21:04, 6 February 2016 (UTC)
What I've done in the past, to preserve the edit history (for legal / attribution reasons), is move the entire entry to the citations namepsace, then strip out everything but the citation (leaving the previous content in the edit history), and then add the entry to Appendix:English nonces with a link back to the edit history. - -sche (discuss) 21:05, 6 February 2016 (UTC)
Regarding nudiustertian, note the discussion on the talk page. @Mr. Granger has found some occurrences of the term on Usenet. — SMUconlaw (talk) 21:24, 6 February 2016 (UTC)
That, of course, applies only to well-documented languages. Ancient Greek, for instance, has quite a few hapax legomena that could use such a category. Chuck Entz (talk) 21:36, 6 February 2016 (UTC)
...which could be modelled after Category:Latin hapax legomena. - -sche (discuss) 21:48, 6 February 2016 (UTC)

Interwiki links - actual regulations[edit]

We have the section WT:EL#Interwiki links, which is a long explanation about the subject matter that IMO would be more suited in a help page like Help:Interwiki links. (currently a redlink)

I believe the most obvious actual regulations for interwiki links are, in this order:

  • Interwiki links must point to the entry spelled exactly in the same way in the foreign language Wiktionaries.
  • Interwiki links should be sorted in the alphabetical order by the name of the language as written in the language itself. (as in: German in German is Deutsch, so it is sorted under D)

But I seem to recall there were some exceptions concerning Hebrew, is that right? Are there any actual regulations for interwiki links that should be mentioned, besides the rules above? Are these rules 100% accurate? --Daniel Carrero (talk) 05:41, 7 February 2016 (UTC)

What if the language doesn't use the Latin script? —Aryamanarora (मुझसे बात करो) 00:10, 9 February 2016 (UTC)
There is no exception for Hebrew entries (and yes, this makes things inconvenient because our Hebrew entry name conventions differ from he.wikt's entry name conventions). As for the alphabetization rules, I think they are actually more complicated than that. User:Ruakh should know all about it. --WikiTiki89 01:19, 9 February 2016 (UTC)
See the interwikis in the entry dog, for example. Japanese in Japanese is 日本語 (nihongo), and the jawikt link is sorted like "Nihongo" in the alphabetical order. This leds me to believe the actual rule in this cases is "sort using the transliterated language name", but I didn't check all languages. --Daniel Carrero (talk) 01:29, 9 February 2016 (UTC)
Like I said it's more complicated than that. I found a discussion explaining it: User talk:Ruakh/2014#Unsorted interwikis. --WikiTiki89 01:38, 9 February 2016 (UTC)
Thanks for the link, according to Ruakh in that conversation, the order is the one used in m:MediaWiki:Interwiki config-sorting order-native-languagename. This information could go to WT:EL, IMO. ("sort interwiki links in this exact order" qualifies as a layout rule, I suppose) --Daniel Carrero (talk) 03:55, 9 February 2016 (UTC)

I created Wiktionary:Votes/pl-2016-02/Interwiki links. --Daniel Carrero (talk) 13:10, 12 February 2016 (UTC)

AddAudio.js[edit]

Presenting AddAudio.js, a script that allows editors to easily add audio pronunciation to entries. The script adds a small "Add audio pronunciation" recording button to each language section, which allows a user to record the audio and then automatically edits the page to add the audio template (adding a pronunciation section if necessary) and uploads the file to Commons.

A few remaining limitations:

  • The script currently only works in Firefox. (I don't expect this to be the case for that much longer, as partial support for the relevant API already exists on Chrome. Microsoft Edge currently has it "Under consideration".)
  • The script has not been tested thoroughly, and is probably somewhat buggy. Sorry about that.
  • Specifying accents is not yet possible, because I don't really know how accents are supposed to work. (How does one figure out what accent code to place in the file name? What text goes in the audio box? Is there a list of each language's accents available somewhere?) If someone could help me with that, that would be very helpful.

Feedback, suggestions, bug reports, etc. are most welcome. --Yair rand (talk) 05:49, 8 February 2016 (UTC)

Nice! But I could not make it work. I have the latest Firefox (44.0). When I click the console prints "navigator.mozGetUserMedia has been replaced by navigator.mediaDevices.getUserMedia". --Giorgi Eufshi (talk) 06:33, 8 February 2016 (UTC)
@Giorgi Eufshi: Okay, I just changed the script to use navigator.mediaDevices.getUserMedia when it's available. Maybe it will work now. If it still doesn't work, can you tell me what entry you're testing it with? --Yair rand (talk) 06:56, 8 February 2016 (UTC)
It seems I had problems with my mic. It works and it is great.--Dixtosa (talk) 17:28, 8 February 2016 (UTC)
I don't want to install Firefox just for this, but it sounds like a great start. Look forward to Chrome compatibility. Equinox 06:36, 8 February 2016 (UTC)
This sounds very helpful. Regarding accent codes: the codes for national accents are the country codes (from ISO 3166 but lowercased? or the ccTLD codes minus the dot? I can't tell which, if there is even a distinction). For example, the US pronunciation of "foo" is "En-us-foo", the UK is "En-uk-foo", New Zealand is "En-nz-foo" and Australia is "En-au-foo"; Germany's German and Austria's German are "De-de-" and "De-at-"; Portugal's Portuguese and Brasil's Portuguese are "Pt-pt-" and "Pt-br-". You could limit people to the nations where that language is natively spoken, if you like, although occasionally files get added from non-native speakers, e.g. an Italian pronunciation of Christmas tree is File:En-it-Christmas tree.oga. Subnational accents don't seem to be standardized; perhaps you could just omit support for them at first. There's File:En-us-ncalif-generictopleveldomain.ogg and File:En-us-ne-berry.ogg. Perhaps we should decide on standardized subnational codes... - -sche (discuss) 21:48, 8 February 2016 (UTC)
I must be missing something, but where should I be seeing the "Add audio pronunciation" button? I don't see it anywhere. Andrew Sheedy (talk) 04:16, 9 February 2016 (UTC)
You don't seem to have the script enabled yet. You can enable it by adding importScript( 'User:Yair rand/AddAudio.js' ); to your common.js. --Yair rand (talk) 05:55, 9 February 2016 (UTC)
Gotcha, thanks. The button doesn't seem to do anything at the moment, though I don't have a mic plugged in. Does it not do anything if it can't find a mic hooked up to my computer? (To be clear, I'm not expecting to be able to record without a mic... :P) Andrew Sheedy (talk) 07:14, 9 February 2016 (UTC)
I've just changed it to give an error message if there's no mic, instead of just silently failing to do anything. --Yair rand (talk) 08:02, 11 February 2016 (UTC)
  • Basic accent support is now working. --Yair rand (talk) 08:02, 11 February 2016 (UTC)

WT:About German - "Obsolete spellings"[edit]

Is the current statement correct about "Obsolete Spellings" at WT:About German correct? The spelling "Alfabet" for example was deprecated in 1902 or fell out of use before that time, and was not re-introduced by the reform of 1996 etc. (duden.de for example doesn't know "Alfabet" and just has "Alphabet"). But, most likely because of reform spellings like "Delfin" and "-graf", the spelling "Alfabet" does re-appear nowadays (e.g. in books.google.de/books?id=fBET_JCIIz0C&pg=PA125 from 2009 and books.google.de/books?id=U8TSDt0h7rQC&pg=PA40 from 2012). The spelling might be rarer or non-standard, but it's not obolete anymore. So the correct statement should be:

"Spellings which were deprecated by or before the Second Orthographic Conference of 1901, or which fell out of use before then, and which have not been reintroduced by a more recent reform or have not become un-obsolete otherwise, are to be labelled obsolete."

But that's pretty much the same as:

"Obsolete spellings are to be marked as obsolete."

Put in other words: The current statement has a 1996-etc.-reform POV. -84.161.1.135 08:01, 8 February 2016 (UTC)

Obsolescence is measured by commonness of usage. There seems to be consensus that spellings like Alfabet are so rare that they have to be considered idiosyncrasies. We are not a collector of ever random spelling ever written down. (Though some users have put forth a different view in an RFD I recently brought up, cough, cough, cough.) Korn [kʰʊ̃ːæ̯̃n] (talk) 09:07, 8 February 2016 (UTC)
If Alfabet is more than marginal in contemporary German, we can call it a {{misspelling of}}. I too am opposed to collecting every random spelling ever written down, when it comes to living languages with enormous corpora, like modern German. —Aɴɢʀ (talk) 12:03, 8 February 2016 (UTC)
My intention in writing that section was to indicate when to use "superseded spelling of" and when to just say "obsolete". If something was deprecated in 1901/1902, or earlier in some states, someone might be tempted to label it "superseded", but it should just be labelled "obsolete". {{de-superseded spelling of}} handles this if someone inputs used=pre-1901 or an earlier date (though if someone inputs an unrecognised date, it displays "former"), but people could also use {{obsolete spelling of}} on such words instead. This could probably be made clearer. (This is, IMO, separate from the question of whether Alfabet is obsolete, a misspelling, etc.) - -sche (discuss) 21:05, 8 February 2016 (UTC)
Superseded spellings can still be used. Obsolete spellings aren't used. By definition, if they're being used they're not obsolete. Renard Migrant (talk) 21:11, 9 February 2016 (UTC)

Categorize by translation[edit]

I don't know if this was brought up before, but would it be a good idea to categorize the English pages by the translations that are there? For example, if the entry dictionary has French and Norse translation, then dictionary would be in Category:Words with French translation and Category:Words with Norse translation. Any thoughts? --kc_kennylau (talk) 12:58, 9 February 2016 (UTC)

I don't know. The idea has merit, but that's a lot of categories. I wonder what the limits are on the number of categories in an entry. If we implement this, water will have hundreds of categories- definitely more than have ever been included in an entry before. Chuck Entz (talk) 14:57, 9 February 2016 (UTC)
@Chuck Entz: I just took this idea from the French wiktionary. Also, fr:eau (French for water) does in fact have hundres of categories. --kc_kennylau (talk) 17:17, 9 February 2016 (UTC)
@Kc kennylau: It only has 28. —Justin (koavf)TCM 23:02, 10 February 2016 (UTC)
@Koavf: The categories are hidden. --kc_kennylau (talk) 02:32, 11 February 2016 (UTC)
I am inclined to oppose this, because it would totally swamp the list of categories at the bottom of the entry, making it hard to find any other category I might be looking for, but I suppose there may not be many people who navigate to categories from entries rather than from categories to entries. (As DCDuring would say, do we have any data?) What would be the benefits? - -sche (discuss) 20:13, 9 February 2016 (UTC)
I was going to oppose it precisely because the French Wiktionary uses that system and it's pointless and godawful. Renard Migrant (talk) 21:09, 9 February 2016 (UTC)
@Renard Migrant: Thank you for your response. --kc_kennylau (talk) 02:32, 11 February 2016 (UTC)
@-sche: Thanks for your feedback. This problem can be solved by making the categories hidden. --kc_kennylau (talk) 02:32, 11 February 2016 (UTC)
But then they'd still swamp the page for someone like me who has hidden categories enabled precisely so that I can find or notice various maintenance categories. - -sche (discuss) 05:07, 11 February 2016 (UTC)
Exactly.--Giorgi Eufshi (talk) 05:58, 11 February 2016 (UTC)
Just a thought – maybe these categories could be added to talk pages, thus leaving the main pages without too many categories. — SMUconlaw (talk) 07:32, 11 February 2016 (UTC)
@Smuconlaw: Nice suggestion; however we would have to add a template to every talk page. --kc_kennylau (talk) 08:16, 11 February 2016 (UTC)
Yes, that would have to be done. It would solve the problem of too many categories on main pages, though. :) — SMUconlaw (talk) 08:25, 11 February 2016 (UTC)
What could one get out of these categories? —suzukaze (tc) 05:17, 11 February 2016 (UTC)
You could track additions to a category. Of course not all translations will be tracked but still has merits. --Giorgi Eufshi (talk) 05:58, 11 February 2016 (UTC)

Presenting the absolutely useless ancestor chain[edit]

Module:User:kc_kennylau/ancestor chain is basically an analysis of Module:languages/alldata, forming a chain with all the ancestors. A known problem is that languages with two ancestors are displayed twice. This module serves no other purpose than testing my programming skills. --kc_kennylau (talk) 17:15, 9 February 2016 (UTC)

If anything, it makes the visualisation of which language codes are missing ancestor data easy. — Ungoliant (falai) 17:19, 9 February 2016 (UTC)
@Ungoliant MMDCCLXIV: Well, thanks. --kc_kennylau (talk) 18:22, 9 February 2016 (UTC)
Why is Yiddish listed as a descendant of Hebrew? —CodeCat 17:42, 9 February 2016 (UTC)
@CodeCat: Presumably because it's called Judaeo-German? --kc_kennylau (talk) 18:22, 9 February 2016 (UTC)
But it doesn't actually descend from Hebrew, it's Germanic at its core. —CodeCat 18:24, 9 February 2016 (UTC)
Because Yiddish inherited quite a bit of its phonology, vocabulary, and grammar from Hebrew. But if you want to dispute that, let's not do that in the Beer parlour. --WikiTiki89 18:27, 9 February 2016 (UTC)
Seconding CodeCat: that is not what "inherited" means in linguistics. If you end up having a discussion elsewhere, I'll be happy to expand on my reasoning. --Tropylium (talk)
I agree also. Yiddish is inherited from German, but has borrowed a lot from Hebrew. Benwing2 (talk) 19:38, 9 February 2016 (UTC)
I agree too. Yiddish didn't inherit anything from Hebrew, but it borrowed a lot from it. I've removed Hebrew as ancestor of Yiddish from the module. —Aɴɢʀ (talk) 20:25, 9 February 2016 (UTC)
Anyone interested in discussing Yiddish, I have sort-of started a discussion at Tropylium's talk page. But I think we should keep this tangent from expanding here. --WikiTiki89 21:13, 9 February 2016 (UTC)
Listing several languages that need ancestors would be easy. A small sample:
  1. Samoyedic (ancestor: syd-pro)
    • Kamassian xas
    • Mator mtm
    • Nganasan nio
    • Selkup sel
    • Tundra Nenets yrk
  2. Iranian (ancestor: ira-pro)
    • Bactrian xbc
    • Chorasmian xco
    • Khotanese kho
    • Median xme
    • Ormuri oru
    • Sogdian sgd
    • Wakhi wbl
    • Yaghnobi yai
(This does not even exhaust Iranian languages that still need ancestors, but I believe some Western ones will possibly need to have their descendants set to Old Persian, and at minimum in the case of Tajik (tg), to Middle Persian (pal); some other groups might need low-scale protolanguages eventually.) --Tropylium (talk) 19:06, 9 February 2016 (UTC)
@Tropylium: I'm not sure I know what you mean. Do you mean that I should write a module to show languages that need an ancestor? --kc_kennylau (talk) 19:44, 9 February 2016 (UTC)
I think he's saying that this module already does show language that need an ancestor. --WikiTiki89 19:46, 9 February 2016 (UTC)
I'm currently sorting through the Iranian languages and I was wondering whether we should give Tati a code like ira-tat. Also, I would love if this ancestor tree became permanent. —JohnC5 20:34, 9 February 2016 (UTC)
Iranian languages really need to be organized more. Here is a detailed tree: File:Iranian Family Tree v2.0.png. --WikiTiki89 20:48, 9 February 2016 (UTC)
Also, Luri (ira-lur)? —JohnC5 20:51, 9 February 2016 (UTC)
@Wikitiki89, Tropylium: I'd really like a full proposal for how the languages should be sorted before I create any new language codes. For now, I'll just stick anything that is ambiguous under ira-pro for later sorting. —JohnC5 20:55, 9 February 2016 (UTC)
Theoretically, there should be a proto-language (or better yet, an attested ancestor language) at each branch point in the tree. --WikiTiki89 21:02, 9 February 2016 (UTC)
Real or etymology only languages? —JohnC5 21:17, 9 February 2016 (UTC)
If we can use etymology-only languages as ancestors, that would probably be the ideal solution. --WikiTiki89 21:19, 9 February 2016 (UTC)
I think we can. Wanna try? —JohnC5 21:20, 9 February 2016 (UTC)
I do want to try, but as I'm finding out, the Northeastern/Southeastern/Northwestern/Southwestern groupings are actually areal and not genetic. Therefore, no proto-language can be theorized for these groups. We need a more linguistically-accurate tree to work from. --WikiTiki89 21:55, 9 February 2016 (UTC)
My original point is that it's already evident from experience that numerous languages do not yet have ancestors added.
As for linguistic subgrouping, it is often a very contentious task, in particular in dialect continuum situations. I don't think any well-accepted tree of Iranian exists that would substantially group the Iranian languages together. (It's even been proposed that "Iranian" itself would be just an areal group of Indo-Iranian varieties.) --Tropylium (talk) 10:35, 11 February 2016 (UTC)
@JohnC5: We can try etymology-only languages with Proto-Canaanite. Its ancestor for now would be Proto-Semitic (sem-pro), and its descendants would be: Ammonite (sem-amm), Edomite (xdm), Hebrew (he), Moabite (obm), Phoenician (phn). --WikiTiki89 19:22, 11 February 2016 (UTC)
@Wikitiki89, Kc kennylau, CodeCat, Vahagn Petrosyan What do we think? Do we just start creating every intermediate language (Proto-Canaanite, Proto-South-Slavic, Proto-Osco-Umbrian) as full fledged reconstructed languages, as etymology-only languages (but allow them to act as ancestors), or add a new "reconstructed-stage" category of languages which may only act as categorizing ancestors? —JohnC5 19:30, 11 February 2016 (UTC)
I think they should be etymology-only languages, because after all, any such stage can also be used in etymologies. --WikiTiki89 19:34, 11 February 2016 (UTC)
It doesn't look like it currently works to me, but we I think we could alter that. —JohnC5 19:35, 11 February 2016 (UTC)
Yeah, I've been thinking of proposing merging the etymology-only language data into the regular language data modules, but giving them a parameter such as "etymonly = true". That way we would need much less special handling. @CodeCat: What do you think of that? --WikiTiki89 19:40, 11 February 2016 (UTC)
Not a fan. That would allow every template to accept etymology languages too, even when we don't want to. An opt-in for each template is better than having to explicitly code an opt-out for many templates. Also I'm opposed to Proto-South-Slavic unless there is a consensus that it existed. And I don't like the idea of creating lots of proto-languages for branch points. It wouldn't be practical. We got rid of Proto-Finno-Ugric for just that reason. —CodeCat 20:03, 11 February 2016 (UTC)
I think we should allow existing etymology-only languages to act as ancestors (e.g. Byzantine Greek for Cappadocian Greek and Pontic Greek) but we should not create new ones when there is no consensus that it existed. --Vahag (talk) 20:13, 11 February 2016 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Proto-South-Slavic was a bad example, but I do agree that having etymology only languages act as ancestors would be nice for categorization. Saying all the Romance languages are effectively the same seems a little silly to me. —JohnC5 20:30, 11 February 2016 (UTC)

We categorise languages by family, not by ancestor. —CodeCat 20:37, 11 February 2016 (UTC)
@CodeCat: Our decision to get rid of Proto-Finno-Ugric is essentially equivalent to a decision not to recognize it as a branch point. Every real genetic branch point theoretically has a proto-language. Regarding merging etymology language data with regular language data: Templates do not have direct access to the language data anyway. All access is mediated through Module:languages, which would be the only module that needs to know how to handle the etymology-only languages. --WikiTiki89 20:39, 11 February 2016 (UTC)
In other news, I created Proto-Na-Dene (xnd-pro) which I feel is fairly uncontroversial. —JohnC5 21:24, 11 February 2016 (UTC)
Creating proto-languages for generally accepted bottom-level families (so, this clause including neither the likes of Proto-Nilo-Saharan nor the likes of Proto-East Sudanic) should be uncontroversial, but beyond that it gets tricky. Consider Proto-Northwest Germanic, which is generally accepted but looks something like 90% identical to Proto-Germanic. Even if we didn't duplicate the PGmc appendices altogether, having to turn every "from Proto-Germanic" to "from Proto-Northwest Germanic, from Proto-Germanic" would be just pointless repetition in etymologies. Someone who knows about the difference will also already know that Gothic comes straight from PG, all other Germanic languages thru PNWG.
The only potential benefit I can immediately see to adding even an etymology-only PNWG would be pretty similar to what we're doing with Proto-Finno-Ugric: alerting readers that the word lacks both a Gothic and wider IE cognates, and hence cannot be necessarily assumed to have existed yet in Proto-Germanic proper. Even then, this could be better indicated in prose (cf. Proto-Uralic/-Finno-Ugric *käte). --Tropylium (talk) 18:25, 12 February 2016 (UTC)
I wasn't advocating changing instances of "from Proto-Germanic X" to "from Proto-Northwest Germanic X, from Proto-Germanic X". That's generally not what etymology languages are for anyway, they are never required to be used. They only exist for the occasional circumstances in which it might be beneficial to distinguish them from their "parent". In this case, I see it as just serving the purpose of organizing the tree (and only maybe the occasional use in an etymology section). Alternatively, maybe we could just integrate families into the tree so that they can be used for organization instead of extra proto-languages. --WikiTiki89 20:37, 12 February 2016 (UTC)
I've done all the Iranian languages I could find and the above-listed Samoyedic languages. —JohnC5 22:08, 9 February 2016 (UTC)
I've also done Italic. We really should make a policy about the creation of intermediate language stages for categorization, but not for actual lemmata, reconstructed or otherwise. —JohnC5 23:57, 9 February 2016 (UTC)
I think I found a bug. For some reason, "zh" is not showing up as the descendant of "ltc" as a descendant of "och", instead "ltc" appears twice, the second time in its own mini branch with just "zh". Also, it seems the Turkic languages need work. —CodeCat 00:09, 10 February 2016 (UTC)
I was thinking of looking at the Turkic languages, but was wondering whether to wait for a more permanent ancestry solution than just pointing them all at Proto-Turkic. Also, I would hope that if we could make this into a fully fledged template, it would contain the actual language names. —JohnC5 01:26, 10 February 2016 (UTC)
I've done what I can for Turkish. —JohnC5 03:55, 10 February 2016 (UTC)
@CodeCat, JohnC5, Wikitiki89: I think I found a bug. Mingo iro-min is listed as the descendant of iro-pro, which (iro-pro) is not in the database... --kc_kennylau (talk) 02:48, 11 February 2016 (UTC)
I've added Proto-Iroquoian since we'd need it eventually anyway. There still seems to be an error in the module that causes reduplication. —JohnC5 03:30, 11 February 2016 (UTC)
@JohnC5: For example? --kc_kennylau (talk) 07:40, 11 February 2016 (UTC)
Well, I guess I meant that Middle High German and Old Chinese just stop and then start again later as their own branches. Is that behavior intentional? —JohnC5 08:01, 11 February 2016 (UTC)
@JohnC5: Sorry, fixed. --kc_kennylau (talk) 08:26, 11 February 2016 (UTC)
@Kc kennylau I love the new layout. Is it possible to disconnect the top-level languages so it doesn't seem like they are related? Also, and I'm not sure this is possible given the cursive nature of the calls, but could you sort alphabetically by level? —JohnC5 17:03, 11 February 2016 (UTC)
@JohnC5: Done. --kc_kennylau (talk) 17:20, 11 February 2016 (UTC)
@Kc kennylau You're the best. Editing this is my favorite Wiktionary activity I've done in quite some time. —JohnC5 17:23, 11 February 2016 (UTC)
1. Why is Proto-Canaanite all on its own?
2. Can you add collapsible toggles to the branches?
3. Can we add this to {{langcatboiler}} (collapsed by default)? DTLHS (talk) 02:22, 12 February 2016 (UTC)
  1. We were hoping they would alter the code to allow etym-only languages (such as Proto-Canaanite) to act as ancestors. No movement yet on that front.
  2. I would love that also.
  3. Technically, this feature already exists in the category system, but I would also enjoy having this mapping. —JohnC5 02:32, 12 February 2016 (UTC)
@Kc kennylau Could you make it so the etym-only languages either don't link or link without the trailing "language"? —JohnC5 05:48, 12 February 2016 (UTC)
Also, can we change etym-only languages so that aliases don't show up multiple times? —JohnC5 05:51, 12 February 2016 (UTC)
@JohnC5: Thank you very much for your continuous feedback. However, would appreciate if you could provide some examples next time. Took me some time to figure out what you're talking about. --kc_kennylau (talk) 10:43, 12 February 2016 (UTC)
@DTLHS: Could you make an example on WT:SB so that I may know how to make collapsible branches? --kc_kennylau (talk) 11:04, 12 February 2016 (UTC)
@Kc kennylau, thanks so much for all you've done so far. I apologize for continuously failing to provide examples. It seems like your edit to fix the aliasing problem for etym-only languages has broken the end-of-branch drawing behavior. The example here is Vulgar Latin (V.L.) has a branch continuation symbol as opposed to branch ending symbol. —JohnC5 16:22, 12 February 2016 (UTC)
@JohnC5: Thanks, fixed. --kc_kennylau (talk) 16:35, 12 February 2016 (UTC)
What does it mean to add this to {{langcatboiler}}? --kc_kennylau (talk) 15:48, 12 February 2016 (UTC)

Placement of "Descendants"[edit]

Some questions:

  1. Should "Descendants" be the last section within the language entry? Should "Descendants" be always L3?
  2. Alternatively, should "Descendants" be the last subsection within the POS section? Should "Descendants" be always L4? (entries with multiple etymologies/pronunciations notwithstanding)
  3. Maybe there are other alternatives?

Note:

  • Arguably, "Descendants" is the information that is most "distant" from the definitions, since it is not about the current entry in the current language, so it could be well placed in the end of the entry, after all sections, including "Translations".

Note:

--Daniel Carrero (talk) 22:07, 9 February 2016 (UTC)

  • If an entry had only a single etymology, it seems obvious to me that it would appear at L3 at the end of the L2 section, above some trivial content (eg, Statistics, Anagrams) and possibly above References and even External links. This situation with multiple etymologies would seem to be about the same with it appearing at L3 at the bottom of the applicable etymology section. There could be issues in cases where semantically related terms of different PoS are presented with distinct etymologies for the PoSes. But I would think we could refer the reader to one Descendants section, rather than duplicate it or somehow alter the entry layout to accommodate this uncommon, minor lack of structural clarity. DCDuring TALK 22:44, 9 February 2016 (UTC)
    The vast majority of entries that have a descendants section put it at level 4. I see no reason to change that. —CodeCat 23:58, 9 February 2016 (UTC)
I think it makes sense for Descendants to be L4 like (and probably near) Derived terms. Descendants are usually POS-specific, e.g. Tok Pisin rais is from the English noun rice and not the verb rice; I expect there are cases where a language like German or French borrowed both a noun and a verb from English in different forms (applying a verb suffix to the verb). - -sche (discuss) 00:42, 10 February 2016 (UTC)

Proposal: Expanding WT:CFI - list of terms[edit]

Proposal: Expanding WT:CFI#Terms with more terms. Added items are underlined. Did I forget any? (Bonus: I'm formatting all examples using {{m}}).

Note: This expansion was taken from Wiktionary:Criteria for inclusion/Editable, which was proposed to be deleted.

Terms

A term need not be limited to a single word in the usual sense. Any of these are also acceptable:

--Daniel Carrero (talk) 03:50, 11 February 2016 (UTC)

Chemical formulae are pretty controversial. As far as I know, we haven't really decided how to deal with them. H₂O and CO₂ are clear keeps, some are more controversial (see Talk:LiBr) and some not even I would support keeping (C6H12O6 is probably citable). Smurrayinchester (talk) 10:52, 11 February 2016 (UTC)
I have two concerns: first, a sign in a sign language is a word in the usual sense—why does it get a separate bullet point?
Second, my understanding is that the phrase "ideographic writing" is inaccurate as a description of Chinese (see Unicode's FAQ on the subject and Wikipedia's article), so maybe that should be rephrased. We could combine that bullet point with the "Letters, numerals, and symbols" bullet point lower down. —Mr. Granger (talkcontribs) 13:30, 11 February 2016 (UTC)
Don't we exclude language codes like fro? DCDuring TALK 17:11, 11 February 2016 (UTC)
If there are any other suggestions, I'll make a new version of the text using them, later.
@Mr. Granger, about your first concern, I suppose you're right. I have a proposal: Maybe we should remove the whole introduction "A term need not be limited to a single word in the usual sense." and add a first bullet point concerning what items are acceptable:
  • Words in the usual sense, including signs in a sign language.
About your second concern, thanks for the link. I had already read that specific FAQ in the past but I failed to remember that "ideographic writing" is not 100% accurate concerning Chinese characters. But what would be accurate about them? Both "字" and "ʃ" are used as examples of "ideographic" and "phonographic" writings, so it seems to serve as a catch-all way to say "we include most symbols here". (can we say that 1 is ideographic writing and A is phonetic writing?) Would it be a good idea using this:
  • Chinese characters such as .
The problem I see with the idea of mentioning "Chinese characters" directly would be the slippery slope: why not saying "Chinese, Korean, Armenian, Arabic [...] characters"?
@DCDuring, I suppose language codes are OK for the minority that are attestable? I was able to attest 2 script codes in the past and created entries for them, after this discussion: Citations:Latn and Citations:Cyrl. --Daniel Carrero (talk) 19:38, 11 February 2016 (UTC)
@Smurrayinchester, we could say "some chemical formulae". --Daniel Carrero (talk) 19:42, 11 February 2016 (UTC)
Same applies to "compounds and multiple-word terms": not all are acceptable. Equinox 20:56, 11 February 2016 (UTC)
Regarding "ideographic": we could refer to as a Han character or maybe just a character, if we want to include it. But if we want an example to demonstrate the range of symbols that we include, I think Chinese characters are not a good choice—they're not so different from English letters, just in a different type of writing system. Wiktionary does have entries for things that I think would be accurately described as ideographic, such as ಠ_ಠ, , ;), and <3. Maybe one of those could be an example. —Mr. Granger (talkcontribs) 21:30, 11 February 2016 (UTC)
  • It's a start. Needs to go farther, but it's a start. Purplebackpack89 01:41, 12 February 2016 (UTC)

CFI terms - revision 2[edit]

@Smurrayinchester, Mr. Granger, DCDuring, Equinox, Purplebackpack89:

I tried and revised the text based on all the ideas you gave. If I forgot to add something mentioned above, or if you think of something new, let me know. (I removed "fro" because it currently does not exist as a Translingual word.)

Also, I expanded the line about prefixes/suffixes.

Terms

These items are acceptable to be included as dictionary entries:

--Daniel Carrero (talk) 12:40, 12 February 2016 (UTC)

Finnish inflected nouns labelled as noun (forms) and adverbs - consistency? policy?[edit]

Many of these are labelled both as noun forms with the title Noun and as Adverbs. You can see them by intersecting Category:Finnish_adverbs and Category:Finnish_noun_forms or by searching for ssa or lla (for example) with your browser on the adverbs page. A specific example is here: https://en.wiktionary.org/wiki/humalassa

Both might be considered correct. Certainly the template used "noun form" is correct, however on the page it says "Noun", not noun form. This kind of fair enough though since in the body of the definitions they all say they're inflected forms. Adverb makes sense since they are equivalent at an adpositional phrase in English and this is how Kotus classifies them also. The general trend seems to be that the "Noun" heading contains, eg Inessive plural form of Foobar and the Adverb heading definition contains an actual definition and possibly a usage example.

My main problem with this set up is it seems inconsistent and it implies they have a different etymology or are a different part of speech. Most of the entries I have seen, it's quite hard to argue that the "Adverb" usage is different from the "Noun Form" usage (I will admit it is possible though - inflection can behave more like derivation). My preference would be that once one is decided on and the entries within the page are merge (I'm currently leaning towards Noun form - simply because there are more). A more radically approach would be to remove the pages altogether (or replace with redirects?) and merge the usage example and extra definitions into the lexeme. My main reasoning that this would be better is people are more likely to look at the lexeme's page and if the usage examples were merged in there would be more information where people are actually looking, rather than having it fragmented across different inflections.

Or am I barking up the wrong tree? Is there already some sort of a policy for this stuff that I'm ignorant of?

--Megajuice (talk) 20:19, 11 February 2016 (UTC)

Minor edit: nonexhaustive gender list in EL[edit]

As proposed here by @Droigheann and later here by @I'm so meta even this acronym I would like to edit one item of WT:EL#Translations, to note that the list of genders given is nonexhaustive. I'm also pretty sure we have POSes with genders in translation tables, other than nouns.

(let' do this change without a formal vote if possible, this is an unsubstantial change)

Current text:

  • Provide the grammatical gender of the translations of nouns, if appropriate, giving the parameters m, f, n and c for “masculine”, “feminine”, “neuter” and “common” respectively to {{t}}.

Possible text (I'm open to other suggestions):

  • Provide the grammatical gender of the translations of nouns, if appropriate, giving the parameters such as m, f, n and c for “masculine”, “feminine”, “neuter” and “common” respectively to {{t}}.

--Daniel Carrero (talk) 12:16, 12 February 2016 (UTC)

Symbol support vote.svg Support — I.S.M.E.T.A. 15:25, 12 February 2016 (UTC)
Symbol support vote.svg Support Equinox 15:29, 12 February 2016 (UTC)
Symbol abstain vote.svg Abstain This change only means accepting gender for other POS than nouns, it says nothing about the aspect of verbs or, more importantly, plurals. I remain convinced that if the translation is a plurale tantum, this should be marked in the translation table, especially if the English word isn't, such as "a watch" vs hodinky f pl. --Droigheann (talk) 17:14, 12 February 2016 (UTC)
Symbol oppose vote.svg Oppose. The addition of "such as" is good, but not the removal of nouns. Adjectives shouldn't have genders, since they have no inherent gender. —CodeCat 17:18, 12 February 2016 (UTC)
The gender with an adjective informs the reader that the translation only applies to a specific (lemma) gender and, unlike in English, other forms for the other genders are to be found at the FL entry.
Incidentally, what other genders than m, f, n & c are there? Wikipedia suggests animate/inanimate but I've never seen that distinction made in a translation table. --Droigheann (talk) 17:38, 12 February 2016 (UTC)
But the same can be applied to things like cases too. Do we need to indicate that a specific form only applies for subjects (nominative case)? I hope not. We shouldn't have to educate our users on the grammar of languages in every translation table. If they know how adjectives work in a language (which they should, if they are going to use the word with any accuracy) then they know that translation tables only give the lemma form, and they know that it may need to change in gender, case or whatever to match usage. —CodeCat 17:41, 12 February 2016 (UTC)
I have to concede that's a good point (and probably applies not only to adjectives, but to pronouns and numerals as well). I know some nouns which don't have an inherent gender, like dítě ‎(child), which is neuter, but whose plural děti ‎(children) is feminine, but I wouldn't put that into the translation table either. --Droigheann (talk) 06:18, 13 February 2016 (UTC)
Symbol oppose vote.svg Oppose - I concur exactly with CodeCat as above. SemperBlotto (talk) 17:22, 12 February 2016 (UTC)

"NORM: 10 proposals" ends in 2 days[edit]

Please see Wiktionary:Votes/pl-2015-11/NORM: 10 proposals for the last time, and consider using these final moments to cast your votes if you didn't do it yet. The vote ends in 2 days. Thanks.

The vote contains lots of ideas together, so the page is larger than usual. The vote lasted for 3 months. (it started on November 15) --Daniel Carrero (talk) 14:47, 12 February 2016 (UTC)

Tying ancestors to families by default[edit]

Right now there's a major undertaking to add ancestor information to lots of languages, which is good. But I've noticed that in many cases, the ancestor duplicates the language family. The ancestor of family X is proto-X, so specifying both may be a bit redundant. Therefore, I'd like to propose the following change to how ancestors are worked out in Module:languages:

  • If the ancestors = value is present, use that, like is done now.
  • If it's not present, then look at the family of the language, and see if a language exists with -pro added to the family code. If so, then use that.
  • If that doesn't work either, then progressively go up the family tree, doing the above check until a language is found.

So, for Old English, it would first look at the ancestors = value. Let's say it isn't present. So then it looks at the family code specified for Old English, which is gmw (West Germanic). It then checks if a language with code gmw-pro exists. It does not. So it then looks at the parent family of gmw, which is gem (Germanic). A language gem-pro does exist, so that is returned as the ancestor of Old English.

By using this method, we can get away with not specifying the ancestor of many languages, if that ancestor is simply the proto-language of the family. For Middle English and modern English we'd still need to specify it. But it would lessen the workload tremendously. It would probably be necessary to add a way to override the proto-language of a family. For the Romance languages, we'd want to use Latin as the ancestor, rather than the default roa-pro which does not exist. So in the family data for Romance, we might specify protolang = "la" to override the default. —CodeCat 17:01, 12 February 2016 (UTC)

I've implemented proto-languages for families, since I didn't think that part would be controversial. The proto-language of a family is now displayed in the info table on the family's category page, see Category:Germanic languages or Category:Romance languages for example. Note that it is not necessary for every family to have a proto-language. For example, Category:West Germanic languages doesn't have one. —CodeCat 19:26, 12 February 2016 (UTC)
Support. - -sche (discuss) 20:46, 12 February 2016 (UTC)
Support. —JohnC5 22:01, 12 February 2016 (UTC)