Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
(Redirected from Wiktionary:Beer parlor)
Jump to: navigation, search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


October 2017

October LexiSession: punishment[edit]

The Punishment of Loki.

The monthly suggested collective theme is punishment. Not so funny, but the 10th of October is the World Day Against the Death Penalty so we may look at the alternatives and do better descriptions around this theme.

Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. In one year, 35+ people have participated! I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every project at the same time, to give us more insight into the ways our colleagues works in the other projects.

See you soon Face-smile.svg Noé 09:28, 1 October 2017 (UTC)

slow slicing and poena cullei are my requests for this month (entries for these, not as punishment...although, they could be useful detractors for trolls...) --P5Nd2 (talk) 09:39, 8 October 2017 (UTC)
Thank you for your participation! Face-smile.svg Noé 08:58, 2 November 2017 (UTC)


Adding translations to too many unrelated languages. No idea where they get transliterations for Chinese dialects, such as Jin, Gan, Xiang, etc. --Anatoli T. (обсудить/вклад) 10:51, 1 October 2017 (UTC)

How the heck did they find Prakrit translations? They must be going through the entries we have already. —Aryaman (मुझसे बात करो) 20:44, 1 October 2017 (UTC)

French Wiktionary September news[edit]

Logo Wiktionnaire-Actualités.svg


Hey! September issue of Wiktionary Actualités just came out in English!

In this issue: Comments about press articles, our information desk is not like yours, a description of a dictionary of short-text signs, a comment on the expression of gender in an Andean language, some cool videos about words (in French and English!), announcements for the Wikiconférence francophone in October and plenty of statistics with fancy fleurons surrounding it all!

As usual, it is translated in English by non-native speakers, in less than a day, and it is not perfect, but it can be improved by readers (wiki-spirit). We did not receive any money for this publication and we are not supported by any user group or chapter. It is only written by the community, for the large community of lexicolovers! I hope you did not feel harassed by this notice Face-smile.svg Noé 21:41, 1 October 2017 (UTC)

PulauKakatua19 (talkcontribs) again[edit]

This user is adding spurious Hittite entries, with some really bad/outdated etymologies. No references either. They have been warned many times for Russian, Hindi, and Rohingya edits. I suggest a week-long block. —Aryaman (मुझसे बात करो) 14:51, 3 October 2017 (UTC)

Etymological information for strong verb non-lemma forms[edit]

There are many forms, e.g. in English, where arbitrary, irregular, strong forms of verbs deserve their own etymology. Many of these individual forms have received particular attention from linguists over the decades, e.g. did (past tense of do, from a unique, non-past reduplicated root form of the ancestor of do dating back to Pre-Germanic, for unknown reasons), sang (past tense of sing derived directly from a Proto-Indo-European form of the ancestor of sing). These non-lemma forms have their own independent etymological lineage that can be traced back thousands of years.

A certain administrator (Rua) has informed me that it is policy on Wiktionary to minimize etymological information on non-lemma forms, and instead place such information in the lemma form's etymology section. This can be understood for weak forms like walked, but those forms need little explanation because they are formed regularly, and for the forms that do require extra explanation, it makes for unsightly etymology pages on the lemma form's etymology sections (see Proto-Germanic *dōną's etymology section for the current policy specification; it doesn't even specify the past form *dedǭ, referring to it only as "the past form").

I understand the concern to avoid etymology fragmentation, but in this case, the etymology itself is fragmented and the two forms are remembered as separate, arbitrary, irregular forms. Perhaps there is a solution to maintain the same etymology information in multiple pages, but I think the most simple solution would be to provide etymological information for such forms on their own pages. There is really no reason to avoid this practice and it only makes things more confusing. I am surprised that this is against current policy. Do you agree with this assessment? 16:17, 3 October 2017 (UTC)

Strongly oppose putting etymologies on every inflected form, irregular or not. —Rua (mew) 16:35, 3 October 2017 (UTC)
  • Out of curiosity -- where should such etymological information go? Some simple-present verb forms include etymological information for irregular conjugated forms, such as at [[go#Etymology 1]]. Others do not, such as at [[do]], which includes no explanation for the formation of [[did]]. ‑‑ Eiríkr Útlendi │Tala við mig 17:29, 3 October 2017 (UTC)
    • At the lemma entry, where we currently already place them. The IP is arguing that we should put etymologies on nonlemma entries too, which is going to lead to a huge duplicative mess. —Rua (mew) 17:47, 3 October 2017 (UTC)
I am proposing a move of the notable etymologies from the lemma to the non-lemma forms, if they are notable as in strong verbs. There is no duplication going on, only a move, as I indicated in the OP. 17:58, 3 October 2017 (UTC)
Then I still oppose it, the etymological information should be centralised on the lemma form. That's how all etymological dictionaries work, that's how we've worked so far too. Our users are accustomed to follow the link to the lemma for information, which is the purpose of non-lemma entries in the first place. They're there to help get users to the right place, nothing more. We should not scatter our information across various non-lemmas. —Rua (mew) 18:34, 3 October 2017 (UTC)
Traditional etymological dictionaries are constrained by the space in a book and give priority to lemma forms because they are the most popular. There is no real reason to ignore non-lemma forms or centralize their etymologies because Wiktionary doesn't have a size constraint, especially for adding reasonable information. I disagree that the only purpose of non-lemma forms is to provide a link to a lemma form; many non-lemma forms have lineages in their own right and there is no reason to marginalize them. Furthermore, users are not accustomed to follow the link to the lemma forms as you suggest; precedents for separate etymologies for non-lemma forms like done, is, are, am, etc. already exist and have existed for a long time. 18:37, 3 October 2017 (UTC)
Size constraint isn't the issue. It's keeping our information organised so that information can be found easily. And what I said is the agreed-upon purpose of non-lemma forms. It's why we don't include things such as derived terms, descendants or inflection tables on non-lemmas. Wiktionary is fundamentally lemma-oriented (or lexeme-oriented) rather than word-oriented. If we were word-oriented, we'd also include full definitions on non-lemmas, but thankfully we've been wise enough to not follow that idea. —Rua (mew) 18:42, 3 October 2017 (UTC)
Semantically and synchronically, what you're saying is correct; non-lemma forms don't require separate definitions. The placeholders used now are adequate. Etymologically and diachronically, it's incorrect. Irregular non-lemma forms are entirely independent of their lemma forms. Wiktionary is a semantically lemma-based dictionary, but that's completely unrelated to etymology. There is good reason for irregular non-lemmas to provide etymologies, and the semantic value of the terms have no bearing on it. 18:46, 3 October 2017 (UTC)
Ok, but as you must understand by now, that's not how Wiktionary works. The etymologies for the individual parts are noted on the lemma. You'll simply have to adapt to this practice. We're not going to change it just because some random user doesn't like it. —Rua (mew) 18:48, 3 October 2017 (UTC)
You are not arguing against my point, you are arguing your point because "that's the way it's always been done" and based on ad hominem because I'm "some random user". 18:54, 3 October 2017 (UTC)
  • I am not proposing putting etymologies on "every inflected form", only on the arbitrary forms with their own separate, traceable etymologies, if only to indicate their significance. The regular forms don't require etymologies because they are predictable. E.g. the etymology for the strong non-lemma form of sing, which is sang:
From Old English sang, from Proto-Germanic *sang, from Proto-Indo-European *songʷh-, o-grade past tense of *sengʷh- (sing, make an incantation).
Right now, the article sang doesn't indicate any of this lineage at all. As a strong and unpredictable form, lexically, sang is just as prominent as its form which is arbitrarily deemed the lemma form, sing, which independently derives from a different PIE form. There is no reason to treat it as a secondary form etymologically, at least in this case. 17:36, 3 October 2017 (UTC)
That's not any better. Consider how many times we'd have to duplicate the etymology for all 12 of the past tense forms of vera or syngja. The lemma is a natural place for etymologies, since it's a single central entry that covers all inflected forms. —Rua (mew) 17:47, 3 October 2017 (UTC)
That's not duplication, that's providing the very separate etymologies for very separate forms. If the forms merge at a certain point, then a link can be provided to the form from which they split off to avoid etymology duplication, like is done with borrowed terms. The words is, are, and were, for example, are all forms of is, but does that mean these forms should not provide their own etymologies? Are the etymologies of these forms of less interest and notability than any other term? They are not. 17:56, 3 October 2017 (UTC)
  • I could be wrong, but I don't think Rua is arguing that the etymologies of conjugated forms are not worthy of inclusion. I believe that she is instead arguing that the etymologies of conjugated forms should go within the etymology of the lemma form, and that the conjugated-form entries should be minimal.
The issue at hand is not whether to include or exclude certain information -- rather, it is about where to include that information. ‑‑ Eiríkr Útlendi │Tala við mig 18:09, 3 October 2017 (UTC)
Right. I am proposing that it should go in the non-lemma form. In fact, what I'm proposing is already standard practice for many notable forms, e.g. done. I want this to be consistent. For verbs like be, it would clutter its etymology section to list all of the etymologies for all of its many suppletive forms is, are, were, was, am, etc. One interested in these etymologies can follow the links to these forms' pages (which are provided in the term head) and view the etymology. More importantly, someone who specifically searches for irregular forms should have immediate access to their etymologies on the same page.
When I go to the page for am (which actually already follows the format that I'm proposing; I don't think anyone would want to move its etymology to the be page), I want to know the etymology of the term. When dealing with etymology, I don't really care if it's a form of any other (in this case, a completely unrelated lemma form), I want immediate access to its own unrelated and notable etymology. I believe this seems fairly reasonable and already has precedent. 18:20, 3 October 2017 (UTC)
The lemma entry is a central place for the term and all of its inflections. Information about am concerns the lemma be, so it should go there. The individual parts of verb paradigms may have separate origins but they don't have separate etymologies because they are inherited as a whole. The verb be in modern English is the same paradigm as the verb been in Middle English. —Rua (mew) 18:38, 3 October 2017 (UTC)
That's incorrect. Separate forms of a verb are not "inherited as a whole". That doesn't even make any sense. Irregular forms all require individual memorization and passing down. The lineage of was, for example, is entirely separate from be, as are both from am. If what you were saying was true, all Indo-European languages would still preserve the verb paradigms of Proto-Indo-European. They do not. They mix, they match, they innovate, they supply. 18:41, 3 October 2017 (UTC)
But they still form a single verbal paradigm. A question like "what is the past tense of be?" has an answer precisely because paradigms exist. We have chosen to use a single form to stand in for the entire paradigm, the lemma form, for convenience. That's where etymologies also go. —Rua (mew) 18:46, 3 October 2017 (UTC)
Etymologically, verbal paradigms don't matter for irregular forms. We have chosen the lemma form to stand in for the non-lemma forms semantically, but we have not done so etymologically, because that makes no sense. 18:48, 3 October 2017 (UTC)
We've chosen to do both. I'm sorry if that makes no sense to you, but it is what it is. —Rua (mew) 18:49, 3 October 2017 (UTC)
Please cite for me to this specific point in Wiktionary policy. I will propose the change through the proper channels. 18:53, 3 October 2017 (UTC)
I found it myself, and lo and behold, you seem to be the one who added this into the "common guidelines" page in the first place. While I agree with most of your additions, exclusivity of etymology to the lemma page is one that does not make any sense. 19:06, 3 October 2017 (UTC)
I am in favor of continuing to not split etymologies, on the grounds of workability: an editor who is interested in adding this type of information should be able to see at a glance if it has been done already, without checking each relevant non-lemma entry separately. On the other hand, I don't see a problem in directing users from non-lemma forms to the lemma, in cases where they need a separate discussion.
Actual suppletion seems like a different case, though. is, are and be have completely unrelated etymologies, and continuing to maintain separate etymology sections for them seems like a good idea (but I'd again be in favor of pointing users from the lemma form to the other entries for further reading). --Tropylium (talk) 20:49, 3 October 2017 (UTC)
I don't want to split etymologies, except like you said, for terms with suppletive forms and terms with strong forms. For example, the etymology of "did" takes a separate lineage all the way back to Proto-Indo-European that's completely independent from "do"; despite not being a suppletive form, it's a strong form. I don't want to split etymologies for verbs like "walked", only for verbs like "did" and "is/am/are" and "brought". One page should not contain etymologies for different terms if the etymologies are not currently regularly formed. So this would only be an exception that would affect a relative minority of pages. Wouldn't you agree with this? 21:47, 3 October 2017 (UTC)
English has relatively few inflected forms, but it can get pretty complicated when you have forms inflected for gender, number, case, etc. Even in English, am, is and are all go back to inflected forms of the same Proto-Indo-European root. As for strong verbs, I don't think differences in ablaut grade are enough to justify maintaining separate etymologies. We have a recognized system of lemmas and non-lemmas, but I'm not sure how you could decide which form to make the "etymology lemma" for forms sharing an etymology. Chuck Entz (talk) 02:15, 4 October 2017 (UTC)
I have trouble with the vagueness of "strong forms". This is well-defined only for Germanic languages, not a generally applicable concept. Likewise, "having a separate lineage" holds for a lot of things, for starters all irregular forms in general. We have a separate etymology for mice; should we also have separate etymologies for taught or bent?
I think the default assumption should be that, if not otherwise specified, it is not merely the lemma but all applicable inflected forms that descend from a given ancestor. If we give mūs as the ancestor of mouse, then this should already imply that the former's plural mȳs is the ancestor of the latter's plural mice. This gets rid of having to treat any irregularities that represent fossilized original regular alternations, no matter how far back they go. We are working on etymology sections here after all, not on historical morphology or historical phonology.
To be fair, without morphological and phonological supplementary information, etymology often becomes fairly opaque just-take-my-word-for-it business, and I do think Wiktionary could benefit from detailing these somewhere; I just do not think etymology sections are the place for this. --Tropylium (talk) 10:26, 4 October 2017 (UTC)
Having mūs as the ancestor of mouse does not immediately imply that mice derives from mȳs, or make it clear to the viewer. There is no duplication of information going on when etymology is given for mice, only clarification and necessary etymology. Apparently, someone rightly found that etymology should be specified for this non-lemma form, since an etymology section for mice already exists. Anyhow, I think this is being blown out of proportion. I would only ask for the option of specifying non-lemma etymologies where they are notable, as has already long been done with the article of am. Rua would delete all these etymology sections (despite am being a oft-cited non-lemma form for the purposes of reconstruction). When I make an etymology section on brought and did to explain their opaque etymologies, I don't want my edits nonsensically moved and crowded under the etymology pages of bring and do (or more often than not, simply deleted). These sorts of power trips by administrators not following the spirit of the guidelines (that they themselves wrote!) just make me incredibly discouraged from adding information to this website. 18:04, 4 October 2017 (UTC)
@Tropylium How would you handle the suppletion of the potential of olla, the perfect of sum, or in być? Putting etymologies on each of the forms is not going to be feasible. —Rua (mew) 23:36, 3 October 2017 (UTC)
There's only a limited amount of suppletion for any given case; we could assign an "etymological lemma" for each nonsuppletive group (e.g. lienee for the Finnish possessive stem). --Tropylium (talk)
Ew. —Rua (mew) 11:04, 4 October 2017 (UTC)
Seconded Rua. Anti-Gamz Dust (There's Hillcrest!) 00:34, 16 October 2017 (UTC)


Hullo. I'd like to make a request for the rollbacking or the patrolling tool. Where is it at? --Barytonesis (talk) 08:09, 5 October 2017 (UTC)

@Barytonesis: An admin has to nominate you at WT:Whitelist I think (or is that only for auto patrol)? —Aryaman (मुझसे बात करो) 17:01, 5 October 2017 (UTC)
I think that rollback/patrol most often is applied to people who, for one reason or another, do not want to be administrators. Just apply to be an admin if you want some subset of the tools. - TheDaveRoss 17:04, 5 October 2017 (UTC)
@TheDaveRoss: I'd like to, but I don't think I've gathered enough trust yet. Would you endorse me? --Barytonesis (talk) 16:42, 14 October 2017 (UTC)

A more personal form of Google Translate just for Faroese[edit]

https://www.faroeislandstranslate.com/#!/Justin (koavf)TCM 08:01, 6 October 2017 (UTC)

Entries with deprecated labels[edit]

The label (ordinal) used for ordinal numbers is listed in Category:Entries with deprecated labels with no suggested replacement. Should it even be listed there? DonnanZ (talk) 13:21, 6 October 2017 (UTC)

There is no replacement. There should not be a label there at all, add the category with {{head}} or {{cln}} instead. —Rua (mew) 13:27, 6 October 2017 (UTC)
The label automatically generates the category though, as well as saying what it is, so I don't see any reason to change it, e.g. nittende. Besides that, there is no suggestion to use {{head}} or {{cln}} in the above-mentioned category. DonnanZ (talk) 13:45, 6 October 2017 (UTC)
It's a misuse of labels, that's why it's deprecated. "Ordinal" doesn't specify a context in which a term is used. —Rua (mew) 13:57, 6 October 2017 (UTC)
Whoever set up the label didn't take that into account. It surely would be a simple matter to change the label to "ordinal number", although loads of entries would have to be revised. "cln|nb|ordinal numbers" works for generating the category, but a qualifier would then have to be added, which is twice as much writing, and a step backwards. DonnanZ (talk) 14:11, 6 October 2017 (UTC)
The other label (cardinal) when moused over shows "cardinal number", but this doesn't happen with (ordinal). It is not deprecated. DonnanZ (talk) 14:47, 6 October 2017 (UTC)
"ordinal number" is also not a valid context. Context labels should not be used to give definitions or disambiguate them. They are meant to describe how something is used, not what it means. —Rua (mew) 15:20, 6 October 2017 (UTC)
Have you checked ordinal number? Also see here. Nineteenth is an ordinal number. DonnanZ (talk) 15:35, 6 October 2017 (UTC)
Where are you getting the idea that I'm denying that these are ordinal numbers? I only said that a context label is not how this fact should be indicated. The entry should be categorised with {{cln}} or the cat2= parameter on {{head}}, but there shouldn't be a context label saying that it's an ordinal number. —Rua (mew) 15:47, 6 October 2017 (UTC)
I agree with RuaCat. Ordinal numbers should be categorized as such using |cat2= or {{cln}} but not using {{lb}}. —Aɴɢʀ (talk) 16:21, 6 October 2017 (UTC)
I still disagree, but as you are so keen on everything else but, perhaps you would like to come up with some usage examples. DonnanZ (talk) 22:31, 6 October 2017 (UTC)

Please, please reveal the cause of the revert in the edit summary[edit]

Void information is the default text If you think this rollback is in error, please leave a message on my talk page. In so many words you could give some specific about the actual problem.

Instead of writing pure junk this formula, it would be more helpful for all of us if you would just write the reason in the edit summary (this way we won’t have to bother you on your talk page).

By the revert you make the work of someone to nil. Please, please either correct the error, other at least give a hint about the problem to avoid.

(Sorry for my poor English.)

Karmela (talk) 07:09, 8 October 2017 (UTC)

There are relatively few admins who have to go through a flood of edits by new contributors and see whether they belong in the dictionary or not. Given that, we simply do not have the time to give explanations tailored for every rollback that we make (if it wasn't clear, the default text is added automatically). I created the vote that added that default text because previously, it said nothing at all — obviously, this is much better, because you followed the instructions and left a message on Wikitiki89's page, where you can further discuss the edit. —Μετάknowledgediscuss/deeds 07:22, 8 October 2017 (UTC)
Thank you. For a (not vandal) contributor is the cause of the rollback _never_ clear, s/he made the contribution supposing it was ok.
The list of the typic errors must not be too long, would be possible to chose from a premade explanation list by reverts?
Karmela (talk) 16:37, 8 October 2017 (UTC)
We have such a list for deletions of entire entries. It would be a good start for what you recommend. I do not know whether it is readily done technically. DCDuring (talk) 18:59, 8 October 2017 (UTC)
  • @DCDuring, Metaknowledge In en.wikipedia.org you can add two dropdown boxes below the edit summary box with some useful default summaries:
  1. Common edit summaries -- click to use
  2. Common minor edit summaries -- click to use
One can enable this gadget at https://en.wikipedia.org/wiki/Special:Preferences#mw-prefsection-gadgets
An analog dropdown box Common revert summaries -- click to use must be technically similar.
Karmela (talk) 07:47, 14 October 2017 (UTC)
So, apparently technically possible. How do we get it? DCDuring (talk) 14:01, 14 October 2017 (UTC)
This is how mw.loader.load('//en.wikipedia.org/w/index.php?title=MediaWiki:Gadget-defaultsummaries.js&action=raw&ctype=text/javascript'); Dixtosa (talk) 14:20, 14 October 2017 (UTC)
All this postulating the wish of the community here. Is this here the correct place and form to ask the community of the Wiktionary?
Karmela (talk) 08:52, 15 October 2017 (UTC)

Requests for deletion - restoring the list of nominations[edit]

In June 2017, WT:RFD was changed to no longer list items nominated at the right top of the page. I propose to restore the previous state. The current state is that categories are listed but not the items nominated themselves. That is not very useful, IMHO.

Therefore, I propose:

  • List nominated items again, as a list of items for all languages.
  • To support that, list all nominated items in Category:Requests for deletion instead of listing them only in per-language categories. This, again, is a restoration proposal.

--Dan Polansky (talk) 09:34, 8 October 2017 (UTC)

@Dan Polansky: If you want to see, say, the 5 French requests, click on the "▶" symbol next to the "Requests for deletion in French entries‎ (0 c, 5 e)". In my opinion, this is more useful than before, because now you can choose the language you want to see, as opposed to seeing a mess of entries in all languages. If we want to see a mess of entries in all languages, we may look at the normal TOC (the "Contents" list). I believe we also have the option of making all languages un-collapsed by default, though personally I'd prefer them collapsed as they currently are. --Daniel Carrero (talk) 09:58, 8 October 2017 (UTC)
I want to see the complete list, not by language. I only want to check whether all the items listed there were put to RFD page itself; if I did not want to do that, I would not want to see that right-floating portion of the page at all. --Dan Polansky (talk) 13:32, 8 October 2017 (UTC)
  • I support Dan's proposals. The language-specific RFD categories seem to be useless. —Μετάknowledgediscuss/deeds 16:05, 8 October 2017 (UTC)
    How do we know that no one uses the by-language listings? (BTW, I don't use them)
    BTW, I have noticed that we have a fair number of headings on request pages that do not have tags. Do we need yet another run against the XML to identify:
    1. Tagged L2s that are not on current request pages.
      1. Tagged L2s that are for archived or otherwise closed requests.
    2. Untagged L2s that are on the request pages.
    We'd also need to treat items that have been stricken or closed, but not yet archived.
    At the moment I don't see how this can systematically be accomplished with search. Though I doubt we would need such a run every two weeks, it might be useful every quarter or, at least, every year. DCDuring (talk) 18:47, 8 October 2017 (UTC)
@DCDuring For part 1, User:DTLHS/cleanup/request consistency. I don't think 2 is that important since entries request request pages get archived eventually. It's possible that there are false positives if pages are linked unusually on the request pages. DTLHS (talk) 19:27, 8 October 2017 (UTC)
@DTLHS: For 2 I was thinking about those requests that are entered without use of any request template. Today I noticed it when [[academic institution]] was added to RFDE. (The contributor has now added {{rfd}} at my request.) Perhaps what is needed is to discourage addition of new headers on request pages except through the relevant templates. DCDuring (talk) 20:35, 8 October 2017 (UTC)

Classification of forms with -n't[edit]

Hello. Rua, Equinox, Erutuon and I have been talking about the classification of don't, can't and other forms with -n't in User talk:TAKASUGI Shinji/2017#Contractions. I think they are verb forms just like did and could, according to Arnold M. Zwicky and Geoffrey K. Pullum (Cliticization vs. Inflection: English n’t, Language 59(3), 1983, pp. 502-513), but not everyone agrees with their analysis. In my opinion, we shouldn't use “Contraction” as a header because it is not a part of speech, and we should replace it with a part of speech we can reasonably assign. What do you guys think? — TAKASUGI Shinji (talk) 10:09, 8 October 2017 (UTC)

Our level 3 headers are for more than just part of speech. Suffix isn't a part of speech either. We have to use "contraction" because for most cases there is no other way to do it. Look at Category:Middle Dutch contractions for example. So that argument is not very compelling.
As for these contractions specifically, I don't see how they can be considered anything else. They aren't considered verb forms in any standard grammar of English. One paper is interesting, but we should follow linguistic consensus on the matter and not the opinion of a single paper. —Rua (mew) 11:44, 8 October 2017 (UTC)
An analysis of well-known linguists and lack of analysis don't have the same value. I find their analysis convincing. You can only say don't you? and not *do not you?, from which we must conclude that don't is not a contraction of do not. — TAKASUGI Shinji (talk) 11:41, 9 October 2017 (UTC)
What do you mean by "standard grammar"? The Cambridge Grammar of the English Language (naturally, because it was co-written by Pullum) uses the inflectional-suffix analysis of -n't, and its auxiliary verb paradigms show negative forms corresponding to each of the finite forms. Certainly the more traditional version of English grammar that I learned as a kid didn't recognize negative inflected forms, but it wasn't particularly linguistically rigorous and shouldn't be the basis for our decisions on Wiktionary. — Eru·tuon 21:52, 9 October 2017 (UTC)
I'm in favor of the analysis in which -n't is an inflectional suffix and forms like don't are verb forms (and I could go on about that), but the essential thing is to at least be consistent. I don't think it's consistent to label -n't as a suffix (as it's been labeled since 2008) and then call forms like won't contractions. A contraction is basically the combination of a full word plus one or more clitics that are derived from orthographic words, but are not spelled as words in this case. So for won't to be a contraction, -n't has to be a clitic (a variant form of not). The other option is for -n't to be a suffix and won't a verb form. We need to pick an analysis and stick to it. It would be fine to include usage notes explaining the alternative analysis, or alternative inflection tables, or categories, but the headers and headword templates should stick to a single analysis. — Eru·tuon 19:20, 8 October 2017 (UTC)

A week has passed, and there has been one negative vote. I assume the classification of -n't according to their paper is acceptable. — TAKASUGI Shinji (talk) 12:54, 18 October 2017 (UTC)

Any idea for a new "Thesaurus:" shortcut?[edit]

WS:goodThesaurus:good stills works, as it should.

But "WS:" does not make a lot of sense anymore, because now "Wikisaurus" is called "Thesaurus".

Then again, "TS:good" and "TH:good" are unavailable, because they are language codes. Is there a good shortcut available? If not, I guess we'll have to keep using only "WS". --Daniel Carrero (talk) 18:37, 9 October 2017 (UTC)

THES seems the obvious choice. Equinox 18:44, 9 October 2017 (UTC)
Alright, I guess. I'm not entirely happy with a mere reduction from 9 to 4 letters, but maybe that's the best option we have.
Maybe THE would be better ("THE:good" → Thesaurus:good), but "the" is the ISO code for Chitwania Tharu (w:Tharu languages). Can't we use it anyway? --Daniel Carrero (talk) 14:31, 11 October 2017 (UTC)
But if we can't use ISO codes then we can't use any three-letter code, even if ISO hasn't used it yet. It should be considered reserved for future ISO use. Equinox 15:07, 11 October 2017 (UTC)
That may be true, but we have violated that rule before. We have "cat" and "mod" as working aliases. See CAT:English nouns and MOD:sandbox. "cat" means Catalan, which seems unlikely to be used by Wikimedia because they have settled for https://ca.wiktionary.org/ and https://ca.wikipedia.org/ (using "ca", not "cat"). "mod" is Mobilian Jargon language. --Daniel Carrero (talk) 15:22, 11 October 2017 (UTC)
To be clear, I would support using "the". ("THE:good" → Thesaurus:good) --Daniel Carrero (talk) 15:24, 11 October 2017 (UTC)
I would prefer "THS" which is the language code for the w:Thakali language, a Nepali Sino-Tibetan language with 5,900 native speakers. Chuck Entz (talk) 02:04, 12 October 2017 (UTC)
No offence to you or Daniel but I think it's pretty obnoxious to appropriate a language code because not many people speak it and "it might never happen". You can't be half ISO compliant. Equinox 20:09, 25 October 2017 (UTC)
SYN. —suzukaze (tc) 02:28, 12 October 2017 (UTC)
NYM: is what I like, to stand for -nyms. --Dan Polansky (talk) 07:49, 21 October 2017 (UTC)
I am actually OK with it not having a shortcut. - TheDaveRoss 20:42, 24 October 2017 (UTC)

Linking active policy proposals[edit]

WT:EL should probably link to WT:FORMS in some fashion. I imagine there are also other cases like these, where EL is a dead-end and the actual documentation is hidden away in some obscure undocumented location.

Some might protest that the former is policy while the latter are often drafts, but as long as this is indicated, I do not see any problem in linking. Should we maybe settle on some specific more mildly worded section hatnote, such as "Read more:" (instead of "Main article:" or the like)?

Interestingly, WT:Policies and guidelines, despite being prominently linked from the policy headers ({{policy}}, {{policy-TT}}, {{policy-DP}}), is currently categorized as "inactive". There's Category:Wiktionary think tank policies, but it's not especially user-friendly. --Tropylium (talk) 14:54, 11 October 2017 (UTC)

New section "Synchronic analysis" in WT:EL[edit]

w:en:Synchrony and diachrony

It isn't useful to have only historic (current "Etymology" section at en.wiktionary) or only modern analys.

Example: атония d1g (talk) 08:51, 12 October 2017 (UTC)

We include this in etymology, but the usual wording is "equivalent to". —Rua (mew) 13:36, 12 October 2017 (UTC)

Linking to Wikimedia Commons categories[edit]

Hello, I would like to know why the wiktionary entries are not linked to the Wikimedia Commons categories (by using statements at Wikidata). For example the entry Varvel can be connected to commons:Category:Vervels. It can only help the readers to (visually) learn more about that particular word. Fructibus (talk) 09:45, 12 October 2017 (UTC)

@Fructibus: Have you seen Wiktionary:Wikidata? We do in fact have some links to that sister project via local templates. E.g. tea. —Justin (koavf)TCM 09:50, 12 October 2017 (UTC)
@Koavf: Thanks a lot! By the way, was there any discussion about including the wiktionary pages into Wikidata, connecting with the Wikipedia/Commons pages? Then the Commons link would show automatically for the Wiktionary pages, in all languages. At this moment, if you want to link to Commons in all language articles, that means you have to edit 67 Wiktionary pages. Fructibus (talk) 19:05, 12 October 2017 (UTC)
@fructibus: "Was there any discussion about including the wiktionary pages into Wikidata" Oh yes, quite a bit. And there are currently options to include Wiktionary entries in Wikidata but I don't feel like I can do a good job of summarizing all of that. You may wish to see the equivalent page here: d:Wikidata:Wiktionary. I 100% agree that we should use Wikidata to make sister links--you may wish to talk with User:CodeCatUser:Rua (I had forgotten he [she?] was renamed for some reason) about that. —Justin (koavf)TCM 19:16, 12 October 2017 (UTC)
People have expressed dislike for Wikidata IDs, so we probably won't be using Wikidata for anything after all. I tried. —Rua (mew) 19:45, 12 October 2017 (UTC)
It will happen, it's just that at the moment the advantages aren't completely obvious. – Jberkel (talk) 20:56, 12 October 2017 (UTC)
@Jberkel: Isn't this one of them? —Justin (koavf)TCM 22:26, 12 October 2017 (UTC)
The page tea has already {{wikidata|Q6097}}. Changing to a template like {{sister links|Q6097}} could fetch all sister project links with automatic update of new links, deleted ones or renamed ones. The problem is that a word may have multiple senses that can be connected to multiple equivalent pages on Wikidata. --Vriullop (talk) 08:22, 13 October 2017 (UTC)
@Koavf: I'm all for Wikidata, it's just that to some editors the advantages are less clear at the moment. @Vriullop: yes that would be great, via Wikidata one should be able to fetch all the other relationships. Couldn't {{senseid}} (or something similar) be used for fine-grained associations? – Jberkel (talk) 14:11, 13 October 2017 (UTC)

@Jberkel - @Koavf - @Vriullop - @Rua - Sorry, I am new to Wiktionary buy I really don't see the reason in not linking the Wiktionary definitions in Wikidata. For example the Wikipedia article Water has a link to the Wiktionary definition, at the bottom of the article. Why not to show it in the middle-left side of the page, near to the other sister project links? (Commons, Wikibooks, Wikiquote). This way all the 220 Wikipedia articles can show the link to the Wiktionary definition in their respective language (if it exists), without the need to actually edit the 220 Wikipedia articles. Fructibus (talk) 18:49, 13 October 2017 (UTC)

@Fructibus: I agree as well but there were concerns that it's too difficult, impossible, or possible-but-difficult and not actually helpful. I disagree with the latter two but it's definitely an undertaking to be sure. Then again, so is everything. —Justin (koavf)TCM 19:01, 13 October 2017 (UTC)
@Koavf: Very nice answer, gives a feeling of touching a perfection in language, thanks :) - Fructibus (talk) 23:39, 13 October 2017 (UTC)

Ōbaku tō-on/sō-on readings[edit]

Found this video: Heart Sutra chanted by Ōbaku monks; is the ruby a Chinese pronunciation or as Wikipedia states: tō-on/sō-on readings? Here's a supporting resource. Domo, --POKéTalker (talk) 04:49, 13 October 2017 (UTC)

Personally it sounds suspiciously(?) too much like accented Mandarin ( () (ji)?  () (e)?), possibly dated ( (けん) (ken)), but I also don't know know what I'm talking about. Maybe tō-on is Mandarin. —suzukaze (tc) 05:11, 13 October 2017 (UTC)

For reference, comparison of Japanese, Ōbaku reading (sō-on?) and standard Chinese:

 (かん) () (ざい) () (さつ) (ぎょう) (じん) (はん) (にゃ) () () (みっ) () ()
Kanjizai Bosatsu gyō jin hannya haramitta ji
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
 (クヮン) () (サイ) () () (ヘン) (シン) () () () () () () ()
K(w)antsusai Pusa hen shin poze poromito su
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...
觀自在菩薩般若波羅蜜多 [MSC, trad.]
观自在菩萨般若波罗蜜多 [MSC, simp.]
Guānzìzài Púsà xíng shēn bānruò bōluómìduō shí [Pinyin]
Avalokitesvara Bodhisattava was practicing deep prajnaparamita when...

Though there is probably no clear romanization to the monks chanting, should kanji with these Ōbaku on-readings be provided as sō-on? Just wondering. --POKéTalker (talk) 02:21, 14 October 2017 (UTC)

(The automatic pinyin generated by zh-usex is not correct because it uses the most common readings. [1] has pinyin transcription that seems to be OK —suzukaze (tc) 02:36, 14 October 2017 (UTC))
  • @POKéTalker, to confirm / clarify -- it sounds like you're asking if there is value in adding sōon readings to the individual kanji entries. If that's your proposal, I have no particular opposition, so long as the readings are clearly labeled as sōon (provided that's the correct reading category). ‑‑ Eiríkr Útlendi │Tala við mig 04:36, 14 October 2017 (UTC)
I think POKéTalker wants to make sure that they are indeed tou'on first. —suzukaze (tc) 20:48, 14 October 2017 (UTC)

TabbedLanguages default and English links in definitions[edit]

Yesterday, following Wiktionary:Beer parlour/2017/July#TabbedLanguages edit: default to English for unmarked links, I made a change to MediaWiki:Gadget-TabbedLanguages.js so that the default language would always be English, if no language is specified. This means that it's no longer necessary to use {{l|en|...}} in definitions. I'd like to ask the people who do this to use regular links from now on. —Rua (mew) 15:40, 14 October 2017 (UTC)

But not everyone uses tabbed languages. DTLHS (talk) 15:45, 14 October 2017 (UTC)
I agree that this would be a sensible default for those without it, too. But then we'd need a separate gadget. —Rua (mew) 15:46, 14 October 2017 (UTC)
If we made a separate gadget it could be more smart, such as linking derived terms to the correct language, while linking terms in definitions to English. DTLHS (talk) 15:53, 14 October 2017 (UTC)
Also, what happened to the plan to make TL the default? We had a vote and everything. —Rua (mew) 15:51, 14 October 2017 (UTC)
@Rua: The vote was conditional on categories being properly sorted by bot in advance of the implementation. This never happened, afaict. --Yair rand (talk) 04:20, 2 November 2017 (UTC)
Any way to undo this behavior for searches and search results? Having them always go to English is pretty annoying when you’re working on some other language. — Vorziblix (talk · contribs) 01:48, 18 October 2017 (UTC)
Where else should they go? —Rua (mew) 10:49, 18 October 2017 (UTC)
Ideally to the last language visited, as they did before. — Vorziblix (talk · contribs) 23:15, 18 October 2017 (UTC)

Singapore terms[edit]

Just a heads up: a while ago, a Singapore schoolteacher encouraged his students to add Singaporean English terms to Wiktionary (which is, on the whole, a good thing). We seem to have a new batch of these happening at the moment, e.g. bus captain, taxi uncle. So be ready for some cleanup. Equinox 08:23, 16 October 2017 (UTC)


They've made some drastic changes to pronuciation which might not be correct. Anyone who knows Old English, do you mind taking a look. --Robbie SWE (talk) 18:12, 16 October 2017 (UTC)

@Robbie SWE: It looks like, as far as Old English pronunciation goes, they're changing sequences of /h/ and a sonorant to sonorant and voicelessness diacritic (for instance, /hr/ to /r̥/). That might be correct in a pseudo-phonetic transcription, but I don't know if it is an accepted phonological analysis. — Eru·tuon 21:46, 16 October 2017 (UTC)

Translating both ways[edit]


When I started working on a project in which I would like to use translations from the wiktionary, I noticed that wiktionary translations are created separately for each language. That means that even if the English wiktionary contains the translation of a word into another language e.g. Mandarin, in that language there will not be a translation of that word into english.

One example:


- the list of translations contains the translation 图书馆


- the Chinese wiktionary page does not have a translation for that word into English (the site contains: 英语(English):[[]])

Since these translations are symmetric, it would be correct to add a large number of translations to these wiktionaries with much less effort. However there surely will be a few issues that have to be resolved first.

TheDaveRoss already replied to me per mail already stating some issues:

"1. There are numerous Wiktionaries, each one maintained by a distinct community of volunteers. Each has its own policies regarding what may or may not be included, how translations are to be added, etc. It is very important that you coordinate with the local community wherever you add content to ensure that the content meets their criteria.

2. Translations are very nuanced (as you are probably aware). Automated addition of translation has happened at small scales in the past, however close oversight by a person familiar with both languages is required. Even translations which appear to be symmetrical may require special annotation in the target language which is not included in the original language.

3. The source material may not be correct, and automation can propagate errors. The English Wiktionary, and a few other large Wiktionaries, have enough contributors that many errors are caught quickly. That is not the case for the majority of other languages, so it is important to ensure any additions to other languages are correct"

4. Attribution to the original contributor will be important

E.g. adding the new words to proposed translation first and then checking for correctness would decrease the risk of wrong translations but add some value right away.

What do you think about writing a script to do this, what other problems are there with this? Do you know about previous attempts to do this? I hope this could be very useful!

Noahho (talk) 01:21, 17 October 2017 (UTC)

@Noahho: Something similar to two-way translation could work if we can agree on how we will use Wikidata and how it will be connected across Wiktionaries. Unfortunately, how that would work is very difficult to determine. —Justin (koavf)TCM 02:13, 17 October 2017 (UTC)
@Noahho Hello Noah. Can I please know what project you are working on? If the aim of the project is to extract translations of foreign-language terms into English from Wiktionary, it would be much easier to extract from the pages on English Wiktionary; e.g. for simplified 图书馆 it would be at 圖書館, which says "library". Wyang (talk) 02:17, 17 October 2017 (UTC)
This is an age-old problem. I believe that, somewhere, there is a unified Wiktionary that is not dependent on a "home" language. But I forget what it is called, or what state it is in. SemperBlotto (talk) 05:37, 18 October 2017 (UTC)
@SemperBlotto: omegawiki:? —Justin (koavf)TCM 09:03, 18 October 2017 (UTC)
I will also mention one previous attempt to do this locally was User:Tbot. The person who created that script has passed away, however there is some amount of documentation of his efforts in that user space. - TheDaveRoss 13:57, 18 October 2017 (UTC)
No Tbot-like program should ever be run again. It was a well-intentioned mistake, the effects of which we are still engaged in cleaning up. —Μετάknowledgediscuss/deeds 07:10, 2 November 2017 (UTC)

Turkish vs Ottoman Turkish[edit]

The Balkan language loanwords from Turkish should technically be Ottoman Turkish, since that's the era they entered those languages, right? Is the only main difference the script being Arabic vs. Latin? I realized I need to go back and change a bunch of Romanian and some other entries. Word dewd544 (talk) 16:12, 17 October 2017 (UTC)

Yes, they should generally be Ottoman Turkish. The script is one significant difference, but if I’m not mistaken there’s also a huge difference in lexicon, where a large portion of the Ottoman Turkish lexis consists of loanwords from Persian and Arabic that were later stamped out of usage and replaced with neologisms by Atatürk. — Vorziblix (talk · contribs) 21:40, 17 October 2017 (UTC)
There are also grammatical differences. That said, I would personally prefer to treat them as a single language, and I don't think we lose much by claiming Balkan loanwords are from Turkish rather than Ottoman Turkish when the word in question is itself the same. —Μετάknowledgediscuss/deeds 21:44, 17 October 2017 (UTC)
I’m indifferent to merging them, not being knowledgeable enough on the subject, but the split does seem to be mostly a relic of sticking to ISO codes; input from editors experienced with Turkish could be helpful. — Vorziblix (talk · contribs) 06:40, 19 October 2017 (UTC)
Isn’t it more true to the soothfast happenings in the Ottoman Era to describe the Ottoman Turkish as an acrolect of Turkish which the elite prioritized while basically we have had Turkish all the time? It would be awkward to say that we had Turkish once and then, by some peculiar developments in constitutional history, Ottoman Turkish, and then because Atatürk said so Turkish has smitten Ottoman Turkish. Rather there has been one basic Turkish from which the Balkan languages also borrowed rather than from the language we see as Ottoman literary inheritance today, though of course there can be learned borrowings from the literary language as well, though in the case this is largely unlikely because of mostly late literary culture in the Balkan countries and literary culture in the Slavia as well as in Greece (I don’t know about Romania and Albania) also prohibits itself to borrow, as compared to other literary cultures. So in the context of Balkan languages, the Turkish they have been in contact with was coexistent Turkish rather than typical Ottoman Turkish. We are just inveigled to assume that one literary language has borrowed from the other literary language even when the spoken language has borrowed, because for older times we know about the spoken language from its appearance in writing. It is an image of things we have long surpassed in Romance studies, like acknowledging that Spanish has borrowed from colloquial Arabic rather than from literary Arabic. Palaestrator verborum (loquier) 20:51, 18 October 2017 (UTC)
Unfortunately it’s not really clear what Wiktionary means by ‘Ottoman Turkish’ — just the literary acrolect or the language in general during a given time period. Previous discussions don’t seem to have reached a conclusion. Some of the comments made there could be relevant to the issue of how to treat Ottoman Turkish, though. — Vorziblix (talk · contribs) 06:40, 19 October 2017 (UTC)

Listing Translations by Language[edit]

It seems to me that quality control of translations is harder than it should be: those of us who patrol new edits can't be knowledgeable in anywhere near all of the languages, and those with expertise in the languages in question are less likely to be spending their time browsing through English entries. {{t-check}} is helpful, but not used in most languages.

Does everybody think it would be a good idea to create a listing of translations in each language, along with the entry they're in? I would envision it as a listing in the language's alphabetical order, with the {{t}} template converted to an {{l|template}} and followed by the name of the entry:

This would make it easy for an expert in a given language to scan through all the translations in that language without browsing a bunch of English entries. It also might make the redlink categories and the overhead that goes into creating them unnecessary.

I'm bringing it up here because it would be a major undertaking involving massive processing of the dumps, so I want to make sure it's a good idea before asking anyone to do it. Perhaps it could be started with some of the smaller LDL languages as a test. Chuck Entz (talk) 14:19, 18 October 2017 (UTC)

Matthias Buchmeier maintains lists similar to what you describe. — Ungoliant (falai) 14:59, 18 October 2017 (UTC)
I can share the assumption that it would make it more possible to make Wiktionary serve as bilingual dictionary in relation to single languages, as for now one cannot directly work in Wiktionary to make it intentionally a bilingual dictionary for any language because one does not see what is already there, i.e. one can only add to quantity, but only by serendipity to quality. But the requirement would be that such lists are live dumps which get refreshed as soon as edited, instantly by Javascript or at least after refetching the page. Because it is a core part of motivation for editing to see the results published instantly, that’s why the web is there. Palaestrator verborum (talk) 16:41, 18 October 2017 (UTC)
@Chuck Entz: Re quality control via patrolling: There's an option for filtering RecentChanges to only show certain languages in WT:PREFS, near the bottom. Apparently it was broken for a while. Now fixed. (Also, it doesn't work for those who have the new version of recent changes enabled.) --Yair rand (talk) 04:26, 2 November 2017 (UTC)
Depending on the quality of translations themselves, the targeted bilingual dictionaries can be quite good. The lists mentioned above: Matthias Buchmeier are quite good (mostly English to foreign language) but require some programming work. Building the reverse - foreign language to English is apparently much harder and depend largely on entry structure in those specific languages.--Anatoli T. (обсудить/вклад) 06:58, 2 November 2017 (UTC)

Word Frequencies in Wiktionary[edit]

We have just finished removing the {{rank}} information based on some old, problematic parsing of the Project Gutenberg corpus. We still have a few appendices which record that information. My main objection to the inclusion of that data was that it was flawed, and outdated. But I don't roundly object to having word frequency information which is accurate. To that end I have a few questions.

  1. Should we include any frequency data in any manner?
  2. If so, should that data be represented within word entries in some way?
  3. Which corpora should be used, or which frequency lists?
  4. Should original research (of the type used for the old data) be allowed?

One starting place for English (and a few other languages) can be found at the BYU corpus page. It is probably best to avoid getting too deeply into the weeds here, but rather if it seems like there is a general consensus around what should be included we can spin off a project page and figure out all of the details. - TheDaveRoss 17:53, 18 October 2017 (UTC)

We should certainly not add frequency data to word entries because the data is doubtless interesting in a list, but too unsure and thus and because nobody wants to know which words have the nearest frequency if he looks up a word – which is a totally random result of sundry capacities of a language and without fruit for erudition – and because it would suck off endeavors to more instructive content creation not worthwhile enough to maintain in a the main namespace. And as there is more instructive content to be created by the same endeavors, I also opine that for the collection of frequency data it should be waited until copyright law has been abolished by revolutions in the world and thus representative and illuminatingly separable corpora can be collected. At that time we would only struggle with the technical recognition of what a word be, not of what sources we digest, which together are multiplying error factors. Palaestrator verborum (loquier) 18:25, 18 October 2017 (UTC)
Wut? Sorry, what I meant there was: one, I have no idea what the abolition of copyright law has to do with the inclusion of word frequency on Wiktionary and, two, I disagree with the notion that this would prevent some other work from being done. - TheDaveRoss 18:33, 18 October 2017 (UTC)
The notion is that there are opportunity costs in collecting telling corpora. I don’t think that one could be content with subtitle databases, as these are slanted to Hollywood and mass productions and their fantasy worlds instead of the whole language that we mean when we talk about the language; and actually those current subtitle collections and most other corpora are non-free either. The web-based corpora which are represented on the BYU corpus site have of course their own problems, with the deep web and the dark web and resources being varyingly crawlable. If we want recent and representative data, we can only go illegal by grabbing Library Genesis, accessing journal and newspaper databases via black channels like Sci-Hub does and things like that, perhaps mixed with subtitles, i.e. things that we cannot perform by the means of the Wikimedia Foundation without endangering it. If there would be no copyright law, there would be a large database of works of all kinds which would constitute good (i.e. the technologically and humanly, not legally best possible) and fast data. This is of course a high standard from which I esteem corpus data valuable, and possibly the view of a philosopher against the practical mind of a programmer. But one can set the doubts even higher by asking oneself how to offset different data sources, like how commensurable web data and journal data and parliamentary debates are, even if one has access to all humanly possible corpora, and if one needs to be the man to have correct information about word frequencies. Others could be pleased to see lesser corpus data, but I think that the assumption cannot be rejected out of hand that this is not worth it if there are so many doubts about whether this data represent the actual distribution in language (by my common sense, I often wonder about words not being found at all in large frequency dictionaries) – aside from such data not being valuable maintained in individual word articles, which question is subject to entirely different evaluation criteria, because the intention of a reader opening a word’s entry is different from the intention of a reader opening a word list. Palaestrator verborum (loquier) 19:25, 18 October 2017 (UTC)
It's about time we asked the question, how can words be real if our eyes aren't real? DTLHS (talk) 19:34, 18 October 2017 (UTC)
@DTLHS: I'm ded. But it should be "How Can Words Be Real If Our Eyes Aren't Real". —Aryaman (मुझसे बात करो) 20:40, 18 October 2017 (UTC)
I have not asked this question, they are valuable abstractions from language – we can explain and describe them –, but you cannot just cast “the language” into a measuring beaker to know an objective distribution of its constituent parts (which is also a mereological problem), as the language you know about is always constructed to some degree as necessitated by material constraints. What we want, in laying out at frequency list which could praise itself of utilizing the methods fit for the object, is to be at least as exact as possible about it, but that is by far not legal. Palaestrator verborum (loquier) 19:45, 18 October 2017 (UTC)
Palaestrator, your writing style is unnatural and unnecessarily loquacious. I don't know why you are doing it, but I want you to know that your arguments will be taken more seriously if you try to express them clearly and succinctly, rather than in a way that just makes us all think that you're trying to show off. (And as a side note, your understanding of how corpora work in relation to copyright law seems to be flawed, so you might want to try reading up on that first.) —Μετάknowledgediscuss/deeds 19:53, 18 October 2017 (UTC)
It is easy with the law in this case: If the corpus collection is legal (i.e for example the Wikipedia corpus, accessing it is legal), accessing it is legal; if the data collection is illegal, or the manner of accessing it is (i.e. for example a university’s access being used beyond its license, as Sci-Hub does), that is already legally contentious (it is disputed in many jurisdictions, as the United States of America and the Federal Republic of Germany, if just streaming content published in breach of copyright law is violating it, also dependent of dolus), and one should not lay the hands in fire for the collected fruits of such automated accessing, especially if it is commercially exploited as allowed by the licenses used on Wiktionary.
I can’t show off with my writing style, being unnatural in normal people’s view is my expressions’ very nature, or sounding like a 19th century novel (wherefore being natural though if the matter dealt with is a complicated matter of human culture, as language is? I don’t know why people recall nature when we can surpass it.). And it is not loquacious, I already write off parts of it. Besides, the point of reading it could lie in it saving minds from futile pursuits. Like others, I don’t talk if I prognosticate that my verbosity does not pay proportionately. What is the prospect though of how much work hours the creation and maintenance of those lists take? Palaestrator verborum (loquier) 20:32, 18 October 2017 (UTC)
  • It would be nice to have word frequency information available, but there is the serious PoS/Etymology problem (eg, dyke or dike). I am skeptical of both the heavily annotated corpora (which differentiate [or try to] by PoS, but are generally small) or the large corpora (which do not usually make accurate PoS determinations). That said Google N-Grams and the BYU corpora would be fairly useful, though I have not investigated the terms of use for their frequency data. It doesn't seem particularly useful in an entry. It would be very useful to have some kind of quick indication as to what frequency class a given word used in a definiens was in (eg, top 10K, next 40K, next 200K, perhaps next 750K). As an appendix such lists might make it easier for a contributor to check the understandability of a definition. DCDuring (talk) 21:02, 18 October 2017 (UTC)
    Google N-Grams makes possible Reference links like this frequency comparison of canvas and canvass as verb and noun. To me that seems useful to contributors and to passive users. DCDuring (talk) 21:08, 18 October 2017 (UTC)
    I completely agree that frequency at the POS or even sense level is much more useful, but how that might happen eludes me. The concept of not necessarily providing specific rank, but instead indication a frequency class of some kind could be interesting, but the underlying data would suffer in the same way. - TheDaveRoss 11:56, 19 October 2017 (UTC)
    The annotated corpora, like Google N-grams, BYU COCA, as well as smaller ones, support PoS at least. Usually, one etymology accounts for the overwhelming majority of the usage in a given PoS, which we could note in such cases with little OR. DCDuring (talk) 13:11, 19 October 2017 (UTC)
  • Thanks for getting rid of it! It's been one of my bugbears for years! --P5Nd2 (talk) 10:06, 20 October 2017 (UTC)
It would be very interesting to have some corpus-related data available here. It's fairly easy to produce ranking lists from Google's N-gram data, I extracted some a while ago for French and German (only top 1K). – Jberkel (talk) 13:55, 6 November 2017 (UTC)
If you would like to share your methodology, or simply generate some lists, I am happy to work on the insertion into entry side of things. Obviously understanding the benefits and limitations of any given corpus is critical. Is it possible to do things like extract the top n words from books within a date range? - TheDaveRoss 14:08, 6 November 2017 (UTC)
The code is on gitlab. If you want I can help to generate some lists. To keep the PoS the script needs to be changed. The date range is flexible. Apparently there are some data quality issues (OCR) in Google's corpus before 1800 and after 2000. – Jberkel (talk) 17:47, 6 November 2017 (UTC)

Catholicism vs. Roman Catholicism vs. Eastern Catholicism[edit]

There seems to me to be an inconsistency with how these terms are used in Wiktionary. I just want to clarify what these terms mean and therefore, we can streamline the usage of these terms in Wiktionary. "Catholicism" or "Catholic" (as capitalized) in common parlance would be the faith or word connoting the Catholic Church (those that are in communion with the Pope in Rome). "Roman Catholicism", on the other hand, means something specific, since it would refer to Catholicism that is using the Roman Rite within the Catholic Church (as opposed to those using other rites such as Byzantine Catholicism, Coptic Catholicism, Syriac Catholicism, etc.). However, historically, the term "Roman Catholicism" was used as a pejorative slur in English-speaking circles for the Catholic Church. Actually, if referring to all western rites (together with the Roman rite, the Ambrosian rite, etc.), it would collectively be called "Latin Catholicism", or "Western Catholicism". "Eastern Catholicism", on the other hand, also means something specific, since it would refer to Catholicism (in communion with the Pope in Rome) that uses any Eastern rite, which would be by Byzantine Catholics, Syriac Catholics, Chaldean Catholics, etc.

The problem now arises. Some call anyone in communion with the Pope as a "Roman Catholic", but almost all Eastern Catholics don't like it, because they say they don't use the Roman rite. Therefore, they don't associate with the term "Roman Catholic", but with whatever their sui iuris church or rite is, like a "Ukrainian Catholic", or a "Greek Catholic". Therefore, some terminology associated with the entire Catholic Church, let's say the Council of Trent, would be associated with the entire Catholic Church, but it is labelled as "Roman Catholic", and Eastern Catholics, since they are in communion with the Pope, would also hold the Council of Trent as true. Therefore, we get Wiktionary entries like patron saint that have both, which is pretty redundant, because one could just label this as simply "Catholicism", and it would be simpler and no one would misunderstand it. Everyone would understand that it means something associated with the Catholic Church.

Therefore, for simplicity, clarity, and completeness of information, I move that all labels under "Roman Catholicism" be changed to "Catholicism" unless the entry is really just concerned with the Roman rite of Catholicism, terms like "Agnus Dei" or a "humeral veil", although I find it redundant too that we need to provide the specific rite within the Catholic Church to which the entry is used. --Mar vin kaiser (talk) 13:16, 19 October 2017 (UTC)

I thought that in English at least, the term Roman Catholic meant a Catholic who is in communion with the Bishop of Rome, i.e. the Pope, and thus includes Eastern Catholics. The "Roman" is necessary in order to exclude Anglicans and the Eastern Orthodox (who are also considered part of the Catholic Church as that term is used in the Creed). —Aɴɢʀ (talk) 13:54, 19 October 2017 (UTC)
@Angr: Actually, in those cases, the word "catholic" is written as a smaller case, which is a common practice in reciting the creed by Protestants, such as the "catholic church". When it is capitalized, as "Catholic", it would refer to the Catholic Church in communion with the Pope. This is exemplified by the fact that when one asks what religion you are, one says "Catholic", "Orthodox"(Eastern or Oriental), or "Anglican", and there is no ambiguity with regards to the term "Catholic" that it automatically refers to the Catholic Church in communion with the Pope. As I said, the reason why "Catholic" should be used instead of "Roman Catholic" is because Eastern Catholics simply do not subscribe to the idea that they are Roman Catholics, because they do not follow the Roman rite, nor any Roman tradition, as started in the church in Rome. They have their own liturgy and practices, distinct from the Roman rite, thus they refuse to be called "Roman Catholic". Since the entries in Wiktionary labelled as "Roman Catholic" also apply to "Eastern Catholics", it's better to label as "Catholic". In religious discussion, people actually differentiate "catholic" and "Catholic", wherein the capitalized word refers to the Catholic Church in communion with the Pope. --Mar vin kaiser (talk) 14:31, 19 October 2017 (UTC)
That's not how I've ever understood "small-c catholic"; I've always taken it to refer to the nonreligious sense of catholic: "universal; all-encompassing; pertaining to all kinds of people and their range of tastes, proclivities etc.; liberal", while "big-C Catholic" has the religious senses. I suppose that, just as with the word American, there are different meanings to both Catholic and Roman Catholic and different people prefer different meanings and get into arguments with other people as to the "proper" meaning. The trouble is, there is no term that is both unambiguous and commonly used that refers to all churches in communion with the Pope. Both "Catholic Church" and "Roman Catholic Church" are ambiguous as they mean different things to different people, and "Church in communion with the Pope" is unwieldy and not exactly a common term (quite apart from the ambiguity of pope, which can refer to other people than the Bishop of Rome}}). —Aɴɢʀ (talk) 14:44, 19 October 2017 (UTC)
@Angr: I see what you mean, and I understand the trouble of ambiguity. How about we follow the precedent of Wikipedia? The Wikipedia entry w:Catholic Church pertains to all churches in communion with the Bishop of Rome. How about using the "Catholic Church" as a label instead? --Mar vin kaiser (talk) 17:10, 19 October 2017 (UTC)
It still seems so weird and funny to me that the entry "particular Church" is labelled as "Roman Catholic" when almost all of the particular Churches except 1 refuse to be called "Roman Catholic". --Mar vin kaiser (talk) 17:12, 19 October 2017 (UTC)
@Mar vin kaiser: The Anglo-Catholic in me rebels at seeing "Catholic Church" used to mean only the parts of it in communion with the Pope (and I was opposed to Wikipedia's moving "Roman Catholic Church" to "Catholic Church" several years ago), but the pragmatist in me says I suppose it's the least bad solution. What do others think? I feel like this isn't a decision that should be made by Mar vin kaiser and me alone. —Aɴɢʀ (talk) 13:47, 20 October 2017 (UTC)
@Angr, Mar vin kaiser: I don't see what the difference between "Catholicism" and "Catholic Church" is. And what about just "Catholic"? Would that be problematic? — justin(r)leung (t...) | c=› } 15:14, 20 October 2017 (UTC)
Come to think of it, I just noticed that if you type "Catholic" or "Catholicism" into Wikipedia, it redirects you to the article of the "Catholic Church". --Mar vin kaiser (talk) 15:27, 20 October 2017 (UTC)
I don't think there is a final solution, since there is ambiguity no matter which term you use. I think "Catholicism" or "Catholic Church" are probably the best ways to label something pertaining to the Church in communion with the Pope, since it is almost always what people are referring to when they use that term, and using more specific labels like "Old Catholicism" or "Old Catholic Church" for other brands of Catholicism. There's no perfect solution, but I think that's the way to go. We almost need another term, like "Papal Catholicism" (although that still leaves an ambiguity between Roman Catholics and sedevacantists...). :P Andrew Sheedy (talk) 03:55, 22 October 2017 (UTC)
I prefer to keep it at "Roman Catholicism". It is less ambiguous and not really unwieldy. Lingo Bingo Dingo (talk) 11:24, 25 October 2017 (UTC)
@Lingo Bingo Dingo: As I said, most entries that refer to Roman Catholicism also refer to other rites in the Catholic Church, such as all the Eastern rites. I think it makes more sense to follow the approach in Wikipedia, which is the label "Catholicism". --Mar vin kaiser (talk) 08:12, 19 November 2017 (UTC)
I agree that the label should simply be Catholicism, except in instances where there is actually a distinction between East and West. For instance, Eastern Catholics (at least in the Byzantine Rite) do not celebrate Mass, strictly speaking, and they prefer the term Divine Liturgy, so the label currently at the former entry should be left as is. Andrew Sheedy (talk) 20:37, 19 November 2017 (UTC)
Yes, I read the above discussion and understood that you meant that, but that doesn't address my concerns. "Roman Catholicism" immediately disambiguates from "Christianity basic on the ecumenical councils", "Latin Christianity", "Old Catholicism", "Anglo-Catholicism/Anglican Catholicism" "Liberal Catholicism" and whatnot, and "Catholicism" doesn't. In normal use this is hardly ever ambiguous because of context, but there isn't a lot of context that we can cram into a label. Lingo Bingo Dingo (talk) 11:54, 21 November 2017 (UTC)

Please revert vandalism at WT:LOP[edit]

Can someone please undo the vandalism of 2017 October 9 at WT:LOP ?

Almost all the page content was deleted by vandal user 2602:306:3B60:F5F0:9DB8:9A6C:2612:A93 (talk)

For some unknown reason pressing revert or trying to go back to the last good version [2] resulted in an edit filter saying it was impossible to save.

-- 06:24, 20 October 2017 (UTC)

Thanks, done. (Repeated vandalism at Appendix:List of protologisms/Q–Z by similar IPs; may require protection or range block.) Wyang (talk) 06:27, 20 October 2017 (UTC)

Why isn’t it even possible for a non-sysop user to revert multiple commits, at least by a single IP? This could of course be used by vandals, but just deleting content has the same result, and without it the vandals have an advantage, because they just need to save multiple times to make their changes unrevertable. Or do I just fail to see how this can be done by me as a plain user? There has been a number of cases however where I would have been faster than an admin in reverting vandalism. Palaestrator verborum (loquier) 09:10, 20 October 2017 (UTC)

@Palaestrator verborum: You can revert several edits in a single stroke, although it's (slightly) convoluted. Go to the revision history of the entry; on the left corner, there's a button called "Compare selected revisions". By default, it will compare the current revision with the second-to-last. Change that to the last good version of the entry instead, then click the button "Compare selected revisions", then "Undo". --Barytonesis (talk) 10:21, 24 October 2017 (UTC)

Removing images of coats of arms[edit]

Relatively recently, coats of arms have been added to entries as images.

I propose to remove all images of coats of arms.

Images should help find what the thing referred to looks like or help get a clearer idea of the referent in another way. Coats of arms do not serve the purpose at all. For countries, states and cities, geographic maps seem okay as images.

--Dan Polansky (talk) 07:44, 21 October 2017 (UTC)

Support removing coats of arms. Support having maps in the entries mentioned. --Daniel Carrero (talk) 10:59, 21 October 2017 (UTC)
Support and maybe include a map through OpenStreetMap with the extension Kartographer? Face-smile.svg Noé 12:03, 23 October 2017 (UTC)
Support (unless the entry is referring specifically to a coat of arms of course) Pengo (talk) 04:41, 25 October 2017 (UTC)
Support, except if the coat of arms is relevant to a sense or an etymology. Lingo Bingo Dingo (talk) 11:25, 25 October 2017 (UTC)
Oppose the proposal to "remove all images of coats of arms" -- as others note above, sometimes an entry's senses may refer explicitly to a coat of arms, as at 分銅 (fundō, a weight used in a balance scale) or 茗荷 (myōga, Japanese ginger) or 家紋 (kamon, family crest). Support removal of coats of arms from entries without such senses, such as entries for geographic entities. ‑‑ Eiríkr Útlendi │Tala við mig 16:38, 25 October 2017 (UTC)
As a few people above: I support removing them as clutter, unless there is some relevance beyond just "being the coat of arms of X" (e.g. if the entry is about coats of arms, or perhaps if it's the arms of Doggingham and there's a dog on it). Equinox 20:02, 25 October 2017 (UTC)
I'm with Equinox and others, see aapiskukko. --Hekaheka (talk) 22:19, 25 October 2017 (UTC)
The coat of arms in aapiskukko is fine, and the intent of my proposal is indeed narrower than the formulation: let's remove a coat of arms from city X if the coat of arms is there only for its being coat of arms of that city; ditto for countries. --Dan Polansky (talk) 12:49, 29 October 2017 (UTC)
I have removed ten coats of arms (more are pending): Panama, Mexico, Portugal, Latvia, Serbia, Malaysia, Romania, New Zealand, Slovakia, and Samoa. --Dan Polansky (talk) 22:09, 4 November 2017 (UTC)
I have removed as many coats of arms as I could quickly find. Two search terms: incategory:"English proper nouns" coat of arms; incategory:"Norwegian Bokmål proper nouns" coat of arms. --Dan Polansky (talk) 09:21, 12 November 2017 (UTC)
Oppose. They are images that relate to the entries in question. DonnanZ (talk) 09:03, 12 November 2017 (UTC)

WT:WDP and senseid[edit]

Not going to hide, I'm eager to have this information directly at Wiktionary.

Probably Template:senseid is not the best template and we would have better solutions.

So I suggest two votings:

  1. if wikidata ids could be used parallel to synonyms (to connect same senses with same wikidata ids)
  2. if Template:senseid is the best option for this

I suggest to start voting on 28-10-2017. d1g (talk) 15:50, 21 October 2017 (UTC)


Discuss any questions before voting here. d1g (talk) 15:50, 21 October 2017 (UTC)

Wikidata ids in order to capture same senses[edit]




Template:senseid as current solution to implement topic above[edit]




from vs <[edit]

I stumbled upon this old line in Wiktionary:Etymology:

Some editors use the word “from” to separate ancestors, while others use the algebraic “<”. The symbol “<” implies an arrow that points in the direction of language change. There is currently no consensus on a preferred form, but a majority of editors prefer "from" over "<".

There was no clear consensus in 2011, but the "from" side has clearly won out by 2017. Can we update WT:ETY or does it need some kind of vote? Pengo (talk) 16:25, 21 October 2017 (UTC)

I've deleted that paragraph. --Barytonesis (talk) 16:53, 21 October 2017 (UTC)
I added the paragraph back: it is true and links to evidence of consensus or its lack. If you can provide evidence of consensus, we can update the paragraph to state what the consensus is. --Dan Polansky (talk) 17:27, 21 October 2017 (UTC)
Newer user here. I registered two and a half weeks ago and worked on and read the English Wiktionary heavily since, having almost 10,000 of pages since then in my browser history from en.wiktionary.org, counting several tens of thousands more for the earlier rest of this year, but I have not seen the use of such a sign since then, or if I have seen it it has not exceeded five times, and I cannot imagine that a reader can have an honest will to see it.
Instead, it appears to me that if one is an editor that has not forgone caring for the visual appearance of a linguistic entry, one is inclined by one’s refinement to substitute such figures en passant; I don’t think that it is good typography and I must own that it would be an understatement to claim that I would need to tax my brain to apprehend what such a mark is intended to signify. It is hard for someone who is lacking of any background of having it seen used other than in unavoidable mathematical education and on the other hand it is hard for a mathematician likewise because he who knows mathematics is irritated by any use of mathematical characters for which he has trained much to have specific concepts. Also I am concerned that it is a bad habit to use that sign anywhere at all on the web with the intention of its glyph being displayed, as its meaning is restricted to being part of the XML and HTML markup and one can make a mess with it easily or have it filtered (on other sites if not here).
Uses in print are based on space rationales while the sign does not belong to the inventory of any writing system, but else you can with as much success use ⊷ or ⊱ or √ if you use < – what you want to express with it is no less obscure, as using “<” is a far-fetched trope no matter of how much frequency it has; “smaller than” in mathematics does not overtly map to “borrowed from” or “inherited from” (the keeping apart of which is also double-crossed by it), and the other uses of the sign on the web further distort the meaning up to conviction that this is not a sign that belongs into etymology sections of community-edited web dictionaries. Of course the same applies to many other parts of the web and goes with the characters “>”, “=”, “"” and “~” in so far as these characters are not used in any language but in the scribal traditions of special disciplines. The fact that your computer has a character on your keyboard does not at all recommend it being used outside of a computer-context, and particularly it can not ever be a standard as it does not need to be included at all in a keyboard layout, or in the keyboard if the key that would carry it is missing as is sometimes the case with the ⟨LSGT⟩ key that carries both ASCII angle brackets in half of Europe (vulgo: the one right of the left shift key).
Is there anything about the “<” sign that outweighs the detriments of its usage? Meseems I have falsified everything that one could possibly say in favor of it, and that this my posting could as well have a place in Wiktionary:Etymology. Come hither, defenders of U+003C in web dictionary etymologies! Can you hold anyone at your side? Or will you shrewdly ignore the issue because of convenience? A community decision would be relieving for the purpose of providing a reference that using “<” for etymologies is uncouth, in the ambit of the rationale. I quit writing for it now, as it seems that I have written the top beer parlor post of this year by length without even being inebriated, but I hope that this has an effect, for the quality of the English Wiktionary and maybe other projects, as it has taken me three hours of thinking and explaining and I have in this time done my best to garner your deltas and I want to spare people from repeating the formulation of the thoughts I have laid down. Palaestrator verborum (loquier) 23:17, 21 October 2017 (UTC)
I have now read the prior utterances in the 2011 thread, and I highlight that the 2011 vote had introducing a certain format with “from” as a practice. However, now the thread is about if U+003C should be replaced, not about what can be used in general. From what I have fished out there, there were not so relevant reasons why people actually opposed. One (Mglovesfun) said: “< is easier on the eye than 'from', not least because in reading I 'internally' pronounce the 'from' but for < I pronounce nothing” – which can only very conditionally be true, as I have shewn that its nature is obscure and the ”internal pronunciation”, if it exists, which is a dubious phonocentristic claim, cannot have weight as one asks oneself what to pronounce for “<” ⇨ it is illogical. Other people opposed because they had to read “a page full of verbal diarrhea” or because they had been asked to vote (SemperBlotto literally), and mostly because they did not want it to be included in the Wiktionary:Etymology page, thus most only of formal reasons. One utterance by some Stevey7788 in opposition is exemplary of the votes in opposition missing the point:
“The "<" sign is easier to read, as wordiness and lack of conciseness tend to cause confusion. This symbol is also widely used in academic publications on historical linguistics. Readers would all quickly learn what "<" means, since it should be very intuitive.” It is the job of the writer not to cause confusion when using words, but “<” does that as a rule; one cannot learn what it means because it means multiple things and one has to interpret it in context, while words can be artful. Also the usage of however-academic publications does not count as we work on a different publication type with unlike constraints and allowances. I want to point out one point that I have not really touched: “less than is not readable for users of screen readers, braille displays, or other assistive technologies.” (Neskaya)
Another guy (Bogorm) wrote that it “is useful in longer derivation chains” – there are ways more useful for the reader to show derivation chains; you write “<” because you are too inert to think of another thing and to accomplish it. Of course it is not that much a sin that it justifies punishment, but there should be a canon for confirmation of the replacement of such inertia. What is Unicode for if people opt for ASCII? What is the increasing store and display space in computers for if they use such mediocre shorthands?
We can do better in reaching a consensus by starting with one summary, as I have started. Palaestrator verborum (loquier) 00:04, 22 October 2017 (UTC)
The uses of "<" largely disappeared, sure, but whether that was by "consensus" is not entirely clear. I don't believe there are any conclusive arguments unequivocally in favor of "<" or "from"; how to weigh various pros and cons is a matter of preference. Wiktionary:Votes/pl-2011-02/Deprecating less-than symbol in etymologies did not show consensus. The current text in Wiktionary:Etymology does not mislead a new user, I believe. A new user can read the vote, see that nearly a supermajority (2/3) supported "from", look around a bit, and see that "from" has won in the mainspace. That said, another vote may be in order if we want Wiktionary:Etymology to indicate "from" as the recommended practice. --Dan Polansky (talk) 06:04, 22 October 2017 (UTC)
I disagree that there is "increasing ... display space in computers". Most people seem to use phones or tablets now. Equinox 16:06, 22 October 2017 (UTC)
  • FWIW, from a usability and understandability perspective, I recommend "from" instead of symbolic notation. This is only three characters longer, and it is clearer. ‑‑ Eiríkr Útlendi │Tala við mig 16:17, 23 October 2017 (UTC)
A possible argument against plain "from" (I do not think it is decisive, but it deserves to be out there): "[A] < B < C" clearly indicates seriality, that is "A comes from B, which comes from C". Our current practice [A] from B, from C" is in principle parseable also in parallel: "from B and from C", i.e. "A comes partly from B and partly from C".
This could be ameliorated with some extra prose: "from B, which comes from C", but this seems excessive. IMO a simple note somewhere relevant on how "from B, from C" is to be interpreted (maybe in the Glossary?) should be sufficient. --Tropylium (talk) 14:45, 28 October 2017 (UTC)

Move {{was wotd}} notices to talk pages[edit]

Would anyone support this? I'm not sure of the utility of this template to a reader beyond archiving old words of the day, and the category Category:Word of the day archive can function as an archive with talk pages. DTLHS (talk) 05:33, 22 October 2017 (UTC)

Why though, is it too ugly? I am afraid it has exactly zero use, only being a source of work that could be used for removing or fixing other template usages. Also it would effectuate additional clicks ad infinitum because people would have to go to the talk pages to check if a word already has been word of the day. Palaestrator verborum (loquier) 05:44, 22 October 2017 (UTC)
  • Support moving "was wotd" to talk pages. I don't see why a reader should care to see immediately whether a word was once the word of the day. --Dan Polansky (talk) 06:07, 22 October 2017 (UTC)
Having it on the talk page is pointless. I personally find it an interesting bit of information. An alternative would be a category (although not very visible, still better than dumping it on talk). – Jberkel (talk) 16:01, 22 October 2017 (UTC)
  • @DTLHS: delete from the main space in any case, it's visual clutter. --Barytonesis (talk) 15:46, 31 October 2017 (UTC)
  • I disagree, I think it is useful to keep them where they are. It probably helps prevent the same word being nominated for WOTD again. DonnanZ (talk) 11:47, 12 November 2017 (UTC)
Nobody seems to have consulted @Sgconlaw. DonnanZ (talk) 11:54, 12 November 2017 (UTC)
  • Disagree: I think it would be more useful on the entry page. I have a feeling that if it is relegated to the talk page, new nominators will simply not bother to check whether it is there before nominating words. (Also, pinging @Metaknowledge as the result of this discussion would apply to {{was fwotd}} too.) — SGconlaw (talk) 15:03, 12 November 2017 (UTC)
  • Disagree. Thanks for the ping, Jack. If there really were a consensus that it is too much visual clutter (which I don't believe there is), the correct change would be to modify the template's behaviour so that it does not display but continues to categorise, not to move it to the talk page and create more work for everyone. —Μετάknowledgediscuss/deeds 19:25, 12 November 2017 (UTC)
    In that case the category would need to be made visible to all users, otherwise the template ceases to have the effect of alerting users to the fact that a word has already appeared as WOTD or FWOTD. — SGconlaw (talk) 03:41, 13 November 2017 (UTC)

Category:Buyeo language[edit]

I just RFV-failed our only two entries: 乙那 and . Is it clear that there was a Buyeo language, and if so, are there any texts that are indiscussibly in Buyeo?__Gamren (talk) 16:07, 22 October 2017 (UTC)

Oh, and @suzukaze-c, -sche, Pedrianaplant.__Gamren (talk) 16:09, 22 October 2017 (UTC)
@Eirikr I did read that, but it did not give me the impression of certainty. The article does list Buyeo as a Buyeo language (which no doubt leads to misunderstandings), but their article on the language redirects to the article of the ("hypothetical") language family. Did any of the things you read make distinctions between the languages, and did they identify any texts as belonging to the language Buyeo, or do they reconstruct words?__Gamren (talk) 15:38, 26 October 2017 (UTC)

Poll: deploy timeless skin[edit]

It'd be great to have the Timeless skin deployed here (opt-in, of course). Wiktionnaire, the French Wikipedia and a bunch of other sites have it already (see T154371 for a list). It has a sticky header and a responsive design that handles different screen sizes much better than vector (demo). We need to have community consensus before it gets deployed, hence this poll. – Jberkel (talk) 16:14, 22 October 2017 (UTC)

The tables of content in articles of Wiktionnaire look flattering, a bit like i3-gaps and particularly more readable. I have also the perception that I can have more on my screen in general with it even though it looks like i3-gaps. So if it gets deployed, we only get a plus, as one can use other skins? So there is no reason to haggle and I assent. Palaestrator verborum (loquier) 20:42, 22 October 2017 (UTC)
I see no reason not to. (as long as it's not default) —Aryaman (मुझसे बात करो) 22:57, 22 October 2017 (UTC)
If it's not going to immediately replace Vector, I see no harm in including it as an alternative users can choose. —suzukaze (tc) 19:48, 23 October 2017 (UTC)
Oppose making the skin default. Sticky headers are a horrible thing, and I am so happy this fashion of the day has not reached Wiktonary yet. I dread the day on which this kind of design fashion prevails on the English Wiktonary. I don't know what to think of an opt-in deployment; I guess if someone wants to use it, that's up to them, and therefore, I don't oppose deployment as opt-in. --Dan Polansky (talk) 06:57, 28 October 2017 (UTC)
I hate it. I'm opposed to any loss of space to the width of the article and the font size is way too large. Wiktionnaire can keep it. --Victar (talk) 22:56, 21 November 2017 (UTC)

FYI, it seems that the development team is talking about deploying the timeless skin on every wiki. Pamputt (talk) 07:56, 19 November 2017 (UTC)

We should probably add it. I could live with the "sticky header". Might stick with the theme I've got, though. Equinox 16:11, 21 November 2017 (UTC

The user Equinox is abusing his administrative authority.[edit]

On the Discussion page of the entry glow-up, I provided sources to prove the entry's admissibility to Wiktionary according to all rules of this site. On the first page of Google, you can see the phrase is used by many notable YouTubers with million of subscribers and well-known sites such as Buzzfeed, Reddit, Pinterest, and many blogs, over the course of many years. Still, the user Equinox let his feelings cloud his judgment and deleted the evidence and blocked me. I will do everything until the user Equinox gets sanctions for his administrative abuse of powers. Wiktionary and Wikipedia have rules and he violated many of them with this abuse of authority. He is an angry INTJ that should never be given any authority. He can't control his emotions. He is poison that hinders Wiktionary's growth. —This unsigned comment was added by (talk).

  • I beg to differ. You removed an RfV template before the result of the verification process. That is a long-standing cause for blocking. You need to understand our rules. SemperBlotto (talk) 12:25, 23 October 2017 (UTC
The deleted talk page contain one link: Proof. Ditto. See also User_talk: with endless requests to be unblocked. The block is justified. The IP talk page should be blocked as well. --Anatoli T. (обсудить/вклад) 12:42, 23 October 2017 (UTC)
  • @ "He is poison that hinders Wiktionary's growth." lol, Equinox has the highest edit count on the whole project. I too suggest that you be blocked. —Aryaman (मुझसे बात करो) 22:51, 23 October 2017 (UTC)
  • "He is an angry INTJ". In case anyone is wondering, you can safely ignore the above user--his claims are baseless. —Justin (koavf)TCM 00:37, 24 October 2017 (UTC)
  • I laughed because I really am an angry INTJ. (I've taken that test twice, ten years apart.) Equinox 00:39, 24 October 2017 (UTC)

Section "Descendants"[edit]

Could someone edit Wiktionary:Entry layout and explain a bit more the scope of the section "Descendants"? It's not supposed to be used to list all the derived terms, but only the direct descendants. An example. --Barytonesis (talk) 09:52, 24 October 2017 (UTC)

Well I agree. Some editors take it a bit too far and add English numerous derivatives to a basic Latin root entry even if there are other Latin entries that are more immediate ancestors to those words; in that case, I take them out and keep them on the appropriate page. However, I sometimes add a derived term that may not be directly descended if there is no other lemma entry it could go under, but the word is still notable, and perhaps the only "descended" term from that word, even if not directly and through another intermediate (in this case Vulgar Latin) term. Like with poitrine; although in this case pis does exist... but poitrine is now the main word for chest. I mean, should we start adding links to the separate reconstruction pages for Vulgar Latin derivatives in the main Latin entries as opposed to just putting the derived Romance term itself there? Word dewd544 (talk) 20:53, 30 October 2017 (UTC)
@Word dewd544: You're right, and I was actually a bit hesitant to remove poitrine; I was targeting more specifically the compounds, and especially the neo-classical compounds, which really don't belong there.
On the one hand, I'd prefer we always put the descendants on the page of their very etymon; on the other, I don't want to see all the info disseminated on a myriad of pages, and I certainly don't want to create a Vulgar Latin entry every time.
So what do you think of this? The layout is not terribly pretty but... --Barytonesis (talk) 11:41, 6 November 2017 (UTC)
It's not bad. I personally like it and think it can work, but there are a few issues I thought about. And this kind of relates to a broader problem. I remember back when I first started, some users were saying that although the nested type of Descendants sections looked better and were more accurate, they hesitated making that a policy because new users unfamiliar with the precise etymological histories of certain words would necessarily not know where to add them, and preferred just having one big descendants section with each descendant word on its own line regardless of the path the word took to get to it. I personally don't much like putting the word "borrowed" or "borrowing" in parentheses after words in descendant sections because it makes it look sort of cluttered and messy, (and there's also words that we're not sure about either), but for lack of a better approach I've been doing that lately. I like the big nested tree of Romance languages that are made for some entries and linked to (and edited separately) but again that would depend on people who are well-versed in these things to sort out and probably preclude your more average user from contributing. The same applies with having a Vulgar Latin derivative section... also where would terms that we are unclear about fit, and what if there end up being like six different Vulgar Latin sub-sections because of how differently various languages evolved them? Word dewd544 (talk) 23:27, 6 November 2017 (UTC)
@Word dewd544: sorry for this late (and for now incomplete) answer. Mh, maybe we don't have to make it a policy. In my view, users should be allowed to add descendants on their own lines without having to worry where it should go precisely, but we would still make it clear that the end goal is to have them accurately sorted. We won't force anyone to do the sorting themselves, but if they can, let them do it. What do you think? --Barytonesis (talk) 13:22, 21 November 2017 (UTC)
Yeah even before reading this I was thinking along the same lines. There shouldn't be anything to stop new users from adding a word as a descendant of a lemma, but eventually the goal would be to have them sorted out properly. If they know how, then sure, but it may involve people like us or others "cleaning up" some of the pages. Right now there's not that many other people making major or drastic changes to the Romance and Latin pages so it's not like we have to worry about this on a large scale. I'm still steadily going through the inherited lexicon of the Romance languages and I'll do it the way we discussed here. Word dewd544 (talk) 22:41, 21 November 2017 (UTC)

Prompted by Barytonesis to participate following a discussion I had with Rua, here are my two cents. I've taken issue lately with an anon adding English descendants which aren't directly derived from the lemma entries. For instance, salient is indirectly related to salio, but isn't the leap a bit too big to motivate the inclusion of salient as a descendant of salio? As I pointed out to Rua, where do we draw the line? E.g. consequently I could add the following Romanian words to descendants of salto, because they're inevitably from the same source: salt, sălta, săltare, săltăreț, săltător, săltătură, saltație, etc. The notion of adding every word remotely descendent of a specific lemma would create utter chaos in our descendants sections, and that scares the crap out of me. --Robbie SWE (talk) 19:52, 18 November 2017 (UTC)

salient is from saliens, the present participle of saliō. Participles are non-lemmas, so we add the descendant to the lemma saliō. If we did not follow this principle, and instead demanded that descendants be listed at the exact form, then French sauter couldn't be listed at saltō either. —Rua (mew) 19:56, 18 November 2017 (UTC)
Sorry Rua, but salient is already listed as a descendant of saliens. It's not appropriate to add salient as a descendant of saliō. I'd appreciate it if you answer my question about the Romanian terms – do you think it's appropriate to add all those terms in the descendants section at saltō? --Robbie SWE (talk) 20:16, 18 November 2017 (UTC) PS: all of the descendants currently listed at saltō have inherited the Latin term through the present infintive saltāre. --Robbie SWE (talk) 20:36, 18 November 2017 (UTC)
When did they make the decision to make participles non-lemmas? I just noticed that recently. Word dewd544 (talk) 00:24, 20 November 2017 (UTC)
@Word dewd544, I don't remember when it was, but I vividly remember a discussion about it. I also recall Rua participating in that discussion. --Robbie SWE (talk) 19:15, 20 November 2017 (UTC)
Participles have always been non-lemmas. They were most likely listed in the inflection tables of Latin verbs from the beginning. Same for English, but in the headword line. —Rua (mew) 19:34, 20 November 2017 (UTC)
@Rua, Robbie SWE: If we're not going to treat present participles as lemmas, I actually agree with Rua that we should move the descendants listed at saliens back to salio, but iff we create subsections in the descendants section ("From the present participle saliens", "From the infinitive saltare" (although that might be pushing it), etc.), as I suggested above (cf. pectus). Otherwise, no; I don't want to have all the derived terms lumped together. --Barytonesis (talk) 20:10, 20 November 2017 (UTC)
@Barytonesis, it could work. However, I still don't think the problem is solved. --Robbie SWE (talk) 20:15, 20 November 2017 (UTC)
But Latin participles used to have actual definitions; in the case of saliens, jumping, leaping, springing, etc. If you look in the page history, that was the case. Now they've been removed and just described as participles. Word dewd544 (talk) 13:05, 21 November 2017 (UTC)
@Word dewd544: I know, I removed them, rather hastily I should add. But either we treat present participles as full lemmas (in which case it has to be reflected in our categorisation scheme), either not, in which case we shouldn't have translations or descendants hosted there. --Barytonesis (talk) 13:10, 21 November 2017 (UTC)
I don't like that policy. It will clutter up the descendants sections of the main verb too much. Word dewd544 (talk) 13:16, 21 November 2017 (UTC)

Request for review[edit]

I've added etymology to πανούκλα (diff). Could a more experienced editor check it and correct my formatting where necessary, please? In particular I wasn't sure whether the etymology for a specific sense should be put in-line with that definition or at the top. Thanks very much! -Stelio (talk) 09:58, 24 October 2017 (UTC)

Hi. You can ping me on the talk pages of the entries if you want; I'm always happy to work on MGr. In any case, the Beer parlour isn't the place to ask; you should go to the Wiktionary:Etymology scriptorium or Wiktionary:Tea room instead. --Barytonesis (talk) 10:06, 24 October 2017 (UTC)
@Stelio: I've done this. The inline etymology wasn't too shocking in this case, but I think it's best to put it in the etymology section anyway. --Barytonesis (talk) 10:12, 24 October 2017 (UTC)
Perfect; thank you very much! (This seemed the best place to me, since the question was on how to format the etymology, rather than what the etymology is. But duly noted for the future.) -Stelio (talk) 10:24, 24 October 2017 (UTC)
Yes, actually it does belong here rather than there. --Barytonesis (talk) 10:30, 24 October 2017 (UTC)

ISO codes as Wiktionary entries[edit]

We've got entries like SH for "The ISO 3166-1 two-letter (alpha-2) code for Saint Helena." But we don't have entries for the ISO 639 language codes (where SH is the deprecated code for Serbo-Croatian). Arguably the language codes are of greater practical value to users of Wiktionary (for translating what the local codes mean) than the country codes. Options include:

  • Status quo: keep existing ISO entries; exclude language code ISO entries
  • Deletionist: remove existing ISO entries; exclude language code ISO entries
  • Inclusionist: keep existing ISO entries; include language code ISO entries
  • Alternative: remove existing ISO entries; include language code ISO entries

Is there a clear policy on this already, and if not is this worth putting to a vote? (I'm hesitant to jump straight into calling a vote, not having done that here before.) -Stelio (talk) 11:23, 24 October 2017 (UTC)

We had a vote on this 7 1/2 years ago, which more people supported than opposed, but not enough more for it to pass, so the motion failed for lack of consensus. Perhaps 7 1/2 years is long enough to warrant a new vote to see if there's a clearer consensus now than there was then. —Aɴɢʀ (talk) 11:34, 24 October 2017 (UTC)
I oppose the creation of entries for any kind of code, unless they are used in running text. —Rua (mew) 14:31, 24 October 2017 (UTC)
I am likewise skeptical about inclusion of codes, and in this case I point out that existing endeavors on Wikipedia to create ISO 639 code lists combined with codes being included on disambiguation pages of Wikipedia are preclusive for Wiktionary containing them, even if just for reason of parsimony – I really would like to see you’all doing other things than including that. I also want to call your attention to language codes being more contestable than country codes, as the existence of countries is mostly fixed, while if language codes are started, we will have quarrels about people including codes which are not ISO but common and things like that, cluttering things that are difficult to ascertain. Things that the people at Wikipedia are more sturdy to be tackled by. The country codes may stay – I am conservative and I don’t feel itched by them. Palaestrator verborum (loquier) 22:28, 24 October 2017 (UTC)
  • Most of the historical discussion is at Talk:jv and Talk:de. I agree with the broad consensus that emerged in those debates, namely that we should have entries for codes if and only if they pass CFI like any other term. —Μετάknowledgediscuss/deeds 20:01, 25 October 2017 (UTC)

Coptic standardisation[edit]

@Aearthrise, Algentem, DerekWinters, Vorziblix but also to any interested: I have started a section to standardise a few practices in editing Coptic. The intention is to write down the outcome of the discussion at the Coptic policy page. Lingo Bingo Dingo (talk) 12:08, 24 October 2017 (UTC)

Etymology of Copto-Greek verbs[edit]

There are three ways in which Coptic dialects have borrowed verbs from Greek. Sahidic borrows them as bare imperatives, Akhmimic and Lycopolitan combine the nominal state of ⲉⲓⲣⲉ with the Greek imperative, whereas Bohairic use the nominal state of ⲓⲣⲓ/ⲓⲗⲓ (same verb) with the Greek infintive. Fayyumic may use both the Bohairic and the Sahidic methods, this being somewhat variable depending on the dialect. Bare infinitives are not used in any dialect.[3]

The treatment of Greek etymologies has begun to diverge a little for Bohairic and Sahidic, compare for instance ⲉⲣⲫⲟⲣⲓⲛ (erphorin) and ⲫⲟⲣⲉⲓ (phorei). I think the best option is to treat the Greek stems like borrowings and to put the nominal state before it as needed, e.g. Bohairic ⲉⲣ- + φορέω, using {{bor}}. Unattested bare infinitives could then be avoided in Coptic text. Maybe it would also be nice to add (in the future) the specific Greek conjugated form in which a word was borrowed, but that may prove a hassle for verbs that do not yet have a Greek entry. Lingo Bingo Dingo (talk) 12:08, 24 October 2017 (UTC)

@Lingo Bingo Dingo: I agree. We should write the etymology something like: From {{prefix|cop|ⲉⲣ}}{{bor|cop|grc|φορέω|notext=1}}.Algentem (talk) 13:50, 7 November 2017 (UTC)

Nominal and pronominal states[edit]

The nominal state and the pronominal state are specific forms of words that occur before nouns/adjectives and pronouns respectively. These states can exist for nouns, verbs and prepositions. The scholarly convention is to mark nominal states with a hyphen at the end and pronominal states with a double oblique hyphen at the end, indicating the position of the argument. It seems wise to standardise this before construct states are added in bulk.

Some matters to be solved are:

  1. Should entries be created with the page name as the unhyphenated form (e.g. ⲛ, ⲙⲙⲟ), exclusively as the hyphenated form (e.g. ⲛ-, ⲙⲙⲟ-), or hyphenated for nominal states and double-hyphenated for pronominal states (e.g. ⲛ-, ⲙⲙⲟ⸗)?
  2. Similarly, how should construct states appear in head templates? So should the displayed pronominal state be ⲛⲁ, ⲛⲁ- or ⲛⲁ⸗?
  3. Should construct states have a L3/L4-header as prefixes or as specific parts of speech?
  4. Which form (absolute state, nominal state or pronominal state) should be made the lemma?

Diffent implementations of 2 and 3 have been tried out at ⲛ- (n-).

I strongly favour displaying nominal states with a hyphen and pronominal states with "doubliques" to follow the modern convention. (point 2)

For lemmatisation I prefer the absolute state, then the nominal state and finally the pronominal state if the other states are unattested. This order of preference is used in most dictionaries. (point 4)

My opinions on the other issues aren't very strong, though I somewhat prefer to have pronominal state entries on pages ending with a hyphen rather than a double hyphen. (point 1) For lemmas it should in my opinion always be clear to the reader what the part of speech is, but I don't favour a particular implementation. (point 3) Lingo Bingo Dingo (talk) 12:08, 24 October 2017 (UTC)

@Lingo Bingo Dingo:
  1. It would be fine to add hyphenated forms for nominal states and double-hyphenated forms for pronominal states.
  2. The pronominal state should be displayed ⲛⲁ, ⲛⲁ- or ⲛⲁ⸗.
  3. If I understood your question correctly, construct states should have a L3/L4-header as specific parts of speech because verbs; Bohairic can differ widely in construct states.
  4. The nominal state should be made the lemma.
    (Ⲁⲉⲁⲣⲑⲣⲓⲥⲉ) 00:11, 25 October 2017‎(UTC)
@Aearthrise: Generally I agree, but why the nominal state? Standard practice in Coptic reference works is to lemmatize at the absolute state if possible, and most of the existing entries (including most of them that you’ve added) are found at the absolute state, but I’m open to arguments if there are advantages to favoring the nominal state instead. — Vorziblix (talk · contribs) 06:42, 26 October 2017 (UTC)
Maybe he was mostly thinking of prepositions, which generally don't have an absolute state, when he wrote that? The question was perhaps a little too concise. Lingo Bingo Dingo (talk) 09:32, 26 October 2017 (UTC)
My confusion derived from the nomenclature of the states; I learned their names as long form, short form, abbreviated form, and past participle form. I agree that we use absolute state(long form).
(Ⲁⲉⲁⲣⲑⲣⲓⲥⲉ) 10:04, 26 October 2017‎(UTC)
@Lingo Bingo Dingo:
1 I am ok with the hyphenation and double-hyphenation as long as double-hyphenation is not too cumbersome.
4 I agree that the absolute state should be lemmatized. DerekWinters (talk) 20:53, 6 November 2017 (UTC)
@Lingo Bingo Dingo: I think that the absolute state should be the lemma. That is the common practice, and most nouns only have the absolute state anyway.
I think that the nominal and pronominal states should be displayed with a hyphen and a double hyphen (oblique or straight?) respectively, but be linked without it (but we do add hyphens to regular prefixes, so I'm not sure). When we create a page for a nominal or a pronominal state we can add the hyphen in the header if we don't have it in the entry.
I also think that nominal and pronominal states should be labeled as prefixes, as they can't stand on their own, but I might change my mind on this in the future.
I created a template that we can use called {{cop-noun}}. It's fully built on the header template, so it works the same. I would like input on it. The only thing I'm unsure about is the plural. I added it at the front, but most entries put it behind the nominal and pronominal states—is there a reason for this? Also, if you don't input a plural form, it assumes that the plural form is identical to the singular and adds that (is this superfluous?). I made it so that you can have up to three plural, nominal and pronominal forms, are more needed? — Algentem (talk) 13:50, 7 November 2017 (UTC)

Dialect tags for Copto-Greek words[edit]

Greek borrowings were previously not tagged with dialect labels, but they can differ in dialects and some words are completely absent from some dialects. Therefore telling in what dialects a word is attested is useful information for a reader. This is a proposal to make labelling dialects the norm in Coptic. Lingo Bingo Dingo (talk) 12:08, 24 October 2017 (UTC)

@Lingo Bingo Dingo: I agree that dialect tagging should be the norm. DerekWinters (talk) 20:53, 6 November 2017 (UTC)
@Lingo Bingo Dingo: I agree. Seems logical. — Algentem (talk) 13:50, 7 November 2017 (UTC)


This is an extremely minor matter, but a convention exists to mark statives with obelisks/daggers. These do not mark any arguments and function just as a shorthand. To me it seems rather useless to include, but I really don't care that much. Lingo Bingo Dingo (talk) 12:08, 24 October 2017 (UTC)

Previous discussion found here. On pretty much all of these points I’m in agreement with Lingo Bingo Dingo; I’d have nominal states with a hyphen and pronominal with ⸗ in both the headword line and the entry name, and I’d lemmatize at the absolute state. Regarding statives, if marking them is done it should probably be via a parameter in a headword template, but I agree that it’s probably unnecessary. — Vorziblix (talk · contribs) 00:14, 25 October 2017 (UTC)
@Lingo Bingo Dingo: I would also like to add two points to this discussion. I think that we should add guidelines as to how we add alternative forms in Coptic. Most words in Coptic are attested in several different spelling ways, and pages like ϫⲱⲙ has too many in my opinion. Either we could add a dropdown box, or we only display the "most common" forms, one for each dialect, or a mix between the two?
Secondly, I don't see a reson for why texts in Coptic has to have a bigger font size. I have no difficulty reading Coptic, no more than for example Ancient Greek. For my conjugation template, I made the Coptic font size regular: {{cop-conj}}. — Algentem (talk) 13:50, 7 November 2017 (UTC)

Improving Wiktionary[edit]

 Ideas for improving Wiktionary

  • Collect ideas from the community for improving Wiktionary,
  • Readability - format the Table of Contents on the right to allow fluid reading from article top.
  • Etymologies - increase depth of word etymologies and use flowcharts
  • Reference similar words in a more integrated cross language form:
  • Rewrite whole entry into a information redundant but readable definition summary at the top.
  • Treat big words / core words as essay topics and as idea primes / word primes.
  • Integrate Proto-Indo-European roots into a main column for parsing through roots in a readable way.

-Aision (talk) 03:47, 25 October 2017 (UTC)

Ideas are easy. Do you have any of the technical skills necessary to implement them? DTLHS (talk) 03:51, 25 October 2017 (UTC)
These are really vague IMO. The second one is covered by the Tabbed Languages gadget. —Aryaman (मुझसे बात करो) 11:24, 25 October 2017 (UTC)
My own thoughts:
  1. Collect ideas -- Why view this as a separate endeavor? We're *already* actively engaged in improving Wiktionary. We're *already* actively engaged in discussing ideas for improvement (viz. this very page).
  2. Readability (TOC) -- As Aryaman noted, already addressed by some existing gadgetry.
  3. Etymologies -- I believe etymology depth is already being addressed by those interested in etymologies. And I struggle to imagine how flowcharts would be helpful.
  4. Referencing -- I don't understand what Aision means by this suggestion, especially the "cross language" part.
  5. Rewriting -- This also doesn't make sense to me. The sense lines already provide definitions in a readable and concise fashion.
  6. Easy topics / idea primes -- Another puzzler. I don't understand this either.
  7. PIE roots in a main column -- Easier to imagine possibilities for what it would look like, but hard to imagine how this would be implemented, especially in a way that is 1) cross-platform (what about mobile? etc.), and 2) user-friendly (what about those relying on screen readers? etc.).
Clarity on the above would help. ‑‑ Eiríkr Útlendi │Tala við mig 16:53, 25 October 2017 (UTC)
Looks like a mindmap from a teacher training. As dubious as the business of pedagogues is, you just don’t know if it’s dully obvious or obviously vain. Palaestrator verborum (loquier) 17:34, 25 October 2017 (UTC)
I don't really like tabbed languages but I would like a way to demote the ToC (and think we should do so by default). Equinox 19:57, 25 October 2017 (UTC)
  • Some thoughts:
    1. Re: "Collect ideas". If some user or users want to engage in a systematic effort to prepare for major changes in Wiktionary, they may do so, but we seem to have problems achieving consensus for relatively modest changes.
    2. Default design should be for unregistered users, about whom we should get some information, about their preferences and, better, behavior. Otherwise configuration gadgets should solve the most important problems for most users.
    3. Re: "Treat big words / core words as essay topics and as idea primes / word primes." This sounds like a job for Wikipedia. Keywords (1st edition 1976) by Raymond Williams provides examples of such essays for the semantics of about 120 words in about 300 pages, excluding introduction. Comprehensive grammars (eg, CGEL) cover grammatical matters of a few individual words with one or a few paragraphs, otherwise mentioning them in lists of words with similar characteristics.
DCDuring (talk) 04:05, 26 October 2017 (UTC)
On collecting ideas, one problem with our current set-up is that the Beer Parlour etc. archives are set up chronologically, not topically. A topical index of some kind to BP discussions (possibly with notation on the status of the issue: open vs. implemented vs. rejected) would be a big help to getting a better big picture on the state of Wiktionary as a project, I believe. --Tropylium (talk) 14:36, 28 October 2017 (UTC)
See #Topical organization of BP archives below. DCDuring (talk) 18:20, 29 October 2017 (UTC)
About "big words": I was thinking about concentrating our efforts on most viewed pages first (have a look at Wikiscan). OK, the lots of XXX things in there, but we can ignore those. – Jberkel (talk) 17:34, 6 November 2017 (UTC)

Adding HSK grade to entries[edit]

@Atitarev, Bumm13, Dine2016, Dokurrat, Hongthay, Jamesjiao, Justinrleung, kc kennylau, Suzukaze-c The HSK level of the words' meanings could also be added, just as Japanese entries show. --Backinstadiums (talk) 20:53, 25 October 2017 (UTC)


It seems to me that there are a lot of word "applications" that are being overlooked relating to verbs that have present (ing) and past {ed) participles. While many are correctly noted as becoming nouns (present) or adjectives/adverbs (past), a far larger number have neither mentioned. I believe the vast majority of verbs can become adjectives, adverbs, or nouns through their past and present participles, and wonder who is deciding which shall be included in Wiktionary and which shall not? I am an anagram "nut" and include a vast number of such examples in my own anagram list, and wonder why they should not be included in Wiktionary's list. Has this issue come up before, and if so, how is it being handled? —This unsigned comment was added by Scottmacstra (talkcontribs).

Scott MacStravic?? --Backinstadiums (talk) 18:24, 26 October 2017 (UTC)
@Scottmacstra: Just a quick note, Scott--I replaced your email address with your username. If you want to publish your email address, that is totally up to you of course, just wanted to make certain. —Justin (koavf)TCM 19:10, 26 October 2017 (UTC)

User Benwing/Benwing2 has been inactive for some time[edit]

Ben was a great contributor in Arabic first, then Russian - even more (other languages, modules and many templates) but has been absent for a considerable amount of time. He hasn't handed over any codes, e.g. for generating inflected Russian forms and pronunciations, many other things. I have tried to contact him a few times but didn't get any response. It's sad but is there anything available that someone can take over or, at least, document? To be honest, I sort of relied on him being available on any fixes, enhancements in the infrastructure for Russian and didn't bother to ask him to train how to do things (modules are well documented, though). Also, I want to thank him for creating a great infrastructure, so that creating quality entries (with the knowledge of the complicated Arabic and Russian pronunciation and grammar) has become easy. --Anatoli T. (обсудить/вклад) 08:32, 27 October 2017 (UTC)

@Atitarev: I hope that it's only temporary, or if not, that he's alright anyway. I was really impressed with his work. --Barytonesis (talk) 19:45, 31 October 2017 (UTC)

How to indicate a long å?[edit]

In Northern Sami, various letters can indicate short or long vowels, and we indicate the long ones with a macron, as in many other languages as well as some linguistic descriptions of the language. In Lule Sami, the same would be desirable, but there is a problem with one letter that Northern Sami doesn't use: å. When you put a macron on it, it looks really weird and not really recognisable as a macron, å̄. Does anyone have ideas on how to make this look better? —Rua (mew) 10:53, 27 October 2017 (UTC)

Is a macron below any good? å̱ —Rua (mew) 10:58, 27 October 2017 (UTC)

What do published works use? In the font I use, the macron above looks just fine; another option would be to use the acute accent to mark length in this case since precomposed ǻ (U+01FB) is an existing Unicode point. —Aɴɢʀ (talk) 18:18, 27 October 2017 (UTC)
In the work that distinguishes them that I have, it's distinguished by bolding. Another one just writes in plain text that the å is long. So that's not particularly useful. —Rua (mew) 20:23, 27 October 2017 (UTC)
No, I guess not. Is this just for headword lines, not for page names? I'm still partial to ǻ. —Aɴɢʀ (talk) 23:08, 27 October 2017 (UTC)
Lule Sami and Northern Sami already use á as a real letter in orthography, so I'd rather not use the acute for a non-orthographic mark. It's why I prefer the macron, anyone who knows the spelling will know that it's not normally written that way. —Rua (mew) 13:12, 28 October 2017 (UTC)
What's wrong with following "ː"? Chuck Entz (talk) 21:56, 27 October 2017 (UTC)
In a headword line? —Rua (mew) 21:57, 27 October 2017 (UTC)
There is an IPA rule stating that diacritics that would clash with a descender or an ascender can be printed after the letter instead (so e.g. [g˖] for a fronted [g], or [l̩´] for syllabic [l] with high tone). This might be applicable here too: å¯? --Tropylium (talk) 14:30, 28 October 2017 (UTC)

Please vote: Wiktionary:Votes/2017-07/Templatizing topical categories in the mainspace 2[edit]

Please vote here: Wiktionary:Votes/2017-07/Templatizing topical categories in the mainspace 2.

Support, oppose, abstain, anything is fine.

I extended this vote by 1 month per request by @Dan Polansky, if no one minds. He requested it at the "Decision" section at the end of the vote.

To be clear: personally, I too support extending the vote. Like Dan, I know there's been some opposition to vote extensions.

The vote already was way past the scheduled end date, but only 7 people voted and the vote count would result in "no consensus" as of now. --Daniel Carrero (talk) 04:46, 28 October 2017 (UTC)

To be honest, I do not even understand what the vote is about or to which end it is started. “Templatizing the markup”? What is that supposed to mean? Are the voters supposed to okay the use of bots to replace direct categorizations with templates? Or asked differently: What will change or is changed when the vote passes? Palaestrator verborum (loquier) 13:20, 28 October 2017 (UTC)
@Palaestrator verborum: The vote is about allowing templates {{cat}} and {{C}}. The thing is, these templates are already being used in an increasingly large number of entries. But as far as I know, there's no proper consensus for their use. (were they approved in any discussion and/or vote before?) If the vote passes, the templates will be officially accepted and may be added in all entries by bot where applicable. If the vote fails, basically nothing changes. (for example, even if the vote fails, we would not remove the templates from the entries that already use it)
But a failed vote would also likely discourage people from doing the specific action voted -- adding the templates in a lot of entries in a short period of time. Some editing conventions and policy pages apparently were never discussed or voted, but are arguably the "status quo" anyway and persist. If the vote never existed, someone might get the impression that since many entries use {{cat}} and {{C}}, then these templates must be acceptable by default and may be added to many other entries, quickly, without further discussion. --Daniel Carrero (talk) 20:46, 28 October 2017 (UTC)

Entries for common Finnic-Samic words?[edit]

@Tropylium The Finnic and Samic languages have a lot of shared vocabulary that isn't found in any other Uralic languages. A Proto-Finno-Samic language is posited by some, but not generally accepted (see w:Finno-Samic languages) and we don't have it as either a real language or an etymology-only language under Proto-Uralic. This is a shame, because it means we can't create entries that contain these words and their descendants. It would be useful to have an entry for, for example, *ültä, to house Proto-Finnic *ültä (Finnish yltä) and Proto-Samic *ëltē (Northern Sami alde). —Rua (mew) 13:09, 28 October 2017 (UTC)

Your last example is not a problem; it's also reflected in e.g. Erzya вельде (velʹde) and we can well reconstruct it as Finno-Permic or similar (an option would be to add "West Uralic", for the F-S-Mordv group, supported in many recent analyses). By now this is the case for maybe most cases that are not loanwords (and those can be grouped together whereever the loan is originally from, say *vasara etc. from *wáĵras). E.g. Terho Itkonen's 1997 paper discussing the concept of Finno-Samic lists 98 Finnic-Samic word groups, of which at least 40 have by now been explained otherwise.
For the remaining cases, it's not at all clear if they come from some common ancestor, from some unknown third source, or are old loans between the two. At least some cases are identifiable as substrate loans (e.g. 'island', *saloi ~ *suolōj, has parallels in Baltic and in the later substrate vocabulary in Sami) and therefore should not be reconstructed back to a common native proto-form.
This is really a general problem for etymology entries though, not something specific to the issue of Finno-Samic. If we have an alleged PIE root reflected only in, say, Germanic and Celtic, or Germanic and Balto-Slavic, or in Balto-Slavic and Indo-Iranian — should we reconstruct a PIE entry (possibly tagged as regional), or just refer each daughter entry to the other? Ditto we have an alleged Malayo-Polynesian root only in some western languages, an alleged Central Semitic root only in Arabic and Aramaic, etc.
I would suggest that we stick to a weak version of the "three-reflex rule": only create bottom-level (etymology-less) proto-entries if they have either
  1. reflexes in at least three separate descendants, or
  2. reflexes in two non-adjacent descendants.
--Tropylium (talk) 14:27, 28 October 2017 (UTC)

Twi language code?[edit]

What code should we use for words in the Twi language? Both tw and twi give lua errors. (I came across "bonsam" in a pop song - I think it's Twi but have added it as Akan.) SemperBlotto (talk) 16:09, 28 October 2017 (UTC)

As it says in WT:LANGTREAT, we use "ak" and treat it as part of Akan. Chuck Entz (talk) 16:20, 28 October 2017 (UTC)

Topical organization of BP archives[edit]

Under the topic above #Improving_Wiktionary, User:Tropylium suggested that one problem is the absence of a topical organization for the BP archives. I agree that it is hard to find discussions that bear on a given point and that the effort to do so seems to be forthcoming only in support of a proposal, especially a vote.

What techniques would usefully assure us that relevant prior discussions would be readily available to help raise the level of our discussions?

The existing means seem tedious and are often not used. The principal means I have used for such purposes is keyword search, first for a page using Wiktionary's search restricted to Wiktionary space, then on the search results for that same or another keyword using the browser's search, and finally on the page for that same or yet another keyword. Then begins the process of reading the text to determine whether the discussion is relevant to the matter at hand. Obviously keywords, usually multiple keywords, have to be well selected, especially since we don't limit the vocabulary used in discussions. I sometimes don't find all the discussions that would be relevant to a matter of interest. Usually I only become aware of the missed discussions by accident or by someone's else's search. There must be other times when a relevant discussion is not found at all.

Would an appendix of discussions referenced in votes and in other discussions help? That seems fairly simple to implement.

Should we salt archived discussions with keywords that would assure inclusion in relevant keyword searches even when the keyword was not present in the original discussion?

There are other, more radical possibilities such as putting each BP discussion on a separate page to eliminate the last two keyword searches most of the time.

Does anyone have any better ideas? DCDuring (talk) 18:19, 29 October 2017 (UTC)

I think something like this would be quite helpful on discussion pages in general:
Wyang (talk) 06:00, 30 October 2017 (UTC)
There's also the search box in Category:Wiktionary-namespace discussion pages, though that searches the Tea Room, Etymology Scriptorium, and Grease Pit as well as the Beer Parlour. — Eru·tuon 06:05, 30 October 2017 (UTC)
I sometimes wish we had more of a ticket system than a discussion system. Issues would be open and could be commented on until they were resolved, as well as tagged. This probably isn't possible onsite. DTLHS (talk) 06:04, 30 October 2017 (UTC)
I would support having a ticket system. --Daniel Carrero (talk) 21:02, 30 October 2017 (UTC)
  • Whatever the merits of a ticket system for solving more-or-less technical (ie, GP) problems, it does not seem particularly well suited for most of the matters that appear in BP, which tend to be less structured and structurable. In many cases they involve revisitation of matters discussed before. Resolution of a BP matter may require the creation of tickets. I'm sure that a ticket system could help us achieve ever greater ossification of our format, which seems to be where consensus, momentum/inertia, or technocracy lead us. DCDuring (talk) 22:51, 30 October 2017 (UTC)
Cheers. One aspect of a way of going about doing this organization is going through previous discussion and, without modifying the text of archived discussion, amend the headers with a label, according to a scheme. Maybe just put the label under the header. Others can then copy from the chronological archives to a new topical archives, its important not to destroy archives and its not a crime to make a redundant but differently organized copy. First thing would be to set down the number of general topics these ideas come in.-Aision (talk) 02:23, 31 October 2017 (UTC)

This one will help https://meta.wikimedia.org/wiki/WMDE_Technical_Wishes/AdvancedSearch --Backinstadiums (talk) 19:19, 31 October 2017 (UTC)

What about this? Wiktionary:Beer parlour index. --Daniel Carrero (talk) 22:03, 9 November 2017 (UTC)

Colloquial vs. informal[edit]

What is the distinction here? How can we decide whether a non-formal term is colloquial and/or informal? Should some senses be glossed with both? Equinox 03:45, 30 October 2017 (UTC)

It's worth mentioning that many people use "slang" to refer to both of these; I've never heard it used as a synonym for "jargon". As for the other two, the glossary definition looks like "informal" is for alternative forms or related synonyms. Ultimateria (talk) 09:45, 30 October 2017 (UTC)
I'll just complain a bit without solving anything, but as long as we won't have some clear distinctive examples of what we mean exactly with all these labels (perhaps in the form of a table?), we're bound to have the same questions come up again and again. --Barytonesis (talk) 00:16, 31 October 2017 (UTC)
Wiktionary:Glossary would be the place. Equinox 19:11, 31 October 2017 (UTC)
Actually at Appendix:Glossary, there is "informal" and "colloquial". --Dan Polansky (talk) 19:14, 31 October 2017 (UTC)
Oh! That's how I missed it. Equinox 14:24, 1 November 2017 (UTC)
@Equinox: For new Czech entries, I am no longer using "colloquial" and only use "informal". In the archives of my talk page, there is a little write-up concerning the two, showing that most modern English dictionaries use "informal" and do not use "colloquial" (User talk:Dan Polansky/2016#Colloquial vs. informal). --Dan Polansky (talk) 18:59, 31 October 2017 (UTC)
I'm not sure that a distinction can be maintained or is worth maintaining, but there are expressions that seem to only be used in speech or reported speech. I'll find examples if I can. DCDuring (talk) 00:19, 1 November 2017 (UTC)
I think they involve anaphora + informality/slang: this here, that there, watcha, luv ya/love you, bye bye. DCDuring (talk) 00:31, 1 November 2017 (UTC)
For the record, of the ~100 English entries I looked at that had the colloquial label, the overwheming majority (>90%) did not fit this description and thus would be better characterized as informal or slang. DCDuring (talk) 15:59, 1 November 2017 (UTC)
Some languages have a bigger difference between literary and colloquial registers than English does; Welsh, for example, is a pro-drop language and makes wide use of synthetic verb forms in its literary register, but is non-pro-drop and uses primarily analytic verb forms in its colloquial register. I always use {{lb|cy|colloquial}} for the Welsh colloquial register because "informal" doesn't quite feel right, though I'd be hard pressed to say exactly why. Maybe just because "colloquial" is and always has been the usual word for the register opposed to literary. Some other languages with very far-reaching linguistic differences between literary and colloquial registers are Burmese and Bengali. —Aɴɢʀ (talk) 15:52, 1 November 2017 (UTC)
(outdent, perhaps) Our Appendix:Glossary does not seem to clear map the distinction that says, colloquial = primarily spoken & informal = not formal yet not necessarily primarily spoken:
  • colloquial: Used primarily in casual conversation or informal writing and not in more formal written works, speeches, and discourse. ...
  • informal: Denotes spoken or written words that are used primarily in a familiar, or casual, context, where a clear, formal equivalent often exists that is employed in its place in formal contexts. ...
I would submit that the above criteria are basically equivalent. A question is, for tagging English, is there at least one dictionary that uses both or that distinguishes merely informal from informal and primarily spoken? (I said for tagging English since the discussion of other languages can lead to nuance that does not apply to English, so it is better to constrain the question to English) --Dan Polansky (talk) 18:12, 3 November 2017 (UTC)
My 1995 print edition of Collins COBUILD has both "informal" (eg, decaf) and "spoken" (eg, whoops) as distinct labels. My older Longmans DCE does not. Are there others? DCDuring (talk) 00:57, 4 November 2017 (UTC)

Favicon and apple-touch-icon[edit]

Despite the fact that we switched to a variant of the "Scrabble/Mahjong logo" months ago, the favicon and Apple Touch icon still show the ['w]. KATMAKROFAN (talk) 04:12, 30 October 2017 (UTC)

I brought this up at Wiktionary talk:Votes/2016-05/New logo 2#Favicon, but never did anything about it, largely because I was simply happy for all those votes to be over. If somebody cares (@Dan Polansky?) and created a new favicon consistent with the logo, I'm sure a vote for it would pass. —Μετάknowledgediscuss/deeds 00:32, 1 November 2017 (UTC)
.ico files can't be uploaded to Wikimedia Commons, it turns out. So here is an icon with a white background and here one with a transparent background (I think the white is more effective). These are in the same format as the existing icon file (three sizes: 16, 32, 48 pixels square). The simplest way to install is to copy over the existing file: https://en.wiktionary.org/static/favicon/wiktionary/en.ico -Stelio (talk) 20:43, 2 November 2017 (UTC)

Indices on main page?[edit]

Why do we link to the Index: namespace on the main page? It's maybe the third thing that people see, but it's embarrassingly outdated (April 2012). We should update them (or even do a complete overhaul, with {l} to mitigate the orange links) or think about replacing them. Perhaps the About X pages or Category:X language. Ultimateria (talk) 09:39, 30 October 2017 (UTC)


@-sche, Rua I'm a bit confused about how to add Westphalian entries to descendant lists. Am I supposed to use nds-de? Should I be specifying that it's Westphalian, ex. * {{desc|nds-de|Sorge}} {{q|Westphalian}} or * {{desc|nds|-}} *: Westphalian: {{l|nds|Sorge}}? Thanks. --Victar (talk) 22:59, 30 October 2017 (UTC)

Here is a more practical example using Old Saxon ertha. Should I be throwing Dutch Low Saxon, German Low German, and Westphalian under Low German ({{desc|nds|-}})? --Victar (talk) 23:46, 31 October 2017 (UTC)

de:Verzeichnis:Deutsch/Dialekte und Varietäten has made me of the mind that we shouldn't have gotten rid of the Westphalian language code. --Victar (talk) 06:24, 1 November 2017 (UTC)

CU Vote for Chuck Entz[edit]

Hey all, please note that the vote (Wiktionary:Votes/cu-2017-10/User:Chuck_Entz_for_checkuser) for User:Chuck Entz to become a checkuser is scheduled to end tomorrow. While Chuck currently has a significant level of support, the WMF requires 25-30 affirmative votes for this role. If we are unable to reach that by the currently scheduled end of the vote I will extend it a week, but I thought if I posted about the vote here a few people who might not have noticed would be alerted and we could end on time. - TheDaveRoss 13:02, 31 October 2017 (UTC)

Kazakh orthography[edit]

What will be the consequences of the Kazakh reform for our entries? --Barytonesis (talk) 14:09, 31 October 2017 (UTC)

Probably should change the Cyrillic entries to soft redirects and move the definitions, etc., to the Latin spellings. —Stephen (Talk) 14:21, 31 October 2017 (UTC)
Why should we follow every whim of a government? The Kazakh entries should stay Cyrillic up into eternity – the Cyrillic alphabet will stay in use outside of Wiktionary because it is fit for the language, so there is no need to bother oneself with the reform. Note that the French entries also do not apply the 1990 reform, because why write it easy if you can write it hard? It is has also been a mistake, a shameless kowtow to start the German entries in 1996/2004/2006 reformed spelling, and I opine that Wiktionary should switch the Russian entries back to the spelling that has been the only one before the Bolshevik overthrow (code ru-petr1708) and is of course still the norm, because anyway some governments later decide that Russian should be written more phonetically like Serbian without double н and things like that. What will Wiktionary do if a cabal between the governments of the majority of English-speakers decides to adapt an “intuitive” spelling, reorganize half the Wiktionary (with category names and descriptions)? In my view it is also more agreeable to create Russian entries in pre-1918 spelling and the reason why I do not have created any Russian entries is that I am averse to communist spelling and tend to read no Russian literature because the publishing houses deface their editions with having them reformed. The governments can keep the language reforms for themselves, the states are not the language communities, but their exploiters that parasitize on them by administrating them and stealing the decisions that they should make themselves. The looser the connection with the government is, the better for the people which uses the language, and thus it is desirable to use a different writing system than the government uses. To revolt is a natural tendency of life. Even a worm turns against the foot that crushes it. In general, the vitality and relative dignity of an animal can be measured by the intensity of its instinct to revolt. Palaestrator verborum (loquier) 16:39, 31 October 2017 (UTC).
Err, communist spelling? —Aryaman (मुझसे बात करो) 01:57, 2 November 2017 (UTC)
@Barytonesis, Stephen G. Brown: We need to wait and see the extent to which the switchover is accepted by the linguistic community. If Kazakh books, magazines, newspapers, billboards, product packaging, etc., really start using the Latin alphabet, then we need to reflect that by moving the content to the Latin spelling and having the Cyrillic spellings be soft redirects (do we have a template {{Cyrillic form of}}?). If not—if only the government switches to the Latin spelling while the rest of the Kazakh-speaking world blithely goes on using the Cyrillic alphabet—then we should keep the content where it is and have the Latin spellings be soft redirects. —Aɴɢʀ (talk) 17:56, 31 October 2017 (UTC)
I fully agree with Angr. That is the most sensible approach for a descriptive dictionary. — Ungoliant (falai) 18:06, 31 October 2017 (UTC)
But how do you see what the world is? What is the world? Internet chats probably always have used that orthography, and the newspapers underlie Gleichschaltung, as they use to, and elsewhere people working with language have various constraints. And not few people are nomads who do not write anyway. If it really comes to a reform, it would be the fastest to just create alternative-form-of pages, so nobody has to survey constantly how the usage is. You can never know the whole language anyway, to say: Wow, now the usage is more than 50%, looks like we have to switch to Latin. And then back when it drops? Whole publishing houses in Germany have switched back to the 1901 orthography after having used the 1996 one. Descriptiveness does not preclude a-priori economic decisions. Usage is a bad guide for description, one always reasons from exterior sources whenever one describes as long and because one aspires to organize.
If you accept an inferior orthography, you will get a shedload of problems, like automatic IPA display not working because the government has decided to use digraphs instead of fitting characters, in which case therefore it would be more economical to use the Cyrillic entries as base and treat the official ones as alternatives; these could even be auto-created with zero work (with the only need of a bot). Palaestrator verborum (loquier) 18:31, 31 October 2017 (UTC)
One sensible option (IMHO anyway) is to treat both scripts equally based on actual attestation evidence. Therefore, Latin script entries should only be created when there are attesting quotations in that script meeting WT:ATTEST. When Latin entries are created, their Cyrillic counterparts should be left as is for considerable time. --Dan Polansky (talk) 18:32, 31 October 2017 (UTC)
I disagree. What is being attested is the language, not the script. Therefore a quote in Cyrillic attests the word in Roman too. This is the same way the language gets treated in large Serbo-Croatian linguistic works. One author writes something in Latin but it is still in Cyrillic in the Речник српскохрватскога књижевног језика (Vols. I-VI 1967–1990) – even though Croatians participated in the dictionary, who do not use Cyrillic at all. Palaestrator verborum (loquier) 18:41, 31 October 2017 (UTC)
In each particular quotation, use of a script is being attested or evidenced. Therefore, it is an observational factual statement to say that script X is usually used to write language Y, and it is equally observational fact that spelling A in script X is attested. This is related to particular spellings being attested, not only their pronunciations; in fact, pronunciations are not really attested. --Dan Polansky (talk) 18:46, 31 October 2017 (UTC)
Of course the script is being attested in this sense, but also the other script: As pronunciations for most languages do not need attestation, so lexemes do not need attestation in every script for each lexeme to be included in each available script, because we can derive the spelling in another alphabet by rules a priori as well as we can derive the pronunciation from rules a priori, as long as we know what the script represents (which can be hard for a bot with digraphs or in as much as the script is unfit otherwise). Or do we need to attest every inflection form of every Latin word? One is content by proofs from which the other forms can be inferred. Palaestrator verborum (loquier) 19:43, 31 October 2017 (UTC)
Each quotation attests only the script which it uses and none other. That is, it serves as direct evidence supporting only the thing that it shows and it does not support as direct evidence things that it does not show. --Dan Polansky (talk) 20:18, 31 October 2017 (UTC)
You talk like there were an essential difference between directness and indirectness; but the delimitation is fuzzy. Surely a quotation attests some things more directly than others. But the second script instance is directly enough known by the instance in the other script plus the conversion rules, and even in the extremely marginal case that we know that the word has no single occurrence whatsoever in the second script, one still wants to have it listed. It would be an overkill to list a word with an asterisk only because of not being attested in this script, as well as it would be awkward to go through lists of Greek and Latin lemmas and mark the forms which are not attested. What we describe are not all instances, but the general rules of a language, consisting of words, and to the extent as we can circumscribe the rule-dependent usage, for which the script hardly ever matters. Or have requests for verification for Serbo-Croatian required attestations in both scripts? No, or they needed not to, because one and the same Serb who writes a word in Cyrillic as well writes Roman on other occasions; what we prove by quoting him is that the word dealed with exists in his lexicon, not that this word exists in that script – by the way one work by the same author might be printed sometimes in Latin, sometimes in Cyrillic, which also suggests that the language analyst should not distinguish.
And yes, as the script is not the language, and the scope of inclusion is clear (unlike the case if arbitrary invented words were allowed on Wiktionary), I do not have anything against Russian words invented after 1918 listed in the Petrine spelling. Palaestrator verborum (loquier) 21:55, 31 October 2017 (UTC)
As always, we should wait to see what is used. I'm seeing some kickback on Facebook, as it is pretty ugly using apostrophes instead of diacritics, but that's not really to be helped from us.
There's some argument about citing cross-script and orthography. In Esperanto, I've always converted the h-system and x-system for writing Esperanto without diacritics into the standard diacritics. There's no point in separating Latin-script and Cyrillic-script Serbo-Croatian attestations. It doesn't work so well when the spelling changes or orthography changes aren't easily mapped, though.--Prosfilaes (talk) 12:08, 1 November 2017 (UTC)
I don't like the new Kazakh orthography, the current transliteration is far better. They don't want to use any diacritic symbols and use ' instead. It's not much different in the way how Uzbek was romanised. They also replaced special letters with apostrophes, a standard symbol is "[ʻ]'. If I'm not mistaken, the new Kazakh orthography will look like this (Cyrillic, current translit, new translit):
Қазақ әліпбиі — қазақ тілінің әріптерінің жүйелі тізбегі, қазақ халқының мәдени өмірінде басқа да түркі халықтарымен бірге пайдаланып келген әр түрлі әріп таңбаларынан тұратын дыбыстық жазу жүйесі.
Qazaq älipbïi — qazaq tiliniñ äripteriniñ jüyeli tizbegi, qazaq xalqınıñ mädenï ömirinde basqa da türki xalıqtarımen birge paydalanıp kelgen är türli ärip tañbalarınan turatın dıbıstıq jazw jüyesi.
Qazaq a'lipbi'i — qazaq tilinin' a'ripterinin' ju'yeli tizbegi, qazaq xalqının' ma'deni' o'mirinde basqa da tu'rki xalyqtarymen birge paydalanyp kelgen a'r tu'rli a'rip tan'balarynan turatyn dybystyq jazy' ju'yesi. --Anatoli T. (обсудить/вклад) 07:15, 2 November 2017 (UTC)
This orthography is a disaster. I think they should get rid of some characters as well like q in favor of k', p in favor of b', t in favor of d' and x in favor of k''. There are some letters lacking, I am guessing sh is s', ch is s'', and w is b''. I mean they must be, just look at this harmony, it is perfect. --Anylai (talk) 20:30, 3 November 2017 (UTC)
If the orthography is using double apostrophes, that raises some technical issues: while it's entirely possible to create and link to pages containing double apostrophes in the page name, there may be problems with the system mistaking double apostrophes for wikitext outside of the links, and there may be similar problems with our templates and modules- nothing we can't deal with, but a bit of tinkering and extra typing will be required to make things work. Chuck Entz (talk) 21:13, 3 November 2017 (UTC)
Apostrophes … another irksome point I have not thought about. It points of course to the rule “if it ain’t broke, don’t fix it”. It applies to this topic as: Keep Kazakh in Cyrillic, at least that will work well. From the example by Anatoli it seems to me that even reading the script does not work. Sickening. Palaestrator verborum (loquier) 22:04, 3 November 2017 (UTC)
I think Anylai was joking? Neither Anatoli T.'s example or the Facebook example I saw complaining about it used double apostrophes, and I'm pretty sure the latter would have complained about double apostrophes.--Prosfilaes (talk) 07:38, 4 November 2017 (UTC)
Nazarbayev wants the transition done by 2025 so there's no need to hurry, if we're lucky maybe he gets putsch-ed over this abominable alphabet. Crom daba (talk) 19:35, 4 November 2017 (UTC)
This Russian video introduces future Kazakh orthography in a funny way. There are many funny cases, especially names. Nazarbayev himself made a joke about сәбіз (säbiz, carrot), which will be spelled "saebiz", which looks like vulgar Russian заеби́сь (zajebísʹ). --Anatoli T. (обсудить/вклад) 21:23, 4 November 2017 (UTC)
@Atitarev: Huh. Kazakh alphabets#Correspondences on Wikipedia indicates that ә will be spelled as a'. — Eru·tuon 21:56, 4 November 2017 (UTC)
Apparently Nazarbayev doesn't know his new alphabet himself. The video used "sa'biz". --Anatoli T. (обсудить/вклад) 22:07, 4 November 2017 (UTC)

Template inflection of - making the arguments required[edit]

In Module:form of/templates, an editor insists that arguments of {{inflection of}} are now required. That is to say, it is now technically required that users of the template specify what sort of inflected form it is. The editor has protected the page.

Do we want to make the arguments required on the technical level? (I don't.)

--Dan Polansky (talk) 18:19, 31 October 2017 (UTC)

What is your specific objection to making the arguments required? DTLHS (talk) 18:23, 31 October 2017 (UTC)
I would like to be able to create pages that leave the form unspecified, and get no module errors. In such a situation, the reader should still be able to find which kinds of inflected forms are concerned in the lemma entry. --Dan Polansky (talk) 18:26, 31 October 2017 (UTC)
That turns non-lemma entries into little more than redirections, then. Having the complete morphological description of a form directly on the entry is more useful IMO, so much so that I think it's not a bad idea to enforce the practice by triggering module errors. --Barytonesis (talk) 18:37, 31 October 2017 (UTC)
The question is what is more valuable, a soft redirect or nothing. Since, you cannot expect the people who would be entering soft redirects (inflection of without the kind of form specified) to enter these forms specified in the same volume, if at all. --Dan Polansky (talk) 18:49, 31 October 2017 (UTC)
Once you have non-lemmas as soft redirects, other editors can attach pronunciation and rhyming. That allows division of labor and specialization, which is good for productivity and accuracy. --Dan Polansky (talk) 18:54, 31 October 2017 (UTC)
@Dan Polansky: True. But quite a few contributors have said over the years that they prefer a red link where they can do everything themselves from the start, to a blue one leading to a half-baked entry; handling the first is more gratifying than the second, which feels like a tedious cleaning up job. That's probably less of a problem for non-lemma entries, however. --Barytonesis (talk) 18:47, 4 November 2017 (UTC)
I don't know what statements from what contributors you have in mind. Different people find different things gratifying. For Czech lemmas, I hardly ever add pronunciations and inflection tables, and I saw recently an anon add these. If the anon is not so strong in English, it would be difficult for them to do full entries, but adding inflections and pronunciation may be easy for them. For those who find it more gratifying to fill redlinks than to expand existing entries, there are plenty of redlinks to fill left, and will be for the foreseeable future. --Dan Polansky (talk) 20:04, 4 November 2017 (UTC)
Dutch has entries like Arabische that just say "Inflected form of ...". This seems like a justification for not making the parameters mandatory. DTLHS (talk) 18:29, 31 October 2017 (UTC)
If it is a good and wished thing to make all inflected form of entries specify this kind of information, then {{nl-adj form of}} used in Arabische could be changed or it could be replaced with {{inflection of}}. The question is, is it good and wished to make that mandatory for all contributors? --Dan Polansky (talk) 18:39, 31 October 2017 (UTC)
Note that we have both {{inflection of}}, which requires arguments, and {{inflected form of}}, which doesn't. If you don't want to specify exactly which forms the term corresponds to, just use {{inflected form of}}. Granted, the doc page for that template says to use {{inflection of}} instead, but that recommendation doesn't match the RFD "keep" outcome archived at Template talk:inflected form of. —Aɴɢʀ (talk) 22:01, 31 October 2017 (UTC)
But does it make sense? Was the intent of the disputed edit in Module:form of/templates to force me to use {{inflected form of}}? Would not a better course of action be to make sure {{inflection of}} does not produce any module errors (it can place items to a hidden category instead, or even visible category), and deprecate the other template? --Dan Polansky (talk) 17:53, 3 November 2017 (UTC)


When using {{quote}} the term get categorized in Category:/Language/ terms with quotations but when using the quotationtemplates in Category:Citation templates (such as {{quote-journal}}) there is no categorizations. Is this supposed to be this way? I doubt it...Jonteemil (talk) 23:45, 31 October 2017 (UTC)

It's not, but merging the various citation / quotation templates into something unified is a lot of work that nobody really seems interested in right now. DTLHS (talk) 00:24, 1 November 2017 (UTC)
Additionally, {{quote-journal}} and the rest of the quotation templates do not require or even make use of a language code, so that would have to be added before they could categorize as "terms with quotations". DTLHS (talk) 00:32, 1 November 2017 (UTC)
The matter has itched me too recently. But isn’t it a simple task for a programmer? Its solving would require to make an addition to the templates and let a bot run accross the articles, just looking under which language a quotation is placed. However yes, I do not see exactly why anyone should be interested in the work, because the categorization is of little use, methinks. Palaestrator verborum (loquier) 00:51, 1 November 2017 (UTC)
I see. It’s just that I found it strange that one quotation template categorizes and one doesn’t. But then I know. Thanks for answering!Jonteemil (talk) 01:52, 1 November 2017 (UTC)
Since quotations are a key part of Wiktionary they should get some more love. But yes, there are quite a few different templates in use, lots of quotes / examples without templates, it will be a bigger project. The first step would be to add language codes to all template invocations. Or we could just wait for T122934. 💤 – Jberkel (talk) 14:05, 6 November 2017 (UTC)

November 2017

Category:Subtypes of compounds by language[edit]

I'd like to create categories for French compounds comprised of:

but I don't know how to name them. It would be useful to distinguish between words such as porte-clef (verb+noun) and porte-fenêtre (noun+noun). Any ideas? --Barytonesis (talk) 19:13, 1 November 2017 (UTC)

I'd just name them more or less as you said, Category:French verb-noun compounds. I notice that we have Category:French compound words with "words" at the end, but all the subcategories of Category:Subtypes of compounds by language, as well as this category itself, lack "words" in the name. Is this something that needs to be corrected? It might be better to do it now, before we start creating more of these categories. —Rua (mew) 20:14, 1 November 2017 (UTC)
Category:Subtypes of compound words by language then? Category:Compound words subcategories by language? Category:Types of compound words by language? (to follow up on Erutuon's suggestion at Wiktionary:Beer parlour/2017/July § Compound and fiction etymology categories)
See also Module talk:category tree/poscatboiler/data/terms by etymology.
Fine by me in any case. --Barytonesis (talk) 00:10, 2 November 2017 (UTC)
@Rua? --Barytonesis (talk) 15:00, 3 November 2017 (UTC)
I'm waiting for others to offer their views. —Rua (mew) 15:05, 3 November 2017 (UTC)
@DCDuring: you were mentioning here the idea of categorising compounds by their head. Maybe you'd like to weigh in? --Barytonesis (talk) 15:23, 13 November 2017 (UTC)
I like the idea of adding depth to our existing entries by almost any means, including categorization.
For compound words one could categorize by head PoS and subcategorize by the PoS of the other element of the two-part compound. The subcategorization might be essential because for most English compounds the PoS of the compound is the same as the PoS of its head. English being what it is (terms that are nouns and verbs as well as nouns being used as noun modifiers), not all such categorization would be indisputable. I think such categorization could help non-natives understand English compounding better and might be useful to linguists. I don't think we would have to have complete coverage to be useful.
It would be consistent with a general scheme of characterizing all multi-word terms grammatically. We already have such categories as Category:English coordinates, Category:English non-constituents, and Category:English predicates.
Also, for words that have suffixes or prefixes it might be useful to distinguish via categories those words of denominal, deverbal, and deadjectival derivation. DCDuring (talk) 20:54, 13 November 2017 (UTC)

Participate in Dispute Resolution Focus Group[edit]

The Harvard Negotiation & Mediation Clinical Program is working with the Wikimedia Foundation to help communities develop tools to resolve disputes. You are invited to participate in a focus group aimed at identifying needs and developing possible solutions through collaborative design thinking.

If you are interested in participating, please add your name to the signup list on the Meta-Wiki page.

Thank you for giving us the opportunity to learn from the Wikimedia community. We value all of your opinions and look forward to hearing from you. JosephNegotiation (talk) 22:50, 1 November 2017 (UTC)

"The scope of this project is limited to the resolution of disputes concerning improper or disruptive behaviors." (from m:Research:Developing Wikimedia's anti-harassment and behavioral dispute resolution systems) DCDuring (talk) 13:24, 2 November 2017 (UTC)
P.S. I am setting up a group for the promotion of improper disruptive behaviours, because they are the only way to stop some bastards. You know whom to call. Equinox 13:25, 2 November 2017 (UTC)

Project idea kind-of maybe[edit]

SECRET CLUB LINK: User:Equinox/EWDC. The rest is babble.

Who else is obsessed with adding missing English words, like I am? I know there are some of you, although User:Visviva seems to have found something better to do. I've got a ton of missing English words from all kinds of sources. The ones I have really tried but couldn't work out can be seen in the alphabetical list on my user page. But I've got a zillion more. Does anyone want to be in ENGLISH WORD DEFINING CLUB. I think it would be pretty amusing, and maybe useful, to break down some of these huge lists and split 'em across a group, and everyone would take five words per week (or whatever) and see what they can do. Probably not alphabetically or there would be a rash of suicides as we hit M and everybody got 30-letter words starting with methyl. -- If anyone is into this I will tell more secrets about my word lists. Equinox 05:48, 2 November 2017 (UTC)

I'm in. Let's get to 1 million lemmas. DTLHS (talk) 05:54, 2 November 2017 (UTC)
If you wanna do it then sign on this page, and make a little Scout salute. User:Equinox/EWDC. Equinox 06:00, 2 November 2017 (UTC)
Minerals, geological periods and formations. DCDuring (talk) 13:20, 2 November 2017 (UTC)
It's only fun when the words are random. We could split them by theme but see above re methyl. Equinox 13:22, 2 November 2017 (UTC)
If I had done Webster 1913 from A to Z rather than having a random number generator, I would probably have chugged bleach in 2015 rather than 2018. Equinox 13:27, 2 November 2017 (UTC)

November LexiSession: toilets[edit]

The monthly suggested collective theme is toilet. It's not a fancy topic but an interesting one and 19th of November is the World Toilet Day (no kinding). So, we may improve Thesaurus:toilet and all the slang words referring to this crucial place! And you are not push to contribute from this place.

Lexisession is a collaborative experiment without any guide or direction. You're free to participate however you like and to suggest next month's topic. If you participate, please let us know here or on Meta, to keep track on the evolution of LexiSession. I hope there will be some people interested this month, and if you can spread it to another Wiktionary, you are welcome to do so. Ideally, LexiSession should be a booster for every project at the same time, to give us more insight into the ways our colleagues works in the other projects.

Have a good time! Face-smile.svg Noé 09:03, 2 November 2017 (UTC)

French Wiktionary October news[edit]

Logo Wiktionnaire-Actualités.svg


Hey! October issue of Wiktionary Actualités just came out in English!

Long Actualités this month with five articles: what is the patrol, a review of a research that use Wiktionary, a dictionary with celtic words in French, a feedback from the Wikiconvention francophone and a short point on the phatic function of the language. Plus, as usual: highlights from the press, statistics, videos, LexiSession and colorful pictures!

This issue was translated by Jberkel in less than a day, and may can be improved by readers (wiki-spirit-love-love). We did not receive any money for this publication and we are not supported by any user group or chapter. It is only written by the community, for the large community of lexilovers! Feel free to send us comments on our writings, or differences between our projects Face-smile.svg Noé 08:53, 3 November 2017 (UTC)

Template:bor: Replace notext=1 with withtext=1[edit]

@Daniel Carrero, TheDaveRoss As part of the effort to remove the text from {{bor}}, we've been adding notext=1 to calls of this template. The idea is that any instances that are called the old way, without that parameter, are still in need of fixing. But this is really annoying and backwards so I propose to reverse the sense of the parameter:

  1. Module:etymology/templates is modified to accept both a notext=1 and withtext=1. withtext=1 is initially implemented as a no-op parameter.
  2. Any instances of {{bor}} that don't currently have notext=1 get withtext=1 added by a bot.
  3. The module is modified so that the text is not displayed by default, thus notext=1 becomes the no-op, and withtext=1 triggers the inclusion of the text.
  4. Any instances of {{bor}} that have notext=1 have it removed by a bot.
  5. The module is modified again to remove the notext= parameter.
  6. The template now works the way it should, and any further efforts at cleanup are now focused on Category:bor with withtext.

Rua (mew) 13:38, 3 November 2017 (UTC)

I concur. --Barytonesis (talk) 15:00, 3 November 2017 (UTC)
I see no way in which this helps at all (just like using a bot to change "{{bor|aa|bb}}" to "Borrowed from {{bor|aa|bb|notext=1}}"). It's just more work for the editor. —Aryaman (मुझसे बात करो) 17:51, 3 November 2017 (UTC)
Why? —Rua (mew) 17:56, 3 November 2017 (UTC)
The question to ask is whether this does any harm other than temporary confusion to editors while the switch is underway. If not, the benefits of speeding up the phaseout of the "Borrowed" text would be reason enough, IMO.
The removal of this text was agreed to a while back, partly to reduce the arbitrary differences needed to be learned between the behavior of similar templates in the {{bor}}/{{inh}}/{{der}}/{{cog}} group, and partly because the added parameters needed to customize the text like |nocaps= are more trouble for many editors than just typing the text by hand every time.
Although the removal of the text makes more work when the template is at the very beginning of the etymology, it makes things simpler for new users and less work in all other cases. Even for those who don't like what's being done, there's something to be said for getting the transition over with. Chuck Entz (talk) 20:06, 3 November 2017 (UTC)
I concur too. It is regularly nerve-racking having to add “notext=1”, and the reasons explained by Chuck Entz are true to the facts. Palaestrator verborum (loquier) 22:14, 3 November 2017 (UTC)
Yes, let {{bor}} default to showing no text. Ideally, in future, let "Borrowed" be no longer shown in etymologies at all. --Dan Polansky (talk) 22:24, 10 November 2017 (UTC)
I'm supportive of unifying template behavior to no longer add text. However, why remove "borrowed" if an editor has added it? This is useful information for the reader. ‑‑ Eiríkr Útlendi │Tala við mig 22:30, 10 November 2017 (UTC)
It is not part of Rua's proposal, but anyway: I don't believe "Borrowed" is useful. It looks like cruft to me. Merriam-Webster does not have it. Whether term T1 is borrowed from T2 or is inherited seems to nearly always follow from which languages T1 and T2 belong to, and therefore, it is a rather trivial piece of text that does not need to be on the radar screen of the reader. --Dan Polansky (talk) 22:34, 10 November 2017 (UTC)
I write "From" for inherited terms, and "Borrowed from" for borrowed terms. I do this also for non-initial parts of the etymology, e.g. Latin borrowed from Greek in an English etymology. —Rua (mew) 22:43, 10 November 2017 (UTC)
It's a distinction without a difference. Czech cannot inherit from French, so it has to borrow; when I see "From French ...", I know it's a borrowing. --Dan Polansky (talk) 22:48, 10 November 2017 (UTC)
You know that, but does everyone else using Wiktionary also know? And there are some languages that are both borrowed and inherited from. Latin is a good example. For those languages, we need to specify borrowings. And for consistency it then makes sense if we do it for all of them. I really don't understand what the objection to it is. —Rua (mew) 22:51, 10 November 2017 (UTC)
There is nothing to know since "borrow" is only a terminology that indicates a trivial distinction, one without a difference. It's the same kind of objection that I have when someone adds "noun" before the word in etymology chain, or instead of "from" writes "which comes from"; I like things to be very compact, and get rid of all that looks like cruft. Merriam-Webster online seems to see things the same way I do. --Dan Polansky (talk) 23:02, 10 November 2017 (UTC)
The distinction between borrowing and and non-specific derivation might not be significant, particularly for Greek terms borrowed using Latin spelling, but the distinction between borrowing and inheritance is significant for some doublets like Spanish palabra and parábola. Both of them originate from the same Latin term, parabola, but the inherited one has undergone sound changes that the borrowed one has not. I don't know how this exactly relates to having "derived" as well as specifically "borrowed" and "inherited" categories, though. — Eru·tuon 23:45, 10 November 2017 (UTC)
These doublets are interesting, thank you, but I do not see that rather small group of items as a sufficient reason to flood our etymologies with "borrowed" that for all but a very small portion of all cases (>99%?) adds nothing beyond the obvious. --Dan Polansky (talk) 08:11, 11 November 2017 (UTC)
If you find it fluff, but others find it useful, can you live with the presence of Borrowed from in entries? ‑‑ Eiríkr Útlendi │Tala við mig 03:44, 12 November 2017 (UTC)
I want to see "Borrowed" gone even if some people find it useful. I reject the following principle: If several users find a certain more loquacious and space-consuming presentation useful, all users should be exposed to that presentation, even if those several users are a small minority. If I am the minority, I have to live with "Borrowed" anyway. --Dan Polansky (talk) 08:25, 12 November 2017 (UTC)

I've done step 1 and a bot is now doing step 2. —Rua (mew) 22:41, 10 November 2017 (UTC)

Step 2 is complete, but I noticed that Category:bor without notext is still being filled with new entries day by day. I've notified Equinox of the upcoming change, but there may be others who still rely on the old format that need to be notified. —Rua (mew) 17:34, 11 November 2017 (UTC)
Step 3 has been done, now the template works in the intended way. The text is no longer shown by default. —Rua (mew) 12:36, 12 November 2017 (UTC)
A bot is now working on step 4. There's 40 thousand entries, so it will take a while, possibly multiple days. —Rua (mew) 12:44, 12 November 2017 (UTC)
Everything is now done! —Rua (mew) 19:36, 12 November 2017 (UTC)
Nice! --Daniel Carrero (talk) 01:29, 13 November 2017 (UTC)

Translations with a different part of speech that express the same idea[edit]

@Adelpine I just noticed diff. French may not even have a noun to express this idea, so I don't think it's helpful to remove this translation. I recall that we had a discussion in the past where we agreed that it was desirable to have translations that express the same idea even if the part of speech is different. A common example is a verb that expresses what is an adjective in English or vice versa. Stative verbs are a thing in many languages. —Rua (mew) 13:46, 3 November 2017 (UTC)

@Rua OK. I will not remove this kind of translations from now on.--Adelpine (talk) 20:42, 3 November 2017 (UTC)

Add pronunciation of chinese words in the table titled "Dialectal synonyms of", under the "Synonyms" header.[edit]

Just like in the box for "Derived terms from", under the "Compounds" header, there's plenty of space for it, and otherwise you have to click every word and then go back in your browser, expand the table again. --Backinstadiums (talk) 21:35, 3 November 2017 (UTC)

Since when do we have a separate header for Compounds? It should be Derived terms. —Rua (mew) 22:17, 3 November 2017 (UTC)
Symbol oppose vote.svg Oppose. This was discussed and it was decided to leave them out, for otherwise the table could get quite wide and also we don't hold pronunciations for most regional dialects. The discussion is somewhere. —suzukaze (tc) 22:30, 3 November 2017 (UTC)
@Rua: For example
That entry is malformed in all kinds of ways. There's no part-of-speech header, and there's a Compounds header at L3. I changed Compounds to L4 Derived terms, but the part of speech needs to be fixed still. —Rua (mew) 22:47, 3 November 2017 (UTC)
Compounds are not necessarily derived from the definitions; they are merely all multicharacter words containing the character. Characters can also be used phonetically. Wyang (talk) 22:48, 3 November 2017 (UTC)
Single-character Chinese entries don't get part-of-speech headers and it has been the language policy, de facto and by the About Chinese page, for quite some time. They get Definitions header. --Anatoli T. (обсудить/вклад) 00:37, 4 November 2017 (UTC)
It seems phonocentristic to me to assert that the contested formations are not derived terms. Of course they are derived terms, in so far as we take the graphical words, independent of any underlying phonetic value, and stick them together. It’s graphemological derivation in the sense as we also use the word “derivation” to subsume compounds. So I presume that Rua is right about that heading. Palaestrator verborum (loquier) 00:49, 4 November 2017 (UTC)
WT:ELE#Derived terms:
List terms in the same language that are morphological derivatives.
See Morphological derivation. Wyang (talk) 01:04, 4 November 2017 (UTC)
Seems like the definition there is a slip, as everywhere there are compounds listed under “derived terms”, notably German, while derivations are “contrasted with other types of word formation such as compounding”. Meseems WT:EL has a bug, because compounds and such as the Chinese formations need either to use the Derived terms heading against the WT:EL definition or have invented a new heading. Palaestrator verborum (loquier) 04:13, 4 November 2017 (UTC)
Morphological derivation and compounding are subsumed under "Derivation" in general Wiktionary terminology (see e.g. search:"insource:/derived from/"), whereas Chinese words using characters phonetically are not said to "be derived from" those individual characters. Consequently, CJKV character entries conventionally use "Compounds" instead of "Derived terms" to include these words, at L3, as they are not subordinate to the definitions. Wyang (talk) 05:21, 4 November 2017 (UTC)
That’s what I already have observed and is expressed above. I have explicitly stated that the practice of listing lexical compounding deviates from the description in WT:EL, or rather the description deviates from the current best practice, and Atitarev has elucidated that Chinese entries generally use the heading “Compounds”. However I contest the notion of “phonetical derivation”. Your outlinings exhibit some unjustly blurry and simplifying thinking. Most commonly “derivation” is understood as lexical, on the level of lexemes and lexes rather than phonemes and phons. But nothing gainstays speaking of derivation on a mere graphical level. But here I opine that “Derived terms” and “compounds” are alike misleading. “Compounds” are typically understood as lexical, i.e. inheriting the meanings of the uncompounded elements – though this is mitigated by the use of L3 instead of L4 –, and is here used quite untechnically, while “term” is a word that can hardly be used in English if no focus on meaning is intended, so the reader might wrongly assume or maybe needs wrongly assumes a little that it is spoken of “derived terms” because some meaning is inherited. Factually both headings are – on the L3 place – roughly correct as long as understood the intended way. However so far it seems to me that there is no way to solve the dilemma because the available English lexicon is too phonocentristic for a reliable generalized heading. Every handling is wrong and there is no solution as long as no cruel and unusual diction is adopted. Palaestrator verborum (loquier) 06:31, 4 November 2017 (UTC)
No, you still don't get it. "Derivation/derived/derives from" on Wiktionary generally means both morphological derivation and compounding, but the terminology is not used for phonetically used glyphs. You have to understand there isn't a perfect solution to many issues in language handling on Wiktionary; what is to be respected is conventional practices that the relevant editors have come up with and have deemed to be the most appropriate. The CJKV editors have long felt the use of "derived" in the description of phonetically chosen characters to be very misleading and have been avoiding such wording whenever possible. It provides way more harm than benefit to Wiktionary readers who are used to understanding "derived" as "etymologically derived".
From your initial, almost instinctive denunciation as an unfamiliar outsider, to the insistence that "derivation" on Wiktionary be interpreted in your manner after having been corrected, and the silly summarisation of the practice as "phonocentrism", your posts come across as quite misinformed and pretentious. You have not worked on CJKV languages, and are unfamiliar with them, so please appreciate that you are unversed in these languages' complexities and the rationales for the conventions. Wyang (talk) 07:20, 4 November 2017 (UTC)
I disagree that the editors and/or speakers of a language should always have the last say, as it can lead to politically/emotionally motivated decisions rather than linguistically based ones (Wiktionary:Votes/pl-2014-03/Unified Norwegian anyone? And the opposing votants barely contribute here, with the exception of Donnanz...). However, I have complete trust in the Chinese contributors, who are numerous, very active and very linguistically literate (as far as I can tell, which doesn't mean much in this case). So I think all the decisions about Chinese can be safely left to them. --Barytonesis (talk) 09:53, 4 November 2017 (UTC)
But you only repeat what I say, Wyang: “No, you still don't get it. ‘Derivation/derived/derives from’ on Wiktionary generally means both morphological derivation and compounding, but the terminology is not used for phonetically used glyphs.” I have said that derivation means 1. lexical/morphological derivation 2. in a broader sense, compounding 3. by virtue of the abstract meaning of the word, those Chinese “compounds” which are not formed because of meaning but because of matching sounds – I don’t think that it is false to speak of “derivations” in the third case, and I have not claimed that the third meaning is used with the “Derived terms” headings. I rather opine that it is wrong to speak of “derivations”, rather than false, because of the misleading consequences you already know (people thinking that the meaning is inherited).
“but the terminology is not used for phonetically used glyphs.” It’s what I have presumed; I have even given the reasons why it is misleading to speak of derived terms. I have said “Of course they are derived terms, in so far as we take the graphical words, independent of any underlying phonetic value, and stick them together,” but that is only the reason why one might believe that it is correct to to speak about the “compounds” as “derived terms”, while on the contrary I have give a reason against using it and why it is wrong, but not false. “’term’ is a word that can hardly be used in English if no focus on meaning is intended, so the reader might wrongly assume or maybe needs wrongly assumes a little that it is spoken of “derived terms” because some meaning is inherited.”
“You have to understand there isn't a perfect solution to many issues in language handling on Wiktionary” – I said ”there is no way to solve the dilemma because the available English lexicon etc.”.
“the insistence that "derivation" on Wiktionary be interpreted in your manner” I have not insisted on anything, I have quite explicitly stated that I know no solution but rather deny its possibility – ”However so far it seems to me that there is no way to solve the dilemma”. My purpose has been to pronounce in which different meanings “derivation” can be understood by whomever, to elucidate the causes of the frictions about and that the definition in WT:EL needlessly contributes to a confusion by dint of displaying only half of the practice with the “Derived terms” heading. How difficult is it to get that I do not call for any action (short of possibly clarifications in WT:EL to regard compounds as in German and those here called “compounds” in the Chinese entries) but call for awareness about the terminology needs being frictious?
In other words:
  1. The ”compounds” listed at are all compounds.
  2. The ”compounds” listed at are not only compounds.
  3. The ”compounds” listed at are all derived terms.
  4. The ”compounds” listed at are not only derived terms.
I esteem all four propositions correct, I am with no side. No contradiction is intended. And if I have said “I presume that Rua is right about that heading” it’s not because I side but because it follows from her proposition that I have formulated in (3.) being correct and only that far. However I do not recommend that stance, as this choosing of words is more likely to be misunderstood than “compounds”; and I do not recommend any wording, as I have said, as all wordings I know are wrong. Palaestrator verborum (loquier) 17:59, 4 November 2017 (UTC)
TLDR. You lost me when I saw the number of bytes in your addition. Not everyone has time for this. Wyang (talk) 23:09, 4 November 2017 (UTC)
@Suzukaze-c: Neither of your reasons are very convincing; as I said, most tables have plenty of space, and otherwise they could be expanded lengthwise. As happens in the table for "Derived terms from", leaving out those pronunciations we are not aware of is not very disturbing, just as we do with unknown etymologies, translations, definitions themselves... and yet those words are added as "Derived terms from", appearing in red. --Backinstadiums (talk) 22:45, 3 November 2017 (UTC)
I don't think adding pronunciations to the module is sustainable, especially considering that there will be topolects not covered yet or words with multiple readings but I think adding simplified form is probably feasible and should be done. So far, the promise after the centralisation of the contents to always provide simplified characters in all entries was kept in all Chinese modules but not in this one. --Anatoli T. (обсудить/вклад) 03:27, 4 November 2017 (UTC)
What on earth happened to politeness in this functionality request (actually more like a demand)? Wyang (talk) 07:38, 4 November 2017 (UTC)
Hi guys, @Wyang: I am sorry if I seem too "demanding", or even cheecky, but I am just a language enthusiast, and so I try to improve this place with objective critical posts. --Backinstadiums (talk) 08:36, 4 November 2017 (UTC)
I didn't mean to sound demanding either, sorry if it sounded that way. It was a general comment - we have kept the promise but not always, which includes myself, since I have been involved in Chinese edits as well. The feature (of providing simplified variants) would be good to have. --Anatoli T. (обсудить/вклад) 09:59, 4 November 2017 (UTC)
{@Wyang: No offense taken at all, do not worry :-) --Backinstadiums (talk) 10:36, 4 November 2017 (UTC)


Regarding Appendix:HSK_list_of_Mandarin_words, except for the first level, the rest presents first the simplified word, then traditional in parentheses, which as far as I can tell is the reverse approach taken here. Furthermore, the advanced level shows a formatting error, starting at letter "w", repeating "Template:l" in a column several times. --Backinstadiums (talk) 11:20, 4 November 2017 (UTC)

WT:ACCEL upgrade[edit]

Over at WT:GP there's a thread about improving the WT:ACCEL script. I've now completed this and I'd like to replace the original script with it. To spare you all the technical details of the discussion, here's what's changed in the new version:

  • Rather than generating the entire new entry for each accelerated link in a page, it only appends the acceleration data onto the URL.
  • If the OrangeLinks gadget is enabled, any orange links are modified to become edit links, and also have acceleration data added.
  • The data in the URL is picked up by the edit page, which is what actually generates the entry and places the content in the edit window. This relieves some of the workload from the main page.
  • If you are editing an existing page and acceleration data is given in the URL (as with modified orange links), the script will insert the new language section into the existing wikitext at the appropriate location.

This last point is especially a big improvement, as it means that you can now use accelerated links even if the entry already exists. The current creation rules work with the new script without changes, and no changes need to be made to accelerated links.

Is it ok to replace the old script with my new one? The code is at User:Rua/Gadget-AcceleratedFormCreation.js. —Rua (mew) 11:56, 4 November 2017 (UTC)

An additional feature has been added. Now, it is possible to specify multiple sets of grammar tags in the URL, using accel1=, accel2= and so on. The script will generate multiple entries in this case, but it will try to merge them so that information isn't duplicated. If two adjacent entries differ only in their definition, with everything else being equal, then the two entries are merged and the definitions are placed next to each other. Moreover, if both definitions use {{inflection of}}, then the inflection tags are also merged, so that e.g. definition 1 with {{inflection of|soddjil||com|s|lang=se}} and definition 2 with {{inflection of|soddjil||loc|p|lang=se}} are merged as {{inflection of|soddjil||com|s|;|loc|p|lang=se}}.

The part of the script that generates the links does not yet make use of this feature, so if two different forms have the same page name, each link will still only generate an entry for its own form. It would be desirable for the script to combine accelerated links if they have the same page name, but this is not trivial: the script would have to decide in which order to place the forms. Consider the entry soddjil: the comitative singular is identical to the locative plural. In the HTML, the locative plural appears first, since tables are organised by row and locative is in a higher row than comitative. It would be desirable, however, to generate an entry with the order com|s|;|loc|p as above, with singular taking precedence over plural. Either the script would have to be modified with more complicated logic to decide which goes first, or the ordering of the elements in the HTML would have to be changed so that all singular forms appear in the HTML before all plural forms. —Rua (mew) 14:53, 5 November 2017 (UTC)

After some more work, the script will now automatically combine identical forms into a single link. It does this on a per-language basis, meaning it gathers all the accelerated links for a particular language for a particular target page, and combines them. It uses the order that the links appear in the HTML to determine which order the entries should be generated in. But perhaps there are ways to change the HTML order? —Rua (mew) 16:21, 5 November 2017 (UTC)

It is a definite improvement to not generate a bunch of new entries at once, and to allow the gadget to add language sections to existing pages. However, I haven't used the gadget much (it needs support for Ancient Greek), so I'm not a good person to test it.
For Ancient Greek and Latin at least, I doubt it would be worth trying to mess with the HTML in tables to get the forms to appear in the correct order. The HTML order for Ancient Greek and Latin nouns is by case and then by number, as the numbers are in column headers and the cases in row headers. The order could be changed by putting each number in a subtable, but that would be messy. The order for Ancient Greek verbs is probably okay though.
Perhaps the order could be imposed using maps and the array sorting function. Thinking of Latin nouns, maps from label to information: type (case or number) and a ranking number (nominative: 1, genitive: 2, ...; singular: 1, plural: 2). And a map from number and case to their ranking numbers (number: 1, case: 2). — Eru·tuon 06:32, 12 November 2017 (UTC)
I've implemented another possibility instead, the one I described on WT:GP. Now, if you want forms in one column (e.g. singular) to always have their definition before forms in another column (e.g. plural), then you specify data-accel-col=number on the table cells of each column. Forms within the same column will then always come before forms in a column with a higher number. This can be seen in Module:se-adjectives (or for an example entry, soddjil). Although the comitative singular form appears in the HTML after the locative plural form, the column numbers make sure it comes first in the generated entry, so that you get {{inflection of|soddjil||com|s|;|loc|p|lang=se}} and not {{inflection of|soddjil||loc|p|;|com|s|lang=se}}. This should work for Latin and Greek as well. —Rua (mew) 16:47, 13 November 2017 (UTC)

If there are no further comments, I will put in the new script on the 19th, two weeks after the initial post. —Rua (mew) 13:11, 14 November 2017 (UTC)

The script has now been activated. Have fun! —Rua (mew) 11:47, 19 November 2017 (UTC)

Thanks, seems like an improvement. Equinox 16:14, 21 November 2017 (UTC)

chinese anagrams[edit]

鞦韆 indicates it's an anagram, but without being indexed to a category of anagrams. How can I find a list of the ones shown in Wiktionary?

This should be a good approximation of what you wanted. Wyang (talk) 00:07, 5 November 2017 (UTC)
@Wyang: Could you tell me where to find info. about the advanced search operators? --Backinstadiums (talk) 08:12, 5 November 2017 (UTC)
You can find more documentation on mw:Help:CirrusSearch and w:Help:Advanced search. Wyang (talk) 08:16, 5 November 2017 (UTC)
@Wyang: Thanks again! BTW, I'd rather use that space for the HSK levels, creating a category for anagrams and indexing words to it. --Backinstadiums (talk) 08:52, 5 November 2017 (UTC)
Wiktionary currently does not have an anagrams category in any language. —suzukaze (tc) 08:54, 5 November 2017 (UTC)
@Suzukaze-c: Regarding the English language, I've read similar proposals and discussions about the topic.--Backinstadiums (talk) 09:00, 5 November 2017 (UTC)
The problem is most words don't have anagrams. So if we created a category (automatically) for each word like "English words with alphagram abc", we would have hundreds of thousands of categories with only a single member. Adding the alphagram categories explicitly only for words with anagrams wouldn't provide any benefit over just listing the anagrams on the page with a bot (like I do now). DTLHS (talk) 21:54, 5 November 2017 (UTC)

Desysopping CodeCat aka Rua[edit]

FYI, I created Wiktionary:Votes/sy-2017-11/Desysopping CodeCat aka Rua. --Dan Polansky (talk) 22:40, 4 November 2017 (UTC)

variant forms of 𠃊[edit]

The cross-references for the variant forms of 𠃊 seem to have failed, aren't they generated automatically? --Backinstadiums (talk) 08:57, 5 November 2017 (UTC)

nope! all manual —suzukaze (tc) 08:58, 5 November 2017 (UTC)
O.K. Is it as such on purpose or for technical issues, or rather nobody noticed it before? --Backinstadiums (talk) 09:02, 5 November 2017 (UTC)

Grouping descendants (by borrowing) by language family[edit]

It has come to my attention that this habit that I feel improves readability may be more controversial than I presumed, because the reader might interpret it as the word being borrowed into a common ancestor of the family.

To give an example, here's an example from *buka:

Here's the alphabetical ordering for comparison:

I think that parsing meaningful information from the list in this form is much harder, am I alone in this? Crom daba (talk) 22:35, 5 November 2017 (UTC)

I agree. I like the language family organization better, but it is potentially ambiguous as descendants lists in Reconstruction entries use family names as a proxy for proto-languages. — Eru·tuon 00:45, 6 November 2017 (UTC)
We could change that. —Rua (mew) 00:46, 6 November 2017 (UTC)
I agree. — Ungoliant (falai) 16:48, 13 November 2017 (UTC)

chinese templates[edit]

Templates like {{Chinese-numbers}} could offer the pronunciation right under the characters (BTW, if sb. please would show me where one can add such info., I'll be so grateful). Their category page could also show the pronunciation since there're few items in them (I volunteer to do it manually, yet I'd need some guidelines on how to proceed).

Secondly, other templates such as {{list:days_of_the_week/zh}} are not displayed as an expandable box with translations, which worsens the user experience. --Backinstadiums (talk) 11:43, 6 November 2017 (UTC)

I believe the latter is by design. None of the list templates offer translations. —suzukaze (tc) 17:17, 6 November 2017 (UTC)
@Suzukaze-c: Is there any specific reason why the second design has been chosen? Can list templates start to show translations? What about pronunciations? --Backinstadiums (talk) 19:06, 6 November 2017 (UTC)

The Community Wishlist Survey 2017[edit]

Hey everyone,

The Community Wishlist Survey is the process when the Wikimedia communities decide what the Wikimedia Foundation Community Tech should work on over the next year.

The Community Tech team is focused on tools for experienced Wikimedia editors. You can post technical proposals from now until November 20. The communities will vote on the proposals between November 28 and December 12. You can read more on the 2017 wishlist survey page. /Johan (WMF) (talk) 20:17, 6 November 2017 (UTC)

Yeah! Go for thousand of proposals! I am convinced none of them will be picked by the Community Tech, but they may inspire a dev or a session at a future hackathon...or at least a dozen of proposals may become a signal for the Wikimedia Foundation that Wiktionaries do exist and deserve some attention. So please, express your wishes! Face-smile.svg Noé 09:05, 8 November 2017 (UTC)
There's a section specifically for Wiktionary proposals at meta:2017 Community Wishlist Survey/Wiktionary. --Yair rand (talk) 18:51, 8 November 2017 (UTC)

Annotating the first English–Navajo dictionary[edit]

https://www.newyorker.com/culture/personal-history/annotating-the-first-page-of-the-first-navajo-english-dictionaryJustin (koavf)TCM 21:13, 7 November 2017 (UTC)

paywall :( --2A02:2788:A4:F44:39D4:48CB:D128:F4DE 21:19, 7 November 2017 (UTC)
I don't see one. —Justin (koavf)TCM 21:51, 7 November 2017 (UTC)

Category:Metaphors by language[edit]

These categories are currently empty. Meanwhile, the label {{lb|metaphorically}} automatically redirects to {{lb|figuratively}}, but we have no CAT:Terms used figuratively by language or CAT:Terms with figurative senses by language. Should we create one of these and delete CAT:Metaphors, or start using it? --Barytonesis (talk) 23:39, 7 November 2017 (UTC)


I created Appendix:Fiction/Films with a few fictional terms used in films. --Daniel Carrero (talk) 21:52, 9 November 2017 (UTC)

Oh thanks!! Equinox 23:20, 9 November 2017 (UTC)
You're welcome. Although I seem to remember that you don't like when I create appendices of works of fiction, so maybe there's a chance you're being sarcastic. --Daniel Carrero (talk) 23:23, 9 November 2017 (UTC)
lol Equinox 23:26, 9 November 2017 (UTC)
rsrs --Daniel Carrero (talk) 23:27, 9 November 2017 (UTC)
lol Equinox 23:29, 9 November 2017 (UTC)

When a city and a province/state/subdivision share a name[edit]

In the Netherlands, two provinces, Groningen and Utrecht, have cities in them by the same name. The cities of course came first, and are the usual referent of these names; the provinces were named after the cities. But what order should the definitions be in? Since the city is the most common sense, it should come first. However, the city is defined as being in the province of the same name, which hasn't been defined yet. So the city definition depends on the province definition. How should this be handled? —Rua (mew) 20:30, 10 November 2017 (UTC)

A map would give an ostensive definition of both (or all three) that surpasses the kind of verbal definition in those entries. A good verbal definition of the city wouldn't reference the province. The etymology or {{defdate}} (together with basic common sense on the part of the user) could address which definition came first. DCDuring (talk) 21:14, 10 November 2017 (UTC)
I guess the province should be listed last: A province in the Netherlands named after the city / capital ? DonnanZ (talk) 00:09, 12 November 2017 (UTC)
  • Just in passing…"Since the city is the most common sense, it should come first." – that isn't policy AFAIK, and I for one am dead against it (in favour of historical ordering). Ƿidsiþ 08:11, 16 November 2017 (UTC)
    • The city is also the oldest sense, so the point is moot. But the oldest sense should definitely not come first by default. Do we really want entries to start with a bunch of obsolete senses before getting to the ones that really matter? I would rather have the most important sense first, the one that is most likely the one intended by the user. —Rua (mew) 15:13, 16 November 2017 (UTC)
      • "Do we really want entries to start with a bunch of obsolete senses before getting to the ones that really matter?" I do, since for me those are the ones that matter. It also make the development of senses much more obvious; otherwise it's hard to understand the connection between wildly different senses of a word that has many definitions. It's also not clear to me how you will determine which sense is most "important" in a word like set. But this is a discussion that should be pursued elsewhere; the only point to make now is that we do not have an agreed policy on the matter. Ƿidsiþ 15:50, 16 November 2017 (UTC)

French IPA pronunciation - express markup vs. autotemplate[edit]

Is it preferable to replace express IPA markup with autotemplate, like in diff?

Thus, replace {{IPA|/e.zɔ.te.ʁik/|lang=fr}} with {{fr-IPA}}.

--Dan Polansky (talk) 22:02, 10 November 2017 (UTC)

I don't have a problem with it, if the template generates the right output. —Rua (mew) 22:51, 10 November 2017 (UTC)
If the backend module is stable and well tested I am fine with it. DTLHS (talk) 01:17, 11 November 2017 (UTC)
So am I. I've seen a bunch of these replacements over the last several weeks and they always seem to give the right results. It is possible to give a phonetic respelling in |1= for words whose pronunciation is not predictable from the spelling. —Aɴɢʀ (talk) 08:12, 11 November 2017 (UTC)
More importantly: with well over 5,000 edits over 10 days at rates up to 8 per minute, this looks like an unauthorized bot. I've blocked them accordingly. Chuck Entz (talk) 23:29, 11 November 2017 (UTC)
Well, if you're referring to User:, it's not a bot. Just a human who edits fast. Please don't block them. They are useful. --Spreaderofwords (talk) 15:49, 13 November 2017 (UTC)

Characters in the same phonetic series (Zhengzhang, 2003)[edit]

Could sb. please add how the info. "Characters in the same phonetic series (Zhengzhang, 2003)" can be used for language learning purposes? Thanks --Backinstadiums (talk) 20:52, 11 November 2017 (UTC)

It would help you understand how the characters were coined in the first place, I hope. Wyang (talk) 21:12, 11 November 2017 (UTC)
@Wyang: could the rest of reconstructions be added as well? --Backinstadiums (talk) 15:30, 12 November 2017 (UTC)
@Backinstadiums I don't have the data, unfortunately. Someone needs to key in the data manually or extract them from somewhere reliable. Wyang (talk) 07:38, 14 November 2017 (UTC)

Changes to the global ban policy[edit]

Hello. Some changes to the community global ban policy have been proposed. Your comments are welcome at m:Requests for comment/Improvement of global ban policy. Please translate this message to your language, if needed. Cordially. Matiia (Matiia) 00:34, 12 November 2017 (UTC)
On Meta, I posted my oppose since the change in wording makes it possible to globally ban someone based on only two bans on Wikimedia projects even if those bans are on projects ruled by small cliques without transparent processes. --Dan Polansky (talk) 11:28, 12 November 2017 (UTC)

Vote: Placing Wikidata ID in sense ID of proper nouns[edit]

FYI, I created Wiktionary:Votes/2017-11/Placing Wikidata ID in sense ID of proper nouns.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 09:30, 12 November 2017 (UTC)

Why is this just about proper nouns? Shouldn't we generalize this? - Jberkel (talk) 12:04, 13 November 2017 (UTC)
@Jberkel: You may already know this, but proper nouns seem to be a special case. Wikidata is for concepts, not words. This is OK for proper nouns. Wikidata is expected to have a specific data item for each place, like d:Q29 = Spain. But things get tricky with some other parts of speech. Like, d:Q3133 is green, but would we use that for the noun or adjective? d:Q31920 is the concept of swimming, so would we add that ID for multiple senses of "swim" and "swimming"? In any event, proper nouns may get the same ID in multiple entries, but the sense should be the same: d:Q30 means United States of America and is available for use in USA, US, United States of America, United States, etc. The same ID can be used for other languages, like in entry Estados Unidos, which means "United States" in Portuguese. Still, the sense is the same.
Let me know if there are any exceptions or problems I did not think yet.
I would be fine with keeping the vote for proper nouns only. Or maybe creating 2 options: proper nouns, and everything else. But the "everything else" part was not properly discussed yet, so maybe I would oppose it. I support placing the Wikidata ID as the sense ID of proper nouns.
It is important to note that some Wikidata ID senses are already being added to proper noun entries. I assume this is why @Dan Polansky created this vote. This Wikidata ID project was discussed at least a couple of times (discussion links are in the vote). This project was never voted yet. --Daniel Carrero (talk) 13:50, 14 November 2017 (UTC)
OK, but I can think of many cases where Wikidata ids are also applicable for “normal” nouns / senses, if they refer to specific concepts. As an example, class can refer to a class in Object Oriented Programming: this can be mapped to d:Q4479242. If class refers to a mathematical class, there's d:Q217594. etc.
Of course we shouldn't add Wikidata ids randomly to any sense, but really just to those where there is a good match and where it is necessary to distinguish multiple senses of a word. For proper nouns this is less controversial but also less useful (ok, the classic counter example here would probably be “Paris, Texas”). – Jberkel (talk) 17:02, 14 November 2017 (UTC)

Browser extension to label a selected chinese word[edit]

Since I am starting to learn Chinese, instead of sitting down and manually edit entries systematically, I thought about doing it on the go, checking whenever they show up in an online text.

After selecting a chinese word, I need an extension to show to which of four groups of words, named by numbers 1 to 4, that word belongs; otherwise, nothing should pop up if the selected word does not belong to any of the four groups. The four groups of chinese words are in this list, and the label to appear is the "level number" of each category.

Regarding the apperance of the number (typeface, font, place etc.) any suggestions are welcome. @Atitarev --[[--Backinstadiums (talk) 11:01, 12 November 2017 (UTC)User:Backinstadiums|Backinstadiums]] (talk) 09:41, 12 November 2017 (UTC)

If you are interested in a general cursor translation tool for Chinese, you can try Lingoes (for Windows) or the in-built Mac dictionary. Both provide decent results for simplified Chinese at least. Wyang (talk) 09:47, 12 November 2017 (UTC)
@Wyang: That's not what I meant. Please check this thread. --Backinstadiums (talk) 09:54, 12 November 2017 (UTC)
That discussion is a bit hard to follow, I still don't quite understand your request or question. Do you mean editing Wiktionary pages themselves to include the levels, or ...? Wyang (talk) 09:58, 12 November 2017 (UTC)
@Wyang: A browser extension so that any user can select the headword of a Chinese entry which hasn't been indexed to an HSK level category yet, and check which one it belongs to, or whether it doesn't belong to any, thereby collectively editing them. --Backinstadiums (talk) 10:09, 12 November 2017 (UTC)
(edit conflict) @Wyang: User:Backinstadiums must be talking about a pop-up dictionary similar to Perapera Chinese Firefox or Chrome, which I use (there are also Japanese and Korean versions). Absolutely unreasonable requests. We are not producing browser extensions! --Anatoli T. (обсудить/вклад) 10:12, 12 November 2017 (UTC)
If Anatoli's right, then my first reply of this thread would apply - there are already mature softwares and extensions that do that. If the aim is to add the appropriate category to all HSK words, manually adding the category to entries missing it would be easier to achieve and much more efficient. Wyang (talk) 10:18, 12 November 2017 (UTC)
@Wyang: "If the aim is to add the appropriate category to all HSK words" Correct, that's the goal "manually adding the category to entries missing it would be easier to achieve and much more efficient" This is false, for the fact is that it's not done yet. I just want to engage with the thousands of users that monthly use Wiktionary. Creating a page of tasks or projects to carry out would really help, similar to https://commons.wikimedia.org/wiki/Commons_talk:Stroke_Order_Project. -- Backinstadiums (talk) 11:01, 12 November 2017 (UTC)

Topics of discussion[edit]

I'm new to Wiktionary, regarding topics, and this is continuing from above BP discussion, what topics are there besides 1, improving Wiktionary, 2, word etymologies, 3, policy topics, 4, discussion about discussion.. -Aision (talk) 03:13, 13 November 2017 (UTC)

Occasionally we slag off another that we find disagreeable. And every now and then we tell jokes or play a wiki-game. --Spreaderofwords (talk) 15:39, 13 November 2017 (UTC)

Defining affixes[edit]

I've seen that Equinox (talkcontribs) has been changing a lot of affix definitions to be less descriptive and more gloss-like; for instance, the definition of viscero- used to say "Forming compound words related to viscera", and it now reads just "viscera". Ditto for many dozens of similar affixes on my watchlist. Is this in accordance with some policy that I missed? It seems strange to me, because "viscero-" doesn't really mean "viscera", otherwise you'd be able to say "I removed the animal's viscero-". I understand that descriptive definitions are something of a grey area, but in the case of affixes they seem justifiable to me. Ƿidsiþ 09:57, 13 November 2017 (UTC)

Mh, yes, I preferred the old version. I've reverted back to it and added {{non-gloss}}. --Barytonesis (talk) 10:05, 13 November 2017 (UTC)
Well OK, but given how many pages this involves, I'd rather work out some consensus on the issue rather than changing one or two here and there. Ƿidsiþ 11:58, 13 November 2017 (UTC)
I concur as well. —Μετάknowledgediscuss/deeds 19:15, 13 November 2017 (UTC)
Hi! I've been doing this because I think that sense lines should tell us what is meant by the prefix (or suffix etc.). We don't define apple as "MEANING A SORT OF FRUIT", we just say what it means. So, to me, we should equally define a *fix like "-phobia" as meaning "fear" and not as "MEANING FEAR" because that isn't a definition per se. I seem to be in a minority here but let us talk a bit more. Why should pre/suffixes be defined in a weird way that we would revert in a second if we saw them for normal nouns or adjectives? Equinox 23:34, 13 November 2017 (UTC)
The quick-and-dirty fix is to put the old n-g template around some text that says "meaning XYZ" but I really think that's dodging the issue. Surely it's reasonable to say (for example) that the prefix "mega-" means "million", without spazzing too much. Yet would we write "MEANS A MILLION" at the capital symbol M, rather than merely writing "million"? Look at the SI units. Equinox 23:40, 13 November 2017 (UTC)
I sort of want to ask, if we need to write that a suffix -xx is used in "forming compound words", well what else would it do? Suffixes by their very nature form compound words. When I delete that stuff I feel like I'm deleting the bracketed part from "dog, n. NOUN [THIS IS A WORD AND A NOUN] an animal that barks". hissss. Equinox 23:45, 13 November 2017 (UTC)
I understand the concern but I think it makes things less clear. "An animal that barks" is already a (compound) noun, and "move quickly" for "run" is already a (compound) verb, but the same is not true of your affix definitions. It's a bit like interjections – we define words like sorry by saying that they "express regret" or similar, which is metadefinitional in the same way. I don't know exactly how it should be codified, but the point is that I think for most people, saying that viscero- means "viscera" is not sufficient, even given the stated part of speech. Ƿidsiþ 07:39, 14 November 2017 (UTC)
Perhaps we shouldn't give them a definition at all, but just redirect to the plain word somehow? --Barytonesis (talk) 10:46, 14 November 2017 (UTC)
The Equinox version has its charm, for the reason given by Equinox: "Forming compound words concerning viscera" does not tell me anything that I do not learn from "viscera". --Dan Polansky (talk) 18:05, 14 November 2017 (UTC)
https://www.merriam-webster.com/dictionary/cardio- does basically what Equinox does; so does AHD[4]; Macmillan has "relating to the heart"[5]. --Dan Polansky (talk) 18:08, 14 November 2017 (UTC)

Indepdendent Synonyms section, or {{synonyms}} under each relevant sense?[edit]

I've noticed some use of {{synonyms}} just after relevant senses, replacing the independent ====Synonyms==== section, as in this edit to the 青空 (aozora) entry. I don't see anything relevant at Wiktionary:Entry_layout#Synonyms. Is this use of {{synonyms}} the expected practice going forward?

Curious, ‑‑ Eiríkr Útlendi │Tala við mig 17:29, 13 November 2017 (UTC)

I don't think it has widespread consensus yet, but I prefer it because it's much easier to associate synonyms with specific senses that way. When they're in a separate ====Synonyms==== section, there's a risk that a sense will get deleted, or a sense will be split into two, and then the synonyms listed in a separate section get stranded. When they're right there under the sense, people are more likely to remember to move them at the same time. They're also easier for users to find when they're right there. —Aɴɢʀ (talk) 17:52, 13 November 2017 (UTC)
One (current) problem with {{syn}} / {{synonyms}} is that it's not possible to put any qualifiers or notes along with the synonyms such as "rare". DTLHS (talk) 17:56, 13 November 2017 (UTC)
That problem could be (and should be) solved — each synonym can have an alternative display with alt1= etc, so they could have qual1= etc just as easily. —Μετάknowledgediscuss/deeds 19:14, 13 November 2017 (UTC)
I use {{syn|pt|[[word|words]] {{q|qualifier}}}}. — Ungoliant (falai) 19:23, 13 November 2017 (UTC)
Functional, but ick -- that's starting to look dangerously like Perl.  :-P ‑‑ Eiríkr Útlendi │Tala við mig 19:41, 13 November 2017 (UTC)
Yeah, it’s not pretty. Still, I think it’s better than using qual1=, qual2=, alt1=, etc., since that would generate incorrect content whenever someone moves, removes or adds synonyms without paying attention to the other parameters.
Another possibility is using regular parenthees and having the module format it like {{q}}, but that would remove the distinction between qualifiers and optional particles (e.g., deixar has a synonym largar (de), where deixar X is synonymous with largar X and largar de X). — Ungoliant (falai) 15:19, 14 November 2017 (UTC)
  • Ideally, synonyms, translations and other sense-specific information would all come in little drop-down boxes on the definition lines, much as citations do now. @Ruakh raised this issue years ago and suggested some technical ways to make it work, but it never got anywhere (mainly due to general apathy, I think – so one really objected to it). Ƿidsiþ 10:35, 14 November 2017 (UTC)
Yes, there should be drop-down boxes possible for semantic relations. For example I can’t make the entry graph very pleasing without it. Technically it would be fine if I could just take the hyponym table that is now in a separate section and put it (minus the terms belonging to other senses) with few changes under the sense it belongs to. Maybe what we need are wrappers {{hypowrap}} and so on in which we can insert the tables, but as I see for at least that table in graph that does not have multiple columns it would be fine to have parameters, like table=yes or hidden=yes, in the semantic relations templates because it is quite easy to convert that table for {{hypo}}, only that under a sense I require a hidden listing instead of a full listing without user interaction. Also, why don’t I see a template {{mero}}, {{holo}}, {{coordinate}} while there is {{syn}}, {{hypo}}, {{hyper}}, {{ant}}?
In fact, the same can be said about translations. We would then need a wrapper which the user could click for each sense to see semantic relations and translations. Or that table would have smart mathematics and hide translations always unless clicked (would be good for learners) and show semantic relations in so far as they do not have excessive length:
1. to turn off, switch off
Antonyms: anstellen, anmachen, anschalten, einschalten, (click to see more)
(click to see translations) (if it is an English sense) Palaestrator verborum (loquier) 18:33, 14 November 2017 (UTC)
One problem I have with putting more things directly under definitions is that it makes it harder to modify or combine English senses. This is especially relevant for translations since an English editor probably will not know all the languages under a particular sense. This problem does not occur with the current gloss system, although translation boxes may become out of sync with definitions. DTLHS (talk) 18:38, 14 November 2017 (UTC)
It does in fact occur currently, except that people ignore the translations when merging senses, so that they go out of sync. Putting them under senses would just make it harder to ignore this problem. —Rua (mew) 18:44, 14 November 2017 (UTC)
If an English sense is modified and the corresponding translation box is not, those translations are still valid since they have an independent gloss that they refer to. DTLHS (talk) 18:46, 14 November 2017 (UTC)
However one might still have the current translation headers. Translation sections sometimes need to be more differentiated than the English glosses. For example I added under interference a specific legalese sense “intrusion into the scope of protection of a guaranteed right” without finding it necessary to add any English gloss, as the sense is included by the most general definition while other languages have already a distinction there. For the inner-language semantic relations, it would help to keep the senses in sync with the synonyms and other semantic relations if these are directly under sense, as else editors are tempted to underspecify to which senses the synonyms belong if they do it in special sections, and, what Rua said, it is a gentle push to keep things in sync if the semantic relations and translations are under the senses. Palaestrator verborum (loquier) 18:52, 14 November 2017 (UTC)
Also one should be more conscious about when it can actually happen that senses get merged, modified, or added.
  1. If senses are added, there is no problem with the translations and semantic relations. There are none for the new sense and the older specified ones are not wrong.
  2. If you combine senses, it is, as I understand it, because it has been foolish in the first place to differentiate, so there can be no distinction discerned in two translation sections either. There are many such English definition pages, and I would like to ask how a translator shall distinguish all the allegedly different eleven concepts of understanding currently listed at enforce – it is hard to relate a table in this lemma to a gloss, and translators have already forgot the sense which they actually want to translate when they have chosen a table, and one doubts the essentiality of the differences between the glosses, this is why there are almost no translations, unlike on enforcement. That article can hardly be made worse by adding the translations under glosses and also not further become worse by further merges, as long as and because the distinctions in the translations tables are incomprehensible. Actually having bilingual dictionaries at hand for some of the translation targets should tell you if it is really necessary to distinguish so many meanings. If in doubt, delete the translation tables because they are insufficient and unhealable to allow the births of new ones – it is better than letting confusing, poorly hedged glosses stay.
  3. If you modify senses, it is because you want to make the senses clearer: In those cases the translations will continue being correct and in the future it will be easier to translate because the sense to be translated is more clear for the translator. If you modify the sense because the former sense was just plain wrong and not extant for the lemma, which is highly unlikely, the translations can be deleted because they are misplaced. Palaestrator verborum (loquier) 19:25, 14 November 2017 (UTC)
Very true, but there's a fourth case: splitting senses. —Rua (mew) 19:37, 14 November 2017 (UTC)
Well, I presumed that the senses are somehow monadic, so splitting would be modifying (specifying) + adding. Because how wretched must one be to have actually two separate meanings as one gloss? At the latest the translator will split up such a gloss. It’s the whole problem of unreadable pages and translations conflicting glosses if the editors do not understand that the fact that they can describe meanings in multiple glosses does not imply that they are dealing with that many separate meanings. By talking more one does not necessarily make more statements, sometimes one clarifies, sometimes one even obsfuscates by hiding the simple. For the lovers of examples, a complete fail of the described kind has happened at نِسْبَة (nisba) until fixed today where there was a list of glosses with one word each gloss as if the English words had no manifold meaning themselves. The skilled editor does not just list every possible rewording but distinguishes the meanings in relation opposite to each other and specializes them as feasible – with means of vocabulary or layout – for the reader to understand given sentences that contain the word; sometimes the translation tables are desired to be even more special than the meaning has been presented without them, but listing the semantic relations and translations for each gloss helps to distinguish independently of this. There is in the case of enforce one abstract meaning that, as I admit, needs at least two main glosses to describe (mainly force for the increase of the effect of something being added and force being just applied on something or someone for a certain effect, as I see before now actually regrouping the enforce lemma content), but not eleven. Palaestrator verborum (loquier) 23:16, 14 November 2017 (UTC)
I like separate synonym sections as mandated by WT:EL, but there was a recent poll showing many people like the new format. --Dan Polansky (talk) 07:28, 17 November 2017 (UTC)
The poll is at Wiktionary:Beer_parlour/2017/May#Poll: putting "nyms" directly under definition lines. It shows a 2/3-supermajority support placing -nyms (including synonyms) directly under definitions. --Dan Polansky (talk) 07:44, 17 November 2017 (UTC)

Hyphen at the end of a proto-language word[edit]

What is the meaning of the hyphen at the end of a proto-language word? For example, if the source uses *pura without a hyphen, can it be changed to *pura-? This change was made at the Hungarian verb fúr and I wonder if it is valid. Thanks. --Panda10 (talk) 18:11, 13 November 2017 (UTC)

It's how the lemma form has been chosen for Proto-Uralic verbs. Verb lemmas are the stem, noun/adjective lemmas are the nominative singular. —Rua (mew) 18:59, 13 November 2017 (UTC)

Proposal: delete all Latin script letter senses from all languages except Translingual[edit]

I suggest deleting all Latin script letter senses from all languages except Translingual. The "letter name" senses like "bee" = "name of letter B" may be kept.

Example: diff. (I just edited the first few language sections)

If we keep the current format without changes and add a language section for each language that uses "a", it will become basically infinite, and useless. It gets increasingly hard to find the non-letter senses.

I know I have proposed different things in the past concerning letters. I've been trying to figure out what to do with them.

Pronunciation can be found in appendices like Appendix:English pronunciation. The appendices may even be expanded with detailed info if needed.

Feel free to propose different things or say if there's any problem with this idea.

This is a major proposal, so if people like this idea, I'm sure it would need to be voted at some point. I'm not in a hurry. --Daniel Carrero (talk) 23:08, 13 November 2017 (UTC)

Why only Latin? The page А is also huge. DTLHS (talk) 23:12, 13 November 2017 (UTC)
I support this, but for other scripts as well. --Barytonesis (talk) 23:29, 13 November 2017 (UTC)
What about unusual or maybe even non-Translingual letters? For example, are there other languages than Gregorian using Gregorian letters like , or languages other than Cherokee using Cherokee letters like ? If there aren't, then properly speaking it's not translingual.
What about inflection? {{en-letter|upper=A|lower=a}} also produces "plural a's" which is an English plural. Well, it might belong into a noun and not a letter section, but there wouldn't be much gain in removing the letter section and adding new noun sections for inflection.
- 00:05, 14 November 2017 (UTC)
Support, though some solution for inflection does need to be found. I definitely agree that these pages are so huge, and have the potential to get so much huger, that they are useless. Moreover, letters themselves don't convey meaning, they just stand for themselves just like any mention of a word stands for the word itself, so they don't even meet CFI. Yes, letters can used in an ordinal fashion, but our current entries don't make any mention of this so there is no loss in that regard. —Rua (mew) 00:18, 14 November 2017 (UTC)
Oppose. I find pronunciation information for individual languages very helpful, especially for the actual name of the letter (how else could you figure out how to spell in another language?). As an alternate, would it be possible to make the Translingual page the default, and move most language-specific information on the letter to an appendix, or break it up into more than one mainspace entry somehow (with links from the main entry), or have two versions of the page? I very badly don't want to lose all the pronunciation information, so I think simply stripping the entries down is a bad idea. Andrew Sheedy (talk) 01:31, 14 November 2017 (UTC)
"Pronunciation can be found in appendices like Appendix:English pronunciation." ;) —suzukaze (tc) 01:54, 14 November 2017 (UTC)
Does that appendix explain that the letter A is pronounced /eɪ/? I can't see it. Ƿidsiþ 08:57, 14 November 2017 (UTC)
@Widsith: No it doesn't, but see my suggestion below. —Aɴɢʀ (talk) 10:08, 14 November 2017 (UTC)
Support, and for non-Latin scripts too. —suzukaze (tc) 01:54, 14 November 2017 (UTC)
  • Support and for non-Latin scripts too. However, although "Pronunciation can be found in appendices like Appendix:English pronunciation" as Suzukaze-c points out, those appendices do not usually contain spelling-to-pronunciation rules. Perhaps we could have separate appendices for those, e.g. Appendix:English reading rules (or a better name?) which tells us that ⟨a⟩ is most often pronounced /æ/, /eɪ/, /ɑː/, and /ə/, and in what environments; ⟨b⟩ is most often pronounced /b/ but is silent at the end of a word after ⟨m⟩; and so on. —Aɴɢʀ (talk) 07:24, 14 November 2017 (UTC)
    Yeah but the point is not about how the letter is pronounced in words (that's also an issue, but one covered by appendices) – it's about what the letter is called in different languages. H is called aitch in English and ache in French; Y is called i grec in French; where will that information be? Ƿidsiþ 10:31, 14 November 2017 (UTC)
    Letter names can be found in appendices too, but maybe not the same as the pronunciation appendices. Appendix:Romanian alphabet, Appendix:Letters/Spanish and other appendices contain lists or tables of letter names. Maybe some of these appendices could be renamed and revised to use a more consistent format, but this is a wiki. --Daniel Carrero (talk) 13:17, 14 November 2017 (UTC)
  • Support moving language-specific information about letters (not letter names!) to appendices, including non-Latin scripts. — Ungoliant (falai) 13:05, 14 November 2017 (UTC)
  • Support also for all other scripts that can be covered by Appendix:Writing systems and alphabets (I do not make any statement about Chinese characters) – it will be easier to find non-letter meanings and easier to compare writing systems. The letter meanings can be outsourced to an appendix for each graph/glyph which is linked by {{also}} in the namespace, having in the case of a for example the name Appendix:Graph/a or even in a new namespace Graph:a. If an entry can only be a graph and not lexical, the mainspace can redirect. Possibly for this purpose a namespace is better, as the software can habitually check for matchings in a namespace dedicated to writing system glyphs like it redirects me to أنبوبة if I type انبوبه. Appendix:Graph/َ  or Graph:َ  with fatha seem to work too, so there is no problem with diacritics? Palaestrator verborum (loquier) 14:04, 14 November 2017 (UTC)
    I think just Appendix:a would be ok too. —Rua (mew) 14:24, 14 November 2017 (UTC)
  • Support also for non-Latin scripts (except Chinese) —Aryaman (मुझसे बात करो) 16:17, 14 November 2017 (UTC)

Vote: Restricting Thesaurus to English[edit]

FYI, I created Wiktionary:Votes/pl-2017-11/Restricting Thesaurus to English.

Let us postpone the vote as much as discussion requires, if at all. --Dan Polansky (talk) 10:28, 15 November 2017 (UTC)

Comments on Recent Votes[edit]

I am curious if people feel as I do, that vote pages are often overrun with discussions and that those discussions are often tangential at best. I propose that we eliminate discussion within the vote page itself, and if someone would like to engage in some discussion concerning an individual's vote that they do so on the vote talk page. Perhaps a link to the discussion on the talk page can be placed on the voting page, but protracted conversations within the vote only obfuscate the results. I am only referring to secondary comments; a comment left by the voter to clarify or justify their vote is, of course, perfectly fine.
Secondly, a number of times recently there have been comments to the effect that votes should include a written rationale, or requests that individuals justify their votes in some way. This is inappropriate. Any member of the community is eligible to vote, and they can vote how they please for whatever reasons they please; if they wish to share their rationale that is their prerogative. Requesting clarification about a vote is acceptable (e.g. "if the proposal had been x instead of y would you have voted differently") but demanding justification is completely out of line. - TheDaveRoss 15:33, 15 November 2017 (UTC)

protracted conversations within the vote only obfuscate the results – well, that is why the three voting possibilities have colored badges so one can count through. And technically the “result” has only to be counted through at the end by one person. Isn’t it then actually good that the result is obfuscated? It diminishes group pressure and enables free voting because one does not see such heaps of votes.
Also I have suggested that some kind of hedge template is created that demarcates digressions from the topic and maybe needs a click for drop-down because often nobody does want to move or one does not know whither exactly to move the discussions. Palaestrator verborum (loquier) 11:04, 16 November 2017 (UTC)
Oppose. First of all, about: "demanding justification is completely out of line". I'm not sure about that. I would support having a different voting system where unjustified votes don't count. (except maybe votes for bots and admins, etc.) But I'm fine with the current system, anyway.
More importantly, I disagree with what has been proposed. I would prefer still freely allowing discussion in vote pages. Some written rationales are more likely than others to attract further comments from other people, and thus discussions begin. I think that's fine. In fact, I consider the in-vote discussions even more important than the actual simple count of "Support"/"Oppose"/"Abstain". The discussions should encourage us to think and justify stuff. If someone says "I'm voting because of reason X", it would most likely be an avoidable hassle to go to the talk page to discuss about reason X, if the alternative is replying right then and there in the vote page itself.
When someone adds the first comment to someone's rationale, it's not clear yet if this will become an actual discussion. It's not clear if someone else will reply. I don't suppose we're expected to create separate sections in the talk page for each little comment when we don't know if this will become an actual discussion, right?
Interestingly, we have this rule in WT:Voting policy: "Debate is welcome on these pages." But this rule was never voted itself. --Daniel Carrero (talk) 12:30, 16 November 2017 (UTC)
@Daniel Carrero Part of the reason I dislike discussions within the vote is that it indicates that the vote was premature, and there was not adequate discussion prior to its start. Votes should not be islands, they should be the final step in a process. The commentary often takes forms which I find distasteful, such as pressure to change one's vote, casting aspersions on the subject of the vote or the voter themselves, or other such nonsense. Why is having the discussion within the vote so important? Re whether or not a comment will lead to a discussion, I think any comment beyond that of the voter constitutes discussion, so every time it happens it is the case. - TheDaveRoss 14:08, 16 November 2017 (UTC)
I don't think discussions in the vote are (necessarily) an indication that the vote is premature. I think we can't ever say for sure that the discussions are over. If some idea is discussed by many people and has close to 100% approval before any vote starts, this would indicate that the issue is so noncontroversial that maybe a vote is not even needed. --Daniel Carrero (talk) 14:34, 16 November 2017 (UTC)
Approval and sufficient discussion are not the same thing, I am perfectly fine with a vote being run for a controversial topic. But if people feel the need to make arguments within the vote itself it is an obvious indicator that they have not had sufficient opportunity to do so ahead of time. - TheDaveRoss 15:07, 16 November 2017 (UTC)
I oppose removing discussions from the vote page. I like the idea of show-hide concealment, perhaps beginning a short time after the last comment in the thread or when the thread is 'too' long. DCDuring (talk) 13:08, 16 November 2017 (UTC)
I support using show-hide concealment in some cases too. --Daniel Carrero (talk) 13:09, 16 November 2017 (UTC)
@DCDuring If the consensus is that discussions within votes are appropriate, then I would support hiding them as well. - TheDaveRoss 14:08, 16 November 2017 (UTC)
  • I oppose removing, limiting or hiding discussions from the vote page. These discussions are a great feature. Votes should be understood to be votes/requests for comment. If votes have the potential to be "evil" to an extent, these discussions directly on the vote page make them less so. I think we should invite more discussion directly on the vote page. The discussions do not "obfuscate the results" since we use icons and vote counting. As for "demanding justification is completely out of line", I think the opposite: we should do more to encourage people to provide rationales and make votes more like requests for comments and discussions. The idea that enough people come to discuss a proposal before the vote starts is unrealistic; it is the vote that forces people to pay attention to a proposal lest it passes. --Dan Polansky (talk) 07:24, 17 November 2017 (UTC)
    That said, it may be a good practice to continue the discussion on the vote talk page once it becomes more protracted. --Dan Polansky (talk) 17:15, 17 November 2017 (UTC)
    I also oppose, for pretty much identical reasons to Dan. —Μετάknowledgediscuss/deeds 17:16, 17 November 2017 (UTC)
    @Dan, I would consider the sentiment of your final sentence to be an abuse of the voting system, not a feature. - TheDaveRoss 20:15, 17 November 2017 (UTC)
    @TheDaveRoss: I read that sentence as a simple statement of fact, not an endorsement of starting votes purely to force people to think about things. — Eru·tuon 22:40, 20 November 2017 (UTC)
This made me start vaguely wondering if a secret ballot (with names/votes posted at conclusion of vote only) would be an improvement over a public voting page. (I doubt it!) It wouldn't stop people from making comments, anyway; would just prevent voting based on how others have voted, which I bet also happens. Equinox 16:49, 17 November 2017 (UTC)

stroke order exceptions within the standard guidelines[edit]

Where could I find a dabase of exceptions of the stroke order to be expected following the standard guidelines? Input methods' lists of input sequences could be used for this issue, which is of great lexicographical value --Backinstadiums (talk) 09:34, 19 November 2017 (UTC)

Request translations of wikipedia pages related to languages[edit]

After jumping from page to page for an hour, I still do not know how to request a translation. The article I'd like to have an English version of, among others, is https://zh.wikipedia.org/wiki/%E5%90%88%E6%96%87. Would it be possible to request the translation of a whole "category"? Thanks --Backinstadiums (talk) 13:49, 20 November 2017 (UTC)

We have WT:TRREQ, but requesting translation of whole Wikipedia articles is asking an awful lot. Of course, you could enter the URL of the Wikipedia page into Google Translate, but for the URL you gave the result is pretty useless (Chinese speakers may find it good for a laugh, though). Chuck Entz (talk) 14:25, 20 November 2017 (UTC)
Indeed, and it starts right from the title... lol! Wyang (talk) 14:28, 20 November 2017 (UTC)
@Chuck Entz: I meant there used to be a page for this in wikipedia, neamely https://en.wikipedia.org/wiki/Category:Translation_Request, yet it's inactive now, and cannot find the current formal procedure --Backinstadiums (talk) 14:58, 20 November 2017 (UTC)
@Backinstadiums: Did you see w:Wikipedia:Translation#Requesting a translation from a foreign language to English? —Aɴɢʀ (talk) 16:38, 20 November 2017 (UTC)
@Angr: Yes, and still do not know where to post... it's easier in Wiktionary. I cannot seem to find the exact final thread to request it. I think those articles are highly relevant for Wiktionary as well --Backinstadiums (talk) 18:31, 20 November 2017 (UTC)

New print to pdf feature for mobile web readers[edit]

CKoerner (WMF) (talk) 22:07, 20 November 2017 (UTC)

Noto font family[edit]

Google has designed and released Noto Sans and Noto Serif (freely available) that can handle a large number of languages. Currently, Noto covers over 30 scripts, and will cover all of Unicode in the future. Noto download. Go to Noto Sans specimen to see how it looks. —Stephen (Talk) 05:31, 21 November 2017 (UTC)

Bah, still no Manichaean. Crom daba (talk) 14:23, 21 November 2017 (UTC)

Gender-neutral French plurals[edit]

Should we be concerned with documenting these? —Justin (koavf)TCM 18:03, 21 November 2017 (UTC)

If they're attestable, of course. —Mahāgaja (fomerly Angr) · talk 19:24, 21 November 2017 (UTC)
The · entry might need the addition of a "French > Symbol" section — again, if attestable. Equinox 19:31, 21 November 2017 (UTC)