Wiktionary:Beer parlour

Definition from Wiktionary, the free dictionary
Jump to navigation Jump to search

Wiktionary > Discussion rooms > Beer parlour

Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives edit


October 2018

Language log on Vietnamese[edit]

Evidently, it is more-or-less common to have diacrtictless Vietnamese. Also, there are some shorthands like "L/L" interpolated from Chinese varieties that are common. I'm not sure how editors here can incorporate that knowledge into the dictionary but I figured I would surface it for anyone who works on Vietnamese. —Justin (koavf)TCM 04:32, 1 October 2018 (UTC)

I created L/L. That sign is probably just created on a computer lacking Vietnamese input methods. Diacritic-less Vietnamese is common in chats, but it can be highly ambiguous- a commonly cited example is Em dang o truong in a text message sent by a female, which can mean either I'm at school or I'm completely naked. Wyang (talk) 04:40, 1 October 2018 (UTC)
I've made diacriticless ="accentless", not "diacrtictless". Diacriticless Vietnamese must be much worse than toneless Chinese pinyin. We don't need those as alt spellings or even redirects. --Anatoli T. (обсудить/вклад) 04:56, 1 October 2018 (UTC)
It's like in accentless Spanish "Toque el cono" can mean "touch the cunt" or "I touched the cone"both of which, of course, are great titles for songs on my next thrash metal album --WF110 (talk) 19:21, 1 October 2018 (UTC)

New language family and protolanguage request[edit]

There's currently no codes for the Mixtec family or for Proto-Mixtec. Mixtec is a branch of Mixtecan alongside Cuicatec (which also lacks a code) and Trique (which does have a code, for some reason). --Lvovmauro (talk) 05:11, 1 October 2018 (UTC)

Oh, and likewise for Otomi and Proto-Otomi. There's probably a lot of these missing. --Lvovmauro (talk) 05:20, 1 October 2018 (UTC)

Featured entries[edit]


Is there a list of featured entries in English Wiktionary, similar as Wikipedia featured articles? There is a ticket in Phabricator to add to Cognate a way to indicate with the links to other versions of Wiktionary if the page there is featured, but it may be too early to develop this feature if French Wiktionary is the only Wiktionary project having a list of good entries Face-smile.svg Noé 08:58, 1 October 2018 (UTC)

The only featured entries are the words of the day on the main page. DTLHS (talk) 16:22, 1 October 2018 (UTC)


Can someone help me? I want to create a category for the Vietnamese language to house Chu Nom, like we have a Japanese category for hiragana, katakana, etc. ---> Tooironic (talk) 11:21, 1 October 2018 (UTC)

@Tooironic: It looks like these are appropriately categorized now. —Justin (koavf)TCM 16:59, 1 October 2018 (UTC)
Category:Vietnamese Nom sounds like what you're proposing. — Eru·tuon 20:06, 1 October 2018 (UTC)
No, this is for English terms that are contextually related to the Vietnamese language. DTLHS (talk) 20:11, 1 October 2018 (UTC)
Oh, I see. Duh. The place to add Category:en:Vietnamese is apparently Module:category tree/topic cat/data/Communication. Done. — Eru·tuon 20:25, 1 October 2018 (UTC)

What about uniformise a little bit all Wiktionaries?[edit]

Hello, could you tell me what do you think about T150841 and especially the Addshore's comment. In brief, the idea is to have a uniform structure on Wiktionaries in order to be able to identify automatically a language section. This is needed in order, for example, to apply different colours for interwiki links depending on:

  1. there is a section in English on enwikt and a section in English on frwikt => the interwiki link to the French Wiktionary will be blue
  2. here is a section in English on enwikt but no section in English on frwikt => the interwiki link to the French Wiktionary will be green (or another colour)
  3. There is no section in English on enwikt and no section in English on enwikt => the interwiki link to frwikt may remain blue or have another colour.

This is only the use case that is discussed in this ticket but we may imagine other use case in the future.

On the French Wiktionary, we already have template for language heading so I think it will be easier than in your case (I guess using a template may be a good idea here). Pamputt (talk) 20:11, 1 October 2018 (UTC)

Wouldn't this depend on having uniform language codes? We have many customized, merged or deleted languages that will not be uniform across projects. DTLHS (talk) 20:21, 1 October 2018 (UTC)
I don't see how us having a template in the header would make a difference. We have a bijective list of language names and language codes. The real problem is that we don't have bijectivity of languages between Wiktionaries, as DTLHS points out. —Μετάknowledgediscuss/deeds 20:30, 1 October 2018 (UTC)
You are right, there are two points. One is the language code (the code has to be the same between Wiktionaries) and the other is the magic word that Addshore propose to add to do what I described above.
About the language codes, I do not think they are two different among Wiktionaries, at least for main languages (for minority languages, this is another story, I guess). Most of Wiktionaries use "en" for the English language (liwikt, rowikit, viwikt and zhwikt use the "eng" code). You can see how different are the language headers between Wiktionaries here. Pamputt (talk) 06:02, 2 October 2018 (UTC)
First of all, if a solution doesn't work for minority languages, then it doesn't work at all. Most of our coded languages are "minority languages" of one type or another. Secondly, even major languages like Serbo-Croatian (sh or hr+sr or hr+sr+bs) are not bijective. You can't sweep this under the rug and expect your proposal to work. —Μετάknowledgediscuss/deeds 16:51, 3 October 2018 (UTC)
Actually, the proposal applies mainly for major languages, i.e. languages that are used on one Wiktionary project (about 170 languages in total). Pamputt (talk) 05:30, 4 October 2018 (UTC)
Why would you only focus on languages that have Wiktionaries? If I want to know if the French Wiktionary has a Vandalic section at fr:drincan (it currently doesn't), would the link at en:drincan be blue or green as a default? If it's blue, it would imply that fr.wikt had a Vandalic section, which would be false. If it's green, then the Old English interwiki would also be green, which would imply that fr:drincan lacks an Old English section, even though it actually has one. Both Vandalic and Old English lack Wiktionaries, so by ignoring them, you're essentially setting up a situation where the links will be misleading. (Also, you didn't address the problem of Serbo-Croatian.) —Μετάknowledgediscuss/deeds 05:57, 4 October 2018 (UTC)
The proposal is the reversal: in the English Wiktionary highlight the interwiki links that have an English section. In the French Wiktionary it would highlight the English interwiki if it detects somehow that on en.wikt there is a French section. This needs some magic uniformising the language headings across Wiktionaries. --Vriullop (talk) 09:05, 4 October 2018 (UTC)
Vriullop is right, the proposal is the reversal. About Serbo-Croatian, there exist Serbo-Croatian Wiktionary, Croatian Wiktionary, Serbian Wiktionary and Bosnian Wiktionary, so the "problem" will be addressed the same way as French or English. Pamputt (talk) 09:16, 4 October 2018 (UTC)
I think there's still a question about what should happen in the interwiki links from the Croatian, Serbian, and Bosnian Wiktionaries that point to the English Wiktionary. On English Wiktionary we only use Serbo-Croatian in headers, not Croatian, Serbian, or Bosnian. So in the proposed extension perhaps links from the Croatian, Serbian, and Bosnian Wiktionaries to the English Wiktionary should be colored to indicate whether there is a Serbo-Croatian entry in the English Wiktionary. Even though the language code is different, the word that the entry is about is probably the same. If the extension does not do this, then no links from the Croatian, Serbian, and Bosnian Wiktionaries to the English Wiktionary will ever be highlighted. — Eru·tuon 19:27, 4 October 2018 (UTC)
Given how butthurt some of the people in those speaker communities are, I suspect that they would choose to not have the link over linking to an "evil" language. —Rua (mew) 20:56, 6 October 2018 (UTC)



A proposal: this month, a lot of people all over the world are doing Inktober challenge, drawing an ink art per day during a month. There is a list of 31 themes. We may take part of this challenge our way, by adding more content on those pages. Plus, those pages may be more consulted during this month so it's better if the content is good. Most of them are already fine pages of course, but there is still some improvement to do Face-smile.svg Noé 06:44, 2 October 2018 (UTC)

Hey Noe! To me it's quite an odd proposal - I doubt there'll be a significant rise in traffic to those Wiktionary pages, really - to be fair, Inktober is pretty obscure. Anyway, it's worth a try. You know what: while I'm here complaining, I'll be productive add some more ====Derived terms==== to a few of these, as solidarity or whatever. --WF110 (talk) 08:51, 2 Inktober 2018 (UTC)
So, I added loads of Derived terms to gift, double, jolt, slice...chicken. Maybe do more another day --WF110 (talk) 09:37, 2 Inktober 2018 (UTC)
Thanks for your help! I am not sure about the traffic rise, but three of my friends are doing this Inktober, here in France. My hypothesis is: when several people want to design something more original than the neighbour in a delimited topic, they look at the definition and polysemy. Well, I can be wrong about that and it could be a random list after all, but we are still improving the content of our projects and that's what matter most Face-smile.svg Noé 16:10, 3 October 2018 (UTC)
It can also be a good opportunity to record audio pronunciations with Lingua Libre for words that don't have one yet. Pamputt (talk) 05:26, 4 October 2018 (UTC)
I have created a list on Lingua Libre. You only need to take your microphone and read the list on Lingua Libre now :) Pamputt (talk) 06:06, 12 October 2018 (UTC)

Reminder: No editing for up to an hour on 10 October[edit]

12:03, 4 October 2018 (UTC)

Lemma of Southern Sami adjectives[edit]

I've started working on improving our Southern Sami coverage, after having worked on a bunch of other Sami languages already. And I've hit a bit of a snag. All Sami languages have both "predicative" and "attributive" forms of adjectives. Almost all Sami languages traditionally lemmatise adjectives under the predicative nominative singular form. In existing Southern Sami dictionaries, though, the attributive form is chosen instead. Should Wiktionary lemmatise Southern Sami like other Sami languages for consistency, or should it follow the practice of other dictionaries and use the attributive as the lemma? —Rua (mew) 20:53, 6 October 2018 (UTC)

I'd go with attributive, respecting existing native lexicographical practice is a good thing insofar as it exists. Crom daba (talk) 14:37, 7 October 2018 (UTC)
I found another reason to use the predicative instead. raeffie is both a noun meaning "peace" and an adjective meaning "peaceful", but the attributive of the adjective is raeffies. Both have exactly the same case forms, the noun just lacks the attributive form. It would make more sense to me to place these on the same page, because they have the same stem and one can't be said to derive from the other through suffixation. This connection is lost if the lemma for the adjective is raeffies; how would you write its etymology section then? —Rua (mew) 15:05, 7 October 2018 (UTC)
This also illustrates an important point: the attributive is usually a derived form, while the predicative is the more basic form. The former is derived from the latter with an -s suffix. There are some adjectives where it's the other way around, but those are more rare. —Rua (mew) 15:08, 7 October 2018 (UTC)

Removal of comments[edit]

Is there policy here on Wiktionary regarding the removal of someone's comments, when those comments aren't vandalism? There is an idea that talk page comments are protected and not to be messed with that has long-standing in the wiki world. Comments welcome. -Inowen (talk) 02:40, 7 October 2018 (UTC)

Unless comments are disruptive or abusive etc., they should not be deleted, though comments on user talk pages can be deleted, more or less at the pleasure of the user. Comments that are merely wrong, deemed a waste of time, etc. should not be deleted IMO. I think that is our practice. DCDuring (talk) 03:52, 7 October 2018 (UTC)
Is there a policy page, like on Wikipedia? -Inowen (talk) 04:13, 7 October 2018 (UTC)
The Wikipedia:Talk page guidelines are, as the title says, guidelines, not policy. Quoting from that page: “Cautiously editing or removing another editor's comments is sometimes allowed, but normally you should stop if there is any objection.” Editing someone’s comment to change its meaning, or removal of comments for no good reason against other editors’ wishes, are considered unacceptable on Wikipedia and may get you blocked for disruptive behaviour. Same here, even if we do not spell it out, so wikilawyering won’t help here.  --Lambiam 08:58, 7 October 2018 (UTC)

A related issue: re-addition of comments[edit]

Users are permitted to delete comments from their own talk page unless it's a block notice, correct? I ask this because @Metaknowledge undid @IQ125's removal of comments from their talk page. When I undid Meta, citing that Meta had no right to re-add comments a user wants deleted from their page, Meta undid me. Purplebackpack89 22:36, 26 October 2018 (UTC)

If the owner of the talk page is trying to deceive people by pretending they haven't been warned about something of importance to the dictionary, I don't think that should be allowed. Chuck Entz (talk) 23:06, 26 October 2018 (UTC)
I'm in favor of a similar policy to what Wikipedia has: users are allowed to remove any comments except blocks. I think admins being allowed to add and remove comments like that gives them too much power. I'm not sure "deceive" is the word I'd use either. Purplebackpack89 01:11, 27 October 2018 (UTC)
Then perhaps we should move IQ's issues to WT:VIP. Equinox 10:24, 27 October 2018 (UTC)

Sanskrit categories for non-lemma[edit]

I am thinking of starting a new project for Sanskrit language regarding adding categories to all the Sanskrit non-lemmas based on their recognition of form. I've commenced it for some words already; however upon dispute as mentioned on Talk page of the word भवामि, I've been to told to suggest this idea on Beer parlour. The crux of this categorization is as follows:

  • For nouns: Creating categories based on case, number
  • For verbs: Creating categories based on tense/mood, voice, person, number
  • For pronouns: Creating categories based on case,  number

Why this is needed? Non-lemmas in Sanskrit are just divided into categories based on parts of speech as of now. However, it is noteworthy to mention that in Sanskrit, because of inflection and conjugation, there can be millions of non-lemmas. Thus, current categorization is not that useful.

For example, in Sanskrit just for a single verb, there can be over dozens of terms in future tense because of 3 persons, 3 voices and 3 number. And on top of that, there are two future tenses in Sanskrit! So to have over 50 terms of future tense verb forms in the Category:Sanskrit verb forms is utterly useless unless they can be further categorized based on the type of future tense.

Thus to avoid a dump of million verb forms in the aforementioned category, I am suggesting to add, in addition to the already existing categories, some categories to a term, so that it is reflected with proper presentation in an organized form under aforementioned category. This I'm suggesting to do for all Sanskrit non-lemma.

Initially, it will be a monumental task, but since Sanskrit language doesn't even have a complete dictionary on Wikitionary yet, this idea too will be a work in progress.

Hoping for positive feedback. JainismWikipedian (talk) 00:13, 11 October 2018 (UTC)

@JainismWikipedian: Manually adding over 100+ categories designed specifically for verb forms to each Sansktit non-lemma verb is insane! What kind of lunatic would complete such a project like that? If categorization were necessary I'd say that the only way to maintain the mind's well-being would be through automation- through robots. Aearthrise (talk) 08:43, 12 October 2018 (UTC)
I just want categories to be added, manual or automated. I am not well versed with coding, so if someone could create a module for it, then extremely grateful. Till then, I can add them for major Sanskrit words. My idea was not to manually add the categories, of course. That's why I mentioned it on this forum, so that someone can help to automate it. Till then, I can add them for major Sanskrit words. My idea was to have categories; not to add them manually forever of course. That's why, I mentioned it on this forum, so that someone can help to automate it. :) JainismWikipedian (talk) 11:05, 12 October 2018 (UTC)
We have already dumped millions of verb forms, to twist your words, in other categories and nobody seemed to care about it. Check out Category:Latin verb forms for a start - nigh on half a million beautiful words all lying together in harmony. BTW, I don't care about Sanskrit at all, I'm just giving information. --WF110 (talk) 11:06, 12 October 2018 (UTC)
Well, that's what. Then it just becomes a useless list of million words. Categories will help to organize the non-lemma in Sanskrit properly. And, someone not caring about a language on Wikitionary is not a good argument to keep things as it is. And, as mentioned by me just above, till the time some good soul creates an automated module or something, I can add categories manually to certain major Sanskrit non-lemma. It will enrich Wiki experience, not diminish it. That's all I am saying. JainismWikipedian (talk) 11:15, 12 October 2018 (UTC)
@JainismWikipedian: If you can find a way to automate the categorization process, I think that adding categories a good idea. What do the categories do for Sanskrit lemmas? Aearthrise (talk) 23:49, 12 October 2018 (UTC)

Proposed revision to WT:FICTION: names of universes[edit]

I have a proposal to tweak the criteria for inclusion given at WT:FICTION.

As it currently stands, terms originating in fictional universes must have at least three durably archived uses that are independent of the universe in which they originated. This means, for instance, that Klingon gets in because there are many durably archived references to the Klingons or Marc Okrand's Klingon language that do not mention Star Trek nor Paramount. In addition, names like Homer Simpson and Pippi Longstocking are permissible because they have acquired secondary meanings (doofish dolts, in the former case, or freckled redheaded girls or spunky girls, in the latter) beyond their literal uses as the name of the characters. Bat'leth gets in because there are print references to real-life bat'leths, proving that the word (and object) have acquired a life beyond the Star Trek universe. And Eevolution and pedosaur get in because the former is not actually officially used in canon Pokémon and the latter certainly would never appear on Barney & Friends; they're kosher because they didn't originate in fictional universes.

But then you have terms like Pokémon. The word Pokémon is a nonexistent entry in Wiktionary. Why? Because it's impossible to make a reference to Pokémon, using the word "Pokémon", that doesn't mention the Pokémon universe, because Pokémon is Pokémon. The requirement that all terms originating in fictional universes have uses independent of the universe (i.e. not mentioning the universe) creates a bizarre legal fiction wherein the name of a franchise can't get into Wiktionary.

It seems bizarre that we have the non-canon spelling Pokemon and the outright misspelling Pokéman, but not Pokémon. That we have Pokédollar listing Pokémon in its etymology, but the link at that etymology is to a nonexistent page. That Pikachu, Squirtle and Jigglypuff are defined as species of Pokémon, but the link to Pokémon in each of their definitions won't tell a reader what a Pokémon is.

So my proposal is: The name of an entire franchise can get an entry in Wiktionary as long as (a) it would meet WT:FICTION were it not for the self-referential aspect as per the independence requirement (i.e. there are at least three uses, the uses are durably archived, they are at least a year apart from earliest to latest, they are uses and not mentions, the uses are by at least three different authors, etc. -- for the name of the franchise this would also require that the cited texts not mention the name of the franchise's creator nor its corporation), and (b) it is a one-word (solid) name, hyphenated word, or multi-word phrase of which the single words are not standalone meaningful words in English nor the native language of the franchise (Italian, French, German, Japanese, Spanish, Korean, Mandarin, Arabic, Hindi, Hebrew, Portuguese, or whatnot). So Pokémon, Digimon, Fraggle, She-ra, Spider-man, and Winnie-the-Pooh would all be permissible Wiktionary entries, whereas Bridge to Terabithia, My Little Pony, Paw Patrol, Masters of the Universe, and Dora the Explorer would not get in because they fail criterion (b), and that comic featuring ninja sharks that you made up last week would not get in because it doesn't have the durably archived uses by three different authors.

What do all of you think? Khemehekis (talk) 03:02, 12 October 2018 (UTC)

It seems to me that Pokémon as a term for a creature could be cited by the same kind of citations as would be sufficient for Klingon, like if someone spoke of "Game of Thrones' resident Pokémon Hodor" in reference to his only speech being his name, or something. Are you wanting a (second) sense line that defines it specifically as a franchise, and if so, why? (And, sincere non-rhetorical question: is it really unincludable in that sense? I see we have e.g. Star Trek and Star Wars, and the former was RFDed and kept. Should they be RFVed?) - -sche (discuss) 03:33, 12 October 2018 (UTC)
-sche, your comment provides many good points to think about. Excellent point about the "resident Pokémon Hodor" uses. TV Tropes even has a trope called Pokémon Speak. The word kryptonite gets in because of its allusory meaning of something someone is helpless against, and muggle gets in as a word for someone without a superpower or special distinction (for example, I once read an article on synaesthesia that referred to non-synaesthetes as "we muggles"). I personally think Pokémon as a word for a creature would be fine, although Pokémon has stayed a nonexistent entry for all these years, so some force must be driving the inertia at creating it (or re-creating the old entry). Look at Pokémon -- it has alternative spellings, a link to the (lower-case) Spanish translation, even a sound file, so it's just silly that it's a nonexistent entry -- following the letter of CFI at the expense of the spirit. To answer your first question: we don't really need a separate definition line for the franchise, vis-à-vis the creature sense. I just want the creature sense to get in, for the reasons stated in this edit and the OP. Come to think of it, in fact, Pokémon, Digimon, Fraggle, and Muppet are all names of creatures/species, while She-ra, Spider-man, and Winnie-the-Pooh are names of individual characters as well as franchises. My favorite guideline at Wikipedia is Use common sense, and that should be observed in drafting WT:FICTION at Wiktionary as well. And considering that we have Star Trek and Star Wars: maybe multi-word entries are permissible for really big franchises, or at least those that have gained extended metaphorical use, but not for every TV show, every toy, and every book under the sun? I'd be open to that as a tweak to my original tweak. Would you be OK with an entry for, say, Game of Thrones or Harry Potter? Khemehekis (talk) 04:58, 12 October 2018 (UTC)
How about an additional requirement that the franchise must have, say, three associated lemmas that must have already made it into Wiktionary by the old rules? Basically to stop the flood gates being opened too wide. e.g. Pokémon would pass this test thanks to the above mentioned Pikachu, Pokédollar, and Eevolution. But not every random show would make it in just by narrowly fitting the other criteria. Pengo (talk) 11:17, 16 October 2018 (UTC)
Support this idea. That's a good objective criterion to have, I think. That would justify the inclusion of major franchises like Star Wars (lightsaber, Wookiee, Jedi) without having to make exceptions to the rules. Andrew Sheedy (talk) 14:31, 16 October 2018 (UTC)
Pengo: That's brilliant! I guess the fictional universes at Category:en:American fiction, for instance, would be able to get in, without having to admit Shirt Tales or The Get-along Gang as entries. Khemehekis (talk) 22:59, 16 October 2018 (UTC)

Adding translation boxes in the entries for Chinese character components[edit]

The entries for Chinese character components lack translations, which I think would be useful --Backinstadiums (talk) 11:37, 14 October 2018 (UTC)

as long as we add the referent source, especially in well-known resources such as The Chinese Language Fact and Fantasy, it only adds to the entries. --Backinstadiums (talk) 15:21, 17 October 2018 (UTC)


Everyone's favourite troll, Wonderfool, had been putting emoji on a few pages before (s)he was blocked. Asking on WF's behalf, can anyone think of a valid reason why not to include the emoji on the corresponding pages? --XY3999 (talk) 12:22, 15 October 2018 (UTC)

They aren't words in a language. Equinox 18:26, 15 October 2018 (UTC)
Nonetheless, they are included in Wiktionary, so it doesn't seem out of place for them to be in a "See also" section. Andrew Sheedy (talk) 19:36, 15 October 2018 (UTC)
Many emojis represent feelings and emotions that can be translated objectively into other languages, such as "I feel happy". —Stephen (Talk) 04:10, 16 October 2018 (UTC)
Neither is r or ;. We should definitely include all Unicode characters. —Justin (koavf)TCM 04:36, 16 October 2018 (UTC)
This isn't about having entries for emoji, this is about linking to those entries from non-emoji pages. The current practice is to not link to other languages except in etymologies, descendants sections and translation tables. I think the argument is that linking to something that isn't English from an English section would go against that practice. Chuck Entz (talk) 06:39, 16 October 2018 (UTC)
But translingual terms get links in definitions (taxonomic names of organisms …). So maybe we can use emojis in definition lines: “A domestic fowl, Gallus gallus, 🐔, especially when young”. – “The meat from this bird eaten as food, 🍗 of 🐔.” Fay Freak (talk) 07:09, 16 October 2018 (UTC)
Ugh, I hope you're joking! I think a "see also" section is the only appropriate place for them. Andrew Sheedy (talk) 17:59, 16 October 2018 (UTC)
Emojis aren't a language. comma links to ,. —Justin (koavf)TCM 07:53, 16 October 2018 (UTC)
@Chuck Entz But emojis are used in English as if they were English, which is the key difference. Andrew Sheedy (talk) 17:59, 16 October 2018 (UTC)

What was the argumentation supporting the addition of pictures in entries? for example in house --Backinstadiums (talk) 20:49, 16 October 2018 (UTC)

Maybe that those pictures illustrate and are pretty and situated almost only on the edges of pages whereas emojis are ugly, often hard to read and intrusive wherever you put em. Fay Freak (talk) 21:09, 16 October 2018 (UTC)
Pictures in entries serve a clear purpose: I can see the definition of a word and now I want to know what the referent looks like. "A picture tells a thousand words." The emoji entry question is quite different: having non-word entries for pictures merely because those pictures are sanctioned by Unicode (which has lost its way as a project recently, with skin colour modifiers etc.). Equinox 22:25, 16 October 2018 (UTC)
@Equinox: The argumentation at issue is adding emojis as a referent in entries, just as are pictures, not entries of their own. The only reasoning to ban their addition, thereby sanctioning a lexicographic principle , would be that actual pictures are a "better" linguistic referent --Backinstadiums (talk) 23:03, 16 October 2018 (UTC)
I don't think we should include emoji or other non-words, and we should definitely not include all of Unicode. We're a dictionary, not a Unicode database. —Rua (mew) 11:09, 17 October 2018 (UTC)
Should we create a vote to ban all Unicode characters from Wiktionary (mainspace) that have the “Emoji” property? Serious question. It would be great to have such a clear stance against emoji. After all we are the guardians of the words, in natural opposition to idiocracy, and so not only would it save work from the encroachment from ever swelling emoji data but we would get media attention as stalwart conservatives and thus an influx of much needed new editors (to which it does not harm that I have mentioned this calculation already). No statement is so far made from my side about kaomojis like (´・ω・`), but those are a dangerous can too. Note that other websites document emojis, kaomojis and so on better anyway, hence we should not be sensitive about a motion for a ban. Fay Freak (talk) 11:43, 17 October 2018 (UTC)
I wouldn't want to get rid of them altogether, but I would support moving them to an Appendix. Andrew Sheedy (talk) 17:51, 17 October 2018 (UTC)
Of course. I explicitly talked about the mainspace. Having an appendix listing them is different. But I claim it is not worth having pages for each as these are of utmost shortness. Fay Freak (talk) 19:13, 17 October 2018 (UTC)
What about emoji which have a broad usage other than their nominal meaning. I am thinking of eggplant (🍆). - TheDaveRoss 18:27, 17 October 2018 (UTC)
We can’t add vulgar hand gestures either, such is appendix matter – emojis are somewhat of a 2D equivalent of hand gestures. Is ”paralinguistic” the correct word? Also we are not Know Your Meme. I have just found Category:Gestures, interesting. Totally underdeveloped also – I think we could link to hand gesture entries from the mainspace only in “see also” at the most, which is deniable too, whereas we would not like even that much to see emojis in “See also” sections, for emojis are trifling and excessive in variation at the same time. Fay Freak (talk) 19:13, 17 October 2018 (UTC)
We do have flip the bird, flip off, flick off, air quote and scare quotes to name a few gestures. Emoji are ideograms, and there is plenty of precedence for including those (see e.g. hieroglyphics and Chinese), however the degree to which any particular emoji has entered any particular language is open to interpretation. Some have become very common and have widely agreed upon meaning (e.g. smiley face) whereas others have little to no usage. - TheDaveRoss 14:17, 18 October 2018 (UTC)
We don't have the action "flip the bird" (or any of the others. We have the words used to describe the actions (just like we don't have an image of someone running, but we have the verb "run"). We do, however include gestures themselves in appendices, and I don't think there's an issue with doing the same for emoji. Andrew Sheedy (talk) 17:54, 18 October 2018 (UTC)
But even if they are put in an appendix, what are they doing in a dictionary? Are we going to include things like road signs, TV/audio player button symbols, "emergency exit" signs, etc etc? All of these carry meaning, and I don't see any difference between them and emoji. What distinguishes an emoji of an eggplant from a JPEG photo of an eggplant? —Rua (mew) 17:59, 18 October 2018 (UTC)
Because if I'm texting or writing an email to a friend, I'm probably going to use some sort of emoji. Signs or symbols on buttons are not used in running text. I use emojis to convey a particular meaning. They're a sort of punctuation that indicates the tone of my text/email. Andrew Sheedy (talk) 18:09, 18 October 2018 (UTC)
But what would distinguish an emoji of an eggplant from a JPEG photo of an eggplant? —Rua (mew) 18:11, 18 October 2018 (UTC)
It's tough to use a JPEG in running text. For the record, I would support excluding emojis that are not attestable (although if we're putting them in appendices, I'd have no strong feelings either way). Andrew Sheedy (talk) 00:42, 19 October 2018 (UTC)
  • Like it or not, emojis convey meaning and thus should be included. Purplebackpack89 18:10, 18 October 2018 (UTC)
    • Can we exclude you, then? —Rua (mew) 18:12, 18 October 2018 (UTC)

@TheDaveRoss What you have mentioned, as Andrew Sheedy has meanwhile noted, are the names of gestures, as such they are in the dictionary. I mean gestures as such, independent of any name, existing as gestures. And similar to making a gesture is posting an image file of a meme, an emoticon, a sticker, an emoji (all depending on the platform, on Windows Live Messenger it was mostly emoticons, no emojis yet, on Telegram Messenger there are stickers and image files and emojis, imageboards insinuate images …) – we don’t add rage comics, not only because a page title cannot be an image and not only because of copyright, but because of what they are is out of scope: Currently emojis are only included because of their technical handling being like of words, while by their conversational nature they are apart, so we shouldn’t have emojis either. Emojis are just memes in Unicode; the deciding criterion for something being included in Unicode is to be used in text, apart from emoji submissions that lack this criterion. If not for copyright the Unicode Consortium would include variations of Pepe the Frog, NPC Wojak, Cereal Guy and what not, I am not even sure that I exaggerate. (Literal gestures are by nature in the grey area in what concerns their being language, but technically more difficult to add here) Fay Freak (talk) 18:15, 18 October 2018 (UTC)

  • I think we can get a small nonthreatening emoji box that can link, like a normal picture. --XY3999 (talk) 13:29, 22 October 2018 (UTC)
    • I tried to do something clever with [[Image:Emoji_u1f414.svg|50px|right|thumb|text=This is the chicken emoji, find an entry at [[🐔]]|]] but I failed. Also, I added an emoji to [[chicken]] for fun. --XY3999 (talk) 13:41, 22 October 2018 (UTC)
      • Yeah, I tried another one at aubergine. I wanted to have it so when you click on the picture you go to the aubergine emoji, but that apparently simple wikiformatting is too advanced for me. --XY3999 (talk) 21:09, 22 October 2018 (UTC)

Amazing Feature that Exists in other Chinese- English Online Dictionaries that We Lack: 田+女 ---> 㚻[edit]

Scene: I sit in my room scrolling through WeChat on my phone and see the character 㚻 in an online joke, and I want to search for it on en.wiktionary. I can't type it in via my pinyin software, so I put 田 and 女 in the search bar like this: 田女. It was the sixth result in a list of hundreds. I don't even scroll down to the sixth result at first, because I expect that the results are all going to be totally unrelated. But I go to ctext.org: [1] I just type 田+女 into the search bar and BOOM there it is, the two relevant characters: no fuss, no muss. I think to myself, this is an amazing feature that exists in another Chinese- English online dictionary (and therefore seems technically feasible to be implemented here) that we are lacking. I imagine: this dictionary will one day be better than the dictionary at ctext.org, and this is one of the things that we might want think about a way to implement within the existing framework here. I open the beer parlor page and type my screed. Perhaps this 功能 already exists in some fashion but I don't know of it? --Geographyinitiative (talk) 14:56, 16 October 2018 (UTC)

@Geographyinitiative: check the setion "consists of" in this link. --Backinstadiums (talk) 20:55, 16 October 2018 (UTC)

A bit crude, but you can try:

Wyang (talk) 21:18, 16 October 2018 (UTC)

@Geographyinitiative: CHISE IDS Find might be better and more comprehensive. —Suzukaze-c 02:59, 18 October 2018 (UTC)

Optional embolding parameter in Template:suffixusex[edit]

The user Rua decided to delete an optional embolding parameter I've added to the Template:suffixusex, claiming it was "inconsistent" (with what?).

May I remind everyone of the fact that the "Wiktionary:Entry layout#Example sentences" states that "Example sentences should" "be italicized, with the defined term boldfaced". --ARBN19 (talk) 10:58, 17 October 2018 (UTC)

Here was the desired (optional) result:

gir- (to enter) + ‎-iş → ‎gir (entrance)

--ARBN19 (talk) 11:17, 17 October 2018 (UTC)

Firstly, it is inconsistent with usage elsewhere on Wiktionary. If the suffix should be bold, then it should be bold in every language, so why make it a parameter? Secondly, suffixes are not example sentences, so they have their own conventions. It may not always be possible to bold the suffix, because it may be obscured by morphological processes that occur as part of the suffixation, which makes it unclear which part of the word is the suffix and which part isn't. {{suffixusex}} was specifically made to not bold the suffix because of this. —Rua (mew) 11:01, 17 October 2018 (UTC)

Classical Nahuatl Possessive forms.[edit]

@Marrovi,@Lvovmauro I propose that we use the he/she/it form to demonstrate possessive Classical Nahuatl noun forms. Aearthrise (𓂀) 12:35, 21 October 2018 (UTC)

I'm fine with that. --Lvovmauro (talk) 06:23, 22 October 2018 (UTC)

Proposed new bot-generated list: Template:desc or Template:desctree with an invalid ancestor[edit]

@DTLHS Something we could perhaps create list for, any uses of {{desc}} or {{desctree}} that have a parent within the list of descendants, or if there is no parent the language entry itself, that is not a valid ancestor of the descendant language (determined the way {{inh}} does it). Of course, if there is a parameter that places an arrow before the term, it's ok, so this should only catch uses of the templates without any of those parameters. I expect to find especially Latin terms in this list, but we'll see. The bot could perhaps also be adapted to handle raw descendants that don't use {{desc}}, but that may be harder and involve more parsing. —Rua (mew) 20:25, 21 October 2018 (UTC)

I barely found any, but maybe I misunderstood something. User:DTLHS/cleanup/descendant ancestors. DTLHS (talk) 21:56, 21 October 2018 (UTC)
Yeah, you shouldn't include descendants that have bor=1, because there is no inheritance then. I don't know why there are so few listed though. —Rua (mew) 09:55, 22 October 2018 (UTC)
At the very least, Danish should appear in the list for capio, because Latin is not an ancestor of Danish and Danish is not marked as being borrowed. On fluo, Esperanto and Ido should appear in the list, for the same reason. —Rua (mew) 09:58, 22 October 2018 (UTC)
I see, I didn't understand your second condition. I'll update it later. DTLHS (talk) 16:27, 22 October 2018 (UTC)
I think I don't understand how to construct the language ancestor tree. French for example, has ancestor frm which has ancestor fro, but fro doesn't have ancestors so how do I get to Latin? DTLHS (talk) 01:16, 23 October 2018 (UTC)
See the getAncestors method in Module:languages. When there aren't ancestors in the language's data table, you have to step through the families to which the language belongs and get their proto-languages. — Eru·tuon 01:30, 23 October 2018 (UTC)
OK, I think I got it now- updated with the first 50,000 entries. DTLHS (talk) 01:53, 23 October 2018 (UTC)
Wow, that's a lot! I think it'll take a while to fix all of these. Any help from who is reading this would be appreciated! —Rua (mew) 10:22, 23 October 2018 (UTC)
Also, @DTLHS I think your bot may have made some errors at language in the Old French section. The bot has included English in the list, but English is given here as a descendant of Middle English, which should be totally fine. —Rua (mew) 10:24, 23 October 2018 (UTC)
OK, I filtered out any nested descendants. DTLHS (talk) 04:35, 25 October 2018 (UTC)
I don't know if they should be filtered out altogether. Rather, they should be filtered using the list item that is given as the parent. In this case, English was given as a descendant of Middle English, which is valid, but if French were listed as a descendant of English in the list for example, then that would be an error, unless it was indicated as borrowed. —Rua (mew) 12:24, 25 October 2018 (UTC)

Copyright status of Category:Esperanto 9OA[edit]

The category Category:Esperanto 9OA contains the words from the work "Naŭa Oficiala Aldono al la Universala Vortaro" ("Ninth Official Addition to the Universal Dictionary"), a work released by the Academy of Esperanto in the year 2007. There is no indication on this page containing the work that it's is freely usable under our license, so this category should probably be deleted. This list of words, which is directly copied from the aforementioned page, prevents the copyright holder to fully benefit from their work. Robin van der Vliet (talk) (contribs) 19:36, 22 October 2018 (UTC)

The theory is that lists are not copyrightable, perhaps @BD2412 might want to weigh in. It may be appropriate to reference the source of the list, however. - TheDaveRoss 19:45, 22 October 2018 (UTC)
Lists of facts existing in the real world are not subject to copyright protection because anyone can assemble such a list by researching the field from which the facts are derived. However, since Esperanto is a manufactured language, if this is nothing more than a list of words newly coined by the manufacturers of that language, it is probably subject to copyright protection. There are probably other ways to approximately convey such information, such as categories for Esperanto words coined by decade, without reference to a particular work or release. bd2412 T 20:33, 22 October 2018 (UTC)
The words on that list are not copyrighted, they have been in use for a long time. For such a long time, that AdE has decided to add them to their official dictionary. That dictionary, called the "Universala Vortaro", has such a size that it has copyright protection. Listing them all together systematically in a category is a copyright infringement. Robin van der Vliet (talk) (contribs) 20:59, 22 October 2018 (UTC)
Lmaltier (talkcontribs) presented a very similar problem on this page:
"The list of words (nomenclature) of a dictionary should normally be considered as copyrighted, because it's the result of a huge selection work, and because customers may buy a dictionary only to check the presence or not of words (arbitration of a discussion, or dictionary used as a referee for word games). If such a list is copied, it's unfair competition, because there will be fewer customers for the copied dictionary."
Robin van der Vliet (talk) (contribs) 06:29, 23 October 2018 (UTC)
The full word list, perhaps. A very small subset of such words would obviously not constitute competition on those grounds since absence from such a list would not suffice to settle those conversations. It has become common for major dictionaries to release the lists of words they include in addenda to the press for publication. While I am not concerned about this being fodder for lawsuit or immoral, I have no problem removing this category since I don't find it to be particularly of use. - TheDaveRoss 12:36, 23 October 2018 (UTC)
This is a clear violation of copyright law; words were copied from a dictionary that invented words in a fake language; there is no way around this, we must delete the category. Aearthrise (𓂀) 15:01, 24 October 2018 (UTC)
First of all, @Aearthrise Esperanto is a real language, I speak it fluently and I know a lot of people who speak it. I care about Esperanto, but I also care about free projects like Wiktionary.
@TheDaveRoss A small subset of such a word categorization would be completely useless. The category Category:Esperanto 9OA invites people to complete it, and it also invites the creation of Category:Esperanto 8OA, Category:Esperanto 7OA, etc, until Category:Esperanto 1OA. This category would only be useful, if we have categories for all Official Additions to the Universal Dictionary.
In the future I would like to send a request to the Academy of Esperanto, asking them to publish all their publications under the public domain, so that it can be included here and in the future also on Wikidata. Robin van der Vliet (talk) (contribs) 22:56, 24 October 2018 (UTC)
Even if the category were complete and all of the categories you mention were complete, I still do not think it would be a copyright violation. Regarding your efforts to encourage the governing body to release their documents more permissively, that seems like a great idea. - TheDaveRoss 13:07, 25 October 2018 (UTC)
As far as I can see, the category is not a list of all words of the Ninth Official Addition, but only lists the ones for which we have an entry, half the number, extracted automatically from the fact that we have marked these entries as not only approved by the AdE, but also specifically stemming from that official edition. I don’t think anyone in their right mind would interpret that as a copyright violation.  --Lambiam 11:43, 25 October 2018 (UTC)
Today somebody showed me this page, at which the Academy of Esperanto declared in 2011 that all official publications are dual-licensed under CC BY-SA 3.0 and GPLv2. This is not declared at their main page, only on one subpage that I didn't see before. I asked a member of the academy months ago about the license status, and nothing came from it. I showed that specific academician that page, and they also didn't see it before and were also surprised... It appears that not all academicians even know about that page. Well anyway, I guess that this discussion can be closed now. The lists are free. Robin van der Vliet (talk) (contribs) 15:49, 25 October 2018 (UTC)

Request for bot permission[edit]

I would like to create pages with my bot for alternative orthographies of Esperanto. I detailed my proposal here. The 2 alternative orthographies I described are used and recognized by the language community. Robin van der Vliet (talk) (contribs) 23:08, 24 October 2018 (UTC)

-- “陶行行知“[edit]

There is a problem with zh-x that is causing a duplication of 行 here when I add the dash between 行 and 知. I don't think I'm making any mistakes. --Geographyinitiative (talk) 02:06, 26 October 2018 (UTC)

You can see a similar, probably related problem in the second quotation for , sense #4. Maybe more an issue for the Grease pit.  --Lambiam 09:43, 26 October 2018 (UTC)
moved conversation there --> [2]--Geographyinitiative (talk) 23:46, 26 October 2018 (UTC)

Pinyin for phonetic annotation[edit]

Victor Mair's last post on languagelog asks for help to use wikipedia's Pinyin for phonetic annotation (Template:Ruby-zh-p) ; CCan it be used in wiktionary entries? --Backinstadiums (talk) 02:23, 28 October 2018 (UTC)

The template Template:Ruby-zh-p in Wikipedia is not smart enough, you have to provide each characters with its pinyin and it's a lot of work. It's not automated. The result looks good but it would some effort to do it for longer texts. Not really a "wonderful tool", as the blogger says. Ruby has been implemented well for Japanese at Wiktionary, and it's also considered a common practice for Japanese, see Template:ja-usex/documentation or Template:ja-r/documentation. One needs to provide the kana spellings, spacing and use a few tricks to give accurate transliterations.
For Chinese, ruby has not been that popular and besides, you can see only one version - simplified or traditional, not both. See Template:zh-x/documentation for the best way to show Chinese usage examples, IMO. The blogger can try annotation tools such as Chinese Annotation Tool but you should know that neither our templates, nor the online annotation tools can give a 100% accuracy. You have to provide the accurate spacing, work with irregular or less common readings and at times provide the correct simplified variant where there are variants. Wiktionary's modules and templates can exported to Wikipedia but I don't think they can be used elsewhere. --Anatoli T. (обсудить/вклад) 08:52, 28 October 2018 (UTC)

attributive -> relational for describing relational adjectives in Russian[edit]

(Notifying Atitarev, Cinemantique, KoreanQuoter, Useigor, Wanjuscha, Wikitiki89, Stephen G. Brown, Per utramque cavernam, Guldrelokk, Fay Freak): Wiktionary Russian entries use the term "attributive" to adjectives that stand in the place of nouns when modifying other nouns. For example, English can simply say "water wheel" or "processor architecture" or "Night Wolves", with one noun in apposition to another, but Russian needs to use a special adjectival form of the modifying noun, e.g. водяно́е колесо́ (vodjanóje kolesó) (cf. вода́ (vodá, water)), проце́ссорная архитекту́ра (procéssornaja arxitektúra) (cf. проце́ссор (procéssor, processor)), or ночны́е во́лки (nočnýje vólki) (cf. ночь (nočʹ, night)). (Note that English has such adjectives too, e.g. cellular vs. cell and senatorial vs. senator, but their use isn't mandatory, and many such terms have a decidedly literary or obscure flavor -- e.g. bovine vs. cow, apian vs. bee.) These adjectival forms are separate lexical entries, which are normally defined e.g.

  1. (attributive) processor

However, this terminology doesn't appear to be standard; instead, they are normally called "relational adjectives" in the literature, and "attributive" has a different meaning (in a directly modifying position, as opposed to predicative). I'm thinking of defining a new label "relational" to mark such adjectives, which classifies the term into e.g. Category:Russian relational adjectives and links to a glossary entry clarifying how such terms work, and using a bot to replace all "attributive" labels on Russian adjectives with "relational". Thoughts? Benwing2 (talk) 22:41, 28 October 2018 (UTC)

@Benwing2: by "they are normally called "relational adjectives" in the literature", do you mean the English-written literature about Russian? Per utramque cavernam 22:50, 28 October 2018 (UTC)
It’s indeed confusing to call those adjectives attributive if what you want to say is “relational”. But the reason that that label is used is that the editors who use the label use such labels because the English glosses belong to multiple parts of speech (example I pick from your edit history because you haven’t named any: древесноуксуснокислый (drevesnouksusnokislyj)). I would not place any labels there, glosses do not generally need to be of the same part of speech, particularly in this situation when an adjective would be part of a composed noun in English. The POS header and the form of the word (the ending ый) is enough, nobody could reasonably mistake the adjective for a noun. Well often in Arabic entries I just mechanically add -related to the gloss so the gloss is an adjective or write “related to X” as is done at عَقَارِيّ (ʿaqāriyy). Others write “pertaining to X” like at جِهَادِيّ (jihādiyy). Compare حَدِيدِيّ (ḥadīdiyy) created by Benwing where the gloss is just “iron”.
tl;dr: You do not need any labels where you think them. Fay Freak (talk) 22:59, 28 October 2018 (UTC)
@Per utramque cavernam Yes, in the English literature. The Russian equivalent appears to be относи́тельное прилага́тельное, which is a pretty direct equivalent. Benwing2 (talk) 23:39, 28 October 2018 (UTC)
@Benwing2: I have no strong opinion on the English terminology or labels to be used. --Anatoli T. (обсудить/вклад) 01:57, 29 October 2018 (UTC)
@Fay Freak I think it's important to put some sort of explanatory label or text if the definition is a part of speech other than the headword. I find it very confusing to have e.g. процессорный defined as just "processor" with no label. It wouldn't be obvious to me, encountering such a thing, that it's a relational adjective and is used in the attributive position (hence the Wiktionary label), modifying another noun. I would just conclude it's a mistake (not unusual in Wiktionary :-( ...). BTW your example of "iron" isn't probative because "iron" is defined as an adjective in English as well (although I would in fact put a "relational" label here, just to make it clearer). Benwing2 (talk) 02:12, 29 October 2018 (UTC)
BTW I don't like 'pertaining to X' or 'related to X' or similar sorts of definitions, because that's not the usual translation in English. It also makes the term sound quite formal, since the wording is so formal, which isn't the case. Benwing2 (talk) 02:14, 29 October 2018 (UTC)

Changing Wastek language name to Huastec[edit]

I would like to change the name of the label of Wastek Mayan to Huastec Mayan. 1. Huasteca is the region it comes from, 2. ortography mentioning the language spells it as Huastec, and 3. the Wiktionary Article names it as "Huastec language". Aearthrise (talk) 16:25, 29 October 2018 (UTC)

I second this. Huastec is an English word, derived from a Spanish word, derived from a Nahuatl word. There's no reason to spell it as if it's a Mayan word. --Lvovmauro (talk) 00:01, 30 October 2018 (UTC)
I third this. Frankly, I've never seen the spelling "Wastek" before. Khemehekis (talk) 04:15, 3 November 2018 (UTC)
It's the name of a couple of refuse-related businesses... This would seem to indicate that Huastec is still in use by scholars. It does seem odd that anyone would bother to make this non-Mayan name look superficially Mayan when its speakers call it Téenek, instead. Nonetheless, we should go by whatever is the predominant one in actual usage. However this turns out, it would be a good idea to add Teenek to the list of names (and Huastec, for that matter), and to create English entries for all the names that are in use (note the redlinks). Chuck Entz (talk) 06:13, 3 November 2018 (UTC)

The Community Wishlist Survey[edit]

11:05, 30 October 2018 (UTC)

Anyone in favor of a lexicographer's workbench? An example of a tool for such a workbench would be a way of extracting snippets of occurrences of words and sorting by collocated words within 1,2,3,4,5, etc. words/tokens, then grouping by synonym groups, then by role groups. This would enable us to do primary lexicographic work on polysemic words. It would also make us more attractive for students of linguistics, which might enable us to recruit more talent. It would greatly enhance our ability to mine whatever corpora we could get the required access to. DCDuring (talk) 21:10, 30 October 2018 (UTC)
Support. Andrew Sheedy (talk) 18:26, 31 October 2018 (UTC)

November 2018

Point/counterpoint: apostrophes[edit]

Abolish them, recognize them as letters. (Feel free to move this to the Tea Room if you feel it's more appropriate there.) —Justin (koavf)TCM 00:53, 1 November 2018 (UTC)

Our current practice is to recognize apostrophes as part of the spelling of a word (e.g. fo’c’s’le), except when used to form the possessive of a noun (so father’s is not an entry), with an exception to that exception when that possessive itself forms part of an idiomatic phrase or (part of) a proper name (e.g. men’s room and McDonald’s). We similarly recognize capitalization and hyphens as part of the spelling. We aim at being descriptive. As long as apostrophes are in widespread use, we are not going to abolish them. We do have an entry for anapostrophic aint because it is attestable but label it as a misspelling because that is how it is generally considered today. Whether or not we see apostrophes or hyphens as letters seems irrelevant. Do you feel we should take a position? In that case, please give some argument indicating the relevance.  --Lambiam 12:34, 1 November 2018 (UTC)
@Lambiam: Interestingly, anapostrophic doesn't come up on google searchs; non- is usually the best option --Backinstadiums (talk) 14:29, 1 November 2018 (UTC)
For every word, someone must have been the first to use it. Apart from my dislike of Greek–Latin chimeras like tele- + vision – why not teleopsis? – I feel that the Greek prefix an- carries a connotation of lacking and not merely negation. For example, since one cannot reasonably say that the word “sock” is lacking an apostrophe, one should not label it as anapostrophic. In fact, since it is not lacking an apostrophe, it is non-anapostrophic :).  --Lambiam 23:39, 1 November 2018 (UTC)
@Lambiam: Everyone is free to make up words, but it's important that those who read them can understand them. Simple prose is often best, and "without an apostrophe" would have been understood by everyone. — Paul G (talk) 07:51, 3 November 2018 (UTC)

Dictionary-only diacritics need to be documented better[edit]

Currently, where we do document dictionary-only diacritics, it's usually on the WT:About (language) page. But those pages seem to be primarily aimed at editors for a particular language, and are probably rather hard to read for people who merely read Wiktionary. I therefore think that these concerns should be separated more clearly: one page to document the reader-facing parts of Wiktionary, and another to document what editors need to know in addition. A paper dictionary would usually document its conventions somewhere in the prologue, so we should have something equivalent. Given that our namespace convention so far is to use Wiktionary: for editors and Appendix: for readers, Appendix: seems like the logical choice. However, Help: is also a good candidate, and is not used as much as it should be currently. —Rua (mew) 14:12, 1 November 2018 (UTC)

Sometimes I wonder if we should link to Wiktionary:About ____ from each L2 header. —Suzukaze-c 04:01, 4 November 2018 (UTC)
I would like to suggest that we have a User's Guide for each language (shortcut U+language code), giving the kind of information that one would find in the introduction to an English-language dictionary or lexicon for that language. That would be a good place for explaining practices like which form is the lemma, what we don't cover (things like forms with clitics like 's or Latin -que) and other features that need to be understood to use the entries, like the difference between attributive nouns and adjectives in English. We would need to develop a standard format and some ground rules to keep them from turning into reference grammars or verbose essays, but I think it would be worth the effort. Chuck Entz (talk) 04:21, 4 November 2018 (UTC)
I think something like Help:English would be a good name. Alternative names for the language can be redirects, and ambiguous names can be "disambiguation" pages. —Rua (mew) 11:49, 6 November 2018 (UTC)

Adding information about historical inflections for Japanese entries[edit]

Including what were called

  • 4-grade inflection
  • upper and lower 2-grade inflection
  • ra, na, ka and sa inflection (irregular)
  • ku and shiku inflection (adjectives)


These inflections are explicitly given in dictionaries so they can be verified. [3]

Although somehow "archaic", they can still be found in fossil words, or even productive sometimes in modern speech, e.g ○○せず, ムダヅモ無改革.

Huhu9001 (talk) 14:20, 1 November 2018 (UTC)

  • @Huhu9001:
    • For verbs (conjugations, more specifically than inflections), I believe the {{ja-verb}} template already supports type=yo for yodan verbs, as well as ni, kami ni, and shimo ni for the nidan verbs. Granted, this isn't yet documented anywhere besides the module code itself, which is not ideal. I've added or edited a few verb entries where I've used these type values; see the relevant subcategories at Category:Japanese_verbs.
    • I'm not sure what you mean by "ra, na, ka and sa inflection". What part of speech are you referencing? I'm used to (sa) as the nominalizing adjective suffix that is a rough equivalent of English -ness, but I suspect you're talking about something else?
    • Re: -ku and -shiku adjective types, we haven't had occasion yet to build out those paradigms -- these are largely gone from modern Japanese, although they are widely known and occasionally used to deliberately archaic effect. See, for instance, that scene in the Ghibli movie Spirited Away when Sen saves the "stink god" who turns out to be the god of a polluted river -- at the end of his big reveal, he says, 良きかな (yoki ka na, “that's better, isn't it”) and disappears.
    • Re: せず, that form is already accounted for -- see する#Conjugation, in particular the Negative continuative row. Granted, the ず ending is also used as a terminal form in formal modern Japanese; however, I'm not sure if a per-entry conjugation table is the best place to treat such stylistic and usage information...
More broadly, we (Wiktionary JA editors) are collectively still in a bit of a muddle about what to do with these aspects of the language -- as you note, many of these constructions are still productive in limited fashion, and thus don't really belong to a wholly different stage of the language in quite the same way that Middle English is differentiated from modern English. We've batted around ideas of setting up "Classical Japanese" or some such as a new L2 "language". But there's no lang code, we're unclear on how to differentiate, and there's potential for tons of duplication -- some verbs are no longer in common use and linger on as yodan paradigms, while others are modern godan verbs that evolved regularly from earlier yodan forms → would we have "Classical Japanese" entries for both? Or only for the yodan-only verbs? Etc. etc.
Setting aside the bigger issue of our overall approach to earlier Japanese grammar, it'd be helpful for starters if you could explain what you mean by "ra, na, ka and sa inflection". ‑‑ Eiríkr Útlendi │Tala við mig 00:15, 7 November 2018 (UTC)
To Eiríkr Útlendi:
  • I mainly suggest we give two inflection tables, one modern and one archaic/classical, for every Japanese entry, e.g. 燃える: 燃え 燃え 燃える 燃える 燃えれ 燃えよ and 燃え 燃え 燃ゆ 燃ゆる 燃ゆれ 燃えよ. I think this is not hard.
  • "ra, na, ka and sa inflection" are archaic/classical irregular inflections, e.g. 死ぬ: 死な 死に 死ぬ 死ぬる 死ぬれ 死ね.
Huhu9001 (talk) 05:51, 7 November 2018 (UTC)
@Huhu9001 -- Re: the irregulars, ah, yes, now I've got you. For 死ぬ, for instance, that's more formally called called the ナ行変格活用 (na-gyō henkaku katsuyō, na-row irregular conjugation). Only two verbs exhibited that pattern, 死ぬ (shinu, to die) and 往ぬ / 去ぬ (inu, to go away, obsolete), with scholars apparently undecided on whether shinu might derive from inu, thus suggesting only one underlying verb with this paradigm.
Re: adding two inflection tables to all modern verbs, I'm not sure I can agree with that. For instance, the classical terminal (i.e. dictionary) form of modern terminal-form 燃える (moeru) is 燃ゆ (moyu): the lemma form is different for certain verb classes. Additionally, various verbs spelled -える in the modern language, such as 変える (kaeru, to change something), were spelled -へる before the spelling reforms of the 20th century, tracing back to terminal (lemma) forms ending in -ふ. Wiktionary practice, as I've understood it, is for conjugation tables and other detailed information to go in the entry for the lemma form of a word.
We don't seem to have any consensus at present for building out a whole new L2 language as "Classical Japanese" or something along those lines. As such, it seems to me more like the conjugation tables for the older paradigms should go collectively on the WT:AJA page, or somewhere similar, rather than on each entry page. Would that be acceptable? ‑‑ Eiríkr Útlendi │Tala við mig 20:03, 7 November 2018 (UTC)
To Eiríkr Útlendi:
  • I insist every entry should contain this inflection information because they are in fact still in use. -> (where are they used)
  • Different lemmas are never a problem in Wiktionary, the entry τελέω gives the inflection of τελῶ as well.
  • WT:AJA is a policy page telling you how to compose an entry. Inflection table is obviously not supposed to be there. Huhu9001 (talk) 06:00, 8 November 2018 (UTC)

Use pronunciation template[edit]

@Huhu9001 What about dropping the inflection templates and extend {{ja-pron}} to achieve the effect?

|m=もえる    // modern kana spelling
|mcat=v1     // modern inflection type, v1 = vowel base (一段) verb
|h=もえる,moyeru    // historical kana spelling
|c=もゆ      // Classical Japanese predicative (終止形) form
|ccat=v2s    // Classical inflection type, v2s = e/u-alternating vowel base (下二段) verb
Conjugation of “燃える” (ichidan conjugation)
Traditional paradigm
Imperfective (未然形) 燃え もえ moe
Continuative (連用形) 燃え もえ moe
Terminal (終止形) 燃える もえる moeru
Attributive (連体形) 燃える もえる moeru
Hypothetical (仮定形) 燃えれ もえれ moere
Imperative (命令形) 燃えよ¹
Key constructions
Passive 燃えられる もえられる moerareru
Causative 燃えさせる
Potential 燃えられる
Volitional 燃えよう もえよう moeyō
Negative 燃えない
Negative continuative 燃えず もえず moezu
Formal 燃えます もえます moemasu
Perfective 燃えた もえた moeta
Conjunctive 燃えて もえて moete
Hypothetical conditional 燃えれば もえれば moereba
¹ Written imperative
² Spoken imperative
³ Colloquial potential
Classical conjugation of “燃ゆ” (shimo nidan conjugation)
Traditional paradigm
Imperfective (未然形) 燃え もえ ⟨moye⟩
Continuative (連用形) 燃え もえ ⟨moye⟩
Terminal (終止形) 燃ゆ もゆ ⟨moyu⟩
Attributive (連体形) 燃ゆる もゆる ⟨moyuru⟩
Evidential (已然形) 燃ゆれ もゆれ ⟨moyure⟩
Imperative (命令形) 燃えよ もえよ ⟨moyeyo⟩

(Notifying Eirikr, Wyang, TAKASUGI Shinji, Nibiko, Atitarev, Suzukaze-c, Poketalker, Cnilep, Britannic124, Fumiko Take, Nardog, Marlin Setia1, AstroVulpes, Tsukuyone, Aogaeru4): --Dine2016 (talk) 08:21, 8 November 2018 (UTC)

To Dine2016: Acceptable to me.Huhu9001 (talk) 09:14, 8 November 2018 (UTC)
Two thoughts.
  • Putting the classical form in {{ja-pron}} seems really weird -- the pronunciation template is intended to show the pronunciation of the headword. The example above ... hides the pronunciation, which seems confusing and poor usability. Also, /moju/ is not a pronunciation of 燃える (moeru, to burn, intransitive). Likewise, conjugation type, historical kana spellings, and romanization schemes do not belong in this template. I'd be fine if some other template were created for this purpose, but {{ja-pron}} is not the place for this.
(Separately, I'm confused by the comment, "what about dropping the inflection templates", since the example above ... includes the inflection templates?)
  • I'm resistant to the idea of conjugating 燃ゆ in the 燃える entry. The Ancient Greek example does not seem germane here; τελῶ (telô) appears to be simply a contemporaneous contraction of τελέω (teléō), and it looks like the editors collapsed the information about the former into the entry for the latter as an exceptional case. For classical Japanese, we're talking instead about every verb in sizable verb classes.
That said, I'd like to survey other dictionaries, get a handle on their approaches. Let me chew on this a bit (where to put classical conjugations).
‑‑ Eiríkr Útlendi │Tala við mig 13:07, 8 November 2018 (UTC)
@Eirikr: Thank you for your reply. By "what about dropping the inflection templates", I meant to make {{ja-pron}} generate the inflection tables, thus eliminating individual inflection templates such as {{ja-ichi}}. Here are some considerations.
  • Putting the historical kana spellings and the romanization schemes in headword templates might seem good at first, but when there are more than one parts of speech, they have to be duplicated, which is not ideal. The Korean pronunciation template {{ko-IPA}} includes romanizations, and the Chinese {{zh-pron}} includes not only romanizations but also Middle and Old Chinese, which can be comparable to historical kana spellings. Therefore I find it convenient for Japanese to follow the model of Korean and Chinese, which means putting the historical kana spellings and the romanizations in the pronunciation template.
  • As for inflection tables, I have no objection to having a separate "Inflection/Conjugation" section and templates, but making {{ja-pron}} take over that functionality is more convenient for editors, isn't it? Moreover, by doing so, phonetic information such as もえる (for 燃える), き.いろい, or |acc=0 can be directly used when generating the inflected forms, eliminating the need to copy them to the inflection templates.
  • 燃ゆ and 燃える are pretty much the same lexeme if you consider what linguistics call the stem or the base: the former is /moje-/, and the latter is /moe-/, clearly regular development. The reason the dictionary form look different is because the modern 終止形 comes from the classical 連体形 and 二段動詞の一段化, both of which concern morphology only. Even in the traditional Japanese analysis, も・える(ア下一) is a direct reflex of classical も・ゆ(ヤ下二) too. Larger dictionaries such as 広辞苑, 大辞林, and 国語大辞典 are all diachronic and treat 上代 も古事記では甲ゆ, 中古 もゆ, and 中世 もゆる in the same entry as modern もえる, and I think we can follow the same approach too. --Dine2016 (talk) 16:04, 8 November 2018 (UTC)
I prefer separating Ancient Japanese and Modern Japanese. Some ancient verb forms are still used in Modern Japanese but the paradigm as a whole is already dead. (I have tried to include as many surviving forms as possible in Wiktionary talk:About Japanese/Conjugation.) — TAKASUGI Shinji (talk) 21:33, 8 November 2018 (UTC)
  • I agree that ja-pron is not the appropriate place for such information. I can sympathize with the desire to include as much information as possible as readily as possible. But there is no direct link between historical inflection and pronunciation as such. I also keep thinking about a comment left some time ago at 'Feedback' where a user complained about having Etymology at the top of the entry because the user was not interested in Etymology. I think that there probably are a lot of users interested in pronunciation of Japanese lemmas, but probably far fewer interested in historical forms. In that case, putting those pieces of information together at the top of an entry may be frustrating. Furthermore, as others have pointed out, although there are reflexes of historical forms in the conjugation of some words, this is not the case for all words. In short, ja-pron is the wrong place for this information, if it is even desirable for all entries. Cnilep (talk) 04:05, 9 November 2018 (UTC)
Ok, thanks, I agree with the idea to limit the pronunciation section to the modern lemma form only and put inflections in a separate section below the definitions. What about creating a unified interface, for example {{ja-infl|もえる|v1|acc=0}} (for modern Japanese) and {{ltj-infl|もゆ|v2s}} (for Classical Japanese) and code the inflection rules in a module, instead of having a separate template for each inflection type? That will be easier to program and come with better extensibility, I think. --Dine2016 (talk) 06:17, 9 November 2018 (UTC)
@Eirikr: Putting the classical form (e.g. 燃ゆ for 燃える) in {{ja-pron}} was actually an idea from 日本国語大辞典, which has a 発音 section of entries that looked like this:

み-ちがえる みちが・ふ〔他ハ下二〕 … 発音 ミチカ゜エル〈標ア〉 〈京ア〉 「みちがふ」ミチカ゜ウ、ミチコ゜ーとも 〈標ア〉 〈京ア〉

As can be seen, the classical 終止形 also has relevance in modern Japanese, with modern pronunciation, Tokyo accent, and Kyoto accent. A Reference Grammar of Japanese even identifies more conjugated forms borrowed from the literary language such as V-uru and V-uréba and give them modern accentuation, indicating their relevance to modern Japanese.
Since the pronunciation section already deals with two different forms: classical 終止形 (≒ modern 連体形) and the modern 終止形, I concluded that it would be convenient to deal with other forms there together. However, if the pronunciation deals with only these two forms, or better yet, only the modern 終止形, and other conjugated forms are listed in a “Conjugation”/“Inflection” section below the definitions with their pronunciation dealt with on their perspective entries, I'm totally fine with that, and it would be more logical as each entry deals with its own pronunciation.
@TAKASUGI Shinji On the other hand, I also agree that separating Ancient Japanese and Modern Japanese has advantages. The current entry ない has three lexemes; the first (suffix in 切ない、幼けない、ぎこちない) is no longer productive in Modern Japanese and is likely to confuse learners. Putting the first lexeme under the Ancient Japanese L2 header would surely be an improvement. --Dine2016 (talk) 11:33, 12 November 2018 (UTC)
I don't believe I am knowledgeable enough to comment productively. —Suzukaze-c 04:29, 13 November 2018 (UTC)

Contribute to the Wikipedia Asian Month 2018[edit]

This month is Wikipedia Asian Month and Tofeiku suggested we, wiktionarians, take part of this. So, we discussed in Meta about it, and we ends up with the proposal of working on traditional asian games such as go, hanafuda, xiangqi, shogi, mahjong. There is a lot of games but no category here, and we may also look at the vocabulary used in the game, such as the famous atari in go. So, it's part of LexiSession initiative, and we aim at showing interproject short term projects can be cool, so I hope you like the suggestion, and let us know what you improve around it! Face-smile.svg Noé 15:40, 1 November 2018 (UTC)

Synonyms given after each sense[edit]

I see that color has synonyms given immediately after each sense rather than in a separate section. Is this how we are doing synonyms now, or is this a permissible alternative method? Or should they be moved to the "Synonyms" section that is also on that page? — Paul G (talk) 07:47, 3 November 2018 (UTC)

Both methods are currently permissible. Some discussion is here. — Vorziblix (talk · contribs) 08:13, 3 November 2018 (UTC)

Passive verbs: should they have full definitions or Template:passive of?[edit]

In Northern Sami, there is technically no passive inflection, but many verbs have an accompanying passive verb. This passive verb is a full verb in every respect, and has its own infinitive, participles and can also have words derived from it. Moreover, the meaning can sometimes be somewhat idiosyncratic, and there is more than one way a passive verb could be formed. This makes it seem like it is a lemma in its own right. On the other hand, passive verbs are used very much productively to form passive sentences, and I've often found them missing from other dictionaries, which suggests they are also somewhat inflectional in nature like the North Germanic passive.

I'd like to know how to best treat these. Should they be given a full definition, as if they were a true lemma, or should they be defined with {{passive of}}? Should they appear in the inflection tables of verbs, or the "derived terms" section? —Rua (mew) 13:11, 5 November 2018 (UTC)

In Mongolian I use {{passive of}}/{{causative of}} but also add definitions below that if they don't obviously follow from the verb it is derived from. I use the usual verb lemma headword and they of course don't have an etymology section, but I stopped listing them in derived terms, with a possible exception for passives that have many derived forms themselves. Crom daba (talk) 21:44, 12 November 2018 (UTC)

Vandalism with "by ___" edit summaries[edit]

Just curious whether anyone else has noticed the periodic vandalism (it looks like ignorance rather than malice) where the edit summaries strangely begin with "by", along the lines of "by adding it" or "by writing my name". What could be the cause of this odd grammar? Possibly some kind of automated translation? Equinox 14:00, 5 November 2018 (UTC)

Open call for Project Grants[edit]

IEG IdeaLab review.png

Greetings! The Project Grants program is accepting proposals until November 30 to fund both experimental and proven ideas such as research, offline outreach (including editathon series, workshops, etc), online organizing (including contests), or providing other support for community building for Wikimedia projects.

We offer the following resources to help you plan your project and complete a grant proposal:

Also accepting candidates to join the Project Grants Committee through November 15.

With thanks, I JethroBT (WMF) 19:46, 5 November 2018 (UTC)

{{quote-meta}} should auto-transliterate[edit]

It's strange that {{quote-meta}} accepts a |transliteration= param but doesn't auto-transliterate the way that {{ux}} does. Unless someone objects, I'll fix this and make it auto-transliterate. Benwing2 (talk) 02:47, 6 November 2018 (UTC)

Don’t you need a language parameter for that?  --Lambiam 10:22, 6 November 2018 (UTC)
Please add any tracking categories for the modified templates missing the language parameter, also for {{quote-book}}, which didn't have it before. --Anatoli T. (обсудить/вклад) 11:41, 6 November 2018 (UTC)
Symbol oppose vote.svg Oppose - I think we should get away from having templates do this sort of thing. Any result which is unlikely to change often should be static text, which is better for the servers and better for anyone who might wish to use the underlying data in any context other than this wiki. If we could make the template subst in the transliteration automatically I would support that. - TheDaveRoss 14:41, 6 November 2018 (UTC)
Ah, but we want to be able to change our transliteration schemes and have that reflected instantly across the wiki. —Μετάknowledgediscuss/deeds 18:39, 6 November 2018 (UTC)
One would hope that such activity would be rare. And when it was required it can be done through via a bot just as easily as through module editing. - TheDaveRoss 19:10, 6 November 2018 (UTC)
@TheDaveRoss: I don't understand your opposition. subst is always a preferred functionality for long transliterations but manual transliteration is also always an option for languages, which don't transliterate automatically or occasionally need it. It shouldn't be a showstopper for not implementing the auto-transliteration. Can you give an example where it's a bad idea?
Note that automatic transliteration for usage examples in Chinese, Japanese, Thai and Khmer exists but it's implementation is different from other non-Roman languages for known reasons. There's also an implementation for Korean with a few tricks, which make it different from the automated Korean transliteration. It would be hard or impossible to merge these three five into the common citation templates. --Anatoli T. (обсудить/вклад) 22:34, 6 November 2018 (UTC)
Symbol support vote.svg Support. DTLHS (talk) 16:09, 6 November 2018 (UTC)
I don't remember where I got the idea, but I use the quotation templates w/o giving them the actual text + templates like {{ja-usex}}. Perhaps doing this all the time would be more simple, and would avoid duplication of {{usex}} and {{quote}} functionality. —Suzukaze-c 02:33, 7 November 2018 (UTC)
@Suzukaze-c: Well, that's what I mentioned in the side comments. These language-specific templates are far-more advanced and do a lot of great work but they don't necessarily have all the bits and pieces present in the other generic templates, such as categorisations of e.g. citations by language or parameters like year, URL, page number, etc. --Anatoli T. (обсудить/вклад) 23:57, 7 November 2018 (UTC)

Making Template:defdate more machine-readable[edit]

Right now this template has only one parameter, which is just plain text. This means that there is no actual standard format, the text could be written in any number of ways, which makes Wiktionary harder to parse for automated processes. I therefore propose to introduce a second parameter, so that the first and second give the start and end dates of use respectively. The second argument would be left empty if the sense is still in use. The format of these parameters would still need further standardisation, but it's a start. —Rua (mew) 13:20, 6 November 2018 (UTC)

I think that is a good idea. I am not sure it will ensure the values are standardized or machine readable, but it is a step in the right direction. Oh, and I think it should show still in use if the second argument is empty or missing. - TheDaveRoss 14:44, 6 November 2018 (UTC)
Introducing a second parameter has no effect on whether the values are standardized. Sure, let's add a second parameter as a first step though. DTLHS (talk) 16:14, 6 November 2018 (UTC)
Splitting it into a from and a to date makes that part at least readable. You know, then, that each parameter only contains a single time description, rather than a range between two. My plan is to automatically add "from" before the first argument if there is no second, like the template's talk page suggests that people enter manually. The next step would be to make a module that parses each date and throws an error for a format it doesn't understand. —Rua (mew) 18:46, 6 November 2018 (UTC)
I vehemently oppose this effort as at best premature and probably wrong-headed.
  1. We don't have the information to support the changes proposed, nor is such information likely to be forthcoming, especially not without risking COPYVIO.
  2. At the very least could someone take the trouble to take a census of the information currently in {{defdate}}?. I have never understood how such proposals can be made without a fact base.
  3. If we are to attempt to standardize the use of {{defdate}}, shouldn't we begin by agreeing on what is acceptable data.
@Widsith. DCDuring (talk) 19:40, 6 November 2018 (UTC)
I think "we don't have the information" is the main problem. If we lack it for most senses of most words, there seems little point in making things more complex than they are. Equinox 19:51, 6 November 2018 (UTC)
Splitting it into from and to parameters might be good, but parsing dates and throwing errors for strange formats would essentially make this template useless for some ancient languages (I am thinking particularly of Egyptian here), as chronologies are uncertain and dates can usually only be narrowed down to specific dynasties, reigns, bodies of religious texts, etc. rather than centuries or years. Such usages need to be taken into account somehow. — Vorziblix (talk · contribs) 02:28, 7 November 2018 (UTC)
I have no objection on the face of it. When I created {{defdate}} it was really just to force the font size and square brackets, the actual text bit was always a more-or-less stopgap decision. I do think we are safer continuing to use centuries rather than specific years (in most languages – ancient extinct languages will have to do things differently), partly from lack of information and partly to avoid COPYVIO issues. Ƿidsiþ 07:44, 7 November 2018 (UTC)
Tentative support. I quite like the idea (it bugs me that we vary between "c." and "century" and "Century", or "from" and "From", etc.). However, I'm hesitant about being too restrictive with this. Usually we give dates in terms of centuries, but occasionally we know the exact date something was coined, and it can be useful to give the specific date. We need to make sure that one can use "early", "mid-", "late" in descriptions, and it should be possible to add other notes as well, in case there is more complex information. Andrew Sheedy (talk) 18:51, 8 November 2018 (UTC)
Some format standardization seems appropriate, but will it be the same for English, Middle English, Old English, Torre Straits Creole, and Ancient Egyptian? DCDuring (talk) 18:57, 8 November 2018 (UTC)
I see no reason why the format itself wouldn't be the same most of the time (the meaning might be different (e.g. an Old English word might be marked as "from 10th century", meaning that it survived into Middle English, but not necessarily that it is used today). Ancient Egyptian, I suppose, might mention dynasties rather than centuries, or have more speculative dates, so I'm not sure if it could be forced into a regular format. It might be preferable to vote on a preferred format, rather than imposing it via a template. Andrew Sheedy (talk) 19:10, 8 November 2018 (UTC)
The first step would seem to me to be making a survey of the different formats in use right now. DTLHS (talk) 19:12, 8 November 2018 (UTC)

Per-lemma etymologies[edit]

In the past, "Alternative forms" was allowed to be entered as a level 4 header under the lemma/part-of-speech header, instead of at level 3 above everything else. I want to do this with the Etymology header as well. It makes no sense to me that the etymology is structurally disconnected from the lemma it belongs to (both on level 3) or even that the lemma is subsumed under the etymology. Assigning etymology to lemma instead of lemma to etymology is much more sensible, and it's the structure that Wikidata's lexemes follow as well. My ultimate desire is to entirely abolish level 3 etymology sections, and especially numbered etymologies, but I think in the meantime the two formats can be allowed as alternatives like we already do with "Alternative forms". That way, there is no immediate need to change any entries, and those who want to can experiment with the new format.

For those who think etymology should be visually the first thing to appear in a lemma, before senses, keep in mind that this proposal is about the logical structure of the entries, not the visual ordering. If you want to preserve the ordering, then the only option that also has a sensible logical structure is to have senses under their own L4 header with the etymology L4 header immediately above, both with the lemma's L3 header as their parent. I'm open to this option, but I think senses should come before etymology, those are what users are mostly looking for and not etymologies. —Rua (mew) 19:42, 9 November 2018 (UTC)

What is your vision for the numbered etymology pages format? What headers would be used? DTLHS (talk) 19:48, 9 November 2018 (UTC)
Each lemma would have its own L4 Etymology section, without numbers. —Rua (mew) 20:13, 9 November 2018 (UTC)
So for the English entry set, for example, we would have the level 3 headers Verb, Noun, Adjective, Noun (2), Verb (2), each with potentially their own etymology and no possibility to group them? DTLHS (talk) 21:31, 9 November 2018 (UTC)
Yes, just like in other dictionaries. —Rua (mew) 22:24, 9 November 2018 (UTC)
I agree with the idea. I just think we should do it all at once and not introduce a 3rd layout possibility that will probably stick around with the others for years. DTLHS (talk) 22:29, 9 November 2018 (UTC)
I also think doing it all at once would be preferable if this is to be done. — Vorziblix (talk · contribs) 04:16, 13 November 2018 (UTC)
I was looking at leave as an example of how this would work out. @Rua, might you be inclined to create a trial page demonstrating the proposed new look for our perusal? (BTW, what is the point of the repeated use of heading in the labels for the first leave#Verb?) The proposal makes sense to me; it is also what is promised by the inscrutable passage in WT:ELE that states that “[a] key principle in ordering the headings and indentation levels is nesting”. As an aside, if we had a more structured way of marking up the entries, along the lines of {{entry|lemma=...|language=...|part_of_speech=...|...}}, somewhat similar to many of the citation templates, such sweeping changes could be effected in one swell foop. (See also the (much more modest) suggestion in Wiktionary:Beer parlour/2018/October#What about uniformise a little bit all Wiktionaries?.)  --Lambiam 08:22, 10 November 2018 (UTC)
I made diff. —Rua (mew) 11:32, 10 November 2018 (UTC)
I'm not opposed to this, but I don't like that closely related nouns and verbs will be no more closely grouped than completely unrelated homographs. I think if we make a change like this, there should be some way of linking them together (especially when one POS comes from another POS in English, and their etymologies are no more distinct than the etymologies of different senses under the same POS). Andrew Sheedy (talk) 18:49, 12 November 2018 (UTC)
Well, how do other dictionaries do it? I don't remember ever seeing a dictionary that gives special treatment to etymologically related words. —Rua (mew) 18:42, 13 November 2018 (UTC)
@Rua I don't use online dictionaries other than Wiktionary very much, but the way I usually see it in print dictionaries is that for different etymologies, the headword will be listed again (usually with a number, like a superscript 1, 2, etc.). Each headword is then followed by its various definitions, further grouped by part of speech. This is especially common with more inflected languages, I find, where homographs might have different genders. Andrew Sheedy (talk) 23:29, 13 November 2018 (UTC)
Even so, paper dictionaries don't normally give any etymology at all, so the idea of being "somewhat" related (e.g. from the same PIE root) is enough to justify a grouping. But on Wiktionary, we do give proper etymologies, and words that are related will still each have their own etymology. The verb sleep and the homographic noun have separate histories that each deserve their own etymology section. This is the same whether the history of the words is long or short (if the noun had been derived from the verb only in modern English, for example). My experience is that when multiple lemmas are grouped under a common etymology, this is almost always an error of omission and each one could be given its own etymology. One of them is always the older, and the other derived from the first, or something similar. So in a future situation where every lemma indeed does have its own proper etymology, each etymology section will have at most one lemma. What is there to gain from that layout then? —Rua (mew) 00:23, 14 November 2018 (UTC)
I definitely agree that it would be an improvement to give etymological information for each part of speech (as well as to better indicate the development of different senses over time). I still don't like that the first verb and noun sections of lay would be no more closely connected than they would be to any other senses. However, one possible compromise would be to adopt the same style as the French Wiktionary, and have only one etymology section, and have "Noun 1", "Verb 2", etc. headers. (See botte for an example of what I mean.)
I oppose etymologies following the definitions for the folowing reasons:
  1. There's enough clutter as it is with all the -nyms, translations, derived terms, references, usage notes, etc. following the defs;
  2. Many dictionaries begin with the etymology. Every print dictionary I've seen with etymological information has it before the defs;
  3. Etyms are currently at the beginning of entries, and it would be far, far more work to place them all below defs than it would be to make the modification I am suggesting;
  4. As has been mentioned, it's very easy to skip over information one doesn't care about, and far more confusing if there is inconsistency in the way information is presented, as there would be for some time if we tried to move etyms from where they are;
  5. And finally, if all etymological information were to be put under one header, as I am suggesting, putting it at the very bottom of an entry would make it hard to find and not very clear. Andrew Sheedy (talk) 03:58, 14 November 2018 (UTC)
As I said in my first post, this proposal is about rectifying the logical structure of the entries. If you want the etymology to appear first, then it should still be nested under a header of some sort, so that it's clear that the etymology belongs to some word and doesn't stand entirely on its own. However, I think both that and the current structure are less desirable than the one I proposed, which simply nests etymology below part of speech and definitions. Definitions are primary, so they should come first.
I'm also opposed to the French structure, which is essentially what we used to do for synonyms and antonyms using the {{sense}} template. Each lemma has its own etymology, just like each sense has its own synonyms and antonyms, so separating them out makes no sense for the logical structure of an entry. —Rua (mew) 13:37, 14 November 2018 (UTC)
I like what you are trying to accomplish, but I think that not having etymologies as headers at all is the way to go. Page ordering and layout is currently logical, but has nothing to do with usability (except, perhaps, that English is the first language presented). While etymologies are very interesting, they are, at best, secondary information for most users. The fact that etymology is present before, and with more prominence, than definitions and translations is downright silly. I would have etymologies live in their own namespace, and transclude them at each relevant definition. I would have them displayed in the same manner as usage examples, hidden by default with a small link indicating their existence. Obviously that is not an easy transition to make.
To be clear, I like what Rua is proposing more that the status quo and would support pursuing it further. - TheDaveRoss 19:01, 13 November 2018 (UTC)
Why, heavy language learners sort the words by etymology. Older dictio­naries mingle un­related words to safe space in a confusing manner. I would still separate unrelated words spelled (with all dia­cri­ti­cal marks, if ap­pli­cable) the same as different lemmas even if they aren’t nested under etymologies. عقار looks totally fine. Not knowing the relation of the meanings of كُور (kūr) makes me un­com­for­table. And as a dictionary aiming at de­pic­ting hi­sto­rical be­gin­ning with the ety­mo­lo­gy as the root of all things is ob­vious any­way. I can’t follow @Rua’s ar­gu­ment “It makes no sense to me (…) that the lemma is sub­su­med under the ety­mo­logy.” – Umm that’s the way I imagine the words, so it makes sense to me?
If one uses this dictionary often, the consistent structure allows to learn where to put the eyes to get only the infor­ma­tion one wants, so abundant etymo­lo­gies can be overlooked with resolve (your own fault then if you read too much). People don’t care what is under what, they only care that things are always in the same order. Fay Freak (talk) 23:14, 13 November 2018 (UTC)
That last point is of course an argument for the change as well as against it. However, I made arguments in favour of a logical nesting structure as well, and that is not achieved with the current layout, but is with my proposal. So if all else is equal, logical nesting wins out. —Rua (mew) 00:11, 14 November 2018 (UTC)
Not sure how it’s for it, it seemed to me that you wanted to introduce a new order. However I do remove splitups into numbered etymology sections, often they are really nasty as on حرس, in this example even containing the identical references under each section, which I have lamentably perceived to be the “Benwing layout”. Sectioning needs more discretion. Notably, I have now put that pronunciation file between a lemma template and its header because somebody recording a pronunciation file does not lead me into banjaxing the whole layout based on this file. By the way the pronunciation section layout believed to be standard is too much based on “inflectionless languages” – there is no point in having such sections when words have many permutations and the lemma form is just one, particularly when the pronunciation in a language is certain based on spelling or transcription. Instead the tables should convert the Latin spellings to IPA with a click (like tables resort) and allow inclusion of audio files in parameters. @Rua Fay Freak (talk) 16:27, 14 November 2018 (UTC)
@Fay Freak A problem I see with the current version of that Arabic entry, is that it's not possible to see how each of the words was formed. Surely, Arabic has well-developed methods of deriving words from roots, and it's not just a matter of "fill in some random vowels"? This is the kind of information that I would want to see in an etymology, and it's currently lacking. —Rua (mew) 12:44, 16 November 2018 (UTC)
I support the proposal. The "lateness" with which our entries get around to providing definitions is a recurring complaint from new and experienced users alike, on WT:FEED, entry talk pages, and even in other sections of this very subpage. I sympathise with the sentiment that this should be done all at once, though also with the original suggestion of allowing either format, like with alternative forms: most entries have only one POS and etymology, but for the few that have several etymologies, maybe the current setup is better. (Still, even switching all entries to nested/L4 etymologies would be better than the current setup, I think, if it were a choice between those two options.) Other online dictionaries I can think of also put etymologies after pronunciations and definitions. - -sche (discuss) 17:02, 14 November 2018 (UTC)
I was ambivalent at first, but having thought some more, I also support the proposal. For words with multiple etymology headers, currently the POS is L4, while all its subheadings are L5 — but L4 is visually indistinguishable from L5, rendering the hierarchy confusing and unclear. While I think the loss of closer grouping for multiple headwords with similar etymologies is regrettable, it’s not that great a loss: those headwords will still directly follow each other, and their individual etymology sections will still indicate their close relation. While most print dictionaries have etymologies before definitions, most online dictionaries seem to have it the other way, so either choice seems find as far as precedent (and thus new users’ expectations) goes. I am, however, opposed to hiding etymologies in small links like usage examples, as they’re one of Wiktionary’s greatest strengths — from what I’ve read elsewhere, many users come to Wiktionary specifically to find etymological information — and also as etymologies don’t generally correspond to specific senses but to the whole headword. — Vorziblix (talk · contribs) 19:02, 14 November 2018 (UTC)
Can’t the distinguishability of L4 and L5 be tweaked in some CSS?
The rest I rather harmonize with, particularly the emphasis about similar etymologies: But what’s with leave’s different etymologies for noun and verb if the headwords are even more riven by translation sections? @Rua, you have not mentioned this. Fay Freak (talk) 19:11, 14 November 2018 (UTC)
What do you mean? I didn't change the translations at all. —Rua (mew) 19:24, 14 November 2018 (UTC)
I mean that the translation sections are so heavy that it weighs littles if around them the lemmas are ordered under etymologies.
Wow, I have just reached a wicked idea: What if we put translations sections under each gloss like with {{synonyms}}? Fay Freak (talk) 20:24, 14 November 2018 (UTC)
That has been suggested before, and I would support it in principle. It would make the wikicode very hard to navigate though. —Rua (mew) 20:32, 14 November 2018 (UTC)
Please, let's keep this to one topic. DTLHS (talk) 20:40, 14 November 2018 (UTC)

The problem is that ideally (in my head, and in all historical dictionaries), the definitions flow on from the etymology and so should come after it. But I do agree that it's normal to see at least some indication of the lemma and part of speech before any of that. In an ideal world, the inflection line would itself be the L3, with etymology and definitions coming below, but I realise this is probably unfeasible at this point. Ƿidsiþ 08:47, 15 November 2018 (UTC)

Classical K'iche'(Quiché) Mayan vs Modern K'iche' Mayan[edit]

There are two languages that are currently sharing the tag K'iche', the modern variety spelled "K'iche'", and the Classical variety, "Quiché". The Classical Variety is written in documents from the 1600's, and differs from the modern variety; for example: the Classical Quiche word for food is tioh, and its modern descendant is ti'ij. I have created the category "Classical K'iche'" to sort the Classical words from the Modern. We should change the tag "quc" tag or change the word headings to Classical K'iche' to avoid confusion. Aearthrise (𓂀) 14:19, 10 November 2018 (UTC)

Etymology of phrasal verbs[edit]

Would it be possible to trace some notion similar to etymology for phrasal verbs? Otherwise, what relevant lexicographic information could be added? --Backinstadiums (talk) 18:10, 10 November 2018 (UTC)

Date of first use is one possible item. DTLHS (talk) 20:26, 10 November 2018 (UTC)
@DTLHS: What about [[forbrecan]] > [[break up]] ? --Backinstadiums (talk) 20:46, 10 November 2018 (UTC)
@Leasnam could probably answer that. DTLHS (talk) 20:58, 10 November 2018 (UTC)
I've made a few etymologies for phrasal verbs (I can't seem to actually find an example at the moment though...) where it was possible that the verb + adverb was derived from an inflected or imperative form of an earlier separable-prefix type verb (similar in Dutch and German) in Middle English, like come up from upcomen. We still have upcome however which is fading away in favour of its replacement, and it's possible that the adjective upcoming may be a relic of the earlier verb's present participle. I'll keep looking for an example I've actually worked. Additionally, there may also be the kind mentioned by Backinstadiums where the verb has been supplemented (translated?) from an earlier synonymous verb with a different structure. Those are more difficult to prove a connection to IMO, but is possible that a move from for- + verb to verb + up (compare also for be- + verb to verb + about) was made across multiple verbs at one time. I haven't really read anything (yet) stating that was the case. Leasnam (talk) 21:30, 10 November 2018 (UTC)
@Leasnam This URL for concise academic references to the reasons of the change. --Backinstadiums (talk) 21:56, 10 November 2018 (UTC)
@Backinstadiums Thanks ! I'll take a look at it. Also, I found a phrasal verb that I "etymologised": break off Leasnam (talk) 22:10, 10 November 2018 (UTC)
@Leasnam: According to this reference, abrecan > break off --Backinstadiums (talk) 22:18, 10 November 2018 (UTC)
There's nothing in that article that says that phrasal verbs are direct descendants in an etymological sense of anything in Old English. They're a different construction that replaced the prefixed-verb construction and filled in the same "slots", just as the gerund replaced many uses of the infinitive. Referring to one as the "ancestor" of the other seems to be strictly a metaphor, and not something to be cited in an etymology. I would say that the replacement of Old English forbrecan with English break up is just applying a different, but equivalent construction to the same verb- English up and Old English for- aren't related in any etymological sense, and one isn't a descendant of the other. Chuck Entz (talk) 01:29, 11 November 2018 (UTC)
@Chuck Entz, I would agree with you...there is no etymological connection between forbrecan and break up. Break up simply replaces OE forbrecan/ME forbreken/ModE forbreak with a newly constructed word: break up Leasnam (talk) 08:21, 11 November 2018 (UTC)
@Leasnam, Chuck Entz: ORIGINAL POST: ... some notion similar to etymology ... Otherwise, what relevant lexicographic information ... --Backinstadiums (talk) 10:28, 11 November 2018 (UTC)
@Backinstadiums I suppose then the answer is simply: not much. You could list the Middle English or Old English word for comparison, but that does little except show that those languages had a synonymous term for whatever the new English term is. In cases where the change-over is regular, like a note showing break up replacing earlier forbreak might be interesting (and we do this often when a French word displaces a native word)...but it also can add unnecessary clutter. Leasnam (talk) 18:50, 11 November 2018 (UTC)
@Leasnam What reference did you use to etymologyze break off? --Backinstadiums (talk) 20:01, 11 November 2018 (UTC)
Middle English Dictionary (University of Michigan) Leasnam (talk) 20:31, 11 November 2018 (UTC)
As far as I can tell, Proto-Germanic had no phrasal verbs. All it had were verbs prefixed by various (unstressed!) adverbs, and verb+adverb phrases like they still exist in English. This is the situation that's found in Gothic. The prefixed verbs have become relics in modern English, and the verb+adverb combinations have taken over. The separable verbs of Dutch and German arose from those same verb+adverb combinations, but the quirks of OV+V2 word order in those languages mean that the verb is sometimes after the adverb (OV) and sometimes before (V2). English has VO word order, so there is no need for separable verbs; the same situation is found in other Germanic languages with VO order, such as Swedish. —Rua (mew) 22:53, 10 November 2018 (UTC)

Change coming to how certain templates will appear on the mobile web[edit]

CKoerner (WMF) (talk) 19:34, 13 November 2018 (UTC)

confusing article layout[edit]

Please take a look at policy and imagine this is your first time here.

Our article layout makes (especially but not only longer pages in) our dictionary very user unfriendly or even unusable for most people because we don't have the policy of listing the rare and obsolete meanings last, as most or all other good modern dictionaries now do.

In addition, we need a better UI that puts meanings with different etymologies close to each other. Most people never find the only other common meaning of the noun "policy" because it's physically so far away, and confusingly after the verb with the first etymology.

Etymologies are wonderful and one of the strengths of this dictionary, but they should be folded away (like the quotations are now) instead of confusing users and preventing them from finding what they're looking for (or, in fact, ever coming back). --Espoo (talk) 05:57, 14 November 2018 (UTC)

I disagree that etymologies should be hidden by default, but I do think that we should organized definitions by how common they are, and/or by their relation to each other, and include information about historical development in the etymology. Andrew Sheedy (talk) 12:56, 14 November 2018 (UTC)
I definitely agree that an obsolete sense should never be the first sense listed, unless there are only obsolete senses. —Rua (mew) 13:31, 14 November 2018 (UTC)
I disagree. I think earliest senses should be first, including when they're obsolete, as in any historical dictionary (which Wiktionary is, like it or not). Ƿidsiþ 14:42, 14 November 2018 (UTC)
Currently entries have a chronological logic, hence first the origin, then the (known or presumed) first or original senses. That’s useful for historical reading. Often one can also group meanings in abstract formulation (so field) to make pages more readable. Frequency is also a criterion however, and it does not exclude counting historical usage. One just has to look if one thinks the result looks best. On policy one may perhaps try grouping. Fay Freak (talk) 14:50, 14 November 2018 (UTC)
I have no objections to a historical ordering of definitions, personally, but if we're going to do it, we should do it consistently, and recognize that mainstream users are not our target audience. I doubt Wiktionary will ever be a truly valuable resource for the average person, unless other dictionaries go out of business or something. We're much better off tailoring ourselves towards people with an interest in language. Andrew Sheedy (talk) 18:34, 14 November 2018 (UTC)
To be honest I agree. Serious dictionaries are not really ideal for casual users looking up common meanings of slightly uncommon words. On the other hand, there are ways that could be tried. For instance, we could order things historically but slightly "grey out" the obsolete senses, and/or perhaps "highlight" or "star" in some way the most common ones. Ƿidsiþ 08:37, 15 November 2018 (UTC)
"Currently entries have a chronological logic" They certainly do not- that is promulgated by certain users but by no means universal in practice or required. DTLHS (talk) 18:36, 14 November 2018 (UTC)
Aye that’s what I said. They just have it. Not exclusively. Also frequential and topical or abstract grouping is done and of course no or arbitrary grouping in many entries. This dictionary is driven by fancy. Fay Freak (talk) 18:49, 14 November 2018 (UTC)
Apropos of one or two of the comments above, I would be disappointed if Wiktionary did not aspire to be accessible to mainstream users, or at least "upper mainstream" users. I would consider myself a upper-ish mainstream dictionary user, certainly not a linguist or language expert, and I find myself increasingly using Wiktionary as a go-to dictionary for practical purposes. Partly this is because of the absence of ads and extraneous crap, and partly because of its breadth of coverage and the pretty good quality of a large part (not all) of its content. Personally I disagree with the chronological order approach, and the consequent risk of obsolete senses appearing first, though I understand the reasons why some people advocate this approach. In an ideal world, in my opinion, definitions should by default appear with common modern senses first and/or in "logical" order, but each definition would have a "first recorded use" date attached, and there would be a function for users to sort by that if they desired. Obviously it would be a huge task to implement that across the project. Mihia (talk) 23:05, 17 November 2018 (UTC)
I've suggested that before, but I think it's a distant dream at this point... Maybe in 20 years, when people have nothing better to do here than adding {{defdate}} to entries. Andrew Sheedy (talk) 23:53, 17 November 2018 (UTC)

not it[edit]

[4] [ זכריה קהת ] Zack. — 13:10, 16 November 2018 (UTC)

WT:About Chinese: Romanisations of Cantonese[edit]

According to the present policy, Pinyin romanisations of monosyllables and polysyllables for Standard Mandarin (aka Putonghua), such as "" and "bùguò" are allowed. However, for Standard Cantonese, only Jyutping romanisations of monosyllables of monosyllables are allowed (e.g. jyut6, ping3), while those of polysyllables are disallowed. Why is there such unequal treatments for the two languages? I believe that Jyutping romanisations of polysyllables should be allowed and massly created, as Pinyin romanisations of polysyllables are allowed and exist in a large quantity. Jonashtand (talk) 15:59, 16 November 2018 (UTC)

Nest "Further reading" under lemma[edit]

According to WT:EL, the "Further reading" heading should be level 3, placed below all the lemmas. But there is always further reading about something. I doubt you'll find many cases where you find further reading material about all words spelled X, but instead you would find material about a particular lemma. To me, it therefore makes more sense to treat this as a level 4 header, and nest it underneath the lemma to which it applies. This is also useful in the interest of machine readability, because a bot can't tell which information belongs to which lemma if it's all put together under the same section. —Rua (mew) 11:57, 17 November 2018 (UTC)

  • I agree this makes more sense. In those cases where there are several lemmas for a given language and the further reading happens to apply to all, it should remain at the L3 level, though.
    • And if some of it applies to all, but not the rest? —Rua (mew) 11:54, 18 November 2018 (UTC)
      • I suppose the same should apply to References, Derived terms, Related terms, See also. Perhaps we should follow the example of Synonyms and place the items under appropriate senses and/or subsenses. DCDuring (talk) 16:36, 18 November 2018 (UTC)